Abstract Background Energy metabolism and pyroptosis are integral to the pathogenesis of diabetic nephropathy (DN). However, the precise roles of energy metabolism and pyroptosis in DN development remain unclear. This study aims to elucidate the roles of energy metabolism- and pyroptosis-related differentially expressed genes (EMAPRDEGs) in DN development. Methods EMAPRDEGs were identified by querying the GeneCards and Gene Expression Omnibus (GEO) databases. Subsequent analyses included Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) enrichment, Gene Set Enrichment Analysis (GSEA), and Protein-Protein Interaction (PPI) network analysis. Additionally, mRNA-miRNA, mRNA-drug, and mRNA-transcription factor (TF) interaction networks were constructed. Differential expression and receiver operating characteristic (ROC) curve analyses were performed to evaluate the diagnostic potential of EMAPRDEGs. Immune cell infiltration in DN was assessed using the ssGSEA algorithm, and the expression levels of EMAPRDEGs in DN tissues were validated by quantitative real-time PCR (qRT-PCR). Results Thirteen EMAPRDEGs were identified, with GO and KEGG analyses indicating their involvement in energy metabolism pathways. GSEA revealed significant enrichment of these genes in biological pathways associated with diabetic nephropathy. PPI network analysis highlighted the central role of these genes within the relevant pathways. Predictive modeling demonstrated interactions between EMAPRDEGs, 69 miRNAs, and 117 TFs. Immune infiltration analysis showed substantial alterations in immune cell populations, with ADH1B and PC showing a significant correlation with natural killer cells and memory B cells. ROC curve analysis confirmed the diagnostic potential of EMAPRDEGs for diabetic nephropathy. qRT-PCR validated the expression patterns of CASP1, IL-18, PDK4, and FBP1, which were consistent with the bioinformatics predictions. Conclusion Bioinformatics analysis identified 13 candidate EMAPRDEGs, among which CASP1, IL-18, PDK4, and FBP1 emerge as potential biomarkers for diabetic nephropathy. Keywords: GEO database, Diabetic nephropathy, Energy metabolism and pyroptosis, Bioinformatics, qRT-PCR 1. Introduction Diabetic nephropathy (DN) represents a major microvascular complication of diabetes and a leading cause of end-stage renal disease globally, impacting both developing and developed nations [[33]1]. As kidney damage progresses, patients with DN may eventually require chronic renal replacement therapy. Current international guidelines for managing chronic kidney disease (CKD) in type 2 diabetes emphasize controlling hypertension and hyperglycemia, alongside the use of renin-angiotensin system (RAS) blockers, which help reduce urinary protein excretion and inhibit renal fibrosis. However, these treatments often fail to arrest disease progression [[34]2]. Consequently, there is an urgent need for further research into the pathogenesis of DN to facilitate early diagnosis and refine therapeutic strategies. Genetic susceptibility, dysregulation of energy metabolism, and inflammatory processes are all critical factors in the progression of DN. Energy metabolism plays a pivotal role in the development and progression of DN, with the underlying molecular mechanisms being multifaceted. These include the activation of inflammatory responses, oxidative stress, advanced glycation end products (AGEs), and the production of reactive oxygen species (ROS) [[35]3]. Additionally, metabolic shifts, such as the transition from oxidative phosphorylation to glycolysis, have been observed in podocytes exposed to high glucose levels [[36]4]. This metabolic reprogramming is closely linked to podocyte dedifferentiation and contributes significantly to the pathogenesis of diabetic nephropathy. Beyond the direct effects of hyperglycemia on renal cells, lipid accumulation resulting from lipid metabolism disorders also plays a key role in the onset and progression of DN. Lipid toxicity, arising from lipid accumulation within kidney tissues, induces inflammation and fibrosis, thereby exacerbating the disease process [[37]5]. Pyroptosis, a recently identified form of programmed cell death, is increasingly recognized as a critical mechanism in the pathogenesis of DN [[38]6]. This process is primarily mediated by caspase-1 and gasdermin-D (GSDMD), with the NLRP3/caspase-1/GSDMD pathway serving as a central signaling cascade. Activation of NLRP3 triggers inflammasome formation, leading to the activation of caspase-1, which subsequently cleaves pro-IL-1β and pro-IL-18 into their active forms. The N-terminal domain of GSDMD translocates to the plasma membrane, causing membrane rupture and the release of cellular contents, culminating in pyroptosis [[39]7]. Pyroptosis is a distinct form of programmed cell death (PCD) characterized by its inflammatory nature and specific molecular mechanisms. In contrast to apoptosis, which is a non-inflammatory process leading to cell shrinkage and fragmentation without triggering an immune response [[40]8], pyroptosis culminates in cell lysis and the release of pro-inflammatory cytokines such as IL-1β and IL-18. Unlike necroptosis, another form of inflammatory cell death, pyroptosis is uniquely associated with the Gasdermin family of proteins. These proteins form pores in the cell membrane, resulting in cell swelling and eventual rupture. While necroptosis is regulated by receptor-interacting protein kinases (RIPK1 and RIPK3) and often serves as a compensatory mechanism when apoptosis is impaired, pyroptosis is directly triggered by pathogens or danger signals, establishing it as a critical component of the innate immune response [[41]9]. Recent research suggests that dysregulated energy metabolism pathways can modulate programmed cell death [[42]10,[43]11]. Notably, studies have established a link between energy metabolism and pyroptosis in conditions such as acute kidney injury [[44]12], myocardial injury [[45]13], and cancer [[46]14]. For instance, one study demonstrated that ERRα inhibits pyroptosis in an NLRP3-dependent manner while promoting glycolytic metabolism, thereby conferring cisplatin resistance in endometrial cancer cells [[47]14]. Furthermore, Hao et al. identified that microRNA-17-5p (miR-17-5p) inhibits death receptor-6 (DR-6), thereby supporting the survival of renal tubular epithelial cells during hypoxic/ischemic kidney injury [[48]15], highlighting the close relationship between energy metabolism and pyroptosis in renal tubular cells. However, the gene expression patterns related to energy metabolism and pyroptosis in DN remain unclear. To date, no comprehensive research has specifically examined the interplay between energy metabolism and pyroptosis in DN. This study, therefore, aims to identify key targets associated with energy metabolism and pyroptosis in DN, followed by the prediction of potential therapeutic agents. Further research into the interplay between energy metabolism and pyroptosis could lead to novel therapeutic strategies targeting DN. In this study, we identified energy metabolism and pyroptosis related differentially expressed genes (EMAPRDEGs), Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) enrichment analysis, as well as Gene Set Enrichment Analysis (GSEA). Differential gene expression and Receiver Operating Characteristic (ROC) analysis were conducted to assess the diagnostic value of these genes. Protein-protein interaction (PPI) networks were constructed using the STRING database to explore interactions between differentially expressed genes (DEGs) in DN and control groups. Additionally, microRNAs (miRNAs) and transcription factors (TFs) targeting EMAPRDEGs were predicted, and gene-drug interaction networks were established. Immune infiltration analysis was performed using single-sample Gene Set Enrichment Analysis (ssGSEA), followed by the examination of EMAPRDEG expression levels in patients with DN and healthy controls. In conclusion, this study investigates the involvement of EMAPRDEGs in DN pathogenesis, offering insights into potential therapeutic avenues. 2. Materials & methods 2.1. Data download Gene expression data associated with DN was retrieved using the GEOquery package [[49]16], which allowed access to the gene expression database, including the datasets [50]GSE30528 [[51]17]and [52]GSE96804 [[53]18,[54]19]. Dataset [55]GSE30528, derived from Homo sapiens and utilizing the [56]GPL571 platform, consists of 22 samples, with 9 samples from patients with DN (group: DN) and 13 from healthy controls (group: Control). Dataset [57]GSE96804, sourced from the [58]GPL17586 platform and also originating from Homo sapiens, includes 61 samples, comprising 41 samples from patients with DN (group: DN) and 20 from healthy controls (group: Control). Both datasets were included in this analysis. To address potential batch effects for data sets [59]GSE30528 and [60]GSE96804, the R package sva (Version 3.50.0) [[61]20] was utilized for batch effect removal before conducting differential expression analysis. The Surrogate Variable Analysis (SVA) method adjusts for experimental batches or other biological factors that might introduce systematic biases, thereby optimizing the consistency of sample expression patterns. This step ensures that the results are not confounded by batch effects, enhancing the reliability of subsequent analyses. Detailed information about the datasets is presented in [62]Table 1. Following batch effect correction, the two datasets were combined, resulting in a unified dataset comprising 50 DN samples and 33 control samples. Table 1. GEO Dataset Information list. [63]GSE30528 [64]GSE96804 Platform [65]GPL571 [66]GPL17586 Experiment type Expression profiling by array Expression profiling by array Species Homo sapiens Homo sapiens Tissue glomeruli glomeruli Samples in Control group Control (13) Control(20) Samples in Disease group DN(9) DN(41) Reference Transcriptome analysis of human diabetic kidney disease. Dissection of Glomerular Transcriptional Profile in Patients With Diabetic Nephropathy: SRGAP2a Protects Podocyte Structure and Function. Identification of Transcription Regulatory Relationships in Diabetic Nephropathy. [67]Open in a new tab DN,Diabetic nephropathy. To identify genes related to energy metabolism and pyroptosis (EMAPRGs), a comprehensive search was performed using the terms "Energy metabolism" and "Pyroptosis" in the GeneCards database [[68]21] ([69]https://www.genecards.org/). Filtering with the criteria "Protein Coding" and "Score >2," a total of 564 energy metabolism-related genes (EMRGs) and 59 pyroptosis-related genes (PRGs) were identified. Additionally, 629 EMRGs [[70][22], [71][23], [72][24], [73][25]]and 52 PRGs [[74]26] were identified from the published literature. The initial genes related to energy metabolism identified were intersected and deduplicated, resulting in 105 EMRGs, while the initial genes related to pyroptosis were merged and deduplicated to obtain 26 PRGs. The integration of these sets revealed 131 EMAPRGs. 2.2. Analysis of differentially expressed genes Differential expression analysis of transcriptome data from the DN datasets was conducted using the limma tool. Genes with a |logFC| > 0.15 and P-value <0.05 were classified as DEGs. Among these, genes with logFC >0.15 and P-value <0.05 were categorized as up-regulated, while those with logFC < - 0.15 and P-value <0.05 were classified as down-regulated. To identify EMAPRDEGs, the intersection of all DEGs from the DN datasets and EMAPRGs was obtained and visualized using a Venn diagram. The overlapping genes were designated as EMAPRDEGs for further analysis. The results of the differential expression analysis were visualized through volcano plots generated with the ggplot2 package in R. Heatmaps of EMAPRDEGs were constructed using the pheatmap package (Version 1.0.12) in R. 2.3. Functional and pathway enrichment analysis GO analysis was performed across three main domains: Molecular Function (MF), Cellular Component (CC), and Biological Process (BP) [[75]27]. KEGG [[76]28] a widely used database, provides comprehensive information on genomes, biological processes, diseases, and medications. The R package clusterProfiler [[77]29] was employed to perform GO and KEGG analyses on the list of differentially expressed genes. Statistical significance was considered for adjusted P-values (P.adj) < 0.05 and false discovery rate (FDR) values (Q.value) < 0.25, with P-values corrected using the Benjamini-Hochberg (BH) method to minimize false positives. 2.4. Gene set enrichment analysis (GSEA) GSEA [[78]30] a statistical method for evaluating the enrichment of predefined gene sets in a ranked list of genes, was applied to assess the relationship between gene sets and specific phenotypes. In this study, genes from the DN datasets were ranked based on their logFC values, and GSEA was performed on all differentially expressed genes using the ClusterProfiler package. The analysis included a seed value of 2022, 5000 permutations, and a gene set size range of 10–500 genes. The "c2.all.v2022.1.Hs" gene set from the MSigDB database [[79]31] was used. Statistical significance was defined by adjusted P-values (P.adj) < 0.05 and FDR values (Q.value) < 0.25, with P-values corrected using the BH method to control for false positives. 2.5. Differential expression analysis and ROC analysis of EMAPRDEGs The Mann-Whitney U test, also known as the Wilcoxon rank-sum test, was employed to evaluate the differences in EMAPRDEG expression between the disease and control groups in the DN datasets. Group comparison maps were created using the ggplot2 package in R, providing a visual representation of the differential expression analysis results. For diagnostic evaluation of the EMAPRDEGs, the receiver operating characteristic (ROC) curve [[80]32] was constructed to depict the relationship between sensitivity and specificity. The pROC package in R was used to generate the ROC curve for the EMAPRDEGs and to compute the area under the curve (AUC) value, which serves as a measure of diagnostic accuracy in DN. The AUC value ranges from 0.5 to 1, with a higher AUC indicating better diagnostic performance. Specifically, an AUC between 0.5 and 0.7 suggests relatively low diagnostic accuracy, an AUC between 0.7 and 0.9 signifies moderate accuracy, and an AUC exceeding 0.9 reflects high diagnostic accuracy. 2.6. Protein-protein interaction network (PPI) and functional similarity analysis The STRING database [[81]33], a comprehensive resource for protein interaction analysis, was utilized in this study with a focus on human species interactions. To ensure the reliability of the results, a stringent filter was applied, setting a minimum interaction score of 0.150. The filtered interaction data were then used to construct a PPI network, which was visualized using Cytoscape [[82]34] software to provide an intuitive representation of the complex interaction landscape. GO annotation provides an empirical framework for assessing functional similarities across genes and genomes, serving as a vital tool in bioinformatics. The GOSemSim package [[83]35] was employed to evaluate the semantic similarity of EMAPRDEGs, calculating a composite score based on the geometric mean of three GO domains: BP, CC, and MF. The ggplot tool was then used to visualize the results of this functional similarity analysis. The GeneMANIA [[84]36] database facilitates hypothesis generation regarding gene functions, allowing for the analysis and prioritization of genes based on their interactions. It also supports the prediction of gene functions by identifying genes with similar profiles to a query gene. In this study, genes with similar functions to the hub genes were predicted using GeneMANIA, and the resulting interaction network was downloaded. The figure displays differentially expressed genes in the inner circle, while the outer circle represents genes with analogous functions. The color of the connecting lines indicates the functional relationships between these genes. 2.7. mRNA-miRNA, mRNA-Drug and mRNA-TF interaction network The Starbase 3.0 database [[85]37], which contains extensive interaction data on miRNA-ncRNA, miRNA-mRNA, miRNA-RNA, and RNA-RNA relationships, was utilized to identify miRNA targets based on experimental data from CLIP-seq and degradome. A filtering criterion of pancancerNum >10 was applied to select mRNA-miRNA interactions. Cytoscape software was then employed to visualize the mRNA-miRNA interaction network. The Comparative Toxicogenomics Database (CTD) [[86]38] was leveraged to predict both primary and secondary drug targets for the EMAPRDEGs. To explore the interactions between these genes and drugs, the "ReferenceCount" > 4 criterion was applied to select significant mRNA-drug interactions. Cytoscape was used to visualize and integrate these mRNA-drug interaction data, generating a comprehensive network. TFs that interact with the EMAPRDEGs were identified using the CHIPBase database (version 3.0) [[87]37]. The mRNA-TF interaction pairs were filtered based on the sum of "Num. samples (upstream)" and "Num. samples (downstream)" being greater than 8. Cytoscape was then employed to generate a visual representation of the mRNA-TF interaction network, illustrating the intricate connections between the mRNAs and their associated transcription factors. 2.8. Single-sample gene-set enrichment analysis (ssGSEA) The ssGSEA [[88]39] methodology was applied to calculate the relative concentration of immune cell infiltrates. Using the ssGSEA algorithm from the GSVA package in R, enrichment scores were computed to quantify the infiltration of various immune cell types across the samples. Boxplots were generated to depict the differences in immune cell infiltration among disease and control groups in the DN datasets. The gene expression matrix from the DN datasets was then analyzed to examine correlations between immune cell profiles and disease status. Additionally, the association between immune cells and EMAPRDEGs was analyzed, with the R package 'ggplot2′ utilized to visualize the correlation data. 2.9. Quantitative real-time PCR (qRT-PCR) Between July and August 2024, a total of nine whole-blood specimens were collected from the First Affiliated Hospital of Nanchang University, consisting of six samples from patients with DN and three from healthy controls, all aged between 30 and 65 years. Informed consent was obtained from all participants, and the study received ethical approval from the Institutional Ethics Committee (Ethical number: (2024) CDYFYYLK (07–026)). A volume of 3–5 mL of whole blood was drawn from each participant into EDTA-containing tubes for white blood cell (WBC) enrichment. Blood samples from patients with DN were collected prior to any treatment. Total RNA was extracted from the enriched samples. Reverse transcription was performed using the Servicebio® RT First Strand cDNA Synthesis Kit (Servicebio, Wuhan, China), according to the manufacturer's protocol. For quantitative PCR (qPCR), the 2 × SYBR Green qPCR Master Mix (None ROX) from Servicebio was employed, following the provided guidelines. The thermocycling conditions were as follows: an initial denaturation at 95 °C for 5 min, followed by 40 cycles of denaturation at 95 °C for 10 s and annealing/extension at 60 °C for 30 s. ACTIN was used as the reference gene for data normalization, and gene expression was calculated using the 2^−ΔΔCt method. The primers used for the experiment are listed in [89]Table 8. Table 8. Primer sequences for quantitative real-time PCR. Gene Names Forward (5′-3′) Reverse (3ʹ-5ʹ) CASP3 TGTTTGTGTGCTTCTGAGCC CACGCCATGTCATCATCAAC ADH1B TGCTGAACCAAACTGTGCTG TGCAACATTGGCTAAGTCGG TALDO1 CAGCACAGATGCCCGCTTA CGGCCCGGAATCTTCTTTAGTA PDK4 GGAGCATTTCTCGCGCTACA ACAGGCAATTCTTGTCGCAAA SDHB AAGCATCCAATACCATGGGG TCTATCGATGGGACCCAGAC PC GGCTACACCTACCCAGAC GGAGTCAAACACACGGAAGAC CASP1 GAGCTTCAGTCAGGTCCATCA TCTGAGGTCAACATCAGCTCC FBP1 AGCAGTCAAAGCCATCTCTTC ACGTCCAGCTTCTTAACTTGA ACOT13 AGCAGCATGACCCAGAACCTA GGAGCGTGCCCAGTTTATTAGTA IL18 GACTGGCTGTGACCCTATCT TTCCATTTTGTTGTGTCCTG PCK1 AGCATTCAACGCCAGGTTC CGAGTCTGTCAGTTCAATACCAA IQGAP1 TTCTATGCAGCTTTCTCGGG CTGTCGAACTAAGTATCCACGG PFKFB3 GATGCCCTTCAGGAAAGCCT TCCCCGACGTTGAACACTTT ACTIN AGCCTTCCTTCCTGGGCAT TGATCTTCATTGTGCTGGGTG [90]Open in a new tab 2.10. Statistical analysis Data processing and analysis were conducted using R software (Version 4.2.2). For comparisons of continuous variables between two groups, the independent Student's t-test was applied when the data followed a normal distribution. In cases where the data did not exhibit a normal distribution, the Mann-Whitney U test (also known as the Wilcoxon rank-sum test) was employed to assess differences between the groups. The Kruskal-Wallis test was utilized for comparisons involving more than two groups. Categorical variables between two groups were analyzed using either the Chi-square test or Fisher's exact test, depending on the dataset's characteristics. Spearman's correlation analysis was applied to explore the relationships among variables. All P-values were calculated as two-tailed, and a P-value of less than 0.05 was considered statistically significant. 3. Results 3.1. Technology Roadmap Depicted in [91]Fig. 1. Fig. 1. [92]Fig. 1 [93]Open in a new tab Technology Roadmap DN, Diabetic nephropathy. DEG, Differentially expressed genes. EMAPRDEGs, Energy metabolism and pyroptosis related differentially expressed genes. GSEA, Gene set enrichment analysis. GO, Gene ontology. KEGG, Kyoto encyclopedia of genes and genomes. ssGSEA, single-sample gene-set enrichment Analysis. ROC, Receiver operating characteristic curve. PPI, Protein-protein interaction network. TF, Transcription factors. RBP, RNA binding protein. qRT-PCR, Quantitative real-time PCR. 3.2. Data set correction Batch correction for the DN datasets was initially performed using the R package sva, resulting in corrected datasets. To evaluate the effectiveness of the correction, a comparative analysis was conducted on the datasets before and after batch correction. This included generating density boxplots and principal component analysis (PCA) plots ([94]Fig. 2A–D), which demonstrated a significant reduction in batch effects post-correction, as evident from the improved clustering and distribution in the corrected datasets. Fig. 2. [95]Fig. 2 [96]Open in a new tab Dataset correction A. boxplot of the DN Datasets dataset before correction. Shows the distribution of data before batching. In the boxplot, the light blue box represents data set [97]GSE30528, while the light red box represents data set [98]GSE96804. B. PCA plot of DN Datasets before correction. Shows the results of principal component analysis (PCA) of the sample before batch removal, with the first two principal components (PC1 and PC2) represented on the X and Y axes, respectively. C. boxplot plot of the corrected DN Datasets dataset. shows the distribution of the data set after batching. D. PCA plot of the corrected DN Datasets dataset. DN, Diabetic nephropathy. Shows the PCA results of a sample after batch removal. The Diabetic nephropathy (DN) dataset [99]GSE30528 is in light blue, and the Diabetic nephropathy (DN) dataset [100]GSE96804 is in light red. 3.3. Analysis of differentially expressed genes To identify DEGs across various DN groups, differential expression analysis was carried out using the limma package in R. A total of 1308 genes were identified as DEGs based on the criteria of |log FC| > 0.15 and P-value <0.05. Specifically, 608 genes were upregulated (log FC > 0.15) and 700 genes were downregulated (log FC < −0.15) between the DN and control groups, as revealed through variance analysis and visualized in the volcano plot ([101]Fig. 3A). To identify EMAPRDEGs, the DEG dataset was intersected with the EMAPRGs. Thirteen genes were identified as EMAPRDEGs for further analysis ([102]Table 2), and their overlap was illustrated in a Venn diagram ([103]Fig. 3B). Fig. 3. [104]Fig. 3 [105]Open in a new tab Analysis of differentially expressed genes A. Volcano plot of differential analysis results between DN and Control groups in the DN Datasets dataset. B. Venn diagram of differential genes in DN Datasets dataset. C. Differential expression heatmap of EMAPRDEGs in DN Datasets dataset. D. Chromosomal localization map. DN, Diabetic nephropathy. DEGs, Differentially expressed genes. EMAPRDEGs, Energy metabolism and pyroptosis related differentially expressed genes. Light red is diabetic nephropathy (DN) group, light blue is Control (Control) group. In the heat map, red represents high expression and blue represents low expression.ADH1B-Alcohol Dehydrogenase 1B; CASP3-Caspase 3; CASP1-Caspase 1; PFKFB3-6-Phosphofructo-2-Kinase/Fructose-2,6-Biphosphatase 3; IQGAP1-IQ Motif Containing GTPase Activating Protein 1; IL18-Interleukin 18; SDHB-Succinate Dehydrogenase Complex Iron Sulfur Subunit B; TALDO1-ransaldolase 1; PCK1-Phosphoenolpyruvate Carboxykinase 1; ACOT13-Acyl-CoA Thioesterase 13; PDK4-Pyruvate Dehydrogenase Kinase 4; FBP1-Fructose-Bisphosphatase 1; PC-Pyruvate Carboxylase. Table 2. EMAPRDEGs in DN datasets. logFC AveExpr t P.Value adj.P.Val B group ADH1B 0.367566 1.887138 6.427951 6.67E-09 5.94E-07 10.13912 up CASP3 0.235233 1.968116 7.635668 2.69E-11 1.41E-08 15.43041 up CASP1 0.205021 1.840216 4.795175 6.65E-06 9.50E-05 3.552582 up PFKFB3 0.176863 1.769025 3.248742 0.001648 0.006893 −1.59571 up IQGAP1 0.163056 2.067314 3.633964 0.000471 0.002566 −0.4441 up IL18 0.158416 1.76176 4.109908 8.93E-05 0.000697 1.105967 up SDHB −0.1729 2.020003 −5.18056 1.41E-06 2.94E-05 5.023465 down TALDO1 −0.18622 2.171838 −5.59438 2.52E-07 8.25E-06 6.666832 down PCK1 −0.1926 2.201158 −3.88242 0.000201 0.001317 0.348477 down ACOT13 −0.19821 1.935578 −4.48063 2.25E-05 0.000242 2.400666 down PDK4 −0.20604 1.868367 −5.52267 3.41E-07 1.01E-05 6.377741 down FBP1 −0.21668 2.191973 −4.4887 2.18E-05 0.000236 2.429621 down PC −0.21989 2.112074 −5.0895 2.05E-06 3.91E-05 4.670416 down [106]Open in a new tab DN, Diabetic nephropathy. EMAPRDEGs, Energy metabolism and Pyroptosis related differentially expressed genes. A heatmap depicting the differential expression of the 13 EMAPRDEGs between the DN and control groups is shown in [107]Fig. 3C, where gene names are arranged in descending order of log FC. To investigate the chromosomal location of the 13 EMAPRDEGs, annotations were made using the RCircos package ([108]Fig. 3D). These EMAPRDEGs were predominantly located on chromosomes 1, 4, 6, 7, 9, 10, 11, and 15, with chromosome 11 showing the highest concentration of EMAPRDEGs (four genes). The close proximity of these genes on chromosome 11 suggests a robust genomic correlation among them. 3.4. Functional and pathway enrichment analysis To elucidate the underlying biological mechanisms associated with the 13 EMAPRDEGs (CASP3, ADH1B, TALDO1, PDK4, SDHB, PC, CASP1, FBP1, ACOT13, IL18, PCK1, IQGAP1, PFKFB3), GO analysis was conducted, focusing on MFs, BPs, and CCs ([109]Table 3). The analysis revealed that these EMAPRDEGs were primarily enriched in BPs such as hexose metabolic processes, monosaccharide metabolic processes, pyruvate metabolism, glucose metabolic processes, and cellular carbohydrate metabolic processes. This suggests a potential role of these genes in regulating energy production and metabolism. Regarding CC, enrichment was observed in the mitochondrial matrix, which implicates these genes in mitochondrial functions, particularly in the respiratory chain and energy production. The mitochondria's role in energy metabolism and apoptosis is well-documented, further supporting the involvement of these genes in regulating cellular energy homeostasis and apoptosis. In terms of MF, the enriched terms included carbohydrate phosphatase activity, sugar-phosphatase activity, monosaccharide binding, carboxylic acid binding, and carbohydrate binding. These functions are closely related to sugar phosphorylation and enzyme activity, which are essential for regulating substrate flux through various metabolic pathways. Subsequently, KEGG pathway enrichment analysis was performed on the 13 EMAPRDEGs ([110]Table 3). The results highlighted significant enrichment in key metabolic pathways, including carbon metabolism, the citrate cycle (TCA cycle), glycolysis/gluconeogenesis, and the AMPK signaling pathway, all of which are central to cellular energy production. To visually represent these findings, bar charts ([111]Fig. 4A) and bubble charts ([112]Fig. 4B) were constructed for the GO and KEGG enrichment analyses. Additionally, network diagrams for biological processes (BP, [113]Fig. 4C), molecular functions (MF, [114]Fig. 4D), cellular components (CC, [115]Fig. 4E), and KEGG pathways ([116]Fig. 4F) were generated. These network diagrams illustrate the connections between the EMAPRDEGs and their respective annotations. The size of each node in the diagrams corresponds to the number of molecules within each entry, with larger nodes indicating a higher quantity of associated molecules. Table 3. GO/KEGG enrichment analysis results. ONTOLOGY ID Description GeneRatio BgRatio pvalue p.adjust qvalue BP GO:0019318 hexose metabolic process 5/12 242/18800 2.4951E-07 8.1852E-05 4.0358E-05 BP GO:0005996 monosaccharide metabolic process 5/12 261/18800 3.6304E-07 8.1852E-05 4.0358E-05 BP GO:0006090 pyruvate metabolic process 4/12 106/18800 4.5642E-07 8.1852E-05 4.0358E-05 BP GO:0006006 glucose metabolic process 4/12 201/18800 5.8699E-06 0.00078951 0.00038927 BP GO:0044262 cellular carbohydrate metabolic process 4/12 287/18800 2.3905E-05 0.00142899 0.00070457 CC GO:0005759 mitochondrial matrix 3/12 473/19594 0.00261407 0.09149242 0.08254955 MF GO:0019203 carbohydrate phosphatase activity 2/12 10/18410 1.7476E-05 0.00074273 0.0004323 MF GO:0050308 sugar-phosphatase activity 2/12 10/18410 1.7476E-05 0.00074273 0.0004323 MF GO:0048029 monosaccharide binding 2/12 71/18410 0.00094398 0.02674614 0.01556741 MF GO:0031406 carboxylic acid binding 2/12 173/18410 0.00544685 0.05507475 0.03205589 MF GO:0030246 carbohydrate binding 2/12 270/18410 0.01283648 0.05507475 0.03205589 KEGG hsa01200 Carbon metabolism 4/11 115/8164 1.1426E-05 0.00033135 0.00021048 KEGG hsa00020 Citrate cycle (TCA cycle) 3/11 30/8164 7.2441E-06 0.00033135 0.00021048 KEGG hsa00620 Pyruvate metabolism 3/11 47/8164 2.8572E-05 0.00055238 0.00035088 KEGG hsa00010 Glycolysis/Gluconeogenesis 3/11 67/8164 8.3175E-05 0.00120604 0.00076609 KEGG hsa04152 AMPK signaling pathway 3/11 121/8164 0.00048045 0.00557325 0.00354018 [117]Open in a new tab GO, Gene ontology; BP, Biological process; CC, Cellular component; MF, Molecular function; KEGG, Kyoto encyclopedia of genes and genomes. Fig. 4. [118]Fig. 4 [119]Open in a new tab Functional enrichment analysis (GO) and pathway enrichment (KEGG) analysis A. Bar graph of GO and KEGG enrichment analysis results of EMAPRDEGs. B. Bubble plot of GO and KEGG enrichment analysis results of EMAPRDEGs. The ordinate is the GO terms and KEGG terms. The bubble map shows the importance and number of genes of 13 EMAPRDEGs enriched entries in the GO (Gene Ontology) and KEGG (Kyoto Genome Encyclopedia) categories. Each bubble represents a specific biological process (BP), cell component (CC), molecular function (MF), or pathway (KEGG), and its size and color indicate the number and statistical significance of the genes involved, respectively. C-E. Network diagram of GO enrichment analysis results of EMAPRDEGs (C: BP, D: CC, E: MF). F. Network diagram of KEGG enrichment analysis results of EMAPRDEGs. In the network diagram (C–F), red dots represent specific pathways and blue dots represent specific genes. GO, Gene ontology. KEGG, Kyoto encyclopedia of genes and genomes. BP, Biological process. CC, Cellular component. MF, Molecular function. EMAPRDEGs, Energy metabolism and pyroptosis related differentially expressed genes. The screening criteria for GO/KEGG enrichment items were p. Adj <0.05 and FDR value (q. value) < 0.25, and the p value correction method was Benjamini-Hochberg (BH). 3.5. GSEA enrichment analysis GSEA was conducted to explore the correlation between gene expression and relevant biological processes, cellular components, and molecular functions in the DN datasets. Enrichment analysis was conducted to identify pathways with significant statistical enrichment (P. adj <0.05 and FDR value [Q. value] < 0.25). The top four pathways exhibiting the highest NES were selected for presentation. These results were visualized through a mountain plot ([120]Fig. 5A) and a GSEA classical enrichment map ([121]Fig. 5B–E). The analysis revealed substantial gene enrichment in processes such as collagen fibril assembly and the formation of other multimeric structures, as illustrated in [122]Fig. 5B. Furthermore, pathways related to the inflammatory response mechanism ([123]Fig. 5C), MET-mediated motility promotion ([124]Fig. 5D), and the regulation of Wnt/β-catenin signaling by small molecule compounds ([125]Fig. 5E) were also identified ([126]Table 4). Fig. 5. [127]Fig. 5 [128]Open in a new tab GSEA of DN Datasets dataset A. GSEA of DN Datasets dataset mountain map of main four biological functions. B-E. Genes in DN Datasets were significantly enriched in Assembly of collagen fibrils and other multimeric structures (B), Inflammatory response pathway (C), MET promotes motility (D), Regulation of Wnt/β-catenin signaling by small molecule compounds (E), GSEA, Gene set enrichment analysis. DN, Diabetic nephropathy. The screening criteria of gene set enrichment analysis (GSEA) were p. Adj< 0.05 and FDR value (q value) < 0.25, and the p value correction method was Benjamini-Hochberg (BH). Table 4. GSEA enrichment analysis results in DN_Datasets Description setSize EnrichmentScore NES pvalue p.adjust qvalue REACTOME_ASSEMBLY_OF_COLLAGEN_FIBRILS_AND_OTHER_MULTIMERIC_STRUCTURES 47 0.64151544 2.37927903 5.467E-08 7.1588E-06 5.8347E-06 WP_INFLAMMATORY_RESPONSE_PATHWAY 26 0.67719699 2.14019254 2.2418E-05 0.00084271 0.00068684 REACTOME_MET_PROMOTES_CELL_MOTILITY 33 0.5985638 2.03937632 4.8794E-05 0.00161981 0.00132022 WP_REGULATION_OF_WNT_BCATENIN_SIGNALING_BY_SMALL_MOLECULE_COMPOUNDS 14 0.71292448 1.95151918 0.00055144 0.01192424 0.00971875 [129]Open in a new tab GSEA, Gene set enrichment analysis. DN, Diabetic nephropathy. 3.6. Differential expression analysis and ROC analysis of EMAPRDEGs The expression profiles of 13 EMAPRDEGs (CASP3, ADH1B, TALDO1, PDK4, SDHB, PC, CASP1, FBP1, ACOT13, IL18, PCK1, IQGAP1, PFKFB3) were compared between the DN and control groups in the DN datasets. The results of this differential expression analysis are displayed in the group comparison plot ([130]Fig. 6A). These findings highlighted significant differences in the expression of the 13 EMAPRDEGs between DN and control groups (P < 0.05). Subsequently, ROC curves were constructed using the expression levels of these genes in both the DN and control groups ([131]Fig. 6B–F). The analysis demonstrated that CASP3 exhibited a high diagnostic accuracy for DN, with an AUC of 0.918 ([132]Fig. 6B). Other genes, including ADH1B (AUC = 0.851), TALDO1 (AUC = 0.875), PDK4 (AUC = 0.853, [133]Fig. 6C), SDHB (AUC = 0.798), PC (AUC = 0.805), CASP1 (AUC = 0.772, [134]Fig. 6D), FBP1 (AUC = 0.759), ACOT13 (AUC = 0.772), IL18 (AUC = 0.715, [135]Fig. 6E), PCK1 (AUC = 0.728), IQGAP1 (AUC = 0.729, [136]Fig. 6F), and PFKFB3 (AUC = 0.707), also displayed varying degrees of diagnostic accuracy in identifying DN. Fig. 6. [137]Fig. 6 [138]Open in a new tab Differential expression analysis and ROC analysis of EMAPRDEGs A. Group comparison plot of EMAPRDEGs in DN and Control groups in DN Datasets dataset. B-F. EMAPRDEGs: CASP3 (B), ADH1B (B), TALDO1 (B), PDK4 (C), SDHB (C), PC (C), CASP1 (D), FBP1 (D), ACOT13 (D), IL18 (E), PCK1 (E), IQGAP1 (F), ROC curve of PFKFB3 (F) between different groups (DN/Control) of DN Datasets. DN, Diabetic nephropathy. EMAPRDEGs, Energy metabolism and pyroptosis related differentially expressed genes. The symbol ∗∗ is equivalent to P < 0.01, which is highly statistically significant. The symbol ∗∗∗ is equivalent to P < 0.001 and highly statistically significant. The closer the AUC in the ROC curve is to 1, the better the diagnostic effect is. When AUC was 0.7–0.9, it had a certain accuracy. AUC >0.9 had high accuracy. ROC, Receiver operating characteristic curve; AUC, Area under the curve. Note: True Positive Rate (TPR) = TP/(TP + FN)TP (True Positives): The number of real cases that the model correctly predicts to be positive.FN (False Negatives): The number of false negative cases, the number of samples that the model incorrectly predicts to be negative. True Negative Rate (TNR) = TN/(TN + FP) TN (True Negatives): The number of true negative cases, the number of samples that the model correctly predicts to be negative. FP (False Positives): The number of false positive examples, the number of samples that the model incorrectly predicts to be positive. 3.7. PPI network and functional similarity analysis The STRING database was employed to analyze PPI of the 13 EMAPRDEGs (CASP3, ADH1B, TALDO1, PDK4, SDHB, PC, CASP1, FBP1, ACOT13, IL18, PCK1, IQGAP1, PFKFB3). Following the filtration of EMAPRDEGs exhibiting interactions with other nodes, a set of 13 EMAPRDEGs was identified for further analysis, and a PPI network was constructed and visualized using Cytoscape software ([139]Fig. 7A). Fig. 7. [140]Fig. 7 [141]Open in a new tab PPI interaction network and functional similarity analysis A. Protein-protein interaction network (PPI network). B. Functional similarity box plot of EMAPRDEGs. C. GeneMANIA website predicted the interaction network of functionally similar genes of EMAPRDEGs. Circles in the figure show EMAPRDEGs and genes with similar functions in our study, and the corresponding colors of the lines represent the interconnected functions. PPI network, Protein-protein interaction network. EMAPRDEGs, Energy metabolism and pyroptosis related differentially expressed genes. Subsequently, functional similarity analysis of the 13 EMAPRDEGs in the PPI network was performed. The R program GOSemSim was utilized to compute the semantic similarity among the gene products and associated gene clusters, which involved calculating GO terms and sets. The functional similarity results for key genes were presented through boxplots ([142]Fig. 7B). The analysis revealed that FBP1 exhibited the highest degree of functional similarity with other major genes within the network. Additionally, the GeneMANIA database was used to explore the correlations between the 13 EMAPRDEGs and other genes ([143]Fig. 7C). The findings indicated that these 13 EMAPRDEGs primarily exhibited co-expression and physical interactions with other genes, further highlighting their involvement in interconnected biological processes. 3.8. mRNA-miRNA, mRNA-Drug, mRNA-TF interaction network The Starbase 3.0 database was utilized to predict miRNAs interacting with the 13 EMAPRDEGs (CASP3, ADH1B, TALDO1, PDK4, SDHB, PC, CASP1, FBP1, ACOT13, IL18, PCK1, IQGAP1, PFKFB3). Filtering was performed using the criterion of pancancerNum >10, and the results were visualized with Cytoscape software ([144]Fig. 8A). The analysis revealed an mRNA-miRNA interaction network comprising 6 mRNAs (CASP3, IQGAP1, PC, PDK4, PFKFB3, SDHB) and 55 miRNAs, representing 69 distinct interactions between mRNA and miRNA. The precise interactions are detailed in [145]Table 5. Fig. 8. [146]Fig. 8 [147]Open in a new tab Interaction network of mRNA-miRNA, mRNA-Drug and mRNA-TF A. The mRNA-miRNA interaction network diagram of EMAPRDEGs, orange circles are mRNAs and light pink V-shaped blocks are miRNAs. B. mRNA-drugs interaction network diagram of EMAPRDEGs, orange circles are mRNAs and green triangular blocks are drugs. C. mRNA-TF interaction network diagram of EMAPRDEGs, orange circles are mRNAs and blue diamonds are transcription factors (TFS). EMAPRDEGs, Energy metabolism and pyroptosis related differentially expressed genes. TF, Transcription factors. Table 5. mRNA-miRNA interaction network nodes. mRNA miRNA CASP3 hsa-miR-29a-3p CASP3 hsa-miR-30a-5p CASP3 hsa-miR-101-3p CASP3 hsa-miR-30c-5p CASP3 hsa-miR-30b-5p IQGAP1 hsa-miR-15a-5p IQGAP1 hsa-miR-16-5p IQGAP1 hsa-miR-17-5p IQGAP1 hsa-miR-26b-5p IQGAP1 hsa-miR-33a-5p IQGAP1 hsa-miR-103a-3p IQGAP1 hsa-miR-107 IQGAP1 hsa-miR-30c-5p IQGAP1 hsa-miR-15b-5p IQGAP1 hsa-miR-30b-5p IQGAP1 hsa-miR-135a-5p IQGAP1 hsa-miR-186-5p IQGAP1 hsa-miR-188-5p IQGAP1 hsa-miR-320a IQGAP1 hsa-miR-106b-5p IQGAP1 hsa-miR-30e-5p IQGAP1 hsa-miR-361-5p IQGAP1 hsa-miR-362-5p IQGAP1 hsa-miR-374a-5p IQGAP1 hsa-miR-339-5p IQGAP1 hsa-miR-335-5p IQGAP1 hsa-miR-500a-3p IQGAP1 hsa-miR-532-5p IQGAP1 hsa-miR-590-5p IQGAP1 hsa-miR-660-5p IQGAP1 hsa-miR-362-3p IQGAP1 hsa-miR-502-3p IQGAP1 hsa-miR-374b-5p IQGAP1 hsa-miR-1296-5p IQGAP1 hsa-miR-1301-3p PC hsa-miR-182-5p PDK4 hsa-miR-16-5p PDK4 hsa-miR-17-5p PDK4 hsa-miR-20a-5p PDK4 hsa-miR-23a-3p PDK4 hsa-miR-27a-3p PDK4 hsa-miR-93-5p PDK4 hsa-miR-96-5p PDK4 hsa-miR-181a-5p PDK4 hsa-miR-182-5p PDK4 hsa-miR-205-5p PDK4 hsa-miR-15b-5p PDK4 hsa-miR-141-3p PDK4 hsa-miR-106b-5p PDK4 hsa-miR-200a-3p PDK4 hsa-miR-301a-3p PDK4 hsa-miR-130b-3p PDK4 hsa-miR-148b-3p PDK4 hsa-miR-671-5p PDK4 hsa-miR-454-3p PDK4 hsa-miR-708-5p PDK4 hsa-miR-301b-3p PDK4 hsa-miR-2355-5p PFKFB3 hsa-let-7d-5p PFKFB3 hsa-miR-17-5p PFKFB3 hsa-miR-19b-3p PFKFB3 hsa-miR-20a-5p PFKFB3 hsa-miR-93-5p PFKFB3 hsa-miR-106b-5p PFKFB3 hsa-miR-130b-3p PFKFB3 hsa-miR-335-5p PFKFB3 hsa-miR-454-3p PFKFB3 hsa-miR-532-3p SDHB hsa-miR-543 [148]Open in a new tab Additionally, the CTD database was employed to predict potential drugs or small molecule compounds interacting with the 13 EMAPRDEGs. mRNA-drug interaction pairs were filtered with the criterion of "reference count" > 4, and the results were visualized using Cytoscape software ([149]Fig. 8B). This analysis identified 6 mRNAs (ADH1B, CASP3, FBP1, IL18, PC, PFKFB3) and 51 different drug molecules, resulting in 53 unique mRNA-drug interactions. Detailed interaction data are provided in [150]Table 6. Table 6. mRNA-Drug interaction network nodes. mRNA Drug ADH1B Benzo(a)pyrene ADH1B Tetrachlorodibenzodioxin CASP3 1-Methyl-4-phenylpyridinium CASP3 ABT-737 CASP3 Acrolein CASP3 Arsenic Trioxide CASP3 benzoylcarbonyl-aspartyl-glutamyl-valyl-aspartyl-fluoromethyl ketone CASP3 bisphenol A CASP3 Bortezomib CASP3 Cadmium Chloride CASP3 Camptothecin CASP3 Cannabidiol CASP3 Capsaicin CASP3 chromium hexavalent ion CASP3 Cisplatin CASP3 cobaltous chloride CASP3 Curcumin CASP3 Daunorubicin CASP3 Doxorubicin CASP3 Drugs, Chinese Herbal CASP3 Endosulfan CASP3 Ethanol CASP3 Etoposide CASP3 Fenretinide CASP3 Fluorouracil CASP3 Gemcitabine CASP3 Genistein CASP3 Glucose CASP3 Hydrogen Peroxide CASP3 hydroquinone CASP3 (+)-JQ1 compound CASP3 Mitomycin CASP3 Niclosamide CASP3 ochratoxin A CASP3 Oxidopamine CASP3 Paclitaxel CASP3 Paraquat CASP3 Particulate Matter CASP3 Plant Extracts CASP3 Quercetin CASP3 Resveratrol CASP3 Rotenone CASP3 sodium arsenite CASP3 Staurosporine CASP3 Thapsigargin CASP3 titanium dioxide CASP3 Tretinoin CASP3 Vorinostat FBP1 Valproic Acid IL18 (+)-JQ1 compound IL18 Lipopolysaccharides PC Valproic Acid PFKFB3 Oxygen [151]Open in a new tab To explore the interactions between the 13 EMAPRDEGs and transcription factors (TFs), a search in the CHIPBase database (version 3.0) was conducted. The screening criterion for mRNA-TF interactions was a combined count of upstream and downstream samples exceeding 8. The mRNA-TF interaction network was visualized using Cytoscape ([152]Fig. 8C). This network revealed 12 mRNAs (CASP3, ADH1B, TALDO1, PDK4, SDHB, PC, CASP1, FBP1, ACOT13, IL18, IQGAP1, PFKFB3) and 45 TFs, with 117 documented interactions between mRNAs and TFs, as detailed in [153]Table 7. Table 7. mRNA-TF interaction network nodes. mRNA TF CASP3 CTCF CASP3 ELF1 CASP3 FOSL2 CASP3 HNF4A CASP3 JUND CASP3 NRF1 CASP3 BRD3 CASP3 RELA CASP3 CEBPB ACOT13 CTCF ACOT13 ELF1 ACOT13 FOSL2 ACOT13 HNF4A ACOT13 JUND ACOT13 NRF1 ACOT13 BRD3 ACOT13 RELA ACOT13 CEBPB ADH1B AR ADH1B FOXA1 ADH1B GATA6 CASP1 CEBPB CASP1 EP300 CASP1 FOS CASP1 FOSL2 CASP1 FOXA1 CASP1 FOXA2 CASP1 HNF4A CASP1 JUN CASP1 JUND CASP1 MAX CASP1 MYC CASP1 NR3C1 CASP1 TEAD4 CASP3 CEBPA CASP3 ERG CASP3 GABPA CASP3 RAD21 CASP3 SPI1 FBP1 HNF4A FBP1 RUNX1 FBP1 SPI1 FBP1 ESR1 IL18 RAD21 IL18 RUNX1 IL18 SPI1 IL18 STAG1 IL18 CEBPB IL18 CTCF IQGAP1 ELF1 IQGAP1 SPI1 PC BCL3 PC BHLHE40 PC CEBPA PC CEBPB PC CREB1 PC CTCF PC EGR1 PC ELF1 PC EP300 PC ERG PC ESR1 PC ESRRA PC FOS PC FOSL1 PC FOSL2 PC GABPA PC HES2 PC HNF4A PC JUN PC JUND PC MAX PC MYC PC NR3C1 PC POLR2A PC RAD21 PC RUNX1 PC SMARCA4 PC SP1 PC SPI1 PC STAT3 PC USF1 PC USF2 PC YY1 PDK4 AR PDK4 CEBPB PDK4 ERG PDK4 ESR1 PDK4 FOXA1 PDK4 FOXA2 PDK4 HNF4A PDK4 SPI1 PFKFB3 MAX SDHB CEBPB SDHB CREB1 SDHB CTCF SDHB EGR1 SDHB ELF1 SDHB EP300 SDHB ERG SDHB ESR1 SDHB FOXA1 SDHB GABPA SDHB MAX SDHB NRF1 SDHB RAD21 SDHB SMC3 SDHB SPI1 SDHB STAG1 SDHB USF1 SDHB USF2 SDHB YY1 TALDO1 MAFK TALDO1 MAX TALDO1 MYC TALDO1 NRF1 TALDO1 REST [154]Open in a new tab TF; Transcription factors. 3.9. Single-sample gene set enrichment analysis (ssGSEA) The ssGSEA algorithm was employed to assess variations in immune cell infiltration between the DN and control groups (DN/Control) across 28 distinct immune cell types. The results were visualized using a group comparison plot ([155]Fig. 9A). Statistically significant differences (P < 0.05) were observed in the infiltration levels of 15 immune cell types between the DN and control groups. A correlation analysis was performed to compute the relationships between the levels of these 15 immune cells across the disease and control samples, with the results depicted in [156]Fig. 9B. In the DN dataset, the strongest positive correlation (r = 0.82) was observed between regulatory T cells and memory B cells, whereas the most prominent negative correlation (r = −0.58) was identified between immature dendritic cells and immature B cells. Fig. 9. [157]Fig. 9 [158]Open in a new tab Immune Infiltration Analysis of DN Datasets (ssGSEA) A. Box plot of grouping comparison of immune cells under DN and Control groups in the DN Datasets dataset. B. Heat map of correlations between the 15 immune cells with significant differences in DN Datasets dataset. C. Dot plot of correlation between EMAPRDEGs expression and infiltration abundance of 15 immune cells in DN Datasets dataset. The symbol ns was equivalent to P ≥ 0.05, which was not statistically significant. The symbol ∗ is equivalent to P < 0.05, which is statistically significant; The symbol ∗∗ is equivalent to P < 0.01, which is highly statistically significant; The symbol ∗∗∗ is equivalent to P < 0.001 and highly statistically significant. DN, Diabetic nephropathy. EMAPRDEGs, Energy metabolism and pyroptosis related differentially expressed genes. ssGSEA, single-sample gene-set enrichment analysis. The absolute value of correlation coefficient (r value) below 0.3 was weak or no correlation, 0.3–0.5 was weak correlation, 0.5–0.8 was moderate correlation, and above 0.8 was strong correlation. The light blue is the Control group, and the light red is the diabetic nephropathy (DN) group. In the correlation heat map, red is positive correlation, blue is negative correlation, and the depth of color represents the strength of correlation. Further analysis explored the relationship between the abundance of 15 immune cell types in the DN samples and the expression levels of the 13 EMAPRDEGs. This correlation was visualized through a dot plot ([159]Fig. 9C). The results revealed that ADH1B exhibited the strongest positive correlation with memory B cells (r = 0.79), while PC showed the most significant negative correlation with natural killer cell activity (r = −0.70). 3.10. Verification of EMAPRDEGs expression in DN qRT-PCR analysis was performed to measure the expression levels of the EMAPRDEGs in both DN and control blood samples. Significant differences in gene expression were observed for four genes (CASP1, IL-18, PDK4, and FBP1) with statistical significance (P < 0.05). Specifically, CASP1 and IL-18 were expressed at higher levels in the DN group compared to the control group, while PDK4 and FBP1 showed reduced expression in the DN group. No significant differences in expression were found for CASP3, ADH1B, TALDO1, SDHB, PC, ACOT13, PCK1, IQGAP1, and PFKFB3 between the DN and control groups ([160]Fig. 10). Fig. 10. [161]Fig. 10 [162]Open in a new tab Validation of the differential expression of potential diagnostic markers PC, SDHB, ADH1B, ACOT13, PDK4, CASP1, IQGAP1, TALDO1, PCK1, PFKFB3, FBP1, CASP3 and IL18 via qRT-PCR. ∗P < 0.05, ∗∗P < 0.01; ns: not significant. 4. Discussion DN remains a leading cause of chronic kidney disease, posing a significant global health challenge [[163]1]. Early detection and therapeutic interventions are urgently needed to halt its progression. Previous studies have shown that elevated glucose levels trigger oxidative stress and DNA damage, contributing to DN pathogenesis [[164]40]. In recent years, the regulatory mechanisms underlying pyroptosis in the context of DN progression have been increasingly elucidated, with key pathways involving oxidative stress and activation of the NLRP3 inflammasome. Pyroptosis has been implicated in renal tubule fibrosis, and persistent inflammation is thought to drive renal interstitial fibrosis, ultimately impairing overall kidney function [[165]41]. The development of DN is closely tied to disruptions in energy metabolism, and emerging evidence suggests that pyroptosis may influence these metabolic pathways [[166][12], [167][13], [168][14]]. Understanding these mechanisms could facilitate the development of novel therapeutic strategies. Consequently, identifying biomarkers associated with energy metabolism and pyroptosis is crucial for the diagnosis and treatment of DN. Our bioinformatics analysis identified 13 key EMAPRDEGs: CASP3, ADH1B, TALDO1, PDK4, SDHB, PC, CASP1, FBP1, ACOT13, IL18, PCK1, IQGAP1, and PFKFB3. GO analysis revealed that these genes are predominantly involved in energy metabolism, with a particular focus on the regulation of mitochondrial function. These findings align with previous BP and MF enrichment results. KEGG pathway analysis further corroborated the involvement of these genes in critical cellular energy metabolism pathways. GSEA highlighted significant enrichment in several pathways, including collagen assembly, inflammatory response, cell migration, and Wnt/β-catenin signaling. Notably, Wnt/β-catenin signaling has been linked to renal cell injury and tubulointerstitial fibrosis in DN [[169]42,[170]43]. The enriched pathways associated with collagen fiber assembly underscore the role of extracellular matrix accumulation in DN progression, a finding consistent with previous research [[171]44]. To further investigate the interactions among the identified EMAPRDEGs in DN, a PPI network was constructed, and the functional similarity of these genes was assessed. Additionally, this study explored potential miRNA and TF interactions, as well as drugs related to DN, and analyzed the expression of EMAPRDEGs in tissue-infiltrating immune cells. The diagnostic potential of these genes was also evaluated. Together, these analyses provide novel insights into the roles of energy metabolism and pyroptosis in DN pathogenesis and offer promising avenues for the development of targeted diagnostic and therapeutic strategies. PDK4, a pivotal enzyme in the energy metabolism pathway, exhibits a broad spectrum of activities and is implicated in both physiological and pathological processes across various tissues and systems. In particular, PDK4 regulates glucose oxidation in the myocardium [[172]45], modulates glucagon signaling in the liver and pancreas [[173]46], controls chronic inflammation in the skeletal muscle system [[174]47], and contributes to endovascular calcification during advanced glycosylation [[175]48]. Moreover, PDK4 has been identified as a potential inhibitor of tumor growth [[176]49]. In a high-glucose (HG) environment, downregulation of PDK4 has been shown to protect MPC5 cells by inhibiting Vascular Endothelial Growth Factor A (VEGFA) [[177]50]. These observations highlight a novel protective role of PDK4 at the cellular level, emphasizing its potential as a diagnostic biomarker for chronic kidney disease [[178]51]. Further research has revealed a strong association between PDK4 upregulation and the progression of diabetic nephropathy. In HG conditions, PDK4 suppresses the expression of antioxidant enzymes by inhibiting the activity of nuclear factor erythroid 2-related factor 2 (Nrf2), thereby exacerbating oxidative stress [[179]52]. These findings align with the results of the current study. Interleukin-18 (IL-18), a critical marker of inflammasome activation, is intricately linked to energy metabolism. Studies on IL-18 gene knockout (IL-18−/−) mice have shown that IL-18 deficiency leads to a 47 % greater increase in body weight compared to wild-type mice, along with the development of hyperglycemia, hyperlipidemia, and insulin resistance with age [[180]53]. Notably, IL-18−/− mice with dyslipidemia also exhibit inhibition of the Wnt signaling pathway, reduced expression of cyclin D1 (Ccnd1), and disruption of circadian rhythms [[181]54]. IL-18 contributes to the pathogenesis of DN through its proinflammatory effects [[182]55], regulation of oxidative stress, and interaction with key signaling pathways [[183]7]. Notably, elevated IL-18 levels have been identified as a key predictor of DN onset in diabetic patients [[184]56,[185]57], a finding consistent with our own results, suggesting that IL-18 may serve as a reliable biomarker for DN. The development of a novel anti-IL-1R7 antibody, which mitigates IL-18-driven inflammation by blocking NFκB activation and reducing pro-inflammatory cytokines such as IFNγ and IL-6 in human cells and PBMCs, further underscores the therapeutic potential of targeting IL-18 signaling in diseases characterized by excessive IL-18 activity, including DN [[186]58]. Caspase-1 (CASP1), a pivotal mediator of inflammation [[187]59], is also implicated in energy metabolism [[188]60]. Increased expression of CASP1 in DN is associated with exacerbated inflammatory responses, a key factor in the progression of DN [[189]6]. The identification of CASP1 as a differentially expressed gene in our study highlights the importance of inflammation in DN pathogenesis and supports the potential of targeting inflammatory pathways as a therapeutic strategy. Fructose-1,6-bisphosphatase 1 (FBP1), a rate-limiting enzyme in gluconeogenesis, plays a pivotal role in regulating glucose production. In diabetes, FBP1 upregulation has been shown to improve insulin resistance [[190]61]. However, the precise mechanisms through which FBP1 impacts DN and its potential therapeutic implications remain to be fully elucidated and require further investigation. FBP1 exhibits the highest degree of functional similarity to other key genes, suggesting that it may play a comparable role in regulating energy metabolism. miRNAs, an endogenous class of small, non-coding RNAs, are essential regulators of gene expression. Our team has successfully constructed an mRNA-miRNA network, identifying 72 miRNAs predicted to target relevant genes, some of which have been experimentally validated. For instance, hsa-miR-27a-3p has been implicated in the regulation of DN progression and is proposed as a potential therapeutic target [[191]62]. Similarly, hsa-miR-15b-5p has been shown to protect podocytes from damage in DN [[192]50]. Additionally, a TF-mRNA network was constructed, identifying 117 predicted target TFs. Notably, a randomized, double-blind, placebo-controlled clinical trial has demonstrated the efficacy of resveratrol in reducing albuminuria in patients with DN [[193]63]. Furthermore, a murine study highlighted the role of quercetin in preventing mesangial cell proliferation during the early stages of DN through modulation of the Hippo signaling pathway [[194]64], while another rat study found that curcumin alleviates DN by inhibiting inflammatory gene expression via reversal of caveolin-1Tyr(14) phosphorylation, which affects TLR4 activation [[195]65]. The infiltration of immune cells and their correlation with the expression of EMAPRDEGs were assessed in both patients with DN and controls. Notable changes in immune cell populations were observed in DN, with ADH1B and PC exhibiting significant associations with natural killer (NK) cells and memory B cells. The expression level of ADH1B was found to correlate with various immune checkpoints, highlighting its potential role in regulating the immune microenvironment [[196]66]. Research suggests that tissue-resident lymphocytes in the kidney are capable of rapid responses to pathogens and environmental stimuli [[197]67]. The accumulation of these immune cells may exacerbate the inflammatory response, thereby contributing to the progression of DN. PC, a key regulator of cellular metabolism, particularly in cancer cells, has been linked to cellular proliferation and migration [[198]68], suggesting that alterations in its expression could influence the immune microenvironment in DN. The immune cell correlation heatmap revealed a strong positive association between Regulatory T cells and Memory B cells, as well as a strong negative correlation between Immature dendritic cells and Immature B cells. These findings warrant further investigation to clarify underlying mechanisms. To validate the diagnostic potential of the identified genes for DN, ROC curve analysis was performed. Thirteen EMAPRDEGs demonstrated good diagnostic value for DN. qRT-PCR confirmed that the expression levels of CASP1, IL-18, PDK4, and FBP1 were consistent with the bioinformatics analysis results. While our approach builds on methodologies previously utilized [[199][69], [200][70], [201][71], [202][72]], certain limitations remain. First, the sample size may have compromised the robustness of our findings, as evidenced by qRT-PCR validation. Future studies should aim to increase the sample size to enhance the reliability of the results. Second, although the FDR method was applied to minimize false positives and improve result confidence, the study design still poses risks of multiple comparison biases, which must be rigorously controlled to ensure result reliability. Third, as our study relies on GEO data, potential biases related to sample selection, data diversity, quality, and consistency must be considered. To address these issues in future experimental verification, integrating multiple data sources, expanding the sample size and diversity, and ensuring data quality will be critical. Additionally, the prognostic value of the identified EMAPRDEGs remains uncertain. This study marks the first effort to link genes involved in energy metabolism and pyroptosis in DN, suggesting that these genes may play significant roles in the pathogenesis and progression of the disease. However, further validation is necessary to substantiate these findings. 5. Conclusions Bioinformatics analysis identified 13 EMAPRDEGs potentially associated with DN. Among these, CASP1, IL-18, PDK4, and FBP1 emerged as promising biological markers for DN, warranting further validation. CASP1 is linked to an enhanced inflammatory response in DN, with its expression level potentially serving as an indicator of inflammation severity. IL-18, a key regulator of both inflammation and energy metabolism, has rising levels that are considered a key predictor of DN progression. CRediT authorship contribution statement Shan He: Writing – original draft, Methodology, Data curation. Jian Ye: Writing – original draft, Data curation. Yu Wang: Funding acquisition. Lu yang Xie: Software. Si Yi Liu: Supervision. Qin kai Chen: Writing – review & editing, Supervision, Methodology. Ethics approval and consent to participate and publication This study was reviewed and approved by the Ethics Committee of the First Affiliated Hospital of Nanchang University, with the approval number: (2024) CDYFYYLK (07–026), dated July 22, 2024. All participants provided written informed consent to participate in the study and for their data to be published. This study was conducted in accordance with the ethical principles outlined in the Declaration of Helsinki, as established by the World Medical Association. Data availability statement Data are available on reasonable request. All data relevant to the study are included in the article or uploaded as supplemental information. Gene expression data from the GEO repository ([203]https://www.ncbi.nlm.nih.gov/geo/) include the following datasets: [204]GSE30528, [205]GSE96804. Funding This study was supported by the Fund of Jiangxi Provincial Natural Science Foundation (Grant number: 20212BAB206028, 20232BAB206034). Declaration of competing interest The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper. Acknowledgements