Abstract Background Oxidative stress plays a crucial role in the development of diabetic foot ulcers (DFU). However, its underlying mechanisms are not fully understood. The purpose of this study was to use bioinformatics and preliminary validation methods to preliminarily reveal the oxidative stress landscape in DFU. Methods Based on the single-cell and bulk RNA sequencing data of DFU, we conducted differential genes screening, machine learning, PPI network construction, immune infiltration analysis, drug prediction, TF-mRNA-miRNA network, cell-cell interaction, pseudotime trajectory analysis, external cohort validation, and in vitro experiments to develop the oxidative stress landscape in DFU. Results Bulk RNA-seq analysis identified 63 oxidative stress-related genes of DFU (DORGs), and the top 59 genes were screened out for key nodes with close functional associations. Functional enrichment analysis showed significant involvement in oxidative stress response. Drug prediction highlighted Thymoquinone and Erlotinib as potential therapeutic candidates. Machine learning algorithms (SVM-RFE, LASSO and RF) identified BCL2 and 和FOXP2 as candidate hub DORGs for DFU diagnosis. Immune cell infiltration analysis indicated a significant presence of naive B cells and CD8 T cells in DFU. The analysis of single-cell RNA sequencing identified a total of 31,787 cells across 10 distinct clusters, with a notably lower proportion of fibroblasts in DFU group than that in the control group. The expression patterns of BCL2 and 和FOXP2 across the different groups were consistent with findings from bulk RNA sequencing analysis. Notably, fibroblasts derived from DFU patients exhibited the highest oxidative stress scores. Intercellular signaling analysis indicated that fibroblasts serve as crucial communication cells, primarily engaged in COLLAGEN signaling network. Additionally, fibroblasts are categorized into five distinct clusters. Among these, COL6A5+ fibroblasts constitute the predominant cluster in DFU and exhibit low differentiation potential. Furthermore, in vitro experiments successfully established a DFU oxidative stress model of fibroblasts, revealing reduced migration ability in the absence of cell death. Both in vitro findings and external data corroborated the decreased expression levels of BCL2and和FOXP2in DFU. Conclusion The oxidative stress-related genes BCL2 and FOXP2 could serve as diagnostic markers for DFU. Furthermore, we identified the novel pathogenic mechanism associated with oxidative stress in DFU fibroblasts. This study may offer new insights for the diagnosis and treatment of DFU. Supplementary Information The online version contains supplementary material available at 10.1186/s13062-025-00672-5. Keywords: Diabetic foot ulcer, Single-cell RNA, Bulk RNA, Oxidative stress, Fibroblast Introduction The global burden of DFU is substantial, with epidemiological data suggesting that 19–34% of the 537 million people with diabetes worldwide will develop DFU during their lifetime, often facing severe consequences including amputation (20% of cases) and elevated one-year mortality (10%) [[32]1–[33]3]. Reactive oxygen species (ROS) have been identified as key mediators in the pathogenesis of diabetic foot ulcer (DFU), but their exact mechanistic role is still incompletely characterized [[34]4]. Although hyperglucose-induced ROS overproduction is known to disrupt redox homeostasis, initiate inflammatory cascades, and impair the wound healing process [[35]5, [36]6], several questions remain regarding the role of ROS in the progression of DFU, including: (1) which molecular targets in DFU-affected are affected by ROS; (2) which specific skin cells are affected by ROS; and (3) which pathways in these skin cells are influenced. In recent decades, the rapid advancement of disease genomics has established bulk transcriptome sequencing (bulk RNA-seq) as a primary tool for transcriptomic research, leading to the identification of numerous genetic alterations that serve as effective therapeutic targets for DFU [[37]7]. Additionally, single-cell RNA sequencing (scRNA-seq) provides insights into cellular transcriptome heterogeneity, enabling the assessment of gene expression distribution at the cellular level [[38]8]. Given this advantage, many studies have concentrated on identifying potential biomarkers for diabetic complications by integrating bulk RNA-seq and scRNA-seq analyses, which can accurately evaluate the early progression of these complications [[39]9–[40]11]. In this study, we employed both scRNA-seq and bulk RNA-seq data to conduct a systematic bioinformatics analysis aimed at evaluating the oxidative stress levels in the diseased skin of patients with DFU. We identified two oxidativestress-related genes that exhibited the highest significance and determined the major cell clusters affected by oxidative stress. Additionally, we validated the accuracy and applicability of our bioinformatics conclusions through the use of external cohorts and in vitro experiments. Furthermore, we explored the relationships between oxidativestress-related genes, microRNAs (miRNAs), and transcription factors, outlined the immune infiltration landscape, analyzed drugs targeting the oxidativestress-related genes. In conclusion, this study examined the potential molecular mechanisms and therapeutic strategies associated with oxidative stress in DFU, which may offer new insights into DFU research. Methods Bulk RNA-seq data sources, processing and differential expressed gene screening Figure [41]1 illustrates the study flowchart. Bulk RNA-seq data ([42]GSE80178 and [43]GSE143735) were obtained from the GEO database ([44]https://www.ncbi.nlm.nih.gov/geo/). Detailed dataset information is presented in Table [45]1. Recognizing the significant batch effects across multiple datasets, we de-batched the data to eliminate noise and normalize it. A differential expression gene (DEG) screening was subsequently performed to identify key genes associated with the disease. The following steps were conducted using R software (version 4.4.1). First, the two DFU datasets were calibrated, normalized, and log-transformed using the “affy” package [[46]12]. Next, the “SVA” package was employed to remove batch effects, merge the DFU datasets, and the “UMAP” package was utilized for visualization in UMAP format [[47]13, [48]14]. The merged DFU dataset comprises 7 control samples and 10 DFU samples. Following this, DEGs were screened using the “LIMMA” package, with a p-value threshold set at 0.05 and|log2 fold change (FC)| > 1.0 [[49]15]. Finally, volcano plots and heat maps were generated using Sangerbox ([50]http://vip.sangerbox.com/login.html) to display the expression patterns of the DEGs. Fig. 1. [51]Fig. 1 [52]Open in a new tab A flowchart showed the overall Idea of this study. (1) Bulk RNA-seq analysis: Identified 63 oxidative stress-related DFU genes (DORGs), with 59 key nodes showing close functional associations and enrichment in oxidative stress pathways. (2) Machine learning: SVM-RFE, LASSO, and RF algorithms pinpointed BCL2 and FOXP2 as diagnostic hub DORGs. (3) Immune infiltration: Revealed significant increases in naïve B cells and CD8 + T cells in DFU lesions. (4) scRNA-seq analysis: Mapped 31,787 cells across 10 clusters, showing reduced fibroblasts in DFU and consistent BCL2/FOXP2 expression with bulk data; Fibroblast in DFU exhibit elevated oxidative stress. (5) Validation: In vitro DFU models confirmed impaired fibroblast migration, suppressed BCL2/FOXP2, and elevated oxidative stress, corroborated by external datasets Table 1. Basic information on the GEO datasets used in the study GSE series Type Sample size Platform Tissue CON DFU [53]GSE80178 Bulk RNA-seq 3 6 GPL166686 Skin [54]GSE143735 Bulk RNA-seq 4 4 [55]GPL11154 Skin [56]GSE199939 Bulk RNA-seq 11 10 [57]GPL24676 Skin [58]GSE165816 scRNA-seq 8 3 [59]GPL24676 Skin [60]Open in a new tab Weighted gene co-expression network analysis The systems biology strategy Weighted Gene Co-expression Network Analysis (WGCNA) was employed to investigate the correlation between genes [[61]16]. WGCNA identifies co-expressed gene modules and hub genes in bulk RNA-seq data, revealing functional networks and key regulators underlying biological traits. First, the median absolute deviation (MAD) of each gene was calculated, excluding the 50% of genes with the smallest MAD. Second, the differentially expressed gene (DEG) expression matrix was filtered using the “goodSamplesGenes” function to eliminate unqualified genes and samples, leading to the construction of a scale-free co-expression network. Third, adjacency was computed using the “soft” thresholding power (b) based on co-expression similarity, which was subsequently converted into a topological overlap matrix (TOM) to assess gene ratios and dissimilarity. Fourth, modules were identified through hierarchical clustering and dynamic tree cutting, with genes grouped using average linkage and a TOM-based dissimilarity metric (minimum group size = 30). Finally, the module eigengene dissimilarity was calculated, a cutoff was applied, and modules were combined for further analysis. PPI network and enrichment analysis of oxidative Stress-related genes To systematically identify oxidative stress-related genes using the GeneCards database ([62]https://www.genecards.org), we performed an advanced search with the following parameters: The exact keyword "oxidative stress" was entered in the search field. The initial search results were then refined by applying a relevance score threshold >7.0, which represents the top 15% of genes with the strongest documented associations to oxidative stress based on GeneCards' proprietary scoring algorithm. This algorithm integrates multiple lines of evidence including: (1) frequency of co-occurrence in scientific literature, (2) experimental validation from curated databases, (3) pathway enrichment in oxidative stress-related biological processes, and (4) disease associations. Additionally, the Venn diagram tool from Sangerbox was employed to identify overlapping genes from LIMMA, WGCNA, and OSRGs. The genes that overlapped were designated as Oxidative Stress-related Genes in DFU (DORGs). To elucidate the relationships among these genes, we used the STRING database ([63]https://cn.string-db.org/) to construct a protein-protein interaction (PPI) network. This approach allowed us to determine the interactions among protein-coding genes, with a minimum interaction score set at 0.400. Cytoscape software (version 3.10.1) was then employed to create a modified visualization based on the data obtained from STRING, and the “MCODE” plug-in was utilized to identify key interacting genes [[64]17]. We conducted Gene Ontology (GO) annotation and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment analysis, specifying the preanalysis species as “Homo sapiens”. The GO analysis encompassed three categories: biological process (BP), cellular component (CC), and molecular function (MF). Finally, a bubble chart was generated using the Bioinformatics platform ([65]https://www.bioinformatics.com.cn). Drug prediction Drug prediction is based on DORGs and utilizes the DSigDB database to identify potential therapeutic drugs through Enrichr ([66]https://maayanlab.cloud/Enrichr/). The top 10 drugs with the highest scores are then selected and displayed using Sangerbox’s bubble chart tool. Machine learning To further screen DORGs for the early diagnosis of DFU, three machine learning algorithms were employed. SVM-RFE, a feature selection method based on support vector machines, is capable of processing high-dimensional data, reducing redundant features, and significantly enhancing the sensitivity and specificity of genetic diagnosis [[67]18]. LASSO, a regression technique, is used for variable selection and regularization, improving the predictive accuracy and interpretability of statistical models [[68]19]. RF serves as a suitable filtering method, characterized by its lack of restrictions on variable conditions and its ability to predict continuous variables with notable accuracy, sensitivity, and specificity [[69]20]. SVM-RFE, LASSO and RF algorithms were conducted using the ‘e1071’, ‘glmnet’ and ‘randomForest’ R packages, respectively. The intersecting genes identified by these three machine learning methods are regarded as central genes for diagnosing DFU. Subsequently, the expression data of the candidate hub genes in the merged DFU dataset were extracted, and violin plots were utilized to visualize the expression differences between groups. TF-mRNA-miRNA network construction The construction of the TF-mRNA-miRNA network predicts the target transcription factors (TFs) and miRNAs of candidate hub genes, facilitating a deeper understanding of gene regulatory mechanisms. This approach offers a theoretical basis for the development of targeted treatment strategies. To predict the target miRNAs of the candidate hub genes, we utilized the miRWalk database ([70]http://mirwalk.umm.uni-heidelberg.de/) with a score threshold set at > 0.90. The target TFs of the candidate central genes were predicted using the TFTF tool [[71]21], selecting the datasets “FIMO_JASPAR”, “PWMEnrich_JASPAR”, and “GTRD” to identify TFs present in these three datasets. The TF-mRNA-miRNA network was visualized using Cytoscape (version 3.10.1). Immune infiltration analysis To explore the immune status of DFU patients and the correlation between hub genes and immune cells, we conducted immune infiltration analysis. CIBERSORT, an in silico method, characterizes immune cell subsets within high-dimensional genomic data from bulk tissue samples using validated signature matrices [[72]22]. Immune cell infiltration analysis was performed using the “Cibersort” R package [[73]23], and boxplots were utilized to visually compare the proportions of various immune cell types between the two groups. Additionally, a heat map illustrating the correlation among 22 types of infiltrating immune cells was generated using the “corrplot” R package [[74]24]. Similarly, the “ggstatsplot” R package was employed to create a lollipop plot depicting the correlation between hub genes and differentially infiltrating immune cells [[75]25]. Single-cell RNA seq data source, processing and cell annotation ScRNA-seq data ([76]GSE165816) was downloaded from GEO database. Perform the following steps using the Seurat package (version 3.4.1) in R software (version 4.4.1). The CreateSeuratObject function was employed to generate a Seurat object. To eliminate low-quality cells, consistent filtering criteria were applied to each Seurat object. The specific filtering conditions included: the number of genes per cell (nFeature_RNA) ranging from 300 to 5000, the proportion of mitochondrial genes (mt_percent) being less than 15%, the proportion of red blood cell genes (HB_percent) being less than 3%, and the Unique molecular identifier (UMI) count (nCount_RNA) exceeding 1000 while remaining below 97% of the maximum value. Following the filtering process, data normalization (NormalizeData), hypervariable gene screening (FindVariableFeatures), and data standardization (ScaleData) were conducted on the Seurat object. Subsequently, principal component analysis (RunPCA) was performed to reduce the dimensionality of the data. The RunHarmony function was then utilized to correct for batch effects in the single-cell RNA data, specifying the orig.ident variable in the input Seurat object and its metadata as batch information. After correction, the harmony dimensionality reduction results for the first five principal components were examined. To assess the effectiveness of batch effect removal, a principal component map (PCA) based on the Harmony-corrected data was generated using DimPlot. Next, ElbowPlot was employed to determine the optimal number of principal components, and based on these results, harmony dimensionality reduction data were utilized in the FindNeighbors and FindClusters functions for cluster analysis, with resolution parameters adjusted to optimize clustering outcomes (dim = 1:20, resolution = 0.2). Finally, the RunUMAP function was applied to perform UMAP dimensionality reduction, resulting in a two-dimensional visualization map. Cells were manually annotated based on characteristic patterns of marker genes, with the primary reference database for cell marker genes being CellMarker 2.0 ([77]http://117.50.127.228/CellMarker/index.html). Identification and characterization of oxidative stress-related cell clusters in DFU First, the expression data of hub genes from scRNA-seq analysis were extracted to validate the results of the bulk RNA-seq analysis. Box plots were employed to illustrate the expression levels of these core genes across different groups, while violin plots were utilized to depict the expression differences among various cell types. Additionally, the FeaturePlot function was applied to visualize the expression of core genes on UMAP plots for a more intuitive representation. Following this, oxidative stress was characterized across various cell groups. Based on data from the GeneCards database and previous research, the DORGs were categorized into two groups (Supplementary Table [78]6). The first group, termed the OS_up gene set, comprises genes associated with elevated oxidative stress, while the second group, referred to as the OS_down gene set, encompasses genes characterized by low oxidative stress levels. Specifically, the AddModuleScore function from the Seurat package was employed to perform gene set scoring on the single-cell data, and box plots, bubble plots, and UMAP plots were utilized to depict the characterization of scores for the OS_up gene set. Similarly, the representation of the OS_down gene set scores was obtained. Analysis of intercellular signal communication To determine the patterns of incoming and outgoing signals from the cell cluster in DFU, the CellChat package was employed to infer and analyze crosstalk between cells [[79]26]. The normalized Seurat data was utilized as the CellChat object, and CellChatDB.human was selected as the receptor-ligand interaction database. Communication probabilities were calculated using the computeCommunProb function, which demonstrates the number and strength of cell interactions. A scatter plot was created to visualize the contributions of sender and receiver cells in two-dimensional space. The netAnalysis_signalingRole_heatmap function was employed to display the top ten signaling pathways that contribute to the output and input signals of the cell cluster. Additionally, the extractEnrichedLR function was used to extract all significant interacting ligand-receptor pairs and associated signaling genes for a given signaling pathway, thereby illustrating intercellular communication mediated by a single ligand-receptor pair. Furthermore, a violin plot was generated to display the expression levels of all genes within a selected signaling pathway across the cell clusters. Cluster classification and pseudotime analysis Following the determination of the main cell cluster affected by oxidative stress in DFU, a further cluster analysis of this primary cell cluster is conducted. Akin to the previous step, harmony dimensionality reduction is applied in the FindNeighbors and FindClusters functions for cluster analysis, with resolution parameters adjusted to enhance clustering effectiveness (dimensions set to 1:6 and resolution to 0.2). The R package CytoTRACE (version 0.3.3) is utilized to calculate the stemness of each cell cluster, thereby inferring the temporal sequence of cell differentiation [[80]27]. Additionally, single-cell pseudotime analysis is conducted using the R package Monocle2 (version 2.24.0). Dimensionality reduction is achieved through the UMAP method, followed by visualization using the “PLOT_CELL_TRACTURE” function [[81]28]. Subsequently, different cell clusters are sorted according to their pseudo-chronological order. DFU cell model establishment and scratch experiment L929 dermal fibroblasts were obtained from the Institute of Biochemistry and Cell Biology at the Chinese Academy of Sciences in Shanghai, China. The L929 cells were cultured in DMEM (25 mM glucose) supplemented with 10% FBS. To simulate the high oxidative stress microenvironment characteristic of diabetic feet, cells were co-cultured with varying concentrations of H[2]O[2] [[82]29] at 0, 5, 10, 25, 50, and 100 µM for 24 h. The relative proliferation rate of the cells was assessed using the CCK-8 method to determine the optimal concentration of H[2]O[2]. Subsequently, live/dead fluorescence was employed to further evaluate the activity and morphology of L929 cells at this optimal concentration. To establish a cell model exhibiting high oxidative stress, a 2’,7’-dichlorodifluorodiazene (DCFH-DA) fluorescent probe was utilized for quantitative analysis of oxidative stress. Initially, cells in the logarithmic growth phase were treated with media under different culture conditions: the blank group (0% FBS and 33 mM Glu), the CON group (5% FBS and 33 mM Glu), and the DFU group (5% FBS, 33 mM Glu, and 25 µM H[2]O[2]) for 2 h, respectively. Following treatment, the medium was removed, and the cells were washed twice with PBS. Then, 10 µM of DCFH-DA was added to the cell culture dish and incubated for 30 min in the dark. During this process, DCFH-DA is hydrolyzed by intracellular esterases to form non-fluorescent DCFH, which is subsequently oxidized to DCF, a fluorescent compound that reflects the level of oxidative stress within the cells. After incubation, the dye solution was removed, and the cells were washed twice with PBS to eliminate excess dye. Finally, the stained cells were observed using a fluorescence microscope. The intensity of the DCF fluorescence signal directly reflected the production of intracellular ROS. To evaluate cell migration ability under various treatment conditions, a cell migration model was established using a scratch test. First, cultured cells from the blank group, CON group, and DFU group were inoculated into 24-well culture plates, with three replicate wells for each group. The medium for the blank group consisted of 0% FBS and 33 mM glucose (Glu), while the CON group was supplemented with 5% FBS and 33 mM Glu. The DFU group medium contained 5% FBS, 33 mM Glu, and 25 µM H₂O₂. Cells were inoculated into each well at a density of 80-90% confluence. After 24 h of cell culture, the tip of a 10 µL pipette was used to lightly draw a straight line in the center of each well to create an artificial scratch. Following the scratch procedure, PBS was used to remove any remaining floating cells, and the corresponding medium was replenished. At different time points (0 h, 24 h, 48 h), cell migration in the scratched area was photographed using a microscope, and cell mobility was subsequently calculated. The scratch wound healing proportion was quantitatively analyzed using ImageJ software (version 1.51) through the following procedure: First, the scratch area was converted to a binary image using the threshold method. Then, the wound area at 0, 24, and 48 hours was calculated by pixel analysis. Finally, the healing rate was determined as: Healing Proportion (%) = (1 - terminal wound area/initial wound area) × 100%. RT-qPCR, Western blot and validation of external data set To verify the expression of the hub genes in fibroblasts, RT-qPCR was employed. After treating L929 cells under the conditions of the CON and DFU groups for 24 hours, total RNA was isolated using Trizol, followed by reverse transcription to create a cDNA template. Subsequently, RT-qPCR was conducted using the SYBR Green Real-Time PCR Master Mix Plus (Vazyme). Analyses were performed in accordance with the MIQE guidelines. The internal reference gene utilized was GAPDH. Table [83]2 presents the amplification sequences of the primers used. Following 24-hour L929 cell treatment, proteins were extracted using RIPA buffer with protease inhibitors. Lysates (30 µg) were separated by 10% SDS-PAGE and transferred to PVDF membranes. After blocking with 5% skim milk, membranes were incubated overnight with primary antibodies (anti-BCL2/FOXP2, 1:1000, Proteintech), then HRP-conjugated secondaries (1:5000). Signals were detected via ECL, with GAPDH as the loading control. We selected [84]GSE199939 as an external dataset to validate the results of the bioinformatics analysis. Detailed information regarding the dataset is provided in Table [85]1. The ‘LIMMA’ package was then used to standardize the data and extract the expression data of the selected the hub genes from the dataset. Finally, the Sangerbox platform ([86]http://vip.sangerbox.com/login.html) was employed to visualize the expression data using violin plots. Table 2. Amplification primer sequences Gene Primer type Sequences (5’ to 3’) FOXP2 Forward AAGGAGCAGGAGAGGAGGTGT Reverse GGTCTTGGTGTCTCCCCTGTC BCL2 Forward TGCCCGCTTGACTGTGAAG Reverse CCTGTTGCTGTTGATGGGA GADPH Forward CCACTCTTCCACCTTCGAT Reverse CACCACCCTGTTGCTGTA [87]Open in a new tab Statistics All statistical analyses and presentations were performed using R software (version 4.4.1) and GraphPad Prism (version 9.5.0). All statistical tests used are defined in the figure legends. Statistical significance was set at P or adjusted P < 0.05. Results Bulk rna-seq analysis reveals DEGs between DFU and control groups The overall analytical workflow is depicted in Fig. [88]1. The UpSet plot in Fig. [89]2A illustrates the shared and unique DEGs identified in the two bulk RNA-seq datasets ([90]GSE143735 and [91]GSE80178). As shown in Fig. [92]2B and C, after the removal of batch effects, the differences between the original datasets were significantly reduced. A total of 1,033 DEGs were identified in the DFU combined dataset using the LIMMA method, of which 33 were upregulated and 1,000 were downregulated. A volcano plot and heatmap of the DFU DEGs are presented in Fig. [93]2D and E. Fig. 2. [94]Fig. 2 [95]Open in a new tab Identification of DEGs in merged DFU dataset. (A) UpSet plot showing the overlap of differentially expressed genes across two diabetic foot datasets. (B-C) UMAP plots of the diabetic foot datasets before and after batch effect removal. (D) Volcano plot of differentially expressed genes in the merged diabetic foot dataset. The x-axis represents the log2 fold change, and the y-axis represents the negative log10 of the p-value. Red triangles indicate upregulated genes. Green triangles indicate downregulated genes. (E) Heatmap of differentially expressed genes in the diabetic foot dataset. Each row represents a differentially expressed gene, and each column represents a sample from either the disease or control group Identification of key DFU module genes Through the DFU combined dataset ([96]GSE143735 and [97]GSE80178), WGCNA was employed to identify the genes involved in DFU development. Based on scale independence and mean connectivity, we selected β = 28 (scale-free R² = 0.9) as the “soft” threshold (Fig. [98]3A and B). Figure [99]3C shows a cluster tree of DFU patients and control datasets. Using this power, five gene co-expression modules (GCMs) were generated, as illustrated in Fig. [100]3D and E, and represented in different colors. The correlation between DFU incidence and GCMs is depicted in Fig. [101]3F. The Greenyellow module, comprising 1,604 genes (Supplementary Table [102]2), exhibited the highest correlation with DFU (correlation coefficient = -0.78, p = 2.3 × 10⁻⁴), identifying it as the key module for subsequent analysis. Additionally, we calculated the correlation between gene significance and gene module membership for DFU patients in the Greenyellow module. As anticipated, a significant positive correlation was observed (r = 0.63), as shown in Fig. [103]3G. Therefore, the genes in the Greenyellow module demonstrate the most significant correlation with DFU. Fig. 3. [104]Fig. 3 [105]Open in a new tab Identification of module genes in DFU. (A, B) Selection of β = 28 as the soft threshold based on the analysis of scale independence and average connectivity. (C) Dendrogram of diabetic foot and control samples based on hierarchical clustering. (D) Gene co-expression modules highlighted in different colors beneath the gene tree. (E) Heatmap of adjacency for feature genes. (F) Heatmap of module relationships with diabetic foot. The green-yellow module shows a significant correlation with diabetic foot. (G) Correlation plot of samples and gene significance within the green-yellow module PPI network and enrichment analysis of oxidative stress-related genes We obtained 1,658 OSRGs from the GeneCards database (Supplementary Table [106]3). Figure [107]4A presents a Venn diagram identifying 63 high-confidence DORGs that satisfy three independent selection criteria: (1) differentially expressed genes from LIMMA analysis (DFU-related transcriptional changes), (2) a module of co-expressed genes from WGCNA (phenotypically related functional networks in DFU), and (3) predefined OSRGs (mechanically annotated oxidative stress genes).Therefore, the expression levels of these 63 DORGs are considered reliable indicators of oxidative stress status in DFU. Subsequently, we constructed a PPI network composed of proteins derived from DORGs using the online analysis platform STRING to elucidate the interactions among these genes. The MCODE plug-in of Cytoscape software was then employed for modular analysis, with the filtering criteria set as follows: degree cutoff = 2.0, node score cutoff = 0.2, k-hub = 2, and maximum depth = 100. Based on these criteria, we identified the top 59 key nodes, which we considered to be the most influential core genes. As illustrated in Fig. [108]4B, the size and color depth of the circles represent the degree of interconnectedness of the proteins; a greater number of connections to a central protein indicates a stronger relationship, highlighting its importance. The 63 DORGs were further subjected to functional enrichment analyses using Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analyses. The most enriched GO terms were classified into Biological Process (BP), Cellular Component (CC), and Molecular Function (MF), with major categories including response to oxidative stress, mitochondrial intermembrane space, and chemoattractant activity, among others (Fig. [109]4C). The most enriched KEGG pathways associated with the DORGs predominantly included those involved in prostate cancer, chemical carcinogenesis - receptor activation, and gastric cancer (Fig. [110]4D). Detailed results of the enrichment analysis can be found in Supplementary Table [111]4. Fig. 4. [112]Fig. 4 [113]Open in a new tab Enrichment analysis of oxidative stress-related genes in DFU. (A) Venn diagram showing the identification of 63 overlapping genes through LIMMA, WGCNA, and the oxidative stress genes. (B) PPI network of the top 59 correlated genes based on the 63 overlapping genes, analyzed using the MCODE plugin in Cytoscape. (C) Bar plot displaying the GO analysis of the overlapping genes, including BP, CC and MF. (D) Bubble plot showing the KEGG analysis of the overlapping genes. (E) Bubble plot depicting the predicted sensitive drugs based on the associated genes. Abbreviations: LIMMA, linear model for microarray data; WGCNA, Weighted Gene Co-expression Network Analysis; PPI, protein-protein interaction; GO, Gene Ontology; BP, biological process; CC, cellular component; MF, molecular function; KEGG, Kyoto Encyclopedia of Genes and Genomes Drug prediction The Enrichr platform, utilizing the DSigDB database, was employed to predict potentially effective intervention drugs. The top 10 compounds were ranked based on Enrichr’s combined score, and only those with an enrichment p value < 0.05 were selected (Supplemental Table [114]4). Among them, Thymoquinone (CTD00000119) and Erlotinib (TTD00007882) emerged as top candidate drugs, showing the highest predicted relevance to the target genes (Fig. [115]4E). Identification of candidate hub DORGs via machine learning Three machine learning algorithm pairs, namely SVM-RFE, RF, and LASSO, were employed to screen hub DORGs from a total of 63 DORGs, with the objective of identifying candidate genes for the early diagnosis of DFU. The SVM-RFE algorithm revealed that the five most critical genes strongly correlated with DFU were BCL2, FOXP2, SYNPO2, BMP4, and BCL6 (Fig. [116]5A and B). The RF algorithm identified BCL2, FOXP2, and AHR as the three most important genes (Fig. [117]5C and D). The results of the LASSO model indicated that the development of DFU was associated with the expression of four genes: FOXP2, FBN1, BCL2, and BCL6 (Fig. [118]5E and F). Finally, a Venn diagram analysis identified BCL2 and FOXP2 as cross-genes, which are regarded as hub DORGs with the most significant impact on DFU and the highest diagnostic value. Notably, both BCL2 and FOXP2 exhibited low expression levels in DFU (Supplementary Fig. [119]1). Fig. 5. [120]Fig. 5 [121]Open in a new tab Machine learning-based selection of candidate diagnostic biomarkers for DFU. (A, B) SVM-RFE algorithm identifies key biomarkers, with the highest accuracy of 0.897 and minimum error of 0.103 for 5 genes. (C, D) Random forest algorithm selects key biomarkers, with error decreasing as the number of decision trees increases, eventually stabilizing, and genes are ranked by importance score. (E, F) LASSO algorithm for biomarker selection, with the optimal gene number (n = 4) corresponding to the lowest point on the curve. (G) Venn diagram showing two common candidate diagnostic genes identified by all three algorithms. (H) Network diagram illustrating the relationships between the two candidate diagnostic genes, miRNAs, and transcription factors. Circles represent mRNA, V shapes represent transcription factors, and diamonds represent miRNAs TF-mRNA-miRNA network The miRWalk database and the TFTF tool were employed to predict the potential miRNA and TF targets of the hub DORGs. As shown in Fig. [122]5H, four common miRNA targets and three common TF targets were identified. These targets may represent key signaling molecules that warrant further investigation in subsequent research. A detailed list of these targets is available in Supplementary Table [123]5. Immune cell infiltration analysis To explore the immune status of DFU, we performed immune cell infiltration assays of the DFU combined dataset ([124]GSE143735 and [125]GSE80178). As illustrated in Fig. [126]6A, the DFU group exhibited a significantly higher number of naive B cells and CD8 T cells compared to the CON group. The correlation heat map of immune cells indicated that memory B cells were positively correlated with activated dendritic cells (r = 0.54), while activated NK cells showed a positive correlation with resting mast cells (r = 0.66). Conversely, a negative correlation was observed between activated NK cells and M0 macrophages (r = -0.70), as well as between resting NK cells and activated NK cells (r = -0.85) (Fig. [127]6B). Among the immune cells, CD8 T cells displayed the strongest positive correlation with BCL2, whereas naive CD4 T cells exhibited the strongest negative correlation (Fig. [128]6C). Additionally, follicular helper T cells were the immune cells most positively correlated with FOXP2, while Tregs T cells showed the most negative correlation (Fig. [129]6D). Fig. 6. [130]Fig. 6 [131]Open in a new tab Immune cell infiltration analysis. (A) Box plot showing the comparison of the proportions of 22 immune cell types between the DFU and control groups. (B) Correlation of the 22 immune cell types, where color and bubble size represent the degree of correlation and significance. (C, D) Correlation between two hub genes and infiltrating immune cells, with bubble size and color indicating the strength and significance of the correlation. *p < 0.05; **p < 0.01; ***p < 0.001; ****p < 0.0001 Cell type identification and characterization of hub DORGs through scRNA-seq data Following the quality control of the [132]GSE165816 scRNA-seq data (Supplementary Fig. [133]2), a total of 31,787 cells were annotated into ten distinct cell types, including smooth muscle cells, fibroblasts, stromal endothelial cells, vascular endothelial cells, T cells, mononuclear macrophages, melanocytes, B cells, lymphatic endothelial cells, and Schwann cells (Fig. [134]7A, B). To illustrate the differences in cell composition between CON group and DFU group, we calculated the proportions of various cell types within each group (Fig. [135]7C). The results indicated that the proportion of fibroblasts in the CON group was higher than that in the DFU group. Additionally, we extracted the expression levels of BCL2 and FOXP2 for comparison between the two groups, both of which demonstrated decreased levels in the DFU group compared to the CON group (Fig. [136]7D and G). This finding aligns with the results of prior bioinformatics analyses. Further examination of the differences between cell clusters revealed that both genes were differentially expressed in fibroblasts (Fig. [137]7E and H). The UMAP plot offers a more visual representation of this trend (Fig. [138]7F and I). Fig. 7. [139]Fig. 7 [140]Open in a new tab Oxidative stress-related analysis of DFU based on scRNA-seq data. (A) UMAP plot of skin cells from 8 healthy control and 3 DFU tissue samples. (B) Bubble plot showing the marker genes for each cell clusters. (C) Stacked bar plot displaying the cell clusters in the two groups. (D-F) BCL2 expression analysis: box plot comparing BCL2 expression between the diabetic foot and control groups, violin plot showing inter-group expression differences across cell clusters, and feature plots of BCL2 expression levels in each cell cluster. (G-I) FOXP2 expression analysis: box plot comparing FOXP2 expression between the diabetic foot and control groups, violin plot showing inter-group expression differences across cell clusters, and feature plots of FOXP2 expression levels in each cell cluster. (J-K) OS_up gene set score: violin plot, bubble plot, and feature plots showing high oxidative stress scores across different cell clusters. (M-O) OS_down gene set score: violin plot, bubble plot, and feature plots showing low oxidative stress scores across different cell clusters.*p < 0.05; **p < 0.01; ***p < 0.001; ****p < 0.0001. Characteristics of oxidative stress in different cell clusters of DFU patients Base on the DFU combined dataset ([141]GSE143735 and [142]GSE80178), we identified 63 DORGs. Utilizing the GeneCards database and insights from previous studies (Supplementary Table [143]6), 63 DORGs were categorized into two groups: 31 genes that promote oxidative stress, classified as the OS_up gene set, and 32 repressed genes, classified as the OS_down gene set. Figure [144]7J and K illustrate fibroblasts from DFU patients exhibited the highest scores for the OS_up gene set. Figure [145]7L provides a direct comparison of the distribution of OS_up gene set scores across different cell clusters. Additionally, Fig. [146]7M and N indicate that fibroblasts from DFU patients recorded low scores for the OS_down gene set, while Fig. [147]7O presents the distribution of OS_down gene set scores among the various cell clusters. Cell-cell communication analyses in DFU patients This section of the research was derived from dataset [148]GSE165816. As shown in Fig. [149]8A and B, the number and intensity of cell communication were predominantly concentrated in smooth muscle cells, fibroblasts, and stromal endothelial cells. Figure [150]8C illustrates the signal intensities of each cell type as both receivers and effectors. Fibroblasts and smooth muscle cells primarily function as signal senders and receivers, while basal endothelial cells are major receivers and less likely to act as senders. Figure [151]8D and E demonstrate that the outgoing signaling patterns between cells are primarily associated with COLLAGEN, LAMININ, and MIF. In the top-ranked COLLAGEN signaling network, fibroblasts and smooth muscle cells contributed the most to signal output, whereas basal endothelial cells, smooth muscle cells, and melanocytes contributed to signal reception (Fig. [152]8F). Finally, the bar and violin plots provide ligand-receptor specific information regarding the COLLAGEN signaling network. It is evident that the ligands are predominantly derived from signaling molecules of the COL family, with these ligands mainly distributed in fibroblasts and smooth muscle cells. The primary receptors identified were CD44, SDC4, and SDC1. CD44 was present in eight cell types, excluding lymphatic endothelial cells and vascular endothelial cells, while SDC4 and SDC1 were primarily found in stromal endothelial cells. These findings suggest that fibroblasts, as major signal exporters and partial signal receivers, play a central role in the upstream pathological processes of DFU. Fig. 8. [153]Fig. 8 [154]Open in a new tab Cell-cell communication analysis of DFU. (A-B) Circular plots showing the number and intensity of interactions between cell clusters in DFU. The thickness of the connections represents communication probability and circle size represents number of cells in each cell group. (C) 2D plot displaying the communication intensity between cell clusters in the diabetic foot. (D-E) Heatmaps illustrating the contribution of each cell cluster to outgoing and incoming signaling pathways in the diabetic foot. (F) Heatmap showing the contribution of each cell cluster, as sources and targets, to the COLLAGEN signaling pathway in DFU. (G) Bar plot displaying the contribution of ligand-receptor pairs to the COLLAGEN signaling network in DFU. (H) Violin plot showing the distribution of ligand-receptor pairs across different cell clusters in the COLLAGEN signaling network Identification of fibroblast subclusters and analysis of differentiation trajectories This section of the research was derived from dataset [155]GSE165816. After establishing that fibroblasts exhibited the highest oxidative stress intensity in DFU and demonstrated robust communication with surrounding cells, we categorized them into subgroups. Figure [156]9A illustrates the division of fibroblasts into five distinct clusters: SFRP5 + fibroblasts, MGP + fibroblasts, FAP + fibroblasts, CXCL1 + fibroblasts, and COL6A5 + fibroblasts. Figures [157]9B and C highlight the compositional differences among these clusters in the two constituent fibrocytes. Notably, fibroblasts in the CON group were predominantly MGP + fibroblasts (89.99%), with a smaller proportion of SFRP5 + fibroblasts (8.37%). In contrast, DFU patients primarily exhibited COL6A5 + fibroblasts (50.18%), FAP + fibroblasts (39.54%), and CXCL1 + fibroblasts (9.44%). Figures [158]9D and E present the stemness of fibroblast subsets, as calculated by the CytoTRACE algorithm, ranked from high to low as follows: FAP + fibroblasts, CXCL1 + fibroblasts, MGP + fibroblasts, and finally, SFRP5 + fibroblasts and COL6A5 + fibroblasts. Ultimately, as illustrated in Figs. [159]9F and G, we inferred the differentiation trajectory among fibroblast subsets as follow: (1) FAP + fibroblasts serve as the primary starting point for development; (2) CXCL1 + fibroblasts, SFRP5 + fibroblasts and MGP + fibroblasts occupy an intermediate state; (3) the terminal states are represented by COL6A5 + fibroblasts. Fig. 9. [160]Fig. 9 [161]Open in a new tab Pseudotime analysis in fibroblast subclusters. (A) Bubble plot showing the marker genes for each fibroblast subcluster. (B) UMAP plot displaying the distribution of fibroblast subclusters in the control and DFU groups. (C) Stacked bar plot showing the percentage of each fibroblast subcluster in the control and DFU groups. (D) Box plot illustrating the differentiation potential of each fibroblast subcluster. (E) UMAP plot showing the differentiation potential of fibroblast subclusters. (F) Trajectory plot displaying the pseudotime analysis results for fibroblast subclusters classified by different subgroups. (G) Trajectory plot showing the pseudotime analysis results for fibroblast subclusters classified by different differentiation times In vitro experiments and validation with external data set Initially, we employed the CCK-8 method to assess the toxicity of H[2]O[2] to L929 cells. As illustrated in Fig. [162]10A, H[2]O[2] concentrations below 25 µM exhibited no significant toxicity to the cells. Subsequently, we co-cultured L929 cells with 25 µM H[2]O[2] for 24 h, observing that at this concentration, L929 cells maintained normal activity and morphology without evident cell death (Fig. [163]10B). To simulate the in vivo cellular environment of DFU, we established a high-sugar and high-reactive oxygen species environment in the L929 cell culture medium. Utilizing the DCFH-DA fluorescent probe kit, we determined that L929 cells exhibited a heightened oxidative stress state when stimulated with 33 mM glucose and 25 µM H[2]O[2] (Fig. [164]10C and D). Furthermore, to evaluate the alterations in the migration ability of L929 cells under a high oxidative stress microenvironment, we measured the cell migration rates of the blank group, the CON group, and the DFU group. Figure [165]10E and F demonstrate that the migration rate of L929 cells in the DFU group was significantly lower than that in the CON group. Additionally, quantitative analyses revealed significant downregulation of both hub DORGs in the DFU group, with concordant reduction observed at both mRNA and protein levels (Fig. [166]10G and H). In the external dataset [167]GSE199939, two DORGs exhibited a similar trend (Fig. [168]10I), further validating the reliability of the bioinformatics analysis results. Fig. 10. [169]Fig. 10 [170]Open in a new tab Validation of bioinformatics analysis results using in vitro experiments and external dataset. (A) Line graph showing the effect of H[2]O[2] on fibroblast proliferation. (B) Live-dead fluorescence staining showing no significant effect of 25 µM hydrogen peroxide on fibroblast activity. (C-D) DCFH-DA staining showing the high ROS in L929 cells, with a bar graph demonstrating significantly higher fluorescence intensity in the DFU group compared to the control group. (E) Scratch assay showing cell migration efficiency in the blank, control, and DFU groups. (F) The bar graph illustrating a significant reduction in cell migration efficiency in the DFU group. (G) Bar graph showing significant differences in the expression levels of two hub genes between the control and DFU groups. (H) The immunoblot plots and bar graphs demonstrate the significant differences in the protein levels of the two hub genes between the control and DFU groups. (I) Violin plot displaying significant differences in the expression levels of two hub genes in the DFU external dataset [171]GSE199939. Abbreviation: DCFH-DA, 2’,7’-Dichlorofluorescin diacetate Discussion Oxidative stress is recognized as a key initiating mechanism for various pathological conditions, leading to a series of biological effects such as lipid peroxidation of cell membranes, protein oxidation, and DNA damage. These biological changes ultimately contribute to the onset of multiple diseases. Growing evidence suggests that oxidative stress plays a significant role in the pathogenesis and progression of DFU, exacerbating the severity of the condition [[172]30, [173]31]. Clinical studies have indicated that serum markers related to oxidative stress, such as the prooxidant–antioxidant balance (PAB), can serve as non-invasive biomarkers for the early diagnosis and management of DFU [[174]32]. However, the specific cells, early signaling molecules, molecular mechanisms, and intercellular communication networks influenced by oxidative stress remain unclear. In recent years, the development of bioinformatics analyses for various diseases has led to numerous studies exploring the pathophysiology and potential biomarkers of DFU by analyzing DEGs between normal and diabetic foot skin tissues [[175]33, [176]34]. Recent studies utilizing scRNA-seq have significantly advanced our understanding of the DFU microenvironment, with particular attention given to immune cells (e.g., macrophages and monocytes), microvascular endothelial cells, and keratinocytes. Emerging evidence has revealed the critical involvement of fibroblast heterogeneity in DFU pathogenesis [[177]35–[178]37]. Notably, scRNA-seq analyses have identified distinct fibroblast subpopulations characterized by elevated expression of PLA2G2A, MMP1, and CHI3L1 that preferentially accumulate at wound sites, where they actively participate in immune-inflammatory responses and extracellular matrix remodeling processes [[179]38]. Furthermore, recent investigations have demonstrated that pro-inflammatory fibroblast subsets exhibit heightened ferroptosis activity and specifically express the molecular marker CGNL1 [[180]39]. These specialized fibroblast subpopulations appear to contribute to DFU progression through unique differentiation trajectories and altered intercellular communication networks, suggesting their potential as therapeutic targets for DFU management. Despite significant advances in scRNA-seq applications in DFU research, limited research has integrated single-cell RNA data with multiple batch RNA datasets while specifically focusing on oxidative stress processes in DFU. In this study, we combined two bulk RNA sequencing datasets to identify DEGs and focused on oxidative stress-related genes. The intersecting DEGs, key module genes, and oxidative stress-related genes led to the identification of 63 key oxidative stress-related genes in DFU. While no recent literature directly addresses the therapeutic potential of thymoquinone and erlotinib in DFU, existing mechanistic studies suggest their plausible efficacy: Thymoquinone demonstrates antioxidant activity through both direct free radical scavenging and Nrf2-mediated upregulation of endogenous antioxidant enzymes [[181]40], while erlotinib’s suppression of inflammation through pro-inflammatory factor inhibition may provide synergistic therapeutic effects [[182]41]. Subsequently, we used three machine learning approaches to select two hub DORGs as potential diagnostic and prognostic biomarkers for DFU. Furthermore, single-cell RNA data was utilized to pinpoint the oxidative stress intensity in the main cell clusters involved in the disease. We then analyzed the signaling pathways, intercellular communication, and differentiation trajectories of these cell clusters in DFU. Finally, external datasets and in vitro experiments were employed to validate our bioinformatic results, confirming the reliability and reproducibility of our findings. Given the significant impact of DFU on the healthcare system, early diagnosis and the targeted inhibition of specific signaling pathways related to DFU may provide essential support and evidence for healthcare professionals. This study offers a valuable strategy for future healthcare practitioners, potentially offering novel insights into the diagnosis and therapeutic research of DFU. Early diagnosis of DFU can significantly reduce the risk of adverse clinical outcomes. However, current early diagnostic methods for DFU are ambiguous, lacking specific indicators or scoring criteria. Accurate diagnosis of DFU usually relies on comprehensive evaluation, including clinical assessment, imaging studies, and microbiological analysis [[183]42]. In this study, we identified 63 oxidative stress genes whose expression in DFU significantly differed from that in healthy controls. The hub DORGs BCL2 and FOXP2 may serve as markers to differentiate DFU patients from healthy individuals. B-cell lymphoma 2 (BCL2) is not only a gene that encodes an anti-apoptotic protein but also one that antagonizes oxidative stress [[184]43]. Xu et al. found that melatonin, by regulating BCL2 expression, inhibits testicular cell apoptosis, alleviates oxidative stress, and protects the testes from lipotoxic damage [[185]44]. In the context of bupivacaine-induced oxidative stress in nerve cells, BCL2 expression was inhibited; however, the knockout of TXNIP restored BCL2 downregulation by mitigating oxidative stress, thereby reducing bupivacaine-induced spinal cord neurotoxicity and cell apoptosis [[186]45]. BCL2 plays a crucial role in the treatment of DFU. Gynura divaricata accelerated the healing process of diabetic foot by activating Nrf2 signaling pathway, regulating BCL2 expression, alleviating oxidative stress, and promoting angiogenesis and wound healing [[187]46]. In summary, oxidative stress in diabetic foot ulcers leads to increased cell apoptosis and reduced angiogenesis, which may result from decreased BCL2 levels in the affected cells. Forkhead box protein P2 (FOXP2) has long been recognized as a crucial gene associated with language and speech abilities, contributing to brain development and various neurological functions [[188]47]. In recent years, an increasing number of studies have identified FOXP2’s involvement in the pathological processes of various diseases. The FOXP2/LAMA4 axis regulates preeclampsia by inhibiting trophoblast cell apoptosis while promoting migration, invasion, and angiogenesis [[189]48]. Overexpression of FOXP2 aberrantly activates the carcinogenic MET signaling pathway, and inhibition of this pathway effectively reverses the carcinogenic phenotype induced by FOXP2 [[190]49]. Additionally, FOXP2 promotes epithelial-to-mesenchymal transition and cell cycle arrest in renal tubular cells, thereby accelerating renal fibrosis in a mouse model of chronic kidney disease [[191]50]. However, few studies have investigated the relationship between FOXP2 and oxidative stress. Wu et al. found that bats with echolocation disorders exhibited symptoms akin to Parkinson’s disease, characterized by significantly reduced FOXP2 expression in the superior colliculus, while proteins associated with inflammation, oxidative stress, and apoptosis were markedly increased in the substantia nigra [[192]51]. This finding suggests that FOXP2 expression is suppressed during oxidative stress in neural tissues. Similarly, in this study, we observed that FOXP2 expression was diminished in the skin of DFU, where oxidative stress was more pronounced, indicating that FOXP2 may function as an antagonist to oxidative stress in DFU. Further mechanistic research is warranted to confirm this hypothesis. The specific regulatory network governing oxidative stress in DFU remains incompletely understood, with TFs and miRNAs likely playing crucial roles in this regulation. Research has indicated that levels of miR-181b are reduced in the renal arteries of diabetic patients. Transfection with miR-181b mimics has been shown to inhibit ROS production and enhance endothelial-dependent vasodilation in the aortas of diabetic mice [[193]52]. Additionally, Kumar et al. developed ginger-derived nanoparticles (GDNP) that protect the transcription factor FOXA2 from Akt-1-mediated phosphorylation, thereby restoring its function and preventing insulin resistance [[194]53]. Identifying key TFs and miRNA targets is essential for advancing gene engineering and biotechnology research related to DFU. In this study, we identified four shared miRNA targets and three TF targets based on two hub DORGs, which could serve as significant focal points for future research. Fibroblasts exhibit high plasticity and multifunctionality, playing critical roles in regulating angiogenesis, inflammation, and differentiation potential. The dysfunction of fibroblasts is a key factor in the pathological processes of DFU [[195]54]. In this study, we found that, compared to the control group, the proportion of fibroblasts in DFU skin was significantly reduced, and these fibroblasts exhibited an increased oxidative stress state. In terms of intercellular communication, fibroblasts are prominent in both the intensity and quantity of their cellular interactions. The collagen signaling network, which serves as the primary intercellular communication pathway, is predominantly regulated by fibroblasts acting as key signal transmitters. This network is crucial for the development and healing of DFU. Under normal conditions, collagen promotes wound healing, enhances tissue structure, and maintains the integrity of the skin and blood vessels. However, diabetes-induced hyperglycemia, oxidative stress, and inflammation suppress collagen synthesis and impair its normal function in wound healing [[196]55]. Based on the findings of this study, we hypothesize that in DFU patients, the elevated oxidative stress in fibroblasts and their reduced numbers lead to weakened collagen signaling, insufficient collagen synthesis, and disorganized collagen fiber arrangement, which further impair wound healing and skin strength, thereby hindering wound repair. To effectively treat DFU, it is significant to reduce oxidative stress levels in fibroblasts, sustain fibroblast activity, and improve the function of the collagen signaling pathway. Following this, we performed a detailed subgroup analysis focusing on fibroblasts. In this study, we classified fibroblasts into five distinct subtypes. MGP + fibroblasts and SFRP5 + fibroblasts emerged as the two predominant subtypes in the CON group. The differentiation potential of these two subtypes, among all five fibroblast subtypes, was assessed to be moderate, and their numbers significantly exceeded those of the other subtypes. We speculate that fibroblasts in healthy skin tissue maintain a stable skin self-renewal rate due to their homeostatic balance. At this stage, the differentiation capacity of fibroblasts is not fully realized, resulting in a moderate level of differentiation. Conversely, fibroblasts in the DFU group exhibited a disordered internal environment, leading to the differentiation into a greater variety of subtypes, including COL6A5+, FAP+, and CXCL1 + fibroblasts. The differentiation potential of fibroblasts in the DFU group was characterized as extreme and diverse. Notably, over 50% of fibroblasts in DFU group were COL6A5 + subtype, which exhibited the most limited differentiation capacity among all fibroblast subpopulations. In contrast, FAP + and CXCL1 + fibroblasts exhibited higher differentiation potential; however, their total number remains lower than that of COL6A5 + fibroblasts. The proportion of fibroblasts in the skin tissue of the DFU group significantly decreased to 20.5%, compared to 28.53% in the CON group. Numerous studies have indicated that oxidative stress is a critical factor contributing to senescence in dermal fibroblasts [[197]56, [198]57]. Integrating these findings with the results of this study, we hypothesize that the dominant fibroblasts in the DFU group tend to transition to less capable subtypes and display diminished self-renewal ability under the influence of oxidative stress, which may be a key factor in the reduction of fibroblast numbers observed in DFU. However, our study has several limitations. First, the sample size of the bulk RNA-seq datasets was small, as we only retrieved three datasets that met our criteria. Due to limited sample sizes in current datasets, our analysis may have reduced sensitivity in identifying DEGs with critical expression changes (1.5–2.0 fold).We combined two datasets for the preliminary analysis, and to ensure the rigor of our results, we conducted cell experiments and utilized a third external dataset. Second, the oxidative stress-related genes were primarily sourced from public databases. Although preliminary screening was performed, there may be genes that are not strongly correlated with the study subjects, and their functional priority in DFU-specific pathological processes still needs to be further confirmed by systematic functional genomics methods such as CRISPR screening. Third, the current study lacks assessment of gene function regulation within the native tissue microenvironment - particularly the dynamic vascular-neural-immune interactions - which would require more sophisticated humanized mouse models or multi-tissue organoid systems. These limitations will be systematically addressed in our ongoing research through: (1) expansion to a 200-patient clinical validation cohort, and (2) development of 3D multicellular coculture platforms incorporating endothelial, neuronal and immune components. Conclusion We conducted a comprehensive bioinformatics analysis to identify DEGs between healthy individuals and patients with DFU, resulting in the identification of 63 differentially expressed oxidative stress-related genes in DFU (DORGs). Additionally, we employed machine learning algorithms to pinpoint two hub DORGs, BCL2 and FOXP2, which may serve as potential diagnostic markers for the disease. In vitro experiments and external dataset were utilized to validate the expression levels of the two hub DORGs. Furthermore, our profiling of DFU oxidative stress suggests fibroblasts may play a central role in this pathological context. The identified transcriptional regulators and cell-cell communication pathways offer preliminary insights into DFU mechanisms, which may contribute to pathogenesis studies. These findings enhance our understanding of the impact of oxidative stress on the pathogenesis in DFU and may offer new avenues for exploring potential combination treatment strategies. Electronic supplementary material Below is the link to the electronic supplementary material. [199]Supplementary Material 1^ (1.3MB, rar) Author contributions Concept and design: JL, MP, LH. Technical Support: MP. Methodology and experiment: JL, LH. Acquisition, analysis or interpretation of data: WL, MP. Software and statistical analysis: JL, WL. Pictures and Tables: JL, WL. Drafting of the manuscript: JL, LH, HX. Language modification and guidance: HX, MP. All authors approved the final version of the manuscript for submission. Funding This research was funded by Social science and technology development foundation of Fengxian District, Shanghai (No. 20241001) and Science and Technology Commission Foundation of Shanghai (No. 24SF1903004). Data availability The public datasets downloaded and analyzed in this study can be found in the GEO data repository ([200]https://www.ncbi.nlm.nih.gov/geo/), and included the following accession numbers as follows: [201]GSE80178 [[202]58], [203]GSE143735 [[204]59], [205]GSE143735 [[206]36], and [207]GSE165816 [[208]38]. Declarations Competing interests The authors declare no competing interests. Footnotes Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. Jialiang Lin and Linjuan Huang contributed equally to this work. Contributor Information Haijun Xiao, Email: xiaohaijun89@163.com. Mingmang Pan, Email: lighting2013@foxmail.com. References