Abstract Background Ulcerative colitis (UC) is a chronic inflammatory disease of the colonic mucosa with increasing incidence worldwide. Growing evidence highlights the pivotal role of nicotinamide adenine dinucleotide (NAD+) metabolism in UC pathogenesis, prompting our investigation into the subtype-specific molecular underpinnings and diagnostic potential of NAD+ metabolism-related genes (NMRGs). Methods Transcriptome data from UC patients and healthy controls were downloaded from the GEO database, specifically [37]GSE75214 and [38]GSE87466. We performed unsupervised clustering based on differentially expressed NAD+ metabolism-related genes (DE-NMRGs) to classify UC cases into distinct subtypes. GSEA and GSVA identified potential biological pathways active within these subtypes, while the CIBERSORT algorithm assessed differential immune cell infiltration. Weighted gene co-expression network analysis (WGCNA) combined with differential gene expression analysis was used to pinpoint specific NMRGs in UC. Robust gene features for subtyping and diagnosis were selected using two machine learning algorithms. Nomograms were constructed and their effectiveness was evaluated using receiver operating characteristic (ROC) curves. Reverse transcription quantitative polymerase chain reaction (RT-qPCR) was conducted to verify gene expression in cell lines. Results In our study, UC patients were classified into two subtypes based on DE-NMRGs expression levels, with Cluster A exhibiting enhanced self-repair capabilities during inflammatory responses and Cluster B showing greater inflammation and tissue damage. Through comprehensive bioinformatics analyses, we identified four key biomarkers (AOX1, NAMPT, NNMT, PTGS2) for UC subtyping, and two (NNMT, PARP9) for its diagnosis. These biomarkers are closely linked to various immune cells within the UC microenvironment, particularly NAMPT and PTGS2, which were strongly associated with neutrophil infiltration. Nomograms developed for subtyping and diagnosis demonstrated high predictive accuracy, achieving area under curve (AUC) values up to 0.989 and 0.997 in the training set and up to 0.998 and 0.988 in validation sets. RT-qPCR validation showed a significant upregulation of NNMT and PARP9 in inflamed versus normal colonic epithelia, underscoring their diagnostic relevance. Conclusion Our study reveals two NAD+ subtypes in UC, identifying four biomarkers for subtyping and two for diagnosis. These findings could suggest potential therapeutic targets and contribute to advancing personalized treatment strategies for UC, potentially improving patient outcomes. Keywords: ulcerative colitis, NAD+ metabolism, bioinformatics, machine learning, immune cell infiltration, subtype, diagnosis 1. Introduction UC is a chronic inflammatory bowel disease (IBD) that primarily affects the colonic mucosa, beginning in the rectum and potentially extending to the entire colon. It is clinically characterized by recurrent episodes of bloody diarrhea and abdominal pain. Globally, the incidence of UC is increasing, with an estimated five million people affected as of 2023 ([39]1). As a chronic disease, UC significantly affects the quality of life of patients, necessitating continuous medical care and potentially leading to severe complications, including colorectal cancer ([40]2). Despite advancements in treatments, including immunosuppressants and biologics, 10%–20% of patients suffer from recurrent and treatment-resistant symptoms, with some requiring colectomy ([41]3). The complex interplay of environmental triggers, genetic predispositions, and immune dysregulation complicates the etiology of UC ([42]4). At the molecular level, UC is characterized by the activation of immune cells, including T cells, macrophages, and dendritic cells, which infiltrate the colonic mucosa ([43]5). These immune cells release pro-inflammatory cytokines, such as TNF-α, IL-1β, and IL-6, contributing to the chronic inflammation observed in UC ([44]6). Additionally, the intestinal epithelial barrier is compromised, allowing microbial products to trigger further immune activation and inflammatory responses ([45]7). The dysregulation of key signaling pathways, such as NF-κB and JAK/STAT, plays a crucial role in sustaining this inflammatory environment ([46]8, [47]9). Furthermore, genetic factors, including mutations in immune-related genes like NOD2 and IL-23R, have been associated with an increased risk of UC ([48]10, [49]11), highlighting the need for advanced research to develop more effective diagnostic and therapeutic options and to have a better understanding of its pathogenesis. NAD+ is essential in cellular metabolism, which is critical in oxidative reactions and energy production. Beyond its metabolic functions, NAD+ is essential for maintaining cellular health, as it is involved in DNA repair, signal transduction, and cell death regulation ([50]12). Recent studies have emphasized its importance in regulating inflammation and immune responses, which are crucial in the pathophysiology of various chronic diseases, including autoimmune and inflammatory conditions ([51]13–[52]15). In the context of ulcerative colitis (UC), disturbances in NAD+ metabolism are associated with the characteristics of the exacerbated inflammatory environment of the disease ([53]16). Despite its critical role, the specific impact of NAD+ metabolism on UC is unknown ([54]17). Therefore, investigating NAD+ metabolism in UC may lead to new therapeutic strategies that can alleviate inflammation and promote mucosal healing, providing new directions for research and treatment of the disease. The genes identified in our study—AOX1, NAMPT, NNMT, PTGS2, and PARP9—play crucial roles in inflammation and immune cell metabolism. AOX1 is involved in the regulation of oxidative stress, a key factor in inflammatory responses ([55]18). NAMPT plays a pivotal role in NAD+ biosynthesis, thus influencing immune cell energy metabolism and inflammatory cytokine production ([56]19–[57]23). NNMT has been linked to regulating methylation processes in immune cells, affecting their activation and function in inflammatory environments ([58]24–[59]27). PTGS2 (COX-2) is a well-known enzyme involved in the production of prostaglandins, which mediate inflammation and immune responses ([60]28, [61]29). Finally, PARP9 is implicated in DNA repair and the regulation of immune cell survival, particularly in response to inflammatory stimuli ([62]30–[63]32). These genes contribute to the modulation of immune cell infiltration and inflammation in UC, underscoring their potential as biomarkers for both disease subtyping and diagnosis. In this study, we collected UC samples from public databases and employed unsupervised clustering to categorize them into two NAD+ metabolism-related subtypes, clusters A and B. Our analyses revealed that these subtypes exhibited varying responses to inflammation: Cluster A exhibited improved self-repair capabilities, whereas cluster B was prone to more severe inflammation and tissue injury. Subsequently, we identified key biomarkers related to NAD+ metabolism using differential analysis, WGCNA, least absolute shrinkage and selection operator (LASSO) regression, and random forest (RF) algorithms. These biomarkers exhibited high predictive accuracy for UC subtyping and diagnosis. RT-qPCR validation of these findings offers potential new strategies and scientific bases for UC diagnosis and personalized treatment. 2. Materials and methods 2.1. Data acquisition and processing Herein, the UC datasets were sourced from the Gene Expression Omnibus (GEO) database, as presented in the flowchart in [64]Figure 1 . The training sets utilized [65]GSE75214 and [66]GSE87466, comprising 161 UC samples and 32 normal tissue samples ([67]33, [68]34). [69]GSE75214 contained a total of 97 UC samples, of which 74 were active UC samples and 23 were inactive UC samples. To ensure the accuracy of the analysis, we excluded the 23 inactive UC samples, and the final dataset used for analysis included 74 active UC patients. The SOFT files from these datasets were imported using the GEOquery package in R software (version 4.3.1). In order to merge gene expression data from multiple datasets, we first normalized and mapped the probe IDs to gene symbols. When multiple probes corresponded to a single gene, the avereps function from the limma package was used to compute average expression values. Subsequently, datasets were merged, and batch effects across datasets were adjusted using the ComBat function from the sva package. ComBat adjusts for systematic technical variations between datasets while retaining biological signal. Principal Component Analysis (PCA) was used to verify the effectiveness of the batch effect adjustment, which demonstrated successful removal of batch effects while maintaining the biological structure of the data. The external validation sets included [70]GSE92415, [71]GSE206285, and [72]GSE66407 ([73]35–[74]38), which underwent the same preprocessing methods as the training set. [75]Supplementary Table 1 presents detailed information on all datasets. Figure 1. [76]Figure 1 [77]Open in a new tab Flowchart of the research. 2.2. Acquisition of NMRGs NMRGs were curated from multiple databases, including the Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway database (hsa00760), Reactome database (R-HSA-196807), and GeneCards database (NAD+ Metabolism Pathway). After removing duplicates, 54 NMRGs were identified. An intersection with all genes in the training set was performed, resulting in 47 NMRGs selected for subsequent analysis. [78]Supplementary Table 2 presents detailed information of these 47 NMRGs. 2.3. Identification and enrichment analysis of differentially expressed genes between UC and normal Differential expression analysis between UC and normal samples in the training set was conducted using limma package ([79]39), with significant differential expression defined by |log2 fold change (FC)| > 0.5 and false discovery rate (FDR) < 0.05. The resulting differentially expressed genes (DEGs) were visualized as volcano plots and heatmaps using the ggplot2 package. Enrichment analysis of these DEGs for Gene Ontology (GO) and KEGG pathways was performed using the clusterProfiler package, with gene annotation facilitated by the org.db package. Pathways with both p-values and q-values < 0.05 were considered significantly enriched. 2.4. Identification of NAD+ subtypes by DE-NMRGs To focus on NAD+ metabolism-related genes, we intersected the DEGs between UC and normal with the list of NMRGs. This intersection resulted in the identification of DE-NMRGs, which were used for further analysis. We then performed unsupervised clustering on the UC samples using the ConsensusClusterPlus package based on the expression levels of these DE-NMRGs to identify subtypes of UC ([80]40). The optimal number of clusters was determined by evaluating the cumulative distribution function, consistency clustering scores, and consensus clustering plots. Additionally, PCA was utilized to differentiate between the NAD+ subtypes. The boxplot and heatmap of these DE-NMRGs between subtypes were generated respectively using the ggpubr and pheatmap packages. 2.5. Identification and enrichment analysis of DEGs between NAD+ subtypes Differential expression analysis between NAD+ subtypes was conducted using limma package, setting thresholds of |log2FC| > 0.5 and an adjusted p-value (FDR) < 0.05 to identify significant DEGs between NAD+ subtypes. Gene set enrichment analysis (GSEA) was performed with the clusterProfiler package ([81]41, [82]42), and results were visualized using the enrichplot package. Gene set variation analysis (GSVA) was performed using the GSVA and GSEABase packages ([83]43), with visualization facilitated by the pheatmap package. All gene sets were sourced from the molecular signature database (MSigDB). Gene sets with an FDR < 0.01 were considered statistically significant. 2.6. Immune cell infiltration analysis between NAD+ subtypes The CIBERSORT package was used to analyze the abundance of 22 types of infiltrating immune cells in all samples ([84]44), and the results were visualized using the ggplot2 package. Interactions among immune cells were examined using the corrplot package to further analyze the impact of immune cells in UC. Comparisons of relative immune cell abundances between normal samples and different NAD+ subtypes were visualized using the ggpubr package. Statistical comparisons were performed using the Wilcoxon rank-sum test, with P < 0.05 considered statistically significant. 2.7. Construction of co-expression networks in UC based on WGCNA Co-expression networks were constructed from UC samples using the WGCNA package ([85]45). Samples were subjected to hierarchical clustering to identify and remove outliers. The optimal soft-thresholding power was determined based on the scale-free topology fit index (R^2 > 0.85). The gene expression matrix was transformed into a weighted adjacency matrix, which was subsequently converted into a topological overlap matrix (TOM). The TOM facilitated module detection via hierarchical clustering of the gene dendrogram, employing the dynamic tree-cutting method to identify modules and compute the module eigengenes, representing the principal components of gene expression profiles within each module. Correlations between each subtype and eigengenes of each module and the corresponding p-values were calculated to quantify the association of each module with different NAD+ subtypes. Gene significance scores within each module were computed to reflect their relative importance to different NAD+ subtypes. 2.8. Identification of key genes between NAD+ subtypes and construction of a classification model through machine learning We initially intersected NMRGs, DEGs between NAD+ subtypes, and genes from WGCNA modules, resulting in several candidate NMRGs to identify key genes for differentiating NAD+ subtypes. Subsequently, we applied two machine learning techniques, LASSO and RF, to further select these NMRGs. LASSO regression, performed using the glmnet package in R, employs regularization to aid in feature selection, aiming to enhance the predictive accuracy and interpretability of the model. Additionally, RF analysis, conducted using the randomForest package in R, was selected for its high accuracy, sensitivity, and specificity, making it particularly suitable for handling biological data with complex interactions. The cross-validated genes identified were considered hub genes capable of effectively distinguishing between different NAD+ subtypes of UC. Scatter plots were generated using the regplot package to evaluate the typing efficacy of these genes, and ROC analysis was conducted with the pROC package to further validate the predictive performance of the model. 2.9. Construction of a diagnostic model for UC based on NMRGs Similarly, we merged NMRGs, DEGs between UC and normal, and genes from WGCNA modules to identify several candidate NMRGs. These NMRGs were further refined using LASSO regression and RF analyses, with the cross-validated genes identified as hub genes capable of diagnosing UC. The construction of the co-expression network for this diagnostic model utilized the WGCNA package, but the scale-free topology fit index (R^2) threshold was set at 0.80 to meet the specific needs of the diagnostic model. The constructed diagnostic model was validated using scatter plots and ROC analysis to assess its efficacy in UC diagnosis. 2.10. Cell culture and RT-qPCR To validate whether gene expression changes identified in the diagnostic model could be confirmed experimentally, we used the normal human colonic epithelial cell line NCM460. The cells were divided into control and LPS-treated (10 μg/mL) groups. LPS stimulation was chosen to model the inflammatory environment characteristic of UC. LPS, a bacterial endotoxin, activates immune responses through Toll-like receptor 4 (TLR4), which plays a central role in UC pathogenesis by triggering inflammation and disrupting the epithelial barrier. LPS-induced inflammation in NCM460 cells mimics the epithelial cell response to microbial stimuli in UC ([86]46, [87]47). Cells were cultured in RPMI 1640 medium supplemented with 10% fetal bovine serum at 37°C in a 5% CO[2] atmosphere. After reaching the logarithmic growth phase, the treatment group cells were exposed to LPS for 24 h. RNA extraction was performed using TaKaRa’s RNAiso Plus (Trizol), and reverse transcription was conducted using TOYOBO’s ReverTra Ace^® qPCR RT Master Mix. Fluorescent quantitative PCR analysis was performed on the ABI 7900HT FAST system using Thermo’s Power SYBR Green PCR Master Mix. Experimental data were analyzed using GraphPad Prism 9.5.0 software, with statistical significance assessed using an unpaired t-test, and P < 0.05 considered statistically significant. 2.11. Statistical analysis R software (version 4.3.1) was used for data analysis. The statistical significance of normally distributed continuous variables was assessed using independent student’s t-tests, while differences in non-normally distributed continuous variables were evaluated using the Wilcoxon rank-sum test. For multiple comparisons, the Benjamini–Hochberg method was applied to adjust the p-values and control the FDR. This method ranks the p-values from all tests and adjusts them based on their rank relative to the total number of tests, ensuring that the FDR is controlled. Furthermore, ROC analysis was used to evaluate the typing and diagnostic biomarkers and models. Spearman correlation analysis was employed to examine the relationships between infiltrating immune cells and gene biomarkers. All statistical tests were two-tailed, with significance levels set at P < 0.05. Significance results were indicated with asterisks: “ns” denotes P > 0.05, “*” denotes P < 0.05, “**” denotes P < 0.01, and “***” denotes P < 0.001. 3. Results 3.1. Identification of DEGs in UC We selected and downloaded two human UC microarray datasets ([88]GSE75214 and [89]GSE87466) from the GEO online database. After careful screening, the study included 161 patients with UC and 32 control participants. Specifically, [90]GSE75214 included 74 UC tissues in an active disease state and 11 normal colonic tissues, while [91]GSE87466 included 87 UC tissues and 21 normal colonic tissues. After removing batch effects, the two datasets were merged into a UC training set, resulting in 17,348 genes. Samples from these independent datasets exhibited distinct clustering before batch effect removal ([92] Figure 2A ) but clustered together post-removal ([93] Figure 2B ). Subsequently, using the “limma” package in R with a threshold of FDR < 0.05 and |log2FC| > 0.5, we identified 2,935 DEGs, including 1,738 upregulated and 1,197 downregulated genes ([94] Figures 2C, D ). Figure 2. [95]Figure 2 [96]Open in a new tab Identification and enrichment analysis of DEGs. (A, B) Two datasets ([97]GSE75214, [98]GSE87466) were combined into one dataset after removing batch effects. Sample relationships before and after batch effect removal. (C, D) Volcano plot and heatmap showing DEGs between UC samples and normal samples. (E) GO enrichment analysis results of DEGs. (F) KEGG enrichment analysis results of DEGs. 3.2. Functional and pathway enrichment analysis of DEGs We conducted GO and KEGG enrichment analyses on the 2,935 DEGs to gain deeper insights into their biological functions. The GO enrichment analysis revealed that DEGs were significantly enriched in multiple biological processes, cellular components, and molecular functions. Specifically, the enriched biological processes included mononuclear cell differentiation, positive regulation of cytokine production, and lymphocyte differentiation; the cellular components primarily involved the collagen-containing extracellular matrix and the external side of plasma membrane; and the molecular functions included active transmembrane transporter activity and actin binding ([99] Figure 2E ). In the KEGG enrichment analysis, genes were predominantly enriched in pathways, including the mitogen-activated protein kinase (MAPK) signaling pathway, endocytosis, and chemokine signaling pathways ([100] Figure 2F ). 3.3. Identification of two UC subtypes based on DE-NMRGs We intersected the identified DEGs with NMRGs to elucidate the NAD+ subtypes in UC and identified 14 NMRGs as DEGs between UC and normal samples (DE-NMRGs) ([101] Figure 3A ). Based on the expression profiles of these 14 DE-NMRGs, we performed consensus unsupervised clustering analysis on the 161 UC samples in the training set. The results indicated that, at k = 2, the patients with UC clustered into two subgroups with good internal consistency and stability ([102] Figures 3B–D ). Combined with the results from the consensus matrix heatmap ([103] Figure 3E ), we categorized the 161 UC samples into two subtypes: cluster A (n = 96) and cluster B (n = 65). Furthermore, PCA further confirmed the clear separation between these two subtypes ([104] Figure 3F ). The box plot and heatmap display the differential gene expression patterns of DE-NMRGs between subtypes ([105] Figures 3G, H ). Figure 3. [106]Figure 3 [107]Open in a new tab Identification of two NAD+ subtypes in UC. (A) Venn diagram showing the overlapping genes between DEGs and NMRGs. (B) Consensus cumulative distribution function (CDF) plot showing the area under the curve for k = 2-9. (C) Relative change in the area under the CDF curve. (D) Tracking plot showing the sample subtypes for different values of (k). (E) Consensus matrix heatmap for k = 2. (F) PCA plot showing the distribution of the two subtypes. (G, H) Boxplot (G) and heatmap (H) displaying the differential expression of DE-NMRGs between the two NAD+ subtypes. * p < 0.05; *** p < 0.001. 3.4. Functional enrichment analysis between NAD+ subtypes We conducted GSEA and GSVA enrichment analyses to explore the biological and behavioral differences between the two subtypes. In the GSEA analysis, the subtypes exhibited some distinctions: pathways including drug metabolism–other enzymes, and pentose and glucuronate interconversions were significantly enriched in subtype A ([108] Figure 4A ), while pathways such as the chemokine signaling pathway and complement and coagulation cascades were significantly enriched in subtype B ([109] Figure 4B ). Furthermore, we performed GSVA analysis to assess the differences in pathway activities and biological functions between the two subtypes. The results indicated that pathways including maturity–onset diabetes of the young and ascorbate and aldarate metabolism were upregulated in subtype A, whereas glycosaminoglycan biosynthesis-chondroitin sulfate, and primary immunodeficiency were upregulated in subtype B ([110] Figure 4C ). Additionally, based on the reactome pathways, subtype A was primarily involved in sulfide oxidation to sulfate and beta-oxidation of butanoyl-CoA to acetyl-CoA, while subtype B was primarily involved in interleukin-10 signaling and CD22-mediated BCR regulation ([111] Figure 4D ). Figure 4. [112]Figure 4 [113]Open in a new tab Pathway enrichment analysis reveals distinct biological behaviors of NAD+ subtypes in UC. (A, B) GSEA highlights pathways significantly enriched in subtype A and B. (C, D) GSVA result, (C) Enriched pathways based on KEGG pathways. (D) Enriched pathways based on Reactome pathways. 3.5. Assessment of immune cell infiltration between NAD+ subtypes We assessed the infiltration proportions of different immune cells in UC samples using CIBERSORT to explore further the potential molecular mechanisms through which molecular subtypes influence UC progression, thereby analyzing the relationship between different NAD+ subtypes and immune cell infiltration. We found that compared with subtype A, subtype B exhibited significantly lower expression levels of M2 macrophages and resting mast cells, while M0 macrophages, M1 macrophages, activated mast cells, and neutrophils exhibited significantly higher expression. Additionally, compared to normal tissue, UC exhibited higher expression of activated memory CD4^+ T cells, follicular helper T cells, M0 macrophages, M1 macrophages, activated mast cells, and neutrophils, while CD8^+ T cells, resting memory CD4^+ T cells, M2 macrophages, and resting mast cells exhibited lower expression in UC ([114] Figures 5A, B ). Consequently, the immune cell infiltration pattern of subtype A appears to be intermediate between subtype B and normal tissue. Furthermore, a negative correlation was observed between neutrophils and M2 macrophages (r = –0.51) and a positive correlation between neutrophils and activated mast cells (r = 0.64), and between resting mast cells and M2 macrophages (r = 0.54) ([115] Figure 5C ). Figure 5. [116]Figure 5 [117]Open in a new tab Immune cell infiltration profiles related to NAD+ subtypes in UC. (A) Heatmap showing the relative abundance of 22 immune cell types in different NAD+ subtype samples. (B) Boxplot visualizing the distribution and variability of immune cell relative abundance in NAD+ subtypes. (C) Correlation matrix describing the interactions between different immune cells. ns, not significant; * p < 0.05; ** p < 0.01; *** p < 0.001. 3.6. Differential analysis and WGCNA analysis between NAD+ subtypes We conducted a differential analysis between the two subtypes and identified 1,597 DEGs. Subsequently, based on the entire gene expression profile, we applied the WGCNA algorithm to construct a co-expression network and key modules most correlated with the NAD+ subtypes. We used Pearson correlation coefficients to cluster the samples, and after removing outliers, we plotted a sample clustering dendrogram ([118] Figure 6A ). The optimal soft-thresholding power was set to 10 to maintain a scale-free topology and high connectivity ([119] Figure 6B ). Using hierarchical clustering, the clustering tree was divided and merged into six modules with different colors ([120] Figures 6C, D ). Among these modules, the black module (containing 2,386 genes) exhibited the highest correlation with subtype B (R = 0.72), and the blue module (containing 1,852 genes) was most correlated with subtype A (R = 0.68) ([121] Figure 6E ). Additionally, module membership in black module and its genes significance exhibited a significant correlation (cor = 0.87) ([122] Figure 6F ), and the blue module exhibited a correlation of cor = 0.82 ([123] Figure 6G ). Therefore, the black and blue modules were selected for further analysis. Figure 6. [124]Figure 6 [125]Open in a new tab WGCNA between NAD+ subtypes. (A) Sample dendrogram generated after clustering using Pearson correlation coefficients and removal of outliers. (B) Determination of the soft-thresholding power in WGCNA. (C) Dendrogram of all DEGs between subtypes, clustered based on differential measurements, dividing genes into six different modules, each representing a co-expressed gene cluster. (D) Bar graph illustrating the significance measurements of the identified gene modules. (E) Heatmap of the UC module feature genes and their correlations with different NAD+ subtypes. (F, G) Scatter plots demonstrating the relationship between module membership and gene significance within the black and blue modules. 3.7. Construction and validation of an NAD+ related typing model We constructed an NAD+-related predictive model to further clarify the role of NAD+ genes in the heterogeneity of patients with UC. Initially, we intersected the 1,597 DEGs between NAD+ subtypes, key module genes identified using the WGCNA algorithm, and all NMRGs, yielding eight intersecting genes ([126] Figure 7A ). Subsequently, we fitted the expression profiles of these eight intersecting genes into a LASSO regression analysis, determined the optimal value of λ, and selected seven potential key genes with non-zero coefficients in the training set ([127] Figures 7B, C ). Additionally, we implemented the RF algorithm in the training set and identified four effective predictive factors ([128] Figures 7D–E ). We identified four hub genes by merging the genes selected by these machine learning algorithms ([129] Figure 7F ): AOX1, NAMPT, NNMT, and PTGS2. We constructed a nomogram in the training set ([130] Figure 7G ) using these four hub genes and established ROC curves to evaluate the classification performance of each gene and the nomogram ([131] Figures 7H, I ) and found that the AUC for these hub genes were as follows: AOX1 (AUC = 0.914), NAMPT (AUC = 0.891), NNMT (AUC = 0.939), and PTGS2 (AUC = 0.932), with the nomogram achieving an AUC of 0.989. These results all indicate the accuracy of this model in predicting the NAD+ subtypes of UC. Figure 7. [132]Figure 7 [133]Open in a new tab Construction of an NAD+ related typing model. (A) Venn diagram showing the intersection of DEGs between subtypes, key module genes identified by the WGCNA algorithm, and NMRGs, resulting in eight intersecting genes. (B, C) Feature gene selection using LASSO regression. (D, E) Feature gene selection using RF algorithms. (F) Venn diagram displaying four candidate hub genes identified by the aforementioned machine learning algorithms as the core of the predictive model. (G) Nomogram of the NAD+ related typing model in the training set. (H, I) The ROC curves of the four hub genes (AOX1, NAMPT, NNMT, and PTGS2) and the nomogram in the training set. We conducted further external validation of these hub genes. We accessed a UC dataset from the GEO online database, [134]GSE206285, and constructed a nomogram in this validation set based on these four hub genes ([135] Figure 8A ). We established ROC curves to evaluate the classification performance of each gene and the nomogram. The results revealed that in the validation set, the AUC for these hub genes were as follows: AOX1 (AUC = 0.825), NAMPT (AUC = 0.965), NNMT (AUC = 0.907), and PTGS2 (AUC = 0.988), with the nomogram achieving an AUC of 0.998. These results demonstrate the accuracy of these four key genes in predicting the NAD+ subtypes of UC ([136] Figures 8B, C ). Additionally, we evaluated the impact of these four hub genes on immune infiltration and conducted Spearman’s correlation analyses between gene expression levels and immune cell content. The results indicated that all four key genes strongly impacted immune cells ([137] Figure 8D ), and a strong correlation was observed among these four hub genes ([138] Figure 8E ). Figure 8. [139]Figure 8 [140]Open in a new tab Validation of the NAD+ related typing model. (A) Nomogram of NAD+ related typing model in the validation set. (B, C) ROC curves for the four hub genes (AOX1, NAMPT, NNMT, and PTGS2) and the nomogram in the validation set. (D) Heatmap of the Spearman correlation coefficients between the expression of the four hub genes and the content of various immune cells. *p < 0.05; **p < 0.01; ***p < 0.001. (E) Network diagram illustrating the interrelationships among the four hub genes. 3.8. Construction and validation of an NAD+ related diagnostic model in UC In addition to the typing model, we established a diagnostic model related to NAD+ metabolism genes in UC to clarify further the role of NAD+ genes in predicting UC. Initially, we constructed a co-expression network and key modules most correlated with UC and normal samples using the WGCNA algorithm based on the entire gene expression profile, selecting the brown module (R = 0.65) for further analysis ([141] Figure 9A , [142]Supplementary Figure 1 ). We intersected the DEGs identified between UC and normal tissues with the WGCNA brown module, obtaining seven intersecting genes (AOX1, CD38, NAMPT, NNMT, PTGS2, PARP14, and PARP9) ([143] Figure 9B ). Subsequently, we further filtered these genes using LASSO regression ([144] Figures 9C, D ) and RF ([145] Figures 9E, F ), merging the genes selected by machine learning algorithms and identifying two hub genes ([146] Figure 9G ). A nomogram was constructed based on these two hub genes in the training set ([147] Figures 9H ), and ROC curves were established to evaluate the classification performance of each gene and the nomogram ([148] Figures 10A, B ). The results demonstrated that the AUCs for these hub genes were NNMT (AUC = 0.976) and PARP 9 (AUC = 0.976), with the nomogram achieving an AUC of 0.993. These results indicate the accuracy of this model in predicting UC. Figure 9. [149]Figure 9 [150]Open in a new tab Construction of an NAD+ related diagnostic model in UC. (A) Heatmap of the UC module feature genes and their correlations with UC and normal in WGCNA. (B) Venn diagram showing the intersection of DEGs between UC and normal, key module genes identified by the WGCNA algorithm, and NMRGs, resulting in seven intersecting genes. (C, D) Feature gene selection using LASSO regression. (E, F) Feature gene selection using RF algorithms. (G) Venn diagram displaying two candidate hub genes identified by the aforementioned machine learning algorithms as the core of the predictive model. (H) Nomogram of NAD+ related diagnostic model in the training set. Figure 10. [151]Figure 10 [152]Open in a new tab Validation of the NAD+ related diagnostic model in UC. (A, B) ROC curves for the two hub genes (NNMT and PARP9) and the nomogram in the training set. (C–H) ROC curves for the two hub genes (NNMT and PARP9) and the nomograms in the validation sets. (I, J) RT-qPCR experiment results of two hub genes (NNMT and PARP9) in NAD+ related diagnostic model. **p < 0.01; ***p< 0.001. We conducted further external validation of the model using three UC GEO datasets ([153]GSE206285, [154]GSE92415, and [155]GSE66407) from the GEO online database. Nomograms was constructed based on the two hub genes in the validation sets ([156] Supplementary Figure 2 ), and ROC curves were established to evaluate the classification performance of each gene and the nomogram ([157] Figures 10C–H ). The results revealed that in the validation set [158]GSE206285, the AUC for these hub genes was NNMT (AUC = 0.988) and PARP9 (AUC = 0.807), with the nomogram achieving an AUC of 0.988 ([159] Figures 10C, D ). In the validation set [160]GSE92415, the AUC values for these hub genes were NNMT (AUC = 0.974) and PARP9 (AUC = 0.962), with the nomogram achieving an AUC of 0.985 ([161] Figures 10E, F ). In the validation set [162]GSE66407, the AUC values for these hub genes were NNMT (AUC = 0.974) and PARP9 (AUC = 0.871), with the nomogram achieving an AUC of 0.976 ([163] Figures 10G–H ). These results all confirm the accuracy of these two key genes in the diagnosis of UC. Subsequently, we conducted RT-qPCR experiments on the two hub genes, NNMT and PARP9, to further confirm their roles. The results revealed that NNMT and PARP9 expression was significantly increased in colonic epithelial cells following LPS treatment compared to the control group ([164] Figures 10I, J ). These RT-PCR results further corroborated the differential expression observed in the dataset analysis, highlighting the potential of these genes as biomarkers in the diagnosis of UC. 4. Discussion UC is a chronic IBD that significantly affects the health and quality of life of patients. As a major global public health concern, the complex etiology and recurrent nature of UC make the research on effective diagnostic and personalized treatment approaches critical. Previous studies have identified the NAD+ metabolism as a critical pathway in UC pathogenesis. We categorized 161 UC samples collected from public databases into two distinct NAD+ subtypes, clusters A and B, to further explore the disease mechanisms. Cluster A exhibited stronger metabolic regulation and self-repair capabilities, while Cluster B was associated with more intense immune responses and severe tissue damage. Additionally, we demonstrated that NNMT and PARP9 are effective diagnostic biomarkers in UC, while AOX1, NAMPT, NNMT, and PTGS2 are discriminative markers for UC subtyping. Models developed using these biomarkers can predict disease progression more accurately and optimize treatment plans. Following the GO and KEGG enrichment analyses of DEGs between UC samples and normal controls, we further validated the characteristics of UC as an immune-mediated inflammatory disease. The GO enrichment analysis revealed the central role of immune system regulation in UC, particularly the differentiation and function of monocytes and lymphocytes. Previous studies have suggested that the aberrant activity of these cells in UC may exacerbate the condition by promoting the release of inflammatory mediators and modulating the function of immune cells, leading to persistent tissue damage ([165]48–[166]51). Moreover, the excessive production of cytokines contributes to ongoing tissue injury ([167]6, [168]52). KEGG enrichment analysis corroborated the GO findings, revealing the activation of critical pathways, including the MAPK and chemokine signaling pathways. Previous studies have reported that the MAPK signaling pathway is closely associated with UC progression ([169]53, [170]54), with several therapeutic drugs alleviating UC symptoms by modulating this pathway ([171]55, [172]56). Activation of the chemokine signaling pathway facilitates the migration of immune cells, including neutrophils and regulatory T cells, to inflamed areas, sustaining the inflammatory response and propelling disease progression ([173]36, [174]57, [175]58). This was further validated in our subsequent analyses of immune cell infiltration. Elevated activated memory CD4^+ T cells, follicular helper T cells, M0 macrophages, M1 macrophages, activated mast cells, and neutrophil expression in UC tissues highlight the significant increase in inflammation and immune activation in UC ([176]59, [177]60). Conversely, lower expressions of resting memory CD4^+ T cells, M2 macrophages, and resting mast cells in UC may reflect the limited functionality of these regulatory and reparative cells in the disease ([178]61). These analyses confirm the significance of UC as an immune-mediated inflammatory disease and provide potential targets for future therapeutic interventions. We observed that pathways related to metabolism and biosynthesis, including drug metabolism, carbohydrate conversion, and steroid hormone biosynthesis, were predominantly enriched in cluster A by comparing enriched pathways between the two UC subtypes. This suggests that cluster A possesses enhanced metabolic regulation and self-repair capabilities, which may help control inflammation spread, alleviate tissue damage, and promote damaged tissue regeneration ([179]62–[180]64). Conversely, pathways related to immunity and inflammation, including chemokine signaling, cytokine-cytokine receptor interaction, and complement and coagulation cascades, were significantly enriched in cluster B. This indicates that cluster B may be associated with more intense immune responses and severe tissue damage, where active inflammatory pathways could lead to rapid accumulation of immune cells and amplification of inflammatory reactions, complicating the disease progression and treatment ([181]65, [182]66). Further analysis of immune cell infiltration between the subtypes supports these findings. The infiltration levels of M0 and M1 macrophage, activated mast cells, and neutrophils are higher in subtype B than in subtype A. However, levels of M2 macrophage and resting mast cells are lower. This reflects a significantly different immune environment in subtype B compared with subtype A, potentially indicative of more intense inflammatory responses and reduced anti-inflammatory or tissue repair capabilities ([183]67, [184]68). Additionally, we found that the pattern of immune cell infiltration in subtype A lies between that of subtype B and normal tissues, suggesting that subtype A may more closely resemble normal tissue compared to subtype B. These differences reveal fundamental distinctions between subtypes A and B based on immune response mechanisms, immune cell types and activity, and potential pathological processes. This highlights the importance of developing personalized treatment plans based on specific subtypes to optimize therapeutic outcomes and improve patient prognosis. We propose that NNMT and PARP9 are effective diagnostic biomarkers for UC, while AOX1, NAMPT, NNMT, and PTGS2 can differentiate between UC subtype clusters A and B following comprehensive bioinformatics analysis and experimental validation. Furthermore, NNMT (nicotinamide N-methyltransferase), a cytoplasmic enzyme primarily involved in the N-methylation of nicotinamide (Nam), reduces precursors of NAD+ through methylation and Nam excretion ([185]69). Notably, NNMT helps maintain high levels of inflammatory signaling and sustained signal transduction by eliminating excess nicotinamide ([186]24–[187]27, [188]70). Consequently, elevated NNMT expression in UC may enhance the activation of inflammatory pathways and increase disease activity and tissue damage. Therefore, monitoring NNMT expression levels aids in diagnosing UC and in differentiating between disease activity states or subtypes, especially those related to inflammatory responses and metabolic status. Besides, PARP(poly(ADP-ribose) polymerase, utilizing NAD+ as a substrate, facilitates ADP-ribosylation reactions during DNA damage ([189]71), with increased PARP activity leading to decreased NAD+ levels ([190]72). This process is crucial for DNA repair and regulating inflammatory responses. PARP9(poly(ADP-ribose) polymerase family member 9), a member of the PARP family, reveals expression patterns closely associated with immune responses and cellular stress states in various inflammatory diseases ([191]30, [192]73–[193]75). In UC, upregulated expression of PARP9 may relate to its role in cellular stress responses. AOX1(aldehyde oxidase 1), a broad-spectrum oxidase, has demonstrated potential as a biomarker in various cancers, where its low expression in clear cell renal cell carcinoma and prostate cancer correlates with poor prognosis, suggesting its tumor-suppressing capabilities ([194]76, [195]77). Although studies in UC are still limited, the role of AOX1 in regulating oxidative stress and inflammatory responses indicates its potential as a valuable target for personalized treatment in UC. NAMPT(nicotinamide phosphoribosyltransferase) is critical in regulating the NAD+ pool and inflammatory responses in UC. It is a key enzyme in NAD+ biosynthesis and a cytokine, influencing cellular metabolism and immune responses within UC ([196]20–[197]23, [198]78). Elevated levels of NAMPT may reflect an adaptive response to metabolic demands and mucosal damage in UC ([199]70, [200]79, [201]80), highlighting its potential as a biomarker for disease severity and subtype differentiation. PTGS2 (prostaglandin-endoperoxide synthase 2, COX-2) is a crucial enzyme that converts arachidonic acid into prostaglandins, essential in inflammation and pain responses. In UC, significant upregulation of PTGS2 marks an inflammatory feature of the disease ([202]81). Studies have reported that increased PTGS2 expression is closely associated with UC severity and progression, demonstrating its potential as a biomarker for disease activity and a therapeutic target ([203]29, [204]82). Furthermore, the NAD+ metabolism pathway is closely linked to inflammatory response regulation, and changes in NAD+ metabolism may affect PTGS2 activity ([205]28, [206]83), implying a potential interaction between NAD+ metabolism and PTGS2 in UC pathogenesis and management. 5. Conclusion In conclusion, this study identified two UC subtypes associated with NAD+ metabolism, and our analysis of the differences between these subtypes highlighted the significant role of NAD+ metabolism in UC. We successfully identified key genes, including NNMT and PARP9, as diagnostic biomarkers for UC, and AOX1, NAMPT, NNMT, and PTGS2 for differentiating the NAD+ metabolism subtypes of UC. The nomograms developed from these biomarkers demonstrated exceptional accuracy and reliability in the early diagnosis and subtyping of UC, indicating the potential application of these biomarkers in UC treatment strategies. Future research should investigate the expression patterns of these genes in different patients with UC and their impact on treatment responses, which could help optimize treatment plans and advance therapeutic strategies for UC. Acknowledgments