Abstract Background and objective Cardiovascular-Kidney-Metabolic (CKM) syndrome reflects the interrelated pathophysiology of obesity, insulin resistance, type 2 diabetes, chronic kidney disease, and cardiovascular disease. Conventional CKM staging often detects risk only after substantial organ dysfunction and may overlook early metabolic heterogeneity. This study aimed to employ plasma metabolomics to identify metabolic subtypes linked to CKM severity and explore early biomarkers for high-risk individuals. Methods A cross-sectional study was conducted involving 163 adults, which included 86 individuals clinically staged as CKM 0–3 according to the criteria proposed by the American Heart Association (AHA). Plasma samples underwent untargeted metabolomic and lipidomic profiling using liquid chromatography–mass spectrometry (LC–MS). Unsupervised clustering identified metabolic subtypes, with validation via random forest analysis. Group differences were assessed using orthogonal partial least squares–discriminant analysis (OPLS-DA) and logistic regression classifiers. Results A total of 390 metabolites, categorized into 9 superclasses and 30 subclasses, were identified. Three distinct metabolic clusters emerged: Cluster 1 (glycerophospholipid-enriched), Cluster 2 (fatty acyl–dominant), and Cluster 3 (glycolipid-enriched). At the individual differential metabolite level, Cluster 1 exhibited a generally low metabolic status, Cluster 2 demonstrated an intermediate metabolic profile, and Cluster 3 showed a high metabolic status. High-risk CKM individuals were predominantly assigned to Cluster 3 (p < 0.001). Within each cluster, OPLS-DA effectively differentiated high- and low-risk individuals based on lipid profiles, highlighting triglycerides, fatty acids, phosphatidylcholines, sphingolipids, and acylcarnitines as key discriminators. Secondary clustering among stage 3 of CKM patients revealed substantial metabolic heterogeneity. A panel of 20 metabolites achieved high diagnostic performance for stage 3 of CKM individual (AUC = 0.875). Conclusions Untargeted plasma metabolomic profiling reveals distinct metabolic subtypes corresponding to CKM severity and uncovers marked heterogeneity within the high-risk group. Key metabolite signatures may enhance early risk stratification and support more personalized management strategies beyond conventional CKM staging. Graphical abstract Overview of study design, metabolicsubtypes, CKM stratification, and key findings.[42] graphic file with name 12933_2025_2881_Figa_HTML.jpg Supplementary Information The online version contains supplementary material available at 10.1186/s12933-025-02881-8. Keywords: Cardiovascular-kidney-metabolic (CKM) syndrome, Liquid chromatography–mass spectrometry (LC–MS), Metabolomics, Lipidomics, Unsupervised clustering, Metabolic endotypes Introduction Cardiovascular-Kidney-Metabolic (CKM) syndrome serves as a comprehensive framework that acknowledges the intricate interconnection of the pathophysiological processes underlying obesity, insulin resistance, type 2 diabetes (T2D), chronic kidney disease (CKD), and cardiovascular disease (CVD) [[43]1–[44]5]. Epidemiological studies indicate that approximately 90% of adults satisfy the criteria for at least Stage 1, characterized by excess adiposity or early metabolic disturbances [[45]6–[46]9]. In contrast, approximately 10–15% have progressed to Stages 3–4, which are indicative of subclinical or overt cardiovascular and renal events superimposed on existing metabolic risk factors [[47]10, [48]11]. Individuals in these advanced stages are subject to a two- to four-fold increased risk of experiencing myocardial infarction, stroke, heart failure hospitalization, progression to end-stage kidney disease, and premature mortality [[49]12–[50]15]. Yet conventional clinical markers, such as Glycated Hemoglobin (HbA₁c), clinical lipid panels, and estimated Glomerular Filtration Rate (eGFR), frequently indicate high-risk status only after significant organ damage has occurred, constraining the window for timely intervention [[51]16–[52]18]. In addition, patients classified within CKM stage may display considerable heterogeneity; individuals with comparable body mass index (BMI), blood lipid profiles, or eGFR values can exhibit distinct metabolic, inflammatory, and molecular profiles, which may affect their clinical outcomes [[53]19–[54]21]. These observations underscore the necessity for methodologies that capture early metabolic changes and define metabolic subtypes beyond conventional clinical metrics, enabling earlier identification of high-risk individuals and more targeted preventive interventions [[55]22, [56]23]. Liquid chromatography-mass spectrometry (LC–MS)-based high-throughput untargeted metabolomics, facilitates the sensitive and specific quantification of a vast array of endogenous metabolites, ranging from lipids and amino acids to energy intermediates and microbiome-derived compounds [[57]24, [58]25]. Metabolomic profiling has already enhanced prediction models for T2D, CVD, and CKD by detecting pathophysiological perturbations invisible to routine assays [[59]26–[60]28]. For instance, increases in branched-chain and aromatic amino acids have been strongly associated with the onset of diabetes, while certain ceramide (Cer) and lysophospholipid species independently predict the development of atherosclerotic plaques [[61]29–[62]32]. Notably, because metabolomics captures the integrated downstream effects of genetic factors, dietary influences, and environmental exposures, it is particularly well-suited to uncover subclinical disease states and dynamic risk profiles [[63]33]. Moreover, unsupervised clustering of untargeted metabolomic data has successfully uncovered latent sub-phenotypes in diseases such as nonalcoholic fatty liver disease and heart failure, yet this approach remains untapped in the CKM context [[64]34–[65]36]. Filling this gap could reveal novel biomarkers and mechanistic pathways, ultimately improving early risk stratification. Despite these advances, few studies have applied untargeted metabolomics to explore latent metabolic subtypes within the CKM framework. In this exploratory study, untargeted LC–MS-based metabolomic and lipidomic profiling was performed on fasting plasma from a health-check cohort. Using unsupervised clustering, the study aimed to identify metabolically distinct subgroups and evaluate their associations with clinical CKM stages. This approach may contribute to improved risk stratification and facilitate early identification of individuals with high-risk metabolic phenotypes, thereby informing more personalized prevention strategies. Methods Study design and participants This cross‐sectional study enrolled 163 adults (aged 18–75 years) from Shandong Provincial Hospital between January and December 2024 [[66]25]. Exclusion criteria were active infection, malignancy, chronic inflammatory disease, recent surgery (< 3 months), and use of lipid‐modifying or immunomodulatory medications. All participants provided written informed consent under a protocol approved by the Institutional Review Board of Shandong Provincial Hospital. Clinical data-including age, sex, ethnicity, medication use, and history of hypertension, T2D-were collected via standardized questionnaires. BMI was calculated from measured height and weight. Carotid plaque data were obtained via carotid ultrasound examination. Systolic and diastolic blood pressures were measured in triplicate after 5 min of rest using an automated sphygmomanometer. Fasting blood samples were obtained between 7:00 and 9:00 AM after an overnight fast of ≥ 8 h. Routine laboratory parameters—fasting glucose, lipid panel (total cholesterol, high-density lipoprotein cholesterol [HDL-C], low-density lipoprotein cholesterol [LDL-C], triglycerides [TG]), serum creatinine, and urea nitrogen—were measured in the central clinical laboratory using standardized assays. Plasma collection Venous blood (10 mL) was drawn into K2‐EDTA tubes, kept on ice, and centrifuged at 1500×g for 10 min at 4 °C. Plasma aliquots (500 µL) were stored at − 80 °C. LC–MS-based untargeted metabolomics profiling Metabolomics analysis was performed as previously described [[67]37]. Briefly, 400 µL of cold methanol (MeOH) containing internal standards was added to 50 µL of plasma sample. After vortex mixing, proteins were removed by centrifugation. The extract was lyophilized and then reconstituted with a water/methanol mixture (4:1, v/v). After dissolving and centrifuging, the supernatant was transferred into a LC–MS system for analysis. The internal standards used for metabolomics analysis and their final concentrations were as follows: Carnitine C2:0-d3 (0.03 ug/mL), Carnitine C10:0-d3 (0.02 ug/mL), Carnitine C16:0-d3 (0.025 ug/mL), Lysophosphatidylcholine 19:0 (LPC19:0, 0.125 μg/mL), Free Fatty Acid C16:0-d3 (FFA C16:0-d3, 0.4 μg/mL), Free Fatty Acid C18:0-d3 (FFA C18:0-d3, 0.4 μg/mL), Cholic Acid-d4 (CA-d4, 0.3 μg/mL), Chenodeoxycholic Acid-d4 (CDCA-d4, 0.3 μg/mL), Phenylalanine-d5 (Phe-d5, 0.5 μg/mL), Leucine-d3 (Leu-d3, 0.7 μg/mL), and Tryptophan-d5 (Trp-d5, 0.6 μg/mL). Measurement was performed using ultra-performance liquid chromatography (UHPLC) (Shimadzu) coupled with a Triple TOF 5600 plus mass spectrometer (AB SCIEX, Framingham, USA) system. A Waters BEH C8 column (2.1 mm × 50 mm, 1.7 μm) and HSS T3 column (2.1 mm × 50 mm, 1.8 μm) was used for separation in positive and negative modes, respectively. The flow rate was set to 0.4 mL/min and the column temperature was maintained at 60 °C. The elution gradient started at 5% B, held for 0.5 min, then increased linearly to 40% B at 2.0 min, reached 100% B at 8.0 min, and held at this concentration for 2 min. Finally, the column was returned to 5% B within 0.1 min and held for 1.9 min for equilibration. In positive mode, the mobile phases consisted of water (0.1% formic acid, phase A) and acetonitrile (0.1 mM formic acid, phase B). In negative mode, the mobile phases consisted of water (6.5 mM ammonium bicarbonate, phase A) and 95% methanol (6.5 mM ammonium bicarbonate, phase B). For mass spectrometry (MS) parameters, the flow rates of sheath gas and curtain gas were set to 55 and 35 psi, respectively. The scan range was set to m/z 100–1250 with a collision energy of 10 V. For dd-MS2 mode, the scan range was set to m/z 50–1250 with collision energy at 35 ± 15 V. In positive ion mode, the ion spray voltage floating was set to 5.5 kV, and the capillary temperature was 550 °C. In negative ion mode, the ion spray voltage floating was − 4.5 kV, and the capillary temperature was 450 °C. LC–MS-based untargeted lipidomics profiling Lipidomics analysis was conducted according to a previously described method [[68]38]. Lipid extraction was performed using the MeOH/H2O/MTBE technique [[69]39]. Briefly, 300 μL of MeOH containing internal standards was added to 40 µL of plasma sample, followed by vortex mixing. Then, 1 mL of methyl tert-butyl ether (MTBE) was added, and the mixture was vortexed for 10 min. Next, 300 μL of H[2]O was added, and the mixture was vortexed to form a two-phase system. After centrifugation. 400 μL of the supernatant was lyophilized and stored at − 80 °C. Prior to analysis, the lyophilized samples were reconstituted with ACN/IPA/H[2]O (65:30:5, v/v/v/) containing 5 mM ammonium acetate. A 5 μL aliquot of the sample was transferred into the LC–MS system for analysis. The internal standards used for lipidomics analysis and their final concentrations were as follows: Phosphatidylcholine 38:0 (PC 38:0, 1.67 μg/mL), Phosphatidylethanolamine 34:0 (PE 34:0, 0.83 μg/mL), Lysophosphatidylcholine 19:0 (LPC 19:0, 0.67 μg/mL), Sphingomyelin 12:0 (SM 12:0, 0.83 μg/mL), Triglyceride 45:0 (TG 45:0, 1.33 μg/mL), Ceramide 17:0 (Cer 17:0, 0.33 μg/mL), Free Fatty Acid 16:0-d3 (FFA 16:0-d3, 0.67 μg/mL), and Free Fatty Acid 18:0-d3 (FFA 18:0-d3, 0.67 μg/mL). Lipidomics analysis was performed using a Waters ACQUITY UHPLC (Shimadzu) coupled with an AB SCIEX Triple Q-TOF 5600 Plus (Concord, Canada). A Waters BEH C8 column (2.1 mm × 100 mm, 1.7 μm) was used for lipid separation. The mobile phases consisted of 3:2 (v/v) acetonitrile (ACN)/H2O (10 mM AcAm, phase A) and 9:1 (v/v) isopropanol (IPA)/ACN (10 mM AcAm, phase B). The flow rate was set to 0.26 mL/min, and the column temperature was maintained at 55 °C. The elution gradient started at 32% B, was held at this concentration for 1.5 min, then increased linearly to 85% B at 15.5 min, reached 97% B at 15.6 min, and was held at this concentration for 2.4 min. Finally, the column was returned to 32% B within 0.1 min and held for 1.9 min for equilibration. The ion spray voltage for MS was set to 5500 V and 4500 V in positive and negative ion modes, respectively. The interface heater temperature was set to 500 °C in positive mode and 550 °C in negative mode. The flow rates for ion source gas 1, ion source gas 2, and curtain gas were set to 50, 50, and 35 psi in positive ion mode and 55, 55, and 35 psi in negative ion mode, respectively. The MS scan range was 300–1250 Da in positive mode and 150–1250 Da in negative mode. Samples were run in a randomized order to minimize systematic bias. Prior to analysis, 10 blank injections were performed to ensure baseline stability and clean system conditions. Quality control (QC) samples were analyzed as the first ten injections to condition and stabilize the mass spectrometry instruments. In the batch of real samples, each blank sample and each QC sample were injected after every 10 real samples. Annotation of metabolites and data processing Raw LC–MS data were processed using MS-DIAL (Version 4.90) software, with metabolite and lipid identification referenced against an in-house database. Compounds were annotated based on their accurate mass, chromatographic retention time and MS/MS fragmentation patterns. Specifically, 174 metabolites were identified in the metabolomics analysis, and 324 lipids were identified in the lipidomics analysis. After removing compounds detected in both ionization modes, and those with low signal-to-noise ratio and high coefficient of variation, 390 unique metabolites were retained. Multiquant software was used for quantification of raw data. The metabolites and lipids in each sample were normalized based on their corresponding internal standards. Processed peak areas were log₁₀‐transformed and Pareto‐scaled; missing values (< 5%) were imputed by k‐nearest neighbors (k = 5) in MetaboAnalyst 5.0. Metabolite classification and pathway enrichment Annotated metabolites were categorized by HMDB class (e.g. glycerophospholipids, glycerolipids, sphingolipids, fatty acyls, acylcarnitines). One-way ANOVA followed by False Discovery Rate (FDR) correction (Benjamini–Hochberg, FDR < 0.05) was applied to identify differentially abundant metabolites across clusters. KEGG pathway enrichment analysis was performed using the hypergeometric test (FDR < 0.05) in MetaboAnalyst, with all annotated metabolites serving as the background reference. External validation using public metabolomics cohorts To validate the CKM-associated metabolite panel, we reanalyzed three publicly available untargeted metabolomics datasets related to cardiovascular or kidney conditions, accessed from the Metabolomics Workbench. These datasets were selected to represent relevant disease axes of the CKM spectrum: kidney disease, coronary atherosclerosis, and microvascular dysfunction. For validation, we intersected our top 30 CKM differential metabolites with those detected in each dataset, and performed classification modeling using random forest and ROC analysis. 1. ST000816—CKD cohort (diabetic kidney disease) This dataset includes plasma lipidomics profiling of patients with type 2 diabetes, stratified by progression of diabetic nephropathy. A total of 100 samples were used for validation, including 50 progressors and 50 non-progressors, based on eGFR decline over time. We used the baseline samples only to compare future progression status. Sixteen of the CKM-featured metabolites overlapped with this dataset. 2. ST003661—CTO cohort (chronic total occlusion of coronary arteries) This study includes plasma samples from patients undergoing percutaneous coronary intervention (CTO-PCI) for chronic total occlusion. Samples were collected at multiple timepoints (pre-PCI, 24 h post-PCI, 72 h post-PCI), and from healthy matched controls. For our analysis, we compared the pre-operative CTO group (n = 31) with healthy controls (n = 20) to focus on pre-treatment cardiovascular signatures. Eight of the CKM-related metabolites were detected in this dataset. 3. ST003275—CM cohort (coronary microvascular dysfunction) This dataset includes 75 serum samples, comprising 56 patients with coronary microvascular dysfunction (CMD) and 19 matched healthy controls, collected from a clinical cardiology cohort. It provides a distinct dimension of cardiovascular risk focused on microvascular pathophysiology. Twelve of the CKM-related metabolites overlapped with this dataset. Statistical analysis All statistical analyses were conducted using R software (version 4.2.1) and GraphPad Prism 9. For unsupervised clustering, k-means, hierarchical, and consensus clustering (R package ConsensusClusterPlus) were compared with cluster numbers ranging from k = 2 to 6. Optimal clustering was achieved using k-means (k = 3), which minimized random forest OOB error (randomForest) and yielded stable subgroup assignments. Cluster separation was visualized using PCA and t-SNE, with loadings examined to identify key metabolites contributing to principal components. To assess clinical relevance, cluster assignments were overlaid on CKM stage data available for 86 participants. Chi-square tests evaluated stage distribution differences across clusters. Logistic regression models (glm), adjusted for age, sex, and BMI, were used to estimate absolute risk differences (RDs) and corresponding 95% confidence intervals (CIs) for high-risk status (Stage 3), using Cluster 1 as the reference group. Classification performance of cluster labels was assessed by ROC curve analysis, with AUC and 95% CIs computed using DeLong’s method. Within each cluster, OPLS-DA was employed to contrast high- vs low-risk individuals. Model performance was evaluated by R^2Y and Q^2 values using sevenfold cross-validation and 200-permutation testing. Metabolites with variable importance in projection (VIP) > 1.0 and FDR-adjusted p < 0.05 (Student’s t-test) were considered discriminant. Heatmaps (pheatmap) visualized standardized abundances. To explore intra-stage heterogeneity, high-risk individuals (n = 21) were re-clustered using k-means (k = 3), with subgroup stability validated by PCA and random forest OOB error. Discriminative metabolites for each endotype were identified using VIP and FDR filtering. For biomarker discovery, significant metabolites (VIP > 1.0, FDR < 0.05) from all clusters were subjected to feature selection via random forest and ranked by mean decrease in Gini index. Logistic regression models incorporating the top 10, 20, and 30 ranked metabolites were trained using tenfold cross-validation. Model performance was evaluated using ROC analysis for AUC, sensitivity, and specificity. To formally compare the discriminative ability of competing models, we applied DeLong’s non‑parametric test for correlated ROC curves, reporting ΔAUC, two‑sided p‑value, and 95% confidence interval. Predicted probabilities from both the clinical model and the metabolite‑enhanced model were exported to the nricens package. We then computed the integrated discrimination improvement (IDI) as the difference in mean predicted risk change between cases and controls, and the category‑free net reclassification index (cfNRI) as the sum of improvement in upward reclassification among cases and downward reclassification among controls. Continuous variables were expressed as mean ± standard deviation (SD) or median (interquartile range, IQR), and categorical variables as counts (%). Group comparisons were performed using independent t-tests or Mann–Whitney U-tests for continuous data and χ^2 or Fisher’s exact tests for categorical data, as appropriate. Multiple testing was adjusted using the Benjamini–Hochberg FDR correction. A two-sided p < 0.05 was considered statistically significant. No a priori power calculation was conducted given the exploratory nature of the study. Results Unsupervised, omics-based clustering defines three metabolic subtypes Unsupervised K-means clustering was performed on 390 plasma metabolites measured by LC–MS in 163 adults. This method was selected for its lowest out-of-bag (OOB) error in random forest analysis among several clustering approaches (Supplementary Fig. 1). Unsupervised K-means clustering identified three distinct metabolomic subtypes, each characterized by a globally divergent biochemical profile (Fig. [70]1A). Notably, metabolites that were differentially abundant across clusters (FDR < 0.05) (Supplementary Table 1) were predominantly enriched in KEGG pathways related to linoleic acid metabolism, tyrosine metabolism, central carbon metabolism in cancer, fatty acid elongation, and retrograde endocannabinoid signaling (Fig. [71]1B). Pathway enrichment analysis underscored fundamental differences in the underlying metabolic processes, offering mechanistic insights into the distinct metabolic profiles of each cluster. Fig. 1. [72]Fig. 1 [73]Open in a new tab Unsupervised consensus clustering of metabolomic and lipidomic profiles in a health-check cohort. A Principal-component analysis (PCA) score plot of 163 participants based on combined metabolite and lipid measurements, colored by consensus cluster (1–3) with 95% confidence ellipses. B Bubble plot of KEGG pathway enrichment for metabolites driving cluster separation (rich factor on the x-axis; bubble size = metabolite count; color =–log₁₀ adjusted p-value). C Stacked bar charts showing the relative abundance of major lipid subclasses (fatty acyls, glycerolipids, glycerophospholipids, glycolipids, sphingolipids, organic compounds, others) in each cluster. D Heatmap of z-scored abundances for the top 20 cluster-discriminating metabolites; samples (columns) are grouped by cluster and metabolites (rows) are hierarchically clustered To further investigate these differences, the total content of each subclass for metabolites was quantified. Cluster 1 was predominantly defined by an enrichment of glycerophospholipids, particularly phosphatidylcholines (PCs) and phosphatidylethanolamines (PEs), suggesting enhanced glycerophospholipid biosynthesis or turnover. Cluster 2 was primarily characterized by an abundance of fatty acyls, indicative of heightened energy metabolism, particularly fatty acid metabolic activity. In contrast, Cluster 3 exhibited a marked enrichment of glycolipids, especially glycosphingolipids, pointing to potential alterations in glycosylation pathways or glycosphingolipid metabolism (Fig. [74]1C). Heatmap analysis of differential metabolites across the three clusters provided finer resolution at the individual metabolite level. Although Cluster 1 generally exhibited lower overall metabolic activity, it showed specific elevations in lysine, leucine, phosphatidylcholine (PC 42:7), and two hexosylceramides (HexCers), suggesting targeted perturbations in amino acid and glycosphingolipid metabolism. Cluster 2 demonstrated intermediate metabolic activity, characterized notably by increased levels of phosphatidylcholines (PCs), ceramides (Cers), sphingomyelins (SMs), and fatty acids (FAs), highlighting the central role of lipid remodeling. In contrast, Cluster 3 exhibited the highest global metabolic status, with broadly elevated levels across most differential metabolites, including triacylglycerols (TGs), diacylglycerols (DGs), cholesteryl esters (CEs), Cers, SMs, PCs and some amide acids (Glutamine, L-Glutamic acid, L-Tyrosine) (Fig. [75]1D). Together, these integrative analyses reveal a coherent and hierarchical pattern of metabolic differentiation among the three clusters, spanning from global KEGG pathway distinctions and metabolite subclass composition to specific metabolite-level perturbations, thereby defining three biologically meaningful metabolic subtypes within the analyzed population. Mapping of CKM stages across metabolic clusters and identification of high-risk stage-specific differential metabolites within each cluster The distribution of CKM risk stages was examined across the three previously defined metabolic clusters. Cluster 3 exhibited a significantly higher proportion of individuals classified as high-risk (Stage 3), whereas Clusters 1 and 2 were predominantly composed of individuals at lower-risk stages (Fig. [76]2A). These differences in CKM stage distribution were statistically significant (chi-square test, p < 0.001). Clinically, membership in Cluster 3 was strongly associated with high-risk stage, reflected by an adjusted risk difference (RD) of 47.6% (95% CI 15.7%–79.8%) compared with Cluster 1, which served as the reference group (Fig. [77]2B). ROC analysis of cluster-based classification yielded an AUC of 0.736 (95% CI 0.615–0.860), indicating moderate discriminatory accuracy for identifying individuals at high CKM risk (Fig. [78]2C). Fig. 2. [79]Fig. 2 [80]Open in a new tab Consensus clusters, CKM stage associations, and within-cluster metabolite differences by risk status. A Stacked bar chart showing the proportion of CKM stages 0–3 + in each consensus cluster; numbers indicate sample counts and χ^2 p < 0.001. High-risk is defined as CKM stage ≥ 3. B Forest plot of hazard ratios (HR; 95% CI) and risk differences (RD) for progression to high CKM stage in Clusters 2 and 3 versus Cluster 1, both unadjusted and adjusted for age, sex and BMI (vertical dashed line at HR = 1). C ROC curve from a Random Forest classifier distinguishing high-risk (stage ≥ 3) from low-risk participants, with shaded ribbon showing 95% CI; AUC = 0.736 (CI 0.613–0.860). D Within-cluster comparisons of high- versus low-risk CKM individuals. Each row represents one cluster. Left: OPLS-DA score plots showing separation between high-risk (red) and low-risk (blue) individuals. Middle left: Z-score plots of representative differential metabolites. Middle right: Volcano-style VIP plots showing discriminative metabolites by log₂ fold-change and significance. Right: OPLS-DA loading plots indicating metabolite contributions to separation by CKM risk status The clinical markers associated with CKM syndrome varied significantly across the three metabolic clusters. Cluster 3 was characterized by lower estimated glomerular filtration rate (eGFR), higher fasting glucose and HbA1c levels, increased body mass index (BMI) and triglycerides, as well as elevated systolic and diastolic blood pressure compared to Clusters 1 and 2 (Supplementary Fig. 2 and Supplementary Table 2). Most of these differences reached statistical significance (p < 0.05), highlighting the distinct clinical profile associated with Cluster 3. OPLS-DA was conducted separately within each cluster to identify key metabolite drivers underlying these associations, comparing high-risk individuals (Stage 3) to those at lower risk (Stages 0–2) (Fig. [81]2D). In Cluster 1, the OPLS-DA model clearly separated the two groups (Fig. [82]2D, Top). Elevated concentrations of triglycerides (e.g., TG 47:1, TG 49:1) and ceramides (e.g., Cer 44:1), along with decreased levels of PEs (e.g., PE 40:4), were observed in the high-risk group. This lipidomic pattern may reflect an early transition from a homeostatic lipid state to one characterized by lipid accumulation and stress-induced lipid remodeling [[83]40, [84]41]. In Cluster 2, high-risk subjects demonstrated an accumulation of specific glycerophospholipids-most notably PCs-along with decreased levels of lysophosphatidylcholines (LPCs) and SMs, relative to low-risk individuals (Fig. [85]2D, Middle). This lipidomic profile suggests that CKM progression in this subgroup may involve dysregulation of membrane composition, activation of inflammatory pathways, and altered turnover within sphingolipid and glycerophospholipid networks. High-risk individuals in Cluster 3 showed increased levels of acylcarnitines (e.g., CAR 10:2, CAR 14:1), FAs (e.g., FA 19:0, FA 22:1), HexCers (e.g., HexCer 42:2), and LPC 18:1, along with decreased concentrations of TGs (e.g., TG 50:3) (Fig. [86]2D, Bottom). These alterations are indicative of mitochondrial stress, defective β-oxidation, and disrupted sphingolipid metabolism—hallmarks of emerging metabolic decompensation [[87]42, [88]43]. In each OPLS-DA model, the top discriminating metabolites (VIP > 1) were consistent with these patterns, highlighting subclass-specific lipid alterations in the high-risk subsets within each metabolic cluster. Metabolomic subtypes of high-risk CKM To gain deeper insights into metabolic heterogeneity at the most advanced stage of CKM, all high-risk individuals categorized as Stage 3 were analyzed regardless of their initial metabolic cluster assignment. Unsupervised k-means clustering of individuals with Stage 3 CKM revealed three metabolically distinct subgroups. This clustering structure was robustly supported by principal component analysis (Fig. [89]3A), highlighting substantial metabolic heterogeneity even at the most advanced stage of the disease. A random forest classifier was applied to identify key metabolites driving subgroup differentiation. The top 20 metabolites, ranked by mean decrease in Gini index, are presented in Fig. [90]3B. Notably, lysophosphatidic acids (e.g., LPA 16:0, LPA 18:0), LPCs (e.g., LPC 14:1, LPC 18:1), acylcarnitines (e.g., CAR 18:2, CAR 14:0), and several TGs (e.g., TG 52:3, TG 56:10) emerged as major contributors to subgroup identity. Heatmap visualization of the top discriminatory metabolites further revealed three distinct lipidomic profiles (Fig. [91]3C). LPC-LPA-TG-enriched subgroup (Cluster 1): Marked by high levels of LPCs and LPAs, along with elevated TG and acylcarnitines. TG-enriched subgroup (Cluster 3): Characterized by broadly elevated TG species across multiple chain lengths, suggesting enhanced neutral lipid accumulation. Mixed-profile subgroup (Cluster 2): Exhibited overall reductions in a combination of features from the other two subgroups, reflecting a relatively low metabolic profile. Fig. 3. [92]Fig. 3 [93]Open in a new tab Unsupervised clustering and metabolic features within the high-risk (CKM Stage ≥ 3) subgroup. A PCA score plot of high-risk participants colored by consensus sub-cluster (blue/red) with 95% confidence ellipses. B Dot plot of z-scored abundances for the top 20 metabolites discriminating the two high-risk sub-clusters; each point = one sample. C Heatmap of the same 15 metabolites across high-risk samples, ordered by sub-cluster and hierarchically clustered These lipid signatures underscore that even among patients with similarly advanced CKM risk, there exists considerable metabolic heterogeneity. This suggests that different lipidomic pathways may drive high-risk phenotypes in a subgroup-specific manner. Lipid-centric metabolite signatures differentiate CKM risk and enable risk assessment The metabolomic profiles of high-risk (Stage 3) and low-risk (Stages 0–2) subjects were compared using OPLS-DA. The resulting OPLS-DA score plot demonstrated clear separation between the two groups, indicating robust discrimination (Fig. [94]4A). To pinpoint metabolites driving this separation, univariate filtering was applied to the OPLS-DA results. Several lipid-related metabolites, including DGs (DG 40:7, DG 38:5, DG 36:3, DG 34:2, DG 36:4), TGs (TG 58:6, TG 56:7, TG 54:5, TG 58:9), and LPA (LPA 16:0), exhibited significant alterations. DGs and TGs were elevated in high-risk individuals, whereas LPA was decreased (Fig. [95]4B). Fig. 4. [96]Fig. 4 [97]Open in a new tab Metabolic differences between high- and low-risk participants and diagnostic performance of key biomarkers. A OPLS-DA score plot comparing low-risk (CKM < 3, blue) versus high-risk (CKM ≥ 3, red) participants, with 95% confidence ellipses. B Volcano-style VIP plot showing top differential metabolites between high- and low-risk groups. Each point reflects a metabolite's importance (VIP score as circle size),–log₁₀ (p value) on the x-axis. C Heatmap of the most discriminatory metabolites across all participants, scaled by Z-score. Samples are grouped by CKM risk stage and metabolites are hierarchically clustered. D Bar plot of the top 30 metabolites ranked by Mean Decrease in Gini from a Random Forest classifier distinguishing CKM high- vs low-risk individuals. Color denotes relative importance. E ROC curves for classifiers built on the top 10, 20 and 30 metabolites (AUCs = 0.863, 0.875, 0.850; shaded ribbons = 95% CI) These differences were further visualized in a heatmap of the more differential metabolites. Interestingly, the high-risk CKM group exhibited an active metabolic state, whereas the low-risk group displayed a relatively quiescent metabolic profile (Fig. [98]4C). Specifically, the high-risk group shows elevated levels of multiple TGs and DGs, while certain Cers and SMs tend to reducing. Additional alterations are observed in acylcarnitines, and PCs, further reinforcing the lipid-centered nature of CKM risk stratification. Next, a random forest analysis was applied to rank the importance of the differential metabolites. The top 30 metabolites by mean-decrease-Gini are plotted, illustrating the most influential features. Consistent with the bubble plot (Fig. [99]4B), many of the highest-ranked metabolites were lipid species, indicating that these metabolites are key drivers of the risk stratification (Fig. [100]4D). The diagnostic performance of the selected metabolites was evaluated using ROC analysis. Logistic regression models based on the top 10, 20, and 30 ranked metabolites achieved AUC values of 0.863, 0.875, and 0.850, respectively, with the 20-metabolite model showing the highest diagnostic accuracy. The model with 20 metabolites yielded the highest AUC (0.875), suggesting that including additional variables beyond 20 did not improve-and slightly reduced-predictive accuracy. Importantly, even the 10-metabolite model performed well (AUC ≈ 0.86), demonstrating that a relatively small panel of metabolites can capture much of the diagnostic signal (Fig. [101]4E). Together, these results imply that the selected metabolites—particularly the DG, TG and LPA species identified—could serve as the basis for a candidate metabolite‐based diagnostic model for high-risk CKM status. In summary, the results show that OPLS-DA and machine-learning selection pinpoint a concise set of lipid metabolites that sharply differentiate high-risk from low-risk groups, laying the groundwork for a biomarker panel. To assess the diagnostic contribution of the metabolite panel in identifying high-risk CKM status, its performance was compared with conventional clinical markers. ROC analysis showed that the models based on conventional clinical metabolic markers (BMI, triglycerides, glucose) and renal/blood pressure markers (SBP, DBP, BUN, eGFR) yielded AUCs of 0.726 and 0.765, respectively. Notably, integrating all clinical variables (metabolic and renal–BP markers) further improved the performance (AUC = 0.816) (Supplementary Fig. 3A–C). However, compared with this integrated clinical model, the metabolite panel conferred only a small, non‑significant gain (ΔAUC = 0.052; DeLong p = 0.78) (Supplementary Fig. 3D). Reclassification indices were likewise modest and non‑significant (IDI = 0.039; cfNRI =–0.25), indicating limited incremental discrimination in the present sample. To externally validate the CKM-associated metabolite panel, we reanalyzed three independent public cohorts involving cardiovascular- or renal- related diseases, as direct CKM datasets were unavailable. Specifically, we used: (1) ST000816—a diabetic kidney disease cohort comparing progressors vs. non-progressors; (2) ST003661—a coronary CTO cohort comparing pre-operative patients with healthy controls; and (3) ST003275—a coronary microvascular disease cohort comparing patients and controls. Random forest (MeanDecreaseAccuracy) and ROC analyses were performed within each cohort for the CKM-associated metabolites to assess their classification value (Supplementary Fig. 4A–B). Notably, LPC 18:1, PC 38:7, LPC 20:0, and DG 38:5 showed consistently high importance and diagnostic performance (AUC > 0.7) across multiple cohorts. These results support the generalizability of the CKM lipid signature, suggesting it reflects broader metabolic disruptions common to cardiometabolic and renal risk. Discussion A comprehensive characterization of metabolic molecular features can enhance the understanding of inter-individual relationships and heterogeneity. This study integrated metabolomic and lipidomic data analyses to investigate the molecular characteristics of CKM syndrome, providing a comprehensive framework that recognizes the intricate interconnections among the pathophysiological processes underlying obesity, insulin resistance, T2D, CKD and CVD. Both shared and stage-specific molecular profiles of CKM syndrome were identified across different stages, including low-risk (stages 0–2) and high-risk (stage 3) individuals. The presence of shared molecular features suggests a potential progression between stages and indicates that the current disease classification lacks clear molecular boundaries. Consequently, routine diagnosis and treatment strategies may overlook the molecular heterogeneity and interconnectedness among patients at different stages of CKM. To further explore alternative strategies for stratifying CKM syndrome, three metabolic subtypes were redefined using clustering analysis integrating both metabolomic and lipidomic data. Although the resulting clusters differed from the original clinical classifications, they demonstrated distinct molecular patterns. Cluster 1 consisted of the most heterogeneous samples across CKM stages. Both Cluster 1 and Cluster 2 were characterized by a relatively low metabolic state, particularly in metabolites related to amino acid and lipid metabolism, suggesting a lower likelihood of progression to high-risk CKM for patients in Cluster 1. In contrast, Cluster 3 was enriched with high-risk CKM patients and exhibited a pronounced high metabolic state involving both amino acid and lipid pathways. While Clusters 1 and 2 were predominantly composed of low-risk CKM patients, high-risk individuals were distributed across all three clusters. This suggests that low-risk CKM patients may still progress to advanced stages, highlighting molecular heterogeneity and offering potential biomarkers for early detection and prevention. Conventional clinical markers, such as glucose, HbA₁c, standard lipid panels, blood pressure, and eGFR, typically show significant changes only after substantial organ damage has occurred [[102]44–[103]46]. Recent advances in metabolomics have begun to address this limitation [[104]25, [105]34, [106]47, [107]48]. For example, large-scale analyses from the UK Biobank identified metabolic clusters with lipid and amino acid profiles that outperformed traditional lipid markers in predicting cardiometabolic outcomes [[108]49]. However, few studies have explicitly focused on the syndromic concept of CKM, which encompasses concurrent cardiovascular, renal, and metabolic dysfunction. In this study, key discriminative features were distilled into a concise 20-lipid panel, achieving an AUC of 0.875 for distinguishing high-risk from low-risk CKM patients; however, its incremental gain over the fully integrated clinical model (ΔAUC = 0.052) was not statistically significant by DeLong testing (p = 0.78). Even so, the panel holds promise for refining risk stratification and informing therapeutic decisions in CKM syndrome. Integrating such lipidomic endotyping into established tools—e.g., the Framingham Risk Score or KDIGO guidelines—could enhance predictive accuracy, especially for individuals whose conventional markers lie near diagnostic thresholds. The results of this study demonstrate that the majority of high-risk CKM patients are concentrated within the metabolically active Cluster 3 (Figs. [109]1D and [110]2A). Notably, reclassification of all high-risk CKM individuals yielded consistent results (Fig. [111]3C), suggesting that heightened metabolic activity serves as a key risk factor for CKM. This hypermetabolic subtype is marked by an enrichment of fatty acyls, with elevated levels of both saturated and unsaturated FAs (e.g., FA 19:0, FA 22:1) and acylcarnitines (e.g., CAR 10:2, CAR 14:1), indicating enhanced mitochondrial β-oxidation coupled with incomplete fatty acid catabolism [[112]50, [113]51]. The accumulation of intermediate acylcarnitines may reflect mitochondrial overload and subsequent cellular stress, potentially contributing to insulin resistance, endothelial dysfunction, and inflammation. Elevated free FAs have previously been implicated in the disruption of insulin signaling and the activation of inflammatory pathways in metabolic tissues, which may exacerbate cardiovascular and renal injury risks-particularly given this subtype’s prominence among Stage 3 CKM patients [[114]52–[115]54]. In parallel, this hypermetabolic subtype exhibits increased concentrations of sphingolipids, including Cers, SMs, and HexCers, indicative of disrupted sphingolipid metabolism. Consistent with previous reports [[116]55, [117]56], high-risk CKM individuals demonstrated elevated glycosphingolipids (e.g., HexCers), implicating endothelial impairment, pro-inflammatory activation, and insulin resistance. The accumulation of Cers and complex SMs may contribute to vascular smooth muscle cell proliferation and renal fibrosis-mechanisms closely associated with CKD progression and cardiovascular complications. Preclinical studies suggest that targeting enzymes involved in sphingolipid biosynthesis (e.g., serine palmitoyltransferase, glucosylceramide synthase) may offer therapeutic potential, highlighting promising avenues for intervention in this metabolic phenotype [[118]57, [119]58]. Furthermore, this subtype exhibits a distinct metabolic signature characterized by glycerophospholipid enrichment, with elevated levels of LPCs (LPCs; e.g., LPC 18:1, LPC 14:1) and PCs (PCs; e.g., PC 42:7), suggesting disruptions in membrane lipid homeostasis. LPCs are bioactive lipids known to promote endothelial activation, oxidative stress, and inflammation [[120]59, [121]60]. Clinical studies have linked increased LPC levels to the progression of diabetic nephropathy and the instability of coronary artery plaques, indicating a potential vascular-inflammatory profile for this subtype. Lastly, concurrent elevations in TGs, DGs, and cholesteryl esters further support the presence of dyslipidemia-related vascular complications in this high-risk group [[122]61, [123]62]. Limitations and future directions Despite its novel insights, this study has several limitations. First, the single‐center cohort of 163 adults—of whom only 86 had complete CKM staging data—was relatively homogeneous, which may constrain the generalizability of the findings across broader ethnic and geographic populations. Second, the cross‐sectional design precludes causal inference: although strong associations between metabolic endotypes and CKM severity were observed, only longitudinal follow‐up can determine whether shifts in metabolomic and lipidomic profiles precede disease progression or merely accompany it. Third, the mechanistic interpretations are inferred from pathway analyses and existing literature, rather than from direct experimental validation in cellular or animal models. Finally, the exclusive focus on metabolomics and lipidomics precludes the integration of complementary ‘omics layers—such as genomics, transcriptomics, and proteomics—that could illuminate upstream drivers and downstream effectors of CKM pathobiology. Future work should focus on validating these data‐driven endotypes in larger, multiethnic cohorts and tracking their stability and prognostic value over time through longitudinal studies; elucidating the causal roles of key pathways in preclinical models to confirm mechanistic links; integrating metabolomics and lipidomics with genomics, transcriptomics, and proteomics to uncover upstream regulators and downstream effectors; and ultimately testing whether endotype‐guided interventions can normalize metabolic profiles and improve clinical outcomes in CKM syndrome. Conclusions This study demonstrates that plasma metabolomics and lipidomics can reveal clinically meaningful metabolic endotypes within CKM syndrome, and a concise lipid panel robustly predicts high-risk status. Moving forward, metabolomic and lipidomic stratification holds promise for earlier detection, individualized risk management, and mechanism-driven therapeutics in CKM patients. Supplementary Information Below is the link to the electronic supplementary material. [124]Supplementary Material 1^ (1.7MB, docx) Acknowledgements