Abstract Colorectal cancer (CRC) is the third most commonly diagnosed malignant tumour in worldwide populations. Although colon cancer (CC) and rectal cancer (RC) are often discussed together, there is a global trend towards considering them as two separate disease entities. It is necessary to choice the appropriate treatment for CC and RC based on their own characteristics. Hence, it is a great importance to find effective biomarkers to distinguish CC from RC. In the present study, a total of 343 participants were recruited, including 132 healthy individuals, 101 patients with CC, and 110 patients with RC. The concentrations of 93 metabolites were determined by using a combination of dried blood spot sampling and direct infusion mass spectrometry technology. Multiple algorithms were applied to characterize altered metabolic profiles in CC and RC. Significantly altered metabolites were screened for distinguishing RC from CC in training set. A biomarker panel including Glu, C0, C8, C20, Gly/Ala, and C10:1 was tested with tenfold cross-validation and an independent test set, and showed the potential to distinguish between RC and CC. The metabolomics analysis makes contribution to summarize the metabolic differences in RC and CC, which might provide further guidance on novel clinical designs for the two diseases. Supplementary Information The online version contains supplementary material available at 10.1038/s41598-025-96004-8. Keywords: Colon cancer, Rectal cancer, Metabolomics, Dried blood spot sampling, Mass spectrometry Subject terms: Colon cancer, Rectal cancer, Predictive markers Introduction Colorectal cancer (CRC) is the third most common malignancy and the second leading cause of cancer death worldwide^[32]1. A total of 0.9 million deaths from CRC and 1.8 million newly diagnosed cases are reported annually^[33]2. The incidence of CRC is gradually increasing, and the number of new patients worldwide is predicted to reach 2.5 million by 2035^[34]3,[35]4. Although the survival of CRC patients has recently improved owing to screening methods in developed countries, 25% of patients had already progressed to stage IV disease at diagnosis^[36]2. Compared with that of patients with localized CRC, the 5-year survival rate of patients with metastatic CRC is low, ranging from 14–20%^[37]5,[38]6. Hence, if CRC is detected at an early stage, its incidence and mortality are reduced, and a better prognosis is obtained^[39]7. According to CRC statistics from 2015 to 2019, approximately 39% of CRC cases occurred in the proximal colon, 24% in the distal colon, and 30% in the rectum^[40]8. Although colon cancer (CC) and rectal cancer (RC) are often discussed together, they have different embryological origins, anatomies, and functions. These differences are closely associated with patient morbidity, risk factors, and treatment strategies and may even impact disease prognosis^[41]9–[42]11. Therefore, there is an urgent need to systematically clarify the molecular mechanisms and find predictive biomarkers specific to CC and RC for early-stage CRC diagnosis as well as better targeted and specific therapies. Carcinoembryonic antigen (CEA) has been recognized as the most accepted and routinely used CRC biomarker for disease screening, for the prediction of treatment response and patient survival, and for the detection of recurrence^[43]12. Preoperative CEA levels are only positive in 40–60% of CRC patients at initial diagnosis^[44]13. Furthermore, it is controversial whether the detection of CEA can reduce CRC mortality for postoperative patients or select patients at stage II who could benefit from adjuvant chemotherapy^[45]14,[46]15. Notably, CEA cannot be used to distinguish CC from RC. For these reasons, finding new efficient and noninvasive biomarkers specific to CC and RC is necessary. Metabolomics is an omics technology used to systematically detect, identify and quantify small-molecule metabolites in a biological system and could offer an efficient method to find biomarkers for abnormalities^[47]16, characterize disease perturbation^[48]17 and biological pathways^[49]18, and assist in disease diagnosis^[50]19. Metabolic reprogramming is an important hallmark of cancer; thus, further systematic characterization of metabolites in CC and RC is warranted. A metabolic biomarker panel for CRC has been identified via targeted liquid chromatography‒mass spectrometry (LC‒MS)^[51]20. A serum metabolomics study of 22 CC patients and 23 RC patients was performed to identify potential metabolic markers for CC and RC, and different metabolic profiles were found between the two diseases^[52]21. However, the sample size of this study was limited, and further clinical sample analysis is still needed for patients with CC or RC. Dried blood spot (DBS) sampling was used in the present study. As a micro-volume sampling technique, DBS is a technique in which blood samples are collected with a finger prick. Compared with conventional whole blood sampling techniques, DBS requires a smaller blood volume, has relatively high stability, reduces the infection risk for infectious pathogens, and provides several other advantages, such as simpler sample collection and storage, easier sample transfer, and lesser invasiveness. A combination of DBS and MS can offer high-throughput, stable, and reliable detection for a wide range of analytes and has been used to monitor cancer metabolic reprogramming and select high-sensitivity and high-specificity cancer biomarkers^[53]22,[54]23. In this study, a combination of DBS and MS was used to detect 93 amino acids, carnitine/acylcarnitines, and related ratios in healthy individuals and patients with CC or RC. After systematic selection, 6 metabolites were screened to construct a prediction model for distinguishing RC from CC in the training set. This model was further validated in tenfold cross-validation on training set, and also assessed in an independent test set. Our findings suggest new insights for CRC detection. Materials and methods Study participants Participants were collected from the First Affiliated Hospital of Jinzhou Medical University. Healthy individuals (n = 132), CC patients (n = 101), and RC patients (n = 110) were recruited. A total of 343 participants were randomly divided into training set and test set in the 4:1 ratio. The training set included 106 healthy individuals, 81 CC patients, and 88 RC patients. The remaining participants were assigned as a test set, including 26 healthy individuals, 20 CC patients, and 22 RC patients. The selection of potential biomarkers and the construction of prediction models were developed on the training set. The evaluation of biomarker candidates was made on the test set. The characteristics for healthy control (HC) and all the patients were collected as shown in Table [55]1. There were no significant differences identified in age and gender among HC, CC, RC groups. The blood samples for patients were recruited before inpatient treatment. Inclusion criteria for patient selection were: (i) diagnosed with CC or RC and (ii) age over 18 years. All Patients with metabolic disease, cardiovascular disease, mental disease, infection, other types of malignant disease, or acute disease were excluded. Inclusion criteria for HC were: (i) healthy individual and (ii) age over 18 years. The subjects with missing data were removed. This study was conducted in accordance with the principles of the Declaration of Helsinki, and approved by Ethics Committee of the First Affiliated Hospital of Jinzhou Medical University. Written informed consent was provided from each research participants. Table 1. Clinical characteristics of all participants Training set Test set HC CC RC P-value HC CC RC P-value Total 106 81 88 26 20 22 Sex Male 61 46 59 0.2973 18 11 17 0.2978 Female 45 35 29 8 9 5 Age Mean (SD) 58.5755(9.8676) 57.1358(11.3223) 60.1364(10.1372) 0.1647 61.4615(8.0361) 63.7000(8.4111) 64.2727(8.5478 ) 0.1071 Range 20-79 22–79 29–79 38–78 39-73 37–75 TNM stage I 22 20 6 7 II 27 28 7 5 III 21 23 4 6 IV 11 17 3 4 [56]Open in a new tab HC, healthy control; CC, colon cancer; RC, rectal cancer. Reagents Acetonitrile was high-performance liquid chromatography grade. Acetonitrile, methanol and high-purity water were acquired from Thermo Fisher (Waltham, MA, USA). Acetyl chloride and 1-butanol were obtained from Sigma-Aldrich (St Louis, MO, USA). Isotope-labeled amino acid internal standard (IS) and carnitine/acylcarnitine IS were purchased from Cambridge Isotope Laboratories, Inc. (Tewksbury, MA). The MassCheck^® Amino Acids, Acylcarnitines, Succinylacetone Dried Blood Spot Controls at two levels were obtained from Chromsystems Instruments and Chemicals GmbH (Munich, Germany). DBS cards were purchased from Liding medical (Guiyang, China). Sample collection and pretreatment Amino acid and carnitine/acylcarnitine ISs were dissolved in 1 ml pure methanol. Dissolved isotope ISs were mixed together to obtain stock solution, and stored at 4 °C. Working solution was prepared by 100-fold dilution of stock solution with pure methanol. High-level and low-level QC sample solutions were used to certify data quality. QC samples were prepared in DBS cards. Fasting blood samples were taken from each participants to avoid diet disturbance. Capillary blood samples were collected directly on DBS card by a finger prick. The first blood drop was discarded, and the second drop was collected on the DBS card. After that, a disc of 3 mm in diameter punched from DBS card was transferred into Millipore MultiScreen HV 96-well plate (Millipore, Billerica, MA, USA) to extract metabolites. Each well contains 100 ul working solution and one DBS card. The plate was then gently shaken at room temperature for 20 min. After centrifuged for 2 min at 1500 × g, the filtrate was transferred into a new 96-well plate. As the QC solution, MassCheck Amino Acids, Acylcarnitines, Succinylacetone Dried Blood Spot Controls were added for checking the stability of MS analysis. During this progress, 2 high-level and 2 low-level QC sample solutions were added into 4 blank wells as real samples. The QC solution and filtrate were dried in pure nitrogen gas flow at 50 °C and then derivatized with a 60 µl mixture of 1-butanol and acetyl chloride (90:10, v/v) at 65 °C for 20 min. After drying again by pure nitrogen gas flow at 50 °C, 100 µl mobile phase was mixed with a dried sample immediately for metabolomics analysis. Metabolomics analysis We performed quantitative metabolomics analysis via direct injection MS on an AB Sciex 4000 QTrap system (AB Sciex, Framingham, MA) coupled with a positive-ion electrospray ionization source. The sample was injected into the system at a volume of 20 µl. The analysis was performed with a mobile phase consisting of acetonitrile/water (80:20). A flow rate of 0.2 ml/min was used initially. A stepwise decrease of the flow rate to 0.01 ml/min within 0.08 min was followed by remaining stable until 1.5 min. Flow rate reverted back to 0.2 ml/min, and maintained constant for 0.5 min. The main MS parameters were finally set as follows: ion spray voltage of 4.5 kV, curtain gas pressure of 20 psi, auxiliary gas temperature of 350 °C, and sheath and auxiliary gas pressure were fixed at 35 psi. The MS scan modes and parameters were according to Supplementary Table [57]S1. The control of mass spectrometer, MS data collection, and spectral alignment were done by Analyst 1.6.0 software (AB Sciex). ChemoView 2.0.2 (AB Sciex) was applied to quantify the concentrations of metabolites. Data analysis Multivariate data analysis could give a systemic overview of metabolic pattern discrimination among HC, CC, and RC by using two pattern recognition methods, principal component analysis (PCA) and partial least squares discriminant analysis (PLS-DA) in training set after data normalization. Statistical significance of PLS-DA model was assessed by a 200-time permutation test. An R2 value close to 1 indicates good quality of model, and a Q2 value less than 0 suggests accurate analysis of differential metabolites. Variable importance projection (VIP) scores were calculated from PLS-DA, and used to highlight metabolic changes that significantly contribute to the discrimination between sample groups. Univariate statistical analysis was further performed to identify significantly altered metabolites between the groups in the training set. The normality of data distribution for the metabolites were assessed by using Shapiro-Wilk test. Parametric t-test statistic was carried out to compare the means of metabolites concentration between two groups. Homogeneity of variance between two groups was checked by Levene’s test followed by standard t-test for data with equal variances and Welch’s t-test for data with unequal variances. Wilcoxon-Mann-Whitney test statistic was used for non-parametric data. Adjusted p-values were calculated in order to control false discovery rate (FDR) by using Benjamini-Hochberg procedure. The Kruskal-Wallis test was performed for the comparisons among the three groups. Volcano plots with thresholds of VIP > 1, fold change (FC) > 1.2 or < − 1.2, adjusted p-value < 0.05 were constructed to identify importantly altered metabolites between two groups. The metabolic data were normalized for the following analysis. Two-class analysis using significance analysis of micro arrays (SAM) was performed to further determine the significantly differential metabolites between two groups in training set. T-test statistic with a 200-time permutation test was implemented in SAM. Based on the significantly altered metabolites between two groups, MetaboAnalyst 5.0 platform was used to identify the most significantly affected pathways with the filter criteria of pathway impact score > 0.05 and −log(p) > 1. After that, a stepwise logistic regression approach was used to identify the most significant independent variables, and a significance level of < 0.1 was required to allow a variable to enter into the model and stay in the model. Furthermore, these potential biomarkers were applied to build prediction models by using binary logistic regression method. Models were evaluated in ten-fold cross-validation on training set, and also assessed in an independent test set. Receiver operating characteristic (ROC) curve was created to assess sensitivity and specificity of potential biomarkers and to determine the ability of these biomarkers to distinguish between two groups. SAS software was used to run statistical analysis. Results Demographic characteristics of the study participants The workflow diagram of this study was depicted in Fig. [58]1. Three groups were enrolled in this study. The training set included 61 (58%) males, 45(42%) females for HC, 46 (57%) males, 35(43%) females for patients with CC, and 59 (67%) males, 29 (33%) females for patients with RC. Mean age was 58.5755 ± 9.8676 years and age range was 20–79 years in HC group. Mean age of patients with CC was 57.1358 ± 11.3223 years and age range was 22–79 years. For patients with RC, mean age was 60.1364 ± 10.1372 years and age range was 29–79 years in training set. The remaining 68 participants were assigned as test set for evaluating the performance of prediction models. There were no significant differences among the three groups in terms of age and gender in training set and in test set. Fig. 1. [59]Fig. 1 [60]Open in a new tab Study design, data collection and analysis workflow. CC: colon cancer; RC: rectal cancer; MS: mass spectrometry; SAM: significance analysis of micro arrays; PLS-DA: partial least squares discriminant analysis; ROC: receiver operating characteristic. Metabolic profiles of HC, CC, and RC groups A combination of DBS sampling and MS detection was used to detect 93 metabolites in HC, CC, and RC groups, including 23 amino acids, 35 carnitine/acylcarnitine, and 35 related ratios, as shown in Supplementary Table [61]S2. A general trend towards all these detected metabolites among the three groups was assessed by an unsupervised PCA method. The PCA score plot gave a separated trend among the three groups (Supplementary Figure [62]S1A) suggesting the metabolic alterations among study groups. In PCA model, the tightly clustering of QCs confirmed the stability of analytical system (Supplementary Figure [63]S1B). In addition, a supervised PLS-DA method was carried out to maximize classification among the three groups. As shown in Supplementary Figure [64]S1C, the metabolic profile could distinguish among the three groups. Considering the results of a 200-time permutation test, the over-fitting of PLS-DA model did not occurred in the present study, as depicted in Supplementary Figure [65]S1D. Comparison of metabolic profiles between HC and CC groups The score plot of PLS-DA (Fig. [66]2A) illustrated a significant separation between HC and CC groups towards 93 metabolites without over-fitting (Fig. [67]2B). Simultaneously, the corresponding VIP values of 93 metabolites were calculated to identify the important variables contributing to the classification between HC and CC. As illustrated in Fig. [68]2C and D, important metabolites were selected by using PLS-DA, statistical test, and FC calculation. It has been identified 50 metabolites with adjust p-value < 0.05 and FC > 1.2 or FC < − 1.2. Furthermore, 37 metabolites were selected with the criterion of VIP > 1 and FC > 1.2 or FC < − 1.2. Altogether, 36 metabolites were screened with the criterion of VIP > 1, adjust p-value < 0.05, and FC > 1.2 or FC < − 1.2, as illustrated in Fig. [69]2E. SAM method was further implemented to identify significant metabolite levels in CC compared with HC (Fig. [70]2F). Among the 36 selected metabolites, SAM showed 6 significantly upregulated metabolites and 26 significantly downregulated metabolites in CC compared with HC. After that, these 32 metabolites were applied for clustering analysis in heat map (Fig. [71]2G) and displayed significant separation between CC and HC. Metabolic pathway analysis displayed that oxidation of branched chain fatty acids, Asp metabolism, beta oxidation of very long chain fatty acids, carnitine synthesis, Arg and Pro metabolism, urea cycle, mitochondrial beta-oxidation of long chain saturated fatty acids, and ammonia recycling were mainly altered in CC (Fig. [72]2H). Fig. 2. [73]Fig. 2 [74]Open in a new tab Metabolic profiles of blood amino acids and carnitine/acylcarnitines to distinguish patients with CC from HC. (A) PLS-DA score plot. (B) Statistical significance of obtained PLS-DA model was evaluated by 200-times permutation test. The R2-intercept and Q2-intercept in permutation test were 0.1640 and − 0.2090, respectively. (C) Volcano plot by unifying adjusted p-value and log2 FC. Significantly altered metabolites were identified with adjusted p-value < 0.05 and FC > 1.2 or < − 1.2. The selected metabolites were colored in blue. (D) The volcano plot by unifying VIP and log2 FC. Metabolites with VIP > 1 and FC > 1.2 or < −1.2 were identified. (E) Venn diagram represented the altered metabolites between patients with CC and HC based on volcano plot analysis. Thirty-six metabolites were selected with adjusted p-value < 0.05 and VIP > 1 and FC > 1.2 or < − 1.2. (F) SAM method for analysis of a two-sample significance in patients with CC and HC. (G) Heat map cluster analysis for the 36 selected metabolites. Red colors characterize upregulated metabolites, and blue colors characterize downregulated metabolites in patients with CC compared with HC. (H) Pathway enrichment analysis for the data of differential metabolites between the two groups. Abbreviation: CC: colon cancer; HC: healthy control; PLS-DA: partial least squares discriminant analysis; FC: fold change; VIP: PLS-DA variable importance projection; SAM: significance analysis of micro arrays. Comparison of metabolic profiles between HC and RC groups A supervised PLS-DA model (Fig. [75]3A) was build without over-fitting (Fig. [76]3B). The score plot of PLS-DA showed that the detected 93 metabolites separated RC from HC (Fig. [77]3A). As illustrated in Fig. [78]3C and D, and [79]3E, 29 metabolites with adjust p-value < 0.05 and FC > 1.2 or FC < − 1.2, and 24 metabolites with VIP > 1 and FC > 1.2 or FC < 1.2 were screened. Altogether, 23 metabolites were selected with the criterion of adjust p-value < 0.05, VIP > 1, and FC > 1.2 or FC < − 1.2. Among the 23 selected metabolites, 13 metabolites were significantly upregulated, and 6 metabolites were significantly downregulated in patients with RC compared with HC (Fig. [80]3F,G), followed by a metabolic pathway analysis with these selected metabolites. It has been shown that the pathways of urea cycle, aspartate metabolism, beta oxidation of very long chain fatty acids, Arg and Pro metabolism, oxidation of branched chain fatty acids, ammonia recycling, Hcy degradation, malate-aspartate shuttle, and phosphatidylethanolamine biosynthesis were altered in RC (Fig. [81]3H). Fig. 3. [82]Fig. 3 [83]Open in a new tab Metabolic profiles of blood amino acids and carnitine/acylcarnitines to distinguish patients with RC from HC. (A) PLS-DA score plot. (B) Statistical significance of obtained PLS-DA model was evaluated by 200-times permutation test. The R2-intercept and Q2-intercept in permutation test were 0.1770 and − 0.2220, respectively. (C) Volcano plot by unifying adjusted p-value and log2 FC. Significantly altered metabolites were identified with adjusted p-value < 0.05 and FC > 1.2 or < − 1.2. The selected metabolites were colored in blue. (D) The volcano plot by unifying VIP and log2 FC. Metabolites with VIP > 1 and FC > 1.2 or <− 1.2 were identified. (E) Venn diagram represented the altered metabolites between patients with RC and HC based on volcano plot analysis. Twenty-three metabolites were selected with adjusted p-value < 0.05 and VIP > 1 and FC > 1.2 or < − 1.2. (F) SAM method for analysis of a two-sample significance in patients with RC and HC. (G) Heat map cluster analysis for the 23 selected metabolites. Red colors characterize upregulated metabolites, and blue colors characterize downregulated metabolites in patients with RC compared with HC. (H) Pathway enrichment analysis for the data of differential metabolites between the two groups. Abbreviation: RC: rectal cancer; HC: healthy control; PLS-DA: partial least squares discriminant analysis; FC: fold change; VIP: PLS-DA variable importance projection; SAM: significance analysis of micro arrays. Comparison of metabolic profiles between CC and RC groups A PLS-DA model (Fig. [84]4A) was built to separate RC from CC based on 93 metabolites without over-fitting (Fig. [85]4B). As shown in Fig. [86]4C and D, and [87]4E, 23 metabolites with adjust p-value < 0.05 and FC > 1.2 or FC <− 1.2 and 25 metabolites with VIP > 1 and FC > 1.2 or FC < 1.2 were screened. Altogether, 20 metabolites were selected with the criterion of adjust p-value < 0.05, VIP > 1, and FC > 1.2 or FC <− 1.2. The SAM results and clustering analysis showed 17 significantly upregulated metabolites and 3 significantly downregulated metabolites in RC compared with CC, as illustrated in Fig. [88]4F and G. Metabolic pathway analysis showed that the pathways of carnitine synthesis, Arg and Pro metabolism, urea cycle, biotin metabolism, beta oxidation of very long chain fatty acids were altered between RC and CC (Fig. [89]4H). Fig. 4. [90]Fig. 4 [91]Open in a new tab Metabolic profiles of blood amino acids and carnitine/acylcarnitines to distinguish RC from CC. (A) PLS-DA score plot. (B) Statistical significance of obtained PLS-DA model was evaluated by 200-times permutation test. The R2-intercept and Q2-intercept in permutation test were 0.2140 and − 0.2090, respectively. (C) Volcano plot by unifying adjusted p-value and log2 FC. Significantly altered metabolites were identified with adjusted p-value < 0.05 and FC > 1.2 or <− 1.2. The selected metabolites were colored in blue. (D) The volcano plot by unifying VIP and log2 FC. Metabolites with VIP > 1 and FC > 1.2 or <− 1.2 were identified. (E) Venn diagram represented the altered metabolites between CC and RC based on volcano plot analysis. Twenty metabolites were selected with adjusted p-value < 0.05 and VIP > 1 and FC > 1.2 or < − 1.2. (F) SAM method for analysis of a two-sample significance in CC and RC. (G) Heat map cluster analysis for the 20 selected metabolites. Red colors characterize upregulated metabolites, and blue colors characterize downregulated metabolites in RC compared with CC. (H) Pathway enrichment analysis for the data of differential metabolites between the two groups. Abbreviation: CC: colon cancer; RC: rectal cancer; PLS-DA: partial least squares discriminant analysis; FC: fold change; VIP: PLS-DA variable importance projection; SAM: significance analysis of micro arrays. Prediction models to differentiate HC, CC, and RC groups In order to improve predictive performance and to avoid collinearity and over-fitting, potential biomarkers were assessed by using stepwise binary logistic regression in training set. The identified potential biomarkers were used to build prediction models via binary logistic regression followed by structural validation. Ten-fold cross-validation on training set was adopted as internal validation to the prediction model. Simultaneously, an independent test set was also used to evaluate the performance of prediction model. A total of 7 metabolites, including Arg, Glu, Val, Gly/Ala, C2/C0, C10:2, C14:2, were screened as potential biomarkers to distinguish CC from HC in training set. A prediction formula was established for classification purposes between the two groups. This equation is as follows: logit(P) = − 1.9213 + 0.2704 × Arg + 0.0172 × Glu-0.0367 × Val + 0.9310 × Gly/Ala + 5.7244 × C2/C0-2.4138 × C10:2−1.8907 × C14:2. The following ROC curve was drawn to evaluate the potential of metabolic panel to distinguish CC from HC, as shown in Fig. [92]5A. After that, Kappa value was calculated to evaluate the consistency between predictive and actual results. As shown in Table [93]2, the area under ROC curve (AUC) ranged from 0.9173 for test set to 0.9543 for training set. The sensitivity and specificity were 0.8642 and 0.9340 for training set and test set, respectively. The ROC curve also yielded good discrimination of CC patients from HC both in tenfold cross-validation (sensitivity: 0.8395; specificity: 0.9057) and in an independent test set (sensitivity: 0.9000; specificity: 0.8462). All the kappa values were higher than 0.6900 (p-value < 0.0001) (Table [94]2). Overall, these results demonstrated good performance of metabolic profile to distinguish CC from HC. Fig. 5. [95]Fig. 5 [96]Open in a new tab The ROC curves towards selected metabolites HC, patients with CC, and patients with RC. (A) ROC curves of binary logistic regression for distinguishing patients with CC from HC. (B) ROC curves for distinguishing patients with RC from HC. (C) ROC curves for distinguishing RC from CC. The curve marked with blue line was for training set, red dash for tenfold cross-validation, and green star for test set. Abbreviation: ROC: receiver-operating characteristic; HC: healthy control; CC: colon cancer; RC: rectal cancer. Table 2. Performance of selected metabolites among HC, CC and RC groups HC versus CC HC versus RC CC versus RC Training set Ten-fold cross validation Test set Training set Ten-fold cross validation Test set Training set Ten-fold cross validation Test set AUC (95%CI) 0.9543(0.9274–0.9813) 0.9373(0.8988–0.9759) 0.9173(0.8274–1.0000) 0.9168(0.8800–0.9536) 0.9027(0.8567–0.9487) 0.8566(0.7447–0.9685) 0.8221(0.7590–0.8852) 0.8005(0.7342–0.8668) 0.8182(0.6894–0.9470) Sensitivity 0.8642 0.8395 0.9000 0.7614 0.8068 0.8182 0.7159 0.7273 0.8636 Specificity 0.9340 0.9057 0.8462 0.9151 0.8491 0.8077 0.8272 0.8025 0.8000 Accuracy 0.9037 0.8717 0.8478 0.8402 0.8247 0.8125 0.7633 0.7574 0.8095 Kappa statistic 0.8028 0.7371 0.6922 0.6735 0.6458 0.6237 0.5286 0.5161 0.6182 P-value for kappa statistic < 0.0001 < 0.0001 < 0.0001 < 0.0001 < 0.0001 < 0.0001 < 0.0001 < 0.0001 < 0.0001 [97]Open in a new tab HC, healthy control; CC, colon cancer; RC, rectal cancer; AUC, area under receiver-operating characteristic curve; CI, confidence interval. Arg, Ser, C4-OH, Cit/Arg, C2/C0, C5DC/C5-OH, and C10:2 obtained from stepwise binary logistic regression was used to build prediction model to distinguish between RC and HC. The equation is written as follows: logit(P)=-3.8442 + 0.1546 × Arg + 0.0382 × Ser + 13.0867 × C4-OH-0.2239 × Cit/Arg + 2.5836 × C2/C0 + 1.7459 × C5DC/C5-OH−3.5051 × C10:2. As shown in Table [98]2; Fig. [99]5B, the AUCs ranged from 0.8566 for test set to 0.9168 for training set. The tenfold cross-validated sensitivity and specificity were 0.8068 and 0.8491, respectively. Furthermore, sensitivity and specificity in the test set were 0.8182 and 0.8077, respectively. In addition to that, all the kappa values were higher than 0.6000 (p-value < 0.0001), which implied good consistency between predictive results and actual results. Glu, C0, C8, C20, Gly/Ala, and C10:1 obtained from stepwise binary logistic regression in training set was used to build prediction model to distinguish between RC and CC. The equation of the binary logistic regression model is as follows: logit(P) = − 1.1075-0.0097 × Glu + 0.0562 × C0 + 8.9730 × C8 + 25.386 × C20−0.8515 × Gly/Ala + 7.6863 × C10:1. The performance of this model was evaluated as shown in Table [100]2; Fig. [101]5C. The AUCs were 0.8221, 0.8005, 0.8182 in training set, tenfold cross-validation, and test set, respectively. The sensitivity ranged from 0.7159 to 0.8636, and specificity ranged from 0.8000 to 0.8272. Furthermore, all the kappa values were higher than 0.5000 (p-value < 0.0001). Discussion Given the different molecular characteristics and metastatic patterns between CC and RC, they are two distinct entities requiring different treatment strategies. Therefore, there is an urgent need to identify reliable and robust biomarkers specific to CC and RC for CRC diagnosis and management. Many studies have reported that metabolic reprogramming is a hallmark of cancer, is involved in cancer progression, and is even related to tumorigenesis^[102]24. A biomarker panel for CRC was identified based on the altered urinary metabolites between patients with CRC and control samples^[103]20. Despite an increasing number of studies in which potential biomarkers for CRC are identified via a metabolomics approach^[104]25, these studies seldom focus on the differences in metabolic profiles between CC and RC. We conducted a metabolomics study combining DBS sampling and MS technology to systematically define metabolic profiles for CC and RC based on detected amino acids, carnitine/acylcarnitines and related ratios. The altered metabolic profiles among HC, CC and RC patients were observed. After systematic selection, metabolite panels were screened to differentiate the three groups. We then validated defined metabolites using a binary logistic regression approach. A panel of 7 metabolites was used for building a prediction model and had good sensitivity, specificity, and consistency in distinguishing CC from HC. In addition, after systematically profiling metabolic changes in HC and patients with RC, a panel of 7 metabolites was screened to construct a prediction model, which showed satisfactory prediction ability (AUCs ranging from 0.8566 to 0.9168). Furthermore, 6 metabolites were identified as potential biomarkers with a good ability to distinguish RC from CC (AUCs ranging from 0.8005 to 0.8221). Amino acids play vital roles in cellular metabolism^[105]25 and are responsible for the formation of various components involved in cell proliferation^[106]26. In particular, targeting amino acid metabolism may be a promising strategy for cancer therapy, which suggests the critical role of amino acid metabolism in cancer^[107]27. Metabolic profile analysis revealed diverse significantly dysregulated pathways in CC and RC, among which Asp metabolism and Arg and Pro metabolism emerged as significantly altered pathways in patients with these diseases when compared with HC. Arg was used to construct a prediction model for distinguishing CC from HC in the present study. As a semi-essential amino acid, Arg levels were significantly upregulated both in CC patients and in RC patients compared with HC. Arg can be generated by arginine succinate synthetase (ASS1) and arginine succinate lyase (ASL). The expression of ASS was significantly increased^[108]28, and ASL was overexpressed in CRC patients^[109]29. The altered expression of these two enzymes may contribute to upregulated Arg biosynthesis. In addition, overexpressed Arg transporter CAT-1 and human member 14 of the solute carrier family 6 (SLC6A14) also influence the intracellular Arg levels. The Arg metabolism pathway has been shown to be hyperactive in CRC patients, and Arg supplementation could increase the risk of CRC^[110]30. Once Arg metabolism is disrupted, malignancy can easily occur^[111]31. These findings suggest that altered Arg metabolism is involved in the CRC process. The increased levels of Arg also changed the Cit/Arg ratio in CC patients compared with that in RC patients. An altered ratio was also observed in other cancer patients^[112]32. Cit, a direct precursor of Arg, is closely coupled with Arg metabolism. Cit can be obtained via the oxidation of Arg by nitric oxide synthase (NOS), and altered NOS expression in cancer may further lead to an imbalance in Cit and Arg^[113]32. We also found altered levels of Glu in patients with CC compared with HC, and this finding was used in building a model for distinguishing CC from RC. As a nonessential amino acid, Glu is not only an important bioenergetic substrate for the proliferation of normal and neoplastic cells but also a potential growth factor for tumour development^[114]33. Altered Glu metabolism was observed in cancer patients^[115]34, and increased levels of Glu in CC patients compared with HC and RC patients were found in our study. The utilization of Glu is considerably greater in cancer cells than in normal human tissue cells^[116]35. Glu has also been found to be involved in glutaminolysis for energy generation^[117]36. Glu can be converted into α-ketoglutarate by glutamate dehydrogenase (GLUD) or a group of transaminases, which are intermediate metabolites in the tricarboxylic acid (TCA) cycle. According to these studies, compared with those in HC and RC patients, the increased levels of Glu in CC patients may indicate the high energy demand of CC. Val, a type of branched-chain amino acid (BCAA), was also selected as a potential biomarker to construct a prediction model to distinguish CC from HC. BCAAs are reportedly associated with cancer progression^[118]37. 3-Hydroxyisobutyryl-CoA hydrolase (HIBCH) has been found to play critical roles in CRC patients. High expression of HIBCH has been linked to poor survival in patients with CRC and is associated with increased cell growth, decreased autophagy, and apoptosis resistance in CRC cells^[119]38. This enzyme is a vital mitochondrial protein in Val catabolism and can catalyse the conversion of 3-hydroxyisobutyryl-CoA to 3-hydroxyisobutyrate^[120]39. This product can then be converted to succinyl-CoA and participate in the metabolism of the TCA cycle. These findings indicate that a high expression of HIBCH may change the levels of Val and further affect CC energy metabolism. The Gly/Ala ratio was greater in CC patients than in HC and RC patients. Gly and Ala have shown potential as markers for tumour stage, recurrence, and rapid recurrence likelihood, and the Gly/Ala ratio has been distinguished in primary and recurrent meningiomas^[121]40. Gly is involved in one-carbon metabolism, and the Gly cleavage system (GCS) can fuel one-carbon units for cancer cells^[122]41. In addition to being converted to α-ketoglutarate, Glu can also be further converted to other amino acids, such as Ala. The changed Glu metabolism may also influence Ala levels. Therefore, an altered Glu/Ala ratio may indicate a need for fuel to support CC cell growth and revalidate the disordered Glu metabolism in CC patients compared with those in HC and RC patients. In this study, abnormalities in Ser metabolism were observed in RC patients compared with HC. Ser metabolism plays a critical role in tumour cell proliferation and survival. The oxidation of 3-phosphoglycerate to 3-phosphonooxypyruvate catalyzed by phosphoglycerate dehydrogenase (PHGDH) is the first committed step of de novo Ser biosynthesis^[123]42. Higher mRNA levels of PHGDH have been reported in CRC patients^[124]42, which may influence Ser levels in RC patients. Carnitine, an essential quasi-vitamin in human metabolism, is regarded as a group of biomarkers for mitochondrial function and is involved in facilitating fatty acid β-oxidation in the mitochondria as well as in peroxisomes^[125]43,[126]44. As a transporter of fatty acids into the mitochondrial matrix, carnitine/acylcarnitine levels could reflect disordered long-chain fatty acid β-oxidation^[127]45,[128]46, while the β-oxidation of long-chain fatty acids is an essential source of energy in cancer cells^[129]47. This study revealed decreased levels of C14:2 in CC patients compared with HC and decreased levels of C20 in CC patients compared with RC patients; these two metabolites were therefore used to build respective prediction models. The decrease in long-chain acylcarnitines in CC patients compared with HC and RC patients in this study indicated that more fatty acids were needed to enter mitochondria to supply energy to sustain CC cell survival. In the process of transformation between fatty acyl-CoAs and acylcarnitines, the CPT enzyme system is considered the rate-limiting step, which could influence the efficiency of β-oxidation. It has been reported that carnitine palmitoyltransferase I (CPT1A), a key enzyme in the fatty acid oxidation pathway, is upregulated in CC cells to support cancer cell growth. Increased CPT1A levels may cause the consumption of long-chain acylcarnitines in CC cells^[130]48. In the acylcarnitine profiles, the levels of the medium-chain acylcarnitine C10:2 in CC patients and the short-chain acylcarnitine C4-OH and medium-chain acylcarnitine C10:2 in RC patients were significantly altered compared with those in HC. CTP2, a key enzyme in the catalysis of acyl group conversion from acylcarnitine to acyl-CoA^[131]49, is downregulated in CRC cell lines and tissues^[132]50, which may further influence the carnitine shuttle system, resulting in the altered production of short- and medium-chain acylcarnitines. The C2/C0 ratio has been regarded as an indicator of β-oxidation for even-numbered fatty acids^[133]51. An increased C2/C0 ratio was observed in this study. This result is in line with other studies^[134]52 and suggests altered β-oxidation of even-numbered fatty acids in CC and RC patients. Hence, these changes in carnitine/acylcarnitines may be indicative of disordered oxidation of fatty acids in CC and RC patients. There were some limitations to consider in this study. First, this study is a case-control single-centre study, and a multi-institution study with a larger sample size is needed to further assess the results of this study. Second, CRC commonly develops from focal changes within benign or precancerous polyps. We believe that this study will be more systematic when a reasonable number of patients with polyps can be recruited. Third, a limited number of metabolites were assessed in this study for their coverage-cost trade-off, and fatty acids were assessed for their potential as biomarkers. Finally, more patients with CC or RC need to be recruited to perform a metabolomics analysis of patients at different stages of these diseases. Conclusion A combination of DBS sampling and direct injection MS was applied to detect metabolites for healthy individuals, patients with CC, and patients with RC. Our findings demonstrate the significant metabolic differences between CC and RC. Metabolic biomarker panels were used to distinguish among HC, CC and RC with satisfactory sensitivity and specificity. Thus we uncover that the selected metabolites have the potential to be used as novel biomarkers for the discrimination between CC and RC. Electronic supplementary material Below is the link to the electronic supplementary material. [135]Supplementary Material 1^ (15.7KB, xlsx) [136]Supplementary Material 2^ (17.3KB, xlsx) [137]Supplementary Material 3^ (309.8KB, docx) Abbreviations CRC Colorectal cancer CC Colon cancer RC Rectal cancer HC Healthy control Glu Glutamic C8 Octanoylcarnitine C20 Arachidic carnitine Gly Glycine Ala Alanine C0 Free carnitine CEA Carcinoembryonic antigen LC‒MS Liquid chromatography‒mass spectrometry DBS Dried blood spot QC Quality control PCA Principal component analysis PLS-DA Partial least squared discriminant analysis VIP Variable importance in projection FDR False discovery rate FC Fold change SAM Significance analysis of micro arrays ROC Receiver operating characteristic AUC Area under ROC curve Val Valine C2 Acetylcarnitine Arg Arginine Ser Serine C4-OH Hydroxybutyrylcarnitine Cit Citrulline C5DC Glutarylcarnitine C10:2 Decadienoylcarnitine ASS1 Arginine succinate synthetase ASL Arginine succinate lyase NOS Nitric oxide synthase GLUD Glutamate dehydrogenase TCA Tricarboxylic acid BCAA Branched-chain amino acid GCS Glycine cleavage system PHGDH Phosphoglycerate dehydrogenase. CPT1A Carnitine palmitoyltransferase I Author contributions X.W. wrote the main manuscript text. Q.Y., L.L, and X.W. did metabolite detection and statistical analysis. P.Y. and Z.Z. provided supervision for data analysis and manuscript preparation. All authors reviewed the manuscript. Funding This study is supported by grants from Guizhou Provincial Basic Research Program (Grant No. ZK[2022] General 494), Guizhou Provincial Basic Research Program (Grant No. ZK[2021] General 503), Science and Technology Foundation of Health. Commission of Guizhou Province (Grant No. [2023]3-gzwkj2023-188). Data availability The datasets used and/or analysed during the current study available from the corresponding author on reasonable request. Declarations Competing interests The authors declare no competing interests. Institutional review board statement This study was approved by Ethics Committee of the First Affiliated Hospital of Jinzhou Medical University. Informed consent This study was conducted in accordance with the principles of the Declaration of Helsinki. Written informed consent was provided from each research participants. Footnotes Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. Xue Wu and Qi Yang contributed equally to this work. Contributor Information Peng Yang, Email: 1874660968@qq.com. Zhitu Zhu, Email: zhituzhu12@163.com. References