Abstract Cancer is widely regarded as a leading cause of death in humans, with colon adenocarcinoma (COAD) ranking among the most prevalent types. Cuproptosis is a novel form of cell death mediated by protein lipoylation. Cuproptosis-related genes (CRGs) participate in tumourigenesis and development. Their role in pan-cancer and COAD require further investigation. This study comprehensively evaluated the relationship among CRGs, pan-cancer, and COAD. Our research revealed the differential expression of CRGs and the cuproptosis potential index (CPI) between normal and tumour tissues, and further explored the correlation of CRGs or CPI with prognosis, immune infiltration, tumor mutant burden(TMB), microsatellite instability (MSI), and drug sensitivity in pan-cancer. Gene set enrichment analysis (GSEA) revealed that oxidative phosphorylation and fatty acid metabolism pathways were significantly enriched in the high CPI group of most tumours. FDX1 and CDKN2A were chosen for further exploration, and we found an independent association between FDX1 and CDKN2A and prognosis, immune infiltration, TMB, and MSI in pan-cancer. Furthermore, a prognostic risk model based on the association between CRGs and COAD was built, and the correlations between the risk score and prognosis, immune-related characteristics, and drug sensitivity were analysed. COAD was then divided into three subtypes using cluster analysis, and the differences among the subtypes in prognosis, CPI, immune-related characteristics, and drug sensitivity were determined. Due to the level of LIPT1 was notably positive related with the risk score, the cytological identification was carried out to identify the association of LIPT1 with proliferation and migration of colon cancer cells. In summary, CRGs can be used as potential prognostic biomarkers to predict immune infiltration levels in patients with pan-cancer. In addition, the risk model could more accurately predict the prognosis and immune infiltration levels of COAD and better guide the direction of clinical medication. Thus, FDX1, CDKN2A, and LIPT1 may serve as prospective new targets for cancer therapy. Keywords: Cuproptosis, FDX1, CDKN2A, LIPT1, Pan-cancer, Colon adenocarcinoma 1. Introduction Cancer is the leading cause of death and a significant economic burden worldwide [[45]1,[46]2]. Globally, there were approximately 23.6 million new cancer cases and 10 million deaths from cancer in 2019, with increases of 26.3 % and 20.9 %, respectively, from 2010 to 2019 [[47]3]. Colon adenocarcinoma (COAD) is one of the most common cancers worldwide, along with lung, liver, and breast cancers. Neoadjuvant chemotherapy, radiotherapy, immunotherapy, and targeted therapy have shown strong anticancer effects in clinical practice [[48][4], [49][5], [50][6], [51][7]]. While advancements in treatment options and early screening have contributed to a gradual decrease in the mortality rate of COAD, unfortunately, the incidence of COAD continues to rise [[52]8,[53]9]. In addition, drug resistance and cancer metastasis pose many challenges in developing effective cancer treatments. Therefore, identifying more effective therapeutic targets for early detection and prognosis prediction is important to improve the prognosis of patients with cancer. Cuproptosis, a novel form of cell death discovered by Tsvetkov et al., and cell death is induced by intracellular copper [[54]10,[55]11]. The mechanism of cuproptosis involves the binding of excess intracellular copper to fatty acylated proteins in the tricarboxylic acid (TCA) cycle within the mitochondria. These lipoylated copper-bound proteins aggregate and the Fe–S cluster protein is downregulated, triggering proteotoxic stress and eventual cell death [[56]10,[57][12], [58][13], [59][14]]. Copper is an indispensable cofactor and its homeostasis is essential for many physiological reactions. Dysregulation of intracellular copper bioavailability can lead to oxidative stress and cytotoxicity [[60]15,[61]16]. One study showed that cytoproptosis-related cell death is mainly achieved by enhancing reactive oxygen species (ROS) levels, which makes cancer cells more susceptible to increased oxidative stress [[62]17]. Studies have shown that Cuproptosis-related genes(CRGs) can predict the prognosis and immune cell infiltration in various cancers, including renal clear cell carcinoma, hepatocellular carcinoma, soft tissue sarcomas, melanoma, osteosarcoma, head and neck squamous cell carcinoma, glioma and lung adenocarcinoma [[63][18], [64][19], [65][20], [66][21], [67][22], [68][23], [69][24], [70][25], [71][26], [72][27], [73][28], [74][29], [75][30]]. Currently, there is a lack of pertinent literature available to study the connection between CRGs and their impact on pan-cancer and specifically COAD. In the present study, we comprehensively analysed the mRNA levels of ten CRGs (FDX1, LIPT1, LIAS, DLD, PDHA1, PDHB, DLAT, MTF1, GLS, and CDKN2A) and investigated the correlation of CRGs or cuproptosis potential index (CPI) with prognosis, immune infiltration, TMB, MSI, and drug sensitivity. Furthermore, the association between the expression of CRGs and gene methylation, pathway enrichment analysis of the ten CRGs, and the mutational landscape (including CNV and SNV) of CRGs in pan-cancer were explored. Due to the role of FDX1 and CDKN2A in some cancers have been studied, FDX1 and CDKN2A were chosen for further exploration to analyse the association between FDX1 and CDKN2A and prognosis, immune infiltration, TMB, and MSI in pan-cancer. Besides, we constructed a prognostic risk model and investigated the association between the risk score and prognosis, immune cell infiltration, immune checkpoint expression, and drug sensitivity in COAD. COAD was then divided into three subtypes using cluster analysis, and the differences among the subtypes in prognosis, CPI, immune-related characteristics, and drug sensitivity were determined. Finally, we verified the association of genes in the risk model with the proliferation and migration of colon cancer cells in vitor. In conclusion, CRGs can be used as prognostic biomarkers to predict immune cell infiltration levels in patients with pan-cancer. In addition, the risk model could more accurately predict the prognosis and immune infiltration levels of COAD and better guide the direction of clinical medication. Besides, FDX1, CDKN2A and LIPT1 may serve as novel targets for cancer therapy. 2. Materials and methods 2.1. Download transcriptome and clinical data The Fragments per kilobase million (FPKM) transcriptome data including 33 cancers were downloaded from The Cancer Genome Atlas (TCGA) database. Then, sample annotations and survival information from "[76]https://xenabrowser. net/datapages/." were downloaded and collated. The CPI value was calculated using single-sample gene set enrichment analysis (and Gene Set Variation Analysis (GSVA). 2.2. Differential expression analysis The differential expression levels of CRGs and CPI were calculated in 14 cancers compared to normal tissues (only in 14 cancers was the number of paired normal tissues more than 10, and other cancers were excluded), including BLCA, ESCA, KICH, LICH, LUSC, LUAD, STAD, BRCA, COAD, HNSC, KIRC, PRAD, KIRP, and THCA. Wilcox tests were used to assess the differentially expressed CRGs and CPI in tumour tissues compared with those in normal tissues. The criteria of |log2(fold change)| > 2/3 and P < 0.05 were regarded as statistically significant. The "pheatmap" R package was used to cluster the differential genes. 2.3. Gene set enrichment analysis and survival analysis GSEA was carried out by using "cluster Profiler" and GSVA. The "ggplot2″ is mainly used for image visualisation. The value of –log[10](FDR) represents the relationship between the expression of the CRGs and the enrichment pathway. We considered the false discovery rate (FDR) to be < 0.05, which was considered significant. The "survminer" R package was used by Kaplan-Meier curves, log-rank tests, and univariate Cox proportional hazards regression to calculate the p values and hazard ratio (HR) with 95 % confidence interval (CI). HR is the hazard ratio of the high-risk group to the low-risk group. If the HR values were <1, the genes were considered protective, and HR values greater than 1 indicated that the gene was a risk factor. 2.4. Gene methylation and gene variation analysis The association between the expression level of CRGs with gene methylation was assessed by Spearman correlation coefficient. The gene mutation data of patients with pan-cancer from the TCGA database was downloaded, and the variation of CNV (copy number variation) and SNV (single nucleotide variation) among distinct tumours were analysed by the "maftools" R package. 2.5. Association of CPI with TMB and MSI Pearson correlation coefficient was used to assess the relationships between tumour mutation burden (TMB) and Microsatellite Instability (MSI) with CPI in pan-cancer patients and visualised as radar plots. The overall number of coding errors of somatic genes, base substitutions, and gene insertions or deletions per million bases was defined as the TMB. The TMB value for each tumour sample was calculated based on exome sequencing data from TCGA. MSI has been primarily discussed as a deficient mismatch repair system. Previous studies were collected to summarise MSI data across different cancers. 2.6. Immune-related characteristics and drug sensitivity analysis Immune infiltration data according to CIBERSORT, and immune-related signature scores were calculated using ssGSEA for pancreatic cancer. The correlation between CPI and various immune-related characteristics was calculated by Spearman correlation coefficient in pan-cancer patients using the "MCPcounter" R package. The GDSC and CTRP databases were used to investigate the relationship between CRGs expression and drug sensitivity using Spearman's correlation coefficient. Image visualisation is mainly presented by the "ggplot2″ R package. 2.7. Construction of risk model in COAD and ROC curves CRGs were characterised using the LASSO Cox regression algorithm, and the selected genes were used to establish a prognostic risk model. The risk scores of all pan-cancer patients were calculated based on this formula: [MATH: riskScore=i=1nCoefi*xi :MATH] (Coefi is coefficient, xi is expression per gene). The cutoff point was the median risk score. Subsequently, we divided the pan-cancer patients into high- and low-risk groups. The Kaplan-Meier "survival" R package was used to plot the ROC curve, and the area under the curve (AUC) was used to judge the credibility of the prediction. 2.8. Immune-related characteristics and drug sensitivity analysis in COAD Patients with COAD were divided into high- and low-risk groups based on median risk scores. The Wilcoxon test was used to analyse the differences in immune cell infiltration, immune checkpoint expression, and immune response (TIDE score) between high- and low-risk populations, and the IMgor210 immunotherapy cohort was used to validate the risk model. The R package "pRRophetic" was used to analyse the difference between the high- and low-risk group regarding chemotherapeutic drug sensitivity. 2.9. Cluster analysis and subgroups analysis The "ConsensusClusterPlus" R package was used to divide the COAD patients into subtypes through consistent cluster analysis based on the CRGs' expression. Differences in survival, CPI, immune cell infiltration, immune checkpoint expression, and immune response between subgroups were analysed using chi-square tests. COAD cell lines from the GDSC database were grouped according to their COAD subtypes. The drug sensitivity was compared among subgroups, and the differences in drug sensitivity were quantified using AUC by Kruskal tests. 2.10. qRT-PCR Total RNAs were extracted using TRIzol reagent (Thermo Fisher Scientific, USA) according to the manufacturer's advised protocol. A NanoDrop spectrophotometer was used to determine the quality and quantity of the RNA samples. A PrimeScript RT reagent kit with a genomic DNA eraser (TaKaRa, Tokyo, Japan) was used for cDNA synthesis. SYBR qPCR Master Mix (TaKaRa, Tokyo, Japan) was used for qRT-PCR. The relative mRNA expression level was computed according to the 2^−ΔΔCt approach. All primers were synthesised by Henan Qingke BioCompany (China). All experiments were performed in triplicates. The primer sequences of qRT-PCR are shown in [77]Table S1. 2.11. Transwell and clone assays A transwell assay was used to evaluate cell migration ability. For the transwell assay, SW620, HCT116 and HT29 cells were suspended in a serum-free medium. The concentration of cells was adjusted to 2 × 105 cells/mL, and the 200 μL cell suspension was subsequently seeded into the upper side of a transwell chamber (8 μm pore size; Costar, Cambridge, MA). Then, 600 μl DMEM complete medium containing 10 % FBS was added into the lower chambers. After incubation for 36h, the non-migratory cells on the upper side of the Transwell chambers were gently swabbed. Subsequently, the cells in the transwell chambers were fixed with 4 % paraformaldehyde and stained with 0.1 % crystal violet. Finally, the chambers were washed three times with PBS, air-dried, and photographed using an inverted microscope (Leica MZ8; Leica Microsystems, Wetzlar, Germany). A clone assay was used to evaluate cell proliferative ability. For the cloning assay, the cells were suspended and counted, and 1000 cells were seeded into 6-well plates (Corning, USA), followed by 10 days of incubation. Finally, cells were fixed with 4 % paraformaldehyde and stained with 0.1 % crystal violet. 2.12. Statistical analysis All statistical analyses were performed using R 4.1.1 and GraphPad Prism 8.0. A P-value less than 0.05 was regarded as statistically significant, and the P-value was two-sided. The R package of "cluster Profiler" and GSVA were performed to analyse the GSEA, and FDR value < 0.05. Wilcox tests were used to assess the differentially expressed CRGs and CPI in tumour tissues compared with those in normal tissues. The criteria were |log2(fold change) | > 2/3 and P < 0.05. 3. Results 3.1. The expression level of CPI and gene set enrichment analysis based on CPI in pan-cancer We chose tumours in which several paired normal samples from more than 10 patients, and 14 tumours satisfied the requirement and were analysed to determine the difference in CPI between normal and tumour tissues. The results showed that CPI was higher in KICH, LIHC, LUSC, LUAD, and STAD tumours and lower in BRCA, COAD, HNSC, KIRC, KIRP, and THCA tumours (P < 0.05) ([78]Fig. 1A). Fig. 1. [79]Fig. 1 [80]Open in a new tab CPI difference and pathway enrichment analysis of CRGs. (A) The CPI difference between normal and tumour tissues. (B) The differential expression of pathways between high and low CPI groups. (C/D) The situation of each CRGs enrichment in the two oxidative phosphorylation pathways and fatty acid metabolism, respectively. The size of the circle represents the degree of correlation. GSEA was performed for each tumour based on the transcriptomes of the two tumour groups with the highest 30 % and the lowest 30 % CPI to investigate the different pathways between the high and low CPI groups in the 14 tumours. We found that the oxidative phosphorylation and fatty acid metabolism pathways were significantly enriched in the high CPI group of most tumours, including ESCA, HNSC, KICH, KIRP, LIHC, LUAD, and THCA ([81]Fig. 1B). In addition, based on the expression of CRGs, ssGSEA was performed in pan-cancer patients, and the analysis showed the situation of gene enrichment in different pathways. Two pathways, oxidative phosphorylation and fatty acid metabolism, were significantly enriched in the FDX1 and PDHB high-expression groups ([82]Fig. 1C and D). The results showed that tumour metabolism was stronger in the high CPI group, and the expression levels of FDX1 and PDHB were positively correlated with these two pathways. However, the related mechanisms of FDX1 and PDHB metabolism need to be studied further. 3.2. Differential expression of CRGs in pan-cancer Among the 10 CRGs, CDKN2A was highly expressed in the tumour tissues, whereas almost all the other nine genes showed low expression in the tumour groups ([83]Fig. 2A). Previous studies have shown that FDX1 could affect the prognosis and mediate the metabolism of LUAD, and the expression of CDKN2A and FDX1 in clear cell renal cell carcinoma was also correlated with immune infiltration levels and PD-1 expression [[84]31,[85]32]. Thus, we focused on these two CRGs and analysed their differences in expression in the tumour groups ([86]Figs. S1A and B). We found that the expression of FDX1 was lower in COAD, LUAD, THCA, LIHC, HNSC, KIRC, and KIRP samples than in normal samples, whereas that of CDKN2A was distinctly higher in COAD, BRCA, LUAD, THCA, LIHC, KICH, KIRC, and KIRP samples. Fig. 2. [87]Fig. 2 [88]Open in a new tab Expression levels of CRGs and the association with tumour prognosis and the difference of CPI between subtypes of tumours. (A)The heatmap with expression levels of CRGs. (B) The heatmap of correlation between CRGs and tumour prognosis. (C) The difference of CPI between clinical subtypes of tumours. 3.3. Relationships between CRGs with tumour prognosis and the different CPI among subtypes of tumours By analysing the association of CRGs with tumour prognosis, we found that CRGs play different roles in the prognosis of different tumours. For example, FDX1 could serve as a protective factor in LICH and KIRC, while it is a risk factor in LGG. CDKN2A could act as a risk factor in LICH, KIRC, THCA, KICH, and ACC, while it is a protective factor in HNSC ([89]Fig. 2B). Therefore, CRGs may serve as prognostic indicators. We obtained tumours with clinical subtypes (BLCA, BRCA, COAD, GBM, HNSC, KIRC, LUAD, LUSC, and STAD) and analysed the differences in the CPI among the tumour subtypes. We found that the CPI significantly differed among the BRCA, GBM, HNSC, KIRC, and STAD subtypes ([90]Fig. 2C). 3.4. The mutational landscape of CRGs in pan-cancer We analysed 33 types of tumours to identify the CNV levels in the CRGs. We found that CDKNZA, PDHB, and other genes have a large proportion of copy number deletions in various cancers and that DLD has a large degree of copy number amplification in almost all cancers. In addition, each CRG had many amplified copies in OV, UCS, and LUAD ([91]Fig. 3A). We analysed 10088 cancer patients, including 31 types of tumours, to identify the level of single nucleotide variation and found that CDKN2A was mutated at a higher frequency in tumours (SKCM, HNSC, PAAD, and LUSC), whereas in UCEC, CRGs were all mutated to some extent ([92]Fig. 3B). Fig. 3. [93]Fig. 3 [94]Open in a new tab Mutational landscape of CRGs and the correlation of CPI with immune variables in pan-cancer and the association of expression level with methylation of CRGs. (A)The copy number variation of CRGs in pan-cancer. (B) The single nucleotide variation of CRGs in pan-cancer. (C) Correlation of CPI with TMB in pan-cancer. (D) Association of CPI with MSI in pan-cancer. (E) Correlation between mRNA expression and methylation of CRGs. 3.5. Correlation of CRGs with TMB and MSI in pan-cancer TMB and MSI can predict the effects of immunotherapy in clinical settings. The higher the TMB and MSI values, the better the effect of immunotherapy. Thus, in this study, we analysed the association of CPI with TMB and MSI to predict the therapeutic direction. The radar chart shows that CPI is positively correlated with TMB in BRCA, BLCA, STAD, SKCM, PRAD, and LUAD, but is negatively associated with TMB in UCEC ([95]Fig. 3C); CPI is positively correlated with MSI in STAD, KIRC, and DLBC, while it is negatively correlated with MSI in THCA, PRAD, OV, LUSC, and COAD ([96]Fig. 3D). We also analysed the relationship of FDX1 and CDKN2A with TMB and MSI and found that FDX1 was significantly associated with TMB and MSI in DLBC and STAD ([97]Fig. S2A). In the ACC, CDKN2A had a significant positive correlation with TMB, and CDKN2A had a positive correlation with MSI in KICH ([98]Fig. S2B). 3.6. Association between mRNA expression and methylation of CRGs As shown in [99]Fig. 3E, in most cancers, CRG expression is negatively correlated with methylation. For example, the expression of LIPT1, LIAS, CDKN2A, and DLAT negatively correlated with methylation in TGCT, PRAD, TGCT, ACC, STAD, BLCA, STAD, and PRAD. 3.7. Correlation of CRGs with immune-related features in pan-cancer According to previous studies, we have known that the relationship of CRGs with immune infiltration has been researched in many tumours, such as osteosarcoma, hepatocellular carcinoma, HNSC, soft tissue sarcoma, KIRC, glioma, and cutaneous melanoma [[100][22], [101][23], [102][24], [103][25], [104][26], [105][27],[106]29]. Therefore, we also analysed the association between CPI and immune infiltration in pan-cancers. We found that CPI had a negative correlation with immune-related features such as infiltration score, CD4^+ T, NK, gammadelta, Tfh, and CD8^+ T, but a positive correlation with immune-related features such as nTreg, effector memory, neutrophil, Th1, and DC in pan-cancer ([107]Fig. 4A). We further investigated the relationship between FDX1 and CDKN2A and immune infiltration in pan-cancer. Obtaining the data according to CIBERSORT, we found that in most cancers, FDX1 was positively interrelated with regulatory T cells ([108]Fig. S3A); CDKN2A was positively associated with memory CD4^+ T cell resting ([109]Fig. S3B). Fig. 4. [110]Fig. 4 [111]Open in a new tab Correlation of CPI with immune-related features in pan-cancer and relationship of CRGs with drug sensitivity in GDSC and CTRP database. (A) Association of CPI with immune-related features in pan-cancer. (B) Correlation of CRGs and drug sensitivity in GDSC database. (C) Relationship of CRGs and drug sensitivity in CTRP database. 3.8. Correlation between CRGs and drug sensitivity We obtained the TOP30 drugs from the GDSC database and found that CDKN2A was positively correlated with almost all drug susceptibilities, such as Bleomycin, PD-0332991, Nutlin-3a(−), whereas LIAS, PDHB, and GLS were negatively correlated with drug sensitivity, mainly including Nutlin-3a(−) and CEP-701 ([112]Fig. 4B). Among the TOP30 drugs obtained from the CTRP database, we found that LIAS was significantly negatively associated with almost all drug sensitivities; MK-1775, tivantinib, COL-3, and other CRGs (including LIPT1, LIAS, DLD, PDHB, DLAT, MTF1, and CDKN2A) were also negatively correlated with almost all drug sensitivities ([113]Fig. 4C). Therefore, CRG expression of CRGs can predict the direction of clinical medication. 3.9. Established a prognostic risk model in COAD based on CRGs COAD is one of the leading cancers worldwide, and its incidence is high at present; however, no relevant article has been published on the correlation of CRGs with COAD. Therefore, we developed a risk model to investigate the relationship between CRGs and prognosis, immune-related characteristics, and drug sensitivity. Genes were screened by utilising LASSO Cox regression ([114]Fig. 5A), and the partial likelihood deviance curve was plotted versus log (λ) ([115]Fig. 5B). Finally, four key CRGs were identified to build the risk model and calculate the risk score of patients with COAD according to the following algorithm: risk score = (0.1345) × LIPT1+(−0.075) × PDHB+(−0.315) × DLAT+(0.1208) × CDKN2A. [116]Fig. 5C shows the relationship between the risk score and survival status. The top represents the scatter plot of the risk score from low to high, and different colours represent different risk groups; the middle represents the scatter plot of the association of the risk score with survival time and survival status, indicating that the risk score is negatively correlated with survival time and survival status. The bottom is a heatmap that represents the situation of gene expression contained in the risk model. The protective CRGs have a higher expression, whereas the risky CRGs have a lower expression in the low-score group. KM survival curves were plotted, in which the different groups were tested using the log-rank test, and the OS was shorter in the high-risk group than in the low-risk group ([117]Fig. 5D). The ROC curves of the risk model at 1, 3, and 5 years were drawn, and the AUC values were 0.654, 0.689, and 0.697, respectively, indicating that the accuracy of the risk model was higher ([118]Fig. 5E). Fig. 5. [119]Fig. 5 [120]Open in a new tab A prognostic risk model was established in COAD. (A)The LASSO Cox regression for ten CRGs. (B) The partial likelihood deviance curve versus log (λ). (C) The top represents the scatter plot of the risk score from low to high, and different colours represent different risk groups; the middle represents the scatter plot of about the association of the risk score with survival time and survival status; the bottom is a heatmap that represents the situation of gene expression contained in the risk model. (D) KM survival curves in high- and low-risk score groups. (E) The ROC curves of the risk model at 1, 3, and 5 years. (For interpretation of the references to colour in this figure legend, the