Abstract Gastric cancer (GC) remains a prevalent and aggressive malignancy with a poor prognosis. This study aimed to identify diagnostic and prognostic biomarkers while exploring their potential functions in GC. A total of 598 upregulated and 506 downregulated genes were identified in GC patients. Among these, survival-related differentially expressed genes (DEGs), including ADAMTS12, F5, and VCAN, were highlighted. Pan-cancer analyses revealed their dysregulation across multiple tumor types. A novel prognostic signature, incorporating ADAMTS12 and F5, effectively stratified GC patients into low- and high-risk groups, demonstrating significant differences in overall survival and robust predictive performance. ADAMTS12, strongly associated with advanced clinical stages and poor prognosis, was validated in an independent cohort and exhibited promising diagnostic potential. RT-PCR and western blot analyses confirmed its high expression in GC tissues and cell lines. Functional assays further demonstrated that ADAMTS12 promotes GC cell proliferation and invasion. In summary, this study provides critical insights into the molecular landscape of GC, offering a potential prognostic tool and therapeutic target. Supplementary Information The online version contains supplementary material available at 10.1007/s12672-024-01724-4. Keywords: Gastric cancer, Biomarker, ADAMTS12, Prognostic signature Introduction According to the statistics report that was submitted by GLOBOCAN 2020, gastric cancer (GC) is the fifth most frequent cancer in the world and the fourth main cause of deaths that are related to cancer [[28]1, [29]2]. In 2020, over one million individuals were diagnosed with stomach cancer, resulting in more than 750,000 deaths globally. The incidence of GC exhibits significant regional variation. According to World Health Organization (WHO) statistics, Asian countries—particularly China, Japan, and South Korea—are recognized as high-incidence regions for GC [[30]3, [31]4]. In contrast, the incidence of gastric cancer (GC) is relatively low in North America and Europe. Age and gender also play significant roles in influencing GC incidence. The disease is generally more prevalent among middle-aged and elderly individuals, particularly those aged 60 and older [[32]5]. Currently, the tumor-node-metastasis (TNM) staging of postoperative specimens is a widely used clinical tool for predicting long-term survival outcomes in patients with gastrointestinal cancer [[33]6]. However, research has revealed several significant limitations of the TNM staging method in clinical practice [[34]7]. Therefore, identifying novel biomarkers to predict the long-term prognosis of GC patients and stratify their risk is critically important. Such advancements would enable more precise, individualized treatment approaches, addressing a key challenge in GC management. Messenger RNA (MRNA) plays a crucial role in cancer research, particularly in the prediction and diagnosis of tumor prognosis [[35]8]. Differential mRNA expression levels serve not only as potential biomarkers for early cancer detection and prognosis prediction but also as valuable tools for guiding individualized treatment strategies by distinguishing high-risk patients from low-risk groups [[36]9]. In-depth analysis of mRNA in tumor tissues enables researchers to identify expression patterns linked to patient survival and disease progression, offering critical insights for more accurate prognostic assessments [[37]10, [38]11]. Moreover, mRNA research facilitates the identification of tumor subtypes, providing valuable insights into the biological characteristics of tumors and their clinical manifestations. Most importantly, mRNA analysis supports the discovery of potential therapeutic targets, laying a theoretical foundation for developing novel treatment strategies and driving progress in personalized medicine and precision therapy [[39]12–[40]14]. The Cancer Genome Atlas (TCGA), a comprehensive cancer genomics research initiative, offers an extensive repository of molecular data across various levels, including the genome, transcriptome, and epigenome, to support cancer research. The primary goal of TCGA is to lay the groundwork for personalized cancer treatment by conducting detailed molecular analyses to uncover the unique molecular characteristics of different cancer types [[41]15, [42]16]. TCGA plays a broad role in cancer research, including molecular classification and subtype identification, discovery of cancer driver genes, identification of biomarkers, development of personalized treatment strategies, and advancement of bioinformatics tools [[43]17]. By analyzing TCGA data, researchers can gain a deeper understanding of the molecular mechanisms driving cancer, providing critical insights to support clinical treatment and personalized medicine, thereby advancing the field of cancer research. In this study, we analyzed TCGA datasets and identified three critical biomarkers for GC patients. Among them, our attention focused on ADAMTS12. Our findings suggested ADAMTS12 may be a novel biomarker and therapeutic target for GC patients. Materials and methods Patients and collection of samples GC tissues and paired adjacent normal tissues were collected from 10 GC patients who underwent surgical resection at Qingdao Chengyang People’s Hospital between 2022 and 2023. None of the patients received preoperative treatments, including radiation or chemotherapy. To support future research, half of the tissues were promptly snap-frozen in liquid nitrogen, while the remaining half was stored at − 80 °C. The residual tissue portions were fixed in formalin and embedded in paraffin. All procedures were conducted in accordance with the Declaration of Helsinki, and the study was approved by Qingdao Chengyang People’s Hospital (Approval number: QDCY2022137). Written informed consent was obtained from all participants. Cell culture and transfection The GC cell lines SGC7901, MKN-45, HGC-27, and AGS, along with the normal gastric epithelial cell line GSE-1, were purchased from the Chinese Academy of Sciences (Shanghai, China). All cells were cultured at 37 °C with 5% carbon dioxide in media supplemented with 10% fetal bovine serum (FBS; Gibco, Carlsbad, California, USA) and 1% penicillin/streptomycin (Gibco). Ribobio (Guangzhou, China) synthesized and engineered siRNA oligonucleotides targeting ADAMTS12 (si-ADAMTS12-1 and si-ADAMTS12-2). Cell transfection was performed according to the manufacturer’s instructions (Invitrogen, USA) using Lipofectamine 3000. Real-time PCR (RT-PCR) Total RNA was extracted from both the specimens and cells using TRIzol reagent (Invitrogen). The RNA was then reverse-transcribed into cDNA using the PrimeScript™ RT Reagent Kit (Takara, Dalian, China). RNA quantification was performed using the ABI PRISM 7900 Real-Time PCR System (Applied Biosystems). GAPDH was used as the internal control for normalization. The relative gene expression levels were determined using the 2 − ΔΔCt method. The PCR conditions were as follows: an initial 3-min denaturation at 94 °C, followed by 30 cycles of 45 s at 94 °C, 45 s at 57 °C, and 45 s at 72 °C, with a final extension at 72 °C for 10 min. The specific primers were as follows: ADAMTS12 forward 5ʹ-ACTTTGGGGCGCTTTGCTAT-3ʹ and reverse 5ʹ-GCAGGCCCTTGATAAAATGC-3′; GAPDH forward 5′-CTCAGACACCATGGGGAAGGTGA-3′ and reverse 5′-ATGATCTTGAGGCTGTTGTCATA-3′. Colony formation assay Transfected MKN-45 and HGC-27 cells were seeded into 6-well plates at a density of 1 × 10^3 cells per well. The plates were incubated at 37 °C for 2 weeks in either DMEM or RPMI 1640, each supplemented with 10% FBS. After incubation, cells were fixed in methanol for 30 min, stained with 1% crystal violet dye, and then counted. Each experiment was performed in triplicate. 5-Ethynyl-2′-deoxyuridine (EDU) assay To assess cell proliferation, the EDU assay (RiboBio, Nanjing, China) was performed. After 48 h of transfection, the cells were treated with 50 μM EDU. Apollo staining and DAPI labeling were then conducted to visualize EDU-positive cells. Cell invasion Transfected MKN-45 and HGC-27 cells (200 µl) were seeded into the upper chambers of Transwell inserts (Corning Inc.), which had an 8 µm pore size and were placed in 24-well plates. The Transwell membrane was pre-coated with Matrigel (BD Biosciences). Simultaneously, 600 µl of RPMI-1640 medium supplemented with 10% FBS was added to the lower chambers. The cells were incubated at 37 °C for 24 h, after which they were fixed and stained with 0.5% crystal violet solution (Sigma-Aldrich; Merck KGaA) for 15 min at room temperature. The number of invading cells was counted using a light microscope (Olympus Corporation) at ×100 magnification. Images were captured from five random fields and analyzed using ImageJ software (version 1.8.0.112) from the National Institutes of Health. Western blot analysis Cells were lysed using RIPA buffer (Merck, China) for this experiment. Protein concentrations were normalized using the BCA assay. The lysates were separated by SDS-PAGE (10–15% polyacrylamide gels) and transferred onto PVDF membranes. The membranes were incubated overnight at 4 °C with primary antibodies, including anti-ADAMTS12 (1:500, Proteintech, USA) and anti-β-actin (1:5000, CST, USA). Afterward, the membranes were incubated with appropriate secondary antibodies (anti-rabbit antibodies, Immunoway) at room temperature (25 °C) for two hours. The blots were developed using ECL detection. Data collection To investigate stomach adenocarcinoma (STAD), we collected HTSeq-Count data, clinicopathological information, and overall survival (OS) data from the TCGA database. A total of 406 patients’ clinicopathological characteristics were included, along with transcriptome data from 375 STAD tissues and 359 normal tissues obtained from GTEx. Additionally, data from GTEx were incorporated. The clinical characteristics of the clinical cohort are provided in Table S1. Data preprocessing and screening of the differentially expressed mRNAs (DEMs) To screen for differentially expressed genes (DEGs) in the gene expression profiles from the TCGA datasets, the Limma package in R software (version 3.5.3) was employed. To analyze the differential expression between GC and normal tissues from the TCGA datasets, the edgeR package in R was used. A significance threshold was set with a log2 (fold change) > 4 and an adjusted p-value of less than 0.05. The results were visualized using volcano plots and heatmaps to display the DEMs data. Function enrichment analysis of DEMs Gene Ontology (GO) analysis is a widely used method in bioinformatics and biological research that systematically classifies and annotates genes and their products to better understand their functions in biological processes. It encompasses three main categories: molecular function, cellular component, and biological process. Molecular function describes the activities of gene products at the molecular level, cellular component specifies their locations within cells, and biological process outlines their involvement in cellular and organismal functions. By categorizing genes based on these functional categories, GO analysis provides a comprehensive approach to understanding gene functions. It is commonly applied to interpret high-throughput experimental data, such as gene expression profiles or proteomics data. For differentially expressed gene sets, GO analysis helps researchers understand their functional localization, roles in biological processes, and molecular functions, enabling insights into their roles and regulatory networks in specific biological contexts. KEGG analysis is another widely used method in bioinformatics and systems biology aimed at understanding gene and protein functions, particularly in the context of metabolic and signaling pathways. The Kyoto Encyclopedia of Genes and Genomes (KEGG) is a comprehensive database that provides detailed biological information, including insights into genomics, biochemistry, and more. In KEGG analysis, researchers compare gene sets of interest with data from the KEGG database, linking genes to specific metabolic and signaling pathways. This approach helps uncover the roles of gene sets in cellular metabolism and signal transduction, revealing how genes collaborate to perform essential biological functions. For gene function analysis, the Database for Annotation, Visualization, and Integrated Discovery (DAVID) version 6.8 ([44]https://david.ncifcrf.gov) was utilized. The DAVID analysis module is designed to transform data into biological insights, accelerating the study of genome-level datasets. To elucidate the biological significance of the differentially expressed genes (DEGs) identified in GC progression, we used DAVID to conduct GO functional and KEGG pathway enrichment analysis on the DEMs obtained from the Venn analysis. A false discovery rate (FDR) of less than 0.05 was considered statistically significant. Identification of survival-related DEMs and establishment of the prognostic gene signature The differentially expressed genes (DEMs) associated with overall survival (OS) were identified using the TCGA STAD dataset. DEMs with a p-value of less than 0.05 were considered statistically significant and included for further analysis. To refine the number of DEMs in the selected panel, a Lasso-penalized Cox regression analysis was performed. This analysis utilized tenfold cross-validation, implemented through the glmnet package in R, to optimize prediction performance. To generate a predictive gene signature for gastric cancer patients, a linear combination of the regression coefficients (β) obtained from the Lasso Cox regression model was calculated and multiplied by the mRNA expression levels of the respective genes. The optimal cutoff for the predictive gene signature was determined using X-Tile software, which classified patients into high-risk and low-risk groups based on this cutoff. To evaluate the prognostic value of the gene signature, several methods were employed, including Kaplan–Meier survival analysis, Harrell’s concordance index (C-index), the area under the curve (AUC) of the receiver operating characteristic (ROC) curve, and a calibration plot comparing predicted and observed overall survival. Statistical analysis A one-way analysis of variance (ANOVA) was used to compare gene expression patterns between normal gastric tissues and GC samples. The Pearson chi-square test was employed to examine categorical data. To assess overall survival across different groups, Kaplan–Meier (K–M) survival analysis was conducted, along with a two-sided log-rank test. A two-tailed p-value of less than 0.05 was considered statistically significant. Results Identification of DEMs in GC and functional enrichment analysis First, we analyzed TCGA datasets to identify DEGs between GC samples and normal samples. Based on criteria of LogFC > 4 and p < 0.05, we identified 598 upregulated genes and 506 downregulated genes in GC samples (Fig. [45]1A and B). Next, we performed functional enrichment analysis to explore the potential functions of these genes. KEGG analysis revealed that the 598 upregulated genes were primarily enriched in the p53 signaling pathway, viral protein interaction with cytokines and cytokine receptors, viral myocarditis, and the NF-kappa B signaling pathway. Conversely, the 506 downregulated genes were mainly associated with retrograde endocannabinoid signaling, the cAMP signaling pathway, protein digestion and absorption, prion diseases, and Parkinson’s disease. Furthermore, GO analysis showed that the upregulated genes were predominantly involved in the regulation of mitotic sister chromatid separation, nuclear division, and sister chromatid segregation, while the downregulated genes were primarily linked to sodium ion homeostasis, sodium ion export across the plasma membrane, response to muscle activity, and organic acid catabolic processes (Fig. [46]1C). Fig. 1. [47]Fig. 1 [48]Open in a new tab Differential expression analysis and Functional Enrichment in GC. A and B Heatmap and Volcano plots illustrating the differential expression of genes between GC samples and normal samples. Genes with LogFC > 4 and p < 0.05 are highlighted in red (upregulated) and blue (down-regulated). The differential expression analysis was performed using the limma package in R, with Benjamini–Hochberg correction for multiple testing. C Functional Enrichment Analysis of DEMs, including KEGG pathway analysis and GO) analysis. The upregulated genes are associated with pathways such as p53 signaling, viral interactions, and sister chromatid segregation, while down-regulated genes are linked to cAMP signaling and sodium ion homeostasis Identification of survival-related DEMs in GC patients We then integrated sequencing data from GC patients in the TCGA datasets with corresponding clinical information to further identify survival-related DEGs in GC patients. Using a threshold of p < 0.01, we performed Kaplan–Meier (K–M) survival analysis on the 598 upregulated genes and 506 downregulated genes and identified 10 survival-related DEGs in GC patients, including CLRN3, BCL11B, PDSS1, PRSS3, MYB, ADAMTS12, EPCAM, MAP7, F5, and VCAN (Fig. [49]2A and B). Given the consistency between expression patterns and clinical significance, we focused on ADAMTS12, F5, and VCAN. The expression of these genes was significantly elevated in GC specimens and was associated with poor prognosis (Fig. [50]2C). Additionally, we performed pan-cancer analysis and found that ADAMTS12 (Fig. [51]3A), VCAN (Fig. [52]3B), and F5 (Fig. [53]3C) exhibited dysregulated expression in various types of tumors, underscoring their important roles in tumor progression. Fig. 2. [54]Fig. 2 [55]Open in a new tab Kaplan–Meier survival analysis of DEMs in GC. A and B Kaplan–Meier survival curve presenting the overall survival of GC patients based on the expression levels of identified DEMs. The log-rank test was used to compare survival curves between groups, with a significance level set at p < 0.01. C Forest plot illustrating hazard ratios for the identified 10 survival-related DEMs in GC patients, including CLRN3, BCL11B, PDSS1, PRSS3, MYB, ADAMTS12, EPCAM, MAP7, F5, and VCAN Fig. 3. [56]Fig. 3 [57]Open in a new tab Pan-Cancer Analysis of A ADAMTS12, B VCAN, and C F5 in various tumor types. Statistical significance was assessed using one-way ANOVA for comparisons between multiple groups, with post hoc Tukey’s test applied to determine pairwise differences. A p-value < 0.05 was considered statistically significant The prognostic value of ADAMTS12, F5 and VCAN in pan-cancer analysis To further explore the clinical significance of ADAMTS12, F5, and VCAN in tumors, we conducted a pan-cancer analysis. As shown in Figs. [58]4A and B, we observed that ADAMTS12 expression was significantly associated with overall survival (OS) and disease-free survival (DFS) in several types of tumors, including KIPAN, BLCA, KIRP, ACC, MESO, PAAD, and UVM. Additionally, the prognostic value of VCAN and F5 in pan-cancer analysis is shown in Figures S1A and S1B, and Figures S2A and S2B. The expression of both VCAN and F5 was also found to be associated with the prognosis of multiple tumor types. Fig. 4. [59]Fig. 4 [60]Open in a new tab Pan-cancer analysis revealed the prognostic impact of ADAMTS12 expression. A Association of ADAMTS12 expression with overall survival in various tumor types, including KIPAN, BLCA and KIRP. B Correlation of ADAMTS12 expression with disease-free survival across multiple tumor types. Kaplan–Meier survival curves were generated, and differences in survival were assessed using the log-rank test, with a significance level set at p < 0.05 Construction of a new gene prognostic signature for GC patients The above analyses revealed that ADAMTS12, F5, and VCAN were closely associated with the survival outcomes of GC patients. Therefore, we performed LASSO regression to construct a prognostic signature for GC. The coefficients of ADAMTS12, F5, and VCAN are presented in Fig. [61]5A. The model achieved the best fit when two of these genes—ADAMTS12, F5, and VCAN—were included (Fig. [62]5B). The risk score model classified 370 GC tumor samples into low- or high-risk subgroups based on a median threshold, and a heatmap of F5 and VCAN expression in GC was also generated (Fig. [63]5C). Furthermore, overall survival analysis showed that the high-risk group had significantly poorer survival outcomes (Fig. [64]5D). To assess the prognostic performance of the model, receiver operating characteristic (ROC) analysis, accounting for time, was performed. The area under the ROC curve (AUC) values for one-year, three-year, and five-year overall survival were 0.595, 0.619, and 0.739, respectively (Fig. [65]5E). Fig. 5. [66]Fig. 5 [67]Open in a new tab Construction and evaluation of a gene prognostic signature for GC. A Coefficient plot of ADAMTS12, F5, and VCAN in the LASSO regression model. The optimal penalty parameter (lambda) was determined using tenfold cross-validation to prevent overfitting. B Selection of the optimal number of genes for the prognostic signature using LASSO regression, with the minimum lambda indicated. C Heatmap displaying the expression levels of F5 and VCAN in GC tumor samples classified into low- and high-risk subgroups based on the risk score model, with significant differences analyzed using Student’s t-test (p < 0.05). D Kaplan–Meier survival analysis comparing overall survival between low- and high-risk groups defined by the prognostic signature, with the log-rank test used to assess differences in survival (p < 0.05). E ROC analysis assessing the prognostic model’s performance over one, three, and five years, with area under the curve (AUC) values calculated to evaluate the diagnostic accuracy High expression of ADAMTS12 in GC and its association with clinical variables We focused on ADAMTS12, one of the three survival-related genes, as its expression and function in GC have not been previously documented. Compared to non-tumor specimens, GC tissues exhibited a significant increase in ADAMTS12 expression (Fig. [68]6A). Additionally, we observed that high ADAMTS12 expression was associated with the age of GC patients (Fig. [69]6B), but it did not correlate with gender (Fig. [70]6C) or tumor grade (Fig. [71]6D). Furthermore, high ADAMTS12 expression was positively associated with advanced clinical stage (Fig. [72]6E) and T stage (Fig. [73]6F), but it was not correlated with M stage (Fig. [74]6G) or N stage (Fig. [75]6H). The association between ADAMTS12 expression and various clinical factors is shown in the heatmap (Fig. [76]6I). Additionally, we evaluated the prognostic significance of ADAMTS12 expression in GC patients using Cox regression analyses. The results of the multivariate analysis, shown in Table [77]1, demonstrate that clinical stage, age, and ADAMTS12 expression independently predict poor prognosis in GC patients. Fig. 6. [78]Fig. 6 [79]Open in a new tab Association of ADAMTS12 expression with clinical variables in GC. A Expression levels of ADAMTS12 in GC specimens compared with non-tumor specimens from TCGA datasets. Statistical significance was determined using the Student’s t-test (p < 0.05). B–H Correlation between ADAMTS12 expression and several clinical factors of GC patients, assessed using Spearman or Pearson correlation coefficients, depending on the data distribution, with p-values reported for each correlation. I Heatmap illustrating the association between ADAMTS12 expression and various clinical factors in GC, with a two-sided p-value < 0.05 indicating statistical significance in the associations Table 1. Univariate and multivariate analysis of overall survival in GC patients Characteristics Total (N) Univariate analysis Multivariate analysis Hazard ratio (95% CI) p value Hazard ratio (95% CI) p value Pathologic stage 347 Stage I & stage II 160 Reference Reference Stage III & stage IV 187 1.947 (1.358–2.793) < 0.001 1.947 (1.357–2.793) < 0.001 Age 367 ≤ 65 163 Reference Reference > 65 204 1.620 (1.154–2.276) 0.005 1.673 (1.176–2.381) 0.004 Gender 370 Female 133 Reference Male 237 1.267 (0.891–1.804) 0.188 ADAMTS12 370 Low 184 Reference Reference High 186 1.544 (1.108–2.151) 0.010 1.583 (1.122–2.233) 0.009 [80]Open in a new tab ADAMTS12 expression was increased in GC using our cohorts We then collected 10 pairs of GC specimens and corresponding normal tissues to evaluate the expression of ADAMTS12 in GC. RT-PCR results revealed a significant increase in ADAMTS12 expression in GC specimens compared to normal tissues (Fig. [81]7A). Subsequent ROC analysis confirmed the strong diagnostic value of ADAMTS12 in distinguishing GC specimens from normal tissues, with an AUC of 0.84 (Fig. [82]7B). Additionally, RT-PCR and western blot analyses of several GC cell lines showed a marked upregulation of ADAMTS12 expression in four different GC cell lines (Fig. [83]7C and D). Importantly, these findings were consistent with the results from TCGA datasets. The elevated expression of ADAMTS12 in both clinical specimens and cell lines highlights its potential as a diagnostic marker for GC. Fig. 7. [84]Fig. 7 [85]Open in a new tab A RT-PCR analysis comparing ADAMTS12 expression in 10 pairs of GC specimens and normal specimens, revealing a distinct increase in GC specimens. Statistical significance was determined using the paired Student’s t-test (p < 0.05). B ROC analysis confirming ADAMTS12 as a robust diagnostic marker for distinguishing GC specimens from normal specimens, with an AUC of 0.84, calculated using the trapezoidal rule. C RT-PCR results showing ADAMTS12 expression in various GC cell lines compared to normal controls, analyzed using one-way ANOVA followed by Tukey’s post hoc test (p < 0.05). D Western blot results demonstrating significantly elevated ADAMTS12 expression in various GC cell lines compared to normal controls, with quantification performed using ImageJ software and analyzed by one-way ANOVA followed by Tukey’s post hoc test (p < 0.05) Knockdown of ADAMTS12 expression suppressed the proliferation and invasion of GC cells To elucidate the functional role of ADAMTS12 in GC progression, we began by employing siRNA-mediated knockdown of ADAMTS12 expression in MKN-45 and HGC-27 cells. The efficiency of si-ADAMTS12 transfection was confirmed through comprehensive analyses, including RT-PCR and western blot assays, as shown in Fig. [86]8A. Subsequent functional assays, such as the Colony Formation Assay and EdU assays, revealed a significant suppression of cellular proliferation in both MKN-45 and HGC-27 cells following ADAMTS12 knockdown (Fig. [87]8B and C). Additionally, Transwell assays demonstrated a marked reduction in cell invasion capacity upon downregulation of ADAMTS12 expression in both cell lines (Fig. [88]8D). These findings collectively suggest that ADAMTS12 plays a critical role in promoting cell proliferation and invasion in GC, highlighting its potential as a therapeutic target for inhibiting GC progression. Fig. 8. [89]Fig. 8 [90]Open in a new tab Suppression of GC cell proliferation and invasion upon knockdown of ADAMTS12 expression. A RT-PCR and western blot results confirming the efficient knockdown of ADAMTS12 expression in MKN-45 and HGC-27 cells using si-ADAMTS12, with statistical significance assessed by the paired Student’s t-test (p < 0.05). B and C Colony formation assays and EdU assays illustrating a significant reduction in the proliferation of MKN-45 and HGC-27 cells upon ADAMTS12 knockdown, analyzed by one-way ANOVA followed by Tukey’s post hoc test (p < 0.05). D Transwell assay demonstrating inhibited invasion of MKN-45 and HGC-27 cells following knockdown of ADAMTS12 expression, with statistical significance evaluated using the Mann–Whitney U test (p < 0.05) Discussion In clinical practice, the diagnosis and prognosis prediction of GC patients primarily rely on various methods [[91]18]. Initially, doctors gather preliminary information by thoroughly understanding the patient’s symptoms and conducting a physical examination [[92]19]. Subsequently, imaging examinations, such as gastroscopy and X-ray, provide detailed information about the tumor’s structure. Blood biomarker tests, including CA 19-9 and CEA, serve as complementary tools for diagnosis and predicting disease progression [[93]20, [94]21]. Additionally, pathological examinations involve obtaining tissue specimens through biopsies to determine histological types and assess lymph node involvement. Molecular biomarker testing, including HER2 and EGFR, provides deeper molecular insights that aid in formulating personalized treatment plans [[95]22, [96]23]. The TNM staging system is employed to stage cancer, evaluating the severity of the disease and predicting prognosis [[97]24, [98]25]. Targeted treatment strategies, such as anti-HER2 therapy, focus on specific molecular markers for treatment. However, these methods also have limitations. Gastroscopy is invasive and may not be suitable for all patients. Imaging examinations have limited resolution and may miss early or small lesions. Blood biomarker specificity is often constrained, and levels may be elevated in other diseases. Pathological examinations require tissue sampling, which can limit access to deep or hard-to-reach areas. Molecular biomarker analysis involves complex techniques and higher costs. The TNM staging system does not account for tumor molecular characteristics, and targeted treatments are only effective for patients with specific target markers. MRNA has diverse potential applications in tumor diagnosis and prognosis prediction. First, through comprehensive molecular analysis, distinct mRNA expression patterns associated with specific cancer types or subtypes can be identified in tumor tissues, providing potential biomarkers for early cancer diagnosis and subtype classification. Second, by analyzing mRNA expression levels, a more accurate assessment of cancer prognosis can be made, offering valuable insights for the development of personalized treatment plans by healthcare professionals [[99]26]. Additionally, mRNA analysis aids in determining the molecular subtypes of tumors, providing essential information for understanding the biological heterogeneity of cancer [[100]27, [101]28]. In this study, we analyzed transcriptome data from GC patients in the TCGA datasets and identified 598 upregulated genes and 506 downregulated genes. Furthermore, functional enrichment analysis revealed that the differentially expressed genes are associated with activated biological processes such as the p53 signaling pathway, immune responses, and viral infections. In contrast, biological processes related to cell signaling, neural system functions, and metabolism appear to be suppressed. These findings provide valuable insights for further exploring the relationship between gene expression changes and specific physiological conditions or disease states. We then identified survival-related differentially expressed genes (DEMs) in GC patients by integrating TCGA sequencing data with clinical information. Ten DEMs, including CLRN3, BCL11B, PDSS1, PRSS3, MYB, ADAMTS12, EPCAM, MAP7, F5, and VCAN, were found to be significantly associated with survival outcomes in GC patients. Among these, ADAMTS12, F5, and VCAN stood out due to their consistent upregulation in GC specimens, which correlated with poor prognosis. Pan-cancer analysis further highlighted the dysregulated expression of ADAMTS12, VCAN, and F5 across multiple tumor types, emphasizing their potential roles in tumorigenesis. In this analysis, ADAMTS12, F5, and VCAN were associated with both overall and disease-free survival in various cancers, reinforcing their prognostic significance. The upregulation of these genes in GC and their consistent association with poor prognosis underscores their clinical relevance in GC and beyond. Additionally, we developed a prognostic model using F5 and VCAN, which demonstrated a strong ability to predict clinical outcomes in GC patients. Our findings not only provide potential prognostic biomarkers for GC but also highlight the broader implications of ADAMTS12, F5, and VCAN in cancer biology. Among ADAMTS12, F5, and VCAN, our attention focused on ADAMTS12. ADAMTS12 is an enzyme protein belonging to the metalloproteinase family, playing a crucial role in the extracellular matrix [[102]29]. The term “A Disintegrin-like” in its name signifies its association with the ADAM family (A Disintegrin and Metalloproteinase), while “Thrombospondin Motifs” refers to the presence of domains similar to those found in thrombospondin. The protein encoded by ADAMTS12 exhibits metalloproteinase activity, allowing it to cleave and degrade protein molecules within the extracellular matrix [[103]30, [104]31]. In both physiological and pathological conditions, ADAMTS12 plays a crucial role in various biological processes, including cell migration, tissue repair, angiogenesis, and the metabolism of collagen and proteoglycans. Its regulated activity is closely linked to the onset and progression of several diseases, including cancer, arthritis, and cardiovascular disorders [[105]32–[106]34]. In the realm of cancer research, alterations in the expression levels of ADAMTS12 are often observed, correlating with aspects of tumor initiation, progression, and prognosis. Through mechanisms influencing extracellular matrix remodeling, cell–cell interactions, and signaling pathway regulation, ADAMTS12 likely plays a role in the biological processes of tumors [[107]35–[108]37]. In addition, Chen et al. reported that analyzed the expression of ADAMTS12 in GC tissues and its involvement in the glycolysis pathway using bioinformatics and experimental methods. The findings demonstrated that ADAMTS12 promotes the proliferation and glycolysis of GC cells, while metformin can inhibit these effects. In vivo experiments further confirmed that metformin suppresses the proliferation and glycolysis of GC cells via ADAMTS12, suggesting that ADAMTS12 could be a potential target for metformin therapy in GC [[109]38]. Moreover, Hou et al. found that ADAMTS12 expression was significantly elevated in gastric cancer samples and correlated with poor prognosis. Gene Set Enrichment Analysis (GSEA) indicated that high ADAMTS12 expression was associated with cancer and immune-related pathways, while low expression was linked to the oxidative phosphorylation pathway. Additionally, CIBERSORT analysis revealed a positive correlation between ADAMTS12 expression and macrophage infiltration, along with a negative correlation with T follicular helper cells, suggesting its role in modulating the tumor microenvironment and metabolic reprogramming in gastric cancer [[110]39]. In this study, we also reported that ADAMTS12 was highly expressed in GC specimens and predicted a poor prognosis. Importantly, we performed RT-PCR and western blot, confirming the expression of ADAMTS12 was distinctly increased in GC specimens, which was consistent with the above findings. Our findings suggested ADAMTS12 as a novel prognostic and diagnostic biomarker for GC patients. Moreover, we performed in vitro assays and found that knockdown of ADAMTS12 distinctly suppressed the proliferation and invasion of GC cells, suggesting it as a tumor promotor in GC. There are several limitations that should be acknowledged in this study. Firstly, although our analysis extensively utilized bioinformatics tools, including functional enrichment, survival analysis, and the construction of a prognostic signature, the absence of in vivo validation is a notable limitation. In vivo experiments are essential for confirming the functional roles of genes like ADAMTS12, F5, and VCAN in GC progression. Secondly, while the clinical information provided by the TCGA datasets offers valuable insights, it is important to recognize the inherent heterogeneity among GC patients. Variations in clinical characteristics, treatment regimens, and response patterns among individuals may affect the robustness and generalizability of our findings in clinical settings. Thirdly, the prognostic signature developed using LASSO regression, while offering a predictive model, introduces a certain level of complexity. The limited number of genes selected (VCAN and F5) may not fully capture the multifaceted nature of GC prognosis. Furthermore, the predictive performance of the model should be validated in independent cohorts to confirm its reliability. Conclusion In this study, we analyzed TCGA datasets to identify differentially expressed genes (DEGs) in GC, which are involved in multiple key pathways and biological processes. By integrating sequencing data with clinical information from GC patients, we identified survival-related DEGs, with a particular focus on ADAMTS12, F5, and VCAN. The high expression of these genes is closely associated with poor prognosis, and the prognostic signature we constructed demonstrates strong predictive performance at multiple time points. Further investigation of ADAMTS12 confirmed its significant role in GC, with its elevated expression correlating closely with clinical and pathological features. In conclusion, our findings not only provide insights into the molecular characteristics of GC but also offer a novel perspective for prognosis assessment based on gene signatures. However, further validation and independent studies are needed to confirm the reliability of these results for clinical applications. Supplementary Information [111]Supplementary Material 1.^ (479.5KB, docx) Acknowledgements