Abstract

   Osteosarcoma, an aggressive bone malignancy predominantly affecting
   children and adolescents, is characterized by a poor prognosis and high
   mortality rates. The development of reliable prognostic tools is
   critical for advancing personalized treatment strategies. However,
   identifying robust gene signatures to predict osteosarcoma outcomes
   remains a significant challenge. In this study, we analyzed gene
   expression data from 138 osteosarcoma samples across two multicenter
   cohorts and identified 14 consensus prognosis-associated genes via
   univariate Cox regression analysis. Using 66 combinations of 10 machine
   learning (ML) algorithms, we developed a machine learning-derived
   prognostic signature (MLDPS) optimized by the average C-index across
   TARGET, [42]GSE21257, and merged cohorts. The MLDPS effectively
   stratified osteosarcoma patients into high- and low-risk score groups,
   achieving strong predictive performance for 1-, 3-, and 5-year overall
   survival (AUC range: 0.852 — 0.963). The MLDPS, comprising seven genes
   (CTNNBIP1, CORT, DLX2, TERT, BBS4, SLC7A1, NKX2-3), exhibited superior
   predictive accuracy compared to 10 established gene signatures. The
   findings of the MLDPS carry significant clinical implications for
   osteosarcoma treatment. Patients with a high-risk score demonstrated
   worse prognosis, increased metastasis risk, reduced immune
   infiltrations, and greater sensitivity to immunotherapy. Conversely,
   low-risk patients exhibited prolonged survival and distinct drug
   sensitivities. These findings underscore the potential of MLDPS to
   guide risk stratification, inform personalized therapeutic strategies,
   and improve clinical management in osteosarcoma.

Supplementary Information

   The online version contains supplementary material available at
   10.1038/s41598-025-00179-z.

   Keywords: Osteosarcoma, Machine learning, Tumor immunotherapy, Risk
   score, Prognosis

   Subject terms: Cancer, Computational biology and bioinformatics, Drug
   discovery, Immunology, Biomarkers, Oncology, Risk factors

Introduction

   Osteosarcoma, a rare yet aggressive primary malignant bone tumor,
   predominantly affects pediatric and adolescent populations^[43]1,[44]2.
   Despite advancements in surgical resection, chemotherapy, and
   immunotherapy, survival outcomes remain poor, largely due to the high
   incidence of metastasis and chemotherapy resistance^[45]3,[46]4. The
   five-year survival rate for osteosarcoma patients exhibiting
   chemotherapy resistance or metastasis is a mere 20-30%^[47]5–[48]7.
   Although the TNM staging system provides basic prognostic
   insights^[49]8, it has limited capacity to capture the molecular
   complexity and heterogeneity of osteosarcoma. Consequently, researchers
   are prioritizing the discovery of novel molecular biomarkers to improve
   clinical outcome prediction and optimize therapeutic
   strategies^[50]9–[51]11.

   The treatment landscape for sarcomas, including osteosarcoma, is
   evolving through the integration of molecular and genomic insights.
   Immune checkpoint inhibitors, tumor microenvironment-targeted
   therapies, and combination regimens show promise in sarcoma management;
   however, their application in osteosarcoma remains exploratory. For
   example, genetic profiling of uterine leiomyosarcoma has uncovered
   frequent mutations in tumor suppressor genes such as TP53, RB1, and
   PTEN, pointing towards potential therapeutic targets^[52]12.
   Nonetheless, adjuvant chemotherapy yields inconsistent results across
   early-stage uterine leiomyosarcoma subtypes^[53]13,[54]14. Similarly,
   while immune checkpoint inhibitors demonstrate efficacyin advanced
   patientscancers^[55]15–[56]20, their role in osteosarcoma requires
   further validation. However, accurately forecasting prognosis has
   proved to be challenging due to the variations in the efficacy of
   treatment approaches^[57]21–[58]24. These findings emphasize the
   difficulty of establishing universal adjuvant therapies for diverse
   sarcoma subtypes and accentuate the necessity for individualized
   approaches grounded in tumor-specific biology. To significantly improve
   osteosarcoma survival rates, it is important to ascertain innovative
   biomarkers capable of accurately predicting clinical outcomes and
   therapeutic responsiveness. Therefore, high-confidence prognostic gene
   signatures are essential for guiding prognosis and treatment regimens.

   As high-throughput sequencing technology advances, researchers are
   increasingly confronted with vast amounts of biological and clinical
   data, collectively termed “Big Data”^[59]25–[60]27. These datasets
   enable the systematic identification of gene signatures and molecular
   pathways driving cancer progression^[61]28, including
   osteosarcoma^[62]29–[63]31. Previous studies have constructed
   prognostic signatures associated with hypoxia^[64]32, apoptosis^[65]33,
   energy metabolism^[66]22, and the immune tumor microenvironment
   (TME)^[67]34. Nevertheless, the absence of reliable biomarkers often
   leads to either overtreatment or undertreatment, causing significant
   financial burden, severe adverse effects, or accelerated disease
   progression^[68]35,[69]36.

   To address this gap, machine learning (ML), a subfield of artificial
   intelligence, has emerged as a powerful tool for constructing
   predictive models from large-scale datasets^[70]37,[71]38. ML
   algorithms has proven instrumental in biomedical research, facilitating
   biomarker discovery, therapeutic target identification, and mechanistic
   insights across diseases^[72]39–[73]41. For instance, Zhang et
   al.^[74]42 leveraged ML to develop a six-gene oxidative stress
   signature for predicting acute myeloid leukemia outcomes. Similarly,
   recent studies have employed ML to identify diagnostic and prognostic
   biomarkers in osteosarcoma^[75]43,[76]44, such as Shi et al.’s six-gene
   risk model (MYC, COL13A1, UHRF2, MT1A, ACTB,
   GBP1)al^[77]23.Nevertheless, integrating ML with bioinformatics to link
   prognostic signatures to diagnostic biomarkers and elucidate their
   clinical significance remains underexplored in osteosarcoma.

   To identify an optimal biomarker, we constructed an ML-derived
   prognostic signature (MLDPS) by implementing 66 ML algorithm
   combinations based on consensus prognostic signature genes (CPSGs). The
   MLDPS demonstrated robust predictive capabilities for overall survival
   (OS), metastasis, immunotherapy response, and drug treatment efficacy
   in osteosarcoma cohorts. After integrating key clinicopathological
   characteristics and 10 established osteosarcoma signatures, MLDPS
   exhibited consistent and significantly improved predictive performance.
   Overall, our research provides a valuable resource for guiding early
   diagnosis, prognostic evaluation, stratified treatment, personalized
   therapy, and post-treatment monitoring of osteosarcoma in clinical
   settings.

Materials and methods

Public osteosarcoma dataset collection and preprocessing

   The TARGET-osteosarcoma cohort
   ([78]https://ocg.cancer.gov/programs/target) consists of 85 samples
   with survival data and follow-up periods exceeding zero. We also
   acquired the [79]GSE21257 dataset from the Gene Expression Omnibus
   ([80]https://www.ncbi.nlm.nih.gov/geo/), which includes 53 osteosarcoma
   samples with survival information and follow-up periods greater than
   zero for all patients. In total, 138 samples were included in this
   study, encompassing 14,174 genes from the TARGET and [81]GSE21257
   cohorts.

Univariate Cox regression analysis

   Univariate Cox regression analysis was conducted on the [82]GSE21257
   and TARGET cohorts using intersecting genes. CPSGs were selected for
   further investigation based on the criteria of P < 0.01 and consistent
   hazard ratios (HRs) that were either above or below 1 in both cohorts.

Functional enrichment analysis

   The CPSGs were used to perform Gene Ontology (GO) and Kyoto
   Encyclopedia of Genes and Genomes (KEGG) pathway enrichment analyses.
   The R package ‘clusterProfiler’ was employed for Gene Set Enrichment
   Analysis (GSEA) along with the GO and KEGG analyses^[83]45–[84]49. Gene
   Set Variations Analysis (GSVA) was conducted using the R package
   ‘GSVA’^[85]50. The biocarta, KEGG, Reactome, wikipathways, and Hallmark
   gene sets used in GSVA were downloaded from the Molecular Signatures
   Database (MSigDB;
   [86]http://software.broadinstitute.org/gsea/msigdb/index.jsp) version
   6.2.

Development of MLDPS

   To construct the MLDPS and ensure high accuracy and generalizability,
   we integrated 10 widely used ML algorithms: gradient boosting machine
   (GBM), survival support vector machine (Survival-SVM), supervised
   principal components (SuperPC), ridge regression, partial least squares
   regression for Cox (plsRcox), elastic network (Enet), random forest
   (RSF), least absolute shrinkage and selection operator (LASSO),
   CoxBoost, and Stepwise Cox. Notably, RSF, LASSO, CoxBoost, and Stepwise
   Cox effectively reduce dimensions and screen variables.

   The development pipeline of the MLDPS is outlined as follows:
     * Univariate Cox analysis was performed on the merged cohort (created
       by removing batch effects and redundancies from the TARGET and
       [87]GSE21257 cohorts) to identify CPSGs, defined as genes with a
       significance level of P < 0.01 and consistent HR orientation across
       the two cohorts.
     * Ten ML algorithms were used to create 66 algorithms combinations to
       develop the most predictive MLDPS with superior C-index
       performance.
     * After establishing the model on the training cohort, we evaluated
       it on two separate testing cohorts. The average C-index was
       calculated for each model, with the model exhibiting the highest
       value being deemed optimal.

Prognostic value of MLDPS and potential clinical application

   Patients from the TARGET, [88]GSE21257, and merged cohorts were
   stratified into high- and low-risk score groups based on the median
   MLDPS value. The prediction accuracy of the MLDPS was assessed using
   receiver operating characteristic (ROC) curves and by calculating the
   time-dependent area under the ROC curve (AUC). The prognostic
   significance of the MLDPS was evaluated using Kaplan–Meier (KM) curves.

Collection and calculation of osteosarcoma’s published signatures

   Numerous prognostic signatures have been developed for osteosarcoma,
   focusing on stratified management and precise treatment, including
   those based on glycolysis- and cholesterol synthesis–related
   genes^[89]22, M1 macrophage–related genes^[90]24, and
   apoptosis–associated genes^[91]33. To compare the predictive efficacy
   of the MLDPS with established prognostic signatures, we systematically
   reviewed 10 published articles on prognostic models. Subsequently, we
   computed the risk scores for each signature in the three cohorts using
   the genes and coefficients outlined in the respective articles. In
   addition, we conducted a comparative analysis of key clinical
   characteristics, such as gender, age, and metastasis, between the high-
   and low-risk score groups in the merged cohort.

Assessment of immune infiltrations

   Immune infiltrations were evaluated in the merged cohort using the R
   ‘IOBR’ package^[92]51 and various algorithms, including
   MCPcounter^[93]52, EPIC^[94]53, xCell^[95]54, CIBERSORT^[96]55, IPS,
   quanTIseq^[97]56, ESTIMATE^[98]57, and TIMER^[99]58. We compared
   microenvironment-related scores, such as ESTIMATEScore, ImmuneScore,
   MicroenvironmentScore, and StromalScore, between the high- and low-risk
   score groups in the merged cohort. The correlation between the MLDPS
   risk score and chemokines, as well as chemokine receptors, in the
   combined cohort was determined utilizing the Pearson method to
   calculate the correlation coefficient and P-value. Correlation
   coefficients greater than 0 signify a positive correlation, while those
   less than 0 indicate a negative correlation. A P-value below 0.05 was
   considered statistically significant.

Response to immunotherapy

   Tumor Immune Dysfunction and Exclusion (TIDE)
   ([100]http://tide.dfci.harvard.edu/)^[101]59 is an integrated web tool
   used to forecast patients’ responses to immunotherapy. TIDE scores were
   calculated for each patient in the merged cohort. Subsequently,
   immunotherapeutic responses, represented by dysfunction scores and TIDE
   scores, were compared between high- and low-risk score groups.

Prediction of drug sensitivity

   To further compare differences between high- and low-risk score groups,
   drug sensitivity analysis was performed to calculate the IC50 values of
   drugs for each sample using the ‘oncoPredict’ software package. Higher
   IC50 values indicate decreased sensitivity to treatment.

Statistical analysis

   All statistical analyses were conducted using R software (version
   4.2.2). AUC analysis was performed using the ‘timeROC’ package. The KM
   method, combined with the log-rank test, was employed to assess
   differences in survival rates among various patient subgroups.

Results

Development of osteosarcoma CPSGs using univariate Cox regression

   Prior to biological data analysis, the collected datasets were assessed
   for batch effects, which were found to be present in both datasets
   (Fig. [102]1A). To address this, the datasets were merged for
   downstream analysis, using the R packages ‘limma’ and “sva” to remove
   batch effects and reduce dataset size (Fig. [103]1B). Next, CPSGs were
   screened from the merged dataset consisting of 138 samples and 14,174
   genes using univariate Cox regression analysis. The robust MLDPS was
   subsequently constructed using the 14 CPSGs, which showed consistent
   prognostic outcomes in both the [104]GSE21257 and TARGET cohorts
   (Fig. [105]1C). The 14 CPSGs comprised 12 risky genes (P-value < 0.01,
   HR > 1) and two protective genes (P-value < 0.01, HR < 1)
   (Fig. [106]1C).

Fig. 1.

   [107]Fig. 1
   [108]Open in a new tab

   Removal of batch effects and univariate regression analysis in
   osteosarcoma. (A) Principle-component Analysis (PCA) plot before batch
   effect removal. (B) PCA plot after batch effect removal. (C) Univariate
   Cox regression analysis results for 14 CPSGs in the [109]GSE21257 and
   TARGET cohorts.

   GO enrichment analysis revealed that the 14 CPSGs were predominantly
   enriched in the biological process (BP) of “pattern specification
   process” (Fig. [110]2A), the cell component (CC) of “mitochondrial
   inner membrane” (Fig. [111]2B), and the molecular function (MF) of
   “transcription coactivator binding,” “telomeric DNA binding,” and
   “telomerase RNA binding” (Fig. [112]2C). KEGG analysis identified the
   “Arginine and proline metabolism” as the most significant signaling
   pathway (Fig. [113]2D). These findings lay a groundwork for deeper
   investigations into the mechanisms of osteosarcoma and uncover pivotal
   biological processes and pathways that may influence its prognosis.

Fig. 2.

   [114]Fig. 2
   [115]Open in a new tab

   GO and KEGG pathway analyse of CPSGs. The GO enrichment analysis of 14
   CPSGs. The enriched BPs (A), CCs (B), and MFs (C) are noted. (D) The
   KEGG pathway enrichment analysis of 14 CPSGs.

Construction of MLDPS

   Risk prediction models were developed in the TARGET training cohort
   using a combination of 10 well-established ML algorithms. The
   predictive performance of all models was evaluated by calculating the
   average C-index across the TARGET, [116]GSE21257 and merged cohorts.
   The optimal model (MLDPS) was built by selecting the combination of
   LASSO and RSF with the highest average C-index (0.866) (Fig. [117]3A).
   The LASSO algorithm effectively identified the most critical CPSGs,
   while the RSF algorithm filtered the most valuable model. Finally,
   seven potential gene signatures (CTNNBIP1, CORT, DLX2, TERT, BBS4,
   SLC7A1, and NKX2-3) were identified, as illustrated in Fig. [118]3B.

Fig. 3.

   [119]Fig. 3
   [120]Open in a new tab

   Construction and evaluation of the MLDPS effectiveness. (A) The
   C-indexes values of 10 well-established ML algorithm combinations
   across three cohorts. (B) Seven CPSGs selected for the final model
   using the LASSO algorithm. (C) Time-dependent ROC analysis for
   predicting OS in the TARGET cohort. (D) AUC values of the
   time-dependent ROC curves in the TARGET cohort. (E) KM curves for OS in
   the TARGET cohort.

Robust prediction performance of MLDPS

   To estimate the discriminatory capability and clinical utility of the
   MLDPS, we plotted ROC and survival curves. As shown in Fig. [121]3C and
   D, the AUCs for OS in the TARGET cohort at 1, 3, and 5 years were
   0.949, 0.957, and 0.963, respectively. Similarly, the model exhibited
   robust performance in the [122]GSE21257 cohort, achieving AUCs of
   0.954, 0.869, and 0.852 (Fig. [123]4A and B). The merged cohort had
   AUCs of 0.935, 0.913, and 0.912 (Fig. [124]4D and E). These findings
   indicate the robust prognostic capability of the MLDPS model for
   osteosarcoma patients, with AUCs ranging from 0.852 to 0.963 for OS at
   1, 3, and 5 years across all three cohorts. Furthermore, KM analysis
   confirmed that the MLDPS could effectively stratify osteosarcoma
   patients into distinct high- and low-risk score groups. Patients in the
   high-risk score group, exhibited significantly worse survival outcomes
   compared to the low-risk score group across all three cohorts
   (P < 0.0001 for all cohorts) (refer to Figs. [125]3E and [126]4C, and
   Fig. [127]4F).

Fig. 4.

   [128]Fig. 4
   [129]Open in a new tab

   Prediction performance of MLDPS in the [130]GSE21257 and merged
   cohorts. Time-dependent ROC analysis for predicting OS in the (A)
   [131]GSE21257 and (D) merged cohorts. AUC values of the time-dependent
   ROC curves in the (B) [132]GSE21257 and (E) merged cohorts. KM curves
   for OS in the (C) [133]GSE21257 and (F) merged cohorts.

Re-evaluation of 10 previously published signatures in osteosarcoma

   The expeditious advancement of high-throughput sequencing has boosted
   the stratified management and precise therapeutic approaches for
   tumors^[134]60. Recently, numerous prognostic signatures for
   osteosarcoma have been developed using ML algorithms, including RSF and
   CoxBoost, leveraging extensive and reliable
   datasets^[135]23,[136]61,[137]62. A total of 10 prognostic signatures
   were collected from published sources to perform a comparative
   assessment of the MLDPS’ predictive efficacy relative to these models.
   The evaluation was carried out using the C-index metric across three
   distinct cohorts. These results demonstrated that the MLDPS achieved
   significantly higher predictive accuracy than the other models across
   all three cohorts, consistently ranking first (Fig. [138]5). This
   underscores the robustness and reliability of the MLDPS. Notably,
   certain prognostic signatures, such as those reported in
   PMID34894177^[139]23, PMID37234254^[140]9, and PMID34337075^[141]32,
   exhibited higher C-index values in one or two cohorts but performed
   poorly in the others. This discrepancy may stem from impaired
   generalization ability caused by overfitting, as plotted in
   Fig. [142]5.

Fig. 5.

   [143]Fig. 5
   [144]Open in a new tab

   Comparisons of C-indexes values between MLDPS and 10 expression-based
   signatures. C-indexes values of MLDPS and 10 published signatures in
   the TARGET, [145]GSE21257, and merged cohorts. *P < 0.05, **P < 0.01,
   ***P < 0.001, and ****P < 0.0001.

The clinical signature of MLDPS

   Univariate Cox regression analysis revealed that the MLDPS, published
   signatures (excluding PMID35598382^[146]33), and metastasis were
   statistically significant within the merged cohort (Fig. [147]6A).
   However, no statistically significant differences were observed for
   gender or age (Fig. [148]6A). Further analysis was conducted to compare
   OS and metastasis between high- and low-risk score groups. Compared to
   the low-risk score group, the high-risk score group exhibited
   significantly inferior OS (Fig. [149]6B). Patients in the high-risk
   score group showed a greater propensity for developing metastasis,
   potentially contributing to their unfavorable prognosis (Fig. [150]6C).

Fig. 6.

   [151]Fig. 6
   [152]Open in a new tab

   The clinical signature of MLDPS and other signatures in the merged
   cohort. (A) Univariate Cox regression analysis of MLDPS and other
   signatures in the merged cohort. (B) Comparison of osteosarcoma between
   the high- and low-risk score groups. Left: Differences in the MLDPS
   risk scores between the OS groups. Right: Composition percentage of the
   high- and low-risk score groups in osteosarcoma. (C) Comparison of
   metastasis between the high- and low-risk score groups. Left:
   Differences in the MLDPS risk scores between the metastasis groups.
   Right: Composition percentage of the high- and low-risk score groups in
   metastasis.

   Collectively, the timeROC curve, KM survival analysis, and Cox
   regression analysis outcomes consistently demonstrated that the MLDPS
   accurately forecasted the prognosis of osteosarcoma patients across all
   three cohorts. These findings suggest that the MLDPS holds promise as a
   valuable tool for clinical application.

The underlying biological mechanisms of MLDPS

   Heatmaps (Fig. [153]7A and B) depicted the top 50 genes showing the
   most significant positive and negative correlations with the risk
   score. GO enrichment analysis of co-expressed genes positively
   associated with the MLDPS exhibited significant enrichment in various
   BPs, namely “ossification,” “ncRNA processing,” “embryonic organ
   morphogenesis,” “ribosome biogenesis,” “rRNA metabolic process,” and
   “rRNA processing” (Fig. [154]7C). Enrichment was observed in CCs such
   as “ribosome,” “mitochondrial matrix,” “ribosomal subunit,” “asymmetric
   synapse,” and “cytosolic ribosome” (Fig. [155]7D). With respect to MFs,
   “structural constituent of ribosome” (Fig. [156]7E) was significantly
   enriched. Notably, KEGG analysis showed “Ribosome” as the only pathway
   with significant enrichment (Fig. [157]7F).

Fig. 7.

   [158]Fig. 7
   [159]Open in a new tab

   Function and pathway analysis of MLDPS. (A) Heatmap of the top 50
   positive correlation genes with the MLDPS risk score; (B) Heatmap of
   the top 50 negative correlation genes with the MLDPS risk score; The GO
   enrichment analysis of the top 500 positive correlation genes,
   including BP (C), CC (D) and MF (E); (F) The KEGG pathways enrichment
   analysis of the top 500 positive correlation genes.

   To understand the biological implications of MLDPS, we performed a GSEA
   analysis. The findings indicated a positive correlation between a
   high-risk score and the upregulation of key gene sets involved in
   translation, rRNA processing in the nucleus and cytosol, the major
   pathway of rRNA processing in the nucleolus and cytosol, and other
   biological pathways (Supplementary Fig. 1A). GSVA was utilized to
   analyze the effects of MLDPS on biological pathways in osteosarcoma. A
   heatmap was used to visualize the top 20 biological processes with
   statistically significant differences, ranked in ascending order of
   their P-values. As depicted in Supplementary Fig. 1B, significant
   pathway differences were observed between the high- and low-risk score
   groups. These observations may offer a potential explanation for the
   unfavorable prognosis associated with the high-risk score group.

Immune landscape of MLDPS

   A thorough analysis of the TME in osteosarcoma was conducted using the
   ‘IOBR’ R package^[160]51. The heatmap exhibited variations in the
   immunological composition between high- and low-risk score groups
   (Fig. [161]8A). As shown in Fig. [162]8B, the high-risk score group
   demonstrated notably diminished ESTIMATEScore, ImmuneScore,
   MicroenvironmentScore, and StromalScore, suggesting an
   immunosuppressive condition.

Fig. 8.

   [163]Fig. 8
   [164]Open in a new tab

   Landscapes of the TME between the high- and low-risk score groups. (A)
   Heatmap indicates the difference in the TME between the high- and
   low-risk score groups. (B) The diversities of the ESTIMATEscore,
   ImmuneScore, MicroenvironmentScore, and StromalScore between the high-
   and low-risk score groups. *P < 0.05, **P < 0.01, ***P < 0.001,
   ****P < 0.0001.

   Further analysis identified a negative correlation between the risk
   score and the gene expression levels of most chemokines and chemokine
   receptors (Fig. [165]9A). Specifically, the correlation coefficients
   between the risk score and the expression levels of CCL8, CSF1, CCR1,
   CCR2, CCR4, and CCR7 were all below − 0.2, with corresponding P-values
   less than 0.05 (Fig. [166]9B).

Fig. 9.

   [167]Fig. 9
   [168]Open in a new tab

   The correlations between MLDPS and gene expression of chemokines and
   their receptors. (A) Correlation heatmap between MLDPS and expression
   of chemokines and their receptors. (B) Correlation analysis of MLDPS
   and CCL8, CSF1, CCR1, CCR2, CCR4 and CCR7 gene expression. *P < 0.05,
   **P < 0.01, ***P < 0.001, ****P < 0.0001.

   TIDE algorithms, widely used for predicting responses to cancer
   immunotherapy, indicate suboptimal immunotherapy outcomes when TIDE
   values are elevated^[169]59. In the merged cohort, TIDE algorithms
   predicted responses to immunotherapy for 138 patients. Of these, 46
   were identified as responders (TIDE score < 0), whereas the remaining
   92 were classified as non-responders (Fig. [170]10A). To evaluate the
   predictive power of the MLDPS for cancer immunotherapy response,
   dysfunction and TIDE scores were calculated. Osteosarcoma patients with
   elevated MLDPS risk scores exhibited decreased dysfunction and TIDE
   scores, indicating a higher likelihood of responding to immunotherapy
   in the merged cohort (Fig. [171]10B − D). Overall, these findings
   indicat that individuals in the high-risk score group are more likely
   to benefit from immunotherapy.

Fig. 10.

   [172]Fig. 10
   [173]Open in a new tab

   Immunotherapeutic response prediction by TIDE. (A) Patients were
   classified as potential responders or non-responders to immunotherapy
   based on their TIDE values in the merged cohort. (B) Differences in
   Dysfunction scores between high- and low-risk score groups (P < 0.001).
   (C) Percent weight of clinical response to immunotherapeutic in high-
   and low-risk score groups. (D) Differences in TIDE scores between high-
   and low-risk score groups (P = 0.02).

Comparative analysis of drug sensitivity between high- and low-risk score
groups

   Precision medicine necessitates clinicians to promptly recognize
   patients sensitive to different treatments in order to provide tailored
   and personalized interventions^[174]63,[175]64. Hence, the drug
   sensitivity between the two risk score groups was evaluated by the
   ‘oncoPredict’ R package. Drug sensitivity analysis demonstrated a
   notable discrepancy in the drug response between the two risk score
   groups. Specifically, the high-risk score group exhibited a correlation
   with decreased IC50 values for drugs such as ABT737_1910, AZ6102_2109,
   AZD5153_1706, AZD5991_1720, AZD6738_1917, and BDP.00009066_1866.
   Conversely, the low-risk score group was correlated with decreased IC50
   values for drugs such as AZ960_1250, BMS.754807_2171,
   Entospletinib_1630, IAP_5620_1428, PF.4708671_1129, and Ribociclib_1632
   (Fig. [176]11). Precise prediction of drug sensitivity may have the
   potential to enhance patient treatment outcomes.

Fig. 11.

   [177]Fig. 11
   [178]Open in a new tab

   Association between the risk model and drug sensitivity in
   osteosarcoma. Drug sensitivity of osteosarcoma patients with high- and
   low-risk score. *P < 0.05, **P < 0.01, ***P < 0.001.

Discussion

   Osteosarcoma is a rare malignant tumor originating in bone tissue that
   predominantly afflicts adolescents and young adults^[179]65,[180]66.
   Clinicians and researchers face critical challenges due to the lack of
   readily available biomarkers for screening, stratified treatment, and
   prognostic follow-up, potentially resulting in inadequate
   treatment^[181]67. To bridge these gaps, novel methodologies are needed
   to identify osteosarcoma biomarkers that accurately reflect the
   biological characteristics of the disease.

   Recent studies have developed various prognostic signatures for
   osteosarcoma, many of which focus on specific biological pathways such
   as glycolysis, ferroptosis, and immunity^[182]68–[183]70. However, this
   focused approach neglects crucial biological processes that may also
   contribute to the oncogenesis and progression of osteosarcoma.
   Moreover, existing research indicates that researchers tend to select
   modeling algorithms based on their knowledge limitations and personal
   preferences^[184]23,[185]61,[186]71. To compensate for these