Abstract

Introduction

   Osteosarcoma (OS) is an invasive and lethal malignancy showing a low
   5 year survival rate, underscoring the need for identifying new
   therapeutic targets and their inhibitors to enhance prevention and
   treatment strategies.

Methods

   In this study, in vitro experiments including CCK-8 assay,
   anchorage-independent growth assays, and plate cloning assays were used
   to detect the anti-proliferation ability of natural compound tangeretin
   towards OS cells. An integrated approach was performed including WGCNA
   and network pharmacology to identify the key genes of tangeretin for
   the treatment of OS. Multigene diagnostic model, reverse transcription
   quantitative polymerase chain reaction (RT-qPCR) analysis along with
   molecular docking analysis were further conducted to validate the
   reliability of the targets obtained by bioinformatics methods.
   Single-cell and gene enrichment analyses were chosen to explore the
   mechanism of tangeretin in OS.

Results

   Hub genes identified by the bioinformatics strategy included ABCC1,
   AKR1C3, BACE1, and CA12. RT-qPCR validation and molecular docking
   analysis confirmed that ABCC1 and BACE1 were the most likely potential
   targets. A multigene diagnostic model for OS demonstrated moderate
   accuracy of the hub genes. Single-cell sequencing results indicated
   that these two hub targets were closely related to OS and provided more
   potential mechanisms for targeting OS.

Conclusion

   Our research highlights the therapeutic potential of the natural
   compound tangeretin and its antineoplastic mechanisms in OS. It offers
   new insights into the molecular mechanisms of tangeretin, paving the
   way for the development of effective OS treatments.

   Keywords: Osteosarcoma, Natural compound, Tangeretin, Therapeutic
   targets, Network pharmacology, WGCNA

Introduction

   As a primary bone malignancy largely impacting adolescents and kids,
   Osteosarcoma (OS) represents the most prevalent form of bone cancer in
   this age group [[38]1]. It accounts for approximately 5% of all
   pediatric tumors and an alarming 20% of all bone malignancies [[39]1].
   Even though chemotherapy, radiotherapy, and surgery are regarded as the
   most efficient clinical therapies, the overall survival proportion only
   reaches about 15–17% [[40]2]. Hence, it is highly indispensable to
   identify efficient molecular targets and their suppressors
   [[41]3–[42]5].

   Natural compounds that were derived from fruits, vegetables, and
   medicinal plants were regarded as promising inhibitor resources for
   tumors [[43]6–[44]8]. Previous studies demonstrated that many
   flavonoids obtained from medicinal plants displayed promising
   anti-cancer properties through carcinogen inactivation,
   anti-proliferative effect, cell cycle arrestment, and apoptosis
   induction, etc. [[45]9, [46]10]. Tangeretin, which was found
   comprehensively in fruits of the citrus genus, pertains to a set of
   polymethoxylated flavonoids [[47]11]. Tangeretin elevated the Bax
   expression and decreased the expression of Bcl-2 in the BFTC-905 cell
   line, which also elicited premature and late apoptosis [[48]12].
   Tangeretin could enhance the anticancer activity of metformin by
   improving the activation production of both AMPK and ROS [[49]13].
   Nevertheless, little work has been done to value the anti-cancer
   activities and elucidate the underlying mechanisms of tangeretin on OS.

   This study first confirmed the inhibitory effect of tangeretin on OS
   and explored its potential mechanisms through a combined approach of
   bioinformatics and network pharmacology. A multigene diagnostic model
   was developed to assess the accuracy of four identified targets, with a
   moderate accuracy demonstrated ([AUC] = 0.575). ABCC1 and BACE1 were
   identified as the most likely targets through RT-qPCR and molecular
   docking analysis. We identified key core targets and elucidated the
   potential mechanisms of tangeretin in treating OS, providing new
   insights for both OS treatment and the clinical application of natural
   compounds.

Materials and methods

Cell lines and reagents

   Tangeretin (CAS# 481-53-8) was purchased from Shanghai Rechemscience
   (Shanghai, China). Cell substrate (RPMI1640, DMEM) and Fetal bovine
   serum (FBS) were supplied by BI (Shenzhen, China). RevertAid™ First
   Strand Scientific cDNA Synthesis Kit was supplied by Thermo Fisher
   Scientific (USA). Cell Counting Kit-8 Assay Kit was offered by
   Meilunbio (Dalian, China). Human OS cell lines 143B and KHOS were
   provided by the cell bank of the Chinese Academy of Sciences
   (Shanghai). 143B and KHOS cell lines were sustained in McCoy’s 5A
   medium (Corning) integrated with 1% streptomycin-penicillin (Gibco) and
   10% FBS (Gibco). All cell substrates were preserved under 5% CO[2] at
   37%.

Cell proliferation

   Cell proliferation progress was assessed via Cell Counting Kit-8 Assay
   method. 143B cells and KHOS cells (1 × 10^3 each well) were fostered
   with 96-well plates all night. Concentrations of tangeretin
   (0,5,10,20,40 μM) were integrated into the above cells on next day.
   CCK-8 assay was dispensed into each well of the 96-well plates. The
   reaction time persisted for 1 h within a 5% CO[2] moist hatcher under
   light-exclusion conditions. Absorbance was then gauged
   spectrophotometrically at 450 nm.

Anchorage-independent growth assays

   143B and KHOS cells (8 × 10^3 each well) were suspended in Eagle’s
   Basal Medium (BME) comprising 0.33% agar and 10% fetal bovine serum
   (FBS) in the top layer, with discrepant doses of tangeretin (0, 5, 10,
   20, 40 μM) were integrated into the blended agar in the top layer and
   base layer. The substrates were cultured in a hatcher containing 5%
   CO[2] for approximately 1 week at 37 ℃. Then, colonies were checked and
   counted through IN Cell Analyzer 6000.

Plate cloning assays

   143B cells and KHOS cells (800 each well) were fostered with 6-well
   plates for 8 days, and diverse concentrations of tangeretin (0, 5, 10,
   20, 40 μM) were integrated into the above cells. When the clones had
   grown to the appropriate size, the cell culture medium was removed, and
   clones were subjected to staining with 0.5% crystal violet for a period
   of 4 min. Subsequently, clones were washed with deionized water,
   photographed, and then enumerated through IN Cell Analyzer 6000.

Acquisition and analysis of the microarray data

   We have obtained two microarray data sets related to OS ([50]GSE14359,
   [51]GSE42572) from the Gene Expression Omnibus Database (GEO)
   ([52]https://www.ncbi.nlm.nih.gov/geo/) and their basic information was
   shown in Table [53]1. The R program inSilicoMerging was used to combine
   the datasets [54]GSE14359 and [55]GSE42572, and the batch effect
   [[56]14] was eliminated then using the technique developed by Johnson
   WE et al. [[57]15] to produce the final normalization profile.

Table 1.

   Summary statistics
   Data sets number Platform information OS group Normal control Species
   [58]GSE14359     [59]GPL96            18       2              Homo sapien
   [60]GSE42572     [61]GPL13376         7        5              Homo sapien
   [62]Open in a new tab

Differential expression analysis

   The analysis of differential expression was undertaken by the R package
   linear models for microarray (limma) (version 3.40.6), with the aim of
   identifying differential genes between normal controls and OS groups.
   Genes having an expression value of 0 for over 50% of the instances
   were removed in the acquired expression profiling datasets.
   Subsequently, the voom function was utilized to convert the data, and
   the lmFit function was further employed to carry out multiple linear
   regression. Log-odds of differential expression, moderated F-statistics
   and t-statistics were calculated by eBay's functionality via empirical
   Bayes mediacy of the standard errors with regard to a representative
   value, consequently achieving the distinctive significance of each
   gene.

   The genes that displayed differential expression genes (DEGs) between
   normal controls and OS groups were chosen via using a filtering
   criterion of |log 2 FC|> 1 and p < 0.05.

Weighted gene co-expression network analysis (WGCNA)

   The Median Absolute Deviation (MAD) for individual genes was computed
   based on their respective expression profiles to assess the variability
   in gene expression across the samples. Initially, the upper 50% of
   genes exhibiting the highest MAD values were removed, as these genes
   exhibited extreme variability that could potentially represent noise
   rather than meaningful biological signals. Next, the goodSamplesGenes
   function from the WGCNA R package was employed to systematically
   identify and eliminate both outlier genes and samples that failed to
   meet quality control standards. This function helped to filter out any
   samples or genes with poor quality expression profiles, ensuring that
   only high-quality data were used for the downstream analysis.
   Subsequently, a scale-free gene co-expression network was constructed
   using the WGCNA method. The sensitivity parameter was set at 3, a value
   chosen to balance the trade-off between sensitivity and specificity in
   the co-expression network construction. This configuration aimed to
   capture the most relevant co-expressed gene pairs without introducing
   too much noise. A module dendrogram was then constructed through
   hierarchical clustering of genes based on their co-expression patterns,
   and a cut line was chosen to separate genes into distinct modules. The
   chosen threshold for module merging was set at a distance of 0.25,
   which meant that modules with a similarity distance lower than this
   value were merged into one co-expression module. This resulted in the
   identification of 23 unique co-expression modules. To assess the
   biological relevance of these modules, a correlation analysis was
   performed between each module and the clinical traits associated with
   OS, such as tumor progression, survival rates, and treatment response.
   The modules that exhibited a significant correlation with OS clinical
   scores were selected for further exploration. To identify the most
   relevant genes within these co-expression modules, Gene Significance
   (GS) and Module Membership (MM) values were calculated for each gene.
   GS measures the correlation between a gene’s expression profile and the
   clinical traits of interest, whereas MM indicates the strength of the
   association between a gene and its corresponding module. Genes with
   high GS and MM values were considered to be central to the module and
   relevant to the clinical traits of OS. A MM value greater than 0.8 and
   a GS value greater than 0.1 were established as the boundary criteria
   for selecting key OS-related genes.

OS feature genes extraction and functional enrichment analysis

   DEGs were cross-referenced with the candidate genes identified through
   WGCNA to derive the OS-related feature genes [[63]16]. To investigate
   the signaling pathways and distinctive biological characteristics of
   the feature gene, Gene Ontology (GO) and Kyoto Encyclopedia of Genes
   and Genomes (KEGG) functional enrichment analysis were conducted based
   on the R package named clusterProfiler (version 3.14.3), with a
   significant set to  p < 0.05.

Tangeretin targets prediction

   The prediction of tangeretin targets was performed using the Swiss
   Target Prediction database ([64]http://swisstargetprediction.ch/) and
   the PharmMapper database
   ([65]http://www.lilab-ecust.cn/pharmmaper/submitfile.html). The species
   Homo sapiens was selected for taxonomic categorization. Potential
   targets exhibiting a probability score > 0 were initially identified
   and subsequently processed through a deduplication procedure to obtain
   the final results of tangeretin-associated molecular targets. The
   target protein sequences were translated into corresponding gene
   identifiers, followed by the systematic exclusion of non-human genes
   through the utilization of the Uniprot database
   ([66]http://www.uniprot.org/).

Establishment of drug-target-illness connection networks and analysis of
functional enrichment

   The recognition of overlapping between tangeretin targets and OS
   feature genes was performed using Venny 2.1.0, an online bioinformatics
   tool (available at
   [67]http://bioinformatics.psb.ugent.be/webtools/Venn/). The
   protein–protein interaction network was established utilizing the
   STRING database ([68]http://string-db.org/) [[69]17]. The
   ClusterProfiler package (version 3.14.3) in R was used to conduct GO
   and KEGG enrichment analysis. Statistical significance was determined
   using a threshold of p < 0.05 for all enrichment analyses.

Hub genes identification and GSEA analysis for tangeretin for the treatment
of OS

   The identification of key regulatory genes associated with tangeretin
   treatment in OS was obtained through the integration of DEGs, WGCNA
   modules, and target gene datasets. The protein–protein interaction
   network was established using the STRING database. The identified hub
   genes comprise ABCC1, AKR1C3, BACE1, and CA12. To investigate the
   underlying signaling cascades and potential biological functions of the
   target genes, we performed a comprehensive Gene Set Enrichment Analysis
   (GSEA) [[70]18]. Genes were ranked based on differential expression and
   their enrichment across multiple important pathways was analyzed. The
   Enrichment Score (ES) and Normalized Enrichment Score (NES) were
   calculated for each gene set, with statistical significance determined
   by p-values and false discovery rates (FDR).

Hub genes validation

   The transcriptomic profiling data, including raw read counts and
   associated clinical metadata for OS samples, were acquired from the
   TARGET database (accessible at
   [71]https://ocg.cancer.gov/programs/target) at level 3 RNA sequencing
   resolution. The statistical significance of survival differences among
   two or more distinct groups was assessed by the log-rank test within
   the Kaplan–Meier survival analysis. Subsequently, the predictive
   performance of the model was evaluated through time-dependent receiver
   operating characteristic (ROC) curve analysis. Feature extraction was
   performed by the Least Absolute Shrinkage and Selection Operator
   (LASSO) regression method, utilizing a tenfold cross-validation
   approach to ensure robust model evaluation. The glmnet package
   implemented in the R programming environment was performed to complete
   the analysis mentioned above. A predictive model was developed through
   multivariate Cox proportional hazards regression analysis. The analysis
   mentioned above was performed utilizing the survival package within the
   R programming environment. Initially, multivariate Cox proportional
   regression analysis was conducted utilizing the stepwise function to
   identify the ideal model, which will subsequently serve as the
   definitive predictive model. The statistical analysis of survival data
   was performed using Kaplan–Meier methodology. We acquired the p-value
   through the log-rank test, which was used to determine whether there
   was a significant difference between the survival curves of different
   groups. We also obtained the hazard ratio (HR) and its 95% confidence
   interval through univariate Cox regression, which was performed to
   assess the impact of a certain factor on survival risk and the
   uncertainty range of this impact. The statistical analyses were
   executed utilizing R packages based on the Foundation for R Statistical
   Computing (version 4.0.3 released in 2020). Statistical significance
   was determined using a threshold of  p < 0.05. A superior predictive
   performance was indicated by higher area under the curve (AUC) values
   along with lower p-values of Log-rank. We identified the genes
   demonstrating predictive performance with maximum AUC values, as
   determined through survival ROC analysis during the initial stage of
   model optimization.

Computer docking model

   In silico docking was conducted using the Schrödinger Suite 2019
   software. The three-dimensional (3D) structure of tangeretin was
   obtained from PubChem. The crystal composition of ABCC1, AKR1C3, BACE1,
   and CA12 were offered by the PDB database. The stimulation of docking
   as well as the choice of the genetic algorithm settings were performed
   by AutoDockTools software (version 1.5.7). The molecular docking
   analysis and preparation were conducted using Pymol software.

Evaluating the expression of chosen genes through RT-qPCR

   TRIzol was utilized with the aim of isolating the overall mRNA of OS
   cells guided by the manufacture’s standard instructions. The appraisal
   of RNA sample quality and quantity was conducted through distinct
   analytical methods, utilizing conventional agarose gel electrophoresis
   for qualitative evaluation and NanoDrop spectrophotometric analysis for
   quantitative assessment. The transcription of complementary DNA (cDNA)
   from RNA was performed using the RevertAid First Strand cDNA Synthesis
   Kit (Thermo Fisher Scientific, USA), following the protocol provided by
   the manufacturer. RT-qPCR analysis was conducted utilizing the RealQ
   Plus Master Mix Green reagent in conjunction with the Applied
   Biosystems 7500 Real-Time PCR platform. The specific RT-qPCR primer
   sequences are presented in Table [72]2. GAPDH served as an internal
   reference gene to quantify and normalize target gene expression levels,
   as well as to normalized the relative mRNA expression levels for the
   purpose of data analysis. Next, the 2^−△△CT approach was performed to
   analyze the data. Each sample was tested three times to ensure the
   accuracy of the data.

Table 2.

   The details of specific RT-qPCR primer sequences
   Gene name Forward primer (5’ to 3’) Reverse primer (5’ to 3’)
   ABCC1     AGGAATTGGTTGTATAGAAG      AAGGATGTTAGAGGTGAT
   AKR1C3    ACCTCTATCTTATTCATTCTC     ATCCTTACACTTCTCCAT
   BACE1     ACCTCTTCTATCTAATCCTT      AACCACTCTGAATTATGTAA
   CA12      AACCTGGAATGGAGACTT        GAATATGGAGTATCTTAGTAGCA
   [73]Open in a new tab

Analyze the single-cell data using t-SNE method

   The single-cell datasets along with their annotation outcomes were
   obtained in h5 file format from the Tumor Immune Single-cell Hub
   (TISCH) database. The MAESTRO and Seura packages within the R
   programming environment were utilized in the aim of analyzing the
   single-cell RNA sequencing data. The cells were re-clustered using the
   t-SNE method.

Statistical analysis

   GraphPad Prism 8.0 was used for all statistical analyses, with
   quantitative results presented as mean ± SD. To determine significant
   differences between groups, either the unpaired Student’s t-test or
   one-way analysis of variance (ANOVA) was employed, depending on the
   experimental design. Post-hoc tests were conducted where appropriate to
   further analyze multiple group comparisons. To ensure sufficient
   statistical power, power analysis was performed prior to the
   experiments to justify the sample sizes for the in vitro assays,
   including cell proliferation, colony formation, and migration assays.
   Power calculations were conducted based on effect size, variability,
   and significance thresholds to confirm that the study was sufficiently
   powered to detect meaningful differences. Specifically, power
   calculations were aimed to achieve a power of 80% or greater with a
   significance level of 0.05, ensuring that the study had a high
   probability of detecting true effects and minimizing the risk of type I
   and type II errors. Statistical significance was determined with the
   following thresholds: *p < 0.05, **p < 0.01, and ***p < 0.001.

Results

Tangeretin inhibits the proliferation of OS

   After preliminary screening of natural compounds, we found that
   tangeratin has an inhibitory effect on OS. In an effort to delve deeper
   into the effects of tangeretin, we evaluated the proliferation of 143B
   and KHOS cells following exposure to varying concentrations of
   tangeretin (0, 10, 20, and 40 μM) utilizing proliferation assays.
   Findings illustrated an important dose-dependent inhibition of
   proliferation in 143B and KHOS cells post-treatment with tangeratin
   (Fig. [74]1A–B). Quantitative analysis of plate clone formation assays
   revealed a marked suppression of clonogenic potential in cells exposed
   to tangeretin, as evidenced by both decreased colony diameter and lower
   colony counts. (Fig. [75]1C–D). Comparative analysis presented that
   tangeretin exhibited superior inhibition of colony formation in 143B
   and KHOS (Fig. [76]1E–F). These findings implied that tangeretin
   treatment efficiently hindered cellular proliferation and
   colony-forming capacity in a dose-dependent manner.

Fig. 1.

   [77]Fig. 1
   [78]Open in a new tab

   Tangeretin attenuates OS cell proliferation. A, B The influences of
   discrepant tangeretin concentrations on cell progress capacity of 143B
   and KHOS at 1 day, 2 days, and 3 days were examined via the CCK-8
   assay.; C, D Colony formation cell progress trial was executed to
   appraise the impact of tangeretin (0, 5, 10, 20, 40 μM) on the progress
   of 143B (C) and KHOS (D). E, F Anchorage-independent cell progress
   trial was executed to appraise the impact of tangeretin (0, 5, 10, 20,
   40 μM) on the progress of 143B (E) and KHOS (F). Statistical analysis
   was executed through the Students’ unpaired t-test. *p < 0.05,
   **p < 0.01, and ***p < 0.001 signify key disparities in the groups. The
   outcomes aligned with three isolated experiments

Tangeretin target prediction

   To further investigate the mechanism of tangeratin in OS, we firstly
   utilized two computational prediction platforms to perform
   comprehensive target prediction analyses for tangeretin. The molecular
   structure of tangeretin, represented by either SMILES notation or SDF
   format, was entered into the two computational platforms: Swiss Target
   Prediction (accessible at [79]http://www.Swisstargetprediction.ch/) and
   the PharmMapper database
   ([80]http://www.lilab-ecust.cn/pharmmapper/submitfile.htlm). Human
   subjects were screened, and potential targets with a probability score
   greater than zero were identified. Duplicate entries were subsequently
   eliminated to derive a refined set of biologically relevant targets.
   The protein targets were mapped to their corresponding genes through
   the Uniprot database (available at [81]http://www.uniprot.org/),
   yielding a total of 345 target genes. This approach represents a vital
   step in elucidating the pharmacological profile and potential
   therapeutic applications of tangeretin at the molecular level.

Differential genes identification between OS and normal controls

   We employed bioinformatics approaches to explore the molecular
   signatures of OS, with particular emphasis on identifying protein
   expression profiles that distinguish OS groups from normal controls.
   The integration of [82]GSE14359 and [83]GSE42572 datasets was conducted
   using the inSilicoMerging R package jointly with the approach described
   by Johnson et al. [[84]15]. This method enabled the effective removal
   of batch effects, resulting in a consolidated expression data profile
   comprising 32 distinct samples. Following batch effect elimination, it
   was apparent that the data distribution between each dataset was
   consistent according to the bar plot, with median values aligning along
   a single horizontal axis. This result indicated successful
   normalization of inter-batch variations, suggesting that the datasets
   were now comparable for subsequent analyses. (Fig. [85]2A and 2B). The
   expression density plot revealed significant heterogeneity in sample
   distributions across datasets before batch effect elimination,
   indicative of substantial technical variation. Following batch effect
   removal, the datasets demonstrated improved alignment, with comparable
   distribution patterns characterized by similar central tendencies and
   dispersion measures (Fig. [86]2C and 2D). The spatial distribution
   patterns of individual datasets were visualized using uniform manifold
   approximation and projection (UMAP) dimensionality reduction, with
   comparative analyses conducted both prior to and following batch effect
   elimination (Fig. [87]2E and 2F).

Fig. 2.

   [88]Fig. 2
   [89]Open in a new tab

   Normalization procedure for integrating multiple datasets from GEO. A
   Visualization of gene expression distributions across both datasets
   prior to normalization. B Comparative analysis of expression profiles
   following normalization procedures. C Density distribution patterns of
   gene expression levels in unnormalized datasets. D Normalized
   expression density profiles demonstrating data standardization effects.
   E UMAP dimensional reduction analysis of pre-normalized datasets. F
   UMAP visualization of normalized datasets illustrating clustering
   patterns post-normalization

   Following the normalization of the expression profile, differential
   gene expression analysis was performed using stringent criteria (|log2
   fold change|> 1, p-value < 0.05), which identified a significant number
   of DEGs. The DEGs were graphically represented through a volcano plot
   (Fig. [90]3A), accompanied by a heatmap that highlighted the 50 most
   significantly elevated-regulated and the 50 most significantly
   reduced-regulated genes (Fig. [91]3B).

Fig. 3.

   [92]Fig. 3
   [93]Open in a new tab

   The DEG analysis resulted from the [94]GSE14359 and [95]GSE42572
   datasets. A volcano plots where up-regulated genes are denoted in red,
   while down-regulated genes are marked in green. B Hierarchical
   clustering heatmap of the 50 most significantly differentially
   expressed genes between OS and normal tissue samples

Weighted gene-co-expression network analysis (WGCNA)

   The MAD was computed for individual genes based on the normalized
   expression profile. Following this, half of the genes exhibiting the
   lowest MAD values were excluded from subsequent analyses. Outlier
   detection and sample filtering were performed by the Sample Genes
   algorithm implemented in the WGCNA R package. Subsequently, a
   scale-free gene co-expression network was constructed through WGCNA
   analysis, with the soft-thresholding power parameter determined at 3.
   This configuration yielded an average network connectivity of 5.87 and
   achieved a scale-free topology fit index of 0.95 (Fig. [96]4A, [97]B).
   With a module merging threshold of 0.25, sensitivity parameter adjusted
   to 3, and minimum module size established at 30, 23 unique
   co-expression modules were successfully identified (Fig. [98]4C). The
   parameter selection represents a balanced method between biological
   relevance and module specificity, as proved by the resulting module
   count and size distribution. Subsequently, through comprehensive
   correlation analyses between individual module and clinical traits
   (Fig. [99]4D), as well as the association between module membership
   (MM) and gene significance (GS), the turquoise module containing 178
   genes emerged as the most significant candidate for subsequent
   exploration (correlation coefficient = -0.58, p = 4.7e^−4). The scatter
   plot analysis focusing on the correlation between GS and MM
   demonstrated a highly significant association, with a p-value of
   2.1e-^67 (Fig. [100]4E).

Fig. 4.

   [101]Fig. 4
   [102]Open in a new tab

   Outcomes of the WGCNA. A Scale-free topology of the standardized
   expression profile. B Mean connectivity values derived from the
   standardized expression profile. C Hierarchical clustering dendrogram
   from gene expression profiles. D The association between distinct gene
   modules and clinical traits, where light blue and blue hues represent
   positive and negative correlations, respectively. E The association
   between the degree of module membership (MM) and gene significance (GS)
   within the turquoise module was examined

Identification and functional annotation of OS related genes and pathway
enrichment evaluation

   Through the intersection between the 178 genes identified by WGCNA and
   the 635 DEGs, a subset of 130 overlapping genes was obtained
   (Fig. [103]5A). These genes were subsequently selected as
   characteristic biomarkers for OS progression. Functional annotation and
   pathway enrichment analyses were conducted on the 130 candidate genes
   utilizing The GO and KEGG databases. Gene ontology biological process
   (GO-BP) analysis revealed essential enrichment of the identified
   feature genes in critical cellular pathways, particularly associated
   with cellular demise mechanisms, including both programmed cell death
   and apoptosis (Fig. [104]5B). The Gene Ontology Cellular Component
   (GO-CC) analysis unveiled significant enrichment primarily in
   extracellular compartments, including the extracellular matrix,
   vesicular structures, and specific subdomains of the extracellular
   region (Fig. [105]5C). The top three significantly enriched categories
   identified through Gene Ontology Molecular Function (GO-MF) analysis
   were associated with cell adhesion molecule binding, structural
   components of the extracellular matrix, and integrin-mediated binding
   interactions (Fig. [106]5D). Furthermore, according to KEGG pathway
   analysis, the signature genes were substantiated predominantly
   associated with key biological processes, including lysosomal function,
   autophagic pathways, and metabolic modulation of
   glycolysis/gluconeogenesis (Fig. [107]5E).

Fig. 5.

   [108]Fig. 5
   [109]Open in a new tab

   Identification of OS characteristic genes and their functional
   annotation. A The overlapping gene sets between WGCNA module genes and
   DEGs. B Bubble chart illustrating the KEGG pathway enrichment results,
   highlighting the top 10 significantly enriched pathways among
   OS-associated genes. C–E GO enrichment outcomes, depicting the premier
   10 enriched biological procedures categories (C), premier 10 cellular
   component terms (D), and the premier 10 molecular functions categories
   (E). In all bubble plots, the color gradient and bubble diameter
   respectively indicate the statistical significance (p-values) and the
   number of genes involved in each term

Tangeretin treatment OS potential targets acquisition and single gene GSEA
analysis

   Through comprehensive analysis, four candidate target genes of
   tangeretin in OS were identified by intersecting 130 characteristic
   genes with 345 obtained tangeretin-associated target genes. This
   integrative approach revealed key molecular targets potentially
   mediating the therapeutic effects of tangeretin in OS (Fig. [110]6A).
   To elucidate the functional significance of target genes in OS
   progress, we conducted GSEA analysis at the individual gene level to
   identify potential pathways involved. The analysis implied that
   tangeretin predominantly modulates genes associated with critical
   biological processes, including cell cycle regulation, proteasomal
   degradation, meiotic division in oocytes, base excision repair
   mechanisms, and DNA replication pathways (Fig. [111]6C-D).

Fig. 6.

   [112]Fig. 6
   [113]Open in a new tab

   Identification and functional characterization of potential molecular
   targets associated with the therapeutic effects of tangeretin in OS. A
   The overlapping gene sets between OS-related molecular signatures and
   tangeretin targets. B Circular visualization of KEGG enrichment pathway
   enrichment analysis, highlighting the biological processes influenced
   by tangeretin in OS. C, D, E, F GSEA enrichment analysis of ABCC1 (C),
   AKR1C3 (D), BACE1 (E), and CA12 (F)

Hub genes validation

   A predictive polygenic model was established through the implementation
   of logistic regression analysis, utilizing standardized gene expression
   profiles as input data. The model was trained on normalized
   transcriptomic data to establish robust associations between gene
   expression patterns and phenotypic outcomes. This computational
   framework employed statistical learning techniques to integrate
   multiple genetic factors and quantify their collective contribution to
   complex traits. The logistic regression approach provided a
   probabilistic framework for estimating the likelihood of specific
   phenotypic manifestations based on the combined effects of multiple
   genetic loci. As a result, four hub genes (ABCC1, AKR1C3, BACE1, CA12)
   were identified to construct the diagnostic model utilizing stepwise
   regression analysis. The findings indicated that the predictive model
   developed using these key target genes demonstrated significant
   discriminatory power in differentiating OS samples from normal
   controls, achieving an AUC value of 0.575 for the three-year survival
   analysis (Fig. [114]7A-D).

Fig. 7.

   [115]Fig. 7
   [116]Open in a new tab

   A The association between the expression levels of CA12, BACE1, AKR1C3,
   and ABCC1 with survival outcomes in the Target dataset. The upper panel
   displays a gradient of gene expression intensities, categorized by
   distinct color codes representing varying expression clusters. The
   central panel depicts the correlation between gene expression profiles
   and corresponding survival durations and statuses across different
   samples. The lower panel presents a heatmap visualization of gene
   expression patterns. B The KM survival analysis for CA12, BACE1,
   AKR1C3, and ABCC1 within the Target dataset. The log-rank test was
   employed to compare survival differences between expression groups,
   with HR denoting the hazard ratio for the high-expression cohort
   relative to the low-expression group. C The ROC curves as well as AUC
   values for CA12, BACE1, AKR1C3, and ABCC1 at different time points,
   where a higher AUC value indicates a stronger predictive ability of the
   gene

Quantitative assessment of gene expression profiles for targeted genetic
markers by RT-qPCR

   To confirm the findings from computational analyses, we implemented
   reverse RT-qPCR assays targeting ABCC1, AKR1C3, BACE1, and CA12 genes.
   These experiments were conducted using cDNA derived from OS cells that
   had been exposed to tangeretin treatment, along with corresponding
   untreated control samples. The results were shown in Fig. [117]8A-B,
   the expression of BACE1 and ABCC1 was significantly decreased relative
   to the control.

Fig. 8.

   [118]Fig. 8
   [119]Open in a new tab

   A, B RT-qPCR Analysis of BACE1 and ABCC1 Expression: The expression
   levels of BACE1 and ABCC1 were quantified using RT-qPCR in OS cells
   treated with tangeretin. Statistical significance was determined using
   the unpaired Student’s t-test, with *p < 0.05, **p < 0.01, and
   ***p < 0.001 indicating significant differences between the treatment
   and control groups. The data represent results from three independent
   experiments, ensuring reproducibility and reliability of the findings.
   C, D Computational Docking Models: Molecular docking simulations were
   performed to assess the binding affinity of tangeretin with BACE1 (C)
   and ABCC1 (D), providing insights into their potential interactions at
   the molecular level. The docking results were visualized, with binding
   energies indicating the strength of the ligand-receptor interactions.
   E, H t-SNE Clustering of Single-Cell Data for BACE1 (E) and ABCC1 (H):
   t-distributed stochastic neighbor embedding (t-SNE) was used to
   visualize the clustering of single-cell data, highlighting the distinct
   populations of cells expressing BACE1 (E) and ABCC1 H. Each cell type
   was color-coded to show differentiation, helping to identify specific
   cell populations involved in OS. F, I Spatial Distribution of Gene
   Expression: t-SNE plots (F for BACE1 and I for ABCC1) showed the
   spatial distribution of BACE1 and ABCC1 gene expression across
   individual cells. The color gradient indicated the relative expression
   levels, with darker colors representing lower expression and brighter
   colors reflecting higher gene activity. G, J Quantitative Bar Graphs of
   Gene Expression Across Cell Types: Bar graphs (G for BACE1 and J for
   ABCC1) summarize the relative expression levels of these genes in
   different cell types. This comparative analysis provided a clear
   overview of how gene expression patterns varied across various cell
   populations, offering insights into their potential roles in OS
   progression and treatment response

Molecular docking validation

   To further verify whether BACE1 and ABCC1 were therapeutic targets for
   OS, we employed molecular docking analysis to assess the binding
   affinity between ligand molecules and their corresponding receptor
   targets. The computational approach allowed for the systematic
   evaluation of molecular interactions and binding potential at the
   atomic level. The stability of ligand-receptor interactions was
   inversely correlated with binding energy values, where lower energy
   levels indicated stronger molecular associations. Molecular docking
   analyses revealed that binding affinities below −5.0 kcal/mol
   demonstrated optimal binding characteristics, whereas interactions with
   energy values exceeding −4.25 kcal/mol represented baseline binding
   capabilities [[120]19]. The two core proteins (ABCC1 and BACE1) were
   docked against tangeretin using Autodock software, and docking scores
   are shown in Fig. [121]8C, 8D. The docking results were analyzed by
   Pymol software (Fig. [122]8C, 8D). The results showed that the binding
   energy of two core proteins and tangeretin all was ≤ -5 kcal/mol.

Analyze the single-cell data using t-SNE method

   To explore the cellular composition and molecular profiles linked to
   two candidate hub genes, we employed t-distributed Stochastic Neighbor
   Embedding (tSNE) for dimensionality reduction and visualized the
   single-cell transcriptomic data. This unsupervised approach enabled us
   to identify distinct cell populations and their associated gene
   expression patterns. The highest levels of BACE1 expression were
   observed in several cell types, including fibroblasts, malignant cells,
   monocyte-macrophages, endothelial cells, osteoblasts, conventional
   CD4 + T cells, plasma cells, and exhausted CD8 + T cells, all of which
   exhibit associations with OS (Fig. [123]8E–G). In parallel, the highest
   levels of ABCC1 expression were predominantly observed in
   monocyte-derived macrophages, CD8 + exhausted T cells, vascular
   endothelial cells, osteoblasts, fibroblast populations, malignant
   cells, and plasma cell lineages (Fig. [124]8H–J). This expression
   pattern suggested a potential role of ABCC1 in diverse cellular
   processes across multiple tissue types. The consistent upregulation in
   these specific cell populations might indicate functional significance
   in cellular transport mechanisms and drug resistance pathways. Notably,
   the expression profile demonstrated tissue-specific regulation, with
   particularly strong signals detected in both normal and transformed
   cellular compartments. These findings provided valuable insights into
   the potential biological functions and therapeutic implications of
   BACE1 and ABCC1 in OS.

Discussion

   As the most prevalent primary bone malignancy, OS, presents a
   considerable challenge in enhancing patient survival rates and
   treatment globally [[125]20]. Traditional chemotherapy, relying on the
   same drugs since the 1980s, is often ineffective due to resistance and
   severe side effects. The high heterogeneity of OS cells further
   complicates the development of targeted therapies. As a result, there
   is a pressing need for new therapeutic strategies. Natural dietary
   compounds, such as tangeretin, have gained attention for their
   potential health benefits, including cancer prevention. However, the
   therapeutic efficacy and mechanisms of tangeretin in OS remain unclear.
   In this study, we applied Weighted Gene Co-expression Network Analysis
   (WGCNA) to reconstruct gene co-expression networks, aiming to reveal
   key genes and transcriptional regulation mechanisms involved in OS
   pathogenesis.

   ABCC1 (MRP1), a constituent of the ABC transporter superfamily, is
   initially characterized by Cole and colleagues in 1992 [[126]21]. This
   membrane-associated protein has been proven to confer cellular
   resistance to multiple chemotherapeutic agents, including etoposide,
   doxorubicin, and vincristine, through its drug efflux activity
   [[127]22]. The accessibility of the ABCC1 enhancer was significantly
   elevated in various malignancies including Clear Cell Renal Cell
   Carcinoma (ccRCC), Glioblastoma Multiforme (GBM), and Uterine Corpus
   Endometrial Carcinoma (UCEC), representing a subset of genes that
   exhibit downregulation following exposure to ultraviolet radiation
   [[128]23]. The ABCC1 gene is responsible for encoding
   multidrug-resistance protein-1 (MRP1), which plays a pivotal role in
   tumor progression by facilitating drug efflux mechanisms in
   neuroblastoma cells [[129]24] Additionally, ABCC1 contributes to
   tumorigenesis by modulating lipid-signaling pathways in uterine
   leiomyoma cells [[130]25]. Similar to other members of the ABC
   transporter family, ABCC1 functions as a membrane-bound efflux pump
   responsible for the cellular export of various chemotherapeutic
   compounds. Recent studies have demonstrated that the inhibition of
   specific protein kinases through tyrosine kinase inhibitors presents a
   promising novel therapeutic strategy for OS management [[131]26,
   [132]27]. It has also to be taken into account that ABC substrates play
   key roles among tyrosine kinase inhibitors [[133]28, [134]29].
   Tetrandrine-mediated inhibition of ABC transporters has been
   demonstrated to enhance apoptotic processes in OS cells [[135]30],
   while also exhibiting the potential to eradicate OS-derived cancer stem
   cell populations. Moreover, Emerging evidence indicates that OS stem
   cells exhibit phenotypic and functional similarities with cancer stem
   cells (CSCs) derived from various malignancies [[136]31–[137]33].
   Intrinsic chemoresistance in OS is largely driven by the upregulation
   of ABC transporter proteins, a mechanism observed across various tumor
   types. This suggests a shared survival strategy among CSCs,
   contributing to treatment resistance in OS. Based on bioinformatic
   analysis, ABCC1 emerged as a pivotal hub gene in OS, identified through
   its gene expression patterns, network centrality, and functional
   relevance. This highlights ABCC1 as a key player in OS progression,
   suggesting its potential as a therapeutic target for overcoming
   chemoresistance and eliminating OS stem cells.

   The β-site amyloid precursor protein cleaving enzyme 1 (BACE1) gene and
   its related proteins play a multifaceted role in cancer prevention and
   treatment, including regulating cell proliferation, affecting the tumor
   microenvironment, serving as therapeutic targets, and being associated
   with CSCs characteristics. The results indicate that BACE1 could
   potentially serve as a novel therapeutic target in cancer treatment
   strategies [[138]34]. Research investigates the radiosensitization
   effects of BACE1 functional inhibition in various cancer cell lines,
   suggesting BACE1 as a potential radiosensitization target for cancer
   radiotherapy[[139]35]. Elevated expression levels of BACE1 and BACE2
   have been demonstrated to correlate with the progression of multiple
   human malignancies [[140]34]. Emerging evidence indicates that BACE1
   plays a pivotal role in modulating various signaling cascades
   implicated in the development of malignancies. [[141]36]. In mammary
   carcinoma cells, the amyloid precursor protein (APP), which interacts
   with BACE1, has been proven to exert remarkably regulatory effects on
   cellular proliferation processes [[142]37]. The novel function of BACE1
   in promoting cellular viability within pancreatic carcinoma cells was
   initially documented, alongside a comparative analysis of the
   suppressive impacts exerted by various inhibitors on cell survival
   [[143]38]. BACE1/2 fosters cancer progression by affecting tumor
   microenvironmental characteristics, particularly through their
   influence on neutrophil extracellular traps (NETs), and therefore,
   their inhibition has a regulatory effect on tumor growth [[144]39].
   Research indicates that pharmacological inhibition of BACE1 using
   MK-8931 induces phenotypic reprogramming of pro-tumorigenic macrophages
   (pTAMs) into tumor-suppressive macrophages (sTAMs), thereby exerting
   anti-tumor effects in glioblastoma models [[145]40]. BACE1 has emerged
   as a promising therapeutic target in prostate cancer management. The
   expression of this gene has been detected in prostate cancer (PCa)
   tissue specimens, and treatment with the BACE1-specific pharmacological
   inhibitor MK-8931 has demonstrated significant inhibition of PCa cell
   proliferation [[146]41]. The experimental findings demonstrated that
   pharmacological inhibition of BACE1 or genetic knockout of BACE1
   significantly altered TAM polarization. Specifically, these
   interventions facilitated the phenotypic transition from the protumoral
   M2-like state to the antitumoral M1-like state, thereby contributing to
   the observed tumor growth inhibition mediated by BACE1 modulation
   [[147]42]. The transcriptional abundance of BACE1-AS in hepatocellular
   carcinoma specimens demonstrates significant associations with
   clinicopathological parameters, with elevated BACE1-AS expression
   serving as an independent prognostic indicator for diminished overall
   survival and increased risk of disease recurrence in hepatocellular
   carcinoma patients [[148]43]. Targeting BACE1 holds therapeutic
   potential across various cancers, making it a promising molecular
   target for drug development. Its involvement in tumorigenesis warrants
   further preclinical and clinical exploration.

   While the study provides valuable insights into the effects of
   tangeretin on OS and identified key molecular targets, there are
   several limitations that should be considered. First, while the
   computational target prediction and pathway enrichment analyses suggest
   potential targets of tangeretin, the identification of these targets
   relies heavily on in silico methods, which may not fully capture the
   complexity of in vivo interactions. Additionally, the experimental
   validation of these targets was conducted only through in vitro cell
   line models, which may not entirely reflect the tumor microenvironment
   and the full range of genetic heterogeneity observed in OS patients.
   Moreover, although the RT-qPCR and molecular docking analyses confirmed
   the interaction between tangeretin and key targets like ABCC1 and
   BACE1, the study did not include detailed in vivo studies to assess the
   pharmacokinetics, toxicity, or long-term therapeutic efficacy of
   tangeretin, limiting the translation of these findings to clinical
   applications. Furthermore, the focus on a few hub genes may overlook
   other potential molecular players involved in tangeretin's mechanism of
   action, and the absence of clinical trial data leaves unanswered
   questions about the drug's relevance and effectiveness in human OS
   cases. Finally, while single-cell RNA sequencing data provided useful
   insights into the expression profiles of ABCC1 and BACE1 across
   different cell populations, further studies are needed to validate the
   functional significance of these findings in the context of OS
   treatment and progression.

Conclusion

   In conclusion, our comprehensive investigation employing WGCNA, network
   pharmacology, RT-qPCR, and molecular docking has revealed that ABCC1
   and BACE1 emerge as promising molecular targets implicated in the
   regulation of OS. These findings offer novel insights and potential
   avenues for the development of therapeutic interventions and
   pharmacological strategies targeting OS.

Author contributions

   Conceptualization: J. T. R.Y. and Z. Y., Writing—Original Draft: R.Y.;
   Software and Formal analysis: X. L., H. Z. and Y. Y.; Investigation: R.
   S. and Y.W.; Writing—Review & Editing: All authors; Funding
   acquisition: R.Y. X.L. and Z.Y.

Funding

   This work was supported by the program of TCM research in Henan
   Province, China (2022ZY1136, 2022ZY1123, 2024ZY2111), Medical Science
   and Technology Project of Henan Province, China (LHGJ20240454).

Data availability

   The data and materials are all included in this manuscript.

Declarations

Ethics approval and consent to participate

   Not applicable.

Consent for publication

   Not applicable.

Competing interests

   The authors declare no competing interests.

Footnotes

   Publisher's Note

   Springer Nature remains neutral with regard to jurisdictional claims in
   published maps and institutional affiliations.

   Ruoping Yanzhang and Zhaojie Yang contributed equally to this work.

References