Abstract

Background

   Postmenopausal osteoporosis (PMOP) represents as a significant health
   concern, particularly as the population ages. Currently, there is a
   paucity of comprehensive descriptions regarding the immunoregulatory
   mechanisms and early diagnostic biomarkers associated with PMOP. This
   study aims to examine immune-related differentially expressed genes
   (IR-DEGs) in the peripheral blood mononuclear cells of PMOP patients to
   identify immunological patterns and diagnostic biomarkers.

Methods

   The [37]GSE56815 dataset from the Gene Expression Omnibus (GEO)
   database was used as the training group, while the [38]GSE2208 dataset
   served as the validation group. Initially, differential expression
   analysis was conducted after data integration to identify IR-DEGs in
   the peripheral blood mononuclear cells of PMOP. Subsequently, feature
   selection of these IR-DEGs was performed using RF, SVM-RFE, and LASSO
   regression models. Additionally, the expression of IR-DEGs in distinct
   bone marrow cell subtypes was analyzed using single-cell RNA sequencing
   (scRNA-seq) datasets, allowing the identification of cellular
   communication patterns within various cell subgroups. Finally,
   molecular subtypes and diagnostic models for PMOP were constructed
   based on these selected IR-DEGs. Furthermore, the expression levels of
   characteristic IR-DEGs were examined in rat osteoporosis (OP) models.

Results

   Using machine learning, six IR-DEGs (JUN, HMOX1, CYSLTR2, TNFSF8,
   IL1R2, and SSTR5) were identified. Subsequently, two molecular subtypes
   of PMOP (subtype 1 and subtype 2) were established, with subtype 1
   exhibiting a higher proportion of M1 macrophage infiltration. Analysis
   of the scRNA-seq dataset revealed 11 distinct cell clusters. It was
   noted that JUN was significantly overexpressed in M1 macrophages, while
   HMOX1 showed a marked elevation in endothelial cells and M2
   macrophages. Cell communication results suggested that the PMOP
   microenvironment features increased interactions among M2 macrophages,
   CD8^+ T cells, Tregs, and fibroblasts. The diagnostic model based on
   these six IR-DEGs demonstrated excellent diagnostic performance
   (AUC = 0.927). In the OP rat model, the expression of IL1R2 and TNFSF8
   were significantly elevated.

Conclusion

   JUN, HMOX1, CYSLTR2, TNFSF8, IL1R2, and SSTR5 may serve as promising
   molecular targets for diagnosing and subtyping patients with PMOP.
   These results offer novel perspectives on the early diagnosis of PMOP
   and the advancement of personalized immune-based therapies.

   Keywords: Postmenopausal osteoporosis, Immune, Diagnosis, Molecular
   subtype, Biomarkers

1. Introduction

   Osteoporosis is a bone disorder marked by diminished bone strength and
   loss of bone mass, leading to a substantially higher risk of fractures
   [[39]1]. Postmenopausal osteoporosis (PMOP) is the most prevalent form,
   primarily driven by an imbalance in bone homeostasis due to diminished
   estrogen secretion in postmenopausal women [[40]2]. Specifically,
   decreased ovarian function causes bone resorption to exceed bone
   formation during the remodeling process [[41]3,[42]4]. This imbalance
   leads to a rapid decline in both bone density and quality, thereby
   increasing the risk of fractures. PMOP represents a substantial global
   health concern, imposing significant physical and economic burdens on
   elderly women. In the United States, nearly 20 % of women aged 50 and
   above suffer from hip osteoporosis, with approximately 50 %
   experiencing a substantial reduction in hip bone mass [[43]5]. Another
   study indicates that the prevalence of osteoporosis in women escalates
   from 4 % at age 50 to a staggering 52 % at age 80 [[44]6]. Thus, early
   diagnosis and timely intervention for PMOP are of paramount importance.

   Currently, clinical diagnosis of PMOP heavily relies on bone mineral
   density (BMD) testing, considered the most critical indicator for
   predicting fracture risk. This includes assessing BMD through
   dual-energy X-ray absorptiometry (DXA) [[45]7,[46]8]. However, for most
   female PMOP patients, clinical symptoms are notably absent until
   explicit fractures occur. BMD testing only provides a limited
   reflection of bone quantity and poses challenges for precise and prompt
   early-stage PMOP diagnosis [[47]9]. Presently, there is a dearth of
   biomarkers for early PMOP diagnosis, which is essential for timely
   intervention and treatment to enhance patient outcomes.

   Estrogen deficiency is a significant etiological factor in PMOP, with
   its receptors widely distributed among osteoblasts, osteoclasts, and
   hematopoietic cells [[48]10]. Estrogen may suppress bone resorption by
   regulating osteoclast formation. Previous research has emphasized the
   strong connection between the pathophysiological mechanisms of PMOP and
   immune regulation within the body [[49]11,[50]12]. From an
   immunological perspective, as aging process intensifies, immune
   defenses weaken of body, maintaining a state of chronic inflammation
   that could activate osteoclasts and exacerbate bone loss [[51]13].
   However, the mechanisms behind bone metabolism and immune regulation in
   PMOP remain unclear [[52]14]. Recent advancements in genomics provide
   promising prospects for exploring biomarkers in PMOP patients at a
   micro-level [[53]15]. Therefore, investigating the immune
   microenvironment of PMOP patients, identifying immune-related
   biomarkers for diagnosis and classification, could aid in early
   diagnosis and personalized immune-based therapies, offering molecular
   targets for PMOP patients.

   This study is the first to investigate the expression profile of
   immune-related genes and the immune microenvironment in PMOP patients
   from an immunological standpoint. Utilizing machine learning
   algorithms, distinctive biomarkers were identified, leading to the
   establishment of reliable immune-related molecular subtypes and
   diagnostic models ([54]Fig. 1). This provides a foundation for early
   PMOP diagnosis and personalized immunotherapy for patients.

Fig. 1.

   [55]Fig. 1
   [56]Open in a new tab

   The flowchart of this study.

2. Materials and methods

2.1. Data source

   Data from PMOP patients were sourced from the Gene Expression Omnibus
   (GEO) database ([57]https://www.ncbi.nlm.nih.gov/geo/), specifically
   from the [58]GSE56815 and [59]GSE2208 datasets. The [60]GSE56815
   dataset, derived from array-based gene expression profiles, includes
   gene expression matrices of peripheral blood mononuclear cells from 20
   PMOP patients and 20 healthy controls, all sequenced using the
   [61]GPL96 platform. Similarly, the [62]GSE2208 dataset contains 9 PMOP
   patients and 20 healthy controls, also sequenced on the [63]GPL96
   platform. To ensure consistency in the training set results,
   [64]GSE56815 was used as the training set, while [65]GSE2208 served as
   the validation set for subsequent analysis. Furthermore, we utilized
   “Perl” software [[66]16] to convert probe names into gene symbols for
   further analysis. Detailed dataset information can be found in
   [67]Table 1.

Table 1.

   Dataset source and information.
   Dataset Platform Sample Cell Type
   [68]GSE56815 [69]GPL96 20 PMOP vs 20 Control Mononuclear cells derived
   from peripheral blood
   [70]GSE2208 [71]GPL96 9 PMOP vs 20 Control Mononuclear cells derived
   from peripheral blood
   [72]GSE147287 [73]GPL24676 1 PMOP Mononuclear cells derived from bone
   marrow
   [74]Open in a new tab

2.2. Analysis of immune-related differentially expressed genes (IR-DEGs)

   A total of 2483 immune-related genes (see [75]Supplementary Table 1)
   were obtained from the ImmPort database ([76]https://immport.org).
   Subsequently, Using the “limma” package [[77]17], the expression matrix
   of immune-related genes from 20 PMOP and 20 healthy samples was
   extracted. Differential expression analysis was then conducted to
   identify significantly differentially expressed immune-related genes
   (IR-DEGs) between PMOP patients and healthy controls. Visualization was
   performed using differential heatmaps and box plots. The criteria for
   selection were set as log fold change (log|FC|)> 0.585 and adjusted
   P < 0.05.

2.3. Enrichment analysis

   To explore the potential molecular mechanisms of IR-DEGs between PMOP
   patients and healthy samples, Gene Ontology (GO) enrichment and Kyoto
   Encyclopedia of Genes and Genomes (KEGG) pathway were analyzed. The GO
   functional analysis encompassed Molecular Functions (MF), Cellular
   Components (CC), and Biological Processes (BP). The top 5 enriched
   functions and the top 10 KEGG pathways were visualized using bar graphs
   in different colors. We used a significance threshold of P < 0.05 to
   determine which functions or pathways were significantly enriched.

2.4. Characteristic gene selection

   To identify key genes, we employed machine learning algorithms to
   select characteristic IR-DEGs. Initially, we compared the residual
   error values of two commonly used Support Vector Machine Recursive
   Feature Elimination (SVM-RFE) [[78]18] and Random Forest (RF) [[79]19]
   models, to determine the most suitable machine learning algorithm.
   Subsequently, a Least Absolute Shrinkage and Selection Operator (LASSO)
   [[80]20] regression was employed to identify the selection of
   characteristic genes. The final set of characteristic IR-DEGs was
   determined by taking the intersection of genes identified by both
   machine learning algorithms. Additionally, we analyzed the chromosomal
   locations of these characteristic IR-DEGs.

2.5. Single-cell data and analysis

   The ScRNA-seq dataset [81]GSE147287 of PMOP patients was obtained from
   the GEO database, comprising the raw matrix data of one PMOP patient.
   Data processing and filtering were conducted using the “Seurat” package
   in R software (version 4.3.2). Standard data preprocessing procedures
   involved determining the number of genes, cells, and total molecules.
   Cells with a detected gene count less than 500 or exceeding 7500 were
   discarded. Cells with high mitochondrial gene content (≥20 %) were also
   removed. Subsequently, data standardization was performed using the
   “ScaleData” function in “Seurat,” and 2000 highly variable genes (genes
   showing significant differences between some cells) were selected for
   principal component analysis (PCA) dimensionality reduction and cell
   identification across all samples. Appropriate PC values (PC = 10) and
   resolution (0.5) were chosen for further optimization of cell
   clustering. Visualization and storage of DEGs for each cluster were
   carried out. Additionally, each cluster was subsequently defined and
   annotated based on the specific expression patterns of marker genes.
   Finally, differential expression analysis was conducted for key IR-DEGs
   across different clusters.

2.6. Cell cycle and cell communication

   The cell cycle is a well-conserved and ongoing process. The “tricycle”
   package was employed to predict the specific positions of cells in the
   cell cycle based on data projection. The cell cycle stages were
   inferred as five phases: G1.S phase, S phase, G2 phase, G2/M phase, and
   M.G1 phase, represented by the UAMP of the data. The proportions of
   different cell cycle phases were further calculated for each distinct
   cluster. Additionally, the “CellChat” package was utilized to analyze
   chondrocyte-like samples from OA joints, calculating intercellular
   communication within each cell cluster of PMOP samples to reveal the
   communication strength of each signaling pathway. A significance
   threshold of P < 0.05 was applied, and the visualization was based on
   the number and intensity of connections between various cell clusters.

2.7. Determination of molecular subtypes in PMOP patients

   Conducting disease subtyping through clustering analysis is beneficial
   for personalized treatment of patients. Therefore, we utilized
   Consensus Clustering analysis to identify molecular subtypes in PMOP
   patients. Initially, PMOP patients were subjected to Consensus
   Clustering using the “ConsensusClusterPlus” package [[82]21,[83]22]. We
   generated eight clustering models (K = 2–9) based on different K values
   and visualized them using clustering heatmaps. The clustering
   parameters were set as follows: max K = 9, reps = 50, pFeature = 1,
   pItem = 0.8. Subsequently, we calculated the consensus scores for the
   K = 2–9 clustering models and created cumulative distribution functions
   (CDF) along with area under the curve (AUC) to identify the most
   appropriate clustering K value.

   Moreover, to confirm significant differences between PMOP molecular
   subtypes with distinct features, we conducted PCA [[84]23] for
   subtyping 20 PMOP patients. Subsequently, we examined the expression
   differences of key IR-DEGs across the identified molecular subtypes.

2.8. Analysis of immune microenvironment in PMOP patients

   The pathogenesis of PMOP may be associated with immune regulation.
   Therefore, we conducted a comprehensive exploration of the immune
   microenvironment in PMOP patients. Initially, we performed immune cell
   infiltration analysis and immunoscore calculation for each PMOP sample
   and healthy control using the “CIBERSORT” package [[85]24].
   Subsequently, the infiltration levels of 22 immune cell types were
   compared between PMOP patients and control samples, with the
   significant differences visualized through cluster heatmaps and box
   plots. Finally, the variations in infiltration levels of these 22
   immune cell types among different PMOP subtypes (subtype 1 and subtype
   2) were visualized through histograms and box plots.

2.9. Immune correlation analysis of characteristic IR-DEGs

   We further explored the immunoregulatory relationship of characteristic
   IR-DEGs with PMOP patients from an immunological perspective.
   Subsequently, we conducted correlation analyses between these key
   IR-DEGs and the 22 immune cell types. A significance threshold was set
   at P < 0.05 and |R|>0.3 to identify genes with a pronounced correlation
   with immune cells.

2.10. Single-sample gene set enrichment analysis (ssGSEA)

   ssGSEA is a widely used method to assess the enrichment of target genes
   under specific biological conditions [[86]25]. In this study, PMOP
   samples from the training group were divided into low- and
   high-expression groups for each characteristic IR-DEG, based on the
   median expression level of these genes. This categorization allowed for
   the analysis of biologically significant functions or pathways that
   were significantly enriched for each f characteristic IR-DEG. A
   significance threshold of P < 0.05 was utilized to determine these
   enriched biological functions or pathways.

2.11. Diagnostic model based on characteristic IR-DEGs

   We further developed a diagnostic model for PMOP using the
   characteristic IR-DEGs, evaluating their ability to distinguish between
   PMOP patients and control samples. Specifically, we employed the
   “glmnet” and “pROC” packages [[87]26] to build ROC curves for the
   characteristic genes and performed ROC evaluations for each gene.
   Subsequently, we established a logistic regression model for the
   characteristic genes to generate an overall ROC curve and calculate the
   AUC values. Finally, we employed the independent [88]GSE2208 dataset as
   a validation set to validate the ROC curves constructed for the
   characteristic genes.

2.12. PPI and gene-gene interaction networks

   To investigate the potential interaction mechanisms among
   characteristic IR-DEGs, we constructed a Protein-Protein Interaction
   (PPI) network using the online STRING database
   ([89]https://string-db.org). Genes lacking interactions were excluded,
   and the selected minimum confidence score was set to 0.15. Furthermore,
   we established a gene-gene interaction network for characteristic
   IR-DEGs based on the GENEMANIA online database
   ([90]http://genemania.org), visually represented as a circular plot.
   Additionally, we created gene-miRNA and gene-disease regulatory
   networks through the NetworAnalyst database
   ([91]https://www.networkanalyst.ca).

2.13. Establishment of PMOP rat model and RT-qPCR

   Purchase 24 female SD rats, aged 4–5 months and weighing 400–500g, from
   the Guangdong Province Medical Experimental Animal Center. After a week
   of routine feeding in the lab, administer 2 % pentobarbital sodium
   intraperitoneally (40 mg/kg) for anesthesia, then prepare and disinfect
   the abdominal area. Randomly divide the 24 rats into two groups: 12 in
   the PMOP group and 12 in the sham group. The PMOP group undergoes
   bilateral ovariectomy, while the sham group only has the adipose tissue
   around the ovaries removed, followed by suturing and disinfection of
   the incisions. Resume normal diet post-operation and establish the PMOP
   model after 8 weeks. After the model was established, 24 rats were
   anesthetized with an intraperitoneal injection of sodium pentobarbital,
   and blood was collected via the jugular vein. The rats were then
   euthanized, and the right femoral tissue was extracted.

   Extract total RNA from femur tissue and peripheral blood using the
   Simply Total RNA Extraction Kit (BioFlux, Japan), according to the
   manufacturer's instructions. Transcribe cDNA using the SweScrip RT I
   First Strand Synthesis Kit (Service Bio, Guangzhou, China) as per the
   instructions provided. Employ the MyiQ PCR instrument to detect the
   expression of the genes JUN, HMOX1, CYSLTR2, TNFSF8, IL1R2, and SSTR5.
   Use GAPDH for normalization. Primer sequences are listed in [92]Table
   2. The animal experiments conducted in this study were approved by the
   Medical Research and Clinical Trial Ethics Committee of the First
   People's Hospital of Huzhou (Approval No. 2022-SDYW-004).

Table 2.

   The primers used for characteristic immune-related DEGs.
   Targeted genes   Forward (5′–3′)      Reverse (3′–5′)
   JUN            AACTCGGACCTTCTCACGTC GGTCGGTGTAGTGGTGATGT
   CYSLTR2        TCATGCTCAACCTGGCCATT GGCCCAGTCCCCAAATATCC
   SSTR5          CCTCCACACCAAGCTGGAAT CAACAGGTAGAGCACAGGCA
   TNFSF8         ACACTGCAGCTCGTCATCAA TCCACCCGGACAGATATGGT
   IL1R2          TGTAAACGCCAGGCTGGAAA TTTGGTTTGGGCTGGAAGGG
   HMOX1          GCATGTCCCAGGATTTGTCC ACCAGCTTAAAGCCTTCCCT
   GAPDH          TTCACCACCATGGAGAAGGC CTCGTGGTTCACACCCATCA
   [93]Open in a new tab

2.14. Statistics analysis

   For normally distributed data, comparisons between two groups were
   conducted using two-sample independent t-tests. In cases where
   normality assumptions were not met, the Wilcoxon test was employed for
   comparing two groups. Statistical significance was defined as a P-value
   less than 0.05. All gene expression matrices and statistical analyses
   were conducted using R version 4.1.3 ([94]https://www.R-project.org/)
   along with relevant packages.

3. Results

3.1. Identification and enrichment analysis of IR-DEGs

   The [95]GSE56815 dataset comprised 40 patients with low bone density
   and 40 patients with high bone density. After excluding premenopausal
   individuals with low bone density and postmenopausal individuals with
   high bone density, we obtained a control group consisting of 20
   premenopausal individuals with high bone density and 20 patients with
   PMOP. Similarly, the [96]GSE2208 dataset included 10 patients with high
   bone density and 9 PMOP patients, resulting in a control group of 5
   premenopausal individuals with high bone density and 9 PMOP patients
   after excluding 5 premenopausal individuals with low bone density. We
   utilized the “limma” package to extract immune-related genes from the
   data matrices and conducted differential expression analysis,
   identifying a total of 327 IR-DEGs (see [97]Supplementary Table 2). The
   top 10 IR-DEGs were visually represented through differential heatmaps
   and box plots ([98]Fig. 2A and B).

Fig. 2.

   [99]Fig. 2
   [100]Open in a new tab

   Identification and Functional Analysis of Immune-Related Differentially
   Expressed Genes (IR-DEGs). (A) Differential heat map of the top 10
   IR-DEGs, with red representing upregulated genes and blue representing
   downregulated genes. (B) Differential box plot of the top 10 IR-DEGs,
   with red indicating the PMOP group and blue indicating the control
   group. (C) The top 10 significant enrichment of KEGG pathways. (D) The
   top 5 significantly enriched biological functions (BP) revealed by GO
   functional analysis, the top 4 enriched cellular components, and the
   top 5 enriched molecular functions. Significance: ∗∗P < 0.01,
   ∗∗∗P < 0.001.

   We performed functional and pathway enrichment analyses to investigate
   the possible biological roles and mechanisms of the IR-DEGs. The KEGG
   pathway analysis indicated the IR-DEGs were significantly enriched in
   pathways related to blood coagulation, wound healing, and hemostasis
   ([101]Fig. 2C). GO functional analysis revealed that these IR-DEGs were
   strongly linked to biological processes (BP) related to wound healing,
   lymphocyte differentiation, and coagulation. In terms of cellular
   components (CC), the IR-DEGs were primarily associated with the
   external side of the plasma membrane, the lumen of cytoplasmic
   vesicles, and secretory granules. Molecular functions (MF) were mainly
   enriched in phospholipid binding, DNA-binding transcription activator
   activity, and collagen binding ([102]Fig. 2D). These results suggest
   that IR-DEGs could contribute to the regulation of PMOP and may have an
   impact on the hematological system.

3.2. Selection of characteristic IR-DEGs

   To select characteristic genes, we initially compared the residual
   error values of RF and SVM-RFE. Using box plots and residual
   distribution curves, we determined that SVM-RFE was the appropriate
   selection method ([103]Fig. 3A and B). In the SVM-RFE analysis,
   cross-validation accuracy plots indicated that 17 features achieved an
   accuracy of 0.825, while cross-validation error plots showed that these
   17 features had an error rate of 0.175 ([104]Fig. 3C and D).
   Consequently, 17 SVM-RFE-selected characteristic genes were obtained
   ([105]Supplementary Table 3). Next, we performed LASSO regression
   analysis with cross-validation on IR-DEGs, resulting in 13
   characteristic genes ([106]Fig. 3E and F). By further intersecting
   these genes, we identified 6 characteristic IR-DEGs: JUN, HMOX1,
   CYSLTR2, TNFSF8, IL1R2, and SSTR5 ([107]Fig. 3G). Chromosome location
   analysis indicated that JUN, IL1R2, TNFSF8, CYSLTR2, SSTR5, and HMOX1
   were situated on chromosomes 1, 2, 9, 13, 16, and 22, respectively
   ([108]Fig. 3H).

Fig. 3.

   [109]Fig. 3
   [110]Open in a new tab

   Identification of characteristic IR-DEGs using machine learning
   methods. (A and B) Residuals from the Random Forest (RF) and Support
   Vector Machine-Recursive Feature Elimination (SVM-RFE) models. (C)
   Change curves in the predicted true values for each IR-DEG. (D) Change
   curves in the predicted error values for each IR-DEG. (E) Gene
   coefficient plots from the Least Absolute Shrinkage and Selection
   Operator (LASSO) regression analysis. (F) Cross-validation curves. (G)
   Characteristic IR-DEGs obtained by the intersection of genes selected
   by LASSO and SVM-RFE. (H) Chromosomal positions of the 6 characteristic
   IR-DEGs.

3.3. Identification of different cell types, cell cycle and cell
communication analysis

   To further investigate the occurrence and progression of PMOP, the
   immune microenvironment of PMOP patients was analyzed at the cellular
   level using scRNA-seq data set ([111]GSM4423510). After quality
   control, filtering, PCA dimensionality reduction, and t-SNE clustering
   analysis, a total of 12 different cell types were identified ([112]Fig.
   4A). Using feature markers, these cells were defined as 11 different
   subtypes: M1 macrophage (FCGR3B), Treg (NRP1), CD8^+ T cell (GZMK and
   CD8A), Endothelial cell (PECAM1), M2 macrophage (CD163), RBC (HBD), B
   cell (CD37), lymphoid-primed multi-potential progenitors (LMPP) (CAMP),
   Fibroblast (FGF7), NK cell (NKG7), Epithelial_cell (CDH1) ([113]Fig. 4B
   and C). The heatmap shows significant differences in gene markers among
   different cell clusters ([114]Fig. 4D). Subsequently, UMAP was employed
   to visualize the expression of key genes (JUN, HMOX1, CYSLTR2, TNFSF8,
   and IL1R2) in cells ([115]Fig. 5A). Differential expression analysis of
   key IR-DEGs across cell clusters showed that JUN expression was
   markedly elevated in M1 macrophages compared to the other six cell
   types (Treg, CD8^+ T cell, Endothelial cell, M2 macrophage, RBC, and B
   cell). HMOX1 showed significantly higher expression in endothelial
   cells and M2 macrophages. CYSLTR2, TNFSF8, and IL1R2 were expressed
   less in various cell types and showed no significant differences
   ([116]Fig. 5B).

Fig. 4.

   [117]Fig. 4
   [118]Open in a new tab

   Cell clustering analysis and annotation using the tSNE method. (A)
   Identification of 12 cell clusters based on the tSNE method. (B)
   Determination of 11 cell types based on cell-specific markers. (C)
   Specific markers for the 11 cell types. (D) Heatmap illustrating the
   classification of the 11 cell types.

Fig. 5.

   [119]Fig. 5
   [120]Open in a new tab

   Cell cycle and cell communication analysis in PMOP patients. (A)
   Expression levels of five characteristic IR-DEGs in bone marrow cells.
   (B) Visualization of differential expression of five characteristic
   IR-DEGs among different cell types. (C) Proportions of different cell
   types in various cell cycles. (D) Visualization of the cell cycle
   distribution among different cell types. (E) Interactions among the 11
   different cell types.

   Additionally, to explore potential connections and molecular mechanisms
   between different cell types, further analysis of the cell cycle and
   cell communication was conducted. M1 macrophages were found to be
   present in almost all cell cycles (M.G1, G2.M, G2, S, G1.S) and
   constituted a significant proportion, while Treg cells were mainly
   concentrated in the S and G1.S phases of the cell cycle ([121]Fig. 5C
   and D). Cell communication analysis revealed that M2 macrophages, CD8^+
   T cells, Treg cells, and fibroblasts exhibited more interactive cell
   communication pairs in the PMOP microenvironment, with M1 macrophages
   having the highest number of interacting cells ([122]Fig. 5E). These
   findings suggest that macrophages may play a crucial role in the immune
   microenvironment of PMOP.

3.4. Determination of molecular subtypes in PMOP patients

   Subtyping PMOP patients helps in further analyzing the differences in
   the immune microenvironment between different subtypes, thereby aiding
   the development of personalized immunotherapy approaches. Based on six
   characteristic IR-DEGs, we employed consensus cluster analysis to
   identify molecular subtypes within PMOP patients. Specifically, based
   on the clustering matrix heatmaps for K = 2 to K = 4, K = 2 was
   considered the optimal clustering solution ([123]Fig. 6A).
   Subsequently, the Consensus Cluster Score results indicated that both
   subtypes achieved scores exceeding 0.80 for K = 2 ([124]Fig. 6B).
   Furthermore, the determination of the best K value was reinforced by
   the Consensus CDF plot and the area under the CDF curve ([125]Fig. 6C
   and D). Consequently, K = 2 was confirmed as the most appropriate
   clustering solution, revealing two subtypes, Subtype 1 (comprising 8
   samples) and Subtype 2 (comprising 12 samples). Principal Component
   Analysis (PCA) results highlighted marked dissimilarities between
   Subtypes 1 and 2 ([126]Fig. 6E).

Fig. 6.

   [127]Fig. 6
   [128]Open in a new tab

   Construction of molecular subtypes in peripheral blood monocytes of
   PMOP patients based on characteristic IR-DEGs. (A) Heatmap of different
   K-means clustering matrices (K = 2–4). (B) Consistency clustering
   scores of different subtypes (K = 2–9). (C) Cumulative distribution
   function (CDF) plot determining the optimal K value (K = 2). (D) Delta
   area plot. (E) Principal component analysis (PCA) of PMOP subtypes 1
   and 2. (F) Differential expression analysis of the 6 IR-DEGs between
   Subtypes 1 and 2. Significance: ∗∗P < 0.01, ∗∗∗P < 0.001.

   Additionally, we analyzed the differential expression of characteristic
   genes between the two subtypes ([129]Fig. 6F). The heatmap and boxplot
   results showed significant differences in HMOX1 and IL1R2 expression
   levels between Subtype 1 and Subtype 2, with both genes showing notably
   higher expression in Subtype 2.

3.5. Analysis of immune microenvironment in PMOP patients

   Previous research has indicated that alterations in the immune status
   of PMOP patients may contribute to ongoing bone destruction,
   emphasizing the interaction between immune cells and bone
   [[130]27,[131]28]. In this context, we performed an in-depth analysis
   of the immune microenvironment in PMOP patients. The “CIBERSORT”
   algorithm was employed to identify and score 22 distinct immune cell
   types in both PMOP patients and healthy controls. The results were
   visually represented through a heatmap and a boxplot ([132]Fig. 7A and
   B). The analysis revealed significant upregulation of resting dendritic
   cells and naive B cells in PMOP patients, while follicular helper T
   cells and naive CD4 T cells showed substantial downregulation in their
   infiltration levels within the PMOP patient group. These findings
   suggest a potential involvement of T cells and B cells in the immune
   regulation mechanisms underlying PMOP. Furthermore, within PMOP subtype
   2, resting NK cells, and monocytes exhibited significantly elevated
   immune infiltration, while M1 macrophages, M2 macrophages, and
   eosinophils had higher infiltration levels in subtype 1 ([133]Fig. 7C).

Fig. 7.

   [134]Fig. 7
   [135]Open in a new tab

   Analysis of the immune microenvironment in PMOP patients. (A) Heatmap
   depicting the correlation between the infiltration abundance of 22
   immune cell types between the PMOP group and control group. (B)
   Differential expression analysis of the infiltration abundance of 22
   immune cell types between the PMOP group and control group. (C)
   Differential expression analysis of immune cell infiltration abundance
   between PMOP Subtypes 1 and 2. Significance: ∗P < 0.05, ∗∗P < 0.01,
   ∗∗∗P < 0.001.

3.6. Immune correlation analysis

   Additionally, we performed a correlation analysis between each key
   IR-DEG and the proportions of 22 immune cell types. Notably, HMOX1,
   IL1R2, and SSTR5 demonstrated significant correlations with immune
   cells ([136]Fig. 8A–C). Specifically, HMOX1 exhibited a significant
   negative correlation with resting mast cells (R = −0.53, P = 0.018),
   while IL1R2 showed a significant positive correlation with memory B
   cells (R = 0.51, P = 0.02). Additionally, SSTR5 exhibited a significant
   negative correlation with resting memory CD4 T cells (R = −0.55,
   P = 0.013).

Fig. 8.

   [137]Fig. 8
   [138]Open in a new tab

   Correlation analysis of key IR-DEGs with 22 immune cell types. (A)
   HMOX1 exhibits a significant negative correlation with resting Mast
   cells. (B) IL1R2 shows a significant positive correlation with
   activated dendritic cells. (C) SSTR5 demonstrates a significant
   negative correlation with resting T cells CD4 memory.

3.7. ssGSEA enrichment analyses

   Additionally, we investigated the potential molecular mechanisms and
   pathways related to the six IR-DEGs. ssGSEA-KEGG pathway enrichment
   analysis revealed the following associations: JUN was primarily
   enriched in pathways related to acute myeloid leukemia and galactose
   metabolism ([139]Fig. 9A–F). HMOX1 showed prominent enrichment in
   pathways associated with acute myeloid leukemia and focal adhesion.
   CYSLTR2 was notably enriched in pathways such as axon guidance,
   cytokine-cytokine receptor interaction, and drug metabolism. TNFSF8 was
   enriched primarily in pathways including focal adhesion, huntington's
   disease, and nod-like receptor signaling pathway. IL1R2 was enriched in
   pathways linked to acute myeloid leukemia and lysosome. SSTR5 was
   predominantly enriched pathways connected to aldosterone regulated
   sodium reabsorption and asthma. In summary, these IR-DEGs may influence
   the development of PMOP by regulating amino acid metabolism and
   potentially have close associations with acute myeloid leukemia.

Fig. 9.

   [140]Fig. 9
   [141]Open in a new tab

   Single-Sample Gene Set Enrichment Analysis (ssGSEA) of the 6 Key
   IR-DEGs. ssGSEA results for JUN (A), HMOX1 (B), CYSLTR2 (C), TNFSF8
   (D), IL1R2 (E) and SSTR5 (F) between high-expression and low-expression
   groups.

3.8. Construction of PMOP diagnostic model

   To investigate the diagnostic value of the six IR-DEGs, the diagnostic
   models were constructed ([142]Fig. 10A and B). ROC curves for each gene
   were generated in the training set, demonstrated that, except for SSTR5
   (AUC = 0.690), all other characteristic genes exhibited AUC values
   exceeding 0.700, with JUN (AUC = 0.700), HMOX1 (AUC = 0.772), CYSLTR2
   (AUC = 0.725), TNFSF8 (AUC = 0.705), and IL1R2 (AUC = 0.780). The AUC
   value for the logistic regression model was 0.927 (95 % CI:
   0.837–0.995). Furthermore, using another dataset ([143]GSE2208) for
   validation and ROC curves were established ([144]Fig. 10C and D). Due
   to the limited sample size, information for only three characteristic
   genes was available: JUN (AUC = 0.933), CYSLTR2 (AUC = 0.711), and
   IL1R2 (AUC = 0.733). All these AUC values exceeded 0.70. In the
   validation set, the AUC value for the logistic regression model was
   0.956 (95 % CI: 0.800–1.000), indicating that the IR-DEGs hold
   substantial diagnostic value for PMOP.

Fig. 10.

   [145]Fig. 10
   [146]Open in a new tab

   Diagnostic value of 6 characteristic IR-DEGs in PMOP. (A) ROC curves
   for each of the 6 characteristic IR-DEGs (JUN, HMOX1, CYSLTR2, TNFSF8,
   IL1R2, and SSTR5) in the training dataset ([147]GSE56815). (B) ROC
   curve for the Logistic regression model constructed based on the 6
   characteristic IR-DEGs in the [148]GSE56815 dataset. (C) ROC curves for
   each of the 3 characteristic IR-DEGs in the validation dataset
   ([149]GSE2208). (D) ROC curve for the Logistic regression model
   constructed based on the 3 characteristic IR-DEGs in the [150]GSE2208
   dataset.

3.9. Construction of PPI and gene-gene interaction networks

   To investigate the potential molecular mechanisms of the six
   characteristic IR-DEGs, a PPI network was constructed using the STRING
   database ([151]Fig. 11A). This network consisted of 4 nodes and 6
   edges. Subsequently, utilizing GENEMANIA database references, including