Abstract

Background

   Heart failure (HF) represents the terminal stage of various
   cardiovascular disorders, with immunogenic cell death (ICD) potentially
   influencing HF progression through modulation of immune cell activity.
   This study aimed to identify ICD-associated biomarkers in patients with
   HF and explore their underlying mechanisms.

Methods

   Data from [40]GSE57338, [41]GSE3586 and [42]GSE5406 were retrieved from
   the Gene Expression Omnibus (GEO) database. Differential expression
   analysis and weighted gene co-expression network analysis (WGCNA) were
   employed to identify candidate genes, followed by enrichment analysis
   and Protein-Protein Interaction (PPI) network construction. Candidate
   biomarkers were selected using two machine learning approaches and
   validated for expression levels, with receiver operating characteristic
   (ROC) curve analysis determining the final biomarkers. A nomogram model
   was built based on the biomarkers, followed by molecular regulatory
   network analysis, gene set enrichment analysis (GSEA), immune
   infiltration assessment, and drug prediction. Additionally, key cells
   were selected for pseudo-time and cell communication analysis using the
   [43]GSE183852 dataset. Next, pseudotemporal analysis was also performed
   on key cell subpopulations. Real-time quantitative PCR (RT-qPCR) was
   employed to validate the biomarkers.

Results

   Three biomarkers, CD163, FPR1, and VSIG4, were identified as having
   significant diagnostic value for HF. GSEA revealed their enrichment in
   ribosomal and immune cell-related pathways. These biomarkers were
   notably correlated with CD8 T cells and M2 macrophages. Carbachol and
   etynodiol were predicted to interact with all three biomarkers.
   Single-cell RNA sequencing identified nine cell types, with expression
   of the biomarkers confined to monocytes and macrophages. Strong cell
   communication was observed between these cell types and fibroblasts.
   Expression of CD163 and VSIG4 decreased over time in monocytes and
   macrophages, whereas FPR1 showed an upward trend. In addition, the
   expression levels of CD163 and VSIG4 increased in subpopulations of
   monocytes and macrophages, whereas FPR1 showed a decreasing trend.
   RT-qPCR results confirmed significant down-regulation of CD163, FPR1,
   and VSIG4 in patients with HF and animal models.

Conclusions

   This study identified and validated three ICD-related biomarkers in
   HF—CD163, FPR1, and VSIG4—offering a novel theoretical foundation for
   the clinical diagnosis and treatment of HF.

   Keywords: immunogenic cell death, heart failure, biomarker, single-cell
   RNA sequencing analysis, monocytes and macrophages

1. Introduction

   Heart failure (HF), the terminal stage of various cardiovascular
   diseases, affects approximately 56.2 million people worldwide ([44]1,
   [45]2). Despite lifestyle changes and advances in medical care that
   have stabilized age-adjusted incidence rates, the prevalence and
   mortality rates of HF remain high, highlighting the need for further
   research to identify improved management strategies ([46]3). Although
   HF was once considered non-immune-mediated, recent studies have
   demonstrated the involvement of the immune system in its
   pathophysiology, and clinical trials on immune modulation therapy for
   HF have been conducted ([47]4). Consequently, modulating immune
   responses to maintain stability may serve as a promising strategy to
   delay HF progression.

   Immunogenic cell death (ICD), a unique form of regulated cell death
   that occurs as a downstream effect of tumor-specific immune responses,
   has been extensively studied in cancer immunotherapy ([48]5, [49]6),
   with emerging research in cardiovascular diseases. Endothelial cell ICD
   in atherosclerosis has been linked to the initiation of adaptive immune
   responses, sustaining chronic inflammation within plaques ([50]7). In
   coronary artery disease, stratification based on ICD-related genes
   (IRGs) enables the development of risk models and immune subtypes that
   facilitate treatment decisions ([51]8). Moreover, ICD has been explored
   as a diagnostic tool for ischemic stroke in elderly women, identifying
   key biomarkers for diagnosis ([52]9). However, the mechanisms
   underlying ICD in HF remain unexplored.

   This study utilized machine learning techniques to identify ICD
   biomarkers in HF, followed by immune infiltration analysis, targeted
   drug prediction, gene set enrichment analysis (GSEA), single-cell data
   clustering and annotation, cell communication analysis, and pseudotime
   analysis. The findings revealed the functional and potential molecular
   mechanisms of these biomarkers at both the transcriptomic and cellular
   levels, providing a novel theoretical framework for the clinical
   diagnosis and treatment of HF.

2. Materials and methods

2.1. Data collection

   RNA data from [53]GSE57338 (sequencing platform: [54]GPL11532) was
   obtained from the Gene Expression Omnibus (GEO) database
   ([55]https://www.ncbi.nlm.nih.gov/geo/), comprising 136 normal left
   ventricular tissue samples and 177 left ventricular tissue samples from
   patients with HF ([56]10). Additionally, RNA data from [57]GSE3586
   (sequencing platform: [58]GPL3050) was downloaded, containing 15 normal
   left ventricular tissue samples and 13 left ventricular tissue samples
   from patients with HF ([59]11). Moreover, the [60]GSE5406 dataset
   contained 16 normal and 194 HF patients’ heart tissue samples. The data
   were obtained from the [61]GPL96 platform using chip sequencing
   technology, mainly for biomarkers expression validation. The
   single-cell dataset [62]GSE183852 was retrieved from the GEO website
   (sequencing platform: [63]GPL24676), including heart tissue samples
   from 5 patients with HF and 2 normal heart tissue samples ([64]12). A
   total of 34 ICD-associated genes were obtained from the literature
   ([65]13) ([66] Additional file 1 ).

2.2. Differential expression analysis

   Differential expression analysis was conducted using the R package
   “limma” (v 3.58.1) ([67]14), applying the screening criteria of
   |log[2]fold change (FC)| > 0.5 and P < 0.05 to compare HF and control
   samples in the [68]GSE57338 dataset. Volcano plots of the
   differentially expressed gene (DEGs) were visualized using the R
   package “ggplot2” (v 3.4.1) ([69]15), highlighting the top 10 up- and
   down-regulated DEGs. Heatmaps of the top 10 DEGs were generated using
   the R package “ComplexHeatmap” (v 2.4.0) ([70]16).

2.3. Weighted gene co-expression networks analysis

   To calculate the single-sample gene set enrichment analysis (ssGSEA)
   scores for ICD-related genes across 313 samples, the ssGSEA algorithm
   from the R package “GSVA” (v 1.46.0) ([71]17) was applied, and box
   plots were created using “ggplot2” (v 3.4.1).

   WGCNA was performed on the [72]GSE57338 dataset using the R package
   “WGCNA” (v 1.72.5) ([73]18), with ssGSEA scores as the feature. Initial
   clustering of samples identified and excluded abnormal samples. The
   soft threshold (power) was determined based on an R^2 > 0.85 and mean
   connectivity = 0. The dynamic tree cutting algorithm, with a minimum
   gene number of 50 per module and a module merging threshold of 0.3, was
   applied to define gene modules. Genes were color-coded, and the “grey”
   module (containing unclassified genes) was excluded. Pearson
   correlation coefficients were calculated between the modules and ssGSEA
   scores, with a heatmap generated to highlight modules with significant
   correlation (|cor| > 0.5, P < 0.05). Genes within these modules were
   identified as key module genes.

2.4. Enrichment analysis of candidate genes and protein-protein interactions
network analysis

   The R package “ggvenn” (v 1.7.3) ([74]19) was employed to identify the
   intersection between DEGs and key module genes, resulting in the
   selection of candidate genes. These genes were then converted from
   SYMBOL to ENTREZID using the human genome database org.Hs.eg.db (v
   3.18.0) ([75]20). Candidate genes underwent Gene Ontology (GO) and
   Kyoto Encyclopedia of Genes and Genomes (KEGG) functional enrichment
   analysis with the R package “ClusterProfiler” (v 3.16.0) ([76]21), with
   a threshold of P < 0.05. To construct PPI networks, candidate genes
   were analyzed using the search tool for the retrieval of interacting
   genes (STRING) database ([77]https://string-db.org) with a confidence
   score of 0.4. PPI networks were then visualized with Cytoscape (v
   3.10.0) ([78]22). The Cytohubba plugin in Cytoscape (v 3.10.0) was
   utilized to rank candidate genes using six algorithms: Maximum
   Connectivity Component (MCC), Minimum Network Connectivity (MNC),
   Degree of Minimum Network Connectivity (DMNC), Degree, Closeness, and
   Betweenness. Based on the ranking results, the top 20 genes from each
   algorithm were extracted, and their intersection was used to identify
   the final candidate key genes. UpSet plots were generated using the R
   package (v 1.4.0) ([79]23).

2.5. Screening candidate biomarkers by machine learning

   Candidate key genes were further screened based on sample grouping
   information from [80]GSE57338 using the support vector
   machine-recursive feature elimination (SVM-RFE) algorithm (10-fold
   cross validation) (v 4.1.4) ([81]24) to obtain feature genes. The R
   package “randomForest” (v 3.2.2) ([82]25) was used for random forest
   algorithm analysis of the feature genes, incorporating sample grouping
   information from [83]GSE57338. A total of 500 decision trees were
   computed using the randomForest function, and the MeanDecreaseGini
   values for each feature gene were visualized in a bar chart. The median
   of the MeanDecreaseGini values (MeanDecreaseGini measures the effect of
   each variable on the heterogeneity of observations at each node in the
   classification tree, thus assessing the importance of the variable. The
   larger the value, the higher the importance of the variable) was
   calculated, and genes with values above the median were selected as
   candidate biomarkers. Correlation analysis of the candidate biomarkers
   was performed using the R package “corrplot” (v 0.92) ([84]26), with
   thresholds of |cor| > 0.3 and P < 0.05.

2.6. Expression validation of candidate biomarkers

   Expression differences of candidate biomarkers between HF and normal
   samples were analyzed using the grouping information from [85]GSE3586
   and [86]GSE57338, with a threshold of P < 0.05. Box plots were
   constructed using the R package “ggplot2” (v 3.4.1). Candidate
   biomarkers showing differential expression between groups and
   consistent trends across both datasets were selected for receiver
   operating characteristic (ROC) analysis. ROC curves for candidate
   biomarkers were generated using the R package “pROC” (v 1.18.0)
   ([87]27), and the area under the curve (AUC) was calculated, with
   biomarkers defined as those having an AUC > 0.7. To validate biomarkers
   expression, differential expression analysis was performed in the
   [88]GSE5406 dataset.

2.7. Construction of a nomogram

   In the [89]GSE57338 dataset, a nomogram was constructed using the R
   package “rms” (v 5.1.4) ([90]28) to evaluate the risk of developing HF,
   based on the expression of identified biomarkers. The predictive
   performance of the nomogram was assessed by plotting the ROC curve with
   the R package “pROC” (v 1.18.0).

2.8. Gene set enrichment analysis

   Spearman correlation analysis was performed between each biomarker and
   the remaining genes across all [91]GSE57338 samples using the R package
   “psych” (v 2.2.9) ([92]29), generating correlation coefficients. Genes
   were then ranked according to these coefficients, yielding gene lists
   associated with each biomarker. GSEA was performed using the sorted
   results and the R package “ClusterProfiler” (v 3.16.0), with
   “c2.kegg.v7.4.symbols.gmt” and “c5.go.v7.4.symbols.gmt” from the
   Molecular Signatures Database (MSigDB,
   [93]https://www.gsea-msigdb.org/gsea/msigdb/index.jsp) as reference
   gene sets. The top 5 most significant signaling pathways were
   visualized using the enrichplot package (P < 0.05 and |Normalized
   Enrichment Score (NES)| > 1) (v 1.18.3) ([94]20).

2.9. Immune infiltration analysis

   The CIBERSORT algorithm (v 1.03) ([95]30) was employed to calculate the
   relative abundance of 22 immune cell types ([96]31) in HF and normal
   samples from the [97]GSE57338 dataset. Immune cells with a result of 0
   were excluded. Differential immune cells (P < 0.05) were identified,
   and box plots were constructed for visualization. Spearman correlation
   analysis was used to assess the relationships among differential immune
   cells and between biomarkers and immune cells (|cor| > 0.3 and P <
   0.05). A correlation matrix was created using the R package “corrplot”
   (v 0.92) ([98]26), and a heatmap was plotted using the R package
   “pheatmap” (v 1.0.12) ([99]32).

2.10. Regulatory network analysis

   MiRNAs targeting the biomarkers were predicted using the microRNA
   database (miRDB, [100]http://mirdb.org) and the starBase database
   ([101]http://starbase.sysu.edu.cn/), and the intersection of miRNAs
   from both databases was extracted. Based on these predictions, a
   miRNA-biomarker network was constructed using Cytoscape (v 3.10.0).
   Transcription factors (TFs) related to the biomarkers were identified
   using the TRRUST database ([102]http://www.grnpedia.org/trrust/), while
   the disease signatures database (DSigDB, [103]https://www.dsigdb.org/)
   was used to identify drugs targeting the biomarkers. A biomarker-drug
   network was then created and visualized.

2.11. Single-cell RNA sequencing analysis

   The single-cell RNA sequencing data from [104]GSE183852 were processed
   into Seurat objects using the R package “Seurat” (v 4.4.0) ([105]33).
   Quality control was performed by applying the following parameters: 200
   < nFeature_RNA < 4,000, nCount_RNA < 10,000, and Mt < 10%. Genes
   covered by fewer than three cells were removed. Hypervariable genes
   were selected using variance stabilization transformation (vst), and
   the highly variable genes (HVGs) were retained for further analysis.
   The LabelPoints function was applied to identify the top 10 most
   variable genes, and the Scale Data function was used for normalization.
   Principal component analysis (PCA) was performed on the HVGs for
   dimensionality reduction. The p-value for PCs 1 to 15 was calculated
   using the Jackstraw function, and variance drop values for PCs were
   computed using the Elbowplot function. Based on the elbow plot,
   appropriate PCs were selected for subsequent analysis (P < 0.05).
   Uniform Manifold Approximation and Projection (UMAP) clustering
   analysis was applied to identify cell clusters (resolution = 0.5).
   Cellular annotation was performed according to the literature
   ([106]12). The Dotplot function was used to visualize the expression of
   the three biomarkers in the cells, and cells expressing all three
   biomarkers were selected as key cells. Enrichment analysis for each
   cell subtype was conducted using the analyze_sc_clusters function from
   the R package “ReactomeGSA” (v 1.12.0) ([107]34). The pathways function
   was used to extract enrichment results, and a heatmap displayed the top
   ten enriched pathways in each cell subtype. Cell subtype interactions
   were explored using the R package “CellChat” (v 1.6.1) ([108]35) to
   conduct communication analysis. Trajectory differentiation of key cell
   clusters was simulated using the R package “Moncle” (v 2.30.1)
   ([109]36). The dynamic trend of biomarker expression during cell
   differentiation was plotted using the plot_pseudo-time_heatmap
   function. Next, the marker genes of key cell subpopulations were
   selected for annotation based on the CellMarker 2.0 database
   ([110]https://ngdc.cncb.ac.cn/databasecommons/database/id/6110), and
   the final key cell subpopulations were identified based on the specific
   expression of these genes in different clusters. To further explore the
   expression dynamics and temporal trajectories of biomarkers in the key
   cells, the annotated key cell subpopulations were analyzed by the
   proposed timeline trajectory analysis. Using the R package Monocle2 (v
   2.24.1) ([111]37), the distribution of biomarkers in each key cell
   subtype was projected onto a root and multiple branches, a single-cell
   trajectory map was constructed, and the dynamic trend of biomarker
   expression during cell differentiation was plotted. Subsequently, in
   order to analyze the relationship between differentiation states and
   subtypes of key cells, stacked maps of cell subpopulations in different
   differentiation states were drawn. Based on the subtype annotation
   results, the proportions of cell types under different groupings were
   first visualized. Wilcoxon test. Finally, the differences in the
   expression of NOS2, TNF, ARG1, and MRC1 genes in Monocyte&Macrophage
   between HF and control samples were analyzed and statistically analyzed
   using the Wilcoxon test.

2.12. Human Subjects and Extraction of PBMC

   Patients with HF admitted to the First Hospital of Shanxi Medical
   University were selected as the HF group, and a control group was
   matched with the HF group based on age, gender, and other underlying
   diseases besides HF. Based on the expression of biomarkers obtained
   through bioinformatics, the sample size was calculated using PASS.15,
   resulting in a total of 15 pairs of samples. In the morning of the
   second day after admission, venous blood was collected into EDTA tubes,
   and peripheral blood lymphocytes were isolated within 2 hours using
   human peripheral blood lymphocyte separation liquid (Solarbio, China).
   The trial protocol was approved by the Scientific Research Ethics
   Review Committee of the First Hospital of Shanxi Medical University
   (NO. KYLL-2024-236), and all patients provided written informed
   consent.

2.13. Animal model (echocardiography)

   SSPF-grade male Sprague-Dawley rats (180–200 g, 6–8 weeks old) were
   used to establish a chronic HF model ([112]38). HF was induced by
   permanently ligating the left coronary artery in rats, while
   sham-operated rats underwent the same surgical procedure without artery
   ligation. Six weeks post-ligation, high-resolution echocardiography was
   performed using the Vevo 770 system (Visualsonics) with a 40 MHz RMV
   704 scanhead to assess cardiac function. Rats with an ejection fraction
   (EF) < 40% were considered to have successfully developed HF, and those
   that did not develop HF were excluded. After completing
   echocardiography, the animals were euthanized, and tissues were
   collected for analysis. The experimental protocol was approved by the
   Animal Experimental Center Ethics Committee of Beijing Yongxinkangtai
   Science and Technology Development Co., Ltd. (NO. YXKT2024L010).

2.14. Staining

   Hearts were fixed in 4% paraformaldehyde at room temperature for 48
   hours, followed by dehydration and embedding. The samples were
   sectioned at 5μm thickness, dewaxed, rehydrated, and stained with
   Hematoxylin and Eosin (HE) and Masson stains. For IHC staining, primary
   antibodies targeting CD163 (1:200, Selleck, F1548) was incubated
   overnight at 4 °C. Then, second antibody was incubated at 37°C for 1
   hour. Chromogen development was accomplished with DAB. Images were
   captured under a microscope (Olympus, Japan).

2.15. Real-time quantitative PCR

   Following tissue homogenization, total RNA was extracted using Trizol
   (Thermo Fisher Scientific, USA). cDNA synthesis was carried out using
   PrimeScript RT Master Mix (Takara, Japan) according to the
   manufacturer’s protocol. Real-time quantitative PCR (qPCR) analysis was
   performed with SYBR Green Master Mix (DBI Bioscience, Germany) on a
   QuantStudio3 real-time PCR instrument (Thermo Fisher Scientific, USA),
   with GAPDH as an internal control. Relative mRNA expression levels were
   quantified using the 2^-ΔΔCt method. Primer sequences are provided in
   [113]Table 1 .

Table 1.

   Primer sequences for quantitative real-time PCR.
   Species Target gene   Primer sequence (5’to3’)
   Human   VSIG4 Forward AAGCAACATCTACAGTGAAGCAGTC
                 Reverse ATGATGAGGATGATGGCAAAGACAG
           FPR1  Forward AGTGGACATCAACTTGTTCGGAAG
                 Reverse ACGGTGCGGTGGTTCTGG
           CD163 Forward ACAATGAAGATGCTGGCGTGAC
                 Reverse TCTCTGAATCTCCACCTCAACTGTC
           GAPDH Forward CGTATCGGACGCCTGGTT
                 Reverse AGGTCAATGAAGGGGTCGTT
   Rat     VSIG4 Forward AGCTGCCGATCTTTGCCATAATC
                 Reverse TCCTGCTCACCTCATAGACATACTC
           FPR1  Forward CCGTGAACACTTGAGGAACATACC
                 Reverse GGATTGGGTTGAGGCAGCTATTG
           CD163 Forward GAATCACAGCATGGCACAGGTC
                 Reverse CACAAGAGGAAGGCAATGAGAAGG
           GAPDH Forward GACATGCCGCCTGGAGAAAC
                 Reverse AGCCCAGGATGCCCTTTAGT
   [114]Open in a new tab

2.16. Statistical analysis

   Statistical analyses were conducted using R software (v 4.2.2) and
   GraphPad Prism 9. Differences between two groups were assessed using
   the Wilcoxon rank sum test, with statistical significance defined as P
   < 0.05.

3. Results

3.1. Acquisition of key module genes

   A total of 441 DEGs were identified, including 236 up-regulated and 205
   down-regulated genes in HF ([115] Additional files 2a-b ). The ssGSEA
   scores for ICD-related genes significantly differed between HF and
   normal samples ([116] Additional file 2c ). In the WGCNA analysis of
   the [117]GSE57338 dataset, no outlier samples were detected ([118]
   Additional file 2d ). The soft threshold was determined to be 7 ([119]
   Additional file 2e ). Similar modules were merged from the
   co-expression matrix, resulting in 11 identified gene modules
   (excluding the gray module for unclassified genes), with each module
   represented by a different color ([120] Figure 1a ). The yellow module
   (cor = 0.72, P = 5.8 × 10^–17) demonstrated the strongest correlation
   with ICD-related gene ssGSEA scores. Consequently, the 432 genes within
   the yellow module were designated as key module genes ([121] Figure 1b
   ).

Figure 1.

   [122]Cluster dendrogram and module-trait relationships are depicted. On
   the left, a dendrogram shows hierarchical clustering with color-coded
   modules at the bottom. On the right, a heatmap illustrates module-trait
   correlations, with values ranging from negative (blue) to positive
   (red) associations. Values and p-values are shown for each module.
   [123]Open in a new tab

   Acquisition of key module genes. (a) Co-expression module
   identification. (b) Heatmap showing the correlation between modules and
   phenotypes.

3.2. Identification and enrichment analysis of candidate genes and PPI

   In this study, 47 candidate genes were identified through the
   intersection of DEGs and key module genes ([124] Figure 2a ). The
   obtained candidate genes were subject to gene ID conversion, though
   FCGR1B could not be successfully converted. GO enrichment analysis
   revealed 272 GO terms, comprising 224 biological processes (BP), 24
   cellular components (CC), and 24 molecular functions (MF) (P < 0.05)
   ([125] Figure 2b ). The candidate genes were significantly enriched in
   pathways such as the positive regulation of inflammatory response,
   secretory granule membrane, and RAGE receptor binding. Additionally,
   the candidate genes were enriched in 26 KEGG pathways (P < 0.05),
   including staphylococcus aureus infection, phagosome, and neutrophil
   extracellular trap formation ([126] Figure 2c ). These results implied
   that candidate genes may play important roles in antimicrobial
   immunity, inflammatory response and cellular damage repair.

Figure 2.

   [127]A series of visual data representations: a) A Venn diagram showing
   overlap between DEG and ModuleGene, highlighting 47 shared genes. b) A
   circular heatmap with GO terms, showing downregulated and upregulated
   pathways, accompanied by a descriptive table. c) A circular ribbon plot
   displaying relationships between genes and KEGG pathways, using varied
   colors for different pathways. d) A network diagram illustrating gene
   interactions with nodes and connecting lines. e) A bar graph showing
   intersection sizes for various metrics such as MCC and MNC, with a
   focus on a size of 16. Each graphic emphasizes different aspects of
   gene interaction and regulation.
   [128]Open in a new tab

   Identification and enrichment analysis of candidate genes and PPI. (a)
   Venn diagram depicting the overlap between differentially expressed
   genes (DEGs) and key module genes. (b) Gene Ontology (GO) enrichment
   analysis results. (c) Kyoto Encyclopedia of Genes and Genomes (KEGG)
   enrichment analysis results. (d) Protein-Protein Interaction (PPI)
   network. (e) Upset plot representing the PPI network.

   The candidate genes were further subjected to PPI network construction,
   resulting in 42 genes, such as TLR2, FPR1, and MRC1, and 240
   gene-to-gene pairs, including TLR2-CD163 and VSIG4-CD14 ([129]
   Figure 2d ).

   To optimize the screening of candidate genes, the genes were ranked
   using different algorithms. The top 20 genes from each algorithm were
   extracted, and the intersection of these top 20 genes was taken.
   Finally, 16 genes were identified as the candidate key genes for
   further analysis ([130] Figure 2e ).

3.3. Machine learning for candidate biomarker screening

   Based on the sample grouping information from [131]GSE57338, the
   SVM-RFE algorithm was applied for screening, resulting in 13 feature
   genes: CD163, VSIG4, FCER1G, CCR1, CCL5, FPR1, TLR2, C1QB, CD14, MSR1,
   CD68, MRC1, and CYBB ([132] Figure 3a ). The MeanDecreaseGini values
   for each feature gene ranged from 0 to 30, with notable differences
   observed between the genes ([133] Figure 3b ). By calculating the
   median of the MeanDecreaseGini values, six genes greater than the
   median were selected as candidate biomarkers: CD163, VSIG4, FCER1G,
   CCR1, CCL5, and FPR1. Among these, CCL5 showed a negative correlation
   with VSIG4 and CD163, while the remaining five genes exhibited positive
   correlations with each other (P < 0.01) ([134] Figure 3c ). The
   correlation between these genes suggested that they may work in concert
   at different stages of the immune response or in different types of
   immune cells.

Figure 3.

   [135]Three-panel figure: (a) Line graph showing cross-validation
   accuracy increasing with additional variables, peaking around the
   fifteenth. (b) Bar chart ranking variables by importance in a random
   forest model, with VSIG4 and CD163 being the most significant. (c)
   Correlation matrix visually depicting relationships among variables
   such as CCL5, CCR1, and others, using colored circles to indicate
   strength and direction.
   [136]Open in a new tab

   Machine learning for candidate biomarker screening. (a) Results of the
   Support Vector Machine-Recursive Feature Elimination (SVM-RFE) model.
   (b) Bar chart depicting the MeanDecreaseGini scores for candidate
   genes. (c) Correlation analysis of candidate biomarkers. *P < 0.05,
   ***P < 0.001.

3.4. Diagnosis and evaluation of biomarkers

   In [137]GSE57338, the six candidate biomarkers demonstrated significant
   differences between HF and normal samples (P < 0.05), with CD163, FPR1,
   and VSIG4 showing decreased expression in HF samples ([138] Figure 4a
   ). In [139]GSE3586, only CD163, VSIG4, CCR1, and FPR1 were expressed,
   with CD163, FPR1, and VSIG4 levels significantly reduced in HF samples,
   consistent with the expression patterns observed in [140]GSE57338
   ([141] Figure 4b ). Consequently, CD163, FPR1, and VSIG4 were selected
   for ROC analysis, which revealed that the AUC for all three biomarkers
   exceeded 0.7 in both datasets, confirming their potential as HF
   biomarkers ([142] Figures 4c-h ). Next, the expression analysis of
   CD163, FPR1, and VSIG4 in the [143]GSE5406 dataset showed that all
   three were significantly under-expressed in the HF group compared to
   the normal group ([144] Additional file 3 ). The expression patterns
   were consistent with those in the [145]GSE57338 and [146]GSE3586
   datasets.

Figure 4.

   [147]Box plots depict biomarker expression for groups HF and NHF in
   train (a) and verify (b) datasets, with statistical significance
   indicated by asterisks. ROC curves for VSIG4 (c, f), CD163 (d, h), and
   FPR1 (e, g) show performance, displaying AUC values of 0.764, 0.906,
   0.827, 0.896, 0.841, and 0.815 respectively.
   [148]Open in a new tab

   Diagnosis and evaluation of biomarkers. (a) Expression levels of
   candidate genes in the training set, with the horizontal axis
   representing genes and the vertical axis indicating gene expression
   levels (Wlicoxon rank sum test, ****P < 0.0001). (b) Expression levels
   of candidate genes in the validation set, with similar axis labels and
   significance markers (Wlicoxon rank sum test, *P < 0.05, **P < 0.01,
   ns: P > 0.05). (c) ROC curve analysis of the VSIG4 biomarker in the
   validation set. (d) ROC curve analysis of the CD163 biomarker in the
   training set. (e) ROC curve analysis of the FPR1 biomarker in the
   training set. (f) ROC curve analysis of the VSIG4 biomarker in the
   training set. (g) ROC curve analysis of the FPR1 biomarker in the
   validation set. (h) ROC curve analysis of the CD163 biomarker in the
   validation set.

   The nomogram model demonstrated that these three biomarkers could
   accurately predict the risk of HF occurrence. ROC analysis of the
   nomogram yielded an AUC of 0.913, indicating that the predictive
   accuracy of the nomogram model was significantly superior to
   single-gene predictions ([149] Additional file 4 ). It also suggested
   that the onset and progression of HF may involve complex interactions
   of multiple genes or biological pathways.

3.5. Functional analysis of biomarkers

   Further analysis of the signaling pathways involving CD163, FPR1, and
   VSIG4 revealed that CD163 was enriched in 76 pathways, including
   ribosome, Parkinson’s disease, leishmania infection, Fc gamma
   R-mediated phagocytosis, and B cell receptor signaling ([150] Figure 5a
   ). FPR1 was enriched in 79 pathways, including ribosome, leishmania
   infection, Parkinson’s disease, cytokine-cytokine receptor interaction,
   and chemokine signaling ([151] Figure 5b ). VSIG4 was enriched in 85
   pathways, including ribosome, Fc gamma R-mediated phagocytosis, B cell
   receptor signaling, leishmania infection, and chemokine signaling
   ([152] Figure 5c ). Notably, all three biomarkers were enriched in
   pathways related to ribosome function, immune cells, and immune
   factors. These findings provided a basis for further investigation of
   the potential applications of biomarkers in immunomodulation, disease
   diagnosis and therapy.

Figure 5.

   [153]Three line graphs labeled a, b, and c show running enrichment
   scores against rank in ordered datasets. Each graph features multiple
   pathways, including Ribosome and B Cell Receptor Signaling Pathway.
   Graph a includes Parkinson's Disease and Leishmania Infection. Graph b
   includes Cytokine-Cytokine Receptor Interaction. Graph c includes Fc
   Gamma R-Mediated Phagocytosis. Each graph has a ranked list
   visualization below.
   [154]Open in a new tab

   Functional analysis of biomarkers. (a) GSEA enrichment analysis of the
   CD163 gene. (b) GSEA enrichment analysis of the FPR1 gene. (c) GSEA
   enrichment analysis of the VSIG4 gene.

3.6. Analysis of immune cell infiltration

   To further explore immune status differences between HF and normal
   samples, immune infiltration analysis was performed on [155]GSE57338
   samples, revealing differences in the abundance of 22 immune cell types
   between samples from patients with HF and normal samples ([156]
   Additional file 5 ). Immune cells with a result of 0 in 30% of the
   samples were excluded, leaving 12 immune cell types for subsequent
   analysis. Five immune cell types showed significant differences between
   the groups: M2 macrophages, resting mast cells, plasma cells, CD8^+ T
   cells, and T regulatory cells (Tregs) (P < 0.05) ([157] Figure 6a ).
   Correlation analysis among these five immune cell types revealed a
   strong positive correlation between CD8^+ T cells and Tregs, while
   plasma cells exhibited negative correlations with Tregs, CD8^+ T cells,
   M2 macrophages, and resting mast cells (|cor| > 0.3, P < 0.05) ([158]
   Figure 6b ). The correlation heatmap between biomarkers and the five
   immune cell types showed that VSIG4 had a strong positive correlation
   with M2 macrophages, and M2 macrophages positively correlated with
   CD163 and FPR1. In contrast, CD8^+ T cells and plasma cells negatively
   correlated with CD163, FPR1, and VSIG4, respectively. Resting mast
   cells demonstrated an inverse correlation with CD163 and FPR1 (|cor| >
   0.3, P < 0.05) ([159] Figure 6c ). The above results suggested that
   biomarkers may be involved in disease onset and progression by
   modulating immune responses and cellular functions.

Figure 6.

   [160]Panel a shows a box plot of immune cell fractions comparing two
   groups, HF and NHF, with significant differences in macrophages M2, NK
   cells resting, T cells CD4 memory resting, T cells CD8, and T cells
   regulatory Tregs. Panel b is a correlation matrix highlighting
   relationships between various immune cell types with scales for
   positive and negative correlations. Panel c presents a heatmap of
   correlations between immune cells and markers CD163, FPR1, and VSIG4,
   indicating significance with asterisk levels from p < 0.05 to p <
   0.001.
   [161]Open in a new tab

   Analysis of immune cell infiltration. (a) Box plot illustrating immune
   cell infiltration differences (Wlicoxon rank sum test, *P < 0.05, **P <
   0.01, ***P < 0.001, ****P < 0.0001, ns: P > 0.05). (b) Correlation of
   differential immune cell types (*P < 0.05, **P < 0.01, ***P < 0.001).
   (c) Correlation between biomarkers and differential immune cells (*P <
   0.05, **P < 0.01, ***P < 0.001).

3.7. Molecular regulatory network and drug prediction

   Prediction of miRNA interactions with the three biomarkers revealed
   that VSIG4 was regulated by four miRNAs, including hsa-miR-665; CD163
   was regulated by 11 miRNAs, including hsa-miR-4262; while no miRNA
   regulatory relationships were found for FPR1 ([162] Figure 7a ). TFs
   regulating the biomarkers were also analyzed, revealing that no TFs
   regulated VSIG4 or FPR1, but eight TFs, including SOX9, were found to
   regulate CD163 ([163] Figure 7b ). These findings provided important
   clues for further understanding of immune markers and their regulatory
   networks in HF.

Figure 7.

   [164]Three network diagrams showing interactions. (a) MicroRNA
   interactions with CD163 and VSIG4, displaying connections to various
   hsa-miR identifiers. (b) CD163 linked to several transcription factors,
   including SPI1, CUX1, and SOX9. (c) Relationships among medications,
   microRNAs, and proteins VSIG4, FPR1, and CD163, with nodes representing
   compounds like simvastatin, etoposide, and ribavirin.
   [165]Open in a new tab

   Molecular regulatory network and drug prediction. (a) Regulatory
   network between the CD163 gene and miRNAs, where pink nodes represent
   biomarkers and blue nodes represent interacting miRNAs. (b) Regulatory
   network between the CD163 gene and transcription factors (TFs). (c)
   Diagram of drug prediction for biomarkers.

   A total of 74 biomarker-drug/compound relationships were identified.
   The network analysis suggested that carbachol and etynodiol may have
   potential effects on all three biomarkers. Additionally, six compounds
   were shared between CD163 and FPR1—prednisolone, flunisolide,
   fludroxycortide, halcinonide, ribavirin, and isoflupredone—while five
   compounds were shared between FPR1 and VSIG4, including anisomycin,
   trichostatin A, cephaeline, emetine, and beclometasone ([166] Figure 7c
   ). By understanding the role of these drugs in regulating the
   expression of immune markers, more effective therapeutic strategies may
   be developed in the future.

3.8. Single-cell RNA sequencing analysis

   Following quality control, 23,963 genes and 49,042 cells were
   identified ([167] Additional file 6 ). The top 2000 HVGs were selected,
   and the 10 genes exhibiting the greatest variation were identified
   ([168] Additional file 7a ). PCA was performed on the selected HVGs,
   and the top 10 principal components (PCs) were chosen for further
   analysis (P < 0.05) ([169] Additional files 7b-c ). UMAP clustering
   analysis was conducted prior to cell annotation, resulting in the
   identification of 14 distinct cell clusters ([170] Additional file 7d
   ). Nine cell types and their corresponding markers were extracted for
   annotation based on the reference ([171]12). Subsequently, cell
   annotation revealed eight distinct cell types: endothelium,
   fibroblasts, pericytes, monocytes and macrophages, natural killer and T
   lymphocytes (NK&T cells), neurons, B cells, and smooth muscle cells
   ([172] Figure 8a ; [173]Additional file 8 ). Monocytes and macrophages
   expressing all three biomarkers were designated as key cells ([174]
   Figure 8b ). To explore the biological pathways and functions of these
   cell subtypes in HF development, enrichment analysis revealed that
   pericytes and smooth muscle cells were significantly associated with
   ATP-sensitive potassium channels and BDNF activation of NTRK2 (TRKB)
   signaling, while NK&T cells and B cells were predominantly enriched for
   activation of Na-permeable kainate receptors and hydroxycarboxylic
   acid-binding receptors ([175] Figure 8c ). The above results implied
   that these cell types act synergistically through multiple mechanisms
   and may provide new targets and ideas for the treatment of HF.

Figure 8.

   [176]Panel a displays a UMAP plot differentiating cell types by color,
   including endothelium, fibroblasts, pericytes, monocytes and
   macrophages, NK and T cells, neurons, B cells, and smooth muscle cells.
   Panel b is a dot plot showing expression of features CD163, FPR1, and
   VSIG4 across cell identities, depicted by dot size and color. Panel c
   is a heatmap illustrating gene expression across cell types with a
   color gradient from red to blue.
   [177]Open in a new tab

   Single-cell RNA sequencing analysis. (a) UMAP plot for different cell
   types. (b) Expression profile plot of biomarkers. (c) Pathway
   enrichment analysis of cell subtypes.

   Analysis of cell communication between the eight cell types showed that
   fibroblasts and neurons exhibited the highest number of ligand-receptor
   pairs, indicating the strongest interaction between these two cell
   types. Fibroblasts also demonstrated a higher probability of
   communication with monocytes and macrophages, NK&T cells, and B cells
   ([178] Figures 9a, b ; a: plot of probability of cellular
   communication, b: plot of number of cellular communications). The
   high-frequency interaction of fibroblasts with these immune cells
   suggested that they may play an important role in tissue repair and
   remodeling in immune responses, and inflammation.

Figure 9.

   [179]Graphs and charts depicting cellular interactions and data. Panels
   a and b show network charts illustrating interactions among cell types
   like fibroblasts and macrophages. Panel c shows pseudotime trajectory
   plots, highlighting different cell states. Panel d displays clustering
   based on components, marked by distinct colors. Panel e shows graphs of
   relative expression over pseudotime for markers CD163, FPR1, and VSIG4,
   with expression trends represented by lines.
   [180]Open in a new tab

   Cell subtype communication analysis. (a-b) Diagrams of cell
   communication results. (c) Results of pseudotime and state analysis in
   single-cell pseudotemporal analysis. Darker colors indicate the most
   advanced state of development, while lighter colors indicate more
   mature development. The cells were divided into 9 different periods
   according to their developmental state. (d) Cell pseudotime analysis of
   Seurat clusters. Different cell clusters presented different positions
   at various nodes of the developmental trajectory. (e) Dynamic atlas of
   biomarkers.

   Monocytes and macrophages were projected onto a root with 9 branches,
   traversing 9 nodes along their developmental trajectory. Clusters 0 and
   3 marked the initial stages of monocyte and macrophage development,
   while clusters 4 and 6 were primarily located at the final stages of
   cellular differentiation ([181] Figures 9c, d ). This dynamic
   developmental trajectory may reflected how immune cells progressively
   differentiate and regulate their functions in the body according to
   different needs.

   Given the specific expression of the biomarkers in monocytes and
   macrophages, the gene expression of the three biomarkers was analyzed
   across the pseudo-time series. The expression of CD163 showed a
   decreasing trend over time, with slight increases at certain nodes of
   the developmental cycle, but overall, the expression in the cells
   declined. In contrast, FPR1 exhibited an upward trend, indicating its
   potential significant role in cellular development and differentiation.
   The expression pattern of VSIG4 mirrored that of CD163 ([182] Figure 9e
   ). This expression pattern suggested that their immunosuppressive or
   reparative functions may be gradually replaced by other functions.

   To further explore the biomarker expression of monocyte and macrophage
   subpopulations at different stages of differentiation, 13 cells were
   first clustered and annotated into 5 subpopulations based on marker
   genes ([183] Table 2 ; [184]Additional file 9a-c ). Subsequently, the
   five cell subpopulations were analyzed in a proposed time series. As
   shown in [185]Additional file 9d , cells gradually differentiated over
   time, with darker blue representing earlier differentiation. Each cell
   subpopulation mapped to a different differentiation time and
   corresponded to a different differentiation state, with darker red
   indicating the earliest type of differentiation. As cells
   differentiated, the expression of CD163 and VSIG4 in key cell
   subpopulations gradually increased, while the expression of FPR1 slowly
   decreased ([186] Additional file 9e ). Next, stacked plots of cell
   subpopulations in different differentiation states ([187] Additional
   file 9f ) showed that M1 macrophages were distributed in all
   differentiation states, especially more in state 2 and state 5;
   Intermediate monocytes were distributed only in state 1; and
   Non-classical monocytes were mainly distributed in state 1 and state 3;
   M2 macrophages were concentrated in state 3 and state 4 in the later
   stages of differentiation; Classical monocytes were found mainly in
   state 1 and state 5. Subsequently, the proportions of cell subtypes
   under different groupings were visualized ([188] Additional file 10a ).
   By comparing NOS2, TNF, ARG1 and MRC1 gene expression in
   Monocyte&Macrophage between HF and control samples, TNF and MRC1 were
   found to be significantly different between the two groups ([189]
   Additional files 10b-e ). This provided important clues to a deeper
   understanding of the function of monocytes and macrophages and their
   role in disease.

Table 2.

   Marker gene annotation information for key cell subpopulations.
              TNF                M1 macrophage
   MERTK, CD163, STAB1, MRC1     M2 macrophage
     BASP1, CXCL8, GPR183      Classical monocyte
             FCN1            Non-classical monocyte
            FCGR3A           Intermediate monocyte
   [190]Open in a new tab

3.9. Clinical and animal validation of Hub genes

   To validate the expression levels of ICD-related hub genes in HF, PBMCs
   were extracted from 15 clinical patients with HF and controls for
   RT-qPCR analysis. Results revealed significant down-regulation of
   CD163, FPR1, and VSIG4 in patients with HF ([191] Figure 10a ). Further
   investigation was conducted in heart tissues using the HF rat model.
   Echocardiography showed reduced left ventricular ejection fraction
   (LVEF) and left ventricular fractional shortening index (LVFS),
   alongside increased left ventricular end-systolic diameter (LVIDs) and
   left ventricular end-diastolic diameter (LVIDd) in HF rats ([192]
   Figures 10b-c ). The ratios of heart weight to body weight and lung
   weight to tibia length were significantly elevated ([193] Figure 10d ).
   HE staining revealed prominent cardiomyocyte hypertrophy, with
   inflammatory cell infiltration in the HF group ([194] Figure 10e ).
   Masson staining indicated severe fibrosis in the HF group ([195]
   Figure 10f ), and the difference in fibrosis between the two groups was
   significant ([196] Figure 10g ). Cardiac tissue RT-qPCR results
   confirmed that CD163, FPR1, and VSIG4 were significantly down-regulated
   in HF rats ([197] Figure 10h ). The results of immunohistochemistry
   showed that the expression of CD163^+ cells was decreased in the
   myocardial tissue of HF mice ([198] Additional file 11 ). These results
   suggested that down-regulation of CD163, FPR1, and VSIG4 expression in
   HF patients and HF rat models may be closely associated with
   dysregulation of the immune system, decreased cardiac function, and
   tissue damage.

Figure 10.

   [199]a) Bar graphs comparing the relative expression levels of CD163,
   FPR1, and VSIG4 between NHF and HF, showing statistically significant
   differences. b) Echocardiography images displaying heart functions in
   two conditions. c) Bar graphs showing EF, FS, LVIDd, and LVIDs
   percentages, comparing sham and HF groups with significant differences.
   d) Bar graphs showing HW/BW and LV/BW comparisons between sham and HF.
   e-f) Histological images at 4x and 40x magnification, comparing sham
   and HF cardiac tissues with different staining methods. g) Bar graph of
   collagen volume fraction in sham and HF, showing significant increase
   in HF. h) Bar graphs showing relative expression levels of CD163, FPR1,
   and VSIG4 between sham and HF with significant differences.
   [200]Open in a new tab

   Clinical and animal validation of hub genes. (a) Expression of CD163,
   FPR1, and VSIG4 in peripheral blood mononuclear cells of patients with
   HF and NHF individuals (Unpaired t test, **P < 0.01, ***P < 0.001).
   (b-c) Echocardiograms of the HF rat model and sham group (Unpaired t
   test, ***P < 0.001). (d) The ratios of heart weight to body weight and
   lung weight to tibia length in rat model (Unpaired t test, *P < 0.05).
   (e-f) HE and Masson staining of rat hearts. (g) Collagen volume
   fraction(%) calculated by Masson staining (Unpaired t test, ***P <
   0.001). (h) Expression of CD163, FPR1, and VSIG4 in the hearts of HF
   and sham groups (Unpaired t test, *P < 0.05, **P < 0.01).

4. Discussion

   Cardiac immunology has recently emerged as a focal area of research.
   While some aspects of immune regulation in HF are understood, many
   questions remain to be addressed. ICD is a form of programmed cell
   death induced by antigens and adjuvants, triggering downstream immune
   responses. However, the role and mechanisms of ICD in HF
   pathophysiology remain unclear. In this study, three ICD-related
   biomarkers—CD163, FPR1, and VSIG4—were identified in patients with HF
   using transcriptomic and single-cell dataset analyses ([201] Additional
   file 12 ). Previous studies have shown that these three genes, as
   combined markers, may act synergistically to affect the occurrence and
   development of HF and non-alcoholic fatty liver disease by regulating
   mechanisms such as immune response and monocyte migration. In addition,
   their association with natural killer (NK) cells and macrophages was
   also found, further supporting their important role in the immune
   response ([202]39).

   Single-cell sequencing data in this study were obtained from the
   research by Koenig et al. ([203]12). Unlike the study by Koenig, our
   work systematically integrated multiomics analyses (including
   transcriptomes and single-cell sequencing), machine-learning approaches
   (e.g., SVM-RFE and random forests), and immune infiltration
   assessments, which were not comprehensively combined in their study.
   Furthermore, this study identified the association of CD163, FPR1, and
   VSIG4 with ICD, a connection that Koenig et al. did not investigate.
   Specifically, during ICD, certain molecules, especially ANXA1, may
   enhance local inflammatory responses by binding to FPR1 receptors and
   activating macrophages and monocytes. At the same time, the activation
   of fibroblasts may promote vascular wall structural changes and
   fibrosis ([204]40). Therefore, targeting FPR1 or ICD-related pathways
   may be a potential strategy for the treatment of ascending aortic
   aneurysm. When tumor cells develop ICD through radiotherapy or other
   therapeutic modalities, macrophages recognize tumor cell death signals
   through CD163 receptors. CD163^+ macrophages are normally in an
   immunosuppressive state and help tumors evade immune surveillance by
   promoting Treg cell infiltration and inhibiting effector T cell
   function ([205]41). In addition, carbon ion radiotherapy has been shown
   to effectively reduce fiber deposition in scar tissue by inducing ICD
   of fibroblasts, slowing their proliferation and promoting their death
   ([206]42). Another study pointed out that ICD may affect
   cancer-associated fibroblasts by regulating immune responses, thereby
   altering tumor progression and patient survival prognosis. Although no
   association between ICD and macrophages or fibroblasts has been found
   in HF, these immune cells may affect the occurrence and development of
   HF through the ICD process. Additionally, potential therapeutic targets
   were proposed via drug prediction, such as carbachol and etynodiol,
   which target all three biomarkers. Collectively, this study not only
   extends the findings of Koenig et al. but also offers novel insights
   and references for future research in HF.