Abstract

   Graves' disease (GD) is an autoimmune disorder characterized by
   hyperthyroidism resulting from autoantibody-induced stimulation of the
   thyroid gland. Despite recent advancements in understanding GD's
   pathogenesis, the molecular processes driving disease progression and
   treatment response remain poorly understood. In this study, we aimed to
   identify crucial immunogenic factors associated with GD prognosis and
   immunotherapeutic response. To achieve this, we implemented a
   comprehensive screening strategy that combined computational
   immunogenicity-potential scoring with multi-parametric cluster analysis
   to assess the immunomodulatory genes in GD-related subtypes involving
   stromal and immune cells. Utilizing weighted gene co-expression network
   analysis (WGCNA), we identified co-expressed gene modules linked to
   cellular senescence and immune infiltration in CD4^+ and CD8^+ GD
   samples. Additionally, gene set enrichment analysis enabled the
   identification of hallmark pathways distinguishing high- and low-immune
   subtypes. Our WGCNA analysis revealed 21 gene co-expression modules
   comprising 1,541 genes associated with immune infiltration components
   in various stages of GD, including T cells, M1 and M2 macrophages, NK
   cells, and Tregs. These genes primarily participated in T cell
   proliferation through purinergic signaling pathways, particularly
   neuroactive ligand-receptor interactions, and DNA binding transcription
   factor activity. Three genes, namely PRSS1, HCRTR1, and P2RY4,
   exhibited robustness in GD patients across multiple stages and were
   involved in immune cell infiltration during the late stage of GD
   (p < 0.05). Importantly, HCRTR1 and P2RY4 emerged as potential
   prognostic signatures for predicting overall survival in
   high-immunocore GD patients (p < 0.05). Overall, our study provides
   novel insights into the molecular mechanisms driving GD progression and
   highlights potential key immunogens for further investigation. These
   findings underscore the significance of immune infiltration-related
   cellular senescence in GD therapy and present promising targets for the
   development of new immunotherapeutic strategies.

   Keywords: Graves' disease, Immunotherapeutic biomarkers, Immune
   infiltration, WGCNA, Gene co-expression modules, Molecular mechanisms

Highlights

     * •
       Identified key immunogenic factors for GD prognosis and therapy
       response
     * •
       Utilized computational immunogenicity-potential scoring and
       multi-parametric cluster analysis
     * •
       Co-expression modules related to immune infiltration and cellular
       senescence in CD4^+ and CD8^+ GD samples identified
     * •
       Three genes (PRSS1, HCRTR1, P2RY4) were identified as potential key
       immunogens for GD therapy and overall survival prediction
     * •
       Immune infiltration-related cellular senescence in GD underscores
       the importance of new immunotherapeutic strategies.

1. Introduction

   Graves' disease (GD), a prevalent autoimmune disorder, affects the
   thyroid gland causing hyperthyroidism, often seen in individuals
   between the ages of 30 and 50 [[35]1]. Women are more likely to be
   diagnosed with GD, with a female-to-male ratio of around 7:1 [[36]2].
   The incidence of GD is higher in industrialized countries such as the
   United States and Europe, with a prevalence of 1–2% and 0.5–2%,
   respectively [[37]3,[38]4]. However, the incidence of GD in Asia is
   relatively lower, with a reported rate of 0.3–0.7%. A study published
   in the Journal of Clinical Endocrinology & Metabolism has indicated an
   increasing trend of GD incidence in China, with a rate of 0.5–0.8%
   [[39]5,[40]6].

   The standard treatment options for GD include antithyroid medications,
   radioactive iodine (RAI) therapy, and thyroidectomy [[41][7], [42][8],
   [43][9]]. Close monitoring and adjustments to the treatment plan may be
   necessary for some patients [[44]10,[45]11]. Although current
   treatments effectively manage hyperthyroidism, they do not address the
   underlying immune activation and autoantibody production [[46]11].
   Hence, there is a need for further research and development of
   treatments targeting the underlying immunological mechanisms in GD.

   GD is characterized by immune pathology that involves the production of
   autoantibodies targeting the thyroid-stimulating hormone receptor
   (TSHR) in the thyroid gland, which results in hyperthyroidism due to
   excessive secretion of thyroid hormones [[47]12,[48]13]. The immune
   response in GD also involves the participation of T and B cells, which
   play critical roles in the disease progression [[49]14]. T cells
   secrete cytokines that promote inflammation and stimulate the thyroid
   gland, while B cells produce autoantibodies against thyroid-specific
   antigens, which contribute to ongoing immune activation and excessive
   thyroid hormone secretion [[50]14,[51]15].

   The exact cause of immune-mediated autoantibody production in
   autoimmune disorders is not yet fully understood, although genetic
   susceptibility, viral infections, and endocrine imbalances have been
   suggested as contributing factors. The initial triggers for
   autoantibody production and immune activation are still unknown, and
   the precise interactions between immune cells and autoantibodies in the
   thyroid gland require further investigation [[52]12,[53]16]. However,
   recent advances in computational tools, such as the
   immunogenicity-potential scoring (IPS) [[54]17,[55]18],
   microenvironment cell populations-counter (MCP) [[56][19], [57][20],
   [58][21]], X-cells algorithms [[59]22,[60]23], and weighted gene
   co-expression network analysis (WGCNA) [[61]24,[62]25], have enabled
   researchers to gain a deeper understanding of the molecular mechanisms
   underlying autoantibody production and the complex interactions between
   stromal and immune cells in the thyroid gland [[63]26,[64]27]. These
   cutting-edge tools allow for large-scale gene expression analysis to
   identify key biological pathways and processes involved in the path
   immunology of autoimmune disorders, providing valuable insights into
   the underlying mechanisms of these diseases [[65]28].

   In this study, we present a pioneering investigation into the immune
   landscape of GD utilizing cutting-edge computational tools. Our
   approach integrates advanced methodologies, including IPS, MCP, and
   X-cell algorithms, complemented by the well-established WGCNA
   [[66]26,[67]27]. This departure from conventional research strategies
   aims to fill crucial knowledge gaps regarding the molecular origins of
   immune-mediated autoantibody production in the GD [[68]28]. By
   leveraging large-scale gene expression analysis, we uncover key
   biological pathways, unraveling the intricate immunological processes
   that characterize autoimmune disorders. This unique amalgamation of
   methodologies not only propels our comprehension of GD immunology but
   also lays the groundwork for revolutionary diagnostic and therapeutic
   avenues with profound implications. Specifically, we aim to determine
   the disrupted immune network associated with GD pathogenesis and
   evaluate the feasibility of utilizing a prognostic marker to predict
   the immune microenvironment status in GD patients. By combining the
   expression data of stromal and immune cells in GD-affected tissues, our
   study seeks to derive a novel understanding of the intricate
   interactions between immune cells and autoantibodies in GD and enhance
   our capacity to forecast and manage this debilitating autoimmune
   disorder. Our research has the potential to provide valuable insights
   into the pathophysiology of GD and inform the development of more
   effective diagnostic and therapeutic strategies.

2. Materials and methods

2.1. Study design

   In this observational study, transcriptomic data of GD obtained from
   The Cancer Genome Atlas (TCGA) and The Cancer Genome Atlas Glioma
   (CGGA) collections were analyzed to identify cellular and molecular
   pathways associated with senescence in GD through an immune
   infiltration-guided strategy. The study employed IPS, MCP, and Xcell
   algorithms to comprehensively analyze the immune cell fractions and
   immune clustering of the immune infiltration components in GD datasets.
   The IPS algorithm was utilized to evaluate the overall abundance of
   immune cells within the transcriptomic data. This computational
   approach entails a thorough assessment of the expression profiles
   linked to various immune cell types. IPS offers a quantitative measure,
   enabling the determination of the global immune cell landscape in the
   context of GD [[69]17,[70]18]. Additionally, MCP was employed to
   scrutinize the specifics of immune cell populations present in the GD
   datasets. This algorithm utilizes advanced computational methods to
   deconvolute bulk transcriptomic data, providing insights into the
   quantitative distribution of distinct immune cell subsets. By
   leveraging gene expression signatures associated with various immune
   cell types, MCP enhances the resolution for characterizing specific
   immune cell populations contributing to the GD microenvironment
   [[71]19,[72]20]. Moreover, the distribution analysis of immune cell
   populations in the GD datasets was conducted using the Xcell algorithm.
   Xcell employs a sophisticated computational framework to evaluate the
   relative proportions of diverse immune cell types within the GD
   transcriptomic landscape [[73]22,[74]23]. By leveraging specific gene
   expression profiles associated with various immune cells, Xcell
   enhances our understanding of the spatial arrangement and interplay of
   immune cell populations in the context of GD. These algorithms
   collectively contribute to a comprehensive assessment of the immune
   landscape, providing valuable information on both the overall abundance
   and specific distribution patterns of immune cell populations
   associated with GD [[75]26,[76]27].

   Furthermore, the WGCNA was used to identify genes related to the immune
   hub and involved in glioma immune invasion. Gene Ontology (GO) and
   Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analysis were
   performed to uncover specific signaling pathways, and the reliability
   of the immune-hub genes was evaluated through clinical prognosis
   analysis. A protein-protein interaction analysis was also performed to
   predict potential mechanisms. The study aimed to provide a
   comprehensive understanding of the immune profile of GD and its
   potential implications for prognosis and therapy. The sample abundance
   scores for immune cells were separately calculated for patient samples
   from different clinical stages and ages [[77]29]. The flowchart in
   [78]Fig. 1 summarizes the preparation, processing, analysis, and
   validation of the data.

Fig. 1.

   [79]Fig. 1
   [80]Open in a new tab

   Flowchart describing the schematic overview of the study design for
   investigating immune infiltration characteristics in patients with GD.

2.2. Data acquisition

   In this study, transcriptomic datasets were collected from the NCBI GEO
   database to investigate epigenetic profiling in CD4^+ and CD8^+ T cells
   of patients with GD. The datasets contained information on changes in
   genes related to T cell receptor signaling, providing valuable insights
   into the underlying mechanisms of GD. The study sample consisted of 38
   GD patients and 31 healthy individuals from the [81]GSE71956 cohort.
   The researchers extracted total RNA from sorted CD4^+ and CD8^+ T cells
   to analyze DNA methylation patterns of specific genes and examine their
   correlation with gene expression profiles in GD. This approach enabled
   the identification of potential targets for therapeutic intervention
   and a deeper understanding of the molecular pathways involved in GD
   pathogenesis. The study followed the PRISMA statement recommendations
   to ensure transparency and reproducibility of the research process.

2.3. Data pre-processing

   Before the gene expression analysis, the raw data were preprocessed
   using the Limma package in R/Bioconductor, with version R 2.14.0.
   Quantile normalization was conducted on the data to ensure the accurate
   representation of gene expression levels, which was implemented using
   the Limma package. To match the microarray probe sets with their
   corresponding genes, an Affymetrix-released annotation file for each
   platform was used. To eliminate redundancy, probes that mapped to
   multiple genes were removed, and the average expression level of the
   probes that mapped to the same gene was used. Lastly, hierarchical
   clustering was carried out to eliminate any noisy data from the
   analysis.

2.4. Gene co-expression network reconstruction

   The WGCNA approach was employed to identify co-expressed gene modules
   related to cellular senescence and immune infiltration in the GD
   [[82]28,[83]30]. To begin with, three biological co-expression networks
   of immune pathways were constructed using the WGCNA R package, and a
   weighted adjacency matrix and a topological overlap matrix were built
   to evaluate the correlation among the genes involved in the senescence
   [[84]30,[85]31]. The authors identified biologically relevant modules
   that had a minimum size of 30 and a split depth of 3. To understand the
   association of gene sets with the disease immunophenotype, the
   module-phenotype relationship was calculated, and the study highlighted
   that the glioma-immune microenvironment can impact cellular
   senescence's pro- or anti-immunopathology tendencies and therapy
   response. Additionally, the IPS, MCP, and Xcell algorithms were
   employed to evaluate the immune cell infiltration level, and the
   correlation between the immune score and hub genes was determined.
   Finally, the Chi-square test was carried out to assess the significance
   of immuno-module genes in each group, to comprehend the immune
   infiltration-related cellular senescence genes in glioma.

2.5. Gene and pathway enrichment analysis

   To gain insights into the biological processes and molecular functions
   related to the targeted immune hubs in glioma, several analyses were
   conducted using multiple datasets such as DisGeNET [[86]32], Clinical
   Variations [[87]24], DisGeNET [[88]25], BeFree [[89]26], BioCyc
   [[90]27], KEGG [[91]28], and MSigDBBioCarta [[92]30]. ToppGene and the
   clusterProfiler package were used to perform GO analysis and KEGG
   analysis to uncover enriched pathways across multiple datasets
   [[93][33], [94][34], [95][35]]. The gene set enrichment analysis (GSEA)
   method was used to conduct pathway enrichment analysis based on 50
   hallmark pathways from the Molecular Signature Database [[96]36]. The
   Enrichplot package was used to visualize the results, and Kaplan-Meier
   diagrams were created to determine the prognostic pattern of the top
   overlapping oncogenic pathways. These analyses aimed to reveal the
   underlying mechanisms of the targeted immune hubs and their impact on
   patient prognosis [[97]37].

2.6. Protein-protein interaction network construction

   To analyze the relationships between genes, a protein-protein
   interaction (PPI) network was constructed from the genes in the
   enriched modules. The PPI network was constructed using the Search Tool
   for the Retrieval of Interacting Genes (STRING) and visualized using
   Cytoscape [[98]38]. In the context of our analysis, STRING served as a
   valuable resource for comprehending the functional associations and
   potential interactions between genes. The constructed PPI network,
   visualized through Cytoscape, provided an insightful representation of
   gene connectivity and potential functional relationships. Hub genes
   were identified as those with node connectivity of ≥5, indicating that
   they play a crucial role in the network due to their many connections
   to other genes. The PPI network visualization allows for the
   identification of potential interactions and functional relationships
   between genes, providing valuable information for further analysis. The
   integration of STRING and Cytoscape in this analysis not only enhances
   the clarity of the PPI network but also contributes to the
   identification of crucial interactions, providing a foundation for
   further in-depth analysis of the functional relationships between
   genes.

2.7. Development of a comprehensive and predictive nomogram

   To investigate the relationship between the expression of the key
   immune-related gene and clinical grade, the researchers utilized the
   survival package and survminer package in R for survival analysis.
   Clinicopathological grading and survival time records of the GD cases
   in the dataset were analyzed, and the correlation between the target
   immune-related gene and clinical grade was evaluated. The Kaplan-Meier
   method was employed to plot the survival curve, with the log-rank test
   used as a statistical significance test. A p-value of less than 0.05
   was considered significant. Univariate and multivariate Cox regression
   analyses were conducted to identify independent prognostic factors. A
   comprehensive and predictive nomogram was developed using the "rms" R
   package, which incorporated the immune-score signature, age, gender,
   and stages. The performance and accuracy of the nomogram were evaluated
   using receiver operating characteristics (ROC) and calibration curves.

2.8. Data analysis

   All data analysis in this study was carried out using the R software,
   version 4.0.3. Statistical tests such as the Wilcoxon signed-rank test
   were used to compare two groups, while the Kruskal-Wallis test was
   applied to compare three or more groups. A p-value less than 0.05 was
   considered statistically significant, and all tests were two-sided.
   Descriptive statistics such as mean ± standard deviation (SD) and
   medians (ranges) were used to present normally and non-normally
   distributed variables, respectively. Categorical variables were
   reported as numbers and percentages. T-tests or one-way ANOVA were used
   for data analysis, while Spearman's rank correlation coefficient test
   was used to identify the effect of the cut-off threshold and construct
   the ROC curve. The relationship between sensitivity and specificity was
   determined between groups. An adjusted p-value less than 0.05 was
   considered statistically significant.

3. Results

3.1. Identification of immune infiltration-related senescence genes

   In this study, we aimed to identify senescence genes that have a role
   in the immune system and their connection with chronic diseases and GD.
   To achieve this, we utilized three algorithms - IPS, MCP, and Xcell -
   to measure different immune cell subsets in the TCGA cohort based on
   their phenotypic characteristics. S1 Figure shows the results of these
   analyses for each sample, and we examined the immune molecular
   characteristics at different stages of CD4^+ and CD8^+ GD samples. Our
   results indicated a significant decrease in immune scores as GD
   progressed, while purity decreased in higher grades ([99]Fig. 2A). We
   also observed that GD was more prevalent in the 30–50 age group than in
   those above 50 years old ([100]Fig. 2B). By using these algorithms, we
   evaluated the immune scores and infiltrations of various immune
   infiltration components such as B cells, M1 Macrophages, M2
   Macrophages, NK cells, and Tregs at different stages of GD ([101]Fig.
   2C). Our analysis indicated that the infiltration of these immune cells
   was lower in the later stages of GD, except for DCs, which increased in
   severe stages. We also found a positive correlation between GA-related
   genes and immune scores. Additionally, we assessed the accuracy of the
   immunogenicity score of IPS, AZ-IPS, and CP-IPS and observed that
   different immunogenicity algorithms of GD decreased in the late stages
   of the disease. However, the MHC-I binding prediction component of the
   IPS algorithm (MHC 1-IPC) increased in the severe stage of GA,
   indicating an accurate prediction of MHC binding affinity for the
   development of effective cancer immunotherapies ([102]Fig. 2D).
   Overall, our study provides insights into the role of senescence genes
   in the immune system and their association with chronic diseases and
   GD.

Fig. 2.

   [103]Fig. 2
   [104]Open in a new tab

   Differences in immune infiltration characteristics among patients with
   different stages of GD (A) and different ages (B). (C) The nine immune
   cell scores differed among the three stages based on IPS, MCP, and
   Xcell algorithms. (D) Immune cell fractions between different stages of
   GD. Values are mean ± SEM, *p < 0.05, **p < 0.001.

3.1.1. Immuno-module screening

   To analyze samples with high immune scores at the systems level, we
   utilized the WGCNA approach to construct a biological network. Samples
   were divided into high-expression outliers and low-expression groups
   based on the interquartile range, with values 1.5 times greater than
   the interquartile range being classified as high-expression outliers.
   Both IPC and MCP algorithms were used to analyze and select the high
   immune score samples ([105]Fig. 3 A and 3B respectively). Out of the 38
   samples, 11 samples with low immune scores were identified as outliers
   and removed from the study. The high immune expression group consisted
   of 68% (27/38) of the TCGA samples and 32% (12/38) of the samples
   ([106]Fig. 3). For further investigation of immune-related genes and
   hubs, we performed additional analysis on the high immune score
   samples, given the focus of our study on GD, an autoimmune disorder. We
   sorted the sample results according to all immunoassay measurements,
   after screening them using both the IPC and MCP algorithms, as outlined
   in the S1 Table.

Fig. 3.

   [107]Fig. 3
   [108]Open in a new tab

   Immune screening of gene expression in CD4^+ and CD8^+ GD samples from
   the TCGA database was evaluated based on the IPS (A) and MCP (B)
   algorithms. The vertical gray line denotes a subset of high immune
   score expression outliers (right) determined by the 1.5 × interquartile
   range. Box plots (far right) display aggregate expression of the genes
   in high and low immune score groups. Statistical analysis was performed
   using the Mann-Whitney U test.

3.1.2. Immuno-module constructing

   The study utilized the WGCNA method to construct a co-expression
   network of genes associated with the development of GD. Analysis of the
   microarray data identified 1,541 differentially expressed genes, and 21
   co-expression modules were found to be linked with GD ([109]Fig. 4A).
   We identified five modules through cluster dendrogram analysis, each
   labeled with a unique color: blue, cyan, dark red, midnight blue, and
   royal blue, which showed similar module integration ([110]Fig. 4).
   Among the selected immuno-module, our analysis resulted in a scale-free
   network that displayed a relative balance between scale independence
   and means connectivity when we used a soft thresholding power of 6
   ([111]Fig. 4B). Interestingly, the blue module contained taxonomic
   genes that were different from the genes in the other modules. We
   employed eigengene co-expression analysis to merge the dynamic modules
   into three using a threshold of 0.5, and this process confirmed the
   reliability of the dark-red module, as illustrated in [112]Fig. 4B. Our
   study's key finding was that 75% (52/69) of the genes in the dark-red
   module were significantly correlated with immune response in GD, making
   it the critical module between the normal mild GD and severe GD groups
   (p = 0.035). We have listed the genes of the selected immune module in
   the S2 Table. We also conducted dimensionless topological network
   analysis between different stages and age groups to identify potential
   candidate key immunogens linked to GD progression, which was used to
   scale the adjacency matrix ([113]Fig. 4C). Our results indicated that
   the dark-red module was the most significant module linked to
   late-stage GD and contained 69 mRNAs that were potential candidates for
   further investigation ([114]Fig. 4D; p = 0.004). Moreover, our module
   correlation analysis revealed statistically significant positive
   correlations between age and pathological stage characteristics with
   the dark red, midnight blue, and royal blue modules (all p < 0.05).

Fig. 4.

   [115]Fig. 4
   [116]Open in a new tab

   Weighted gene correlation network analysis (WGCNA) of CD4^+ and CD8^+
   GD samples based on stage (mild, moderate, and severe) and age range.
   (A) Clustering dendrogram and module-trait analysis. Different colors
   of the column indicate different hub modules. (B) Analysis of the
   scale-free fit index and mean connectivity for various soft-threshold
   powers. (C) Clustering dendrogram of all genes with dissimilarity based
   on the topological overlap and assigned module colors. (D) Heatmap for
   the correlation between immune modules and traits. (For interpretation
   of the references to color in this figure legend, the reader is