Abstract Autoimmune diseases (ADs) are a broad range of diseases in which the immune response to self-antigens causes damage or disorder of tissues, and the genetic susceptibility is regarded as the key etiology of ADs. Accumulating evidence has suggested that there are certain commonalities among different ADs. However, the theoretical research about similarity between ADs is still limited. In this work, we first computed the genetic similarity between 26 ADs based on three measurements: network similarity (NetSim), functional similarity (FunSim), and semantic similarity (SemSim), and systematically identified three significant pairs of similar ADs: rheumatoid arthritis (RA) and systemic lupus erythematosus (SLE), myasthenia gravis (MG) and autoimmune thyroiditis (AIT), and autoimmune polyendocrinopathies (AP) and uveomeningoencephalitic syndrome (Vogt-Koyanagi-Harada syndrome, VKH). Then we investigated the gene ontology terms and pathways enriched by the three significant AD pairs through functional analysis. By the cluster analysis on the similarity matrix of 26 ADs, we embedded the three significant AD pairs in three different disease clusters respectively, and the ADs of each disease cluster might have high genetic similarity. We also detected the risk genes in common among the ADs which belonged to the same disease cluster. Overall, our findings will provide significant insight in the commonalities of different ADs in genetics, and contribute to the discovery of novel biomarkers and the development of new therapeutic methods for ADs. Keywords: ADs, genetic susceptibility, network similarity, functional similarity, semantic similarity, autoimmune tautology Introduction Autoimmune diseases (ADs) are characterized by causing abnormal immune response which can damage human tissues as a result of the loss of immune tolerance to self-antigens ([43]Margo and Harman, 2016). ADs affect more than 5% of the global population, the incidence and mortality of which have also increased markedly ([44]Ji et al., 2016). Possible causes contain genetic, environmental, hormonal, and immunological factors ([45]Stojanovich and Marisavljevich, 2008). However, neither the inner mechanism action nor the etiology of ADs is clear and there is still no effective cure for these diseases ([46]Rosenblum et al., 2012; [47]Li et al., 2017). ADs share several clinical signs and symptoms, physiopathological mechanisms, and environmental and genetic factors, and this fact indicates that they have a common origin, which has been called the autoimmune tautology. A growing body of evidence has indicated the existence of the autoimmune tautology among various ADs: 1) Different ADs exhibit the same phenotypic characteristics ([48]Anaya, 2017). These diseases, whether organ-specific or systematic, show tissue and organ damage and inflammatory pathological features ([49]Place and Kanneganti, 2020). 2) Different ADs exhibit the same clinical characteristics. Clinically, the results from serological examinations of patients often overlap. And the same patient may suffer from two or more ADs simultaneously, which has been called the polyautoimmunity (PolyA) ([50]Anaya, 2014). In addition, there is a tendency for ADs to cluster within families ([51]Cardenas-Roldan et al., 2013). 3) Different ADs exhibit the same genetic characteristics. ADs are caused by the mutation of multiple loci in the human genome and share the same main genetic loci ([52]Anaya et al., 2006). For example, a previous study indicated that the human leukocyte antigen (HLA) is a susceptibility gene shared by multiple ADs ([53]Cruz-Tapias et al., 2012). And [54]Ueda et al. (2003) found that a molecule encoded by CTLA4 was vital for negative regulation of the immune system and could enhance the risk of several ADs, such as Graves disease, autoimmune hypothyroidism, and type 1 diabetes mellitus (T1D), which indicated that ADs might share similar pathogenic mechanisms. [55]Li et al. (2015) proved that different pediatric ADs shared the same genetic variation. They analyzed the clinical cases of ten different ADs and found many of these diseases were familial and the patients often suffered from several ADs at the same time. In this study, 27 significant risk genetic loci were identified, of which 22 were shared by at least two ADs and 19 loci were shared by at least three ADs. Thus, identification of risk genes shared by multiple ADs may help to explain the development of PolyA. 4) In addition, different ADs also exhibit the same epigenetic characteristics. Epigenetic researches found that ADs shared similar epigenetic mechanisms. For instance, the DNA promoter region in the target cells of systemic lupus erythematosus (SLE) and rheumatoid arthritis (RA) showed low methylation ([56]Quintero-Ronderos and Montoya-Ortiz, 2012). The strong similarity among ADs provides us with a deeper understanding of the common underlying mechanisms of ADs, and also prompts researchers to classify ADs. Therefore, the studies on the genetic similarity of ADs can help us to dissect AD pathogenesis, and contribute to the discovery of novel biomarkers and the development of new therapeutic methods for ADs, which is extremely important in clinical research. In this study, we utilized three measurements, including network similarity, functional similarity, and semantic similarity, to analyze genetic similarity between 26 ADs (the workflow diagram is shown in [57]Figure 1). We identified three significant pairs of similar ADs by multi-step computational approaches. Besides, based on the similarity matrix of 26 ADs, we found some other ADs which were similar to significant pairs of similar ADs by cluster analysis. And the risk genes shared by the ADs which belonged to the same disease cluster could be promising biomarkers for ADs. Our findings provided a novel perspective to understand the commonalities of different ADs in genetics and would facilitate AD mechanism research. FIGURE 1. [58]FIGURE 1 [59]Open in a new tab The workflow diagram for this study. Materials and Methods Collection of AD Terms and AD-Related Genes The AD terms (category C20.111), including 68 diseases, were acquired from the Medical Subject Headings (MeSH, [60]https://www.nlm.nih.gov/mesh/meshhome.html). After removing the complications of ADs, we extracted human disease-related genes from the Genetic Association Database (GAD, [61]https://geneticassociationdb.nih.gov/) ([62]Becker et al., 2004) and mapped these genes to the AD terms for integration. The disease gene sets consisted of 267 related genes of 26 ADs ([63]Supplementary Table S1). Calculation of Network Similarity Between ADs We downloaded the information on protein-protein interactions of human genes from the Human Protein Reference Database (HPRD, [64]http://www.hprd.org/) ([65]Peri et al., 2003) and used Cytoscape software (v3.8.2) ([66]Shannon et al., 2003) to construct a human protein-protein interaction network. The topological properties of AD-related genes in this network were computed ([67]Hidalgo et al., 2009; [68]Chavali et al., 2010). The gene set of disease d was defined as G = {g [1], g [2], g [3], … , g [i ], … , g [k ]}. In order to assessed the network similarity between ADs, we first calculated the average topological properties of each AD in the network as follows: [MATH: R=1ikri k, C=1ikci k, B=1ikbi k, S=1iksi k,a< /mi>ndH=1ikhi k :MATH] (1-5) where k represents the number of genes in G, g [i ]is the ith gene of G, r [i ]is the degree of g [i ], c [i ]is the clustering coefficient of g [i ], b [i ]is the betweenness centrality of g [i ], s [i ]is the average shortest path length of g [i ], h [i ]is the neighborhood connectivity of g [i ], R is the degree of d, C is the clustering coefficient of d, B is the betweenness centrality of d, S is the average shortest path length of d, and H is the neighborhood connectivity of d. As shown in [69]Figure 1, d [1] and d [2] are two ADs from MeSH, and G [1] and G [2] are gene sets related to d [1] and d [2]. The average topological properties of d [1] and d [2] were defined as vector T [1] = {R [1], C [1], B [1], S [1], H [1]} and vector T [2] = {R [2], C [2], B [2], S [2], H [2]}, respectively. We defined the network similarity (NetSim) score between d [1] and d [2] as the Pearson correlation coefficient (PCC) calculated with T [1] and T [2]. The formula that was used as follows: [MATH: ρT1, T2=< mi>cov(T1, T2)σT1,σT2 :MATH] (6) where cov(T [1],T [2]) is the covariance of variables T [1] and T [2], σT [1] and σT [2] are the standard deviations for T [1] and T [2]. Calculation of Functional Similarity Between ADs The data on the functional interactions of genes was downloaded from HumanNet, which is a human gene functional interaction network based on Gene Ontology annotation ([70]Lee et al., 2011). Each interaction in HumanNet has a log likelihood score (LLS) that measures the probability of a functional association between genes ([71]Cheng et al., 2014). We downloaded LLSs between human genes from HumanNet and normalized the LLSs as follows: [MATH: LLSN(< /mo>gi,gj)=LLS(gi,gj)LLSminLLSmax< /mi>LLS< mi>min :MATH] (7) where g [i ]and g [j ]are the ith and jth gene respectively. LLS [N ](g [i ], g [j ]) indicates LLS between g [i ]and g [j ]after normalization. LLS(g [i ], g [j ]) indicates LLS between g [i ]and g [j ]. LLS [min] and LLS [max] are the minimum LLS and the maximum LLS of HumanNet respectively. The functional similarity (FunSim) score between a pair of genes was defined as follows: [MATH: FunSim(gi,g< mi>j)={1L LSN0< /mn>i=jijijandan de(< /mo>i,j)E(Hum anNet)e(< mi>i,j)E (Huma< mi>nNet)< /mtd> :MATH] (8) where e(i, j) represents the interaction edge between g [i ]and g [j ]. E(HumanNet) is a set including all the edges of HumanNet. Next, we defined the functional association between a gene g and a gene set G = {g [1], g [2], g [3], … , g [i ], … , g [k ]} as follows: [MATH: FG(g)< /mo>=max1< mo>≤ik( FunSim(g,gi< /mrow>)),< mtd>giG :MATH] (9) where k represents the number of genes in G, g [i ]is the ith gene of G. G [1] = {g [11], g [12], … , g [1i ], … , g [1m ]} and G [2] = {g [21], g [22], … , g [2j ], … , g [2n ]} are gene sets related to d [1 ]and d [2 ]respectively. m is the number of genes in G [1], and n is the number of genes in G [2]. We defined FunSim score of d [1] and d [2] as follows: [MATH: FunSim(d1,d 2)=1imFG2( g1i)< mo>+1jnFG1( g2j)< /mrow>m+n,g1iG1,g2jG2 :MATH] (10) Calculation of Semantic Similarity Between ADs Semantic similarity is a method to measure the closeness between two terms, according to a given ontology ([72]Zhang and Lai, 2015; [73]Zhang and Lai, 2016; [74]Del Prete et al., 2018). The Resnik method was applied to our study. The human disease terms were obtained from the Human Disease Ontology (DO, [75]http://www.disease-ontology.org) ([76]Schriml et al., 2019). The DO includes the breadth of common and rare diseases, organized as a directed acyclic graph in which a term represents a DO term and an edge represents an “IS_A” relationship between diseases. The information content (IC) of each DO term could be calculated as follows: [MATH: IC(d)=log(nN) :MATH] (11) where d is a disease term of DO, n is the number of genes related to d, and N is the total number of genes related to DO. As shown in [77]Figure 1, d [1] and d [2] are two AD terms of DO, and d [MICA ]is the most informative common ancestor (MICA) of d [1] and d [2]. The MICA means the ancestor that has the maximum IC among all the common ancestors between terms of ontology. And we defined the semantic similarity (SemSim) score of d [1] and d [2] as follows: [MATH: SemSim(d1,d 2)=IC (dMICA) :MATH] (12) Calculation of Integrated Similarity Between ADs To identify more reliable pairs of similar ADs, we integrated above three kinds of similarity (NetSim, FunSim, and SemSim) scores to comprehensively determine the levels of similarity between ADs as follows: [MATH: Integra tedSim (d1,d< /mi>2)=NetSim(d1,d 2)Fu nSim(d1,d2< mo>)SemSi m(d1< mo>,d2)3 :MATH] (13) where d [1] and d [2] are two AD terms of MeSH. Functional Analysis of Related Genes of Similar ADs Functional enrichment analysis of Gene Ontology (GO) and Kyoto encyclopedia of genes and genomes (KEGG) for related genes of similar ADs was performed to infer potential biological processes and pathways using the DAVID Bioinformatics Tool ([78]http://david.abcc.ncifcrf.gov/, version 6.7) ([79]Huang Da et al., 2009). The p-values for the biological processes and pathways were adjusted for false discovery rate (FDR) by the Benjamini-Hochberg method. The biological processes and pathways with FDR less than 0.05 were considered statistically significant functional categories. Cluster Analysis for ADs Based on Three Kinds of Similarity To determine whether multiple ADs had high genetic similarity, hierarchical clustering was performed on the integrated similarity scores between 26 ADs based on the Euclidean distance. The d [1] and d [2] are two diseases of 26 ADs. We defined the Euclidean distance between d [1] and d [2] as follows: [MATH: E(d1 ,d2) =1i26(x1i−< /mo>x2i )2 :MATH] (14) where x [1i ]is the integrated similarity score between d [1] and ith disease of 26 ADs, x [2i ]is the integrated similarity score between d [2] and ith disease of 26 ADs. The 26 ADs were clustered based on the Euclidean distances between diseases and there were shorter Euclidean distances between ADs belonging to the same disease cluster. Results Identification of Potential Pairs of Similar ADs To identify pairs of similar diseases in genetics, we calculated similarity between 26 ADs based on three measurements. All of AD pairs were sorted in descending according to their scores of NetSim, FunSim, and SemSim. To enhance the reliability of the study, we selected ten AD pairs from intersection of top 50 AD pairs with the highest NetSim score, top 50 AD pairs with the highest FunSim score, and top 50 AD pairs with the highest SemSim score for further analysis ([80]Figure 2 and [81]Supplementary Tables S2–4), which were considered as potential pairs of similar ADs. FIGURE 2. FIGURE 2 [82]Open in a new tab Venn diagram analysis of three groups of AD pairs ranked by NetSim, FunSim, and SemSim respectively. Further, a network was generated on the ten AD pairs and their related genes, including 14 ADs and 247 genes ([83]Figure 3). The topological properties of the network were investigated. We extracted 35 genes each of whose degree was greater than or equal to three from the network as the possible AD relevant genes. The top five genes regarding degree were HLA-DRB1, HLA-DQB1, CTLA4, HLA-DQA1, and TNF, indicating that these genes were critical in multiple ADs. Previous researches have demonstrated that these genes are associated with many ADs and exert various functions in human autoimmune disorders. Besides, we ascertained that RA and multiple sclerosis (MS) involved more similarity relationships than other AD in this network, implying that the pathogenesis of RA and MS might exist in most of ADs from the network. What’s more, a pair of diseases with higher number of shared genes, suggesting they likely have higher genetic similarity. Thus, the ten potential pairs of similar ADs were ranked by their number of shared genes and shown in [84]Table 1. And the top-rank disease pair is RA and SLE, followed by T1D and RA, and RA and MS. FIGURE 3. [85]FIGURE 3 [86]Open in a new tab Network on potential pairs of similar ADs and AD-gene relationships. The blue nodes represent genes, and the size of these nodes corresponds to the node degree. The green nodes represent ADs. The gray edges represent disease-gene relationships, and the orange edges represent potential AD similarity relationships. TABLE 1. The ten potential pairs of similar ADs ranked by number of shared genes. Rank Autoimmune disease Autoimmune disease Number of shared gene 1 Arthritis, Rheumatoid Lupus Erythematosus, Systemic 40 2 Diabetes Mellitus, Type 1 Arthritis, Rheumatoid 28 3 Arthritis, Rheumatoid Multiple Sclerosis 23 4 Lupus Erythematosus, Systemic Multiple Sclerosis 20 5 Multiple Sclerosis Sjogren’s Syndrome 9 6 Myasthenia Gravis Thyroiditis, Autoimmune 7 7 Addison Disease Graves Disease 6 8 Hepatitis, Autoimmune Sjogren’s Syndrome 3 9 Polyendocrinopathies, Autoimmune Uveomeningoencephalitic Syndrome 2 10 Purpura, Thrombocytopenic, Idiopathic Still’s Disease, Adult-Onset 1 [87]Open in a new tab Functional Implication of the Genes Related to Multiple ADs To explore common genetic mechanisms of a variety of ADs, we performed functional enrichment analysis of GO and KEGG for the 35 AD relevant genes ([88]Figure 4A). The cutoff criterion was a FDR less than 0.05. The top ten significant GO terms in the BP were mainly associated with immune response, antigen processing and presentation, and interferon gamma (IFNγ)-related functions ([89]Figure 4B). Notably, IFNG was involved in the top three GO terms and was defined as a hub gene. IFNG can encode IFNγ that is a cytokine that is critical for innate and adaptive immunity against viral, bacterial, and protozoan infections. And aberrant IFNγ expression is associated with a number of ADs, such as RA and SLE ([90]Hu and Ivashkiv, 2009; [91]Barrat et al., 2019). The top ten significant KEGG pathways contained four AD-correlated pathways, such as “inflammatory bowel disease (IBD),” “autoimmune thyroid disease,” “type 1 diabetes mellitus,” and “rheumatoid arthritis,” which illustrated that the 35 genes might induce the initiation and development of multiple ADs ([92]Figure 4C). HLA-DQB1, HLA-DRB1, HLA-DPB1, HLA-DQA1 were involved in all of the top ten KEGG pathways and were defined as hub genes. The four genes belonged to HLA class II alleles which were suggested to contribute to the susceptibility and resistance to ADs ([93]Wieber et al., 2021). FIGURE 4. [94]FIGURE 4 [95]Open in a new tab Identification and functional enrichment analysis of AD relevant genes. (A) The genes correlated with at least three ADs derived from the AD similarity network, which are ranked by the number of related diseases. (B) Circos plot of top ten significant GO terms in the BP. (C) Circos plot of top ten significant KEGG pathways. The genes are displayed on the left half of the circos plots. The right half represents different GO terms or KEGG pathways with different colors. A gene is linked to a certain GO term or KEGG pathway by the colored bands. Identification of Significant Pairs of Similar ADs To identify more reliable pairs of similar ADs, we integrated the three measurements to compute integrated similarity scores between 26 ADs (see Materials and Methods). All of AD pairs were sorted in descending according to their integrated similarity scores. And the top ten AD pairs were extracted for further analysis ([96]Table 2). We found that the ten AD pairs contained three potential pairs of similar ADs consisting of RA and SLE, myasthenia gravis (MG) and autoimmune thyroiditis (AIT), and autoimmune polyendocrinopathies (AP) and uveomeningoencephalitic syndrome (Vogt-Koyanagi-Harada syndrome, VKH) which were defined as the significant pairs of similar ADs. TABLE 2. Top ten pairs of ADs ranked by integrated similarity scores. Rank Autoimmune disease Autoimmune disease Integrated similarity score 1 Polyendocrinopathies, Autoimmune Uveomeningoencephalitic Syndrome 0.704362256 2 Thyroiditis, Autoimmune Graves Disease 0.605323256 3 Myasthenia Gravis Thyroiditis, Autoimmune 0.540495323 4 Addison Disease Hepatitis, Autoimmune 0.53677382 5 Uveomeningoencephalitic Syndrome Addison Disease 0.534304723 6 Arthritis, Rheumatoid Lupus Erythematosus, Systemic 0.527496589 7 Pemphigoid, Bullous Polyendocrinopathies, Autoimmune 0.52566119 8 Hepatitis, Autoimmune Myasthenia Gravis 0.507462033 9 Anemia, Hemolytic, Autoimmune Purpura, Thrombocytopenic, Idiopathic 0.499501 10 Guillain-Barre Syndrome Still’s Disease, Adult-Onset 0.499448795 [97]Open in a new tab Functional Analysis of Related Genes of Significant Pairs of Similar ADs To reveal the underlying mechanisms shared by two similar ADs, we performed functional enrichment analysis of related genes of RA and SLE, MG and AIT, and AP and VKH. The cutoff criterion was a FDR less than 0.05. The related genes of RA and SLE were significantly enriched in GO terms mainly involved in immune response, inflammatory response, and IFNγ. The significant enriched pathways including RA, inflammatory bowel disease (IBD), tuberculosis, etc ([98]Figures 5A,B). The related genes of MG and AIT were mainly related to immune response (GO), antigen processing and presentation (GO), autoimmune thyroid disease (AITD) (KEGG), and allograft rejection (KEGG) ([99]Figures 5C,D). Moreover, the related genes of AP and VKH were mainly associated with the antigen processing and presentation in GO and some pathways such as viral myocarditis, Staphylococcus aureus infection, AITD, intestinal immune network for IgA production, etc ([100]Figures 5E,F). FIGURE 5. [101]FIGURE 5 [102]Open in a new tab Functional enrichment analysis of related genes of RA and SLE (A-B), MG and AIT (C-D), and AP and VKH (E-F). Enriched functional terms are sorted in descending order according to their–log10 (FDR), and the top ten significant GO terms in the BP and KEGG pathways of each significant pair of similar ADs are used for further analysis. Hierarchical Clustering Result of 26 ADs Based on Three Similarity Measurements To determine whether there was high genetic similarity among multiple ADs, we applied hierarchical clustering to the integrated similarity matrix of 26 ADs. The disease clusters consisting of AD pairs with integrated similarity scores greater than 0.3 were considered to be significant. As shown in [103]Figure 6A, three significant disease groups were identified from the 26 ADs, and the ADs of each disease group might have high genetic similarity. We found that the three significant pairs of similar ADs were located in three different clusters, respectively. In the cluster one, bullous pemphigoid might be similar to AP and VKH. The ADs of cluster one are involved in endocrine autoimmunity. For instance, bullous pemphigoid has been proved to be related to immunodysregulation polyendocrinopathy enteropathy X-linked syndrome ([104]Mcginness et al., 2006). In the cluster two, pemphigus, Graves disease, Sjogren’s syndrome, Addison Disease, and autoimmune hepatitis might be similar to MG and AIT. The ADs of cluster two contain the main AITDs (AIT and Graves disease) and frequent ADs involved in PolyA (AIT, Graves disease, and Sjogren’s syndrome) ([105]Amador-Patarroyo et al., 2012; [106]Botello et al., 2020). In the cluster three, T1D and MS might be similar to RA and SLE. The ADs of cluster three are all chronic inflammatory ADs and share multiple genetic susceptibility loci ([107]Richard-Miceli and Criswell, 2012). Then, we performed pathway enrichment analysis of the related genes of ADs of three significant clusters. The related genes of ADs of three disease clusters were significantly enriched in pathways of T1D and IBD ([108]Figures 6B–D). Therefore, we infer that T1D and IBD can participate in PolyA in various AD patients. FIGURE 6. [109]FIGURE 6 [110]Open in a new tab Cluster analysis for 26 ADs and pathway enrichment analysis for ADs of each significant cluster. (A) The clustering heatmap illustrating the classification of 26 ADs based on Euclidean distances and integrated similarity scores between 26 ADs. The similarity values, which are greater than 0.3, are marked in the matrix. (B) The top ten significant KEGG pathways of related genes of ADs of cluster one ranked by–log10 (FDR). (C) The top ten significant KEGG pathways of related genes of ADs of cluster two ranked by–log10 (FDR). (D) The top ten significant KEGG pathways of related genes of ADs of cluster three ranked by–log10 (FDR). Next, we detected the causal genes in common among the ADs which belonged to the same significant disease cluster, and these genes could be used for PolyA research. As shown in [111]Table 3, the ADs of cluster one shared one gene (HLA-DQA1); the ADs of cluster two shared two genes (HLA-DQB1 and HLA-DRB1); the ADs of cluster three shared eight genes (TNF, HLA-DRB1, PDCD1, PTPN22, CCR5, IL6, HLA-DQB1, and CTLA4). Identification of these genes will contribute to the discovery of novel prognostic, diagnostic, and therapeutic markers and justification of drug repurposing for ADs. TABLE 3. The shared causal genes among the ADs which belong to the same significant disease cluster. Disease cluster Shared gene Cluster one HLA-DQA1 Cluster two HLA-DQB1, HLA-DRB1 Cluster three TNF, HLA-DRB1, PDCD1, PTPN22, CCR5, IL6, HLA-DQB1, CTLA4 [112]Open in a new tab Italics refers to gene symbols (gene names). Discussion During the past years, numerous studies have confirmed that different ADs are similar in various aspects. Nevertheless, these studies just focused on several ADs, and lacking a comprehensive analysis on similarity between ADs from the perspective of genetics. To date, various disease similarity methods have been developed ([113]Dozmorov, 2019). In this study, we calculated the similarity scores between 26 ADs by means of three similarity measurements. To ensure the accuracy of the subsequent analysis, we combined the results of NetSim, FunSim, and SemSim to evaluate all the AD pairs. We found ten potential pairs of similar ADs that were utilized to form a network containing an overall insight of the information about AD-AD relationships and AD-gene relationships, which provided essential clues to understand the mechanisms shared by multiple ADs. Based on the AD pairs in this network, we detected three significant pairs of similar ADs (RA and SLE, MG and AIT, and AP and VKH), and then investigated the shared functional terms for each significant AD pair. We also employed cluster analysis on the integrated similarity matrix of 26 ADs to acquire some other ADs which were similar to significant AD pairs in genetics, and identified the risk genes which belonged to the same disease cluster. These results still need to be verified by more studies, but we hope that our observations can help researchers to dissect the complex pathogenesis of ADs. By the functional enrichment analysis of 35 AD relevant genes, we mainly focused on GO terms involved in immune response, antigen processing and presentation, and IFNγ. The immune response is how the immune system defends against foreign invaders, such as bacteria or viruses ([114]Chaplin, 2010). ADs are triggered by aberrant immune response which damages healthy body part and is influenced by a large number of genes ([115]Hill et al., 2008; [116]Gregersen and Olsson, 2009). We concluded that the 35 genes might trigger a variety of ADs. On the other hand, the dysfunction of antigen processing and presentation might influence the emergence of ADs ([117]Ritz and Seliger, 2001). In human bodies, antigens are processed into peptides of a certain length in association with major histocompatibility complex (MHC) molecules. T cells are capable of recognizing these fragmented peptides bound to the MHC to initiate immune responses ([118]Purcell et al., 2016; [119]Kelly and Trowsdale, 2019; [120]Kotsias et al., 2019). As different ADs share the characteristic that risk is conferred by genes encoded within the MHC locus, antigen presentation generally seems to be crucial in ADs ([121]Riedhammer and Weissert, 2015). For example, processing and presentation of self-antigens by different antigen presenting cells may result in MS ([122]Stoeckle and Tolosa, 2010). Ultimately, IFNγ is a pleiotropic cytokine secreted by immune cells and plays a critical role in innate and adaptive immunity ([123]Tau and Rothman, 1999; [124]Schoenborn and Wilson, 2007). Abnormal IFNγ expression is correlated with considerable number of ADs. Although IFNγ can mediate clearance of pathogenic insults, chronic exposure to IFNγ is thought to cause many ADs, such as RA and SLE ([125]Nielen et al., 2004; [126]Lu et al., 2016). And the complex role of IFNγ in ADs also has important therapeutic implications. Above evidences demonstrate that the three function aspects play important roles in AD-related mechanisms. With regard to the three significant pairs of similar ADs, several studies have confirmed these similarity relationships. For example, ([127]Wang et al., 2020) found that familial RA, SLE, and primary Sjögren’s syndrome shared common genetic characteristics, and the genetic variations in T cell receptor signaling pathway genes which might become novel molecular targets for therapeutic interventions for the three ADs. ([128]Liu et al., 2019) found that T cell receptor could become a promising diagnostic marker for RA and SLE. In addition, previous studies have confirmed that MG and AIT are similar in many aspects ([129]Marino et al., 1997; [130]Lopomo and Berrih-Aknin, 2017), and AIT frequently accompanies MG ([131]Mao et al., 2011; [132]Kubiszewska et al., 2016). The two diseases are both organ-specific ADs with a clear pathogenic effect of antibodies. Meanwhile, MG and AIT share the same predisposing genes (such as PTPN22, CTLA4, and HLA) and pathological mechanisms (such as T-cell immune-mediated mechanisms). Thus, we infer that AD genetic similarity research can help to explain the similar phenotypic and clinical features between ADs. These reports are consistent with our current results. Experimental studies on these AD pairs are desperately needed to provide important information to understand their intrinsic mechanisms. And further validation of these disease relationships in clinical trials will be a better option to turn them into clinical practice. Besides, the results of pathway enrichment analysis of related genes of significant pairs of similar ADs exposed possible PolyA. For example, the related genes of RA and SLE were enriched in pathways of IBD, T1D, and AITD. It was reported that AIT frequently coexisted with RA and SLE ([133]Ordonez-Canizares et al., 2020). Another study showed that AITD, RA, SLE, and IBD were observed in Sjögren’s syndrome patients with PolyA ([134]Amador-Patarroyo et al., 2012). And the related genes of MG and AIT were enriched in pathways of T1D, IBD, and RA. Previous study found that the latent and overt PolyA in patients with AITD were associated with gastrointestinal, endocrinological, rheumatological, dermatological, and neurological ADs ([135]Botello et al., 2020). The PolyA is not uncommon and multiple ADs that coexist in a single patient may share the same etiopathogenesis. Some genetic studies on ADs ignored the coexistence of other autoimmune conditions by implementing anachronistic nomenclature (i.e., primary or secondary ADs) ([136]Rojas-Villarraga et al., 2012). We hope that researchers can take in account PolyA and concern whether or not patients have latent or overt PolyA in AD study. With regard to the result of cluster analysis on 26 ADs, hitherto, a lot of reports have confirmed our viewpoint. For example, for the disease cluster two, a recent study found that chemokines were associated with the early phases of the autoimmune response in AIT, Graves disease, and Addison disease ([137]Fallahi et al., 2020). For the disease cluster three, another study found major common gene expression changes at the target tissues of T1D, MS, RA, and SLE ([138]Szymczak et al., 2021). This study predicted AD pairs and clusters with high genetic similarity, as well as potential risk genes, biological processes, and pathways involved in multiple types of ADs. Despite the two diseases of a certain AD pair with high similarity score have different phenotypic or clinical features, they are likely to have similar or the same ways to elicit autoimmune responses in the human body. Consequently, we reason that similar ADs in genetics can be treated with similar therapeutics and drugs. We hope that these findings can aid in elucidating AD mechanisms, and provide more references for researchers.