Abstract Ulcerative colitis (UC) and rheumatoid arthritis (RA) are immune-mediated inflammatory diseases (IMIDs) with similar symptoms and common genomics. However, the relationship between UC and RA has not been investigated thoroughly. Therefore, this study aimed to establish the differentially expressed genes (DEGs) and potential therapeutic targets in UC and RA. Three microarray datasets ([31]GSE38713, [32]GSE1919, and [33]GSE12251) were selected from the Gene Expression Omnibus (GEO) database for analysis. We used R software to identify the DEGs and performed enrichment analyses. Search Tool for the Retrieval of Interacting Genes/Proteins (STRING) and Cytoscape software were used to construct the protein-protein interaction (PPI) network and identify the hub genes. A regulatory network based on the constructed PPI was generated using StarBase and PROMO databases. We identified a total of 1542 and 261 DEGs in UC and RA. There were 169 common DEGs identified in both UC and RA, including 63 upregulated genes (DEGs1) and nine downregulated genes (DEGs2). The Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analyses of DEGs1 and DEGs2 in the PPI network revealed that the genes enriched were involved in immunity. A total of 45 hub genes were selected based on high scores of correlation; three hub genes (SRGN, PLEK, and FCGR3B) were found to be upregulated in UC and RA, and downregulated in UC patients with response to infliximab treatment. The identification of novel DEGs and hub genes in the current study contributes to a novel perception for latent functional mechanisms and presents potential prognostic indicators and therapeutic targets in UC and RA. Keywords: bioinformatical analysis, hub genes, ulcerative colitis, rheumatoid arthritis, differentially expressed genes Introduction Ulcerative colitis (UC) is a chronic inflammatory disease that mainly involves the colon. The incidence and prevalence of UC have increased worldwide, thus placing a significant burden on human society ([34]Ng et al., 2018). The intestinal symptoms that accompany UC include bloody diarrhea, and a third of patients with UC present with extraintestinal manifestations. Among these manifestations, arthritis has been the most commonly identified ([35]Ungaro et al., 2017). Rheumatoid arthritis (RA) is an autoimmune disease that is characterized by inflammation, stiffness of joints accompanied by pain, loss of mobility, and joint deformity, and its incidence has increased substantially in the past 30 years ([36]Safiri et al., 2019). Studies report that patients with UC have an increased risk of RA ([37]Wilson et al., 2016; [38]Bae et al., 2017; [39]Halling et al., 2017). UC and RA are immune-mediated inflammatory diseases (IMIDs); hence, they likely share similar pathogenesis, genes, and antigens. Previous studies have revealed several common genes associated with both UC and RA, including the human leukocyte antigen (HLA-B27), interleukin 15, peptidyl arginine deiminase type 4 (PADI4), and prostaglandin receptor EP4 (PTGER4) ([40]Klausen et al., 1992; [41]Mosquera-Martinez, 2001; [42]Chen et al., 2008; [43]Perdigones et al., 2010). UC and RA also share some common drugs for their treatment. TNF-α antagonists such as infliximab, have been approved as first- or second-line treatment of patients with UC and RA ([44]Rubin et al., 2019; [45]Smolen et al., 2020). Despite extensive research on UC and RA, there still is a gap in understanding differentially expressed genes (DEGs) and possible targets for the treatment of UC and RA. Our study aimed to determine DEGs and possible targets for the treatment of UC and RA through bioinformatical analysis. In this study, we analyzed three gene expression datasets ([46]GSE38713, [47]GSE1919, and [48]GSE12251) downloaded from the Restructured Gene Expression Omnibus (ReGEO) database. Comprehensive bioinformatics and enrichment analyses were used to determine independent DEGs and differentially coexpressed genes (DCGs). We constructed a protein-protein interaction (PPI) network to identify hub genes using the Search Tool for the Retrieval of Interacting Genes/Proteins (STRING) database and Cytoscape ver. 3.7.2 software. Moreover, we identified four potential therapeutic target genes related to UC and RA and constructed their regulatory network, using starBase and PROMO databases. These target genes include those of microRNAs (miRNAs), long non-coding RNAs (lncRNAs), and transcription factors (TFs). The potential therapeutic targets between UC and RA identified here are expected to provide novel insights into the biological mechanisms linked with these two diseases. Materials and Methods Data Source GEO^[49]1 is a public repository containing high throughout sequencing and microarray data sets. We selected three gene expression microarray datasets ([50]GSE38713, [51]GSE1919, and [52]GSE12251) from the GEO database. The [53]GSE38713 and [54]GSE12251 datasets were available on the [55]GPL570 platform (HG-U133_Plus_2; Affymetrix Human Genome U133 Plus 2.0 Array), while [56]GSE1919 was accessible on the [57]GPL91 platform (HG_U95A; Affymetrix Human Genome U95A Array). Identification of DEGs The R software (version 3.6.3)^[58]2 and limma package^[59]3 in Bioconductor^[60]4 were used to detect the DEGs affected by UC, RA, infliximab treatment response samples, and corresponding control groups ([61]Ritchie et al., 2015). DEGs were identified using the selection criteria of adjusted P-value < 0.05 and | logFC| >1.0. The intersecting parts of DEGs were calculated using a Venn diagram webtool^[62]5. Gene Ontology and Pathway Enrichment Analysis of DEGs Gene Ontology (GO) is a universal tool for defining the biological process (BP), cellular component (CC), and molecular function (MF) of numerous genes. Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway is a database that contains multiple biological pathways for several organisms. GO and pathway analysis provide a deep insight into the relations of functions or pathways, and the primary roles of these genes. The enrichment analyses of DEGs were performed using the Cluster Profile package^[63]6 in Bioconductor, and a P-value less than 0.05 was considered as statistically significant ([64]Yu et al., 2012). Protein-Protein Interaction Network Construction and Module Analysis Protein-protein interaction (PPI) network reveals the specific and unspecific interactions of proteins, and identifies the core protein genes. STRING (version 11.0)^[65]7, is a freely accessible database, that collects, scores, and integrates data, is used to predict functional relationships between proteins ([66]Szklarczyk et al., 2019). A PPI network of the DEGs with combined score >0.4 in STRING was considered as a functional link, and was constructed using the Cytoscape software (version 3.7.2)^[67]8 ([68]Smoot et al., 2011; [69]Wang et al., 2016a, [70]2018; [71]Li et al., 2017; [72]Zhao et al., 2019). Subsequently, we used the MCODE plugin to identify densely connected modules from the PPI network with the criteria of K-core = 2, degree cutoff = 2, max depth = 100, and node score cutoff = 0.2 ([73]Bandettini et al., 2012). Selection and Analysis of Hub Genes The degree of protein nodes was calculated by using the Cytoscape plugin, CytoHubba, to find hub genes ([74]Chin et al., 2014). In this study, hub genes were selected with degrees ≥10. Subsequently, we used the corrplot package in R software to calculate the correlation between hub genes based on Pearson correlation analysis. Construction of Regulatory Network The network of genes and their corresponding miRNAs and lncRNAs was constructed using StarBase^[75]9, a publicly available database that mainly focuses on miRNA-target interactions ([76]Li et al., 2014). The transcription factors (TFs) of genes were downloaded from PROMO^[77]10, a public database for predicting the TFs of various genes through DNA sequences ([78]Farre et al., 2003). The above-mentioned tools were combined to construct a multi-factor regulation network. Results Identification of DEGs Three gene expression datasets ([79]GSE38713, [80]GSE1919, and [81]GSE12251) were selected in this study ([82]Figure 1). The [83]GSE38713 dataset was derived from 15 UC tissue samples and 10 control samples. [84]GSE1919 dataset was derived from 5 RA samples and 5 control samples. [85]GSE12251 dataset was derived from 22 patients with UC, of which 12 responded to and 10 did not respond to infliximab treatment. We used limma package to identify the DEGs in the three datasets with P < 0.05 and | logFC| >1. There were 1542 DEGs in the [86]GSE38713 dataset, including 978 upregulated genes and 564 downregulated genes ([87]Supplementary Table 1). There were 260 DEGs in the [88]GSE1919 dataset, including 134 upregulated and 126 downregulated genes ([89]Supplementary Table 2). In the [90]GSE12251 dataset, 68 downregulated genes were identified ([91]Supplementary Table 3). A Venn diagram was generated to show the overlap between [92]GSE38713 and [93]GSE1919 datasets; these include 63 upregulated genes (DEGs1) ([94]Figure 2A) and 9 downregulated genes (DEGs2) ([95]Figure 2B). Furthermore, 1470 DEGs (DEGs3) and 188 DEGs (DEGs4) were identified independently from the DEGs in UC and RA, respectively. FIGURE 1. [96]FIGURE 1 [97]Open in a new tab Flow diagram of the study design. FIGURE 2. [98]FIGURE 2 [99]Open in a new tab Differentially expressed genes (DEGs) among UC and RA. 63 upregulated DEGs (A) and 9 downregulated DEGs (B) expressed both in UC and RA. GO analyses of DEGs independently in UC (C) and RA (D) with adujust P-value. GO Enrichment Analyses of Independent DEGs in UC and RA GO analysis of [100]GSE38713 indicated that the DEGs3 in UC were mainly involved in leukocyte migration, humoral immune response, and regulation of the inflammatory response under BP. The analysis also indicated that these DEGs were mainly involved in collagen-containing extracellular matrix, external side of plasma membrane, and immunoglobulin complex under CC. Likewise, the terms antigen binding, extracellular matrix structural constituent and immunoglobulin receptor binding were enriched under MF ([101]Figure 2C). The GO analysis in the [102]GSE1919 dataset for DEGs4 returned that the terms response to antibiotic, glucocorticoid, and corticosteroid under BP were mainly enriched. The terms enriched under CC were external side of plasma membrane, contractile fiber part, and clathrin-coated vesicle membrane. Moreover, the terms enriched under MF were DNA-binding transcription repressor activity, coreceptor activity, and virus receptor activity ([103]Table 1 and [104]Figure 2D). TABLE 1. The GO enrichment analysis of DEGs3 and DEGs4 (top 3 terms according to p.adjust). DEGs Ontology ID Description Count p.adjust DEGs3 BP GO:0050900 Leukocyte migration 138 3.44E-41 GO:0006959 Humoral immune response 110 2.09E-37 GO:0050727 Regulation of inflammatory response 120 1.06E-30 CC GO:0062023 Collagen-containing extracellular matrix 109 2.07E-32 GO:0009897 External side of plasma membrane 98 1.92E-26 GO:0019814 Immunoglobulin complex 53 3.78E-20 MF GO:0003823 Antigen binding 57 1.83E-21 GO:0005201 Extracellular matrix structural constituent 47 1.61E-13 GO:0034987 Immunoglobulin receptor binding 31 2.48E-13 DEGs4 BP GO:0046677 Response to antibiotic 14 0.000109 GO:0051384 Response to glucocorticoid 10 0.000109 GO:0031960 Response to corticosteroid 10 0.000193 CC GO:0009897 External side of plasma membrane 16 1.88E-06 GO:0044449 Contractile fiber part 8 0.008248 GO:0030665 Clathrin-coated vesicle membrane 6 0.008248 MF GO:0001227 DNA-binding transcription repressor activity 8 0.000361 GO:0015026 Coreceptor activity 5 0.005158 GO:0001618 Virus receptor activity 5 0.020157 [105]Open in a new tab Protein-Protein Interaction Network Construction and Module Analysis Protein-protein interaction network analysis is a remarkable method in understanding the biological responses in health and disease. In this study, protein interactions between the DEGs1 and DEGs2 were analyzed using the STRING database. A total of 69 nodes and 251 edges were included with combined scores >0.4, and visualized using Cytoscape software ([106]Figure 3). The MCODE plugin identified five densely connected modules in which 39 DEGs were among DEGs1 and DEGs2 ([107]Figure 4A). KEGG and GO enrichment analyses of these 39 genes were carried out using the ClusterProfiler package. GO analysis revealed that these genes are involved in immunity ([108]Figure 4B), and KEGG pathway analysis revealed them to be mainly involved in viral myocarditis, leishmaniasis, and allograft rejection ([109]Figure 4C). FIGURE 3. [110]FIGURE 3 [111]Open in a new tab Based on database STRING and Cytoscape software, PPI networks of the DEGs1 and DEGs2 were constructed. The red point represents upregulated genes, and blue point represents downregulated genes. FIGURE 4. [112]FIGURE 4 [113]Open in a new tab Modular analyses of PPI found five key modules (A). GO (B) and KEGG (C) enrichment analyses of 39 genes in these modules with P-value. Hub Gene Selection and Analysis With a criteria of degrees ≥10 using CytoHubba plugin, we identified a total of 30 hub genes. The scores of hub genes are presented in [114]Figure 5A. The correlation between these 30 hub genes was investigated using the corrplot package, and Pearson scores >0.95 indicated a strong correlation between the hub genes. The correlation between 29 pairs of hub genes was considered significant ([115]Figure 5B and [116]Supplementary Table 4). FIGURE 5. [117]FIGURE 5 [118]Open in a new tab Hub gene selection and analysis performed by cytohubba. The score of hub genes was based on the EPC Algorithm (A). Pearson correlation analysis was used to calculate the correlation of hub genes (B). Hub Genes in UC Response to Infliximab Treatment We used limma packages to identify the DEGs in the [119]GSE12251 dataset, and found 68 downregulated genes. The overlap between the hub genes of [120]GSE38713, [121]GSE1919, and [122]GSE12251 include four protein-coding (pc) genes, such as SRGN (serglycin), PLEK (pleckstrin), and FCGR3B (Fc fragment of IgG receptor IIIb). All four genes were upregulated in UC and RA samples compared to those in the control samples in [123]GSE38713 and [124]GSE1919 datasets. On the other hand, these genes were downregulated in UC patients with response to infliximab treatment in the [125]GSE12251 dataset. It could be concluded that the four genes play an important role during infliximab treatment of UC and RA. Multi-factor Regulation Network Construction We used StarBase and PROMO databases to predict the miRNAs, lncRNAs, and TFs of SRGN, PLEK, and FCGR3B, and found a total of 16 miRNAs, 40 lncRNAs, and 41 TFs. The data of these four genes and their miRNAs, lncRNAs, and TFs were integrated into a regulatory network, and visualized using Cytoscape software ([126]Figure 6). FIGURE 6. [127]FIGURE 6 [128]Open in a new tab Multi-factor regulation network of SRGN, PLEK, and FCGR3B was constructed by starBase and PROMO database. Discussion There has been a considerable increase in the incidence and prevalence of UC and RA worldwide ([129]Molodecky et al., 2012; [130]Safiri et al., 2019). These diseases can lead to functional disabilities, severe decline in quality of life, and an increased risk of cancer ([131]Jess et al., 2012; [132]Simon et al., 2015; [133]Myasoedova et al., 2019). Furthermore, UC has been reported to be concomitant with RA ([134]Wilson et al., 2016). As IMIDs, UC and RA might have overlapping pathogenic pathways. Inflammatory and immune regulatory pathways, such as Fcγ receptor signaling, are linked to the pathogenesis of IMIDs ([135]Castro-Dopico and Clatworthy, 2019; [136]Virtanen et al., 2019). Additionally, gut microbiota has been reported to play a role in IMIDs ([137]Liu et al., 2013; [138]Forbes et al., 2018; [139]Imhann et al., 2018). Treatment with TNF-α antagonists has been firmly established as an effective therapeutic approach for RA ([140]Kievit et al., 2008); however, non-responsiveness to infliximab (a TNF-α antagonist) is common in patients with UC ([141]Kievit et al., 2008; [142]Wong and Cross, 2017). The main purpose of our study was to identify the common DEGs in UC and RA, thereby revealing potential targets for predicting the therapeutic effect of TNF-α antagonist and treating UC and RA. In this study, we identified 72 overlapping DEGs in both UC and RA, of which 63 were upregulated (DEGs1) and 9 were downregulated genes (DEGs2). Independent DEGs included 915 upregulated and 555 downregulated genes in UC (DEGs3), and 71 upregulated and 117 downregulated genes in RA (DEGs4). GO analysis revealed that DEGs3 were significantly enriched in inflammatory and immune pathways, which played a central role in the development of UC and RA. While DEGs4 were mainly enriched in drug responses. Enrichment analyses of the genes in the key modules of the constructed PPI network revealed that they were mainly enriched in some immune-related pathways and cellular organization processing, such as leukocyte cell-cell adhesion, extracellular matrix organization, T cell activation, and leukocyte migration. The adhesion and migration of leukocyte, as well as the activation of T cells have been linked to the pathogenesis of UC and RA ([143]Thomas and Baumgart, 2012; [144]Reynisdottir et al., 2016; [145]McNaughton et al., 2018; [146]Rabe et al., 2019). Finally, a total of 45 hub genes were identified, among which four hub genes (SRGN, PLEK, and FCGR3B) were predicted to be upregulated in UC and RA samples ([147]GSE38713 and [148]GSE1919) and downregulated in UC patients with response to infliximab treatment ([149]GSE12251). These findings suggest that these four genes are predictive markers and therapeutic targets for UC and RA. SRGN encodes the proteoglycan protein, and is mainly expressed in hematopoietic cells. Many studies have confirmed that SRGN promotes tumor invasion and metastasis in colorectal cancer, non-small cell lung cancers, multiple myeloma, nasopharyngeal carcinoma, and breast cancer ([150]Li et al., 2011; [151]Korpetinou et al., 2013; [152]Purushothaman and Toole, 2014; [153]Guo et al., 2017; [154]Xu et al., 2018). SRGN is also involved in inflammatory processes through the regulation of numerous inflammatory mediators such as TNF-α, and activating the NF-κB signaling pathway ([155]Zernichow et al., 2006; [156]Korpetinou et al., 2014; [157]Scuruchi et al., 2019). These processes caused by the combination of SRGN and CD44 receptor, could promote inflammation ([158]Misra et al., 2015). PLEK, a substrate of protein kinase C, is involved in various adaptive immune responses ([159]Cremonesi et al., 2012). Although the underlying mechanisms of PLEK are still unclear, many studies have linked it to certain diseases. PLEK might be a susceptibility locus for venous thromboembolism, and its expression is increased in UC, periodontitis, and celiac disease ([160]Song et al., 2015; [161]Pascual et al., 2016; [162]Lindstrom et al., 2019; [163]Medrano et al., 2019). In diabetes, PLEK has been reported to promote the secretion of proinflammatory cytokines such as TNF-α and IL-1β in mononuclear phagocytes; these cytokines have already been linked to increased risk of UC and RA ([164]Ding et al., 2007; [165]Hermanns et al., 2016). We speculate that SRGN and PLEK are involved in the pathogenesis of UC and RA through the increase in inflammatory factors. Many researches have proven that the copy number variation (CNV) of FCGR3B is linked to autoimmune and inflammatory diseases. Low copy number and the deletion of FCGR3B increase the risk of RA ([166]Tsang-A-Sjoe et al., 2016; [167]Wang et al., 2016b; [168]Rahbari et al., 2017; [169]Zheng et al., 2017), and FCGR3B gene copy number has also been suggested to increase susceptibility to UC, which indicates that FCGR3B might be the key gene involved in their pathogenesis ([170]Asano et al., 2013). Although the specific mechanisms of action of FcγRIIIb in IMIDs remain unclear, studies revealed that FcγRIIIb is a stimulatory Fc gamma receptor which promotes neutrophil recruitment and the capture and clearance of immune complexes (ICs); the deletion of FCGR3B lead to immune-complex-mediated diseases ([171]Dijstelbloem et al., 2001; [172]Willcocks et al., 2008; [173]Chen et al., 2012). We speculate that FCGR3B is involved in the inflammatory processes in IgG-IC-FcγR signaling ([174]Mathsson et al., 2006; [175]Uo et al., 2013; [176]Bersellini Farinotti et al., 2019). The limitations of our study are as follows. Our study is a retrospective analysis and has small sample size. Hence, our findings need to be validated using a larger cohort and prospective studies. We did not assess the potential therapeutic roles of SRGN, PLEK, and FCGR3B in UC and RA; therefore, further clinical research is needed to investigate whether they could be used as predictive factors for infliximab efficacy in patients with UC and RA. Finally, we did not explore the specific mechanisms of these four genes in UC and RA, which warrants further studies. Conclusion In conclusion, our study identified 169 novel DEGs and 45 hub genes common in both UC and RA. GO and KEGG analyses of independent DEGs and the hub genes in UC and RA might reveal a novel prospective relationship between UC and RA. In addition, we found four hub genes (SRGN, PLEK, and FCGR3B) that were significantly associated with infliximab treatment in UC. These genes need to be explored further for their clinical relevance as potential prognostic indicators and therapeutic targets in UC and RA. Data Availability Statement The datasets presented in this study can be found in online repositories. The names of the repository/repositories and accession number(s) can be found in the article/[177]Supplementary Material. Author Contributions YC and HL collected the papers and analyzed data, analyzed the conclusions, and drafted the manuscript. LL reviewed the data and conclusions. JS presented the idea of this manuscript, supported the funding, analyzed the conclusions, drafted and revised the manuscript. All authors contributed to the article and approved the submitted version. Conflict of Interest The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest. Abbreviations BP biological process CC cellular component CNV copy number variation DEGs differentially expressed genes FCGR3B Fc fragment of IgG receptor IIIb ICs immune complexes lncRNAs long non-coding RNAs IMIDs immune-mediated inflammatory diseases MF molecular function miRNAs microRNAs NFkB nuclear transcription factor kappaB OS overall survival PPI protein-protein interaction PLEK pleckstrin RA rheumatoid arthritis SRGN serglycin TF transcription factors UC ulcerative colitis. Funding. Supported by grants from the National Natural Science Foundation of China (No. 81770545 and 81701746) and MDT Project of Clinical Research Innovation Foundation, Renji Hospital, School of Medicine, and Shanghai Jiao Tong University (PYI-17-003). ^1 [178]http://www.ncbi.nlm.nih.gov/geo/ ^2 [179]https://www.r-project.org/ ^3 [180]http://www.bioconductor.org/packages/release/bioc/html/limma.html ^4 [181]http://www.bioconductor.org/ ^5 [182]http://bioinfogp.cnb.csic.es/tools/venny/index.html ^6 [183]http://www.bioconductor.org/packages/release/bioc/html/clusterProf iler.html ^7 [184]http://string-db.org ^8 [185]http://www.cytoscape.org/ ^9 [186]http://starbase.sysu.edu.cn/index.php ^10 [187]http://alggen.lsi.upc.es/cgi-bin/promo_v3/promo/promoinit.cgi?dirD B=TF_8.3 Supplementary Material The Supplementary Material for this article can be found online at: [188]https://www.frontiersin.org/articles/10.3389/fgene.2020.572194/ful l#supplementary-material Supplementary Table 1 Differentially expressed genes in [189]GSE38713. [190]Click here for additional data file.^ (2.7MB, XLSX) Supplementary Table 2 Differentially expressed genes in [191]GSE1919. [192]Click here for additional data file.^ (970.2KB, XLSX) Supplementary Table 3 Differentially expressed genes in [193]GSE12251. [194]Click here for additional data file.^ (2.7MB, XLSX) Supplementary Table 4 Correlations between hub genes. [195]Click here for additional data file.^ (10.8KB, xlsx) [196]Click here for additional data file.^ (16.8KB, DOCX) [197]Click here for additional data file.^ (18KB, DOCX) References