Abstract Objective: To perform systematic transcriptomic analysis of multiple sclerosis (MS) risk genes in peripheral blood mononuclear cells (PBMCs) of subjects with distinct MS stages and describe the pathways characterized by dysregulated gene expressions. Methods: We monitored gene expression levels in PBMCs from 3 independent cohorts for a total of 297 cases (including clinically isolated syndromes (CIS), relapsing-remitting MS, primary and secondary progressive MS) and 96 healthy controls by distinct microarray platforms and quantitative PCR. Differential expression and pathway analyses for distinct MS stages were defined and validated by literature mining. Results: Genes located in the vicinity of MS risk variants displayed altered expression in peripheral blood at distinct stages of MS compared with the healthy population. The frequency of dysregulation was significantly higher than expected in CIS and progressive forms of MS. Pathway analysis for each MS stage–specific gene list showed that dysregulated genes contributed to pathogenic processes with scientific evidence in MS. Conclusions: Systematic gene expression analysis in PBMCs highlighted selective dysregulation of MS susceptibility genes playing a role in novel and well-known pathogenic pathways. __________________________________________________________________ MS is an inflammatory demyelinating disease of the CNS with evidence of immune dysfunction.^[59]1 The first clinical episode with features suggestive of MS is classified as clinically isolated syndrome (CIS), unless lesion dissemination in time and space ratifying MS diagnosis is evidenced.^[60]2 Approximately, 85% of individuals develop the relapsing-remitting form of MS (RR-MS), whereas 15% of MS individuals experience a progressive course (primary progressive MS, PP-MS). After a variable time, most RR-MS subjects advance to a secondary progressive (SP) phase (SP-MS), where neurologic worsening takes place without periods of remission.^[61]3 Several MS risk variants have been uncovered by genome-wide association studies (GWAS), and many of them are located close to immunologically relevant genes,^[62]4 suggesting that immune dysfunction may be partly genetically determined. Although effective in highlighting MS risk alleles, GWAS have failed in dissecting the genetic components of the susceptibility to distinct MS clinical forms. In fact, MS is a multifactorial disorder determined by the complex interaction between genetic and environmental factors, whose integration occurs at the epigenetic level and determines gene expression. Our group has recently shown the importance of blood transcriptomics in uncovering gene expression changes and transcriptional regulators in MS.^[63]5 In this study, we have (1) examined the expression levels of known MS susceptibility genes in peripheral blood mononuclear cells (PBMCs) of CIS, RR-MS, PP-MS, SP-MS, and control individuals, (2) identified a panel of blood transcriptional signatures for distinct MS forms, and (3) explored the pathways contributed by the dysregulated susceptibility genes. METHODS Human subjects and blood sampling. Investigations were conducted according to the principles expressed in the Declaration of Helsinki, and peripheral blood was drawn after signing of the institutional informed consent. We recruited 142 patients with MS (46 CIS, 52 RR-MS, 23 PP-MS, and 21 SP-MS) and 40 healthy controls (HCs) of Italian origin for the generation of the main transcriptomic data set by Illumina platform. Demographic and clinical parameters are shown in [64]table 1. A second distinct cohort comprising 21 RR-MS, 15 PP-MS, 13 SP-MS, and 27 HCs was enrolled according to the same inclusion criteria and used for validation with a distinct array platform ([65]table 1). Finally, a third independent cohort comprising 31 CIS, 30 RR-MS, 24 PP-MS, 21 SP-MS, and 29 HCs was included and used for validation experiments by quantitative PCR (q-PCR) ([66]table 1). Patients with MS were diagnosed according to McDonald criteria^[67]6 and were not suffering from any other acute or chronic inflammatory diseases or other autoimmune disorders. Furthermore, they had not started any immunomodulatory therapy for MS yet, as recruitment was performed over the last 15 years, in a period when decision to treat was not established and widespread as nowadays. Blood sampling was performed between 30 and 90 days after the first clinical attack in patients with CIS, and at least 4 weeks after the last clinical attack or steroid treatment for RR-MS subjects. All participants had peripheral blood counts within the reference range. All blood samplings were performed between 9 and 12 am Table 1. Subjects demographics and clinical information graphic file with name NEURIMMINFL2016011197TT1.jpg [68]Open in a new tab PBMC isolation and RNA extraction. PBMCs were isolated using a discontinuous density gradient (Lymphoprep; Nycomed, Oslo, Norway). Viable cells were counted by Trypan blue (Sigma-Aldrich, Milan, Italy) exclusion. Total RNA was extracted using Tri Reagent (Ambion; Applied Biosystems, Monza, Italy) and stored at −80°C. Generation of gene expression data sets. RNA quality was checked using Bioanalyzer 2100 (Agilent, Milan, Italy). Complementary RNA (cRNA) synthesis was performed using the Illumina TotalPrep RNA Amplification Kit (Ambion) according to the manufacturer's protocol. Hybridization of the cRNA relative to the first casistics was performed on Illumina Human Ref-8 v2 arrays (Illumina, Son, Netherlands). GenomeStudio GX Software (Illumina, San Diego, CA) was used to extract the array raw data. All the raw data were background subtracted by NEC method and normalized using cubic spline normalization as implemented in the software. We performed batch correction using COMBAT to remove batch effects in the PBMC expression data.^[69]7 The second independent transcriptomic data set was generated by Affymetrix GeneChip Human Genome U133 Plus 2.0 Array. All the necessary preprocessing steps were performed using robust multiarray average algorithm present in the Affy package in Bioconductor and keeping genes with multiple probes as individual transcripts. The MIAMI compliant microarray data have been deposited in the EBI ArrayExpress database (Accession numbers E-MTAB-4890 and E-MTAB-5151). Probes with a mean intensity value lower than 100 in all experimental groups were filtered out. Furthermore, we applied Pearson correlation to identify and remove probes correlating with age in the healthy population of the Illumina study, where age ranged from 22 to 57 years. To identify differentially expressed genes (DEGs) between each clinical form and the healthy population, we performed a 2-sample t test with a p value threshold of 0.05. A detailed flow diagram of this study is depicted in [70]figure 1. To verify whether stage-specific transcriptomes were enriched in dysregulated risk genes, we measured the frequency of differential expression of GWAS genes and compared it to that of DEG in the global transcriptome by χ^2 test with Yates' correction in GraphPad. Figure 1. Flow chart of the study. [71]Figure 1 [72]Open in a new tab CIS = clinically isolated syndrome; DEG = differentially expressed gene; FDR = false discovery rate; GWAS = genome-wide association studies; HC = healthy control; MS = multiple sclerosis; PP-MS = primary progressive MS; RR-MS = relapsing-remitting MS; RT-PCR = real-time PCR; SP-MS = secondary progressive MS. Collection of GWAS genes and their mapping to gene expression data. Genes located in the vicinity of MS susceptibility loci were collected from the supplementary tables 3 and 8 published by the International Multiple Sclerosis Genetics Consortium^[73]4 and reporting 48 gene variants from previous GWAS studies^[74]8[75]–[76]10 and 110 non–major histocompatibility complex (MHC) susceptibility variants described in 2013.^[77]4 Figure e-1 at [78]Neurology.org/nn summarizes the complete list of GWAS genes considered in our study. To map MS susceptibility genes to the relative Illumina probe identifiers, all the filtered probes in the Illumina data set were submitted to DAVID Gene Accession Conversion Tool,^[79]11 using Illumina HumanRef-8 v2 as a background reference so to retrieve Entrez and gene symbol identifiers for each probe. The same probe list was submitted to BioMart ID,^[80]12 conversion package in R Bioconductor,^[81]13 as a further control. In this way, gene expression levels were retrieved for 182 Illumina probes corresponding to 146 GWAS genes ([82]figure 1). Similarly, GWAS genes were annotated to the corresponding Affymetrix probes using BioMart ID,^[83]12 conversion package in R Bioconductor,^[84]13 and gene expression levels were retrieved for 344 Affymetrix probes relative to 149 MS susceptibility genes. cDNA synthesis and real-time PCR. The purity and the quality of RNA were analyzed using NanoDrop8000 (Thermo-Scientific, Waltham, MA). Two-microgram aliquot of total RNA was reverse transcribed to cDNA using random primers and Superscript III reverse transcriptase (all from Invitrogen, Carlsbad, CA) following the manufacturer's protocol. Real-time (RT-PCR) was performed using TAQMan Universal Master Mix (Applied BioSystems). PSMB1 was used as housekeeping gene based on its low coefficient of variation in the Illumina transcriptomic data set. Transcript levels of the target gene CD86 were measured by the TAQMan assay corresponding to the gene sequence detected by the Illumina probe and expressed as percentage of the housekeeping gene. Normality of data distribution was assessed by Kolmogorov-Smirnov statistics, and nonparametric Mann-Whitney test was used to compare mean values. All p values were 2-sided and subjected to a significance level of 0.05. Pathway analysis and literature mining. Pathway enrichment analysis was performed starting from the distinct dysregulated expression profiles isolated in each clinical form of disease. MS specialty modules in MetaCore (Thomson Reuters, New York, NY) are manually curated pathways associated with MS pathology, which are then ranked based on statistical significance. We selected pathways passing the false discovery rate (FDR)-corrected p value threshold of 0.05 and containing 3 DEGs and at least 1 GWAS DEG. For literature mining, PubMed database was used to collect scientific evidence about the role of the identified biological pathways in MS pathogenesis by searching the name of the pathway and “Multiple Sclerosis” or “EAE.” RESULTS Transcriptomic analysis in MS. Genes located in the vicinity of MS susceptibility loci (hereon called as GWAS genes) were collected from the supplementary tables 3 and 8 published by the International Multiple Sclerosis Genetics Consortium^[85]4 (see also figure e-1). To examine the expression levels of the 162 GWAS genes, we generated high-throughput gene expression data from PBMC of CIS, RR-MS, PP-MS, SP-MS individuals, and HCs ([86]table 1) by Illumina arrays, recovered the gene symbols for GWAS genes, and mapped them to the relative Illumina probes. As shown in [87]figure 1, there were 72 probes for 68 expressed GWAS genes passing signal intensity filter. Next, we tested whether these probes were differentially expressed in any disease group compared with healthy controls. We observed 29 GWAS DEGs in CIS, 19 in RR-MS, 35 in PP-MS, and 35 in SP-MS ([88]table 2). Notably, 6 GWAS DEGs were concordantly dysregulated in all disease groups compared with HCs, and several other GWAS DEGs were common to at least 2 MS forms ([89]figure 2A). Among them, NDFIP1, AHI1, FCRL1, and STAT4 were concordantly regulated in CIS and RR-MS but not in progressive forms of disease, whereas 11 MS risk genes, including TYK2, IL7R, and TNFRSF1A, were differentially expressed in PP-MS and SP-MS but not in CIS and RR-MS subjects ([90]figure 2A). Finally, some GWAS transcripts were altered at single stages of disease, suggesting distinct pathogenic alterations in distinct disease courses ([91]figure 2A). To verify whether MS transcriptome was enriched in dysregulated GWAS genes, we measured the frequency of differential gene expression in the list of expressed GWAS genes and in the global transcriptome for each disease subtype ([92]figure 1, [93]table 2) and verified whether the frequency of GWAS DEG was significantly higher in the GWAS gene list than the expected frequency of dysregulation in a random selection of transcripts using χ^2 statistics. As shown in [94]table 2, we found a clear and significant enrichment of GWAS DEG in CIS, PP-MS, SP-MS, and not in RR-MS, demonstrating that the frequency of dysregulation in genes located in the vicinity of MS susceptibility variants is higher than expected in at least 3 distinct stages of disease. Then, we checked the expression of the identified GWAS DEG in PBMC of an independent cohort of subjects including 21 RR-MS, 15 PP-MS, 13 SP-MS, and 27 HCs, for which gene expression was generated using the Affymetrix array platform ([95]table 1). After applying the same statistical threshold of the Illumina data set, several dysregulated GWAS DEGs were identified for each MS form in the Affymetrix data set. In fact, a total of 26 GWAS DEGs were concordantly upregulated (or downregulated) compared with the healthy population in 2 independent transcriptomic data sets ([96]figure 2B). Finally, we used a third case-control cohort to validate the expression of CD86, one of the GWAS genes upregulated in all forms of disease in the Illumina and Affymetrix data sets, by q-PCR ([97]figures 2C and e-2). Table 2. Probability of GWAS DEG enrichment at distinct stages of MS graphic file with name NEURIMMINFL2016011197TT2.jpg [98]Open in a new tab Figure 2. Dysregulated MS risk genes and pathways at distinct stages of disease. [99]Figure 2 [100]Open in a new tab (A) MS susceptibility genes dysregulated at distinct MS stages compared with as measured by Illumina microarrays. (B) Dysregulated MS susceptibility genes validated in 2 distinct array platforms. For A and B, fold-change heatmaps represent upregulated (red), downregulated (green), and unchanged (gray) expressions in the disease group compared with the healthy population. (C) Real-time PCR validation of CD86 transcript in peripheral blood mononuclear cells of a third case-control cohort. Refer also to figure e-2 for data elaboration after removal of a few outliers (samples with expression levels above 500). *p < 0.05, ***p < 0.005. (D) Venn diagram representing number of common and unique enriched pathways for each disease stage. CIS = clinically isolated syndrome; HC = healthy control; MS = multiple sclerosis; PP-MS = primary progressive MS; RR-MS = relapsing-remitting MS; SP-MS = secondary progressive MS. Biological pathways contributed by MS GWAS genes. To verify whether the dysregulated GWAS genes were involved in specific biological pathways, we applied the MetaCore pathway analysis program which uses a curated annotation database of biological functions and identifies the significantly enriched pathways for a given DEG list. Critically considering the limited number of dysregulated GWAS genes for a pathway analysis, for each disease group, we enriched the list of GWAS DEG with those dysregulated genes which were not located in the vicinity of MS susceptibility variants, performed pathway analysis using the global DEG list as input, and selected those pathways passing FDR-corrected p value cut-off of 0.05 and containing 3 global DEGs and at least 1 GWAS DEG for each disease form ([101]figure 1). We found 165, 95, 184, and 223 significant pathways for CIS, RR-MS, PP-MS, and SP-MS respectively, with 65 pathways shared by all the disease courses and others common to at least 2 or 3 clinical forms ([102]figure 2D, table e-1). Of interest, the 65 shared pathways belonged to major biological themes relative to transcription and translation, cell signaling and proliferation, cancer, cytoskeleton remodeling and cell migration, apoptosis, immunity, and nervous system ([103]table 3). Functions related to immunity and cell signaling were predominant and included pathways regulating innate and adaptive immunity, and triggered by lipids, cytokines, and growth factors. An interactive link in the table e-1 opens the graphical maps representing the overall genes and the dysregulated MS transcripts involved in each pathway. It is important that when searching for scientific evidence about a role of the pathways in MS or its animal model by literature mining, we found it in most cases ([104]table 3). Thus, bioinformatical annotation of gene expression data led to the identification of a core of pathogenetic functions common to all MS stages and characterized by dysregulation in genes including MS susceptibility genes. Table 3. Shared pathways across MS stages [105]graphic file with name NEURIMMINFL2016011197TT3.jpg graphic file with name NEURIMMINFL2016011197TT3A.jpg [106]Open in a new tab DISCUSSION In this study, we provide evidence that genes located in the vicinity of MS risk variants display dysregulated expression in peripheral blood of subjects with distinct MS stages compared with healthy individuals, that the frequency of dysregulation is significantly higher than expected, and that those genes contribute to pathogenic pathways in MS. The most recent Immunochip study has highlighted more than 100 novel non-MHC risk variants in MS and has underlined that several genes close to the identified polymorphisms play a key role in immunity, thus suggesting genetic susceptibility to a dysimmune state.^[107]4 On the other hand, genetics contributes only to part of the risk of developing MS and has been unsuccessful in determining predisposition to distinct clinical courses of MS, indicating an important role for environmental factors. Thus, the potential effect of the identified GWAS genes during distinct stages of MS remains elusive and needs clarification. As a first step into this direction, we have verified whether the GWAS genes postulated by the Immunochip analysis have altered transcript expression in peripheral mononuclear cells in MS subjects starting from the early phase of disease (CIS) to the distinct clinical courses most of the patients experience. Here, we show that although a core of GWAS genes is concordantly dysregulated in all MS stages when compared with the healthy population, each disease group is characterized by a distinct set of dysregulated GWAS genes, clearly indicating that alteration in the expression of MS risk genes is not a stable feature reproduced throughout the distinct phases and forms of disease, but is a selective event for each disease stage. Notably, 26 MS susceptibility genes were concordantly regulated in 2 independent cohorts measured with distinct array platforms which use different probes to detect gene expression, emphasizing the robustness of several dysregulations. A GWAS DEG in our study was CD86 (B7-2), a costimulatory molecule expressed by a wide range of leukocytes^[108]14 and critically involved in T-lymphocyte activation and proliferation.^[109]15,[110]16 Monocytes of patients with RR-MS display higher expression of CD86 compared with healthy controls and interferon β treatment triggers reversal regulation of CD86.^[111]17 Here, we demonstrate the overexpression of CD86 transcripts in PBMC at all the stages of MS compared with HCs in 2 independent microarray data sets and in a third independent cohort using q-PCR. Further, we report that the expression of JDP2, MAF, MAPK3, and RGS1 is upregulated in MS. Of interest, RGS1 is also upregulated by pathogenic Th17 T cells during experimental autoimmune encephalomyelitis (EAE).^[112]18 Conversely, BACH2, a critical transcriptional factor regulating the balance between tolerance and immunity by repressing CD4^+ T-cell differentiation,^[113]19 is downregulated in pathogenic T cells in EAE mice^[114]18 and in the blood of RR-MS subjects.^[115]20 Our study confirms this observation and extends it to the other MS stages. Very limited information is present in the literature about the expression of other GWAS genes in MS blood cells. A recent report has highlighted the downregulation of the transcription factor EOMES in whole blood of CIS and a pooled group of MS subjects with various disease courses compared with HCs by both RNA-seq and RT-PCR analysis.^[116]21 We confirm the low transcript levels in CIS but not in the distinct clinical forms when analyzed separately. Furthermore, our study dissects the previously reported differential regulation of IKZF3, FOXP1, ZNF438 in CIS, and all MS clinical forms^[117]21 and reproduces IL7R downregulation in PP-MS.^[118]22 The regulation of the other GWAS genes during MS finds the first description in our study. Concerning the frequency of dysregulation of MS susceptibility genes, a recent work shows that CD4-positive pathogenic T cells during EAE are significantly enriched in dysregulated GWAS genes.^[119]18 It is important that this observation is conserved in human PBMC, as we demonstrate a significantly high frequency of dysregulated GWAS genes in CIS and progressive forms of MS but not in RR-MS. As the potential to find transcriptional dysregulation in RR-MS data, however, was comparable with that in CIS (25.68% and 23.67% global dysregulation in CIS and RR-MS, respectively), the lack of enrichment in GWAS DEG in RR-MS has most probably a biological reason, whose explanation remains obscure. Thus, to our knowledge, we report the first systematic expression analysis for MS GWAS genes using large cohorts of untreated subjects with distinct MS stages and HCs, and describe several novel GWAS dysregulations, whose single functional effects on MS pathogenesis deserve further investigations in appropriate animal models. In addition, genome-wide expression profiling has provided relevant information about the overall quality and frequency of dysregulation in PBMC transcriptomes at distinct MS stages. Previous studies have searched for pathways in MS using the full list of GWAS genes under the assumption that all GWAS genes are dysregulated in MS in all tissues and at all disease stages.^[120]23,[121]24 Our study demonstrates that this hypothesis may be wrong, as MS GWAS genes are not all dysregulated at the transcriptional level in PBMC and that there may be differences among disease courses. To investigate the combined effects of multiple dysregulations, we have searched for biological processes significantly enriched in DEG, including MS risk genes, as we have considered that MS is not a pure genetic disorder and pathogenic pathways may be contributed by transcriptional dysregulation determined by environmental factors as well. We report here the complete list of pathways obtained for each disease stage. Notably, several of them were detected in single forms of disease, whereas others were common to more than 1 MS stage with 65 pathways appearing in all the distinct MS subtypes. The shared pathways involve well-known functions in MS pathogenesis as cell signaling and migration, immunity, and apoptosis. Scientific evidence about a pathogenic role for most of these shared pathways is already available in the literature; however the contribution of the single pathways to the distinct MS stages still needs to be clarified. Furthermore, we describe a series of novel candidate pathogenic pathways in MS, which deserve investigation and validation. Overall, we offer the scientific community a large transcriptomic data set for expression studies in MS, the evidence about dysregulation of several known MS risk genes in blood at distinct MS stages, and the complete list of significant pathways contributed by GWAS genes for each disease form. This information may serve as the basis for novel translational medicine studies investigating the molecular framework underlying MS pathogenesis. Supplementary Material Data Supplement [122]supp_4_3_e337__index.html^ (948B, html) ACKNOWLEDGMENT