Abstract Immunoglobulin A (IgA) mediates mucosal responses to food antigens and the intestinal microbiome and is involved in susceptibility to mucosal pathogens, celiac disease, inflammatory bowel disease, and IgA nephropathy. We performed a genome-wide association study of serum IgA levels in 41,263 individuals of diverse ancestries and identified 20 genome-wide significant loci, including 9 known and 11 novel loci. Co-localization analyses with expression QTLs prioritized candidate genes for 14 of 20 significant loci. Most loci encoded genes that produced immune defects and IgA abnormalities when genetically manipulated in mice. We also observed positive genetic correlations of serum IgA levels with IgA nephropathy, type 2 diabetes, and body mass index, and negative correlations with celiac disease, inflammatory bowel disease, and several infections. Mendelian randomization supported elevated serum IgA as a causal factor in IgA nephropathy. African ancestry was consistently associated with higher serum IgA levels and greater frequency of IgA-increasing alleles compared to other ancestries. Our findings provide novel insights into the genetic regulation of IgA levels and its potential role in human disease. Subject terms: Genome-wide association studies, Mucosal immunology, IgA nephropathy __________________________________________________________________ Immunoglobulin A protects against infectious disease and contributes to autoimmune and inflammatory disorders. Here, the authors perform a genome-wide association study for serum IgA levels, identifying 20 genome-wide significant loci, providing new insights into the genetic regulation of IgA levels. Introduction Immunoglobulin A (IgA) provides protection against mucosal infections and contributes to the pathogenesis of autoimmune and inflammatory disorders^[127]1,[128]2. Most of the IgA production occurs at the mucosal surfaces along the gastrointestinal and respiratory tracts, but a large portion of circulating IgA is contributed by bone-marrow plasma cells^[129]3. IgA neutralizes mucosal pathogens^[130]4 and enhanced IgA responsiveness has been reported in various respiratory and gastrointestinal infections, including acute SARS-CoV-2 infection^[131]5–[132]7. Increased serum IgA levels are a common phenomenon in patients with IgA nephropathy^[133]8,[134]9, diabetes^[135]10 and metabolic syndrome^[136]11. The serum concentration of IgA can be influenced by a combination of inherited factors and environmental exposures, including age, sex, and lifestyle factors^[137]11–[138]14. The heritability of serum IgA levels has been estimated in the range 20–60%^[139]15–[140]18. Several GWAS have investigated genetic determinants of serum IgA levels in individuals from either European or East Asian ancestry, and nine significant GWAS loci have been identified to date^[141]16,[142]19,[143]20. Notably, African and other more diverse populations have not been included in prior studies, and limited data exist on the ancestral differences in IgA levels. In this study, we conducted a genetic analysis of serum IgA levels in 41,263 individuals, including 22,229 diverse participants across 16 ancestry-defined cohorts with genome-wide imputed data combined with 19,034 individuals with summary statistics data on significant and suggestive association signals (P < 10^−6) from the previous IgA level GWAS by deCODE Genetics and Lund University (deCODE-Lund)^[144]20. We identified novel genetic determinants of serum IgA levels and, through comprehensive functional annotations, we prioritized candidate causal genes at each of the IgA-associated loci. We then investigated the shared genetic architecture between serum IgA levels and other human traits using several approaches, including genome-wide genetic correlation analysis, co-localization of GWAS loci, Mendelian randomization, and phenome-wide association studies. Our results provide new insights into the genetic regulation of serum IgA levels and its role in genetic susceptibility to several human diseases, including immune, infectious, kidney and cardio-metabolic traits. Results Ancestral differences in serum levels of IgA We first tested for differences in the distribution of serum IgA levels among 12,260 multi-ancestry individuals broadly classified into 4 major groups based on genetic ancestry (1751 African American, 6791 European, 1257 East Asian, and 2461 Latinx or admixed ancestry individuals). First, we performed laboratory measurements of serum IgA levels in all 5420 participants of the Multi-Ethnic Study of Atherosclerosis (MESA) with available serum samples, providing us with the largest multi-ancestry cohort with standardized IgA measurements. For group comparisons, we generated standardized residuals of log-transformed serum IgA levels after adjustment for age and sex. Notably, MESA participants of African ancestry had significantly higher mean age and sex-adjusted serum IgA levels compared to all other ancestries (Fig. [145]1a). We next examined the distribution of adjusted IgA levels in the pooled dataset of 6840 diverse non-MESA participants included in our genetic studies. In this independent dataset, we replicated the strong association of genetic African ancestry with higher IgA levels after age and sex adjustment (Fig. [146]1b). The admixture analysis across the MESA cohort confirmed weak, but highly statistically significant positive correlation between African ancestry and age- and sex-adjusted serum IgA levels (r = 0.026, P = 4.6 × 10^−33, Fig. [147]1c). This correlation remained significant after additional adjustment for body mass index (BMI) and diabetes. Fig. 1. Ancestral differences in serum IgA levels. [148]Fig. 1 [149]Open in a new tab a discovery analysis across the four major ancestral groups in MESA demonstrates that African ancestry is associated with higher adjusted IgA levels, b replication analysis in all non-MESA study participants confirms higher mean IgA levels in individuals of African ancestry, c mean adjusted IgA levels (±95% confidence intervals) as a function of African ancestry fraction, demonstrating that individuals in the upper quartile (>75%) of African ancestry have the highest serum IgA levels (N = 5420 MESA participants); standardized residuals generated by regression of log-transformed serum IgA levels against age and sex were significantly correlated with the African ancestry fraction (P = 4.6 × 10^−33), and this relationship remained highly significant after additional adjustment for BMI and diabetes (P = 3.7 × 10^−23), The boxplots in (a, b, and d) depict the median (horizontal line), upper/lower quartiles (boxes), and range (whiskers); the red diamond point denotes the mean value per ancestry group; two-sided unadjusted t test: *P < 0.05; **P < 0.01; ***P < 0.001. d distributions of the GPS for IgA levels in the 1000 Genomes (Phase 3) populations demonstrating higher GPS in African (AFR) compared to European (EUR), Admixed American (AMR), and East Asian (EAS) populations, e mean AFR-EUR difference in IgA-increasing allele frequency for each quartile of GPS weights (LD-score-adjusted effect sizes under the assumption of 0.01 fraction of casual variants, N = 67267 variants included); error bars correspond to 95% confidence intervals around the mean. Multi-ancestry GWAS meta-analysis Next, we aimed to identify genetic loci controlling serum IgA levels using GWAS. We performed a GWAS meta-analysis of 16 diverse ancestry-defined cohorts comprised of 22,229 individuals genotyped genome-wide, combined with 4699 significant and suggestive association signals from an independent cohort of 19,034 North Europeans previously published by deCODE-Lund^[150]20. Each of the 16 cohorts was genotyped with high-density SNP arrays and imputed using the latest genome sequence reference panels (see Table [151]1 and Online Methods for details). These cohorts were not ascertained based upon any specific immune or disease phenotype. Within each of the cohorts, IgA phenotypes were defined as standardized residuals of log-transformed IgA levels regressed against age and sex. The final combined dataset comprised of 41,263 individuals with serum IgA measurements (35,094 European, 1751 African American, 1957 East Asian, and 2461 Latinx or admixed-ancestry individuals). Table 1. Baseline characteristics of participants in the GWAS cohorts Cohorts Ancestry N[Total] N[Male] N[Female] Mean age MESA (European) European 2280 1132 1148 64.17 MESA (African) African 1275 599 676 64.05 MESA (Admixed 1) Admixed 474 209 265 63.38 MESA (Admixed 2) Admixed 750 369 381 63.17 MESA (East Asian) East Asian 641 319 322 62.48 eMERGE (European) European 4261 1622 2640 56.45 eMERGE (African) African 476 326 150 50.06 eMERGE (East Asian) East Asian 73 29 44 45.38 eMERGE (Admixed 1) Admixed 235 84 151 45.63 eMERGE (Admixed 2) Admixed 1002 427 575 54.97 German (European) European 156 104 52 44.88 French (European) European 103 30 73 – Chinese (East Asian) East Asian 467 318 149 32.39 Japanese (East Asian) East Asian 776 523 252 33.79 U.S. (European) European 93 53 40 35.66 Swedish (PMID: 24676358) European 9167 4361 4806 64.50 deCODE-Lund (PMID: 28628107) European 19,034 – – – All Discovery 41,263 6144 6918 51.18 [152]Open in a new tab The results of joint meta-analysis are provided in Fig. [153]2a and Table [154]2 with regional association plots shown in Supplementary Fig. [155]1. We observed minimal genomic inflation of the final meta-analysis summary statistics (λ = 1.016, Supplementary Fig. [156]2), confirming negligible effects of population stratification. In total, we identified 20 genome-wide significant independent loci, including nine known and 11 novel loci based on P < 5 × 10^−8. We detected no significant heterogeneity in associations across all cohorts, and the meta-analysis results were robust under both fixed effects and trans-ancestry (TransMeta^[157]21) random effects models. Stepwise conditional analyses of the genome-wide significant loci revealed that the HLA locus harbored at least two independently genome-wide significant variants (Supplementary Fig. [158]3), but no additional independent signals were detected outside of the HLA region. Forests plots in Supplementary Figs. [159]4, [160]5 provide detailed comparisons of effect estimates by ancestry for each of the 20 genome-wide significant loci, while Supplementary Data [161]1 provides the breakdown of effect estimates and P-values for each individual cohort. We additionally identified eight suggestive signals with P < 1 × 10^−6 (Supplementary Table [162]1), including TNFSF13 locus previously reported in GWAS of Han Chinese ancestry participants^[163]19. Using linkage disequilibrium (LD)-score regression method^[164]22, we estimated the genome-wide SNP-based heritability of age- and sex-adjusted IgA levels at approximately 7% (95%CI: 2–11%). Fig. 2. Trans-ethnic meta-analyses across all cohorts identified 20 genome-wide significant loci. [165]Fig. 2 [166]Open in a new tab a Manhattan plot depicting a total of 11 novel loci (red) as well as 9 known loci (purple) identified in the meta-analysis; the dotted horizontal line indicates a genome-wide significance threshold (P = 5 × 10^−8); the y-axis depicts -log10(P-value) for the fixed effects meta-analysis (two-sided, unadjusted), and is truncated to accommodate large peak at RUNX3 locus. b Correlation between average frequency of the independently significant alleles associated with higher IgA levels (x-axis) and their age and sex adjusted effect size (y-axis, standardized betas). c Pleiotropic effects of the IgA GWAS loci; GWAS loci for IgA levels are in green; other traits are in purple; arrows represent allelic associations that are identical to or in tight LD (r^2 > 0.5) with the IgA effect alleles; concordant effects indicated in red; opposed effects in blue. Table 2. Effect estimates for 20 genome-wide significant loci for IgA levels by trans-ancestry meta-analysis Locus CHR BP (hg19) SNP Effect Allele Beta P-value FE P-value RE I2 Q Note RUNX3 1 25291697 rs188468174 C 0.876 3.42E−92 1.37E−91 0.0 0.55 Known TNFSF4, TNFSF18 1 173163568 rs7518129 G 0.056 1.06E−16 4.24E−16 3.3 0.42 Known GPATCH2 1 217563106 rs73100295 T 0.356 3.91E−08 3.11E−08 0.0 0.98 New IL1R1 2 102689031 rs13427957 C 0.040 6.19E−10 3.86E−10 33.3 0.08 New ELL2 5 95277555 rs3777175 G 0.084 7.81E−30 3.13E−29 0.0 0.50 Known ANKRD55, IL6ST 5 55438580 rs6859219 C 0.073 1.41E−20 4.38E−20 12.4 0.31 Known HLA 6 31106893 rs1265094 A 0.076 1.86E−31 7.43E−31 37.4 0.06 Known RUNX2 6 45526470 rs1200427 T 0.059 6.85E−14 2.24E−13 20.1 0.23 New CITED2 6 139975943 rs17069163 T 0.050 5.94E−11 2.01E−10 37.3 0.06 New ZP3, SSC4D 7 76034150 rs55722505 C 0.048 8.61E−14 3.44E−13 23.6 0.18 New TNFSF8, TNFSF15 9 117692882 rs3181356 T 0.069 1.13E−18 2.67E−18 0.0 0.75 Known TMEM258, FADS2 11 61595564 rs968567 T 0.116 2.42E−41 3.05E−41 14.9 0.29 Known OVOL1, RELA 11 65555524 rs10896045 A 0.066 2.57E−22 1.03E−21 0.0 0.84 New POU2AF1 11 111267394 rs4938518 T 0.056 7.01E−16 2.80E−15 0.0 0.74 New HDAC7 12 48214825 rs7487637 G 0.054 9.97E−15 3.99E−14 4.2 0.41 New SH2B3 12 111833788 rs10774624 G 0.050 1.37E−13 3.25E−13 0.0 0.54 Known TRAF3 14 103239630 rs12147883 C 0.048 5.42E−14 3.51E−14 8.4 0.35 New LITAF 16 11717832 rs113962704 T 0.054 1.91E−12 6.02E−12 0.0 0.54 New CTF1 16 30916129 rs1458201 A 0.052 2.02E−11 8.07E−11 25.1 0.16 New HORMAD2, LIF 22 30448399 rs193473 A 0.067 1.58E−20 6.30E−20 0.0 0.51 Known [167]Open in a new tab Beta: per allele effect estimate from linear regression expressed in age, sex, and ancestry-adjusted standard deviation units of IgA distribution. The last column indicates known (previously reported) and new (newly discovered) loci. Gene names indicated in italics. FE fixed effects meta-analysis, RE random effects (TransMeta) meta-analysis, I2 Heterogeneity Index, Q Cochrane’s heterogeneity test P-value. As expected, the effect sizes of independently associated variants were inversely related to their minor allelic frequencies (Fig. [168]2b). Two relatively rare ancestry-specific genome-wide significant variants exhibited the largest effects, including the previously reported RUNX3 locus supported by the European cohorts (IgA-lowering allele rs188468174-T, Beta = −0.88, P = 3.42 × 10^−92, European MAF = 1%, absent in African genomes), and the novel GPATCH2 locus supported predominantly by the African ancestry cohorts (IgA-increasing allele rs73100295-T, Beta = 0.36, P = 3.91 × 10^−8, MAF = 9% in African ancestry cohorts, MAF = 2% in admixed cohorts, rare in Europeans). Interestingly, the IgA-increasing alleles at 12 of the 20 genome-wide significant loci (60%) were more frequent in African compared to European ancestry, suggesting an enrichment of IgA-increasing alleles in African genomes. To assess if a similar trend was present for all IgA-increasing alleles, we derived a genome-wide polygenic score (GPS) for serum IgA levels (see Online Methods) and compared its distributions between major ancestral populations of the 1000 Genomes Phase 3 reference dataset. The African (AFR) reference population had the highest average GPS of all other ancestral populations (Fig. [169]1d), with the mean value over two standard deviations higher compared to the European (EUR) population (t-test P < 2 × 10^−16). The observed distributional differences in the GPS by ancestry had a strikingly parallel pattern to the differences in serum IgA levels (Fig. [170]1a, b). Because the GPS weights are constant when scoring populations, these observations must be driven by inter-population differences in allelic frequencies. These findings were also consistent with the results of our admixture analysis demonstrating positive correlation between the fraction of African ancestry and IgA levels, leading us to hypothesize that the IgA-increasing alleles could have provided some degree of fitness advantage in Africa, or decreased fitness in non-African environment. To assess for potential polygenic adaptation, we next tested African-European frequency difference for the IgA-increasing alleles for correlation with their corresponding GPS weights (LD-score-adjusted effect sizes assuming 1% fraction of causal variants). We detected a significant positive rank correlation among the top 1% variants with the largest effect (Spearman’s r = 0.01, P = 0.0043, N = 67,267 top variants, Fig. [171]1e). When extended genome-wide by LD score regression^[172]23, this correlation became non-significant (r[g] = 0.05, P = 0.32, N = 6,710,977 variants). Similarly, we detected no significant genome-wide correlation by LD score regression between polarized singleton density scores (tSDS)^[173]24 for IgA increasing alleles and their effect sizes (r[g] = −0.07, P = 0.31, N = 3,635,846 variants with tSDS scores available). Pathway and tissue enrichments In a protein–protein interaction (PPI) network analysis of positional candidate genes from the non-HLA loci, we observed greater network connectivity than expected by chance (permutation P = 0.002, Supplementary Fig. [174]6), suggesting that the gene products physically interact and thus may participate in common biological processes. Our pathway enrichment analyses based on genome-wide summary statistics^[175]25 identified five significantly enriched pathways at Bonferroni-adjusted P < 0.05 (Fig. [176]3a and Supplementary Table [177]2), including cytokine signaling in immune system (P = 1.2 × 10^−6), signaling by interleukins (P = 1.1 × 10^−5), cytokine-cytokine receptor interactions (P = 5.2 × 10^−6), TNFs and their physiological receptors (P = 1.4 × 10^−5), and IL-6-type cytokine receptor ligand interactions (P = 5.0 × 10^−5). Using data-driven expression-prioritized integration for complex traits (DEPICT) analysis^[178]26, we further prioritized 17 tissues and cell types at FDR < 0.05 based on our GWAS results, with the strongest enrichment in bone marrow cells, hematopoietic system, as well as blood and myeloid cells (Fig. [179]4b and Supplementary Table [180]3). Fig. 3. Pathway and gene set enrichment analyses. [181]Fig. 3 [182]Open in a new tab a Pathway enrichment analysis for genes at the significant GWAS loci (two-sided enrichment test P-values). b Gene-set enrichment for genes that cause abnormal IgA level. c abnormal immune tolerance; and (d) abnormal response to infection when genetically manipulated in mice. The y-axis shows the fixed effects meta-analysis –log10 (P-value) for the variant with the lowest two-sided unadjusted P-value in each candidate gene. The dashed line corresponds to genome-wide significance (P = 5 × 10^−8). Enrichment P-value corresponds to the two-sided Fisher exact test comparing the observed number of genes with association signals below the genome-wide threshold against the number expected under binomial distribution. Fig. 4. Colocalization and enrichment analyses across different tissue/cell types. [183]Fig. 4 [184]Open in a new tab a Colocalization analysis with gene expression QTLs (eQTLs) in whole blood and primary immune cells; each row indicates one gene and each column denotes one tissue/cell type; the color from red to white shows the probability of sharing the same causal variant between GWAS loci and eQTLs (PP4). b Tissue/cell type enrichment in DEPICT; the y-axis represents the −log10 of the two-sided adjusted p-value for the enrichment test and x-axis shows the first level MeSH tissue and cell type annotations. c Integrative analysis of eQTL and hQTL for LITAF locus. The upper and lower panels show the regional plots of the LITAF locus for IgA GWAS and eQTLs in monocyte, respectively. The y-axis represents the −log10 of the unadjusted two-sided p-value for the fixed effects GWAS meta-analysis (top) and the Wald test from linear regression of genotype on gene expression level (bottom); x-axis shows the chromosome positions. The lowest panel denotes the positions of LITAF gene and a hQTL peak (H3K27ac) in monocytes. Pleiotropic associations based on GWAS catalog We next interrogated the pleiotropic effect of individual loci by annotating lead SNPs and their proxies against the GWAS catalog database (see Online Methods)^[185]27. Most of the genome-wide significant loci had previous GWAS associations with immune-mediated disorders, infections, or hematological traits (Fig. [186]2c and Supplementary Data [187]2). In particular, eight non-HLA loci, SH2B3, ANKRD55, HDAC7, RCOR1/TRAF3, TNFSF4, POU2AF1, FADS2/TMEM258, and OVOL1/RELA, displayed either concordant or opposed effects on 18 different autoimmune and inflammatory disorders, suggesting that genetic regulation of IgA levels may play a pervasive role in the control of autoimmunity and inflammation. The SH2B3 locus displayed the highest degree of pleiotropy, being associated with 79 different GWAS traits. We also found that the alleles associated with higher serum IgA levels at both SH2B3 and HORMAD2/LIF loci were associated with increased risk of tonsillectomy, a procedure frequently performed in the setting of recurrent pharyngitis^[188]28. Moreover, concordant effects on high blood pressure were found at three loci, including SH2B3, CTF1 and HDAC7, consistent with the epidemiologic association of high blood pressure with increased IgA levels^[189]11. Interestingly, the SH2B3 locus showed concordant risk effects on several cardiovascular traits, including coronary artery disease and hypertension, but had an opposed effect on LDL cholesterol. The effect on BMI was concordant with IgA levels for SH2B3 and RCOR1/TRAF3 loci, also consistent with the reported correlation between higher serum IgA levels and obesity^[190]11. Functional annotations of GWAS loci The majority of lead SNPs at genome-wide significant loci map to non-coding (intronic or intergenic) regions (Supplementary Table [191]4). The lead SNPs at five loci had proxies in the 5′ or 3′UTR regions, in RUNX2, ELL2, TNFSF15, TNFSF4, TRAF3, and FADS2 genes, and three loci had missense proxies, in FBXL19, ELL2 and SH2B3 genes. At the previously reported ELL2 locus, we identified the Thr298Ala missense variant in ELL2 exon 7 (rs3815768, P = 2.9 × 10^−27) in linkage disequilibrium with the lead SNP (rs3777175, P = 7.8 × 10^−30, r^2 = 0.69) and located to the end of the ELL2 domain required for transcriptional elongation^[192]29. At the known SH2B3 locus, rs3184504 in tight LD with the top SNP (r^2 = 0.94) is a missense variant in the canonical transcript of SH2B3 gene, although this variant is predicted to be benign by PolyPhen2. Given that other top signals map to non-coding regions, we evaluated their potential regulatory function by systematic eQTL co-localization analyses using whole blood^[193]30 and 13 primary immune cell types^[194]31. The co-localization probability (PP4) exceeded 50% in at least one cell type for 14 of 20 GWAS loci, prioritizing biologically plausible candidate genes at each of these loci (Fig. [195]4a and Supplementary Table [196]5). We will next summarize some of our positive functional annotation findings for the top non-HLA GWAS loci. RUNX2 and RUNX3 loci The known RUNX3 locus on chr.1p36.11 represents one of the strongest signals in our meta-analysis with the largest effect size (rs188468174, Beta = 0.88, P = 3.42 × 10^−92). This locus was previously associated with IgG glycosylation^[197]32 and IgA levels^[198]20, and strongly replicated in our study. Interestingly, we have also picked up a novel locus with smaller effect on chr.6p21.1 encoding RUNX2, a related transcription factor (rs1200427, Beta = 0.06, P = 6.85 × 10^−14). RUNX transcription factors are essential regulators of diverse developmental and signaling pathways^[199]33,[200]34. Both RUNX2 and RUNX3 physically interact^[201]35 and have been linked to retinoic-acid- and TGF-β-induced IgA class switching^[202]36. The novel RUNX2 locus co-localized with the eQTL for the RUNX2 gene in whole blood with PP4 = 0.99; the IgA-decreasing allele at the index SNP (rs1200427-G) was associated with higher mRNA levels of RUNX2 (Fig. [203]4a and Supplementary Table [204]5). Notably, the lead SNP is in LD with rs1200428 (r^2 = 0.51), the 3′UTR variant in RUNX2 that has the strongest eQTL effects in blood. A similar phenomenon was previously reported for the RUNX3 locus, where the minor allele of the top SNP (rs188468174-T) was associated with lower IgA levels and increased mRNA abundance of the long isoform of RUNX3^[205]20. Therefore, our study solidifies the evidence for a genetic control model of RUNX transcription factors, in which increased RUNX expression suppresses IgA class switching, reducing circulating IgA levels. LITAF locus The new locus on chr.16p13.13 (rs113962704, Beta =0.05, P = 1.91 × 10^−12) encodes lipopolysaccharide-induced TNF-alpha factor (LITAF), a transcription factor regulating TNF-alpha expression in intestinal macrophages^[206]37,[207]38. Notably, macrophage-specific deficiency of LITAF in mice leads to attenuated TNF and IL-6 response upon LPS stimulation^[208]39. Our analyses revealed that this locus co-localized with eQTL for LITAF exclusively in monocytes. Moreover, the top SNP represented a monocyte-specific hQTL (H3K27ac)^[209]40, suggesting that it alters LITAF enhancer activity in the monocytic lineage (Fig. [210]4a, c). The IgA-increasing allele (rs113962704-T) was associated with higher mRNA expression of LITAF in monocytes, supporting the hypothesis that this transcription factor provides a stimulus for IgA production by altering monocyte function. IL1R1, TRAF3, ANKRD55, and HDAC7 loci These four loci exhibited T-cell specific eQTL effects. The new locus on chr.2q11.2 (rs13427957, Beta = 0.04, P = 6.19 × 10^−10) contains the IL1R1 gene encoding Interleukin 1 Receptor Type 1, an important T-cell receptor involved in cytokine-induced immune and inflammatory responses^[211]41. This locus co-localizes with eQTL for MIR4772 in Th1 cells (PP4 = 0.69) with concordant effect on IgA levels, and IL1R1 is predicted to be a target of MIR4772 by miRDB^[212]42 and TargetScan^[213]43. A mouse knock-out of IL1R1 in Th2 cells had decreased IgA levels^[214]44. The new locus on chr.14q32.32 (rs12147883, Beta = 0.05, P = 5.42 × 10^−14) contains TRAF3 encoding TNF Receptor Associated Factor 3, a protein participating in the CD40 signaling, inhibition of non-classical NF-κB signaling, and regulation of class switch recombination in B cells^[215]45–[216]47. Interestingly, this locus co-localizes with eQTL for TRAF3 specifically in T cell lineage, where the IgA-increasing variant is associated with higher TRAF3 expression. In contrast to its inhibitory functions in B-cells, TRAF3 is known to promote many T-cell effector functions through enhancing signaling by the T-cell receptor-CD28 complex^[217]48,[218]49. Another locus on chr.5q11.2 (rs6859219, Beta = 0.07, P = 1.41 × 10^−20) encoding ANKRD55 (closest gene) and IL6ST has previously been associated with IgA levels^[219]20, IgG glycosylation^[220]50 and increased risk of multiple sclerosis, rheumatoid arthritis, and Crohn’s disease (Fig. [221]1c, Supplementary Data [222]2). Notably, ANKRD55 gene expression is specific to CD4+ T cells, and mouse studies suggest that ANKRD55 is induced by inflammation and displays T-cell regulatory functions^[223]51. We co-localized this locus with eQTL for ANKRD55 in T regulatory cells (PP4 = 0.51) as well as in whole blood (PP4 = 0.98) with concordant effect of ANKRD55 expression and IgA levels. Moreover, we found that rs6859219 intersects two genetically regulated histone-modification peaks (H3K4me1 and H3K27ac) that are T cell-specific^[224]40 (Supplementary Fig. [225]7a). The fourth locus on chr.12q13.11 (rs7487637, Beta = 0.05, P = 9.97 × 10^−15) contains HDAC7 encoding an important histone deacetylase regulating differentiation and function of T-cells^[226]52. This new locus co-localizes with eQTL for HDAC7 specifically in T-cells, where the IgA-increasing allele is associated with higher mRNA levels of HDAC7. OVOL1/RELA locus The new locus on chr.11q13.1 (rs10896045, Beta = 0.07, P = 2.57 × 10^−22) encodes multiple candidate transcripts including OVOL1, RELA, and several others. The co-localization analysis suggested that this locus was shared with IgA nephropathy, with the IgA-increasing allele associated with increased risk of IgA nephropathy, a kidney disease due to IgA deposition in the glomeruli. Previous GWAS also pointed to the protective role of this locus from atopic dermatitis (Fig. [227]2c). While the index SNP localizes to the intronic portion of OVOL1, a putative transcription factor of poorly defined function, the nearby RELA gene encodes a subunit of NF-κB complex^[228]53. We found no significant eQTL effects for either RELA or OVOL1 in blood or primary immune cells. However, the lead SNP had significant eQTL effects on OVOL1 transcript in thyroid (P = 3.1 × 10^−12), spleen (P = 1.5 × 10^−6) and EBV-transformed lymphocytes (4.7 × 10^−9) with the IgA-increasing allele (rs10896045-A) associated with higher expression of OVOL1 gene in all three GTEx tissues. POU2AF1 locus The new locus on chr.11q23.1 (rs4938518, Beta = 0.056, P = 7.01 × 10^−16) encodes POU2AF1 (POU class 2 homeobox associating factor 1), the gene involved in B-cell antigen responses required for the formation of germinal centers^[229]54. Mouse knock-out of POU2AF1 leads to increased B-cell apoptosis and decreased IgA production^[230]55. The IgA-decreasing allele at this locus has been associated with increased risk of primary biliary cholangitis (Fig. [231]2c and Supplementary Data [232]2). Additional newly discovered loci The new locus on 16p11.2 (rs1458201, Beta = 0.052, P = 2.02 × 10^−11) contains CTF1 (encoding cardiotrophin-1) involved in multiple immune-related pathways including cytokine signaling in immune system and interleukins, IL-6-type cytokine receptor ligand interactions and signaling (Fig. [233]3a). The new locus on chr.6q24.1 (rs17069163, Beta = 0.05, P = 5.94 × 10^−11) contains CITED2, which encodes a CBP/P300 interacting trans-activator functioning as a molecular switch of TGF-α and TGF-β induced signaling^[234]56, and previously implicated in immune homeostasis and tolerance^[235]57. The new locus on chr.1q41 (rs73100295, Beta = 0.36, P = 3.91 × 10^−08) encodes GPATCH2, but its role in the immune system regulation is unknown. Lastly, our co-localization analysis of the new locus on chr.7q11.23 (rs55722505, Beta = 0.048, P = 8.61 × 10^−14) suggested several candidate effector genes including SRCRB4D, DTX2 and YWHAG in whole blood, ZP3 and SSC4D in monocytes, and POMZP3 in naïve CD8 and B cells (Fig. [236]4a). Additional known loci validated in this study We replicated the previously reported ELL2 locus (rs3777175, Beta = 0.08, P = 7.81 × 10^−30), which co-localized with blood eQTL for ELL2 (PP4 = 0.53, Fig. [237]4a and Supplementary Table [238]5). Consistent with previous reports^[239]58, the IgA-increasing allele (rs3777175-G) was associated with lower transcript level of ELL2 (P = 6.76 × 10^−24), prioritizing ELL2 (encoding elongation factor for RNA polymerase II 2) as the most likely causal gene at this locus. The known TMEM258/FADS2 locus (rs968567, Beta = 0.12, P = 2.42 × 10^−41) co-localized with several candidate genes, including FADS1 and FADS2 (fatty acid desaturases regulating unsaturation of fatty acids, co-localized across most blood cell types) and the TMEM258 gene (co-localized specifically in T cells), with the IgA-increasing allele (rs968567-T) associated with higher expression of all three transcripts. Notably, rs968567 falls into a genetically regulated histone-modification peak (H3K4me1) for FADS2 in T cells, monocytes, and neutrophils^[240]40 (Supplementary Fig. [241]7b). TMEM258 represents a particularly attractive candidate gene at this locus, since mice deficient in TMEM258 exhibit severe intestinal inflammation^[242]59. We additionally replicated three known loci encoding members of tumor necrosis factor ligand superfamily; TNFSF4/TNFSF18 locus on chr.1q25.1^[243]20 (rs7518129, Beta = 0.06, P = 1.06 × 10^−16, TNFSF4 is the closest gene), TNFSF8/TNFSF15 locus on chr.9q33.1^[244]16 (rs3181356, Beta = 0.07, P = 1.13 × 10^−18, TNFSF8 is the closest gene), and a suggestive TNFSF13 locus on chr.17p13.1^[245]19 (rs3803800, Beta = 0.06, P = 9.41 × 10^−08). These loci encode powerful TNFSF cytokines with partially overlapping receptors^[246]60. TNFSF8/TNFSF15 and TNFSF13 loci are also associated with the risk of IgA nephropathy, while TNFSF4/TNFSF18 locus has previously been associated with the risk of eczema, asthma and narcolepsy, all with concordant effects to IgA levels (Fig. [247]2c and Supplemental Table [248]5). Convergence with relevant mouse phenotypes To further prioritize causal genes at our GWAS loci, we tested for significant gene set overlaps with human orthologs that cause immune-related phenotypes when disrupted in mice. This included gene sets for abnormal IgA levels (120 genes), abnormal immune tolerance (408 genes) and abnormal response to infection (562 genes) from the Mouse Genome Informatics (MGI) database^[249]55. We identified significant overlap for 13 genes linked to abnormal IgA levels in mice (enrichment P = 5.8 × 10^−5), 29 linked to abnormal immune tolerance (enrichment P = 6.3 × 10^−6) and 42 linked to abnormal response to infection (enrichment P = 1.1 × 10^−5) (Fig. [250]3b and Supplementary Tables [251]6–[252]8). Among the 120 genes with abnormal IgA phenotypes in mice, six additional genes (TRIM13, NCR1, IRF5, FGR, ARHGAP15 and REL) surpassed a Bonferroni-corrected threshold (P = 0.05/22169 = 2.26 × 10^−06) when tested against our GWAS results; these represent plausible new candidate genes for regulation of IgA production in humans. Relationship to the risk of IgA nephropathy and tonsillectomy Given the known role of the IgA system in the pathogenesis of IgA nephropathy, a kidney disease caused by glomerular deposition of IgA in the setting of pharyngitis and other mucosal infections, we specifically tested for shared genetic architecture between IgA levels, IgA nephropathy, and tonsillectomy^[253]61 by systematic lookups and co-localization analyses of genome-wide significant loci. Among all 20 independent loci associated with higher IgA levels, 8 had nominal associations with increased risk of IgA nephropathy (P < 0.05), all with concordant effects (Supplementary Table [254]9). Among these, there were five genome-wide significant loci with high probability of shared causal variants (PP4 > 0.7), TNFSF4/TNFSF18, ANKRD55/IL6ST, OVOL1/RELA, SH2B3, and HORMAD2/LIF (Supplementary Table [255]10). There were also four loci with an overlapping genomic position, but high probability of different causal variants between the two traits (PP3 > 0.7), HLA, CTF1, TRAF3, and TNFSF8/TNFSF15. Between IgA levels and tonsillectomy, we observed two co-localized loci (SH2B3 and HORMAD2/LIF), both with concordant effects, and another two loci (HLA and CTF1) with high probability of different causal variants (Supplementary Table [256]11). Remarkably, the HORMAD2/LIF locus was genome-wide significant in all three GWAS and co-localized across all three traits, suggesting a common genetic mechanism (Fig. [257]5a, b). Fig. 5. Colocalization and Mendelian randomization analyses based on GWAS for serum IgA levels, IgA nephropathy and tonsillectomy. [258]Fig. 5 [259]Open in a new tab a Regional plot of the HORMAD2/LIF locus for IgA levels (without the deCODE-Lund cohort, top panel), IgA nephropathy (middle panel), and tonsillectomy (lower panel). The two-sided unadjusted P-value corresponds to the fixed effects GWAS meta-analysis. b Co-localization analysis of HORMAD2/LIF locus across the three traits; PP4 is the posterior probability of co-localization. c Mendelian randomization analysis using IgA level (GWAS N = 41,263) as an exposure, IgA nephropathy (GWAS N = 38,897) as an outcome, and co-localizing loci as instruments. The error bars correspond to 95% confidence intervals for the effect size. The x and y axis represent effect sizes of the genetic variants associated with the exposure and outcome, respectively. P-value corresponds to the two-sided inverse variance weighted (IVW) Mendelian Randomization test. To further explore the genetic relationships between these traits, we performed two sample Mendelian Randomization (MR) between the co-localized IgA level-associated loci as an exposure instrument and IgA nephropathy and tonsillectomy as disease outcomes. Using this strategy, we estimated significant causal effects between serum IgA levels and IgA nephropathy (inverse variance-weighted OR 9.70 per SD of exposure, 95%CI: 6.80–13.8, P < 0.001; Fig. [260]5c), supporting IgA level as a strong causal mediator of disease risk for these loci. Sensitivity analysis confirmed that all co-localized loci contributed with concordant effects, there were no outlier effects, and there was no evidence of directional horizontal pleiotropy (Egger intercept test P = 0.58). Moreover, this effect remained highly significant when instrumental variables were expanded to encompass all genome-wide significant non-HLA loci for serum IgA levels (inverse variance-weighted OR = 2.49 per SD of IgA level, 95%CI: 1.56–3.99, P < 0.001). In contrast, we detected no significant causal effects between IgA levels and tonsillectomy (P = 0.26), or tonsillectomy and IgA nephropathy (P = 0.96). Genetic correlations with other complex traits studied by GWAS We assessed genome-wide genetic correlations (r[g]) of IgA levels with 52 complex traits and diseases, including 13 immune-mediated disorders, 23 infectious diseases, and 16 cardio-metabolic traits using stratified LD score regression^[261]22,[262]23 (Fig. [263]6a and Supplementary Table [264]12). In the analysis that excluded the HLA region (to remove potential bias from large effects and extended LD at this locus), we confirmed positive genetic correlation with IgA nephropathy (r[g] = 0.35, P = 0.002), tonsillectomy (r[g] = 0.20, P = 0.01), type 2 diabetes (r[g] = 0.18, P = 0.01), and BMI (r[g] = 0.13, P = 0.03). We observed a negative genetic correlation with Crohn’s disease (r[g] = −0.19, P = 0.005), celiac disease (r[g] = −0.21, P = 0.007), and inflammatory bowel disease (r[g] = −0.13, P = 0.04). The observed negative correlation with traits that involve gut inflammation could be potentially explained by the protective anti-inflammatory effects of mucosal IgA. Among infectious disease GWAS, we observed mainly negative genetic correlations, including with bacterial meningitis (r[g] = −0.47, P = 0.005) and shingles (r[g] = −0.46, P = 0.009), despite the fact that most existing GWAS for infections are either underpowered or based only on self-report^[265]61. For most, but not all phenotypes, genetic correlations with and without HLA were comparable, as summarized in Supplementary Fig. [266]8 and Supplementary Table [267]12. Fig. 6. Genetic relationships between IgA levels and human disease traits. [268]Fig. 6 [269]Open in a new tab a genome-wide genetic correlation analyses between IgA levels and autoimmune, infectious, and cardio-metabolic traits after exclusion of the HLA region (*P < 0.05; two-sided unadjusted P-values for genetic correlation by LD score regression). Supplementary Table [270]12 provides genetic correlations with and without HLA with P-values for each trait, and references to the original GWAS studies; the error bars