Graphical abstract graphic file with name fx1.jpg [33]Open in a new tab Highlights * • Genetic predisposition to SLE confers risk of CAD * • Increased CVD risk in SLE involves atherosclerosis rather than cardiac dysfunction * • PPI-based MR identifies pathways with causal implications between SLE and CAD * • Pathway analysis informs therapeutic selection for managing CAD in SLE __________________________________________________________________ Kain et al. report a positive genetic relationship between SLE and CAD using traditional and comprehensive Mendelian randomization (MR) approaches. MR analysis coupled with network modeling highlights the shared genetic risk underlying the two diseases and identifies putative molecular pathways involved in the development of CAD in SLE. Introduction Systemic lupus erythematosus (SLE) is a female-predominant autoimmune disease characterized by immune dysregulation and multi-organ inflammation that is frequently associated with the development of cardiovascular disease (CVD).[34]^1^,[35]^2 SLE exhibits hyperactivity of the innate and adaptive immune systems, increased production of numerous autoantibodies, and disturbed cytokine balance.[36]^3 Although CVD is not a diagnostic criterion of SLE and was not included in the original descriptions of the disease, it is currently the main cause of death in SLE,[37]^4^,[38]^5^,[39]^6 with coronary artery disease (CAD) directly responsible for one-third to one-half of all CVD cases and 30% of deaths.[40]^7^,[41]^8^,[42]^9 Notably, whereas mortality from infections and active disease has decreased in SLE patients, CVD-related death rates have not improved,[43]^10 and the standardized mortality ratio related to CVD has actually increased.[44]^11 Women with SLE have a significantly increased risk of stroke and myocardial infarction along with elevated incidence of asymptomatic atherosclerosis compared with the general population.[45]^12^,[46]^13 Furthermore, traditional CVD risk factors, such as cholesterol, blood pressure, and smoking status, fail to fully account for the overall higher risk of acute CVD events in SLE, although the underlying mechanisms remain unknown.[47]^14^,[48]^15^,[49]^16^,[50]^17 This lack of an understanding for the increased risk of CVD in SLE has resulted in limited treatment options and the puzzling juxtaposition that despite the efficacy of statins and angiotensin-converting enzyme inhibitors/angiotensin receptor blockers in treating the general population, they appear to have little effect on CVD outcomes in SLE patients.[51]^5^,[52]^18 As a result, even though SLE has a prevalence of only about 70 per 100,000, it ranks among the leading causes of death in young women,[53]^1 despite the omission of lupus diagnoses in almost half of SLE patients’ death certificates.[54]^19^,[55]^20 Genetic predisposition imposes important risk factors for both SLE and CVD.[56]^21^,[57]^22^,[58]^23 To date, genetic association studies of SLE patients with and without CVD have been limited in size and have detected only a few common genetic risk loci, including IRF8, STAT4, IL19, and SRP54-AS1.[59]^22^,[60]^24^,[61]^25 Mendelian randomization (MR) is a causal inference method using genotypes as “treatments” when randomized controlled trials are not feasible. By measuring and correlating the effect sizes of exposure-associated genetic variants in large-scale genetic association studies on traits of interest, a causal effect of the exposure on the outcome can be estimated. Here, we report the application of multiple, complementary MR methods to identify causal paths from SLE-associated variants to CAD using summary statistics from genetic association studies. Using multiple MR algorithms, we have identified large sets of SLE causal variants that also impart genetic risk for CAD as well as those that appear to diminish the risk of CAD. Using innovative approaches to build molecular pathways from genetic risk factors,[62]^26 we have developed a map of SLE-derived biological processes with causal implications for CAD that may account for the genetic basis of the association between these two apparently dissimilar clinical entities and may also provide insights into the shared mechanisms underlying each. Understanding the pathogenesis of genetic variants underlying the increased CAD risk in SLE can ultimately provide insight into the immune and inflammatory components of atherosclerosis, as well as revealing opportunities for targeted therapeutics. Results Pathway analysis reveals gene networks implicated by genetic variants associated with both SLE and CAD To explore the shared genetic predispositions for SLE and CAD, we first identified single nucleotide polymorphisms (SNPs) associated with each trait in five SLE and one CAD multi-ancestral genetic association study.[63]^27^,[64]^28^,[65]^29^,[66]^30^,[67]^31^,[68]^32 In total, 96 SNPs were associated with both conditions ([69]Figure S1A). Notably, the majority of the overlapping SNPs mapped to the human leukocyte antigen (HLA) region of chromosome 6. To identify putative gene(s) influenced by each of the 96 SNPs associated with both SLE and CAD, we mapped causal SNPs to genes,[70]^26 identifying 189 unique genes encoding 135 proteins in STRINGdb ([71]Figure S1B). Stratified linkage disequilibrium score regression (S-LDSC) was then used to validate the biological relevance of SNP-predicted gene and protein sets by assessing whether they captured more disease heritability than expected by chance with respect to all genes and STRINGdb proteins, respectively.[72]^33 Application of S-LDSC using genome-wide association study (GWAS) summary statistics for SLE (GCST003155[73]^27), CVD (GCST004280[74]^34), and two CAD datasets (CAD-I, GCST000998[75]^35 and CAD-II, GCST001479[76]^36) determined that the 189 genes predicted by the 96 overlapping SNPs were significantly (p < 0.05) enriched for genomic regions capturing the genetic heritability of CAD and CVD ([77]Figure S1C). Nearly identical results were also obtained using the smaller subset of 135 protein-coding genes ([78]Figure S1C). However, application of standard LDSC, which was restricted to the use of the relatively small, European-only SLE GWAS,[79]^27 did not reveal a significant level of genetic correlation between the diseases ([80]Figure S1D). To assess molecular networks encoded by the set of 135 protein-encoding genes predicted from the overlapping SLE/CAD SNPs, a protein-protein interaction (PPI) network was generated, and unsupervised clustering revealed 12 distinct gene clusters that were functionally enriched in a diverse range of immunological and cellular categories ([81]Figure S1E), many with relevance to SLE and CAD/CVD, including cluster 1 characterized by canonical pathways for “Antigen presentation pathway” and “B cell development,” along with “Sudden cardiac death,” and cluster 3 enriched in “Atherosclerosis signaling” and “Lupus erythematosus, systemic,” among others ([82]Table S1). Although the molecular pathways associated with SNP-predicted genes suggested a convergence of biological processes underlying SLE and CAD, it remained uncertain whether the finding of overlapping SNPs implied shared genetic causation. The subsequent studies presented here explore this in detail. MR estimates a positive correlation between effects of SLE-associated non-HLA variants on SLE and CAD MR methods were employed to estimate the association between effect sizes of relevant variants on SLE and CAD. We first applied six MR methods using various sets of SLE-associated instrumental variables (IVs) to determine whether they tend to confer shared (positive) effects on SLE and CAD, noting that this initial approach did not satisfy all assumptions for IV validity or IV independence and therefore could only provide an estimated association ([83]Figure 1A). Initial exploratory analyses employing IVs derived from the Immunochip and GWASs suggested a net positive association for non-HLA SLE-associated SNPs on CAD ([84]Figures 1B and 1C). Even when using SLE IVs determined with the more stringent significance level and removal of known pleiotropic associations to CVD or confounders such as cholesterol, obesity, blood pressure, insulin resistance, and smoking, the indication of a positive causal relationship between SLE and CAD remained ([85]Figure 1C, bottom row). Figure 1. [86]Figure 1 [87]Open in a new tab MR demonstrates a positive association of effect sizes of SLE-associated non-HLA SNPs on SLE and CAD (A) Graphical depiction of the two-stage approach for an initial exploratory analysis using expanded groups of SNPs as IVs followed by a confirmatory analysis using highly curated IVs. (B and C) Forest plots of six MR causal estimates (β ± standard error). For results, gray indicates insignificant (p > 0.05) and red indicates positive causal estimates determined by each MR method. (B) Immunochip-derived SLE-associated non-HLA SNPs were used as IVs for SLE; summary statistics from both the SLE Immunochip study (left) and SLE GWAS (right) were used for the exposure; summary statistics from the CAD GWAS were used for the outcome. (C) Additional MR analyses using SNPs associated (p < 10^−6) with SLE in the Immunochip and GWAS study (rows 1 and 2) or Phenoscanner reaching genome-wide significance (p < 5 × 10^−8, row 3) were used as IVs; summary statistics for the exposure and outcome are indicated. MR analyses excluding the entire short arm of chromosome 6 and excluding only the extended HLA region (chr6: 27–34 Mb, left columns). Right columns show MR analyses using the same sets of SNPs excluding pleiotropic SNPs associated (p < 10^−5) with either CAD directly or CVD-related confounders, included on the Phenoscanner platform. To validate the robustness of our estimated associations by satisfying stringent requirement for IV selection, we carried out two-sample MR analyses using multi-ancestral, non-HLA SNPs strongly associated (p < 5 × 10^−8) with SLE, excluding SNPs weakly associated (p < 10^−5) with CVD or confounders ([88]Table S2), followed by stringent linkage disequilibrium (LD) clumping to ensure IV independence[89]^37 (R^2 = 0.001, 100 kb window, 1000 Genomes EA reference population) ([90]Figure 2A). SLE GWAS summary statistics were used for exposure,[91]^27 and multiple CAD GWASs were used for outcome (GCST005194,[92]^32 CAD-a and GCST005195,[93]^32 CAD-b). Since CAD is causative of myocardial infarction (MI) and atherosclerosis is common to CAD and ischemic stroke (IS), MR was carried out using summary statistics for two additional MI GWASs (GCST003117,[94]^38 MI-a and GCST011365,[95]^39 MI-b) and IS (GCST006906[96]^40). Summary statistics for cardiomyopathy from the FinnGen biobank analysis (finn-b-I9_CARDMYO, CM) and atrial fibrillation (GCST006414,[97]^41 AFib), which are not associated with atherosclerosis or CAD, were also included for comparison. After LD clumping, 60 independent SNPs were included in the SLE GWAS and then harmonized with each outcome-GWAS pair before use as IVs for SLE exposure ([98]Table S9). Between 43 and 56 harmonized IVs for SLE exposure were then tested using 16 MR methods, some of which account for additional IV invalidity, pleiotropy, or heterogeneity, to estimate causal relationships with the various atherosclerotic and cardiac conditions. The majority of MR methods resulted in significant (p < 0.05) positive causal estimates of SLE-associated variants on CAD-a and both MI GWASs, but not for cardiomyopathy or AFib ([99]Figures 2B and [100]S2A; [101]Table S3). Directional pleiotropy was only detected between the SLE and CAD-b GWASs by the MR-Egger intercept test ([102]Figure S2A), indicating potential bias in the causal estimates based on effect sizes using these summary statistics. Overall, inverse variance weighting (IVW), weighted median, penalized weighted median, maximum likelihood, robust adjusted profile score (RAPS), and pleiotropy residual sum and outlier (PRESSO) were significant in four out of five outcome GWASs ([103]Figure 2B). These results establish a positive causal effect of SLE on CAD and suggest that the increased CVD risk associated with SLE is likely to involve atherosclerosis rather than other aspects of cardiac pathology. Figure 2. [104]Figure 2 [105]Open in a new tab MR demonstrates a net positive causal effect of SLE-associated non-HLA SNPs on CAD (A) MR diagram for testing the causal effects of SLE on CAD with respect to instrument relevance to the exposure, exclusion from the outcomes (i.e., CAD, MI, IS), and independence from confounding factors. LD clumping (R^2 < 0.001) was used to obtain independent IVs. (B) Forest plots of MR causal estimates (β ± standard error) for SLE on CAD (CAD-a, CAD-b), MI (MI-a, MI-b), IS, cardiomyopathy (CM), and atrial fibrillation (AFib) GWAS using 16 MR methods. Missing PRESSO-OC (outlier correction) estimates indicate insignificant global tests for horizontal pleiotropy. For results, gray indicates insignificant (p > 0.05), red indicates positive (p < 0.05), and blue indicates negative (p < 0.05) causal estimates determined by each MR method. Numbers within forest plots indicate the SNPs used as IVs after harmonization. NOME, no measurement error. (C) Application of S-LDSC using summary statistics for SLE, CVD, and CAD GWAS to estimate the heritability (coefficient ± standard error) of the 284 SNP-predicted genes (top) and 160 SNP-predicted proteins from STRINGdb (bottom). Bar color indicates coefficient significance. (D) Cluster metastructures for the 160 putative protein-coding genes are based on PPI networks. Functional and cell-type enrichments for each cluster were determined using BIG-C (black labels) and I-scope (red labels), respectively. Black labels over colored shadings represent shared functional annotations for the clusters they surround. To eliminate the possibility that the positive causal estimate of SLE on CAD is bidirectional and therefore unlikely to represent a true causal relationship, MR was also carried out in the reverse direction, with CAD or MI as exposure and SLE as the outcome. Importantly, none of the 14 methods yielded a significant positive causal estimate of CAD or MI on SLE. Of interest, however, significant (p < 0.05) negative causal estimates of CAD and MI on SLE were observed for approximately half of the 14 MR methods tested ([106]Figure S2B and [107]Table S3). To understand the pathways underlying the positive causal estimates of SLE on CAD in greater detail, all SLE-associated SNPs included as putative IVs before harmonization with each GWAS were mapped to genes. Consistent with satisfying the exclusion restriction criteria and independence assumption with respect to traits imposing significant CVD risk, S-LDSC results demonstrated that the 284 genes and 160 predicted proteins captured a significant portion of SLE heritability (p = 3.46 × 10^−5 and 3 × 10^−5, respectively), but not that of CVD or CAD ([108]Figure 2C). Proteins predicted from the SLE IVs were then integrated into connectivity networks in STRINGdb ([109]Figure 2D). Cluster annotations were dominated by processes commonly dysregulated in SLE as expected, including canonical pathways for “Systemic lupus erythematosus in B cell signaling,” “Th1 pathway,” and “Th2 pathway,” as well as gene ontology (GO) terms for “Regulation of immune response” (GO:0050776) and “Negative regulation of B cell activation” (GO:0050869) ([110]Table S4). Interestingly, disease associations were enriched in various autoimmune diseases (“Lupus erythematosus, systemic,” “Aicardi-Goutières syndrome,” and “Hashimoto’s disease”), along with cardiovascular dysfunction, such as “Arterial embolism and thrombosis,” “Hypertension,” “Plaque, atherosclerotic,” and “Ischemic heart disease” ([111]Table S4). Single-SNP MR identifies gene networks implicated by SLE-associated variants with positive and negative causal estimates on CAD We next employed single-SNP MR (SSMR) to identify specific SLE-associated variants with positive or negative estimates on CAD. SSMR applied to SLE-associated SNPs, including those in the HLA region, reveal that the majority of negative causal SNPs are located on the short arm of chromosome 6; all but one were tightly packed around the HLA region, spanning chr6:28,014,374–33,683,352 ([112]Figure S3). When excluding the short arm of chromosome 6 and SNPs associated with CVD or confounders, SSMR identified 80 and 96 SLE-associated variants with significant (p < 0.05) positive ([113]Figure 3A, top 25) and negative ([114]Figure 3B, top 25) causal estimates on CAD, respectively ([115]Figure S4 and [116]Table S3). The majority of positive causal SNPs were distributed on chromosomes 1, 2, and 4, whereas over 50% of negative causal SNPs were on chromosomes 7, 11, and 17 ([117]Figures 3C and 3E). Figure 3. [118]Figure 3 [119]Open in a new tab Analysis of SLE-associated SNP-predicted genes with causal effects on CAD by single-SNP MR (A and B) Forest plots (β ± standard error) of the top 25 (by absolute value of causal estimates) positive (A) and negative (B) causal non-HLA SNPs identified by single-SNP MR (SSMR) using the Wald ratio method. (C and E) Pie charts illustrating the chromosomal distribution of 80 positive (C) and 96 negative (E) causal SLE SNPs on CAD. (D and F) Cluster metastructures for the 200 (D) and 184 (F) predicted genes from positive and negative causal SNPs identified by single-SNP MR. Functional and cell-type enrichments for each cluster were determined using BIG-C (black labels) and I-scope (red labels), respectively. Bold black labels over colored shadings represent shared functional annotations for the clusters they surround. (G and H) S-LDSC using summary statistics for SLE, CVD, and CAD GWAS to estimate the heritability (coefficient ± standard error) of genes (open bars) and SNP-predicted proteins (hashed bars) predicted by positive (G) and negative (H) causal SNPs determined by SSMR. Bar color indicates coefficient significance. Non-HLA SLE variants with either significant positive or negative causal estimates on CAD were separately mapped to 236 ([120]Figure 3D) and 244 ([121]Figure 3F) predicted genes, respectively, for clustering and pathway analysis. Positive SNP-predicted gene clusters were enriched in the canonical pathway for “Antigen presentation” and functional categories for major histocompatibility complex (MHC) class I, as well as epigenetic processes, transcription, and endocytosis ([122]Figure 3E and [123]Table S5), whereas negative SNP-predicted gene clusters were dominated by processes related to cell death (“Pyroptosis” [GO:0070269], cluster 2; “Regulation of oxidative stress-induced cell death” [GO:1903201], cluster 6) and protein degradation (“Proteasome,” cluster 6 and “Autophagy,” cluster 8; [124]Figure 3F and [125]Table S5). Finally, gene sets predicted by both positive and negative causal SNPs captured a significant portion of SLE heritability but not that of CAD or CVD, consistent with their selection as IVs for SLE ([126]Figures 3G and 3H). Pathway analysis of HLA-region variants associated with SLE risk and protective of CAD Risk haplotypes in the HLA region heavily contribute to susceptibility for SLE[127]^42 and CAD.[128]^43 However, accurate genotyping of HLA alleles and corresponding GWAS effect size estimates are notoriously unreliable.[129]^44 Additionally, the complex genetic architecture of this region makes mapping HLA variants to genes especially challenging given the extensive LD and high density of genes in this region. Nonetheless, an examination of the HLA area (chr6:28.5–33.5 Mb) revealed 30 SNPs significantly (p < 10^−6) associated with both SLE and CAD in their respective GWAS. While these SNPs are not independently associated variants, all 30 SNPs had positive effect sizes for SLE but were negative for CAD ([130]Figure S5A), possibly reflecting the extensive LD in this region. Connectivity mapping and clustering of the 69 protein-encoding genes predicted from these 30 SNPs revealed six distinct clusters dominated by processes dysregulated in SLE, including the functional categories for MHC class I and MHC class II in clusters 1 and 6, along with canonical pathways for “Th1 and Th2 activation,” “B cell development,” and “Notch signaling”, as well as GO term for “Interferon-gamma mediated signaling pathway” (GO:0060334) ([131]Figures S5B and S5C). Other pathways of interest involving “Complement system abnormalities,” “LXR/RXR activation,” and “21-hydroxylase deficiency” were predicted by cluster 3 ([132]Figure S5C). PPI-based MR predicts specific sets of SLE-associated variants and gene pathways causal of CAD To obtain a more comprehensive view of the possible impact of SLE-derived molecular pathways on atherosclerosis, we mapped SLE-associated, non-HLA Immunochip SNPs with net positive causal estimates on CAD by MR to genes and pathways regardless of their associations with CVD-related traits. In total, 838 SNPs predicted 2,336 putative genes and 1,501 proteins that collectively captured a significant amount of SLE, but not CAD or CVD, heritability ([133]Figure 4A); these 1,501 proteins clustered into 46 distinct clusters based on PPI connectivity ([134]Figure 4B). We then grouped SLE-associated SNP mapping to genes in each of the 46 PPI-based clusters for use as SLE IVs to estimate cluster-specific associations with atherosclerotic traits. Initial application of MR-IVW to these 46 subsets of SLE SNP-derived IVs yielded 16 and 9 significant (p < 0.05) positive and negative causal estimates, respectively ([135]Figures 4B, 4C, and [136]S6A) for CAD. Additional MR methods, including mode- and median-based methods, MR-Egger, MR-RAPS, MR-PRESSO, and maximum likelihood, were carried out for further validation of the PPI-based MR-IVW causal estimates ([137]Table S3). Clusters were grouped into tiers with respect to consistency across the various MR methods, with tier 1 clusters yielding significant positive or negative causal estimates by almost all (at least 14/16) MR methods and tier 2 clusters yielding significant positive or negative causal estimates by MR-IVW or at least seven MR methods ([138]Figures 4B and 4C). Finally, when examined individually, 20 of the 46 clusters specifically captured SLE heritability by PPI-based S-LDSC, many with significant causal estimates on CAD by PPI-based MR ([139]Figure 4D and [140]Table S6). Figure 4. [141]Figure 4 [142]Open in a new tab SLE-derived gene network with causal implications for CAD and PPI-based MR (A) S-LDSC using summary statistics for SLE, CVD, and CAD GWAS to estimate the heritability (coefficient ± standard error) of the 2,336 genes (open bars) and 1,501 proteins (hashed bars) predicted by 838 Immunochip SNPs associated with SLE. Bar color indicates coefficient significance. (B) Functional and cell-type enrichments for cluster metastructures were determined using BIG-C (black labels) and I-scope (red labels), respectively. Bold black labels over colored shadings represent shared BIG-C functional annotations for the clusters they surround. Node size is proportional to the number of SNPs (height) mapping to the genes in each cluster (width). For node color, red and blue indicate significant positive or negative estimates, respectively for 14/16 MR methods used (tier 1); light red and light blue indicate significant positive or negative estimates by MR-IVW or at least 7/16 MR methods (tier 2); gray indicates insignificant. Thickness of the yellow border is roughly proportional to the negative log of the MR-IVW p value. Green border indicates clusters with −log(MR-IVW p value) > 3. (C) Forest plots from PPI-based MR showing estimates (β ± standard error) calculated by MR-IVW for select positive and negative clusters. (D) PPI-based S-LDSC (coefficient ± standard error) using GWAS summary statistics for SLE, CVD, and CAD. Bar color indicates coefficient significance. In an effort to support these results by expanding the size of the network, we added 914 multi-ancestral, non-HLA SNPs associated with SLE on the Phenoscanner database to the analysis. Overall, 1,708 unique SNPs predicted 3,272 putative genes and 1,972 proteins that collectively captured a significant amount of SLE heritability, but not that of CAD or CVD ([143]Figure 5A) and clustered into 67 distinct sets of protein-coding genes ([144]Figure 5B). PPI-based MR-IVW using these 67 clusters of SLE SNPs as IVs yielded 24 and 11 significant (p < 0.05) positive and negative causal estimates on CAD, respectively ([145]Figures 5C, 5D, and [146]S6B), many of which captured SLE heritability, but not that of CAD or CVD, by PPI-based S-LDSC ([147]Figure 5E and [148]Table S6). Figure 5. [149]Figure 5 [150]Open in a new tab Comprehensive PPI-based MR predicts sets of SLE-associated variants and pathways causal of CAD (A) S-LDSC using GWAS summary statistics for SLE, CVD, and CAD to estimate the heritability (coefficient ± standard error) of the 3,272 genes (open bars) and 1,972 protein-coding genes (hashed bars) predicted by 1,708 combined Immunochip- and Phenoscanner-derived SNPs. Bar color indicates coefficient significance. (B) Functional and cell-type enrichments for cluster metastructures were determined using BIG-C (black labels) and I-scope (red labels), respectively. Bold black labels over colored shadings represent shared BIG-C functional annotations for the clusters they surround. Node size is proportional to the number of SNPs (height) mapping to the genes in each cluster (width). For node color, red and blue indicate significant positive or negative estimates, respectively for 14/16 MR methods used (tier 1); light red and light blue indicate significant positive or negative estimates by MR-IVW or at least 7/16 MR methods (tier 2); purple indicates mixed estimates; gray indicates insignificant. Thickness of the yellow border is roughly proportional to the negative log of the MR-IVW p value. Green border indicates clusters with a negative log of the MR-IVW p value >3. (C and D) Forest plots from PPI-based MR showing estimates (β ± standard error) calculated by 16 MR methods for select positive (C) and negative (D) clusters. The number of SNPs used as IVs for each cluster are indicated in the plots. (E) PPI-based S-LDSC (coefficient ± standard error) using GWAS summary statistics for SLE, CVD, and CAD. Bar color indicates coefficient significance. To ensure that the majority of predicted causal clusters are not a result of random chance or multiple-hypothesis testing, simulations were carried out to estimate the false discovery rate. Results from these simulations, which account for both LD and pleiotropy, indicate that SLE-derived and PPI-clustered modules, as opposed to randomly generated SNP-to-gene modules, demonstrate a higher rate of significant causal estimates on CAD ([151]Figure S7). Furthermore, to assess the reproducibility of the cluster-specific causal estimates, PPI-based MR was repeated using CVD-related GWAS datasets on the MR-Base platform.[152]^45 The PPI-based MR-IVW causal estimates were highly consistent using summary statistics from two CAD and two MI GWAS on MR-Base, but not cardiomyopathy or AFib ([153]Table S7), suggesting that the stratified causal estimates on CAD are associated with the atherosclerotic component of CVD. Together, these results support the conclusion that the PPI-based MR results are atherosclerosis specific and unlikely to be trivial results of random chance or multiple-hypothesis testing. SLE-derived clusters in all positive and negative causal tiers were annotated using multiple functional and cellular composition tools ([154]Figure 5B and [155]Table S8). These results show that a wide range of SNP-predicted biological functions known to be involved in SLE pathogenesis have causal implications for CAD by MR, such as “Neutrophil degranulation” (clusters 2 and 43), “Th1 and Th2 activation” and/or “Th17 activation” (clusters 3, 5, 8, and 9), “Interferon signaling” (cluster 8), “Leukocyte extravasation signaling” (cluster 12), “Leukocyte transendothelial migration” (cluster 28), and “Leukocyte adhesion to endothelial cells” (cluster 2). In addition to immune-related pathways, many of these positive causal clusters were enriched in disease phenotypes associated with CVD, including “Th1 cell activation and proliferation in atherosclerosis” (cluster 9) and “Lipid and atherosclerosis” (cluster 12). Interestingly, several clusters were enriched in autonomic nervous system control related to cardiac function (“Cardiac muscle contraction,” clusters 13 and 33) or “Neuroinflammation” (cluster 60) ([156]Figure 5B and [157]Table S8). In contrast, SLE-derived clusters with negative causal estimates on CAD were enriched for oxidative stress (cluster 10), nitric oxide (clusters 24, 40, and 64), and high-density lipoprotein cholesterol (clusters 24 and 50) ([158]Figure 5B and [159]Table S8). Pathway enrichment was further reflected in assigned functional categories, with reactive oxygen species protection (clusters 24 and 45), nuclear receptor transcription (cluster 40), and ubiquitylation and SUMOylation (cluster 64) dominating clusters with protective estimates on CAD ([160]Figure 5B). These results are highly consistent with the enrichments associated with negative causal variants in our single-SNP MR and HLA-specific pathway analyses. PPI-based MR stratifies SNPs, genes, and networks underlying the positive and negative causal effects of SLE on CAD To further validate the causal effects of the 67 SNP-to-gene modules identified by PPI-based MR ([161]Figure 6A), we carried out additional MR analyses with respect to PPI-based MR cluster groupings after accounting for pleiotropy and LD. Causal estimates of SLE on CAD with IVs derived from clusters meeting the tier 1 or the tier 1 and 2 criteria, as well as those that surpassed the MR-IVW p value <0.00075 threshold, were universally more positive, significant, and consistent than those based on all SNPs ([162]Figures 6B and 6C; [163]Table S3). Similarly, negative causal estimates for SLE on CAD were obtained using IVs meeting the negative tier 1 and MR-IVW p value <0.00075 thresholds from the 67-cluster network. In contrast, IVs derived from clusters with insignificant causal estimates generally failed to reach significance in either direction. While these trends were observed using summary statistics from both the SLE GWAS and SLE Immunochip, causal estimates were more significant using the SLE Immunochip, consistent with its larger sample size. Importantly, these results demonstrate that PPI-based MR can be used to identify independent IVs satisfying MR assumptions that underlie both positive and negative causal effects of SLE on CAD. Figure 6. [164]Figure 6 [165]Open in a new tab PPI-based MR identifies SLE SNPs with positive and negative causal effects on CAD (A) Workflow depicting PPI-based MR. (B and C) PPI-based MR validation. Forest plots (β ± standard error) from 16 MR methods using summary statistics from the SLE GWAS (B) or SLE Immunochip (C) as the exposure and CAD GWAS as the outcome. SLE-associated non-HLA SNPs mapping to positive and negative clusters, separately (by tier) and together (“All SNPs”) were used as IVs after excluding CVD and confounder-associated SNPs followed by stringent LD clumping (R^2 = 0.001) and harmonization. The number of SNPs used as IVs for each SNP set are indicated in the plots. For results, gray indicates insignificant (p > 0.05), dark red indicates positive (p < 0.00075), red indicates positive (p < 0.05), dark blue indicates negative (p < 0.00075), and blue indicates negative (p < 0.05) by each MR method. Pathway analysis facilitates drug prediction Pathways associated with positive causal clusters were used to facilitate identification of new therapeutic interventions for managing the unique inflammatory environment contributing to CAD in SLE ([166]Figure 7A). Canonical pathways related to immune function in clusters 2, 3, 5, and 8 predicted drugs targeting T and B cells and inflammatory cytokines, including daratumumab (CD38), belimumab (TNFSF13), elotuzumab (SLAMF7), abatacept (CD80/86), iberdimide (IKZF1/IKZF3), and sarliumab (IL6R). Broader analysis of pathway categories also suggested the utility of targeting interferon signaling with anifrolumab (cluster 8), as well as antiplatelet/coagulant therapy to combat dyslipidemia (cluster 5).[167]^46 Additional noteworthy targets include PCSK9 (cluster 5), a protease involved in the degradation and recycling of the low-density lipoprotein (LDL) receptor targeted by alirocumab and evolocumab, and oxidated LDL molecules (cluster 5) targeted by orticumab ([168]Figure 7B). Figure 7. [169]Figure 7 [170]Open in a new tab Genes and molecular pathways associated with positive causal clusters identify therapeutic interventions for managing CAD in SLE (A) All tier 1 and a selection of tier 2 clusters were functionally annotated using BIG-C, IPA, and the EnrichR database. Select drugs acting on direct gene targets or on any of the associated pathways (italics) are listed. (B) Venn diagram summarizing therapies that might uniquely impact SLE or CAD and those that may target pathways common to both diseases. Discussion Although genetic association studies have been successful in mapping disease loci in both immune and cardiovascular diseases, the genetic and molecular basis for the increased CAD predisposition in SLE patients has remained largely unexplained. Considering the limited data on CAD in SLE, we developed an approach that utilized GWAS summary statistics for both diseases to identify and interpret various sets of SLE-associated variants with causal implications for CAD. New findings suggest that the causal relationship with SLE appears to be focused on the atherosclerotic process, evidenced by positive estimates with CAD, MI, and IS, but not other cardiac conditions such as cardiomyopathy or AFib. Furthermore, we developed and carried out PPI-based MR approach to identify specific sets of SLE variants mapping to biologically relevant gene sets with causal implications for CAD. By coupling various MR methods with network modeling and variant interpretation, we not only provided substantial evidence of shared genetic risk but also identified the putative molecular pathways involved in the development of CAD in SLE. Moreover, a number of the immune and inflammatory pathways identified in these analyses could well contribute to the pathogenesis of CAD even in the absence of SLE or other recognized autoimmune conditions. This points to the larger implication that CAD itself is a heterogeneous condition and subpopulations, such as those driven by SLE-associated processes, might require potentially distinct treatment strategies, at least partially motivated by unique genetic predispositions. Causal inference using traditional MR methods rely on strict assumptions for independent IVs; however, given the extensive pleiotropy underlying complex traits such as SLE and CVD, efforts to satisfy these assumptions can result in biasing the analyses by excluding previously established associations. Furthermore, the exclusion of SNPs associated with CVD-related traits results in the loss of relevant molecular information. While the use of SLE IVs that are also associated with CVD or confounders in traditional MR disqualifies the causal estimates from representing an effect on CAD directly through SLE, these SNPs can be just as important with respect to understanding the relevant biological pathways underlying CAD in SLE. Similarly, stringent LD clumping to obtain an independent set of IVs not only reduces the statistical power of MR [171]^47 but also can omit additional SNPs, genes, and pathways underlying CAD in SLE. Due to our rigorous efforts to satisfy the assumptions and account for LD in the traditional MR analyses while also employing numerous MR methods that account for IV invalidity, pleiotropy, or heterogeneity, these results may give overly conservative estimates of the causal effects and underlying mechanisms as a result of overpruning. To overcome these limitations of traditional MR, we developed and employed a PPI-based MR approach using networks comprehensively derived from large sets of SLE-associated SNPs, regardless of their associations with CVD-related traits. By generating cluster-specific associations between effect sizes on SLE and CAD, biologically relevant SNP-to-gene modules can be categorized as having shared (positive estimates) or opposing (negative estimates) effects on SLE and CAD. Traditional MR using independent, SLE-specific IVs mapping to positive and negative clusters separately, confirmed that the groups of causal clusters are representative of positive and negative causal effects on CAD through SLE, respectively. We believe that our PPI-based MR approach is particularly beneficial in cases when the exposure is complex and heterogeneous, such as SLE which embodies a diverse range of molecular and pathophysiological mechanisms that we expect to impose unique casual effects on CAD. Genetic variants are typically mapped to genes with respect to genomic location, identifying genes containing and/or nearby the SNPs of interest. Additionally, more recent advances have given rise to identification of trans-acting genomic regions that can epigenetically and/or transcriptionally influence genes at distant locations. This is especially important for complex, polygenic traits, such as SLE and CAD, of which most associated variants are non-coding. Here, we link SNPs to genes via amino acid changes in encoded proteins, proximity, expression quantitative trait loci predictions, and regulatory elements in an effort to be as comprehensive as possible. Our subsequent PPI-based clustering elucidated a broad range of biologically relevant molecular networks within the diverse set of implicated genes and importantly served to filter out noise. Furthermore, our PPI-based MR approach served to highlight SNP-to-gene modules contributing most to the causal effects of SLE on CAD. Together, these results demonstrate how SLE genetics can be used to identify both known and novel loci and pathways with causal implications for CAD. Numerous biologically relevant SNP-to-gene modules were determined to have positive causal effects on CAD through SLE by MR, spanning inflammatory factors, adaptive and innate immunity, intracellular signaling, cell differentiation, microRNA and mRNA processing, mitochondrial function, and more. A wide range of enrichments among positive causal clusters have been hypothesized and/or demonstrated to contribute to CVD in SLE patients, including glucocorticoids, neutrophil cell death (NETosis) and degranulation, tumor necrosis factor-like weak inducer of apoptosis (TWEAK) signaling, canonical and alternative complement pathways, Th1 differentiation, and lipid and lipoprotein metabolism, among others. Considering the drastically increased prevalence and mortality of CAD in SLE, the considerable portion of SLE-associated risk variants with negative causal effects on CAD was unexpected and suggested that numerous variants contributing to SLE have atheroprotective effects. Further SNP-to-gene mapping and detailed pathway analyses revealed that these variants are involved in various processes, predominantly related to oxidative stress and cholesterol homeostasis, whose atheroprotective effects have been found to be impaired in certain disease-related contexts such as SLE. For example, the enzyme responsible for maintaining cholesterol homeostasis though lipoprotein lipase synthesis, cholesterol 27-hydroxylase, has been shown to be decreased in human monocytes and aortic endothelial cells of SLE patients and is thought to impair the protective mechanism of efflux of cellular cholesterol.[172]^48 Cyp27a1 is the gene that encodes cholesterol 27-hydroxylase and is a liver X receptor (LXR) target activated by oxysterols as well as a target of retinoid X receptor (RXR) and peroxisome proliferator activated receptor (PPAR) in human macrophages.[173]^49 LXR activation has additional proatherogenic and atheroprotective effects, as LXR activation in the liver promotes atherosclerosis via excess lipogenesis whereas LXR activation in macrophages and dendritic cells has anti-inflammatory effects, linking lipid metabolism, immune cell function, and inflammation.[174]^50 Our approach also has the advantage of identifying “actionable” points of therapeutic intervention with the potential to impact the inflammatory environment associated with CAD in SLE. This is especially important given that CAD risk in SLE cannot be fully accounted for by the increased prevalence of traditional atherosclerotic risk factors. SLE subjects therefore may derive particular benefit from treatments that mitigate inflammatory intermediates such as type I interferons with anifrolumab. Our findings also highlight additional putative targets, including PCSK9 involved in LDL receptor recycling. Inhibitors of PCSK9 activity, such as alirocumab and evolocumab, are approved by the Food and Drug Administration to treat hyperlipidemia and may prove to be effective in controlling atherosclerosis in chronic inflammatory conditions.[175]^51 Finally, recent reports also support targeting oxidized LDL molecules (anti-oxLDL, orticumab) for the prevention of cardiovascular events in SLE.[176]^52 Limitations of the study Limitations of this study include those related to the data integrated in our pipeline. First, SLE genetic association studies have been restricted in size and scope, yielding limited power and genomic coverage, especially considering the extensive heterogeneity and polygenicity of lupus. To maximize both power and scope, we used the largest genetic association study for SLE, which is limited to Immunochip SNPs, the largest SLE GWAS, as well as SLE-associated SNPs pooled from the Phenoscanner platform. However, most genetic association studies, including the multi-ancestral data used in this study, are heavily biased toward European ancestries[177]^53. This is especially problematic given the increased CVD morbidity and mortality in SLE patients of African ancestry[178]^54 in addition to the ancestry-dependent disparities observed in both SLE and CAD. It is also of note that certain risk factors leading to distinct phenotypic outcomes such as CAD are likely to be impacted by environmental factors that cannot be accounted for by genetics alone. This is important with respect to the higher disease burden observed in patients of African ancestry, where barriers to treatment (such as delayed diagnosis and/or limited access to a specialist) may contribute to elevated mortality in this population and further underscores the importance of generating large datasets with diverse patient populations. In addition, the ability to map genetic variants to implicated genes is limited to known SNP-to-gene relationships included in Ensembl’s variant effect predictor (VEP), Geno-type-Tissue Expression (GTEx), and Human ACtive Enhancer to interpret Regulatory variants (HACER) databases. Although putative causal pathways associated with the HLA region are intriguing, mapping of the SNPs within the HLA region to genes is challenging because of the extensive LD across the region. Additionally, genes included in our PPI networks and clusters are limited to protein-coding genes and interactions included in STRINGdb. This is a potential shortcoming of our pipeline, especially considering the large number of non-coding genes implicated in our SNP-to-gene predictions in addition to the growing evidence highlighting the contributions of non-coding long RNAs and microRNAs in both SLE and CAD.[179]^55^,[180]^56 Similarly, the ability to annotate gene clusters functionally is limited and potentially biased by the data underlying the numerous enrichment platforms used in our pathway analyses. Ingenuity Pathway Analysis (IPA), EnrichR, which pools a myriad of public databases, and cell and functional analytic tools were all utilized to obtain orthogonal and reproducible annotations. Ultimately, however, our robust SNP-to-gene mapping approach, which included multiple sources of information in combination with biologically informed clustering employing numerous sources of annotation, enabled comprehensive analysis of both small and large sets of genetic variants to specific pathways with excellent reproducibility. In summary, we have employed various approaches to clearly identify shared genetic risk factors for SLE and CAD. These results have provided new information about common molecular pathways in SLE and CAD, as well as the genetic and molecular information needed to consider novel therapeutic interventions in these conditions. STAR★Methods Key resources table REAGENT or RESOURCE SOURCE IDENTIFIER Deposited data __________________________________________________________________ GTEX v.6.8 The Geno-type-Tissue Expression (GTEx) Project [181]GTEXportal.org Pre-processed summary statistics The Broad Institute [182]https://alkesgroup.broadinstitute.org/LDSCORE/all_sumstats/ UK Biobank Biobank UK [183]https://www.ukbiobank.ac.uk/ Cardiomyopathy GWAS FinnGen Biobank Finn-a-I9_CARDMYO; [184]https://www.finngen.fi/fi MR-Base Hemani et al., 2018[185]^45 [186]https://www.mrbase.org Phenoscanner Staley et al., 2016[187]^57; Kamat et al., 2019[188]^58 [189]www.phenoscanner.medschl.cam.ac.uk __________________________________________________________________ Software and algorithms __________________________________________________________________ Bioconductor (R) Open source [190]https://www.bioconductor.org Ingenuity pathway analysis (IPA) Qiagen [191]https://www.qiagenbioinformatics.com S-LDSC Finucane et al., 2015[192]^33 [193]https://github.com/bulik/ldsc LDSC Bulik-Sullivan et al., 2015[194]^59 [195]https://github.com/bulik/ldsc Cytoscape v.3.9.1 Shannon et al., 2003[196]^60 [197]https://cytoscape.org Clustermaker2 v.1.2.1 Morris et al., 2011[198]^61 [199]https://apps.cytoscape.org TwoSample MR Hemani et al., 2018[200]^45 [201]https://github.com/MRCIEU/TwoSampleMR PPI-based MR This manuscript [202]https://doi.org/10.6084/Lm9.figshare.21225251 __________________________________________________________________ Other __________________________________________________________________ Human Active Enhancers to interpret Regulatory variants (HACER) Wang et al., 2019[203]^62 [204]http://bioinfo.vanderbilt.edu/AE/HACER Variant Effect Predictor (VEP) McLaren et al., 2016[205]^63 [206]ensembl.org/info/docs/tools/vep Search Tool for the Retrieval of Interacting Genes/proteins (STRING) v. 11.0b Szklarczyk et al., 2021[207]^64 [208]https://string-db.org EnrichR Chen et al., 2013[209]^65 [210]https://maayanlab.cloud/Enrichr/ Search Tool for Interacting Chemicals (STITCH) Kuhn et al., 2009[211]^66 [212]http://stitch.embl.de Library of Integrated Network-based Cellular Signatures (LINCS) Keenan et al., 2018[213]^67 [214]http://www.lincs.hms.harvard.edu/db/ [215]Open in a new tab Resource availability Lead contact Further information and requests should be directed to the lead contact, Katherine A. Owen ([216]kate.owen@ampelbiosolutions.com). Materials availability This study did not generate new unique reagents. Data and code availability * • This paper analyzes existing, publicly available data. All GWAS and Immunochip studies are referenced. * • Original software code and documentation to conduct PPI-based MR analyses have been deposited on figshare ([217]www.figshare.com; [218]https://doi.org/10.6084/m9.figshare.21225251) and is publicly available as of the date of publication. * • Any additional information required to reanalyze the data reported in this report is available from the [219]lead contact upon request. Experimental models and subject details This study did not use any experimental models or enroll human subjects. Method details Identification of SLE- and CAD-associated SNPs and overlap SNPs associated with each disease were obtained from previous GWAS and Immunochip studies. For CAD, we used a comprehensive multi-ancestral meta-analysis of GWAS.[220]^32 For SLE, we included results of multiple GWAS and Immunochip studies to account for as many ancestries as possible.[221]^27^,[222]^28^,[223]^29^,[224]^30^,[225]^31 In total, 7,222 and 16,163 unique SNPs were significantly (p < 10^−6) associated with SLE and CAD, respectively, and were employed in these studies. A full list of the SNPs, chromosome locations, positions and sources used are detailed in [226]Table S9. Identification of SNP-predicted genes Expression quantitative trait loci (eQTLs) were identified using GTEx[227]^68 version 6.8 ([228]GTEXportal.org) and mapped to their associated eQTL expression genes (E-Genes). To find SNPs in enhancers and promoters, and their associated transcription factors and downstream target genes (T- Genes), we queried the atlas of Human Active Enhancers to interpret Regulatory variants[229]^62 (HACER, [230]http://bioinfo.vanderbilt.edu/AE/HACER). To find SNPs in exons of protein-coding genes (C-Genes) and include proximal genes (P-Genes, within 5kb), we queried the human Ensembl genome browser’s variant effect predictor[231]^63 (VEP, ensembl.org/info/docs/tools/vep, GRCh38.p12). Network analysis and visualization Protein-protein interaction (PPI) networks of SNP-predicted protein-coding genes were generated by STRING[232]^64 ([233]https://string-db.org, version 11.0b), and resulting networks were imported into Cytoscape[234]^60 (version 3.6.1) for visualization and partitioned with MCODE via the clusterMaker2[235]^61 (version 1.2.1) plugin. Metastructures are based on PPI networks. For all metastructures, node gradient shading is proportional to intra-cluster connectivity, cluster size indicates number of genes per cluster and edge weight indicates inter-cluster connections. Functional gene set analysis Predicted genes were examined using Biologically Informed Gene Clustering (BIG-C; version 4.4.). BIG-C is a custom functional clustering tool developed to annotate the biological meaning of large lists of genes and has been previously described.[236]^69^,[237]^70^,[238]^71 I-Scope is a custom clustering tool used to identify immune cell types in large gene datasets.[239]^72 The Ingenuity Pathway Analysis (IPA; [240]https://www.qiagenbioinformatics.com) platform and EnrichR[241]^65 ([242]https://maayanlab.cloud/Enrichr/) web server provided additional molecular pathway enrichment analysis. Drug candidate identification Drug candidates were identified using LINCS,[243]^67 STITCH[244]^66 (v5.0), IPA and literature mining. Each of the database tools includes either a programmatic method of matching existing therapeutics to their targets or else is a list of drugs and targets for achieving the same end. Quantification and statistical anaysis Linkage disequilibrium score regression (LDSC) genetic correlations LDSC[245]^59 was used to estimate genome-wide genetic correlations between traits using GWAS summary statistics. Pre-processed summary statistics from SLE, CAD and CVD GWAS were obtained from the Broad webpage ([246]https://alkesgroup.broadinstitute.org/LDSCORE/all_sumstats/). Using the LDSC software provided on github ([247]https://github.com/bulik/ldsc) and reference data on the Broad webpage ([248]https://alkesgroup.broadinstitute.org/LDSCORE/), including European LD scores 'eur_w_ld_chr' or 'weights_hm3_no_hla' as weights for analyses excluding the HLA region. Using standard parameters, the "ldsc.py" (with the "--rg" flag) script was used to generate genome-wide genetic correlation estimates between SLE and CVD or CAD. Stratified linkage disequilibrium score regression (S-LDSC) S-LDSC[249]^33 was used to obtain gene-set specific disease-heritability estimates using GWAS summary statistics. Pre-processed summary statistics from SLE, CAD and CVD GWAS were obtained from Broad webpage ([250]https://alkesgroup.broadinstitute.org/LDSCORE/all_sumstats/). Using the S-LDSC software provided on github ([251]https://github.com/bulik/ldsc) and reference data on the Broad webpage ([252]https://alkesgroup.broadinstitute.org/LDSCORE/), annotation and LD score files were generated for each SNP-predicted gene- and protein-set, separately. Using standard parameters, the “make_annot.py” and “ldsc.py” (with the “--l2” flag) scripts were first used to generate the gene-set-specific annotation and LD files, then the “ldsc.py” (with the “--h2-cts” flag) script was used to generate stratified heritability scores for each GWAS. Selection of valid, independent instrumental variables for traditional MR analysis Traditional MR methods, such as MR-IVW, operate under three strict assumptions for instrumental variable (IV) validity: 1) the relevance assumption, 2) the exclusion restriction criteria assumption, and 3) the independence assumption. To satisfy the relevance assumption, SNPs significantly (genome-wide significance p value < 5 × 10^−8) associated with SLE[253]^27^,[254]^28^,[255]^29^,[256]^73^,[257]^74^,[258]^75^,[259]^76 ^,[260]^77^,[261]^78^,[262]^79^,[263]^80^,[264]^81^,[265]^82^,[266]^83^ ,[267]^84^,[268]^85^,[269]^86^,[270]^87 were obtained from the Phenoscanner database ([271]www.phenoscanner.medschl.cam.ac.uk)(82,83) ([272]Table S9). To satisfy the exclusion restriction criteria and independence assumptions, 89,336 SNPs weakly associated (p value < 1 × 10^−5) with CVD and confounders including cholesterol, obesity, blood pressure, insulin resistance, smoking, age-related diseases, and many more, were excluded from being IVs for SLE-exposure (see [273]Table S2 for the full list of excluded traits). HLA-region SNPs were conservatively removed from MR analyses by excluding the short-arm of chromosome 6. Stringent LD clumping[274]^37 was employed using the clump data (R^2 = 0.001, 100kb window, 1000G EA reference population) function to generate an independent set of 60 SLE-IVs harmonized for each GWAS. Mendelian randomization (MR) MR was used to test for causal relationships between SLE and CAD using the MR-Base[275]^45 ([276]https://www.mrbase.org) TwoSampleMR[277]^45 package in R ([278]https://github.com/MRCIEU/TwoSampleMR). Various sets of SLE-associated genetic variants used as instrumental variables (IVs) and summary statistics for SLE-exposure were manually imported into R and summary statistics were carried out for MR-base compatibility using the ‘format data’ command. All effect sizes and standard errors were obtained from the exposure summary statistics used in each analysis, regardless of the study in which each IV was associated with the exposure. Given the availability of well-powered CAD/MI GWAS on MR-Base, IVs for CAD and MI were directly obtained from each exposure GWAS using the ‘extract instruments’ command for the bidirectional analyses. Data from the SLE and all CVD-related GWAS studies used in our MR analyses are publicly available and also accessible through the MR-Base software, which was used to obtain the outcome summary statistics via the ‘extract outcome data’ command. The ‘allele harmonization’ command was used to ensure the effect estimates of the exposure and outcome are based on matching alleles, excluding SNPs with completely mismatching alleles from the MR analysis or reversing the effect and non-effect alleles along with the effect estimates when applicable. Because of the allele harmonization step and because some SNPs are absent from the available summary statistics, a small proportion of SNPs used as IVs are absent from the final MR calculations. Up to sixteen individual MR methods were carried out through the TwoSampleMR package, including inverse variance weighted (IVW), simple mode, weighted mode, simple median, weighted median (WMedian), MR-Egger, MR-PRESSO (raw and outlier-corrected), MR-RAPS, and two sample maximum likelihood (ML). The ‘MR report’ function was used to generate a summary containing heterogeneity and directional pleiotropy tests and scatterplots ([279]Figure S2). MR-IVW and MR-Egger heterogeneity test results (Q-value) indicate whether significant heterogeneity was detected, which does not necessarily indicate biased causal estimates. MR-Egger intercept indicate whether significant directional horizontal pleiotropy was detected, which usually indicates biased causal estimates. For single-SNP MR, the ‘MR single-SNP’ function was also carried out using the Wald Ratio method. Full details of all MR results are included in [280]Table S2 and a summary of all the main findings are included in [281]Table S10. PPI-based MR SLE-associated variants from the Immunochip[282]^31 and Phenoscanner database[283]^57^,[284]^58 were linked to their most likely genes, and the genes used to generate PPI-informed gene clusters. The SLE-associated SNPs mapping to genes in each of PPI-based clusters were then extracted to “reverse engineer” subsets of SNPs that could be used separately as SLE-IVs for MR to independently estimate the causal effects of each PPI-informed SNP-to-Gene module on CAD. Up to sixteen MR methods were carried out for each SNP-to-gene module through the TwoSampleMR package. In additional analyses (related to [285]Figure 6) using the combined Immunochip and Phenoscanner SNP dataset, filtering eliminated SNPs weakly associated (p value < 1 × 10^-5) with CVD and confounders including cholesterol, obesity, blood pressure, insulin resistance, smoking, age-related diseases, and many more ([286]Table S2). HLA-region SNPs were conservatively removed from MR analyses by excluding the short-arm of chromosome 6. Stringent LD clumping[287]^37 was employed using the clump_data (R^2 = 0.001, 100kb window, 1000G EA reference population) function to generate an independent set of SLE-IVs. Various analyses were performed using independent, valid IVs derived from ‘All SNPs’, SNPs mapping to ‘Insignificant’ clusters, ‘Positive Tier 1’ clusters, ‘Positive Tier 1 and Tier 2’ clusters, ‘Positive MR-IVW p value<0.00075’ clusters, ‘Negative MR-IVW p value<0.00075’ clusters, ‘Negative Tier 1’ clusters, and ‘Negative Tier 1 and Tier 2’ clusters. Monte Carlo simulations for expected MR results using random sets of Immunochip-derived SNP-to-gene modules Monte Carlo simulations were implemented and performed to estimate the false discovery rate with respect to significant PPI-based MR causal estimates. 120,026 Immunochip SNPs included in the SLE summary statistics were mapped to putative genes using the VEP, including regulatory effects, to generate an Immunochip SNP-to-Gene library with 67,211 unique SNPs mapping to 7,602 STRINGdb proteins. In each simulation, a random set of 3–152 SNP-predicted proteins were selected from the 7,602 proteins and used to extract up to 400 Immunochip SNPs. MR-IVW was then performed for SLE on CAD using harmonized, non-HLA SNPs (via removal of the entire short-arm of chromosome 6) from the simulated set of Immunochip SNPs as IVs. By using our Immunochip derived SNP-to-Gene dictionary for random selection of protein clusters and associated SNPs to generate random sets of IVs, our simulations account for both a high degree of LD and pleiotropy, especially considering the major influence of loci associated with diabetes in development of the Immunochip. Acknowledgments