Graphical abstract

   graphic file with name fx1.jpg
   [33]Open in a new tab

Highlights

     * •
       Genetic predisposition to SLE confers risk of CAD
     * •
       Increased CVD risk in SLE involves atherosclerosis rather than
       cardiac dysfunction
     * •
       PPI-based MR identifies pathways with causal implications between
       SLE and CAD
     * •
       Pathway analysis informs therapeutic selection for managing CAD in
       SLE
     __________________________________________________________________

   Kain et al. report a positive genetic relationship between SLE and CAD
   using traditional and comprehensive Mendelian randomization (MR)
   approaches. MR analysis coupled with network modeling highlights the
   shared genetic risk underlying the two diseases and identifies putative
   molecular pathways involved in the development of CAD in SLE.

Introduction

   Systemic lupus erythematosus (SLE) is a female-predominant autoimmune
   disease characterized by immune dysregulation and multi-organ
   inflammation that is frequently associated with the development of
   cardiovascular disease (CVD).[34]^1^,[35]^2 SLE exhibits hyperactivity
   of the innate and adaptive immune systems, increased production of
   numerous autoantibodies, and disturbed cytokine balance.[36]^3 Although
   CVD is not a diagnostic criterion of SLE and was not included in the
   original descriptions of the disease, it is currently the main cause of
   death in SLE,[37]^4^,[38]^5^,[39]^6 with coronary artery disease (CAD)
   directly responsible for one-third to one-half of all CVD cases and 30%
   of deaths.[40]^7^,[41]^8^,[42]^9 Notably, whereas mortality from
   infections and active disease has decreased in SLE patients,
   CVD-related death rates have not improved,[43]^10 and the standardized
   mortality ratio related to CVD has actually increased.[44]^11 Women
   with SLE have a significantly increased risk of stroke and myocardial
   infarction along with elevated incidence of asymptomatic
   atherosclerosis compared with the general population.[45]^12^,[46]^13
   Furthermore, traditional CVD risk factors, such as cholesterol, blood
   pressure, and smoking status, fail to fully account for the overall
   higher risk of acute CVD events in SLE, although the underlying
   mechanisms remain unknown.[47]^14^,[48]^15^,[49]^16^,[50]^17 This lack
   of an understanding for the increased risk of CVD in SLE has resulted
   in limited treatment options and the puzzling juxtaposition that
   despite the efficacy of statins and angiotensin-converting enzyme
   inhibitors/angiotensin receptor blockers in treating the general
   population, they appear to have little effect on CVD outcomes in SLE
   patients.[51]^5^,[52]^18 As a result, even though SLE has a prevalence
   of only about 70 per 100,000, it ranks among the leading causes of
   death in young women,[53]^1 despite the omission of lupus diagnoses in
   almost half of SLE patients’ death certificates.[54]^19^,[55]^20

   Genetic predisposition imposes important risk factors for both SLE and
   CVD.[56]^21^,[57]^22^,[58]^23 To date, genetic association studies of
   SLE patients with and without CVD have been limited in size and have
   detected only a few common genetic risk loci, including IRF8, STAT4,
   IL19, and SRP54-AS1.[59]^22^,[60]^24^,[61]^25 Mendelian randomization
   (MR) is a causal inference method using genotypes as “treatments” when
   randomized controlled trials are not feasible. By measuring and
   correlating the effect sizes of exposure-associated genetic variants in
   large-scale genetic association studies on traits of interest, a causal
   effect of the exposure on the outcome can be estimated. Here, we report
   the application of multiple, complementary MR methods to identify
   causal paths from SLE-associated variants to CAD using summary
   statistics from genetic association studies. Using multiple MR
   algorithms, we have identified large sets of SLE causal variants that
   also impart genetic risk for CAD as well as those that appear to
   diminish the risk of CAD. Using innovative approaches to build
   molecular pathways from genetic risk factors,[62]^26 we have developed
   a map of SLE-derived biological processes with causal implications for
   CAD that may account for the genetic basis of the association between
   these two apparently dissimilar clinical entities and may also provide
   insights into the shared mechanisms underlying each. Understanding the
   pathogenesis of genetic variants underlying the increased CAD risk in
   SLE can ultimately provide insight into the immune and inflammatory
   components of atherosclerosis, as well as revealing opportunities for
   targeted therapeutics.

Results

Pathway analysis reveals gene networks implicated by genetic variants
associated with both SLE and CAD

   To explore the shared genetic predispositions for SLE and CAD, we first
   identified single nucleotide polymorphisms (SNPs) associated with each
   trait in five SLE and one CAD multi-ancestral genetic association
   study.[63]^27^,[64]^28^,[65]^29^,[66]^30^,[67]^31^,[68]^32 In total, 96
   SNPs were associated with both conditions ([69]Figure S1A). Notably,
   the majority of the overlapping SNPs mapped to the human leukocyte
   antigen (HLA) region of chromosome 6. To identify putative gene(s)
   influenced by each of the 96 SNPs associated with both SLE and CAD, we
   mapped causal SNPs to genes,[70]^26 identifying 189 unique genes
   encoding 135 proteins in STRINGdb ([71]Figure S1B). Stratified linkage
   disequilibrium score regression (S-LDSC) was then used to validate the
   biological relevance of SNP-predicted gene and protein sets by
   assessing whether they captured more disease heritability than expected
   by chance with respect to all genes and STRINGdb proteins,
   respectively.[72]^33 Application of S-LDSC using genome-wide
   association study (GWAS) summary statistics for SLE
   (GCST003155[73]^27), CVD (GCST004280[74]^34), and two CAD datasets
   (CAD-I, GCST000998[75]^35 and CAD-II, GCST001479[76]^36) determined
   that the 189 genes predicted by the 96 overlapping SNPs were
   significantly (p < 0.05) enriched for genomic regions capturing the
   genetic heritability of CAD and CVD ([77]Figure S1C). Nearly identical
   results were also obtained using the smaller subset of 135
   protein-coding genes ([78]Figure S1C). However, application of standard
   LDSC, which was restricted to the use of the relatively small,
   European-only SLE GWAS,[79]^27 did not reveal a significant level of
   genetic correlation between the diseases ([80]Figure S1D).

   To assess molecular networks encoded by the set of 135 protein-encoding
   genes predicted from the overlapping SLE/CAD SNPs, a protein-protein
   interaction (PPI) network was generated, and unsupervised clustering
   revealed 12 distinct gene clusters that were functionally enriched in a
   diverse range of immunological and cellular categories
   ([81]Figure S1E), many with relevance to SLE and CAD/CVD, including
   cluster 1 characterized by canonical pathways for “Antigen presentation
   pathway” and “B cell development,” along with “Sudden cardiac death,”
   and cluster 3 enriched in “Atherosclerosis signaling” and “Lupus
   erythematosus, systemic,” among others ([82]Table S1). Although the
   molecular pathways associated with SNP-predicted genes suggested a
   convergence of biological processes underlying SLE and CAD, it remained
   uncertain whether the finding of overlapping SNPs implied shared
   genetic causation. The subsequent studies presented here explore this
   in detail.

MR estimates a positive correlation between effects of SLE-associated non-HLA
variants on SLE and CAD

   MR methods were employed to estimate the association between effect
   sizes of relevant variants on SLE and CAD. We first applied six MR
   methods using various sets of SLE-associated instrumental variables
   (IVs) to determine whether they tend to confer shared (positive)
   effects on SLE and CAD, noting that this initial approach did not
   satisfy all assumptions for IV validity or IV independence and
   therefore could only provide an estimated association ([83]Figure 1A).
   Initial exploratory analyses employing IVs derived from the Immunochip
   and GWASs suggested a net positive association for non-HLA
   SLE-associated SNPs on CAD ([84]Figures 1B and 1C). Even when using SLE
   IVs determined with the more stringent significance level and removal
   of known pleiotropic associations to CVD or confounders such as
   cholesterol, obesity, blood pressure, insulin resistance, and smoking,
   the indication of a positive causal relationship between SLE and CAD
   remained ([85]Figure 1C, bottom row).

Figure 1.

   [86]Figure 1
   [87]Open in a new tab

   MR demonstrates a positive association of effect sizes of
   SLE-associated non-HLA SNPs on SLE and CAD

   (A) Graphical depiction of the two-stage approach for an initial
   exploratory analysis using expanded groups of SNPs as IVs followed by a
   confirmatory analysis using highly curated IVs.

   (B and C) Forest plots of six MR causal estimates (β ± standard error).
   For results, gray indicates insignificant (p > 0.05) and red indicates
   positive causal estimates determined by each MR method. (B)
   Immunochip-derived SLE-associated non-HLA SNPs were used as IVs for
   SLE; summary statistics from both the SLE Immunochip study (left) and
   SLE GWAS (right) were used for the exposure; summary statistics from
   the CAD GWAS were used for the outcome. (C) Additional MR analyses
   using SNPs associated (p < 10^−6) with SLE in the Immunochip and GWAS
   study (rows 1 and 2) or Phenoscanner reaching genome-wide significance
   (p < 5 × 10^−8, row 3) were used as IVs; summary statistics for the
   exposure and outcome are indicated. MR analyses excluding the entire
   short arm of chromosome 6 and excluding only the extended HLA region
   (chr6: 27–34 Mb, left columns). Right columns show MR analyses using
   the same sets of SNPs excluding pleiotropic SNPs associated (p < 10^−5)
   with either CAD directly or CVD-related confounders, included on the
   Phenoscanner platform.

   To validate the robustness of our estimated associations by satisfying
   stringent requirement for IV selection, we carried out two-sample MR
   analyses using multi-ancestral, non-HLA SNPs strongly associated
   (p < 5 × 10^−8) with SLE, excluding SNPs weakly associated (p < 10^−5)
   with CVD or confounders ([88]Table S2), followed by stringent linkage
   disequilibrium (LD) clumping to ensure IV independence[89]^37 (R^2 =
   0.001, 100 kb window, 1000 Genomes EA reference population)
   ([90]Figure 2A). SLE GWAS summary statistics were used for
   exposure,[91]^27 and multiple CAD GWASs were used for outcome
   (GCST005194,[92]^32 CAD-a and GCST005195,[93]^32 CAD-b). Since CAD is
   causative of myocardial infarction (MI) and atherosclerosis is common
   to CAD and ischemic stroke (IS), MR was carried out using summary
   statistics for two additional MI GWASs (GCST003117,[94]^38 MI-a and
   GCST011365,[95]^39 MI-b) and IS (GCST006906[96]^40). Summary statistics
   for cardiomyopathy from the FinnGen biobank analysis
   (finn-b-I9_CARDMYO, CM) and atrial fibrillation (GCST006414,[97]^41
   AFib), which are not associated with atherosclerosis or CAD, were also
   included for comparison. After LD clumping, 60 independent SNPs were
   included in the SLE GWAS and then harmonized with each outcome-GWAS
   pair before use as IVs for SLE exposure ([98]Table S9). Between 43 and
   56 harmonized IVs for SLE exposure were then tested using 16 MR
   methods, some of which account for additional IV invalidity,
   pleiotropy, or heterogeneity, to estimate causal relationships with the
   various atherosclerotic and cardiac conditions. The majority of MR
   methods resulted in significant (p < 0.05) positive causal estimates of
   SLE-associated variants on CAD-a and both MI GWASs, but not for
   cardiomyopathy or AFib ([99]Figures 2B and [100]S2A; [101]Table S3).
   Directional pleiotropy was only detected between the SLE and CAD-b
   GWASs by the MR-Egger intercept test ([102]Figure S2A), indicating
   potential bias in the causal estimates based on effect sizes using
   these summary statistics. Overall, inverse variance weighting (IVW),
   weighted median, penalized weighted median, maximum likelihood, robust
   adjusted profile score (RAPS), and pleiotropy residual sum and outlier
   (PRESSO) were significant in four out of five outcome GWASs
   ([103]Figure 2B). These results establish a positive causal effect of
   SLE on CAD and suggest that the increased CVD risk associated with SLE
   is likely to involve atherosclerosis rather than other aspects of
   cardiac pathology.

Figure 2.

   [104]Figure 2
   [105]Open in a new tab

   MR demonstrates a net positive causal effect of SLE-associated non-HLA
   SNPs on CAD

   (A) MR diagram for testing the causal effects of SLE on CAD with
   respect to instrument relevance to the exposure, exclusion from the
   outcomes (i.e., CAD, MI, IS), and independence from confounding
   factors. LD clumping (R^2 < 0.001) was used to obtain independent IVs.

   (B) Forest plots of MR causal estimates (β ± standard error) for SLE on
   CAD (CAD-a, CAD-b), MI (MI-a, MI-b), IS, cardiomyopathy (CM), and
   atrial fibrillation (AFib) GWAS using 16 MR methods. Missing PRESSO-OC
   (outlier correction) estimates indicate insignificant global tests for
   horizontal pleiotropy. For results, gray indicates insignificant
   (p > 0.05), red indicates positive (p < 0.05), and blue indicates
   negative (p < 0.05) causal estimates determined by each MR method.
   Numbers within forest plots indicate the SNPs used as IVs after
   harmonization. NOME, no measurement error.

   (C) Application of S-LDSC using summary statistics for SLE, CVD, and
   CAD GWAS to estimate the heritability (coefficient ± standard error) of
   the 284 SNP-predicted genes (top) and 160 SNP-predicted proteins from
   STRINGdb (bottom). Bar color indicates coefficient significance.

   (D) Cluster metastructures for the 160 putative protein-coding genes
   are based on PPI networks. Functional and cell-type enrichments for
   each cluster were determined using BIG-C (black labels) and I-scope
   (red labels), respectively. Black labels over colored shadings
   represent shared functional annotations for the clusters they surround.

   To eliminate the possibility that the positive causal estimate of SLE
   on CAD is bidirectional and therefore unlikely to represent a true
   causal relationship, MR was also carried out in the reverse direction,
   with CAD or MI as exposure and SLE as the outcome. Importantly, none of
   the 14 methods yielded a significant positive causal estimate of CAD or
   MI on SLE. Of interest, however, significant (p < 0.05) negative causal
   estimates of CAD and MI on SLE were observed for approximately half of
   the 14 MR methods tested ([106]Figure S2B and [107]Table S3).

   To understand the pathways underlying the positive causal estimates of
   SLE on CAD in greater detail, all SLE-associated SNPs included as
   putative IVs before harmonization with each GWAS were mapped to genes.
   Consistent with satisfying the exclusion restriction criteria and
   independence assumption with respect to traits imposing significant CVD
   risk, S-LDSC results demonstrated that the 284 genes and 160 predicted
   proteins captured a significant portion of SLE heritability (p =
   3.46 × 10^−5 and 3 × 10^−5, respectively), but not that of CVD or CAD
   ([108]Figure 2C).

   Proteins predicted from the SLE IVs were then integrated into
   connectivity networks in STRINGdb ([109]Figure 2D). Cluster annotations
   were dominated by processes commonly dysregulated in SLE as expected,
   including canonical pathways for “Systemic lupus erythematosus in B
   cell signaling,” “Th1 pathway,” and “Th2 pathway,” as well as gene
   ontology (GO) terms for “Regulation of immune response” (GO:0050776)
   and “Negative regulation of B cell activation” (GO:0050869)
   ([110]Table S4). Interestingly, disease associations were enriched in
   various autoimmune diseases (“Lupus erythematosus, systemic,”
   “Aicardi-Goutières syndrome,” and “Hashimoto’s disease”), along with
   cardiovascular dysfunction, such as “Arterial embolism and thrombosis,”
   “Hypertension,” “Plaque, atherosclerotic,” and “Ischemic heart disease”
   ([111]Table S4).

Single-SNP MR identifies gene networks implicated by SLE-associated variants
with positive and negative causal estimates on CAD

   We next employed single-SNP MR (SSMR) to identify specific
   SLE-associated variants with positive or negative estimates on CAD.
   SSMR applied to SLE-associated SNPs, including those in the HLA region,
   reveal that the majority of negative causal SNPs are located on the
   short arm of chromosome 6; all but one were tightly packed around the
   HLA region, spanning chr6:28,014,374–33,683,352 ([112]Figure S3). When
   excluding the short arm of chromosome 6 and SNPs associated with CVD or
   confounders, SSMR identified 80 and 96 SLE-associated variants with
   significant (p < 0.05) positive ([113]Figure 3A, top 25) and negative
   ([114]Figure 3B, top 25) causal estimates on CAD, respectively
   ([115]Figure S4 and [116]Table S3). The majority of positive causal
   SNPs were distributed on chromosomes 1, 2, and 4, whereas over 50% of
   negative causal SNPs were on chromosomes 7, 11, and 17 ([117]Figures 3C
   and 3E).

Figure 3.

   [118]Figure 3
   [119]Open in a new tab

   Analysis of SLE-associated SNP-predicted genes with causal effects on
   CAD by single-SNP MR

   (A and B) Forest plots (β ± standard error) of the top 25 (by absolute
   value of causal estimates) positive (A) and negative (B) causal non-HLA
   SNPs identified by single-SNP MR (SSMR) using the Wald ratio method.

   (C and E) Pie charts illustrating the chromosomal distribution of 80
   positive (C) and 96 negative (E) causal SLE SNPs on CAD.

   (D and F) Cluster metastructures for the 200 (D) and 184 (F) predicted
   genes from positive and negative causal SNPs identified by single-SNP
   MR. Functional and cell-type enrichments for each cluster were
   determined using BIG-C (black labels) and I-scope (red labels),
   respectively. Bold black labels over colored shadings represent shared
   functional annotations for the clusters they surround.

   (G and H) S-LDSC using summary statistics for SLE, CVD, and CAD GWAS to
   estimate the heritability (coefficient ± standard error) of genes (open
   bars) and SNP-predicted proteins (hashed bars) predicted by positive
   (G) and negative (H) causal SNPs determined by SSMR. Bar color
   indicates coefficient significance.

   Non-HLA SLE variants with either significant positive or negative
   causal estimates on CAD were separately mapped to 236 ([120]Figure 3D)
   and 244 ([121]Figure 3F) predicted genes, respectively, for clustering
   and pathway analysis. Positive SNP-predicted gene clusters were
   enriched in the canonical pathway for “Antigen presentation” and
   functional categories for major histocompatibility complex (MHC) class
   I, as well as epigenetic processes, transcription, and endocytosis
   ([122]Figure 3E and [123]Table S5), whereas negative SNP-predicted gene
   clusters were dominated by processes related to cell death
   (“Pyroptosis” [GO:0070269], cluster 2; “Regulation of oxidative
   stress-induced cell death” [GO:1903201], cluster 6) and protein
   degradation (“Proteasome,” cluster 6 and “Autophagy,” cluster 8;
   [124]Figure 3F and [125]Table S5). Finally, gene sets predicted by both
   positive and negative causal SNPs captured a significant portion of SLE
   heritability but not that of CAD or CVD, consistent with their
   selection as IVs for SLE ([126]Figures 3G and 3H).

Pathway analysis of HLA-region variants associated with SLE risk and
protective of CAD

   Risk haplotypes in the HLA region heavily contribute to susceptibility
   for SLE[127]^42 and CAD.[128]^43 However, accurate genotyping of HLA
   alleles and corresponding GWAS effect size estimates are notoriously
   unreliable.[129]^44 Additionally, the complex genetic architecture of
   this region makes mapping HLA variants to genes especially challenging
   given the extensive LD and high density of genes in this region.
   Nonetheless, an examination of the HLA area (chr6:28.5–33.5 Mb)
   revealed 30 SNPs significantly (p < 10^−6) associated with both SLE and
   CAD in their respective GWAS. While these SNPs are not independently
   associated variants, all 30 SNPs had positive effect sizes for SLE but
   were negative for CAD ([130]Figure S5A), possibly reflecting the
   extensive LD in this region. Connectivity mapping and clustering of the
   69 protein-encoding genes predicted from these 30 SNPs revealed six
   distinct clusters dominated by processes dysregulated in SLE, including
   the functional categories for MHC class I and MHC class II in clusters
   1 and 6, along with canonical pathways for “Th1 and Th2 activation,” “B
   cell development,” and “Notch signaling”, as well as GO term for
   “Interferon-gamma mediated signaling pathway” (GO:0060334)
   ([131]Figures S5B and S5C). Other pathways of interest involving
   “Complement system abnormalities,” “LXR/RXR activation,” and
   “21-hydroxylase deficiency” were predicted by cluster 3
   ([132]Figure S5C).

PPI-based MR predicts specific sets of SLE-associated variants and gene
pathways causal of CAD

   To obtain a more comprehensive view of the possible impact of
   SLE-derived molecular pathways on atherosclerosis, we mapped
   SLE-associated, non-HLA Immunochip SNPs with net positive causal
   estimates on CAD by MR to genes and pathways regardless of their
   associations with CVD-related traits. In total, 838 SNPs predicted
   2,336 putative genes and 1,501 proteins that collectively captured a
   significant amount of SLE, but not CAD or CVD, heritability
   ([133]Figure 4A); these 1,501 proteins clustered into 46 distinct
   clusters based on PPI connectivity ([134]Figure 4B). We then grouped
   SLE-associated SNP mapping to genes in each of the 46 PPI-based
   clusters for use as SLE IVs to estimate cluster-specific associations
   with atherosclerotic traits. Initial application of MR-IVW to these 46
   subsets of SLE SNP-derived IVs yielded 16 and 9 significant (p < 0.05)
   positive and negative causal estimates, respectively ([135]Figures 4B,
   4C, and [136]S6A) for CAD. Additional MR methods, including mode- and
   median-based methods, MR-Egger, MR-RAPS, MR-PRESSO, and maximum
   likelihood, were carried out for further validation of the PPI-based
   MR-IVW causal estimates ([137]Table S3). Clusters were grouped into
   tiers with respect to consistency across the various MR methods, with
   tier 1 clusters yielding significant positive or negative causal
   estimates by almost all (at least 14/16) MR methods and tier 2 clusters
   yielding significant positive or negative causal estimates by MR-IVW or
   at least seven MR methods ([138]Figures 4B and 4C). Finally, when
   examined individually, 20 of the 46 clusters specifically captured SLE
   heritability by PPI-based S-LDSC, many with significant causal
   estimates on CAD by PPI-based MR ([139]Figure 4D and [140]Table S6).

Figure 4.

   [141]Figure 4
   [142]Open in a new tab

   SLE-derived gene network with causal implications for CAD and
   PPI-based MR

   (A) S-LDSC using summary statistics for SLE, CVD, and CAD GWAS to
   estimate the heritability (coefficient ± standard error) of the 2,336
   genes (open bars) and 1,501 proteins (hashed bars) predicted by 838
   Immunochip SNPs associated with SLE. Bar color indicates coefficient
   significance.

   (B) Functional and cell-type enrichments for cluster metastructures
   were determined using BIG-C (black labels) and I-scope (red labels),
   respectively. Bold black labels over colored shadings represent shared
   BIG-C functional annotations for the clusters they surround. Node size
   is proportional to the number of SNPs (height) mapping to the genes in
   each cluster (width). For node color, red and blue indicate significant
   positive or negative estimates, respectively for 14/16 MR methods used
   (tier 1); light red and light blue indicate significant positive or
   negative estimates by MR-IVW or at least 7/16 MR methods (tier 2); gray
   indicates insignificant. Thickness of the yellow border is roughly
   proportional to the negative log of the MR-IVW p value. Green border
   indicates clusters with −log(MR-IVW p value) > 3.

   (C) Forest plots from PPI-based MR showing estimates (β ± standard
   error) calculated by MR-IVW for select positive and negative clusters.

   (D) PPI-based S-LDSC (coefficient ± standard error) using GWAS summary
   statistics for SLE, CVD, and CAD. Bar color indicates coefficient
   significance.

   In an effort to support these results by expanding the size of the
   network, we added 914 multi-ancestral, non-HLA SNPs associated with SLE
   on the Phenoscanner database to the analysis. Overall, 1,708 unique
   SNPs predicted 3,272 putative genes and 1,972 proteins that
   collectively captured a significant amount of SLE heritability, but not
   that of CAD or CVD ([143]Figure 5A) and clustered into 67 distinct sets
   of protein-coding genes ([144]Figure 5B). PPI-based MR-IVW using these
   67 clusters of SLE SNPs as IVs yielded 24 and 11 significant (p < 0.05)
   positive and negative causal estimates on CAD, respectively
   ([145]Figures 5C, 5D, and [146]S6B), many of which captured SLE
   heritability, but not that of CAD or CVD, by PPI-based S-LDSC
   ([147]Figure 5E and [148]Table S6).

Figure 5.

   [149]Figure 5
   [150]Open in a new tab

   Comprehensive PPI-based MR predicts sets of SLE-associated variants and
   pathways causal of CAD

   (A) S-LDSC using GWAS summary statistics for SLE, CVD, and CAD to
   estimate the heritability (coefficient ± standard error) of the 3,272
   genes (open bars) and 1,972 protein-coding genes (hashed bars)
   predicted by 1,708 combined Immunochip- and Phenoscanner-derived SNPs.
   Bar color indicates coefficient significance.

   (B) Functional and cell-type enrichments for cluster metastructures
   were determined using BIG-C (black labels) and I-scope (red labels),
   respectively. Bold black labels over colored shadings represent shared
   BIG-C functional annotations for the clusters they surround. Node size
   is proportional to the number of SNPs (height) mapping to the genes in
   each cluster (width). For node color, red and blue indicate significant
   positive or negative estimates, respectively for 14/16 MR methods used
   (tier 1); light red and light blue indicate significant positive or
   negative estimates by MR-IVW or at least 7/16 MR methods (tier 2);
   purple indicates mixed estimates; gray indicates insignificant.
   Thickness of the yellow border is roughly proportional to the negative
   log of the MR-IVW p value. Green border indicates clusters with a
   negative log of the MR-IVW p value >3.

   (C and D) Forest plots from PPI-based MR showing estimates (β ±
   standard error) calculated by 16 MR methods for select positive (C) and
   negative (D) clusters. The number of SNPs used as IVs for each cluster
   are indicated in the plots.

   (E) PPI-based S-LDSC (coefficient ± standard error) using GWAS summary
   statistics for SLE, CVD, and CAD. Bar color indicates coefficient
   significance.

   To ensure that the majority of predicted causal clusters are not a
   result of random chance or multiple-hypothesis testing, simulations
   were carried out to estimate the false discovery rate. Results from
   these simulations, which account for both LD and pleiotropy, indicate
   that SLE-derived and PPI-clustered modules, as opposed to randomly
   generated SNP-to-gene modules, demonstrate a higher rate of significant
   causal estimates on CAD ([151]Figure S7). Furthermore, to assess the
   reproducibility of the cluster-specific causal estimates, PPI-based MR
   was repeated using CVD-related GWAS datasets on the MR-Base
   platform.[152]^45 The PPI-based MR-IVW causal estimates were highly
   consistent using summary statistics from two CAD and two MI GWAS on
   MR-Base, but not cardiomyopathy or AFib ([153]Table S7), suggesting
   that the stratified causal estimates on CAD are associated with the
   atherosclerotic component of CVD. Together, these results support the
   conclusion that the PPI-based MR results are atherosclerosis specific
   and unlikely to be trivial results of random chance or
   multiple-hypothesis testing.

   SLE-derived clusters in all positive and negative causal tiers were
   annotated using multiple functional and cellular composition tools
   ([154]Figure 5B and [155]Table S8). These results show that a wide
   range of SNP-predicted biological functions known to be involved in SLE
   pathogenesis have causal implications for CAD by MR, such as
   “Neutrophil degranulation” (clusters 2 and 43), “Th1 and Th2
   activation” and/or “Th17 activation” (clusters 3, 5, 8, and 9),
   “Interferon signaling” (cluster 8), “Leukocyte extravasation signaling”
   (cluster 12), “Leukocyte transendothelial migration” (cluster 28), and
   “Leukocyte adhesion to endothelial cells” (cluster 2). In addition to
   immune-related pathways, many of these positive causal clusters were
   enriched in disease phenotypes associated with CVD, including “Th1 cell
   activation and proliferation in atherosclerosis” (cluster 9) and “Lipid
   and atherosclerosis” (cluster 12). Interestingly, several clusters were
   enriched in autonomic nervous system control related to cardiac
   function (“Cardiac muscle contraction,” clusters 13 and 33) or
   “Neuroinflammation” (cluster 60) ([156]Figure 5B and [157]Table S8).

   In contrast, SLE-derived clusters with negative causal estimates on CAD
   were enriched for oxidative stress (cluster 10), nitric oxide (clusters
   24, 40, and 64), and high-density lipoprotein cholesterol (clusters 24
   and 50) ([158]Figure 5B and [159]Table S8). Pathway enrichment was
   further reflected in assigned functional categories, with reactive
   oxygen species protection (clusters 24 and 45), nuclear receptor
   transcription (cluster 40), and ubiquitylation and SUMOylation (cluster
   64) dominating clusters with protective estimates on CAD
   ([160]Figure 5B). These results are highly consistent with the
   enrichments associated with negative causal variants in our single-SNP
   MR and HLA-specific pathway analyses.

PPI-based MR stratifies SNPs, genes, and networks underlying the positive and
negative causal effects of SLE on CAD

   To further validate the causal effects of the 67 SNP-to-gene modules
   identified by PPI-based MR ([161]Figure 6A), we carried out additional
   MR analyses with respect to PPI-based MR cluster groupings after
   accounting for pleiotropy and LD. Causal estimates of SLE on CAD with
   IVs derived from clusters meeting the tier 1 or the tier 1 and 2
   criteria, as well as those that surpassed the MR-IVW p value <0.00075
   threshold, were universally more positive, significant, and consistent
   than those based on all SNPs ([162]Figures 6B and 6C; [163]Table S3).
   Similarly, negative causal estimates for SLE on CAD were obtained using
   IVs meeting the negative tier 1 and MR-IVW p value <0.00075 thresholds
   from the 67-cluster network. In contrast, IVs derived from clusters
   with insignificant causal estimates generally failed to reach
   significance in either direction. While these trends were observed
   using summary statistics from both the SLE GWAS and SLE Immunochip,
   causal estimates were more significant using the SLE Immunochip,
   consistent with its larger sample size. Importantly, these results
   demonstrate that PPI-based MR can be used to identify independent IVs
   satisfying MR assumptions that underlie both positive and negative
   causal effects of SLE on CAD.

Figure 6.

   [164]Figure 6
   [165]Open in a new tab

   PPI-based MR identifies SLE SNPs with positive and negative causal
   effects on CAD

   (A) Workflow depicting PPI-based MR.

   (B and C) PPI-based MR validation. Forest plots (β ± standard error)
   from 16 MR methods using summary statistics from the SLE GWAS (B) or
   SLE Immunochip (C) as the exposure and CAD GWAS as the outcome.
   SLE-associated non-HLA SNPs mapping to positive and negative clusters,
   separately (by tier) and together (“All SNPs”) were used as IVs after
   excluding CVD and confounder-associated SNPs followed by stringent LD
   clumping (R^2 = 0.001) and harmonization. The number of SNPs used as
   IVs for each SNP set are indicated in the plots. For results, gray
   indicates insignificant (p > 0.05), dark red indicates positive
   (p < 0.00075), red indicates positive (p < 0.05), dark blue indicates
   negative (p < 0.00075), and blue indicates negative (p < 0.05) by each
   MR method.

Pathway analysis facilitates drug prediction

   Pathways associated with positive causal clusters were used to
   facilitate identification of new therapeutic interventions for managing
   the unique inflammatory environment contributing to CAD in SLE
   ([166]Figure 7A). Canonical pathways related to immune function in
   clusters 2, 3, 5, and 8 predicted drugs targeting T and B cells and
   inflammatory cytokines, including daratumumab (CD38), belimumab
   (TNFSF13), elotuzumab (SLAMF7), abatacept (CD80/86), iberdimide
   (IKZF1/IKZF3), and sarliumab (IL6R). Broader analysis of pathway
   categories also suggested the utility of targeting interferon signaling
   with anifrolumab (cluster 8), as well as antiplatelet/coagulant therapy
   to combat dyslipidemia (cluster 5).[167]^46 Additional noteworthy
   targets include PCSK9 (cluster 5), a protease involved in the
   degradation and recycling of the low-density lipoprotein (LDL) receptor
   targeted by alirocumab and evolocumab, and oxidated LDL molecules
   (cluster 5) targeted by orticumab ([168]Figure 7B).

Figure 7.

   [169]Figure 7
   [170]Open in a new tab

   Genes and molecular pathways associated with positive causal clusters
   identify therapeutic interventions for managing CAD in SLE

   (A) All tier 1 and a selection of tier 2 clusters were functionally
   annotated using BIG-C, IPA, and the EnrichR database. Select drugs
   acting on direct gene targets or on any of the associated pathways
   (italics) are listed.

   (B) Venn diagram summarizing therapies that might uniquely impact SLE
   or CAD and those that may target pathways common to both diseases.

Discussion

   Although genetic association studies have been successful in mapping
   disease loci in both immune and cardiovascular diseases, the genetic
   and molecular basis for the increased CAD predisposition in SLE
   patients has remained largely unexplained. Considering the limited data
   on CAD in SLE, we developed an approach that utilized GWAS summary
   statistics for both diseases to identify and interpret various sets of
   SLE-associated variants with causal implications for CAD. New findings
   suggest that the causal relationship with SLE appears to be focused on
   the atherosclerotic process, evidenced by positive estimates with CAD,
   MI, and IS, but not other cardiac conditions such as cardiomyopathy or
   AFib. Furthermore, we developed and carried out PPI-based MR approach
   to identify specific sets of SLE variants mapping to biologically
   relevant gene sets with causal implications for CAD. By coupling
   various MR methods with network modeling and variant interpretation, we
   not only provided substantial evidence of shared genetic risk but also
   identified the putative molecular pathways involved in the development
   of CAD in SLE. Moreover, a number of the immune and inflammatory
   pathways identified in these analyses could well contribute to the
   pathogenesis of CAD even in the absence of SLE or other recognized
   autoimmune conditions. This points to the larger implication that CAD
   itself is a heterogeneous condition and subpopulations, such as those
   driven by SLE-associated processes, might require potentially distinct
   treatment strategies, at least partially motivated by unique genetic
   predispositions.

   Causal inference using traditional MR methods rely on strict
   assumptions for independent IVs; however, given the extensive
   pleiotropy underlying complex traits such as SLE and CVD, efforts to
   satisfy these assumptions can result in biasing the analyses by
   excluding previously established associations. Furthermore, the
   exclusion of SNPs associated with CVD-related traits results in the
   loss of relevant molecular information. While the use of SLE IVs that
   are also associated with CVD or confounders in traditional MR
   disqualifies the causal estimates from representing an effect on CAD
   directly through SLE, these SNPs can be just as important with respect
   to understanding the relevant biological pathways underlying CAD in
   SLE. Similarly, stringent LD clumping to obtain an independent set of
   IVs not only reduces the statistical power of MR [171]^47 but also can
   omit additional SNPs, genes, and pathways underlying CAD in SLE. Due to
   our rigorous efforts to satisfy the assumptions and account for LD in
   the traditional MR analyses while also employing numerous MR methods
   that account for IV invalidity, pleiotropy, or heterogeneity, these
   results may give overly conservative estimates of the causal effects
   and underlying mechanisms as a result of overpruning.

   To overcome these limitations of traditional MR, we developed and
   employed a PPI-based MR approach using networks comprehensively derived
   from large sets of SLE-associated SNPs, regardless of their
   associations with CVD-related traits. By generating cluster-specific
   associations between effect sizes on SLE and CAD, biologically relevant
   SNP-to-gene modules can be categorized as having shared (positive
   estimates) or opposing (negative estimates) effects on SLE and CAD.
   Traditional MR using independent, SLE-specific IVs mapping to positive
   and negative clusters separately, confirmed that the groups of causal
   clusters are representative of positive and negative causal effects on
   CAD through SLE, respectively. We believe that our PPI-based MR
   approach is particularly beneficial in cases when the exposure is
   complex and heterogeneous, such as SLE which embodies a diverse range
   of molecular and pathophysiological mechanisms that we expect to impose
   unique casual effects on CAD.

   Genetic variants are typically mapped to genes with respect to genomic
   location, identifying genes containing and/or nearby the SNPs of
   interest. Additionally, more recent advances have given rise to
   identification of trans-acting genomic regions that can epigenetically
   and/or transcriptionally influence genes at distant locations. This is
   especially important for complex, polygenic traits, such as SLE and
   CAD, of which most associated variants are non-coding. Here, we link
   SNPs to genes via amino acid changes in encoded proteins, proximity,
   expression quantitative trait loci predictions, and regulatory elements
   in an effort to be as comprehensive as possible. Our subsequent
   PPI-based clustering elucidated a broad range of biologically relevant
   molecular networks within the diverse set of implicated genes and
   importantly served to filter out noise. Furthermore, our PPI-based MR
   approach served to highlight SNP-to-gene modules contributing most to
   the causal effects of SLE on CAD. Together, these results demonstrate
   how SLE genetics can be used to identify both known and novel loci and
   pathways with causal implications for CAD.

   Numerous biologically relevant SNP-to-gene modules were determined to
   have positive causal effects on CAD through SLE by MR, spanning
   inflammatory factors, adaptive and innate immunity, intracellular
   signaling, cell differentiation, microRNA and mRNA processing,
   mitochondrial function, and more. A wide range of enrichments among
   positive causal clusters have been hypothesized and/or demonstrated to
   contribute to CVD in SLE patients, including glucocorticoids,
   neutrophil cell death (NETosis) and degranulation, tumor necrosis
   factor-like weak inducer of apoptosis (TWEAK) signaling, canonical and
   alternative complement pathways, Th1 differentiation, and lipid and
   lipoprotein metabolism, among others.

   Considering the drastically increased prevalence and mortality of CAD
   in SLE, the considerable portion of SLE-associated risk variants with
   negative causal effects on CAD was unexpected and suggested that
   numerous variants contributing to SLE have atheroprotective effects.
   Further SNP-to-gene mapping and detailed pathway analyses revealed that
   these variants are involved in various processes, predominantly related
   to oxidative stress and cholesterol homeostasis, whose atheroprotective
   effects have been found to be impaired in certain disease-related
   contexts such as SLE. For example, the enzyme responsible for
   maintaining cholesterol homeostasis though lipoprotein lipase
   synthesis, cholesterol 27-hydroxylase, has been shown to be decreased
   in human monocytes and aortic endothelial cells of SLE patients and is
   thought to impair the protective mechanism of efflux of cellular
   cholesterol.[172]^48 Cyp27a1 is the gene that encodes cholesterol
   27-hydroxylase and is a liver X receptor (LXR) target activated by
   oxysterols as well as a target of retinoid X receptor (RXR) and
   peroxisome proliferator activated receptor (PPAR) in human
   macrophages.[173]^49 LXR activation has additional proatherogenic and
   atheroprotective effects, as LXR activation in the liver promotes
   atherosclerosis via excess lipogenesis whereas LXR activation in
   macrophages and dendritic cells has anti-inflammatory effects, linking
   lipid metabolism, immune cell function, and inflammation.[174]^50

   Our approach also has the advantage of identifying “actionable” points
   of therapeutic intervention with the potential to impact the
   inflammatory environment associated with CAD in SLE. This is especially
   important given that CAD risk in SLE cannot be fully accounted for by
   the increased prevalence of traditional atherosclerotic risk factors.
   SLE subjects therefore may derive particular benefit from treatments
   that mitigate inflammatory intermediates such as type I interferons
   with anifrolumab. Our findings also highlight additional putative
   targets, including PCSK9 involved in LDL receptor recycling. Inhibitors
   of PCSK9 activity, such as alirocumab and evolocumab, are approved by
   the Food and Drug Administration to treat hyperlipidemia and may prove
   to be effective in controlling atherosclerosis in chronic inflammatory
   conditions.[175]^51 Finally, recent reports also support targeting
   oxidized LDL molecules (anti-oxLDL, orticumab) for the prevention of
   cardiovascular events in SLE.[176]^52

Limitations of the study

   Limitations of this study include those related to the data integrated
   in our pipeline. First, SLE genetic association studies have been
   restricted in size and scope, yielding limited power and genomic
   coverage, especially considering the extensive heterogeneity and
   polygenicity of lupus. To maximize both power and scope, we used the
   largest genetic association study for SLE, which is limited to
   Immunochip SNPs, the largest SLE GWAS, as well as SLE-associated SNPs
   pooled from the Phenoscanner platform. However, most genetic
   association studies, including the multi-ancestral data used in this
   study, are heavily biased toward European ancestries[177]^53. This is
   especially problematic given the increased CVD morbidity and mortality
   in SLE patients of African ancestry[178]^54 in addition to the
   ancestry-dependent disparities observed in both SLE and CAD. It is also
   of note that certain risk factors leading to distinct phenotypic
   outcomes such as CAD are likely to be impacted by environmental factors
   that cannot be accounted for by genetics alone. This is important with
   respect to the higher disease burden observed in patients of African
   ancestry, where barriers to treatment (such as delayed diagnosis and/or
   limited access to a specialist) may contribute to elevated mortality in
   this population and further underscores the importance of generating
   large datasets with diverse patient populations. In addition, the
   ability to map genetic variants to implicated genes is limited to known
   SNP-to-gene relationships included in Ensembl’s variant effect
   predictor (VEP), Geno-type-Tissue Expression (GTEx), and Human ACtive
   Enhancer to interpret Regulatory variants (HACER) databases. Although
   putative causal pathways associated with the HLA region are intriguing,
   mapping of the SNPs within the HLA region to genes is challenging
   because of the extensive LD across the region. Additionally, genes
   included in our PPI networks and clusters are limited to protein-coding
   genes and interactions included in STRINGdb. This is a potential
   shortcoming of our pipeline, especially considering the large number of
   non-coding genes implicated in our SNP-to-gene predictions in addition
   to the growing evidence highlighting the contributions of non-coding
   long RNAs and microRNAs in both SLE and CAD.[179]^55^,[180]^56
   Similarly, the ability to annotate gene clusters functionally is
   limited and potentially biased by the data underlying the numerous
   enrichment platforms used in our pathway analyses. Ingenuity Pathway
   Analysis (IPA), EnrichR, which pools a myriad of public databases, and
   cell and functional analytic tools were all utilized to obtain
   orthogonal and reproducible annotations. Ultimately, however, our
   robust SNP-to-gene mapping approach, which included multiple sources of
   information in combination with biologically informed clustering
   employing numerous sources of annotation, enabled comprehensive
   analysis of both small and large sets of genetic variants to specific
   pathways with excellent reproducibility.

   In summary, we have employed various approaches to clearly identify
   shared genetic risk factors for SLE and CAD. These results have
   provided new information about common molecular pathways in SLE and
   CAD, as well as the genetic and molecular information needed to
   consider novel therapeutic interventions in these conditions.

STAR★Methods

Key resources table

   REAGENT or RESOURCE SOURCE IDENTIFIER
   Deposited data
     __________________________________________________________________

   GTEX v.6.8 The Geno-type-Tissue Expression (GTEx) Project
   [181]GTEXportal.org
   Pre-processed summary statistics The Broad Institute
   [182]https://alkesgroup.broadinstitute.org/LDSCORE/all_sumstats/
   UK Biobank Biobank UK [183]https://www.ukbiobank.ac.uk/
   Cardiomyopathy GWAS FinnGen Biobank Finn-a-I9_CARDMYO;
   [184]https://www.finngen.fi/fi
   MR-Base Hemani et al., 2018[185]^45 [186]https://www.mrbase.org
   Phenoscanner Staley et al., 2016[187]^57; Kamat et al., 2019[188]^58
   [189]www.phenoscanner.medschl.cam.ac.uk
     __________________________________________________________________

   Software and algorithms
     __________________________________________________________________

   Bioconductor (R) Open source [190]https://www.bioconductor.org
   Ingenuity pathway analysis (IPA) Qiagen
   [191]https://www.qiagenbioinformatics.com
   S-LDSC Finucane et al., 2015[192]^33 [193]https://github.com/bulik/ldsc
   LDSC Bulik-Sullivan et al., 2015[194]^59
   [195]https://github.com/bulik/ldsc
   Cytoscape v.3.9.1 Shannon et al., 2003[196]^60
   [197]https://cytoscape.org
   Clustermaker2 v.1.2.1 Morris et al., 2011[198]^61
   [199]https://apps.cytoscape.org
   TwoSample MR Hemani et al., 2018[200]^45
   [201]https://github.com/MRCIEU/TwoSampleMR
   PPI-based MR This manuscript
   [202]https://doi.org/10.6084/Lm9.figshare.21225251
     __________________________________________________________________

   Other
     __________________________________________________________________

   Human Active Enhancers to interpret Regulatory variants (HACER) Wang
   et al., 2019[203]^62 [204]http://bioinfo.vanderbilt.edu/AE/HACER
   Variant Effect Predictor (VEP) McLaren et al., 2016[205]^63
   [206]ensembl.org/info/docs/tools/vep
   Search Tool for the Retrieval of Interacting Genes/proteins (STRING) v.
   11.0b Szklarczyk et al., 2021[207]^64 [208]https://string-db.org
   EnrichR Chen et al., 2013[209]^65 [210]https://maayanlab.cloud/Enrichr/
   Search Tool for Interacting Chemicals (STITCH) Kuhn et al.,
   2009[211]^66 [212]http://stitch.embl.de
   Library of Integrated Network-based Cellular Signatures (LINCS) Keenan
   et al., 2018[213]^67 [214]http://www.lincs.hms.harvard.edu/db/
   [215]Open in a new tab

Resource availability

Lead contact

   Further information and requests should be directed to the lead
   contact, Katherine A. Owen ([216]kate.owen@ampelbiosolutions.com).

Materials availability

   This study did not generate new unique reagents.

Data and code availability

     * •
       This paper analyzes existing, publicly available data. All GWAS and
       Immunochip studies are referenced.
     * •
       Original software code and documentation to conduct PPI-based MR
       analyses have been deposited on figshare ([217]www.figshare.com;
       [218]https://doi.org/10.6084/m9.figshare.21225251) and is publicly
       available as of the date of publication.
     * •
       Any additional information required to reanalyze the data reported
       in this report is available from the [219]lead contact upon
       request.

Experimental models and subject details

   This study did not use any experimental models or enroll human
   subjects.

Method details

Identification of SLE- and CAD-associated SNPs and overlap

   SNPs associated with each disease were obtained from previous GWAS and
   Immunochip studies. For CAD, we used a comprehensive multi-ancestral
   meta-analysis of GWAS.[220]^32 For SLE, we included results of multiple
   GWAS and Immunochip studies to account for as many ancestries as
   possible.[221]^27^,[222]^28^,[223]^29^,[224]^30^,[225]^31 In total,
   7,222 and 16,163 unique SNPs were significantly (p < 10^−6) associated
   with SLE and CAD, respectively, and were employed in these studies. A
   full list of the SNPs, chromosome locations, positions and sources used
   are detailed in [226]Table S9.

Identification of SNP-predicted genes

   Expression quantitative trait loci (eQTLs) were identified using
   GTEx[227]^68 version 6.8 ([228]GTEXportal.org) and mapped to their
   associated eQTL expression genes (E-Genes). To find SNPs in enhancers
   and promoters, and their associated transcription factors and
   downstream target genes (T- Genes), we queried the atlas of Human
   Active Enhancers to interpret Regulatory variants[229]^62 (HACER,
   [230]http://bioinfo.vanderbilt.edu/AE/HACER). To find SNPs in exons of
   protein-coding genes (C-Genes) and include proximal genes (P-Genes,
   within 5kb), we queried the human Ensembl genome browser’s variant
   effect predictor[231]^63 (VEP, ensembl.org/info/docs/tools/vep,
   GRCh38.p12).

Network analysis and visualization

   Protein-protein interaction (PPI) networks of SNP-predicted
   protein-coding genes were generated by STRING[232]^64
   ([233]https://string-db.org, version 11.0b), and resulting networks
   were imported into Cytoscape[234]^60 (version 3.6.1) for visualization
   and partitioned with MCODE via the clusterMaker2[235]^61 (version
   1.2.1) plugin. Metastructures are based on PPI networks. For all
   metastructures, node gradient shading is proportional to intra-cluster
   connectivity, cluster size indicates number of genes per cluster and
   edge weight indicates inter-cluster connections.

Functional gene set analysis

   Predicted genes were examined using Biologically Informed Gene
   Clustering (BIG-C; version 4.4.). BIG-C is a custom functional
   clustering tool developed to annotate the biological meaning of large
   lists of genes and has been previously
   described.[236]^69^,[237]^70^,[238]^71 I-Scope is a custom clustering
   tool used to identify immune cell types in large gene datasets.[239]^72
   The Ingenuity Pathway Analysis (IPA;
   [240]https://www.qiagenbioinformatics.com) platform and EnrichR[241]^65
   ([242]https://maayanlab.cloud/Enrichr/) web server provided additional
   molecular pathway enrichment analysis.

Drug candidate identification

   Drug candidates were identified using LINCS,[243]^67 STITCH[244]^66
   (v5.0), IPA and literature mining. Each of the database tools includes
   either a programmatic method of matching existing therapeutics to their
   targets or else is a list of drugs and targets for achieving the same
   end.

Quantification and statistical anaysis

Linkage disequilibrium score regression (LDSC) genetic correlations

   LDSC[245]^59 was used to estimate genome-wide genetic correlations
   between traits using GWAS summary statistics. Pre-processed summary
   statistics from SLE, CAD and CVD GWAS were obtained from the Broad
   webpage
   ([246]https://alkesgroup.broadinstitute.org/LDSCORE/all_sumstats/).
   Using the LDSC software provided on github
   ([247]https://github.com/bulik/ldsc) and reference data on the Broad
   webpage ([248]https://alkesgroup.broadinstitute.org/LDSCORE/),
   including European LD scores 'eur_w_ld_chr' or 'weights_hm3_no_hla' as
   weights for analyses excluding the HLA region. Using standard
   parameters, the "ldsc.py" (with the "--rg" flag) script was used to
   generate genome-wide genetic correlation estimates between SLE and CVD
   or CAD.

Stratified linkage disequilibrium score regression (S-LDSC)

   S-LDSC[249]^33 was used to obtain gene-set specific
   disease-heritability estimates using GWAS summary statistics.
   Pre-processed summary statistics from SLE, CAD and CVD GWAS were
   obtained from Broad webpage
   ([250]https://alkesgroup.broadinstitute.org/LDSCORE/all_sumstats/).
   Using the S-LDSC software provided on github
   ([251]https://github.com/bulik/ldsc) and reference data on the Broad
   webpage ([252]https://alkesgroup.broadinstitute.org/LDSCORE/),
   annotation and LD score files were generated for each SNP-predicted
   gene- and protein-set, separately. Using standard parameters, the
   “make_annot.py” and “ldsc.py” (with the “--l2” flag) scripts were first
   used to generate the gene-set-specific annotation and LD files, then
   the “ldsc.py” (with the “--h2-cts” flag) script was used to generate
   stratified heritability scores for each GWAS.

Selection of valid, independent instrumental variables for traditional MR
analysis

   Traditional MR methods, such as MR-IVW, operate under three strict
   assumptions for instrumental variable (IV) validity: 1) the relevance
   assumption, 2) the exclusion restriction criteria assumption, and 3)
   the independence assumption. To satisfy the relevance assumption, SNPs
   significantly (genome-wide significance p value < 5 × 10^−8) associated
   with
   SLE[253]^27^,[254]^28^,[255]^29^,[256]^73^,[257]^74^,[258]^75^,[259]^76
   ^,[260]^77^,[261]^78^,[262]^79^,[263]^80^,[264]^81^,[265]^82^,[266]^83^
   ,[267]^84^,[268]^85^,[269]^86^,[270]^87 were obtained from the
   Phenoscanner database ([271]www.phenoscanner.medschl.cam.ac.uk)(82,83)
   ([272]Table S9). To satisfy the exclusion restriction criteria and
   independence assumptions, 89,336 SNPs weakly associated (p
   value < 1 × 10^−5) with CVD and confounders including cholesterol,
   obesity, blood pressure, insulin resistance, smoking, age-related
   diseases, and many more, were excluded from being IVs for SLE-exposure
   (see [273]Table S2 for the full list of excluded traits). HLA-region
   SNPs were conservatively removed from MR analyses by excluding the
   short-arm of chromosome 6. Stringent LD clumping[274]^37 was employed
   using the clump data (R^2 = 0.001, 100kb window, 1000G EA reference
   population) function to generate an independent set of 60 SLE-IVs
   harmonized for each GWAS.

Mendelian randomization (MR)

   MR was used to test for causal relationships between SLE and CAD using
   the MR-Base[275]^45 ([276]https://www.mrbase.org) TwoSampleMR[277]^45
   package in R ([278]https://github.com/MRCIEU/TwoSampleMR). Various sets
   of SLE-associated genetic variants used as instrumental variables (IVs)
   and summary statistics for SLE-exposure were manually imported into R
   and summary statistics were carried out for MR-base compatibility using
   the ‘format data’ command. All effect sizes and standard errors were
   obtained from the exposure summary statistics used in each analysis,
   regardless of the study in which each IV was associated with the
   exposure. Given the availability of well-powered CAD/MI GWAS on
   MR-Base, IVs for CAD and MI were directly obtained from each exposure
   GWAS using the ‘extract instruments’ command for the bidirectional
   analyses. Data from the SLE and all CVD-related GWAS studies used in
   our MR analyses are publicly available and also accessible through the
   MR-Base software, which was used to obtain the outcome summary
   statistics via the ‘extract outcome data’ command. The ‘allele
   harmonization’ command was used to ensure the effect estimates of the
   exposure and outcome are based on matching alleles, excluding SNPs with
   completely mismatching alleles from the MR analysis or reversing the
   effect and non-effect alleles along with the effect estimates when
   applicable. Because of the allele harmonization step and because some
   SNPs are absent from the available summary statistics, a small
   proportion of SNPs used as IVs are absent from the final MR
   calculations. Up to sixteen individual MR methods were carried out
   through the TwoSampleMR package, including inverse variance weighted
   (IVW), simple mode, weighted mode, simple median, weighted median
   (WMedian), MR-Egger, MR-PRESSO (raw and outlier-corrected), MR-RAPS,
   and two sample maximum likelihood (ML). The ‘MR report’ function was
   used to generate a summary containing heterogeneity and directional
   pleiotropy tests and scatterplots ([279]Figure S2). MR-IVW and MR-Egger
   heterogeneity test results (Q-value) indicate whether significant
   heterogeneity was detected, which does not necessarily indicate biased
   causal estimates. MR-Egger intercept indicate whether significant
   directional horizontal pleiotropy was detected, which usually indicates
   biased causal estimates. For single-SNP MR, the ‘MR single-SNP’
   function was also carried out using the Wald Ratio method. Full details
   of all MR results are included in [280]Table S2 and a summary of all
   the main findings are included in [281]Table S10.

PPI-based MR

   SLE-associated variants from the Immunochip[282]^31 and Phenoscanner
   database[283]^57^,[284]^58 were linked to their most likely genes, and
   the genes used to generate PPI-informed gene clusters. The
   SLE-associated SNPs mapping to genes in each of PPI-based clusters were
   then extracted to “reverse engineer” subsets of SNPs that could be used
   separately as SLE-IVs for MR to independently estimate the causal
   effects of each PPI-informed SNP-to-Gene module on CAD. Up to sixteen
   MR methods were carried out for each SNP-to-gene module through the
   TwoSampleMR package.

   In additional analyses (related to [285]Figure 6) using the combined
   Immunochip and Phenoscanner SNP dataset, filtering eliminated SNPs
   weakly associated (p value < 1 × 10^-5) with CVD and confounders
   including cholesterol, obesity, blood pressure, insulin resistance,
   smoking, age-related diseases, and many more ([286]Table S2).
   HLA-region SNPs were conservatively removed from MR analyses by
   excluding the short-arm of chromosome 6. Stringent LD clumping[287]^37
   was employed using the clump_data (R^2 = 0.001, 100kb window, 1000G EA
   reference population) function to generate an independent set of
   SLE-IVs. Various analyses were performed using independent, valid IVs
   derived from ‘All SNPs’, SNPs mapping to ‘Insignificant’ clusters,
   ‘Positive Tier 1’ clusters, ‘Positive Tier 1 and Tier 2’ clusters,
   ‘Positive MR-IVW p value<0.00075’ clusters, ‘Negative MR-IVW p
   value<0.00075’ clusters, ‘Negative Tier 1’ clusters, and ‘Negative Tier
   1 and Tier 2’ clusters.

Monte Carlo simulations for expected MR results using random sets of
Immunochip-derived SNP-to-gene modules

   Monte Carlo simulations were implemented and performed to estimate the
   false discovery rate with respect to significant PPI-based MR causal
   estimates. 120,026 Immunochip SNPs included in the SLE summary
   statistics were mapped to putative genes using the VEP, including
   regulatory effects, to generate an Immunochip SNP-to-Gene library with
   67,211 unique SNPs mapping to 7,602 STRINGdb proteins. In each
   simulation, a random set of 3–152 SNP-predicted proteins were selected
   from the 7,602 proteins and used to extract up to 400 Immunochip SNPs.
   MR-IVW was then performed for SLE on CAD using harmonized, non-HLA SNPs
   (via removal of the entire short-arm of chromosome 6) from the
   simulated set of Immunochip SNPs as IVs. By using our Immunochip
   derived SNP-to-Gene dictionary for random selection of protein clusters
   and associated SNPs to generate random sets of IVs, our simulations
   account for both a high degree of LD and pleiotropy, especially
   considering the major influence of loci associated with diabetes in
   development of the Immunochip.

Acknowledgments