Abstract
Background
   Significant differences in immune responses, prevalence or
   susceptibility of diseases and treatment responses have been described
   between males and females. Despite this, sex-differentiation analysis
   of the genetic architecture of inflammatory proteins is largely
   unexplored. We performed sex-stratified meta-analysis after protein
   quantitative trait loci (pQTL) mapping using inflammatory biomarkers
   profiled using targeted proteomics (Olink inflammatory panel) of two
   population-based cohorts of Europeans.
Results
   Even though, around 67% of the pQTLs demonstrated shared effect between
   sexes, colocalization analysis identified two loci in the males
   (LINC01135 and ITGAV) and three loci (CNOT10, SRD5A2, and LILRB5) in
   the females with evidence of sex-dependent modulation by pQTL variants.
   Furthermore, we identified pathways with relevant functions in the
   sex-biased pQTL variants. We also showed through cross-validation that
   the sex-specific pQTLs are linked with sex-specific phenotypic traits.
Conclusion
   Our study demonstrates the relevance of genetic sex-stratified analysis
   in the context of genetic dissection of protein abundances among
   individuals and reveals that, sex-specific pQTLs might mediate
   sex-linked phenotypes. Identification of sex-specific pQTLs associated
   with sex-biased diseases can help realize the promise of individualized
   treatment.
Supplementary Information
   The online version contains supplementary material available at
   10.1186/s12864-024-10065-z.
   Keywords: Sex-stratified genetic analysis, Inflammation, pQTL, GWAS,
   Biomarkers
Background
   Differences in inflammation between males and females can explain
   sex-biased susceptibility and severity of various common diseases such
   as atherosclerosis, cancer, autoimmune diseases and many more [[35]1].
   While females are less prone to infectious diseases than males, they
   account for more than 80% of individuals presented with autoimmune
   diseases such as systemic lupus erythematosus (SLE), rheumatic
   arthritis (RA) and autoimmune thyroid disease [[36]2]. Sex differences
   in immunological responses and disease susceptibility may be influenced
   by the complex interaction of sex hormones [[37]3], host microbiome
   [[38]4], immune-regulatory genes located on the X-chromosome [[39]5]
   and environmental exposures [[40]6–[41]8]. It is also possible that
   genetic polymorphisms associated with immune phenotypes partially
   account for sex-based differences in immune response. In this regard
   and given the recent recognition of sex-stratified analysis, studies in
   the context of epigenetics [[42]9, [43]10] and transcriptomics [[44]11,
   [45]12] have been conducted. The disproportionate sex influence on all
   these phenotypes, including treatment efficacy [[46]13], makes it
   imperative to understand the role of sex on modulation of inflammation
   in complex traits and diseases which will ultimately help our quest of
   developing personalized treatments.
   Many of the key inflammatory markers which are released into the
   bloodstream during inflammation and are associated with chronic
   diseases are proteins, the functional molecules encoded by the genome
   [[47]14]. A plethora of protein quantitative trait loci (pQTL) studies
   have been conducted to characterize genetic variants associated with
   circulating protein levels in both healthy and disease individuals
   [[48]15–[49]21]. Despite the burgeoning number of studies reporting
   sex-based differences in the immune response, most of the research on
   pQTLs analysis mainly adjust for sex differences in the model without
   seeking to identify sex-specific pQTLs. A recent study clearly
   demonstrated sex-dependent effect on the circulating concentrations of
   inflammatory proteins [[50]22]. Therefore, identifying the precise
   factors driving the inter-individual variability in inflammatory
   protein levels especially in sex-dependent fashion will help to
   comprehensively understand sex differences in inflammatory-driven
   diseases and to make meaningful prediction of individual risk for
   diseases.
   In the present study, using two population-based cohorts, we aimed to
   identify which genetic variants affect inflammatory protein levels in
   sex-specific manner. We identified and compared genetic variants
   associated with protein concentrations profiled with the Olink
   Inflammation panel using meta-analytic approach in males and females
   separately. We demonstrate that while the regulation of numerous pQTL
   variants is independent of sex, some key loci act discordantly between
   sexes and are correlated with sex-dependent traits.
Results
   We aimed to identify single nucleotide polymorphisms (SNPs) associated
   with plasma protein concentrations profiled in males and females
   samples separately using two different population-based cohorts. To
   enhance statistical power and to detect robust signals, pQTL
   association results of 66 inflammatory proteins and 4,028,465 SNPs of
   both cohorts were integrated using meta-analysis. A general overview of
   the cohorts and analyses conducted is displayed in Fig. [51]1.
Fig. 1.
   [52]Fig. 1
   [53]Open in a new tab
   Graphical representation of study cohorts, design and analysis
   conducted. This study utilized two population-based cohorts (500FG &
   300 BCG) of individuals of European decent. Imputed genetic data and
   plasma protein abundances profiled with the Olink inflammatory panel
   were available for protein quantitative trait (pQTL) mapping in
   sex-dependent fashion. The resulting summary statistics were integrated
   using the meta-analytic approach for males and females separately.
   Colocalization and functional enrichment analysis of the identified
   meta-analyzed pQTL variants were conducted. Finally, we cross-validated
   the identified pQTL variants with previously published molecular traits
Identification of pQTLs in males
   In the males (N = 318), we observed a total of six genome-wide
   significant pQTLs (P < 5 × 10^–8) which comprises of two trans-pQTL
   variants (rs3213964 and rs11207327) regulating TNFRSF9 and CD5 levels
   respectively. The four cis-acting pQTL variants (lying 250 kb window
   around the tested protein-coding gene) were associated with CCL25, CD6,
   IL-10RB and IL-18R1 proteins. Summary of the meta-analyzed results are
   represented with Manhattan plot (Fig. [54]2A) while regional
   association plots are used to zoom into the genomic regions surrounding
   the index pQTL variants of each significant pQTL locus (Fig. [55]2B).
   In general, the cis-acting pQTL variants showed stronger genetic
   associations. For example, the most strongly associated intronic SNP
   rs7605284 with IL-18R1 (P = 7.22 × 10^–19,Zscore = 8.871) followed by
   SNP rs2843699 correlating with IL-10RB (P = 3.29 × 10^–12,
   Zscore = -6.965) were all residing in the cis-regions. Interestingly,
   trans-pQTL variant (rs11207327) correlating with CD5 levels resides in
   the LINC01135 locus, suggesting that genetic polymorphisms in both the
   protein-coding and non-coding regions play a role in regulating
   variations in plasma protein levels. The most significantly associated
   genetic variant for each protein in the males is reported in
   Table [56]1. Detailed summary statistics for the results of pQTL
   mapping in each cohort and meta-analyzed results with suggestive
   associations for the males are presented in Table [57]S1.
Fig. 2.
   [58]Fig. 2
   [59]Open in a new tab
   Summary of pQTL meta-analysis results in males. A Manhattan plot
   depicting the association results of significant genetic variants
   (P < 0.05). The red bold horizontal line marks the genome-wide
   significant threshold (p-value < 1 × 10^–8) and the black dashed
   denotes the suggestive threshold (p-value = 5 × 10^–5). Top pQTL
   variants and their correlated proteins are displayed on the plot. B
   Regional association plots of the genome-wide significant loci
   (p < 1 × 10^–8). The -log10 association p-values are plotted on the
   y-axis against physical position (NCBI build 36) of each marker on the
   x-axis. The pQTL variants are color coded according to their
   correlation coefficient (r^2) with the top SNP using the hg19/1000
   Genomes European samples
Table 1.
   Summary statistics of index-pQTLs for males after meta-analysis
MarkerName  A1 A2 Weight Zscore P.value  Direction chr pos        Protein
rs7605284   t  g  318    8.871  7.23E-19  +  +     2   103044175  IL.18R1
rs8105816   t  c  318    7.001  2.54E-12  +  +     19  8099017    CCL25
rs2843699   t  c  318    -6.965 3.29E-12 –         21  34663850   IL.10RB
rs4939488   t  c  318    6.944  3.80E-12  +  +     11  60786474   CD6
rs3213964   a  g  318    -5.647 1.63E-08 –         2   187540732  TNFRSF9
rs11207327  a  g  318    -5.457 4.85E-08 –         1   59371517   CD5
rs7259738   c  g  318    -5.425 5.79E-08 –         19  53308330   CD40
rs2041190   a  g  318    5.407  6.42E-08  +  +     17  32634541   MCP.2
rs3098547   a  g  318    5.395  6.86E-08  +  +     15  27873562   HGF
rs9771768   a  g  318    5.392  6.98E-08  +  +     8   139165953  IL7
rs1986471   t  c  318    5.337  9.43E-08  +  +     3   127001452  OPG
rs9323902   t  g  318    -5.296 1.19E-07 –         14  94572110   LAP.TGF.beta.1
rs9983324   a  g  318    -5.264 1.41E-07 –         21  21130809   CXCL6
rs678815    c  g  318    -5.258 1.46E-07 –         11  102713777  MMP.1
rs72800254  a  g  318    -5.224 1.75E-07 –         16  83201614   TNFRSF14
rs58814158  t  g  318    -5.208 1.91E-07 –         13  110820255  FGF.23
rs924935    a  g  318    5.161  2.45E-07  +  +     3   16172032   CCL28
rs7530589   a  g  318    -5.134 2.83E-07 –         1   22756991   TRANCE
rs2346994   a  g  318    5.109  3.24E-07  +  +     16  26478953   CCL23
rs7429951   a  g  318    -5.104 3.33E-07 –         3   18031227   uPA
rs12578430  t  c  318    5.096  3.47E-07  +  +     12  104093212  CDCP1
rs72783206  a  g  318    -5.067 4.03E-07 –         2   34173056   IL.17A
rs7151919   a  g  318    -5.049 4.43E-07 –         14  21139505   CSF.1
rs1963248   t  g  318    5.04   4.65E-07  +  +     1   74346006   FGF.19
rs112953000 t  c  318    -5.01  5.45E-07 –         21  37365315   MCP.4
rs2604330   a  g  318    4.994  5.92E-07  +  +     8   15729358   TNFB
rs2568222   a  g  318    4.987  6.12E-07  +  +     2   85377979   PD.L1
rs73232950  c  g  318    4.981  6.32E-07  +  +     21  46208128   CASP.8
rs1233829   a  t  318    4.972  6.62E-07  +  +     1   198041374  CXCL11
rs10977778  c  g  318    -4.97  6.69E-07 –         9   9459604    CXCL10
rs2543063   t  c  318    4.954  7.27E-07  +  +     8   39832166   Flt3L
rs112770619 a  g  318    -4.942 7.73E-07 –         8   119484547  TWEAK
rs4942873   t  c  318    4.941  7.76E-07  +  +     13  50392047   IL6
rs60844779  a  t  318    4.925  8.45E-07  +  +     5   149456772  CCL19
rs2165385   t  c  318    -4.924 8.46E-07 –         18  27371049   CST5
rs4854604   a  c  318    4.924  8.50E-07  +  +     3   133780263  CD8A
rs34011530  c  g  318    4.894  9.87E-07  +  +     11  1689461    NT.3
rs2552275   c  g  318    4.892  9.96E-07  +  +     8   6022581    CD244
rs9324481   c  g  318    -4.887 1.03E-06 –         8   139121726  CXCL1
rs9502959   t  c  318    -4.877 1.08E-06 –         6   206599     TRAIL
rs1814451   t  c  318    4.876  1.09E-06  +  +     17  47540123   DNER
rs72977655  t  c  318    -4.867 1.13E-06 –         2   235708696  EN.RAGE
rs4335544   a  c  318    4.854  1.21E-06  +  +     11  21,330,182 ADA
rs2570672   c  g  318    4.85   1.24E-06  +  +     8   6026307    IL18
rs75674858  a  g  318    4.847  1.25E-06  +  +     4   44388139   IL8
rs10933215  t  c  318    4.843  1.28E-06  +  +     2   228661357  IL10
rs4328681   a  g  318    4.833  1.34E-06  +  +     2   45261612   MMP.10
rs357292    t  g  318    4.79   1.67E-06  +  +     5   38923732   OSM
rs493480    c  g  318    4.777  1.78E-06  +  +     15  55879574   CXCL9
rs9855230   a  g  318    4.751  2.03E-06  +  +     3   64642056   CCL4
rs4752902   a  g  318    4.742  2.12E-06  +  +     11  48148779   LIF.R
rs11017041  a  c  318    -4.741 2.12E-06 –         10  131795756  TGF.alpha
rs1034527   t  c  318    -4.738 2.16E-06 –         8   120939125  SCF
rs744768    a  g  318    4.732  2.22E-06  +  +     7   132014951  CCL3
rs10756341  t  c  318    4.725  2.31E-06  +  +     9   12203051   ST1A1
rs1216465   a  c  318    4.723  2.33E-06  +  +     11  100343360  MCP.1
rs11881877  a  g  318    -4.712 2.46E-06 –         19  53401760   IL.12B
rs57254266  a  g  318    4.711  2.47E-06  +  +     11  5352458    VEGFA
rs9771768   a  g  318    4.704  2.55E-06  +  +     8   139165953  CXCL5
rs4839524   a  g  318    -4.704 2.55E-06 –         1   116800201  X4E.BP1
rs167500    a  c  318    4.684  2.81E-06  +  +     18  1907316    SIRT2
rs85023     a  g  318    4.623  3.78E-06  +  +     20  45181296   CX3CL1
rs1759325   t  c  318    -4.616 3.91E-06 –         10  59838496   CCL11
rs7964436   t  c  318    4.581  4.64E-06  +  +     12  15561776   CCL20
rs7628951   a  g  318    4.562  5.06E-06  +  +     3   56457705   AXIN1
rs8060853   a  g  318    4.553  5.29E-06  +  +     16  30403858   FGF.21
   [60]Open in a new tab
Identification of pQTLs in females
   Next, we repeated the meta-analysis of the 66 proteins in the females
   (N = 359) to uncover SNPs controlling inflammatory protein levels in
   females. We identified a total of nine genome-wide significant pQTLs
   comprising four trans pQTL variants (rs12634152, rs608574, rs12977062
   and rs67015567) which affected IL-12B, IL18, PD-L1 and CX3CL1
   abundances respectively. The five cis-acting pQTL variants were
   associated with proteins such as IL-18R1, MCP-2, CCL25, CD6, and
   IL-10RB. The SNP rs1985329 showing the most significant association
   (P = 1.15 × 10^–26) is within an intron mapping to IL18R1. All the 9
   pQTLs showed stronger associations in females than in males except
   rs2843699-IL-10RB association (P = 2.43 × 10^–11, Zscore = -6.678) in
   females and (P = 3.29 × 10^–12,Zscore = -6.965) in males. The general
   meta-analyzed results are summarized with Manhattan plot (Fig. [61]3A)
   and the genomic regions around each of the genome-wide significant pQTL
   variants are represented with regional association plots (Fig. [62]3B).
   The trans-pQTL SNP rs12634152 on chromosome 3 which significantly
   correlated with IL-12B mapped to the LPP (Lipoma-preferred partner)
   locus. Interestingly, genetic polymorphisms in the LPP gene are
   potential risk factors for autoimmune diseases such as celiac disease
   and Addison’s diseases [[63]23, [64]24]. Table [65]2 shows the lead
   pQTL results for all the 66 proteins interrogated in the females.
   Suggestive associations for each cohort identified in the females after
   pQTL mapping and meta-analysis are presented in Table [66]S2.
Fig. 3.
   Fig. 3
   [67]Open in a new tab
   Summary of pQTL meta-analysis results in females. A Manhattan plot
   illustrating the association results of significant genetic variants
   (P < 0.05). The red bold horizontal line denotes the genome-wide
   significant threshold (p-value < 1 × 10^–8) and the black dashed
   denotes the suggestive threshold (p-value = 5 × 10^–5). Top pQTL
   variants and their correlated proteins are displayed on the plot. B
   Regional association plots of the genome-wide significant loci
   (p < 1 × 10^–8). The -log10 association p-values are plotted on the
   y-axis against physical position (NCBI build 36) of each marker on the
   x-axis. The pQTL variants are color coded according to their
   correlation coefficient (r^2) with the top SNP using the hg19/1000
   Genomes European samples
Table 2.
   Summary statistics of index-pQTLs for females after meta-analysis
   MarkerName Allele1 Allele2 Weight Zscore P.value Direction chr pos
   Protein
   rs1985329 t c 359 -10.689 1.15E-26 – 2 103058208 IL.18R1
   rs4939488 t c 359 7.983 1.43E-15  +  +  11 60786474 CD6
   rs1204494 t c 359 -7.017 2.27E-12 – 19 8060420 CCL25
   rs2843699 t c 359 -6.678 2.43E-11 – 21 34663850 IL.10RB
   rs2041190 a g 359 6.336 2.36E-10  +  +  17 32634541 MCP.2
   rs608574 c g 359 -6.167 6.95E-10 – 2 31833066 IL18
   rs12634152 t c 359 -6.116 9.58E-10 – 3 188121019 IL.12B
   rs67015567 t c 359 -5.7 1.20E-08 – 3 32828703 CX3CL1
   rs12977062 t c 359 5.476 4.34E-08  +  +  19 54755922 PD.L1
   rs175133 t c 359 -5.414 6.17E-08 – 11 60859791 CCL4
   rs62094708 a g 359 -5.397 6.78E-08 – 18 56159522 Flt3L
   rs6486739 a g 359 5.391 7.01E-08  +  +  12 129440318 CST5
   rs13075467 t c 359 -5.338 9.39E-08 – 3 135609551 CXCL6
   rs1954112 t c 359 -5.304 1.13E-07 – 14 86674145 CD244
   rs10816341 t c 359 -5.293 1.21E-07 – 9 98440825 CASP.8
   rs1880614 t c 359 5.255 1.48E-07  +  +  7 53247740 TRANCE
   rs1741298 t c 359 -5.25 1.52E-07 – 20 4133990 HGF
   rs1959748 t c 359 5.248 1.53E-07  +  +  14 86683256 CCL3
   rs7710132 c g 359 5.244 1.57E-07  +  +  5 51345337 TGF.alpha
   rs56875031 c g 359 -5.235 1.65E-07 – 8 6252045 FGF.21
   rs6842405 t c 359 5.211 1.88E-07  +  +  4 179558777 IL10
   rs36018306 t g 359 -5.195 2.04E-07 – 19 52655428 OSM
   rs4299800 c g 359 -5.18 2.22E-07 – 6 169337246 FGF.23
   rs33234 a g 359 5.179 2.24E-07  +  +  12 31030018 EN.RAGE
   rs562966 t c 359 5.168 2.36E-07  +  +  11 120076116 TNFRSF14
   rs16922761 t c 359 5.162 2.44E-07  +  +  11 95708152 MMP.10
   rs2191835 t c 359 -5.148 2.63E-07 – 2 229942970 IL.17A
   rs6460958 a g 359 -5.13 2.89E-07 – 7 12572904 CXCL9
   rs7936105 t c 359 -5.03 4.91E-07 – 11 77610834 TNFB
   rs10270821 t c 359 -5.02 5.18E-07 – 7 124051449 TNFRSF9
   rs9270845 t c 359 -4.999 5.77E-07 – 6 32570571 MCP.1
   rs7818631 t c 359 4.976 6.48E-07  +  +  8 96427828 FGF.19
   rs2102759 a g 359 -4.963 6.93E-07 – 2 161866727 CCL23
   rs13254474 t c 359 -4.959 7.08E-07 – 8 2918923 IL8
   rs7843880 a g 359 -4.944 7.67E-07 – 8 9099173 CCL11
   rs73085912 a g 359 4.92 8.67E-07  +  +  3 54197988 MMP.1
   rs17523444 a g 359 -4.919 8.69E-07 – 5 15657540 VEGFA
   rs324347 a g 359 4.91 9.09E-07  +  +  5 104092939 OPG
   rs3772382 t c 359 4.908 9.19E-07  +  +  3 24185307 LIF.R
   rs2303147 t c 359 -4.906 9.28E-07 – 19 49143025 CD5
   rs396832 a g 359 4.903 9.45E-07  +  +  21 37299882 IL7
   rs35617738 a g 359 -4.894 9.87E-07 – 3 171187483 CCL28
   rs1943821 t c 359 4.891 1.00E-06  +  +  18 70986060 uPA
   rs9430101 t c 359 4.863 1.15E-06  +  +  1 213014295 CCL19
   rs8033014 a g 359 4.863 1.16E-06  +  +  15 50437063 X4E.BP1
   rs4076388 t c 359 4.85 1.23E-06  +  +  3 72333947 ST1A1
   rs12403928 t c 359 4.838 1.31E-06  +  +  1 190564863 CDCP1
   rs10733202 t g 359 -4.838 1.31E-06 – 9 10027632 CXCL10
   rs10832590 t c 359 -4.815 1.47E-06 – 11 16318856 CD40
   rs1342420 t c 359 4.814 1.48E-06  +  +  20 6138415 TWEAK
   rs980325 a g 359 -4.814 1.48E-06 – 6 153148350 NT.3
   rs8105276 a g 359 4.8 1.59E-06  +  +  19 9101857 CCL20
   rs654186 a t 359 4.797 1.61E-06  +  +  6 132640349 TRAIL
   rs16887551 a g 359 -4.796 1.62E-06 – 8 38470624 AXIN1
   rs62273562 a g 359 4.758 1.95E-06  +  +  3 102908341 MCP.4
   rs2266374 t c 359 -4.746 2.08E-06 – 10 14736230 ADA
   rs10916078 a g 359 4.743 2.10E-06  +  +  1 223857726 SIRT2
   rs2368328 a g 359 -4.722 2.33E-06 – 10 28654250 CXCL11
   rs4547370 t g 359 -4.71 2.48E-06 – 17 63281454 CSF.1
   rs75956752 t c 359 -4.695 2.67E-06 – 13 90524286 CD8A
   rs73159896 t g 359 4.685 2.80E-06  +  +  22 17295981 CXCL1
   rs55984748 t g 359 -4.672 2.98E-06 – 20 739264 IL6
   rs2092212 c g 359 -4.645 3.40E-06 – 22 43318948 SCF
   rs6727186 t c 359 -4.612 3.98E-06 – 2 69103817 DNER
   rs9267393 a c 359 -4.585 4.55E-06 – 6 31469365 LAP.TGF.beta.1
   rs9324265 a g 359 -4.566 4.98E-06 – 13 112835484 CXCL5
   [68]Open in a new tab
Colocalization analysis implicates sex-specific pQTL loci
   Next, to identify genomic regions that are shared between males and
   females or unique pQTLs variants, we performed genetic colocalization
   analysis for all the genome-wide significant loci. A tested region with
   posterior probability (PP4 >  = 0.75) show a common association in both
   males and females. In the males, we compared the six significant loci
   with the same regions in the females. There was strong evidence of
   shared causal variant with PP4 values ranging from 0.94 to 0.99 in four
   genomic loci: CCL25, CD6, IL10RB and IL18R1 (Fig. [69]4A.).
   Interestingly, at the LINC01135 and ITGAV loci (Fig. [70]4B), the PP4
   values were just 0.06 and 0.07 respectively, which suggests lack of
   shared causal variant between sexes. In fact, index SNP look-up showed
   that the index SNPs in these regions specifically regulate proteins in
   males. For example, at the LINC01135 locus, the upstream variant
   rs11207327 showed strong association with CD5 (P = 4.85 × 10^–8,
   Zcore = -5.457) in males but was not significant in females (P = 0.26,
   Zscore = -1.126). We also observed a higher PPH1 value (0.693) and a
   lower PP3 value of 0.0185, supporting the evidence that only males have
   significant associations in the tested region (Table [71]S3).
   Individuals carrying the G allele on average produced the highest
   abundance of CD5 levels. Similarly, at the ITGAV locus, while the index
   SNP rs3213964 surpassed the genome-wide significant threshold
   (P = 1.63 × 10^–8, Zscore = -5.647) in the males, this effect was
   completely masked in the females (P = 0.23, Zscore = -1.188). This
   observation was further by supported by the higher PP1 and lower PP3
   values 0.894 and 0.0069 (Table [72]S3).
Fig. 4.
   [73]Fig. 4
   [74]Open in a new tab
   Illustration of colocalization analysis results in males. A Locus
   comparison plots of shared genomic loci between males and females (B)
   Locus comparison plots of male-specific genomic loci. The protein names
   and -log[10] association p-values are displayed on the vertical axis
   and the names of the loci and chromosomes are represented on the
   horizontal axis. The posterior probability values (PP4) are also
   indicated on the plot
   We observed on average higher production of TNFRSF9 levels for
   individuals with the A allele. To elucidate whether the identified two
   male-specific loci were not driven by sample size variation or
   up-scaling of sample size via meta-analysis, we interrogated such
   effect in the biggest cohort (500FG) with comparatively similar sample
   size (Males = 184, females = 187) and a similar pattern was observed
   (Figure [75]S1A). Of note, of all the male-specific loci detected, the
   lead pQTL SNPs were all trans-pQTLs.
   We next performed similar colocalization analysis in the nine loci
   detected after pQTL mapping in females. For six loci, namely IL-10RB,
   CCL8, IL-18RAP, LPP, CD6, and ELAVL1 we did not observe any
   female-specific effects as affirmed by the strong colocalization with
   PP4 values ranging from 0.89 to 0.99 (Fig. [76]5A). On the other hand,
   no evidence of shared causal variant was detected in three loci and the
   lead SNPs in these regions exhibited trans-association only in females
   (Fig. [77]5B). For example, at the CNOT10 locus (PP4 = 0.05), the
   downstream SNP rs67015567 on chromosome 3, linked with CX3CL1 levels
   was significant in females (P = 1.2 × 10^–8, Zscore = -5.70) but not in
   the males (P = 0.98, Zscore = -0.028). While the females carrying the C
   allele on averaged produced the highest level of CX3CL1, we observed
   the contrary in the males. Also, at the LILRB5 locus (PP4 = 0.06),
   while the association of SNP rs12977062 on chromosome 19 with PD-L1
   reached genome-wide significant in females (P = 4.34 × 10^–8,
   Zscore = 5.476), no significant association was detected in males
   (P = 0.37, Zscore = -0.901). Similarly for PD-L1 levels, females with
   the T allele produced the highest levels while their counterpart males
   produced the least. We further observed female-specific effect at the
   SRD5A2 locus (PP4 = 0.19) where the association strength of the lead
   SNP rs608574 on chromosome 2 with IL18 was significant
   (P = 6.95 × 10^–10, Zscore = -6.167) in females, but not in males
   (P = 0.051, Zscore = -1.952). Individuals carrying the C allele
   produced the highest levels of IL18 but the average levels for the C
   and G alleles were comparable in the males. The female-specific genetic
   variants were also apparent in the analysis considering only the
   largest cohort (Figure [78]S1B), suggesting that sample size
   differences cannot account for this effect. In all the three genomic
   regions, we observed lower PP2 values compared to PP1 values,
   suggesting the presence of significant associations only in. females
   (Table [79]S4). Furthermore, we observed an interaction effect between
   sex and the genotypes of the sex-specific pQTLs, and all but one was
   statistically significant (Figure [80]S2, Table [81]S5), supporting the
   robustness of the findings.
Fig. 5.
   [82]Fig. 5
   [83]Open in a new tab
   Illustration of colocalization analysis results in females. A Locus
   comparison plots of shared genomic loci between males and females (B)
   Locus comparison plots of female-specific genomic loci. The protein
   names and strength of association (-log[10] p-values) are displayed on
   the vertical axis against the chromosomal physical position on the
   horizontal axis. The posterior probability values (PP4) are also
   indicated on the plot
Conditional analysis identifies secondary pQTL signals
   Colocalization analysis with coloc is built under the assumption of a
   single causal variant per trait and we observed instances of complex LD
   relationship between some index pQTL variants and neighboring SNPs. For
   example, variants associated with TNFRSF9 (Fig. [84]2) and IL18 and
   PDL1 (Fig. [85]3) among the males- and females- specific loci. After
   performing conditional analysis on the top pQTL variants, as summarized
   in Table [86]S6, no genetic variants reached the significant threshold
   (pC < 5 × 10^–8) for IL18 and CX3CL1 proteins. In the case of TNFRSF9
   conditioned on rs3213964, 28 pQTL variants were significant, However,
   these variants showed various degree of LD correlation with the top
   pQTL variant (R^2 ranging from 0.239 to 0.754). Interestingly, for CD5
   conditioned on rs11207327, the four significant pQTL variants
   (rs2764912, rs11207331, rs10889121, and rs4912382) have no correlation
   with the top SNP (R^2 ranging from 0 to 0.1). Similar observation was
   made for PDL1 conditioned on rs12977062. The four significant pQTL
   variants (rs2361797, rs73058787, rs380731, and rs36068997), correlated
   poorly with the index pQTL (R^2 ranging from 0 to 0.1), suggesting the
   presence of multiple or independent associations at the sex-specific
   loci.
Exploring the proportion of sex-specific pQTLs across the genome
   After identifying sex-specific pQTLs using the genome-wide significant
   threshold, we next wondered to what extent the pQTLs are sex-dependent
   if we reduce the statistical threshold. To do this, we selected pQTL
   variants for each of the 66 proteins used for meta-analysis separately
   for each sex at a nominal threshold (P < 0.05). In the males, the total
   number of pQTL variants association (P < 0.05) range between 160,302
   and 198, 838 depending on the protein we tested (Fig. [87]6A). The
   number of these pQTLs with significant association (P < 0.05) only in
   males ranged between 152,955 and 189,844 (Figure S3A), which in
   percentage terms, spans between 3.80% to 4.71% of the total genetic
   variants of over 4.0 million interrogated (Figure [88]S3C). In females,
   the identified pQTLs (P < 0.05) with consistent effect size direction
   in both cohorts for all the 66 proteins ranged from 178,773 to 199,759
   (Fig. [89]6B). The number of these pQTL variants that were significant
   (P < 0.05) only in females ranged from 169,779 to 191,123 (Fig. [90]6B)
   which shows a percentage of 4.21% to 4.74% (Figure [91]S3C).
Fig. 6.
   [92]Fig. 6
   [93]Open in a new tab
   Distribution of pQTLs after meta-analysis in both cohorts. Total number
   of pQTL variants with consistent effect size direction in both cohorts
   after meta-analysis (P < 0.05) in males (A) and in females (B). C
   Number of male-specific pQTL variants with strong suggestive cut-off
   (P < 5 X 10^–5) per protein after meta-analysis of both cohorts in
   males. D Number of female-specific pQTL variants with strong suggestive
   cut-off (P < 5 X 10^–5) per protein after meta-analysis of both cohorts
   in females
   Aiming to minimize the degree of false positive associations, we next
   applied a strong suggestive association threshold (P < 5 X 10^–5). The
   numbers of sex-specific pQTLs ranged from 79 to 309 for the males
   (Fig. [94]6C) and from 64 to 520 (Fig. [95]6D) in females. The limited
   percentage of sex-specific pQTLs at a lenient threshold or the limited
   numbers of sex-specfifc pQTL variants (P < 5 X 10^–5) relative to all
   the SNPs tested, suggests that most of the genetic variants associated
   with protein abundances act in sex independent manner. This is
   supported by the findings of colocalization analysis among the
   genome-wide significant hits, whereby approximately 67% (males = 4/6,
   females = 6/9) of pQTLs have similar effects in both sexes.
Functional and regulatory annotation of sex-specific pQTLs
   The functional consequences of the sex-specific pQTLs with suggestive
   evidence of association (P < 5 × 10^–5) shows that most of these pQTLs
   are within introns and intergenic regions, for both the males
   (Fig. [96]7A) and females (Fig. [97]7B). The sex-specific pQTLs were
   least represent in coding genomic regions such as UTRs and exons. We
   next examined the regulatory potential of the of the sex-specific pQTLs
   and observed that 39.5% and 37.1% of the pQTLs identified had
   RegulomeDB score of 5 for both the males (Fig. [98]7C) and females
   (Fig. [99]7D) respectively. We observed significant enrichment of pQTLs
   with RegulomeDB score of 5 among genetic variants with RegulomeDB
   scores in the database based on chi-squared test (X-squared = 754.57,
   df = 1, p-value = 2.26 × 10^–16). This observation indicates that most
   of the identified pQTLs are likely to alter transcription factor
   binding sites (TFBs) and are therefore regulatory. In fact, lower
   RegulomeDB scores provide evidence that, the pQTL variants are located
   in a functional region and pQTL variants falling within the category
   one was as low as 1.5% and 1.4% in the males and females respectively.
Fig. 7.
   [100]Fig. 7
   [101]Open in a new tab
   Illustration of functional annotation results of sex-specific pQTLs (p
   value = 5 × 10^–5). Distribution of sex-specific pQTL variants’
   functional consequences in males (A) and females (B). Bar plots
   distributions of RegulomeDB scores indicating the regulatory potential
   of pQTL variants for males (C) and females (D). Interpretation of the
   scores is sandwiched in between the bar plots. Line plots of
   Transcription factor enrichment analysis of male-specific pQTL variants
   (E) and female-specific pQTL variants (F). The top 25 enriched TFs are
   ploted on the x-axis and the level of significance are indicated in the
   color legend
   Given that most of the identified pQTLs were predicted to alter TF
   binding sites, we mapped the sex-specific pQTL variants with suggestive
   evidence association to TFBs to uncover the affected transcription
   factors (TFs). Several TFs were identified after TF enrichment analysis
   in the males (Fig. [102]7E) with DUX4, CREB1 and ELK4 being the three
   most enriched TFs. In the females (Fig. [103]7F), TFs such as FOXP2,
   ELK1, and FOXA1 were mostly enriched. Knowing that pQTL variants change
   or disrupts TF-binding is crucial to understand the molecular
   mechanisms of how pQTL variants impact protein abundances.
Biological interpretation of TFs and gene sets
   Next, we sought to gain mechanistic insight into the predicted TFs (see
   methods) and curated gene sets (mapping pQTL variants to genes) from
   the sex-specific pQTL variants (P < 5 × 10^–5). To do this, we
   performed over-representation analysis to identify enriched biological
   pathways. Among the predicted TFs (Table [104]S7 & Table [105]S8), we
   selected significant TFs with P-value < 0.05 which means that only TFs
   identified not by chance are used for further analysis. According to
   the Reactome database, several pathways were significantly enriched for
   the TFs in the males (Fig. [106]8A, Table [107]S9). Pathways such as
   FOXO-mediated transcription of cell cycle genes and Signaling by
   Activin and Signaling by NODAL were identified. We also identified
   enrichment of TFs in pathways such as Endogenous sterols, ESR-mediated
   signaling and Estrogen-dependent nuclear events downstream of
   ESR-membrane signaling in the females (Fig. [108]8B, Table [109]S10).
   GO ontology analysis shows that the TFs for the males (Figure [110]S4A)
   and females (Figure [111]S4B) are generally involved in metabolic,
   biological and developmental processes.
Fig. 8.
   [112]Fig. 8
   [113]Open in a new tab
   Biological interpretation of sex-biased pQTLs variants (p
   value = 5 × 10^–5). Pathway enrichment analysis using significantly
   enriched TFs matched to the sex-specific pQTL variants in males (A) and
   in females (B). Pathway enrichment analysis of annotated gene sets in
   males (C) and in females (D). P-value < 0.05 with Bonferroni multiple
   correction method was set for significantly enriched terms (category).
   Genes and or TF names related to the pathways are displayed on the
   circular plot
   To explore biological relevance of the gene sets using the Reactome
   database, while pathways such as Extracellular matrix organization,
   Collagen formation, Collagen biosynthesis and modifying enzymes and
   Protein–protein interactions at synapses were identified among the top
   15, others such as Regulation of insulin secretion were significant
   pathways discovered for males (Fig. [114]8C, Table [115]S11), while for
   females the mapped gene-sets were enriched for pathways such as
   Ca2 + pathway, Platelet homeostasis, Muscle contraction (Fig. [116]8D,
   Table [117]S12). The Neuronal system however appeared as common pathway
   in both sexes. Gene ontology analysis for males (Figure [118]S5A) and
   females (Figure [119]S5B) revealed that the curated genes-sets are
   overall involved in biological, metabolic, developmental and
   reproduction processes. The identification and understanding of
   sex-specific pathways is crucial to design effective therapeutics,
   especially for the diseases that are expressed differently between
   sexes.
Sex-dependent pQTL variants overlap with sex-related molecular traits
   We cross referenced the genome-wide significant index pQTL variants
   with other phenotypic traits, not limited to gene expression,
   metabolites, and epigenetic markers, to explore any potential
   biological pleiotropy. Overall, in the males, all the pQTLs variants
   showed association with various diseases or traits and expression
   levels of genes (Figure [120]S6). For example, male-specific pQTL
   rs3213946 is also eQTL for several genes (ITGAV, FAM171B, ATP6V1B2 and
   ZC3H15) and associated with diseases such as inflammatory bowel
   disease, coronary artery disease, and attention deficit hyperactivity
   disorder. Another male-specific pQTL rs11207327 linked with CD5 levels
   also modulates the expression of genes (JUN, KRT79, TACSTD2), and it is
   associated with diseases and traits (Asthma, self-reported haemorrhoids
   and treatment with estriol product). Evidence of prior associations
   with metabolites (e.g., X-12441, citrate and Hypoxanthine) and
   epigenetic markers (e.g., H3K4me1, H3K27ac and cg07810476) were also
   uncovered.
   Similarly, in females, the index pQTL variants are also eQTLs and
   correlate with other traits (Figure [121]S7). For instance, SNP
   rs608574 linked with IL18 and showing association only females,
   regulates genes such as SRD5A2, CAPN13, SLC30A6, NLRC4, XDH and SPAST,
   and it is as well associated to traits and diseases (e.g., height, age
   started oral contraceptive pill, single delivery by caesarean section,
   treatment with diltiazem, chronic sinusitis, birth control pills and
   fistulae involving female genital tract). Another pQTL SNP rs12977062
   associated with PD-L1 regulates multiple genes, not limited to GP6,
   TPM3P6 and LILRB5. Associations with epigenetic markers (e.g.,
   percent-splice-in and cg15691140), other traits and diseases (treatment
   with noriday tablet, convulsions, birth weight, height, treatment with
   tridestra tablet and atherosclerosis) were also detected. These
   observations cross validate the identified pQTL variants and justifies
   the roles of pQTLs in explaining the mechanisms of diseases.
Discussion
   This study investigated the contribution of host genetics to sex
   differences in inflammatory proteins production capacity by conducting
   meta-analysis of pQTL summary statistics in males and females
   separately using two population-based cohorts. Given the well-known
   sexual dimorphism in most complex traits and diseases, there is growing
   awareness for large scale genetic studies to unravel sex-specific
   genetic factors.
   In this study, we provide evidence for the contribution of common
   autosomal SNPs for differences in inflammatory biomarker between sexes
   by identifying genome-wide significant pQTLs in the sex-stratified
   analyses for males (6 pQTLs) and for females (9 pQTLs). In general, we
   observed that majority of the pQTL effects, from the tested autosomal
   SNPs (4,028,465) on the 66 proteins, are shared between males and
   females with approximately 5% displaying sex-specific effects. This
   observation is concordant with previously published evidence,
   highlighting the lack of strong sex-biased genetic effects on complex
   traits. For example, large scale twin studies across 50 human traits
   observed sex-specific genetic factors for 25% of the traits with
   limited sex-specific genetic variants except for the apparent
   puberty-related traits [[122]25]. Another study targeting specifically
   20 neuropsychiatric and behavioral traits, showed that between-trait
   genetic correlation estimates were not significantly different between
   males and females [[123]26]. In fact, a recent study showed that
   sex-specific eQTLs do not account for the sex-specific trait
   associations and demonstrated through power analysis that millions of
   GWAS samples are required to detect sex-specific trait associations
   driven by sex-biased eQTLs [[124]27]. Therefore, even though the
   limited percentage of sex-specific pQTLs identified in our study is in
   line with previous findings, statistical power could account for the
   minimal percentage observed as well.
   Albeit most of the genome-wide pQTLs appeared to exhibit shared effect
   between males and females, colocalization analyses also highlights
   specific genomic-loci with sex-biased effects. The two loci harboring
   male-specific pQTLs are LINC01135 and ITGAV, which are lncRNA and
   protein coding genes respectively. While the function of LINC0113 gene
   is not completely known, recent study has demonstrated its role in
   prostate cancer- skin cancer affecting males [[125]28]. Interestingly,
   the sentinel pQTL variants at these loci (ITGAV (rs3213964) and
   LINC0113 (rs11207327)) were previously reported to be associated with
   various diseases and intermediate traits such as gene expression and
   metabolites which are likely to mediate the observed associations. The
   SNP rs3213964 at ITGAV gene is evident through cross validation
   approach to be associated with male-related phenotypic traits such as
   malignant neoplasm of testis, sitting height and self-reported
   testicular cancer which supports the stronger associations detected in
   males.
   Furthermore, three pQTLs (CNOT10 (rs67015567), LILRB5 (rs12977062) and
   SRD5A2 (rs608574)) were identified to modulate proteins specifically in
   females. The CNOT10 gene is related to metabolic pathways as it is
   predicted to be involved in catabolic process and also participates in
   negative regulation of translations. The LILRB5 gene is involved in the
   regulation of immune pathways and its role in negative regulation of
   cytokine production is reported [[126]29]. We further showed that the
   lead SNP rs12977062 at the LILRB5 gene is correlated with expression of
   other genes and epigenetic markers. The SRD5A2 gene is involved in
   pathways such as metabolism of steroid hormones and widely known for
   the regulation of testosterone biosynthetic process. Although
   testosterone is traditionally thought of as males sex hormones, females
   also produce it but in lower quantities [[127]30]. Supporting this, the
   SRD5A2 gene also converts progesterone or corticosterone into their
   corresponding 5-alpha-3-oxosteroids [[128]31] and mutations in the
   SRD5A2 gene has been associated with human intersex condition termed as
   pseudohermaphroditism [[129]32]. These unique characteristics of this
   gene makes it possible to function in either sex, but we speculate that
   genetic polymorphisms in this gene might regulate phenotypes
   differently. In line with this claim, the lead SNP rs608574 at SRD5A2
   gene overlap with female dominated traits, among others treatment with
   livial tablet and fistulae involving female genital tract, supporting
   the stronger association of this variants specifically in females.
   Surprisingly, all the genome-wide significant pQTL variants identified
   with sex-specific effects are trans-pQTLs, which suggests that the
   uncovered pQTL variants may not have direct effect on the protein
   coding gene but other proteins and pathways are involved in the
   mechanism underlying the genetic sex-differences in protein levels.
   Interestingly and supporting this argument, over-representation
   analysis of genes mapping to the female-specific pQTLs implicated
   pathways such as Ca2^+ pathway and cardiac conduction which has
   functional interplay in cardiovascular diseases, with increased
   susceptibility among females in developed countries [[130]33].
   Regulation of calcium ion (Ca2^+) cycle is important for cardiac
   contraction and relaxation and estrogen levels in plasma can affect
   cardiac function [[131]34]. Aside geometric differences of the healthy
   heart, functional differences such as contractility exist between sexes
   with smaller cardiac output and larger ejection fraction are reported
   in females compared to males [[132]35]. On the other hand, pathway
   analysis highlighted the function of genes annotated to the
   male-specific pQTLs to collagen formation and protein–protein
   interaction.
   Furthermore, large proportion of these pQTLs variants are TFBS
   variants, suggesting their gene regulatory role. The TFs identified are
   enriched among sex-related pathways. For instance, in the females are
   Endogenous sterols, ESR-mediated signaling and Estrogen-dependent
   nuclear events downstream of ESR-membrane signaling. Higher
   concentrations of endogenous sex steroids and mutations of estrogen
   signaling receptor (ESR) genes are some of the underlying causes of
   breast cancer [[133]36, [134]37]. In the males, pathways such as
   signaling by activin and signaling by nodal were detected. Activin is
   mainly produced in the male reproductive tract and helps main cell–cell
   interactions, especially in the testis and prostate [[135]38].
   Similarly, the nodal signaling pathway is known to regulate fetal
   testicular development and uncontrolled nodal signaling has been
   implicated in testicular cancer [[136]39]. While these downstream
   observations strongly support the validity of the identified
   sex-dependent pQTL variants, further mechanistic studies are required
   to discern the mechanisms by which these pathways drive sex-specific
   pQTLs.
Limitations of the study
   It is important to acknowledge that, exclusion of the sex chromosome
   variants is a major limitation. Despite several evidence pointing to
   the autosomal genomic region to driving phenotypic sex-differences
   [[137]6], the X chromosome is known to accommodate the largest number
   of genes related to the immune system [[138]40]. Also, even though, we
   employed meta-analysis of two different cohorts to increase statistical
   power, upscaling of the sample sizes and the utilization of methods
   that consider effect size differences between males and females, could
   help identify additional genome-wide significant loci. Additionally,
   increasing the number of proteins by utilizing the Olink explore panel
   is desirable for future work. Finally, to be able to ascertain whether
   the identified pQTLs are population-specific or shared, and to
   subsequently facilitate the translation of the findings into clinical
   practice, extending this analysis to diverse populations is warranted.
Conclusion
   In conclusion, we identified and characterized sex-specific pQTLs,
   which is crucial to discerning the underlying mechanisms of complex
   diseases, traits and biomarkers. Given that proteins determine almost
   all cellular process, ultimately dictate the phenotypic expression, and
   dysregulation of proteins are also implicated in several diseases, the
   identified sex-specific pQTL variants could be use as genetic
   instruments through mendelian randomization to interrogate how these
   inflammatory mediators are causally implicated in sex-biased phenotypes
   and/or diseases. This information will eventually help understand
   disease biology and also facilitate the development of new therapeutics
   with strong efficacy in one sex.
Materials and methods
Study cohorts
   The 500FG cohort is a population-based cohort of healthy individuals of
   Western European ancestry consisting of 237 males and 296 females with
   age range of 18-75 years. The 300BCG cohort is another population-based
   cohort of 325 (142 males and 183 females) healthy individuals of
   Western European origin with age range of 18–71 years. Sex
   classification of research participants was based on sex assigned at
   birth.
   Both cohorts are part of the Human Functional Genomics Project
   (w[139]www.humanfunctionalgenomics.org), aimed at identifying the
   genetic and non-genetic determinants of the immune response.
Measurement of inflammatory protein biomarkers
   Inflammatory protein levels were measured using targeted proteomics
   (Olink® platform). We generated protein abundance of the plasma samples
   which was quantified using the inflammatory panel of 92 proteins. The
   Olink data are reported in NPX values (Normalized Protein expression)
   which are on log 2 scale. Immunoassays utilized by Olink are based on
   the Proximity Extension Assay (PEA) technology [[140]41], which makes
   use of oligonucleotide-labelled antibodies binding to their respective
   protein. When the two antibodies are brought in proximity, a DNA
   polymerase target sequence is formed, which is subsequently quantified
   by quantitative real-time polymerase chain reaction (qPCR). Each plate
   included interplate controls which are used to adjust any potential
   plate difference. NPX values were intensity normalized with the plate
   median for each assay as the normalization factor (Intensity
   Normalization v.2).
Preprocessing / filtering of protein data and normalization
   Samples that did not pass Olink internal quality control or flagged
   “Warning” were excluded, so as proteins which failed to be quantified
   in at least 75% of the samples. The remaining 73 and 70 proteins in the
   500FG and 300BCG cohorts respectively, with NPX values below the
   protein-specific detection limit (LOD) were replaced with their
   corresponding LOD values as recommended by Olink. 66 proteins were
   common between both cohorts after filtering.
Genotyping and genetic data quality control
   We previously described details of the genotyping, imputation and all
   quality control procedures for the 500FG [[141]42] and 300BCG [[142]43]
   cohorts. For the 500FG cohort, DNA samples of approximately 500
   individuals were genotyped with the commercially available SNP chip,
   Illumina HumanOminiExpress-8 v1.0 and DNA samples of 325 individuals
   were genotyped for the 300BCG cohort using the Infinium Global
   Screening Array MD version 1.0 from Ilumina SNP chip. The genotype
   calling for both cohorts were performed using Optical 0.70 with the
   default settings [[143]44]. Standard pre-imputation quality filters
   such as excluding variants with call rate ≤ 0.99 Hardy–Weinberg
   equilibrium (HWE) ≤ 0.0001 and minor allele frequency (MAF) ≤ 0.001.
   Per sample quality control such as sex discrepancies, cryptic
   relatedness and population stratification to exclude genetic outliers
   (17 and 12 in the 500FG and 300BCG cohorts respectively) were applied.
   Genotyped samples were imputed with the Michigan Imputation server
   [[144]45], with the Genome of the Netherlands Consortium, (GoNL 2014)
   and the human reference consortium (HRC r1.1 2016) reference panels for
   the 500FG and 300BCG cohorts respectively.
   We filtered out genetic variants with imputation quality score
   (R^2) < 0.3 and MAF cut-off of 5%. A total of 4,296,841 and 4,358,039
   SNPs were available for the 300BCG and 500FG cohorts respectively.
Protein QTL mapping
   The association of protein-genotype analysis was performed in a
   sex-stratified manner for both cohorts separately as protein
   measurements were not performed at the same time. We used the linear
   model function in the Matrix-eQTL R package [[145]46] for association
   analysis. Age of the participants was included in the model which was
   fitted on the inverse ranked normalized protein concentration. In the
   300BCG cohort, a total of 306 (Males = 134 and Females = 172) samples
   were retained after quality control and with matched genotype–phenotype
   data. In the case of the 500FG, the total samples used for pQTL mapping
   was 371 consisting of 184 males and 187 females. In the males, the mean
   ages are 27.20 and 29.56, and the standard deviations are 12.30 and
   14.68 for the 300BCG and 500FG cohorts respectively. In the females
   also, the mean ages are 24.66 and 24.77 and the standard deviations are
   8.60 and 10.98 for the 300BCG and 500FG cohorts respectively. Wilcoxon
   test did not show any significant age difference between both cohorts
   in each sex (P > 0.05).
Meta-analysis of males and females pQTL results
   The meta-analysis was carried out using fixed effects sample size
   weighted analysis method implemented in METAL package [[146]47], based
   on pQTL summary statistics (p-values). Here, the analysis which was
   conducted for males and females from both cohorts separately, was
   confined to the 66 proteins and 4,028,465 SNPs common between the 500FG
   and 300BCG cohorts. We extracted SNPs with consistent effect size
   direction per protein for further analysis.
Estimation of study-wide significant threshold
   To account for the multiple testing burden, the significant cut-off is
   determined based on the ratio of the 5% significant level and the
   product of the number of proteins (66) and the number of tested SNPs
   (4,028,465). The resulting p-value was 1.89 × 10^–10. Given that few
   associations surpassed this stringent threshold, we considered the
   conventional genome-wide significant threshold (5.0 × 10^–8) together
   with colocalization analysis to identify key loci exhibiting
   sex-dependent effect. For sex-specific biological insights, we used
   associations with strong suggestive threshold of 5.0 × 10^–5 in one sex
   which failed to be replicated in the opposite sex at nominally
   significant threshold of 0.05.
Colocalization of pQTLs between sexes
   Colocalization of pQTLs identified in males and females was performed
   using Bayesian colocalization method which is implemented in the coloc
   package in R [[147]48]. The default prior probability and parameters
   that a random variant in the region is causal to both traits was
   applied. For each protein-SNP genome-wide significant pair considered
   for colocalization, SNPs within a window size of 500 kb around the lead
   SNP were tested. A posterior probability (PP4 >  = 0.75) is considered
   as strong evidence of colocalization or shared genomic region between
   the sexes [[148]48]. Other hypothesis tested are PP1 and PP2 which
   indicates either males or females have significant associations in the
   tested region and PP3, which indicates that both males and females have
   significantly unique causal variants. LocusCompareR, being an R package
   was used for visualizing the results [[149]49].
Conditional analysis on index sex-specific pQTL variants
   We performed conditional analysis using the GCTA-COJO software
   [[150]50]. As the software requires effective size and standard error
   statistic as input, the METAL Zcores from the meta-analysis results
   were used to estimate the effect size (Beta) and standard error (SE)
   values using the two equations suggested by previously [[151]51].
   [MATH: Beta=Zscore/sqrt2∗MAF∗
   1-MAF<
   mrow>∗N+Zscore<
   /mrow>2 :MATH]
   1
   [MATH: SE=1/sqrt2∗MAF∗
   1-MAF<
   mrow>∗N+Zscore<
   /mrow>2 :MATH]
   2
   The parameters in the equation are defined as: sample size (N), and
   minor allele frequency (MAF). The 500FG cohort was used as the LD
   reference panel. The analysis was confined to regions around 250 kb of
   the index pQTL variants. A conditional p-value (pC) threshold of
   5 × 10^–8 was used to identify secondary hits.
Functional annotation of sex-specific pQTLs
   We selected sex-specific pQTLs with strong suggestive association
   (5.0 × 10^–5) for functional annotation. The functional consequences of
   the pQTL variants on genes functions were explored using the ANNOVAR
   method in Functional Mapping and Annotation of GWAS (FUMA) [[152]52].
   The putative regulatory potential of the pQTLs were also accessed with
   RegulomeDB [[153]53] which incorporates high-throughput, experimental,
   different data sources and computational predictions to score genetic
   variants. RegulomeDB assigns scores ranging from 1 to 6 to help
   classify SNPs and to determine genetic variants with or with regulatory
   functionality. Genetic variants with lower scores indicate stronger
   evidence of residing in a regulatory region.
Mapping sex-specific pQTLs with transcription factor binding sites
   SNP2TFBS web-based annotation tool [[154]54], was applied to study
   functional effects of genetic variants within transcription factor
   binding sites (TFBS) of the human genome. Briefly, a SNP’s effect on
   transcription factor (TF) binding is estimated with position weight
   matrix (PWM) model for the binding specificity of the corresponding
   factor. This results in a list of SNPs that overlaps with predicted
   binding sites of a specific TF. The list of TFs being mapped by the
   SNPs are generated. The TF enrichment is done by computing the ratio of
   the observed SNP hits over the expected hits for each TF.
   P-value less than 0.05 was declared as statistically significant
   threshold for TF enrichment.
Pathway over-representation analysis
   The SNP2GENE function in FUMA [[155]52] was applied to annotate
   sex-specific pQTL variants to genes. Default settings in FUMA were used
   to identify independent lead associations with the following
   parameters: r^2 < 0.6 for independent significant SNPs, 2nd r^2 < 0.1
   for index SNPs and a window size of 250 kb to determine LD blocks. The
   independent lead SNPs are then mapped to genes based on positional and
   functional information of SNPs. The curated gene sets were used as
   input for both pathway enrichment analysis and Gene Ontology (GO) slim
   analysis. Gene Ontology slim analysis provides a higher-level summary
   or term of GO ontologies or provides a broader overview of the GO
   ontology terms.
   Enrichment of the candidate gene list and TFs were estimated using
   ClusterProfiler R package for hypergeometric test [[156]55]. Bonferroni
   multiple testing correction method with q-value < 0.05 was declared as
   significant pathways. For a broader overview of the annotated terms, GO
   slim analysis was performed with the WebGsalt tool [[157]56].
Integration of pQTL variants with other molecular traits: cross validation
   We looked up for a range of phenotypes or traits that are associated
   with the identified genome-wide significant independent pQTLs with
   Phenoscaner [[158]57], which is a curated database of publicly
   available summary statistics of large-scale genotype–phenotype
   associations, not limited to NHGRI-EBI GWAS catalog and dbGAP catalogs.
   Association results of traits such as gene expression, proteins,
   metabolites, epigenetics and diseases with our pQTL variants were
   downloaded and overlap analysis was conducted. A nominal threshold of
   1 × 10^–3 was applied to select SNP-phenotype associations from the
   database.
Statistical analyses
   All statistical methods and tools used are described under the
   appropriate sections in the methods. Analysis and visualization were
   performed in R version 4.10 unless otherwise stated. Two-way Analysis
   of Variance (ANOVA) was performed to examine the potential interaction
   between SNP genotypes and Sex. The aov () function in R was used for
   conducting the analysis and interaction effects were visualized using
   the interaction. plot () function.
Supplementary Information
   [159]12864_2024_10065_MOESM1_ESM.pdf^ (9.9MB, pdf)
   Additional file 1: Figure S1. Comparing the genetic effects of
   sex-specific pQTL in 500FG cohort. Figure S2. Visualization of
   Sex-by-SNP interaction in the 500FG cohort. Figure S3. Bar plots
   distribution of sex-specific pQTLs after meta-analysis. Figure S4.
   Graphical illustration of GO slim results with Transcription Factors
   (TFs). Figure S5. Graphical illustration of GO slim results with
   curated gene sets. Figure S6. Circular barplots summarizing traits
   associated with male-specific pQTLs. Figure S7A. Circular barplots
   summarizing traits associated with genome-wide significant pQTLs in
   females. Figure S7B. Circular barplots summarizing traits associated
   with genome-wide significant pQTLs in females.
   [160]12864_2024_10065_MOESM2_ESM.xlsx^ (5.7MB, xlsx)
   Additional file 2: Table S1. Summary statistics of pQTL mapping and
   meta-anlysis results in the males group. Table S2. Summary statistics
   of pQTL mapping and meta-anlysis results in the females group. Table
   S3. Summary statistics of genome-wide significant meta-anlysis and
   colocalization results in the males group. Table S4. Summary statistics
   of genome-wide significant meta-anlysis and colocalization results in
   the females group Table S5. Summary statistics of analysis of variance
   (ANOVA) results using the 500FG cohort. Table S6. Summary statistics of
   conditional analysis results of sex-specific loci. Table S7. List of
   TFs enriched or linked to male-specific pQTL variants. Table S8. List
   of TFs enriched or linked to female-specific pQTL variants. Table S9.
   Details of enriched pathways of TFs linked with male-specific pQTL
   variants. Table S10. Details of enriched pathways of TFs linked with
   female-specific pQTL variants. Table S11. Enriched pathways of genes
   mapped to male-specific pQTL variants. Table S12. Enriched pathways of
   genes mapped to female-specific pQTL variants.
Acknowledgements