Abstract

   Biological mechanisms underlying multimorbidity remain elusive. To
   dissect the polygenic heterogeneity of multimorbidity in twelve complex
   traits across populations, we leveraged biobank resources of
   genome-wide association studies (GWAS) for 232,987 East Asian
   individuals (the 1st and 2nd cohorts of BioBank Japan) and 751,051
   European individuals (UK Biobank and FinnGen). Cross-trait analyses of
   respiratory and cardiometabolic diseases, rheumatoid arthritis, and
   smoking identified negative genetic correlations between respiratory
   and cardiometabolic diseases in East Asian individuals, opposite from
   the positive associations in European individuals. Associating
   genome-wide polygenic risk scores (PRS) with 325 blood metabolome and
   2917 proteome biomarkers supported the negative cross-trait genetic
   correlations in East Asian individuals. Bayesian pathway PRS analysis
   revealed a negative association between asthma and dyslipidemia in a
   gene set of peroxisome proliferator-activated receptors. The pathway
   suggested heterogeneity of cell type specificity in the enrichment
   analysis of the lung single-cell RNA-sequencing dataset. Our study
   highlights the heterogeneous pleiotropy of immunometabolic dysfunction
   in multimorbidity.

   Subject terms: Genome-wide association studies, Type 2 diabetes,
   Dyslipidaemias, Chronic obstructive pulmonary disease, Asthma
     __________________________________________________________________

   Here, the authors perform cross-trait analyses of respiratory and
   cardiometabolic diseases, rheumatoid arthritis, and smoking and
   identify negative genetic correlations between respiratory and
   cardiometabolic diseases in individuals of East Asian ancestry, in
   opposition from the positive association in European ancestry
   individuals.

Introduction

   Multimorbidity, two or more coexisting diseases in an individual,
   burdens individuals and societies globally^[64]1,[65]2. People with
   multimorbidity show impaired physical functions, frequent
   hospitalization, and high mortality^[66]3–[67]5. Multimorbidity
   consequently increases healthcare costs in individuals, for instance,
   those with asthma and chronic obstructive pulmonary disease
   (COPD)^[68]6,[69]7. Unveiling the biological mechanism of
   multimorbidity can provide a stepping stone for personalized medicine
   and eventually contribute to decreasing the disease-associated burden
   in terms of healthcare and socio-economy. However, the complex
   structure of diseases and their interactions prevented the
   comprehensive understanding of multimorbidity. As a population-based
   study shows that the combination of traits comprising multimorbidity
   differs between populations^[70]8, we hypothesized that genetic
   analysis could disentangle the biological and epidemiological
   complexity of multimorbidity.

   Genome-wide association studies (GWAS) for decades have identified
   genetic risks of complex diseases^[71]9,[72]10. Recent studies
   generally identify the positive genetic correlations among diseases in
   the same category^[73]11–[74]13, the classification criteria for which
   are based on increasing knowledge about organ functions and diagnostic
   tests^[75]14. On the other hand, diseases in different categories
   present complex genetic correlations. A preceding study dissected the
   genome into regions with high linkage disequilibrium (LD) and described
   the negative local genetic correlations in the major histocompatibility
   complex (MHC) regions among multicategorical traits, such as that
   between asthma and blood triglyceride levels^[76]15.

   The genetic heterogeneity of multimorbidity may exist among different
   populations but remains unclear because of the limited GWAS datasets in
   non-European (non-EUR) populations^[77]16. Therefore, assessing
   phenotypic correlations among diverse populations is effective in
   seeking clues for research investigating heterogeneous genetic
   correlations. In respiratory and cardiometabolic diseases, some studies
   reported the phenotypic correlations of lipid metabolism-related traits
   with asthma and COPD^[78]17–[79]20. Obesity occurs less frequently with
   severe asthma in East Asian (EAS) population than in EUR^[80]21.
   Furthermore, individuals with COPD are more underweight than healthy
   individuals in EAS^[81]22,[82]23 but not in EUR^[83]24. In contrast, a
   prospective observational study for EAS population showed that the
   prevalence of dyslipidemia is associated positively with the severity
   of asthma and the frequency of asthma exacerbation^[84]25. Comparing
   the genetic correlations between populations can offer novel insights
   into the biological mechanism underlying the heterogeneous phenotypic
   correlations. As respiratory, autoimmune, and cardiometabolic diseases
   are related to the immune system and inflammation^[85]26–[86]28, the
   dissimilar phenotypic correlations may be derived from the immune
   response network associated with lipid metabolism. The development of
   GWAS downstream analysis will contribute to the precise understanding
   of the genetic heterogeneity of multimorbidity.

   A recent pathway polygenic risk scores (PRS) analysis enables
   researchers to analyze the direction of functional associations between
   traits^[87]29. Pathway PRS accounts for genomic substructure and
   reflects disease heterogeneity. Instead of aggregating the estimated
   effects of risk alleles across the entire genome (genome-wide PRS),
   pathway PRS aggregates risk alleles per pathway. Although other
   post-GWAS analyses do not account for cross-trait associations per
   function, pathway PRS provides a detailed insight into the genetics
   underlying the common and heterogeneous dysfunctions in multimorbidity.
   A recent development of Bayesian PRS in genome-wide PRS^[88]30–[89]32
   may improve the predictive performance of genetic liability in pathway
   PRS.

   Here, we performed GWAS for twelve complex traits relevant to the
   immune system and inflammation, namely respiratory and cardiometabolic
   diseases, rheumatoid arthritis (RA), and smoking, in EAS (n = 232,987)
   and EUR (n = 408,552) populations. For the multi-population GWAS, we
   leveraged data obtained from BioBank Japan (BBJ) and UK Biobank
   (UKB)^[90]33,[91]34. We merged summary statistics from FinnGen^[92]35
   (n = 342,499) in EUR population using a standard fixed-effect
   meta-analysis and compared the global and local genetic correlations in
   EAS and EUR populations. We constructed genome-wide PRS to assess the
   genetic correlation in the individuals with and without multimorbidity.
   Utilizing the blood metabolome and proteome datasets of the EAS (BBJ1)
   and EUR individuals (UKB), we assessed the associations of respiratory
   diseases with circulating lipid biomarkers. We then constructed
   Bayesian pathway PRS using PRS-CSx^[93]32 to identify pathways with
   cross-trait associations in the immune system and lipid metabolism.
   Finally, we applied scDRS^[94]36 to investigate phenotype-relevant
   cells based on the human lung single-cell RNA-sequencing (scRNA-seq).

Results

Characteristics of GWAS meta-analysis obtained from three biobank resources

   An overview of this study is shown in Fig. [95]1. To investigate the
   heterogeneity of genetic correlations in individuals with and without
   multimorbidity related to the immune system and inflammation, we
   leveraged the biobank resources of BBJ, UKB, and FinnGen. BBJ collected
   about 200,000 participants for its first cohort (BBJ1) and 67,000
   participants for its second cohort (BBJ2)^[96]33. This study enrolled
   individuals with asthma, COPD, interstitial lung disease (ILD), RA,
   smoking, obesity, dyslipidemia, type 2 diabetes (T2D), hypertension,
   coronary artery disease (CAD), heart failure (HF), or stroke. Cases
   were individuals with the target phenotypes, and controls were those
   without target or related phenotypes (Supplementary Tables [97]1 and
   [98]2). Briefly, we analyzed the samples of EAS (BBJ1: 801–49,217
   cases; and BBJ2: 962–17,342 cases) and EUR (UKB: 2199–126,436 cases;
   and FinnGen: 2922–98,683 cases). We conducted GWAS for the BBJ1, BBJ2,
   and UKB individuals and obtained FinnGen GWAS summary statistics. We
   then meta-analyzed the GWAS summary statistics for each population and
   phenotype using an inverse-variance-weighted fixed-effect method
   implemented in RE2C^[99]37.

Fig. 1. The study overview.

   [100]Fig. 1
   [101]Open in a new tab

   We performed a GWAS meta-analysis on twelve complex traits examining
   232,987 East Asian individuals (EAS) from BioBank Japan (BBJ) and
   751,051 European individuals (EUR) from UK Biobank and FinnGen. We
   estimated the heritability and genetic correlations among the complex
   traits and found significant negative genetic correlations between
   respiratory and cardiometabolic diseases in BBJ (bottom left corner).
   Association analyses for genome-wide polygenic risk scores (PRS) and
   nuclear magnetic resonance (NMR) metabolite and Olink protein
   biomarkers showed the negative associations between regression
   coefficients for dyslipidemia and respiratory diseases (bottom left).
   Cross-trait pathway association analysis using Bayesian pathway PRS
   detected five pathways with negative risk associations, the functions
   of which regulate lipid metabolism (bottom right). Further pathway
   enrichment analysis of cell types demonstrated the enrichment of the
   lipid pathway in T cells of asthma (bottom right corner).

Heterogeneity of cross-trait genetic correlations across populations

   We evaluated the genetic multimorbidity among respiratory and
   cardiometabolic diseases with multifaced approaches. We applied
   LDSC^[102]38 to estimate the heritability of the twelve phenotypes and
   analyze the global genetic correlations for each population
   (Fig. [103]2 and Supplementary Data [104]1). The direction of
   phenotypic and genetic correlations (r[g]) was concordant in most trait
   pairs. The EAS analysis showed negative values of genetic correlations
   in most of the 28 pairs of cardiometabolic diseases × respiratory and
   autoimmune diseases, while there were positive values in the EUR
   analysis. We then identified the negative genetic correlations
   satisfying the significance level (P < 0.05/66) in the four disease
   pairs of EAS analysis (asthma−dyslipidemia: r[g] = −0.29,
   P = 7.5 × 10^−6; COPD−dyslipidemia: r[g] = −0.26, P = 6.0 × 10^−4;
   asthma−T2D: r[g] = −0.15, P = 6.0 × 10^−4; and RA−hypertension:
   r[g] = −0.25, P = 7.0 × 10^−4). The genetic correlations between
   respiratory and cardiometabolic diseases in EAS population remained
   negative among the independent datasets (Supplementary Fig. [105]1). To
   further investigate the negative genetic correlation between asthma and
   dyslipidemia, we analyzed additional EAS GWAS datasets (asthma from
   Tohoku Medical Megabank [TMM] and dyslipidemia from Korean Genome and
   Epidemiology Study [KoGES])^[106]39,[107]40. We observed that 9/12 of
   the pairs showed negative values of r[g] (Supplementary Fig. [108]2).
   Since these disease pairs displayed positive genetic correlations in
   the EUR analyses, our results highlighted the heterogeneity of genetic
   correlations between the two populations.

Fig. 2. Analysis of heritability, genetic correlations, and phenotypic
correlations in EAS and EUR populations.

   [109]Fig. 2
   [110]Open in a new tab

   a A bar plot of the heritability for twelve complex traits in EAS and
   EUR populations. Trait labels are colored based on the disease
   categories. b A heatmap of genetic correlations in the twelve complex
   traits colored by LDSC genetic correlation estimates. P-values of the
   two-sided tests are adjusted using Bonferroni corrections. The upper
   and lower triangular matrices show EAS and EUR analyses, respectively.
   c A heatmap illustrating the phenotypic correlations in the twelve
   complex traits colored by the natural logarithm of odds ratio (OR).
   P-values are calculated from two-sided Fisher’s exact tests and
   adjusted using Bonferroni corrections. EUR populations included UKB
   samples. OR was calculated via Fisher’s exact tests. h^2: heritability.
   *: P[uncorrected] < 0.05/66. CAD: coronary artery disease; COPD:
   chronic obstructive pulmonary disease; HF: heart failure; ILD:
   interstitial lung disease; RA: rheumatoid arthritis; and T2D: type 2
   diabetes.

   To assess the influence of polygenicity on the heterogeneity of
   cross-trait genetic correlations, we conducted a local genetic
   correlation analysis. We partitioned the genome into blocks with high
   levels of LD using LAVA supplementary program^[111]15 (EAS: 2,197
   blocks; and EUR: 2,495 blocks). We then applied SUPERGNOVA^[112]41 for
   a local genetic correlation analysis for each population (Supplementary
   Fig. [113]3). After calculating per-block local genetic correlations of
   all LD blocks for all 66 disease pairs, we compared the proportion of
   significant blocks with negative correlations between EAS and EUR
   populations. We observed the consistency of positive directions in
   phenotypic and genetic correlations between asthma and obesity for both
   populations^[114]19,[115]21 (38 and 3 positive blocks in EUR and EAS,
   respectively). In twelve EAS and one EUR phenotype pairs of respiratory
   and cardiometabolic diseases, LD blocks with negative correlations were
   the majority. While we confirmed that the genetic correlation between
   COPD and dyslipidemia supported the population difference in phenotypic
   correlations (3/5 blocks in EAS and 1/53 blocks in EUR with significant
   negative correlations), we saw several genetic correlations that
   disagreed with the known phenotypic correlations^[116]25,[117]42
   (asthma–dyslipidemia, asthma–T2D, and COPD–T2D in EAS analysis). The
   pair of asthma and dyslipidemia showed a remarkable difference in the
   proportion of blocks with negative correlations (11/12 blocks in EAS
   and 7/31 blocks in EUR). To further validate the negative genetic
   correlation between asthma and dyslipidemia, we analyzed asthma from
   TMM and dyslipidemia from KoGES^[118]39,[119]40. Among the pairs of
   asthma-dyslipidemia EAS GWAS datasets, we observed significant LD
   blocks in 10/12 pairs (Supplementary Fig. [120]4). LD blocks with
   significant negative correlations were the majority among the 9/10
   pairs, supporting the negative local genetic correlation between asthma
   and dyslipidemia in EAS population. The analyses of EAS population
   showed the negative genetic correlation between asthma and dyslipidemia
   at both global and local levels.

   We used Popcorn^[121]43 to analyze the heterogeneity of genetic
   correlations across the populations and to identify whether respiratory
   or cardiometabolic diseases have heterogeneity toward genetic
   correlations (Supplementary Fig. [122]5). As described in the LDSC
   within-population analysis, many pairs of respiratory and
   cardiometabolic diseases presented the opposite direction of genetic
   correlations in EAS population. We next focused on cross-population
   analysis between EAS and EUR populations. Although cardiometabolic
   diseases in EAS and respiratory diseases in EUR had positive
   correlations (for instance, EAS T2D and EUR COPD: r[g] = 0.19,
   P = 7.8 × 10^−5), respiratory diseases in EAS suggested the negative
   genetic correlations with cardiometabolic diseases in EUR (EAS COPD and
   EUR T2D: r[g] = −0.22, P = 0.014). Our results implied that EAS
   respiratory diseases had the opposite genetic risk components from
   cardiometabolic diseases.

Directions of cross-trait associations in genome-wide polygenic risk scores
analysis

   To validate the per-individual negative genetic correlations between
   respiratory and cardiometabolic diseases in the independent datasets,
   we constructed Bayesian PRS using PRS-CSx^[123]32 for each phenotype
   and population (Fig. [124]3a). For the PRS analyses, we assigned the
   BBJ1 (EAS) and FinnGen (EUR) to the training datasets and the BBJ2
   (EAS) and UKB (EUR) to the testing datasets. We showed the predictive
   performances of genome-wide PRS in Supplementary Fig. [125]6. The
   average proportion of heritability genome-wide PRS explained was 53.5%
   in EAS and 61.1% in EUR analyses, comparable to the preceding PRS study
   ([126]Supplementary Methods)^[127]44. The standardized regression
   coefficients (β) of genome-wide PRS showed consistency with the
   estimates of genetic correlations (Supplementary Fig. [128]7). Because
   the previous study of genetic correlations in UKB EUR individuals with
   multimorbidity showed the positive genetic correlations of asthma with
   dyslipidemia and cardiometabolic diseases^[129]45, we hypothesized that
   cross-trait genetic correlations between respiratory and
   cardiometabolic diseases might be in the same direction among the
   individuals with and without multimorbidity. For validation, we
   analyzed the individuals with and without multimorbidity separately and
   compared the directions of genetic correlations.

Fig. 3. Predictive performance and cross-trait associations of genome-wide
PRS.

   [130]Fig. 3
   [131]Open in a new tab

   a Results from logistic regression analyses testing the associations
   between target phenotypes in testing datasets and genome-wide PRS
   calculated from the training datasets using PRS-CSx. The analyses used
   identical phenotypes in the training and testing datasets. In the
   forest plots, dots indicate standardized regression coefficients, and
   whiskers represent 95% confidence intervals. Disease labels are colored
   based on the disease categories. b Results from logistic regression
   analyses testing the cross-trait associations of base and target
   phenotypes. After excluding all individuals with overlapping base and
   target phenotypes from the testing datasets, we analyzed the
   associations between the target phenotype and the PRS generated from
   the training datasets. The heatmaps are colored based on standardized
   regression coefficients. P-values of the two-sided tests are adjusted
   using Bonferroni corrections. *: P[uncorrected] < 0.05/132. CAD:
   coronary artery disease; COPD: chronic obstructive pulmonary disease;
   HF: heart failure; ILD: interstitial lung disease; RA: rheumatoid
   arthritis; and T2D: type 2 diabetes.

   To analyze the individuals without multimorbidity, we excluded
   multimorbid individuals for the base and target phenotypes from the
   testing datasets in the analysis of each phenotype pair. We then tested
   the associations of binary target phenotypes with genome-wide PRS of
   base phenotypes (significance level: P < 0.05/132). We adjusted the
   results from the analysis for age, sex, and top ten genetic principal
   components (PCs). Concordant with the earlier genetic correlation
   analysis using the GWAS meta-analysis summary statistics, the
   genome-wide PRS analysis in EAS presented the negative genetic risk
   associations between respiratory and cardiometabolic diseases among the
   independent datasets (Fig. [132]3b). In the EAS analysis of asthma
   (base phenotype) and dyslipidemia (target phenotype), asthma showed a
   negative association with dyslipidemia (β = −0.039, P = 1.4 × 10^−4).
   Furthermore, we tested the cross-trait associations across independent
   EAS biobanks. We constructed genome-wide PRS using EAS GWAS
   meta-analysis (BBJ1 + BBJ2). We then tested the cross-trait
   associations on 99,561 individuals registered in TMM, one of the
   largest EAS biobanks in Japan. As shown in Supplementary Fig. [133]8
   and Supplementary Table [134]3, the negative associations between
   respiratory and cardiometabolic diseases were validated in the
   phenotype pairs (i.e., asthma–dyslipidemia and COPD–dyslipidemia).

   We then investigated genetic risk associations in the individuals with
   multimorbidity by linear regressions between genome-wide PRS for base
   and target phenotypes (Supplementary Fig. [135]9). The genetic
   associations between respiratory and cardiometabolic diseases presented
   similar results in the individuals with and without multimorbidity. For
   instance, the genome-wide PRS for respiratory and cardiometabolic
   diseases presented suggestive negative correlations in the individuals
   with multimorbidity (e.g., β = −0.077 and P = 0.050 for asthma and
   dyslipidemia). Our PRS analysis revealed the heterogeneity of genetic
   associations between respiratory and cardiometabolic diseases in the
   EAS individuals with and without multimorbidity.

   Motivated by the different directions between phenotypic and genetic
   correlations in asthma and smoking and sex-specific smoking behavior in
   EAS population^[136]46, we performed Fisher’s exact tests between
   smoking status and asthma stratified by sex (Supplementary
   Fig. [137]10). Consistent with the previous study from Japan^[138]46,
   there was a male-specific negative correlation between smoking status
   and asthma in EAS population (odds ratio=0.84, P = 2.8 × 10^-4).
   Therefore, we conducted an interaction analysis of cross-trait PRS
   associations including age, sex, smoking amount (pack-years), sex *
   smoking amount as the interaction term, and top ten genetic PCs in the
   models (Supplementary Data [139]2). The interaction term presented
   significant associations in the analysis of asthma PRS and dyslipidemia
   in EAS population (EAS: P = 1.7 × 10^−5; and EUR: P = 0.0016). Even
   after accounting for the interaction, the analysis yielded similar
   results for the association of asthma with dyslipidemia in EAS
   population (β = −0.047, P = 1.7 × 10^−5; Supplementary Fig. [140]11).

   Because sex differences affect asthma risk and serum lipid
   profiles^[141]47,[142]48, we hypothesized that there might be a sex
   difference in the negative genetic association between asthma and
   dyslipidemia. Accordingly, we performed sex-stratified cross-trait
   association analyses of genome-wide PRS for individuals without
   multimorbidity (Supplementary Fig. [143]12). The analysis identified
   the negative association of asthma with dyslipidemia in the EAS males
   (males: β = −0.051, P = 2.1 × 10^−4; and females: β = −0.051,
   P = 0.20). The multifaceted analyses for genetic and phenotypic
   associations supported the negative association between asthma and
   dyslipidemia, especially in the EAS males.

Associations with genome-wide PRS and circulating metabolites and proteins

   We investigated associations between genome-wide PRS and circulating
   lipid and metabolite biomarkers to detect shared risk biomarkers with
   heterogeneous associations between respiratory and cardiometabolic
   diseases. We utilized blood nuclear magnetic resonance (NMR) biomarker
   data (Nightingale Health Metabolic Biomarkers) from 51,612 EAS
   individuals registered in the BBJ1 and those from 245,349 EUR
   individuals from the UKB. After quality control and the removal of
   technical variation using ukbnmr R package^[144]49 (Supplementary
   Data [145]3), we assessed 325 metabolites for the EAS and EUR samples
   (Fig. [146]4, Supplementary Fig. [147]13, and Supplementary
   Data [148]4). In the EAS respiratory and autoimmune diseases, the
   genome-wide PRS presented the opposite directions of association
   against dyslipidemia (positive correlations with VLDL-C and negative
   ones with HDL-C in EAS respiratory and autoimmune diseases).

Fig. 4. Results from genome-wide PRS association analysis for circulating NMR
lipid and metabolite markers.

   [149]Fig. 4
   [150]Open in a new tab

   We assessed the associations between genome-wide PRS and NMR
   metabolome, adjusting for age, sex, and the top ten genetic PCs.
   Heatmaps are colored based on standardized regression coefficients (β)
   calculated from 325 biomarkers and genome-wide PRS association
   analysis. We categorized circulating lipid and metabolite markers based
   on the classification defined in ukbnmr R package, a toolkit for
   quality control and removing technical variation for NMR metabolome
   data. As positive controls for the analysis, we found positive
   correlations of dyslipidemia PRS with VLDL-C-related markers and
   negative ones with HDL-C-related markers in both populations. CAD:
   coronary artery disease; COPD: chronic obstructive pulmonary disease;
   HF: heart failure; ILD: interstitial lung disease; RA: rheumatoid
   arthritis; and T2D: type 2 diabetes.

   Based on the consistent negative genetic association between asthma and
   dyslipidemia in the EAS individuals, we further evaluated correlations
   of β[_biomaker] of PRS obtained from asthma and COPD analyses with
   those from dyslipidemia analyses (Supplementary Fig. [151]14). The PRS
   β[_biomaker] for asthma and COPD negatively correlated with those for
   dyslipidemia only in the EAS individuals (asthma–dyslipidemia in EAS:
   Spearman’s rank correlation coefficient [r[S]]=−0.45, P = 3.6 × 10^−6;
   asthma–dyslipidemia in EUR: r[S] = 0.31, P = 1.9×10^-7;
   COPD–dyslipidemia in EAS: r[S] = −0.29, P = 0.0062; and
   COPD–dyslipidemia in EAS: r[S] = 0.41, P = 7.7 × 10^−12). Our
   cross-trait analysis for genome-wide PRS and NMR metabolome revealed
   the decreased genetic risks of dyslipidemia in the EAS individuals with
   respiratory diseases.

   We further analyzed the association of genome-wide PRS with circulating
   blood proteins to detect biomarkers affected by the genetic risks of
   respiratory diseases. We measured protein concentrations using Olink
   platform from 2071 EAS individuals registered in the BBJ1 and those
   from 53,058 EUR individuals from the UKB. After bridging sample
   normalization and applying rank-inverse normal transformation, we
   assessed 2917 proteins for the EAS and EUR samples. We investigated the
   association of protein concentrations with genome-wide PRS for asthma,
   COPD, and dyslipidemia for all individuals with the proteomics data
   (Supplementary Fig. [152]15 and Supplementary Tables [153]4 and
   [154]5). We assessed the association of β[_protein] for the two
   respiratory disease PRSs and identified the negative association of
   β[_protein] for genome-wide PRS for the two respiratory diseases and
   dyslipidemia. Among eleven (asthma–dyslipidemia) and four
   (COPD–dyslipidemia) proteins that satisfied P < 0.05 in both phenotypes
   and both populations, we identified proteins associated with airway
   hypersensitivity (STC1) and dysfunction (EZR) in
   asthma^[155]50,[156]51. Our comprehensive multi-omics analyses showed
   the genetic negative associations of asthma with dyslipidemia and
   pinpointed the biomarkers with heterogeneous variations in the EAS
   individuals.

Detecting biological processes responsible for the heterogeneity of
multimorbidity with pathway polygenic risk scores

   Since the dysregulated immune system and lipid metabolism influence the
   onset of respiratory and cardiometabolic diseases, we assessed biology
   underlying the global negative genetic correlations by aggregating the
   effect sizes of variants to the gene sets of immune and metabolic
   pathways. In detail, we analyzed 335 pathways related to the immune
   system and lipid metabolism registered in the Reactome subset of
   curated gene sets in MSigDB^[157]52. With PRS-CSx, we constructed
   Bayesian pathway PRS to improve the predictive performance. For
   cross-trait pathway enrichment analyses using Bayesian pathway PRS, we
   performed three quality control steps: (1) performing Bayesian
   (PRS-CSx) and clumping + thresholding (C + T) pathway PRS (PRSet)
   methods in parallel to confirm the consistency of associations, (2)
   pre-analysis filtering: exclusion of pathways without associations with
   any base phenotypes using MAGMA^[158]53 pathway enrichment analysis,
   and (3) post-analysis filtering: exclusion of pathways with false
   positive associations and those with small predictive performances by
   setting the thresholds of P-value and Nagelkerke’s R^2 for Bayesian
   pathway PRS analysis. We described the details of Bayesian pathway PRS
   in the [159]Supplementary Methods. Briefly, we confirmed that pathway
   PRS analysis using PRS-CSx showed predictive performances consistent
   with C + T pathway PRS in phenotypes with sufficient heritability
   (Supplementary Figs. [160]16–[161]18). For cross-trait Bayesian pathway
   PRS analyses targeting the immune system and lipid metabolism, we set
   the P-value threshold using Bonferroni method and Nagelkelke’s R^2
   threshold to 0.001 (Supplementary Figs. [162]19 and [163]20). Lastly,
   we selected 18 pathways significantly associated with at least one
   phenotype in MAGMA pathway enrichment analyses (P < 0.05/335) from the
   335 immune or lipid pathways (Supplementary Data [164]5). For all
   combinations of the 18 pathways and 132 phenotype pairs (2376
   combinations), we conducted a cross-trait association analysis adjusted
   for the same covariates used in the primary analysis of genome-wide
   PRS.

   Applying Bonferroni-corrected P-value threshold of 2.1 × 10^−5
   (0.05/2,376) and PRS R^2 threshold of 0.001 to the results of Bayesian
   pathway PRS analysis, we identified five pathways with multicategorical
   associations (Table [165]1 and Supplementary Data [166]6) and eight
   with intracategorical associations (Supplementary Data [167]7). We
   observed the well-known associations of dyslipidemia with CAD in
   pathways regulating lipid metabolism. In the EAS analysis alone,
   pathways regulating lipid metabolism exhibited significant negative
   associations of dyslipidemia with asthma, COPD, and RA. To examine the
   heterogeneity in the directions of pathway associations, we further
   analyzed all 18 MAGMA-selected pathways for the two pairs of phenotypes
   (asthma–dyslipidemia and COPD–dyslipidemia). All the five lipid
   metabolism pathways with significant cross-trait association had a
   negative value of β in the analyses of asthma for EAS population.
   However, only one of the three significant lipid metabolism pathways
   showed negative β in the asthma analysis of EUR population
   (Fig. [168]5a and Supplementary Data [169]8). We examined the five
   pathways with multicategorical associations that passed the P-value and
   PRS R^2 thresholds to illustrate the genetic heterogeneity at the
   pathway level (Fig. [170]5b and Supplementary Figs. [171]21 and
   [172]22). Both in the individuals with and without multimorbidity, we
   identified a pathway regulating lipid metabolism by peroxisome
   proliferator-activated receptors (PPAR) α (Supplementary Data [173]9),
   a fatty acid-activated transcription factor of nuclear hormone receptor
   that regulates thermogenesis by stimulating adipocytes and represses
   interferon-γ production in T cells^[174]54. Furthermore, sex-stratified
   analyses for the PPARα pathway (Supplementary Fig. [175]23) identified
   the negative association of asthma with dyslipidemia in the EAS males
   (males: β = −0.080, P = 4.3 × 10^−9; and females: β = −0.059,
   P = 1.3 × 10^−4). We demonstrated that the Bayesian cross-trait pathway
   PRS analysis workflow enables the functional annotations of the
   genetics underlying the heterogeneous cross-trait associations.

Table 1.

   Significant pathways in Bayesian pathway PRS analyses investigating the
   associations between immune-mediated and cardiometabolic diseases
   Population Parental pathway Pathway Phenotype Standardized regression
   coefficient SE P-value Nagelkerke’s R^2
   Base Target
   EAS Metabolism Regulation of lipid metabolism by PPARα Asthma
   Dyslipidemia −0.070 0.010 5.5 × 10^−12 0.0013
   Transport of small molecules Assembly of active LPL and LIPC lipase
   complexes Asthma Dyslipidemia −0.088 0.010 4.2 × 10^−18 0.0021
   Chylomicron remodeling Asthma Dyslipidemia −0.097 0.010 2.2 × 10^−21
   0.0025
   COPD Dyslipidemia −0.081 0.010 8.3 × 10^−16 0.0017
   Plasma lipoprotein assembly, remodeling, and clearance Asthma
   Dyslipidemia −0.079 0.010 1.1 × 10^−14 0.0016
   Plasma lipoprotein remodeling Asthma Dyslipidemia −0.078 0.010
   2.6 × 10^−14 0.0016
   COPD Dyslipidemia −0.064 0.010 2.0 × 10^−10 0.0011
   RA Dyslipidemia −0.064 0.010 1.8 × 10^−10 0.0011
   [176]Open in a new tab

   We performed logistic regression analyses between pathway PRS of base
   phenotypes and binary target phenotypes, adjusting for age, sex, and
   top 10 genetic principal components. P values in the table are
   uncorrected and two-sided.

   LPL lipoprotein lipase, LIPC hepatic triacylglycerol lipase, PPARα
   Peroxisome proliferator-activated receptor α.

Fig. 5. Association analyses of immune and lipid metabolism pathways in
asthma and COPD.

   [177]Fig. 5
   [178]Open in a new tab

   a Results from logistic regression analyses investigating the pathway
   associations of dyslipidemia with asthma and COPD. P-values of the
   two-sided tests are adjusted using Bonferroni corrections. For the
   phenotype pairs of dyslipidemia−asthma and dyslipidemia−COPD, we
   analyzed all 18 pathways with significant associations in MAGMA
   gene-set analysis. The bar plots provide standardized regression
   coefficients of the analyzed pathways. Arrows on the bar plots pointed
   PPARα pathway. *: P[uncorrected] < 0.05/2,376. b Forest plots of
   logistic regression analyses that assess the pathway regulating the
   lipid metabolism by PPARα. For pathways regulating lipid metabolism by
   PPARα, we analyzed cross-trait associations of asthma and COPD with
   other phenotypes. We generated pathway PRS of asthma and COPD from the
   training datasets to test the associations with target phenotypes in
   the testing datasets. In the forest plots, dots indicate standardized
   regression coefficients, and whiskers represent 95% confidence
   intervals. P-values of the two-sided tests are adjusted using
   Bonferroni corrections. ●: P[uncorrected] < 0.05/2376; ○:
   P[uncorrected] ≥ 0.05/2376.

Cell type specificity of traits and pathways in the lungs

   To acquire further biological insights into the genetic basis of
   multimorbidity in the lungs, we conducted cell type-specific analyses
   using the human lung scRNA-seq dataset derived from the Human Lung Cell
   Atlas (HLCA)^[179]55. The scRNA-seq dataset of the lungs consisted of
   50 cell types, classified broadly into endothelial, epithelial, immune,
   and stromal cells. After calculating gene-level Z-scores through MAGMA
   gene analyses for all combinations of phenotypes and populations based
   on the GWAS meta-analysis summary statistics, we selected the top 1000
   genes representing the polygenic risk in each phenotype-population
   combination. For cell type enrichment analysis, we calculated disease
   scores using scDRS^[180]36 based on the Z-scores of the selected genes.
   We performed cell type enrichment analyses with the default settings
   and investigated the differences in cell type enrichment between EAS
   and EUR populations^[181]56. Among the 600 pairs of 50 cell types × 12
   phenotypes, we focused on the combination with prominent
   cross-population differences in scDRS disease scores between EAS and
   EUR populations (“Methods”).

   We showed an overview of the cross-population cell type enrichment
   analysis in Fig. [182]6. To assess the cross-population differences in
   cell type enrichment, we started to denote the baseline cell type
   enrichment per population. Our results pointed to the cell type
   specificity with known clinical relevance to each phenotype. For
   instance, we saw the enrichment of immune cells in RA and asthma and
   endothelial cells and fibroblasts in CAD (Supplementary Fig. [183]24).
   Among the cell types with significant enrichment, we identified 15
   population-specific enriched cell types (Fig. [184]7 and Supplementary
   Table [185]6). The remarkable enrichment of fibroblasts reflected the
   large proportion of emphysema in EAS^[186]23,[187]57, a COPD subtype
   characterized by alveolar destruction (Fig. [188]7c). Fibroblasts
   repair alveolar damage caused by smoking, but their functional
   impairment leads to emphysema^[189]58. Additionally, we identified
   EAS-specific enrichment of dendritic cells in RA (Supplementary
   Fig. [190]25) and fibroblasts in dyslipidemia. In contrast, two cell
   types enriched specifically in respiratory diseases in EUR: goblet
   cells in ILD (Fig. [191]7d) and B cells in asthma (Fig. [192]7e). Our
   findings suggest that analyzing the differences in cell type enrichment
   between populations can help explain the heterogeneity of diseases.

Fig. 6. Overview of cross-population cell type enrichment analyses.

   [193]Fig. 6
   [194]Open in a new tab

   Disease associations of and the mean differences in scDRS disease
   scores at the human lung cell type level. To investigate the
   disease-associated cell types, we analyzed a scRNA-seq dataset with 50
   cell types obtained from the lungs. We applied scDRS with the default
   settings to calculate disease scores for the twelve complex traits
   using the GWAS summary statistics and to perform downstream analysis.
   The left and middle heatmaps show the cell type enrichment in EAS and
   EUR populations, respectively. Each tile in the left and middle
   heatmaps is colored based on the proportion of significant cells in the
   tested cell type. The right heatmap is colored based on the mean
   differences in scDRS disease scores between EAS and EUR populations. □:
   FDR < 0.05; ×: FDR of heterogeneity <0.05; +: FDR of t-tests assessing
   the significance of the scDRS disease score differences across all
   pairs of the cell types and phenotypes.

Fig. 7. Cell types with different enrichment between EAS and EUR populations.

   [195]Fig. 7
   [196]Open in a new tab

   a A UMAP plot of the scRNA-seq dataset from the Human Lung Cell Atlas
   colored by coarsest annotations. b Mean differences in scDRS disease
   scores of population-specific cell types. From all phenotype-cell type
   pairs identified as significant in EAS or EUR cell type enrichment
   analysis, the plots described the pairs with prominent differences in
   the mean scDRS disease scores between EAS and EUR populations. The bar
   plots provide the differences in disease scores of EAS and EUR
   populations. c UMAP plots of fibroblasts colored by the enrichment in
   COPD. The leftmost UMAP plot is colored based on the class of tested
   cell types. The two UMAP plots in the middle are colored based on the
   disease scores for EAS (left) and EUR (right) populations. The
   rightmost UMAP plot is colored based on the differences in disease
   scores between EAS and EUR populations. d UMAP plots of goblet cells
   colored by the enrichment in ILD. e UMAP plots of B cells colored by
   the enrichment in asthma.

   We then analyzed pathway-level cell type enrichment for pathways with
   100–2000 genes and significant associations because the original paper
   on scDRS validated the statistical power of scDRS for gene sets within
   the range^[197]36. As the PPARα pathway newly showed the concordance of
   enrichment between genetic and biological findings which regulates both
   the immune system and lipid metabolism, we investigated its
   pathway-level cell type enrichment in immune cells of the lungs. We
   calculated pathway-level disease scores using Z-scores of genes in the
   PPARα pathway (Fig. [198]8). Despite the cross-population differences
   in MAGMA gene analyses between EAS and EUR populations (Fig. [199]8a),
   we found a similar suggestive cell type enrichment of T cells in asthma
   and that of macrophages in dyslipidemia (Fig. [200]8c and [201]d).
   CD4^+ (EAS: P = 0.035 and EUR: P = 0.096) and CD8^+ (EAS: P = 0.041 and
   EUR: P = 0.083) T cells enriched suggestively in EAS asthma. On the
   other hand, elicited macrophages (EAS: P = 0.10 and EUR: P = 0.0070)
   and non-classical monocytes (EAS: P = 0.22 and EUR: P = 0.039)
   exhibited suggestive enrichment in EUR dyslipidemia. To detect genes
   contributing to the pathway-level enrichment, we investigated the
   expressions of genes in PPARα pathway in the immune cells. The
   expressions of top five genes associated with the traits in MAGMA gene
   analyses aligned with the cross-trait differences in the enrichment
   patterns of asthma and dyslipidemia (Supplementary Fig. [202]26). Among
   the top five genes significant in MAGMA gene analyses for asthma, RORA
   contributed to the pathway-level cell type enrichment in T cells
   (Supplementary Fig. [203]27). The pathway-level cell type enrichment
   analysis pinpointed cell types associated with the genetic risks of
   asthma and dyslipidemia and gained insights into the diverse biology of
   PPARα pathway in the diseases.

Fig. 8. PPARα pathway enrichment analysis of immune cells for asthma and
dyslipidemia.

   [204]Fig. 8
   [205]Open in a new tab

   a MAGMA gene analysis targeting the pathway regulating the lipid
   metabolism by PPARα. We plotted the logarithm of uncorrected P-values
   obtained from the gene analysis of asthma and dyslipidemia in EAS and
   EUR populations. The plots label the top five significant genes in EAS
   or EUR populations. P-values of the one-sided tests are adjusted using
   Bonferroni corrections. The corrected significance thresholds are shown
   as purple dashed lines. b Coarse annotations of immune cells in the
   lung tissue. c Enrichment analyses of the PPARα pathway in asthma.
   After selecting gene-level Z-scores included in the PPARα pathway, we
   calculated the pathway-level disease scores for eight cell types using
   scDRS. The UMAP plots are colored based on the disease scores of the
   PPARα pathway. d Enrichment analyses of the PPARα pathway in
   dyslipidemia.

Discussion

   In this study, we dissected the genetics underlying the biology of
   multimorbidity in twelve complex traits relevant to inflammation and
   the immune system using the three large-scale biobank resources. Global
   and local genetic correlation analyses revealed the potential effects
   of pleiotropy on the negative genetic correlations between respiratory
   and cardiometabolic diseases in EAS population. In the cross-trait
   genome-wide PRS analysis, the negative genetic association between
   asthma and dyslipidemia was consistent in the individuals with and
   without multimorbidity. We observed the negative association between
   asthma and dyslipidemia, especially in the EAS males. Genome-wide PRS
   and metabolome association analyses revealed the negative association
   of asthma PRS with circulating lipid and metabolite biomarkers,
   supporting the negative genetic correlation between asthma and
   dyslipidemia in the EAS individuals. We then successfully constructed
   the Bayesian pathway PRS to analyze the biology underlying
   multimorbidity in the complex traits. This identified the biological
   processes of the negative correlations between asthma and dyslipidemia,
   one of which regulated lipid metabolism via PPARα. Cell type
   specificities of the traits and pathways with the lung scRNA-seq
   dataset highlighted the epidemiology and biology of traits. The
   enrichment of fibroblasts in COPD corresponded to the dominant
   phenotypic distribution of emphysema in EAS population, characterized
   by the dysfunction of alveolar repairs. Our pathway enrichment analyses
   using the lung scRNA-seq dataset revealed the enrichment of the PPARα
   pathway in immune cells (T cells in asthma and macrophages in
   dyslipidemia), highlighting cell types associated with biology of the
   diseases. Among the genes in PPARα pathway, RORA contributed to the
   enrichment of T cells in asthma for EAS population.

   Our results demonstrated the negative genetic risk association of the
   PPARα pathway between asthma and dyslipidemia. The heterogeneity of
   genetic correlations between asthma and lipid metabolism-related traits
   was implicated in preceding studies^[206]11,[207]15,[208]59. While
   dyslipidemia is associated with increased phenotypic and genetic risks
   of asthma^[209]15,[210]60, a preceding study showed a local genetic
   correlation analysis found LD blocks with negative genetic correlations
   of asthma with blood levels of triglycerides and cholesterols^[211]15.
   Another multi-ancestry genome-wide PRS study suggested a negative
   association of pediatric asthma with dyslipidemia in a multi-population
   cohort^[212]61. As the former study focused on EUR population and the
   latter was for mixed populations, the genetic and biological
   heterogeneity underlying the negative associations has remained unclear
   across populations. In EAS population, our results identified the
   negative genetic correlations between asthma and dyslipidemia from
   global (Fig. [213]2 and Supplementary Fig. [214]5), local
   (Supplementary Figs. [215]3 and [216]4), and pathway levels
   (Table [217]1) with consistency. Our study suggests that the different
   effect sizes of the cross-trait asthma–dyslipidemia association in the
   PPARα pathway may drive the heterogeneity of multimorbidity in EAS and
   EUR populations. As PPARα exerts not only anti- but pro-inflammatory
   effects via multiple regulatory mechanisms^[218]62, the variety in the
   effect sizes of the PPARα pathway may be associated with the dissimilar
   balance of the anti- and pro-inflammatory states. For instance, the
   PPARα pathway represses interferon-γ production in human T cells in
   response to androgen^[219]63, potentially leading to the sex and
   population difference in the association of asthma with dyslipidemia
   (Supplementary Fig. [220]12). Because cross-population genetic analysis
   needs larger sample sizes to gain statistical power^[221]43, we
   anticipate that further research on large-scale non-EUR individuals
   will validate our findings and provide the basis for personalized
   medicine for multimorbidity in asthma and dyslipidemia.

   In EAS population, our analysis revealed the high enrichment of
   fibroblasts in COPD and obesity (Fig. [222]7). COPD has two main
   subtypes, one with emphysema (alveolar destruction) and loss of body
   weight and another with chronic bronchitis and obesity^[223]64. COPD in
   EAS population is characterized epidemiologically by the dominance of
   emphysema and weight loss^[224]22,[225]23, unlike in EUR
   population^[226]57,[227]65. As functionally impaired fibroblasts have a
   reduced capacity to repair alveolar destruction caused by
   smoking^[228]58, our results suggested that the significant enrichment
   of fibroblasts aligns with the high prevalence of emphysema in EAS
   population (Fig. [229]7). In addition, dysregulating lipolysis in white
   adipose tissue can lead to obesity, such as via the resistance to
   fibroblast growth factors^[230]66. Therefore, our results imply that
   the complex interactions of fibroblasts and adipose tissues may
   introduce the heterogeneous phenotypic correlation of COPD and obesity,
   leading to the different prevalence of emphysema and weight change
   between EAS and EUR populations.

   This study has several potential limitations. First, we leveraged the
   five biobank resources to explore the population differences of
   multimorbidity. The differences in sample sizes can bias our study,
   especially increasing the false negative results in the traits with
   small sample sizes. Despite the detailed investigation of our findings
   using the independent datasets, genotyping platforms, imputation
   procedures, phenotyping definitions, and biobank-specific confounders
   between the cohorts can bias our study. In addition, the biobank-scale
   resources currently do not have detailed information about phenotypes,
   for example, the onset of asthma and the subtypes of asthma and COPD in
   the BBJ. Since dyslipidemia has phenotypic associations with COPD and
   asthma subtypes (allergic and non-allergic asthma)^[231]67, enlarging
   non-EUR sample sizes with subtype information may help illustrate the
   complex associations of multimorbidity in subtypes of respiratory
   diseases. Therefore, future studies using various pipelines and
   resources will validate our findings and delve into the biology
   underlying multimorbidity. Finally, as pathways used in the PRS and
   cell type enrichment analyses are defined based on the known biology,
   genes and SNPs with unknown biology were beyond the scope of this
   study.

   Collectively, our study provided a piece of evidence that biobank-scale
   GWAS could highlight the heterogeneous polygenicity of multimorbidity
   across EAS and EUR populations, detect the biology driving the
   associations of multicategorical traits, and contribute to elucidating
   the global landscape of heritable multimorbidity risks. Our results
   demonstrate that exploring diverse populations better promotes
   understanding the genetic basis of multimorbidity.

Methods

East Asian samples in BioBank Japan

   EAS population analysis included samples from BBJ, a hospital-based
   registry with multi-omics data from genotypes to multitudes of
   phenotypes. All the participants in BBJ provided written, informed
   consent approved by ethics committees of the Institute of Medical
   Sciences, the University of Tokyo and RIKEN Center for Integrative
   Medical Sciences^[232]68. BBJ1 recruited approximately 200,000
   individuals with at least one of 47 target diseases from twelve
   Japanese medical institutions and collected DNA, serum samples, and
   clinical information between 2003 and 2007^[233]33,[234]69. BBJ2 is an
   additional cohort including ~67,000 independent individuals recruited
   independently of BBJ1 with at least one of 38 target diseases and
   registered between 2013 and 2018. For meta-analysis and PRS analysis,
   we included samples derived from the BBJ1 and BBJ2.

Tohoku Medical Megabank

   We used 99,561 TMM samples to validate the cross-trait PRS associations
   between respiratory cardiometabolic diseases in EAS population. TMM is
   a population-based prospective cohort that enrolled participants from
   Miyagi and Iwate Prefectures in the Tohoku region, in the northeastern
   part of Japan^[235]39. Genotyping was conducted using a custom SNP
   array for the Japanese population (Japonica Array v.2). Details of
   imputation and quality control criteria for samples and variants are
   described elsewhere^[236]70. After imputation, variants with INFO score
   of <0.3 or minor allele frequency <0.005 were excluded. Quality control
   of the study participants was performed with the following exclusion
   criteria: (1) outliers from East Asian ancestry clustering based on the
   projection PCA with samples of 1KGP3 data; (2) One of pairs within the
   third degree of kinship;(3) genotype call rate of <95%; (4) without
   phenotype or covariate information.

UK Biobank

   We obtained the genomic data of UKB, a population-based registry with
   approximately 500,000 individuals aged between 40 and 69 recruited in
   the UK^[237]34. The individual registration process is described
   elsewhere^[238]71. Briefly, the UKB individuals were genotyped using
   the Applied Biosystems UK BiLEVE Axiom Array or the Applied Biosystems
   UK Biobank Axiom Array. After quality control, genotype data were
   imputed with the Haplotype Reference Consortium data and the merged
   UK10K and 1000 Genomes Project (1KG) Project Phase 3 reference panels
   using IMPUTE4^[239]34. We analyzed EUR individuals tagged “Caucasian”
   in UKB Data-Field 22006 and confirmed that all individuals were
   classified into EUR ancestry using principal component analysis (PCA).

FinnGen

   FinnGen is a large public-private genome research project that collects
   and analyzes genome and health data from Finnish biobanks and digital
   health record data from Finnish health registries, with its original
   phenotypes defined mainly using International Classification of
   Diseases (ICD) and Anatomical Chemical Therapeutic classification
   codes^[240]35. We used the GWAS summary statistics of FinnGen Data
   Freeze 8 (released on December 1st, 2022), for which association tests
   were conducted using SAIGE (v.0.35.8.8). The datasets we used for the
   analyses were listed in Supplementary Table [241]2.

Phenotype definition

   According to ICD-10 codes, we defined cases and controls for twelve
   phenotypes of GWAS and downstream analyses in the BBJ and UKB. In
   brief, cases were individuals with the twelve phenotypes (asthma, COPD,
   ILD, RA, obesity, dyslipidemia, smoking, hypertension, T2D, CAD, HF,
   and stroke). To disentangle the genetic effects of multiple risk
   factors on cardiometabolic diseases, we analyzed both cardiometabolic
   disorders and the risk factors in parallel. We excluded individuals
   with target and related phenotypes from the controls (Supplementary
   Table [242]2). Because autoimmune disorders share immune genetic
   backgrounds with asthma^[243]13 and can bias the results of GWAS and
   post-GWAS analyses for asthma and COPD, we conservatively excluded
   individuals with autoimmune diseases from the controls for asthma and
   COPD to gain the robustness of the study. In addition, we included RA,
   an autoimmune disease available in the datasets, because our previous
   study showed the positive genetic correlation between asthma and RA in
   both populations^[244]13.

Genotyping and imputation of autosomal chromosomes in BioBank Japan

   We genotyped the Japanese samples in BBJ1 with the Illumina
   HumanOmniExpressExome BeadChip or a combination of the Illumina
   HumanOmniExpress and HumanExome BeadChips. Quality control of samples
   and genotypes was conducted as described elsewhere^[245]72. We included
   individuals identified as EAS ancestry based on PCA. We used Eagle
   (v.2.3) for haplotype phasing of the genotype data and imputed genotype
   dosages using Minimac3 with the combined reference panel of 1KG Phase 3
   version 5 genotype data (n = 2504) and Japanese whole-genome sequencing
   (WGS) data obtained from the BBJ1 (n = 1037).

   BBJ2 (~67,000) and a part of BBJ1 (~12,000) individuals were genotyped
   using Illumina Asian Screening Array-24 v1.0 BeadChip. Quality control
   and genotype data of the BBJ2 were described elsewhere^[246]68. Using
   Minimac4, we imputed genotype dosages with the combined reference panel
   of 1KG Project Phase 3 and Japanese WGS data.

GWAS and meta-analysis

   We conducted GWAS for each phenotype in a single population using a
   saddle point approximation implemented in Regenie (v.3.1.1)^[247]73 to
   adjust for case-control imbalance. As covariates for GWAS, we included
   age, sex, and top ten genetic PCs. We excluded variants with an
   imputation quality Rsq < 0.7 or minor allele frequency (MAF) < 0.005.
   For all downstream analyses, we excluded the MHC regions (chromosome 6:
   25–34 Mb) due to their complex and strong LD structure^[248]29. We
   applied UCSC liftOver^[249]74 to convert the genome builds of BBJ and
   UKB datasets from GRCh37/hg19 to GRCh38/hg38 and annotated variants
   using SnpSift^[250]75. For usage in downstream analyses, we filtered
   out variants with imputation quality Rsq < 0.9 or MAF < 0.01 and
   meta-analyzed each phenotype per population using a standard
   fixed-effect approach in RE2C^[251]37. We included 4,612,828–4,622,861
   SNPs in the EAS GWAS meta-analyses and 7,057,994–7,061,744 SNPs in the
   EUR GWAS meta-analyses.

Global genetic correlation analysis

   Given the differences in LD structures of diverse populations, we used
   two software to estimate global genetic correlations that account for
   genome-wide SNPs and to avoid the influence of mismatched top variants
   in the GWAS between phenotypes and among populations. For a
   single-population analysis, we calculated heritability of each
   phenotype and genetic correlations among phenotype pairs using
   LDSC^[252]38 (v.1.0.1). We confirmed the concordance of heritability
   obtained from Popcorn^[253]43 (v.1.0) and performed a cross-population
   analysis. Based on the standard protocol of each software, we used 1KG
   Phase 3 reference panels for matched populations. The significance
   levels of genetic correlation analysis were adjusted by Bonferroni
   correction (LDSC: P < 0.05/66; and Popcorn: P < 0.05/276). We then
   performed local genetic correlation and genome-wide PRS association
   analyses to understand the negative genetic associations between
   respiratory and cardiometabolic diseases using different genetic
   analysis methods, as described in the latter sections.

Local genetic correlation analysis

   We applied SUPERGNOVA (v.1.0) to estimate local genetic correlations in
   the LD-independent segments^[254]41. First, we partitioned the genome
   into the LD-independent segments using LAVA^[255]15 supplementary
   program (v.1.0.0) to update the reference panels from 1KG Project Phase
   1 to 3. As reference panels for LD estimation, we used the preprocessed
   EAS (n = 504) and EUR (n = 503) subsets of 1KG
   ([256]https://ctg.cncr.nl/software/magma). We assessed the significance
   of the local genetic correlations based on the significance of local
   genetic covariances, as in the SUPERGNOVA original article. The
   significance level for local genetic correlation analysis was a
   false-discovery rate (FDR) < 0.05 adjusted for the Benjamini-Hochberg
   method. Because the LD structure was different between EAS and EUR
   populations, the positions of LD blocks were not aligned. Therefore, we
   conservatively avoided the concordance of significant LD blocks between
   the populations. Alternatively, we compared the proportion of
   significant blocks with negative correlations between the two
   populations. For further validation of the local genetic correlation
   between asthma and dyslipidemia, we utilized KoGES dyslipidemia GWAS
   summary statistics to assess the local genetic correlation between
   asthma and dyslipidemia in EAS population^[257]40.

Genome-wide PRS analysis

   To assess whether the results from cross-trait genetic correlation
   analyses are not derived from biases in genotyping, imputation, and
   phenotyping, we performed cross-trait PRS association analyses using
   the independent datasets generated from different genotyping and
   imputation platforms and phenotyping procedures. We used training GWAS
   datasets (BBJ1 for EAS and FinnGen for EUR) to construct PRS and
   applied the PRS weights to the testing datasets (BBJ2 for EAS and UKB
   for EUR) to perform association analyses. Based on the original
   paper^[258]32, we excluded variants with Rsq < 0.8 or MAF < 0.01. After
   excluding variants located in the MHC region, we ran PRS-CSx (released
   on July 29th, 2021) to calculate genome-wide PRS for each population.
   We set φ = 0.01 for analyzing highly polygenic traits and used the
   default settings for other parameters. We evaluated the predictive
   performance of the PRS for matched phenotype and population with
   logistic regression models, adjusting for age, sex, and top ten genetic
   PCs as covariates. Nagelkerke’s pseudo-R^2 was used to evaluate the
   predictive performance for all PRS analyses. In addition, we calculated
   liability-scale R^2 to assess the proportion of heritability
   genome-wide PRS explained based on the disease prevalence per
   population^[259]76. We showed the methods and prevalences of traits
   used to calculate liability-scale R^2 in [260]Supplementary Methods and
   Supplementary Data [261]10.

   Because we hypothesized that cross-trait genetic correlations might be
   in the same direction in individuals with and without multimorbidity,
   we analyzed individuals with and without multimorbidity separately. For
   the analysis of the individuals without multimorbidity, we excluded
   samples overlapping base and target phenotypes from the testing
   datasets for each phenotype pair. We then investigated the cross-trait
   associations per population. The significance level for cross-trait
   analysis was set to P < 0.05/132, adjusted by Bonferroni correction.
   Next, to adjust for the interaction of smoking behavior and sex on the
   associations of phenotype pairs, we conducted an interaction analysis.
   In this analysis, we tested the cross-trait associations of genome-wide
   PRS, including age, sex, smoking amount (pack-years), top ten genetic
   PCs, and sex * smoking amount as the interaction term in the logistic
   regression models. We used lmtest R package (v0.9.40) to perform
   likelihood ratio tests and calculate the P-values of the interaction
   term. To evaluate the cross-trait associations of genome-wide PRS for
   males and females, we performed a sex-stratified PRS association
   analysis.

   Next, we assessed the associations of genetic risks between phenotypes
   in individuals with multimorbidity using genome-wide PRS. For
   multimorbid individuals with base and target phenotypes in the testing
   datasets, we tested the associations between the PRS for base and
   target phenotypes, adjusting for age, sex, and top ten genetic PCs
   based on the linear regression models.

   Finally, we assessed the cross-trait associations between respiratory
   and cardiometabolic diseases across EAS biobanks. We constructed
   genome-wide PRS using EAS GWAS meta-analysis (BBJ1 + BBJ2) as a
   training dataset. We then tested the cross-trait associations between
   respiratory and cardiometabolic diseases in the TMM as a testing
   dataset, adjusting for the same covariates used in the genome-wide PRS
   association analysis.

Genome-wide PRS association analysis for metabolome and proteome

   We used targeted high-throughput NMR metabolomics from Nightingale
   Health Ltd (biomarker quantification version 2020) to measure 249
   circulating lipid and metabolite biomarkers from 54,250 serum samples
   obtained from the BBJ1 participants. For PRS association analysis in
   EUR population, we obtained the UKB NMR metabolomics data comprising
   291,003 samples. We performed quality control for the two metabolome
   datasets separately to remove technical variation using ukbnmr R
   package (v.2.2). The details of technical variation adjustments using
   ukbnmr are described elsewhere^[262]49. We analyzed 325 markers in
   total generated from ukbnmr. After excluding duplicates, we used
   samples derived from EAS (BBJ1: 51,612 individuals) and EUR (UKB:
   245,349 individuals) populations based on genotype PCA criteria as the
   testing datasets. All metabolites were subject to inverse rank
   normalization transformations before association analysis. To remove
   the sample overlap between training and testing datasets, we performed
   GWAS after excluding samples measured metabolome or proteome for the
   BBJ1 and UKB (Supplementary Table [263]7). We meta-analyzed the GWAS
   summary statistics for each population (BBJ1 and BBJ2 in EAS and UKB
   and FinnGen in EUR) to construct single-population genome-wide PRS
   using PRS-CSx. We conducted linear regression analyses for metabolome
   and PRS, adjusting for age, sex, and top 10 genetic PCs^[264]77. As the
   BBJ1 NMR metabolomics data were derived from the two batches (47,355
   and 4257 individuals), whose genotyping and imputation protocols were
   different, we first analyzed genome-wide PRS–metabolome associations
   per biobank and then performed inverse-variance fixed-effect
   meta-analysis using metagen function implemented in meta R package
   (v.7.0-0).

   We measured expressions of circulating proteins using Olink Explore
   3072 platform from 2700 individuals from the BBJ1 across three batches.
   The expression levels in a normalized scale (Normalized Protein
   eXpression: NPX) were bridge-normalized using OlinkAnalyze R package
   and subsequently rank-inverse normal transformed. We excluded proteins
   with missing data in >80% of the samples. For EUR analysis, we obtained
   bridge-normalized Olink Explore 3072 proteomics data from the UKB
   measured from 53,058 individuals^[265]78. We selected samples from 2071
   EAS individuals in the BBJ1 and 45,631 EUR individuals in the UKB based
   on the genotype PCA criteria. We assessed the association of
   genome-wide PRS with the processed expression levels of 2917 proteins
   measured in both the BBJ1 and UKB, adjusting for the same covariates
   used in the genome-wide PRS–metabolome association analysis.

MAGMA gene-set analysis

   As one of the three quality control steps for pathway PRS analysis, we
   investigated the enrichment of pathways related to the immune system
   and lipid metabolism using MAGMA (v.1.10)^[266]53. First, we selected
   335 pathways tagged “Immune system”, “Transport of small molecules”,
   and “Metabolism” from the curated gene sets of Reactome in MSigDB
   (released in January 2023)^[267]52. Then, we analyzed the pathway
   enrichment using MAGMA gene-set analysis function and 1KG Project Phase
   3 reference data for matched populations. We used the meta-analyzed
   GWAS summary statistics for the MAGMA gene-set analysis. The
   significance level for the analysis was defined as P < 0.05/335,
   adjusted by Bonferroni correction. In total, 18 pathways passed the
   significance level and were used for downstream analyses.

Pathway PRS analysis

   Because pathway PRS aggregates risk alleles per pathway, we performed
   cross-trait pathway PRS analysis to detect functions shared between
   respiratory and cardiometabolic diseases. We used training GWAS
   datasets (BBJ1 for EAS and FinnGen for EUR) to construct PRS and
   applied the PRS weights to the testing datasets (BBJ2 for EAS and UKB
   for EUR) to perform association analyses. We described the details of
   pathway PRS analysis described in [268]Supplementary Methods. To
   investigate the predictive performance and concordance of PRS-CSx with
   PRSet (implemented in PRSice v.2.3.5), we compared the statistics
   calculated from the curated gene sets of Reactome in
   MSigDB^[269]29,[270]32,[271]52. As PRSet targets pathways with 10–2000
   genes, we analyzed 1,319 pathways satisfying the requirement. We
   adjusted the size of gene boundaries in PRS-CSx analyses to that in
   other MAGMA-based analyses. In detail, we extended SNPs located within
   the gene coordinates ±10 kilobases (kb) of each gene to include
   potential regulatory elements selectively. For the PRSet analyses, we
   included SNPs selectively located within the gene coordinates,
   encompassing 35 kb upstream and 10 kb downstream of each gene. We
   evaluated the predictive performance of the two PRS tools based on the
   single-trait analysis per population. The results from logistic
   regression analysis were adjusted for age, sex, and the top ten genetic
   PCs.

   Next, we investigated the pathway-level cross-trait associations for
   the 18 significant pathways in MAGMA gene-set analyses. We excluded
   samples overlapping base and target phenotypes from the testing
   datasets and analyzed cross-trait associations for each population,
   adjusting for the same covariates in other PRS analyses. We defined the
   significance level as P < 0.05/2,376 in the analyses, adjusted by
   Bonferroni correction. We filtered out pathways with false positive
   associations and those with small predictive performance by setting
   Nagelkerke’s R^2 threshold to 0.001 in the cross-trait Bayesian pathway
   PRS analysis.

Cell type enrichment analysis

   For cell type enrichment analysis, we focused on relevant cell types to
   avoid overly strict multiple-testing corrections derived from cell
   types irrelevant to or weakly correlated with respiratory diseases.
   Because the lungs are the organs responsible for asthma, we analyzed
   the human lung scRNA-seq dataset derived from the Human Lung Cell
   Atlas^[272]55, the 50 cell types of which included stromal, immune,
   epithelial, and endothelial cells. We calculated gene-level Z-scores
   from the GWAS meta-analysis summary statistics using MAGMA. For the
   cell type enrichment analysis, we selected the top 1000 genes based on
   the gene-level Z-scores as a set of putative disease genes. Using the
   “compute_score” function in scDRS (v1.0.2), we calculated a disease
   score of each cell in the scRNA datasets by aggregating the expression
   of the putative disease gene sets and computed 1000 sets of control
   scores using a random gene set^[273]36. Then, we normalized the raw
   disease and control scores for each cell. We used the
   “compute_downstream” function in scDRS with the default settings to
   associate the putative gene sets with the cell types. The significance
   level in the association and heterogeneity of cell types was a
   FDR < 0.05 adjusted using the Benjamini-Hochberg method across all the
   600 pairs of the cell types and phenotypes.

   For cross-population analysis of cell type enrichment, we compared the
   mean differences in scDRS disease scores between the EAS and EUR
   populations. We calculated a mean scDRS disease score per cell type for
   each individual in the scRNA-seq dataset. We then compared the mean
   differences in scDRS disease scores generated from the EAS and EUR GWAS
   using a two-sided paired t-test for all the combinations of 12
   phenotypes and 50 cell types (adjusted for multiple comparisons using
   the Benjamini-Hochberg method across all pairs of the cell types and
   phenotypes)^[274]56. Since many pairs of cell types and phenotypes
   satisfied the significance level, we wanted to focus on the cell
   type–phenotype pairs with prominent differences in the mean scDRS
   disease scores between EAS and EUR populations. Among the 600 pairs of
   50 cell types × 12 phenotypes, we defined the combinations satisfying
   three requirements as population-specific enriched cell types: (1) the
   combinations were significant in the EAS or EUR cell type enrichment
   analyses; (2) the mean differences in the scDRS disease scores
   generated from the EAS and EUR GWAS were significant (FDR < 0.05); and
   (3) the mean differences in scDRS disease scores generated from the EAS
   and EUR GWAS exceeded the 95% confidence interval of the distribution
   collecting all the 600 mean differences (Supplementary Fig. [275]28).

   For pathway-level cell type enrichment analyses, we used a sequence of
   gene sets identified in the cross-trait pathway PRS analyses. Because
   the original paper on scDRS validated the statistical power of scDRS
   for gene sets with 100–2000 genes^[276]36, we conservatively analyzed
   pathways with more than 100 genes. Among the five pathways with
   significant multicategorical associations in respiratory and
   cardiometabolic diseases, the PPARα pathway was eligible for the
   analysis. As in the cross-population analysis, we used the default
   settings of scDRS. We focused on enrichment in immune cells to assess
   the association of lipid metabolism with the immune system. To further
   focus on genes contributing to the pathway-level enrichment identified
   in scDRS, we compared the expression of top genes defined by MAGMA gene
   analysis to the pathway-level disease scores. Because 25–42 genes in
   the PPARα pathway satisfied the suggestive threshold (P < 0.05) in
   MAGMA gene analysis for asthma and dyslipidemia, we calculated the
   average expressions of the top 5, 10, 15, 20, and 25 genes in MAGMA
   gene analysis using “scanpy.get.aggregate” function implemented in
   scanpy (v.1.9.8)^[277]79. We found that the top five genes were
   representative of the pathway-level enrichment of PPARα pathway in the
   immune cells, especially in the analysis of asthma (Supplementary
   Fig. [278]26). Thus, we assessed the expressions of top five genes in
   the pathway-level cell type enrichment analysis.

Reporting summary

   Further information on research design is available in the [279]Nature
   Portfolio Reporting Summary linked to this article.

Supplementary information

   [280]Supplementary Information^ (8.8MB, pdf)
   [281]41467_2025_58149_MOESM2_ESM.pdf^ (222.9KB, pdf)

   Description of Additional Supplementary Files
   [282]Supplementary Data 1^ (56.9KB, xlsx)
   [283]Supplementary Data 2^ (19.9KB, xlsx)
   [284]Supplementary Data 3^ (23.4KB, xlsx)
   [285]Supplementary Data 4^ (268KB, xlsx)
   [286]Supplementary Data 5^ (12.2KB, xlsx)
   [287]Supplementary Data 6^ (12.2KB, xlsx)
   [288]Supplementary Data 7^ (13.5KB, xlsx)
   [289]Supplementary Data 8^ (17.9KB, xlsx)
   [290]Supplementary Data 9^ (13KB, xlsx)
   [291]Supplementary Data 10^ (11.5KB, xlsx)
   [292]Reporting Summary^ (2.5MB, pdf)
   [293]Transparent Peer Review file^ (5MB, pdf)

Acknowledgements