Abstract

   Chickens are the most abundant agricultural animals globally, with
   controlling abdominal fat deposition being a key objective in poultry
   breeding. While GWAS can identify genetic variants associated with
   abdominal fat deposition, the precise roles and mechanisms of these
   variants remain largely unclear. Here, we use male chickens from two
   lines divergently selected for abdominal fat deposition as experimental
   models. Through the integration of genomic, epigenomic, 3D genomic, and
   transcriptomic data, we build a comprehensive chromatin 3D regulatory
   network map to identify the genetic regulatory mechanisms that
   influence abdominal fat deposition in chickens. Notably, we find that
   the rs734209466 variant functions as an allele-specific enhancer,
   remotely enhancing the transcription of IGFBP2 and IGFBP5 by the
   binding transcription factor IRF4. This interaction influences the
   differentiation and proliferation of preadipocytes, which ultimately
   affects phenotype. This work presents a detailed genetic regulatory map
   for chicken abdominal fat deposition, offering molecular targets for
   selective breeding.

   Subject terms: Animal breeding, Agricultural genetics, Gene regulation
     __________________________________________________________________

   Abdominal fat deposition is an important trait in chickens farmed for
   food. Here, the authors study this trait by integrating -omics data
   from several sources to construct a 3D chromatin map and validate key
   variants.

Introduction

   Chicken, being the most widely consumed meat globally, is valued for
   its high-quality protein content and cost-effectiveness^[44]1. However,
   traits related to abdominal fat deposition (AFD) significantly impact
   meat quality and production efficiency. Research indicates that
   excessive fat accumulation in chickens can notably decrease feed
   conversion efficiency^[45]2 and carcass lean meat rate^[46]3. Moreover,
   breeder chickens with excessive fat can negatively affect egg
   production rate, fertilization rate, and hatchability rate, potentially
   leading to increased mortality during the laying period^[47]4.
   Therefore, effectively controlling AFD in chickens is a crucial issue
   that requires urgent attention in chicken breeding worldwide.

   Furthermore, chickens are valuable model organisms in life sciences
   research, playing a significant role in the study of human obesity and
   related metabolic diseases^[48]5,[49]6. They exhibit characteristics
   similar to early stages of human type 2 diabetes such as high blood
   glucose and insulin resistance^[50]7, making them an ideal animal model
   for obesity research. For instance, the obesity-inducing virus SMAM-1,
   which leads to obesity in chickens, has been found to have similar
   effects in humans^[51]8. Moreover, several quantitative trait loci
   associated with obesity in chickens contain genes linked to human
   susceptibility to obesity^[52]9. Therefore, studying the regulatory
   mechanisms of AFD traits in chickens also offers valuable insights into
   understanding the molecular mechanisms of human obesity and related
   diseases.

   The AFD traits are a complex quantitative trait controlled by multiple
   genomic variants with intricate genetic regulatory mechanisms. Recent
   research has made significant progress in identifying genomic variants
   associated with AFD traits through genome-wide association studies
   (GWAS). Despite these advancements, the functional mechanisms of these
   variants, especially those located in non-coding regions, remain poorly
   understood, leading to a ‘black box’ scenario regarding their specific
   effects on phenotype. Traditional research methodologies, such as the
   ‘nearest gene model’ based on locational proximity, simply assign the
   nearest gene as the regulatory target of a genomic variant^[53]10.
   However, recent studies revealed that approximately 75–76% of genomic
   variants affect target genes through long–range interactions^[54]11,
   which challenges the effectiveness of the ‘nearest gene model’.

   To overcome these limitations, many researchers have turned to
   expression quantitative trait locus (eQTL) analysis, which aims to
   identify correlations between genomic variants and the expression
   levels of their corresponding target genes. International initiatives,
   such as the Farm Animal Genotype–Tissue Expression (FarmGTEx)
   project^[55]12–[56]14, have contributed to extensive eQTL datasets on
   adipose tissues from agricultural animals, enhancing our understanding
   of the genetic mechanisms underlying genomic variants associated with
   economic traits. However, eQTL analysis predominantly relies on
   statistical correlations and fails to directly elucidate the
   functionality of genomic variants. Emerging epigenetic techniques
   suggest that variants located in regulatory elements have functional
   implications^[57]15, but these methods are limited in explicating how
   functional variants exert their influence through specific gene
   regulation. In contrast, three–dimensional (3D) genomics approaches
   based on the frequency of physical chromatin interactions can predict
   target genes but often neglect the significant influence of epigenetic
   activity on variants^[58]11,[59]16,[60]17. Therefore, the integration
   of 3D genomics with multiple omics datasets is crucial for
   comprehensively understanding how noncoding variants influence
   transcriptional regulation^[61]18–[62]22.

   In this study, we used two Northeast Agricultural University Broiler
   Lines (NEAUHLF), which have been selected for divergent abdominal fat
   content since 1996, as our experimental populations. These two lines
   have proven to be ideal genetic models for investigating the underlying
   mechanisms of obesity^[63]23. Building upon this, we constructed an
   Integrative Multi–omics approach for Variant–Gene Interactions (IMVGI)
   that considers genomic, epigenomic, 3D genomic, and transcriptomic data
   and broadly analyzed how genomic variants systematically affect the
   transcription regulatory mechanisms of AFD traits. The strategy was to
   use whole–genome sequencing analysis of the NEAUHLF to identify genomic
   variants in the chicken genome, and selection signature analysis was
   used to identify variants associated with AFD traits. Subsequently, the
   functionality of these variants was annotated through epigenetic
   analysis, and their potential target genes were identified through 3D
   genomic analysis. By further employing weighted gene coexpression
   network analysis (WGCNA), we established correlations between these
   target genes and AFD traits. Finally, experimental validation revealed
   the role of crucial functional variants in AFD traits and their
   regulatory mechanisms. In summary, our research unveils a detailed
   framework that shows how genomic variants influence transcriptional
   regulation associated with AFD traits, providing insights into the
   genetic basis of these traits and identifying significant targets for
   the molecular breeding of chicken AFD traits.

Results

Analysis of genomic selection signatures uncovers candidate variants
associated with fat deposition

   In this research, we conducted a thorough investigation to identify
   candidate variants related to abdominal fat deposition (AFD) traits.
   Our research focused on a unique population of 330 chickens from the
   Northeast Agricultural University broiler lines that were divergently
   selected for abdominal fat content (NEAUHLF) for 19^th generations
   (years). Despite their similar body weights at 7 weeks of age (BW7),
   chickens from the fat (FL) and lean (LL) lines displayed a notable
   7.83–fold difference in abdominal fat weight (AFW) and a 7.51–fold
   difference in abdominal fat percentage (AFP = AFW/BW7) (Fig. [64]1a, b
   and Supplementary Fig. [65]1). This discrepancy underscores the
   significant phenotypic divergence between the FL and LL groups in terms
   of AFD traits.

Fig. 1. Genomic analysis identifying candidate variants for fat deposition in
broiler chicken lines.

   [66]Fig. 1
   [67]Open in a new tab

   a Comparative visualization of fat (FL) and lean (LL) lines,
   highlighting the marked difference in abdominal fat tissue. b Graphical
   representation of body weight at 7 weeks of age (BW7), abdominal fat
   weight (AFW), and abdominal fat percentage (AFP = AFW/BW7) in the FL
   and LL groups. Data were shown as the mean ± SD, n = 160 (FL), 170
   (LL), biologically independent animals. Statistical analysis was
   performed using an unpaired two-tailed Student’s t-test. The fold
   differences observed were as follows: BW7 in FL relative to LL is
   1.04-fold (p < 0.0001); AFW is 7.83-fold (p < 0.0001); and AFP shows a
   7.51-fold difference (p < 0.0001). c Circos plot displaying the
   distribution characteristics of genomic variants, including SNPs and
   InDels, across the whole genome. d Population structure analysis at
   K = 2, illustrating distinct genetic clusters. e Three–dimensional
   principal component analysis (PCA) plot. f Analysis of linkage
   disequilibrium (LD) decay distance, quantified by R^2 values. g
   Genome–wide F[ST] bar plot, with a red dashed line indicating the
   cutoff threshold (F[ST] > 0.2), highlighting regions of high genetic
   differentiation. h Genome–wide θπ ratio bar plot. The red dashed line
   indicates the cutoff thresholds for log2 (θπ ratios) > 0.3 or < −0.3,
   identifying areas under selection or with high genetic variation. i
   Scatter plot showing 2301 selected regions associated with abdominal
   fat deposition AFD traits identified through joint F[ST] and θπ ratios.
   j Venn diagram illustrating the overlap between the 2301
   adipose–related selected regions and known adipose–related QTLs from
   the Chicken QTLdb. k Manhattan plot highlighting significant changes in
   allele frequency (AF) between the FL and LL groups, with regions under
   selection marked. l Pie chart categorizing the 311,947 candidate
   variants linked to AFD traits. Source data are provided as a Source
   Data file.

   Population structure analysis was subsequently conducted to estimate
   the genomic relationship between these two groups. Whole–genome
   sequencing (WGS) of these 330 individuals revealed 4,677,252
   high–quality, filtered genomic variants (4,140,284 single nucleotide
   polymorphisms (SNPs) and 536,968 insertions/deletions (InDels)).
   Intriguingly, 20.01% of these variants were novel and were not recorded
   in the dbSNP database
   ([68]https://www.ncbi.nlm.nih.gov/projects/SNP/snp_summary.cgi, last
   accessed on August 22, 2021) (Fig. [69]1c). Admixture analysis
   partitioned the population into FL and LL subpopulations at K = 2
   (Fig. [70]1d), which was supported by principal component analysis
   (PCA) (Fig. [71]1e). Linkage disequilibrium (LD) analysis indicated
   rapid degradation, particularly in the LL subpopulation, suggesting
   intense selection pressure (Fig. [72]1f). The significant genetic
   differences between the FL and LL groups make them ideal subjects for
   studying traits influenced by artificial selection, such as AFD traits.

   To identify candidate variants associated with AFD traits, we employed
   selection signature analysis. We calculated the fixation indices
   (F[ST]) and θπ ratios across the genome, leading to the identification
   of 2301 regions selected for fat traits (Fig. [73]1g–i). Remarkably,
   72.23% of these regions overlapped with known quantitative trait loci
   (QTLs) associated with the fat traits from the Chicken QTLdb^[74]24
   (Fig. [75]1j), suggesting their potential involvement in AFD traits. By
   focusing on these regions, we identified variants that showed
   significant changes in allele frequency (|ΔAF | ≥ 0.5) between FL and
   LL as candidate variants linked to AFD traits. This process allowed us
   to identify 311,947 candidate variants associated with AFD traits,
   predominantly located in noncoding intergenic/intronic regions
   (94.42%), posing challenges in identifying the true functional variants
   and their target genes (Fig. [76]1k, l).

Identification of potentially functional variants regulating fat deposition
using epigenomic techniques

   To identify the potentially functional variants among the 311,947
   variants associated with AFD traits, we employed a comprehensive
   annotation strategy. This methodology entailed the synthesis of data
   derived from transposase–accessible chromatin sequencing (ATAC–seq) and
   chromatin immunoprecipitation sequencing (ChIP–seq) of chicken
   abdominal fat tissues to increase the precision of our identification
   process.

   In our analysis of open chromatin regions (OCRs) using ATAC–seq data,
   we focused on identifying differentially open chromatin regions (DOCRs)
   in abdominal fat tissues between the FL and LL groups, as those showing
   statistically significant changes (p < 0.05, |log2 (fold change)| > 1).
   This approach led us to identify 128,090 OCRs, 4028 of which met the
   criteria for classification as DOCRs (Fig. [77]2a, b). These DOCRs
   exhibited distinct length characteristics and were predominantly
   enriched in regions flanking transcription start sites (TSSs)
   (Fig. [78]2c, d). We further classified these DOCRs into distinct
   regulatory elements (REs) based on their genomic features using
   ChIP-seq data. Specifically, we identified 218 promoters (TSS −4
   kb/+2 kb, H3K4me3, H3K27ac), 1106 enhancers (H3K4me1, H3K27ac), 312
   silencers (H3K27me3), 2337 CTCF-enriched regions (marked by CTCF
   peaks), and 1264 low-signal regions (regions lacking significant RE
   signals) (Fig. [79]2e, f). Notably, some DOCRs may exhibit multiple
   regulatory functions, leading to their classification as various types
   of REs (Supplementary Fig. [80]2a). These REs showed higher fold
   changes in activity signals between FL and LL compared to low-signal
   regions (Fig. [81]2g), and these low signal regions are highly
   conserved across all biological replicate samples (Supplementary
   Fig. [82]2b), thereby enhancing the credibility of our annotations.

Fig. 2. Epigenomic screening of functional variants influencing fat
deposition.

   [83]Fig. 2
   [84]Open in a new tab

   a Circos plot showing the distribution and frequency of open chromatin
   regions (OCRs) across chromosomes identified by ATAC–seq, with each
   ring representing different sample data. b Volcano plot of 4028
   differentially expressed OCRs between FL and LL, with significance
   thresholds. c Graphical representation of the size distribution of
   differential OCRs. d Enrichment analysis of ATAC–seq signals near
   transcription start sites (TSSs). e Annotation of DOCRs with regulatory
   elements (REs) using ChIP–seq data of histone modifications. f Number
   of annotated REs in DOCRs. g Comparative analysis of the activity
   signals of different REs. Violin plots provide a density estimation of
   the data distribution, and box plots display the median (central line),
   interquartile range (IQR, bounds), and whiskers representing the
   minimum and maximum values. Sample sizes are n = 1264 (Low signal), 218
   (Promoter), 1106 (Enhancer), 312 (Silencer), and 2337 (CTCF),
   biologically independent samples. P values were obtained using one-way
   ANOVA followed by Dunnett’s multiple comparisons test. h Circular plot
   depicting the identification of superenhancers within DOCRs. i Circular
   plot showing the identification of supersilencers within DOCRs. j, k
   Comparative analysis of chromatin marks in enhancers and silencers,
   with box plots showing the median, interquartile range, and whiskers
   representing the range of data, and violin plots providing a density
   estimation of the data distribution. Data are plotted in box plots
   showing the median (central line) and IQR (bounds) with whiskers
   extending to the minimum and maximum values. n = 1106 (Typical
   enhancer), 280 (Super enhancer), 312 (Typical silencer) and 41 (Super
   silencer), biologically independent samples. P values were obtained
   using unpaired two-tailed Student’s t-test. l Bar and pie charts
   showing the distribution and proportion of functional variants across
   different REs. Source data are provided as a Source Data file.

   Intriguingly, in addition to typical REs, we also identified a distinct
   class of enhancers and silencers, termed superenhancers (SEs) and
   supersilencers (SSs), respectively. SEs were defined by applying the
   ranking of the ROSE pipeline^[85]25 to ATAC–seq and H3K4me1 signals,
   which revealed 280 SEs in DOCRs (Fig. [86]2h). The authors of a recent
   study referred to H3K27me3–rich genomic regions as SSs that promote
   gene repression through chromatin interactions^[87]26. Similar to the
   identification of SEs, we defined SSs by utilizing the ROSE algorithm
   to rank ATAC–seq signals and H3K27me3 signals. This method allowed for
   the recognition of 41 SSs in DOCRs (Fig. [88]2i). These SEs/SSs showed
   stronger activity signals than regions with typical REs (Fig. [89]2j,
   k), enhancing the credibility of our annotations.

   After applying the comprehensive annotation strategy using ATAC–seq and
   ChIP–seq data, we identified 2162 potentially functional variants
   embedded within annotated REs. The functional relevance of these
   variants is underscored by their distribution: approximately half were
   significantly enriched at CTCF–binding sites (50.66%), and the rest
   (45.24%) were nested within distal regulatory elements such as
   enhancers, SEs, silencers, and SSs (Fig. [90]2l). The localization of
   these 2162 variants within REs has potential functional significance,
   as they may influence the activity of these elements and consequently
   modify the transcriptional regulation of target genes.

Identification of variant–gene interactions using three–dimensional genomes

   To investigate how 2162 potentially functional variants influence gene
   regulatory mechanisms, we utilized Hi–C technology to perform
   high–resolution genome–wide chromatin interaction mapping of the
   abdominal fat tissues of the FL and LL groups (Supplementary
   Table [91]1). Through this chromatin interaction mapping, we
   constructed chromatin interaction frequency plots, revealing complex
   chromatin structures (A/B compartments, topologically associating
   domains [TADs], and loops) in both the FL and LL genomes (Fig. [92]3a).
   Furthermore, we crafted individualized three–dimensional (3D) chromatin
   models for FL and LL, revealing marked structural differences between
   the two groups. Notably, the LL group exhibited a weaker chromatin
   conformation, as indicated by increased Von Neumann entropy^[93]27 and
   decreased interaction frequencies (Fig. [94]3b–d, Supplementary
   Figs. [95]3 and [96]4).

Fig. 3. 3D genome mapping of variant–gene interactions in fat and lean
broiler chicken lines.

   [97]Fig. 3
   [98]Open in a new tab

   a Hi–C contact matrices at varying resolutions for chromosomes in the
   FL and LL genomes, with darker shades indicating higher chromatin
   interaction frequencies. b Three-dimensional average models of FL and
   LL chromatin structures, color-coded to represent different chromosomes
   at a 500 kb resolution, illustrating average inter-chromosomal spatial
   relationships rather than specific diploid configurations. c Box plots
   of Von Neumann entropy comparing chromatin conformational looseness in
   the FL and LL genomes. Boxplot center line, median; box limits, upper
   and lower quartiles; whiskers, 1.5× interquartile range and data beyond
   that threshold indicated as outliers. P values were obtained using
   unpaired two-tailed Student’s t-test, n = 40 biologically independent
   samples. d Log–log plot of contact frequency versus distance,
   highlighting structural differences between the FL and LL genomic
   architectures. e Chromatin A/B compartment distribution maps for
   specific chromosomal regions in the FL and LL genomes at 100 kb
   resolution. f Frequency charts of A/B compartment transitions, showing
   the proportion of transitions between A/B compartments in the FL and LL
   genomes. g Aggregate peak analysis (APA) of all topologically
   associating domains (TADs) within the FL and LL genomes at 10 kb
   resolution. h Statistical summary of specific and shared TAD
   boundaries, including counts and percentages. i APA for a comparative
   analysis of all chromatin loops within the FL and LL genomes at
   resolutions of 1, 5, 10, and 25 kb. j Distribution charts summarizing
   the number and percentage of specific and shared chromatin loops. k
   Schematic diagrams illustrating how functional variants influence gene
   interactions through changes in proximal and distal RE activity and in
   TAD and loop structures. l Table detailing 2559 variant–gene
   interactions identified, categorized by interaction type, variant
   number, REs, and target genes involved. Source data are provided as a
   Source Data file.

   We conducted a comparative analysis of A/B compartments, TADs, and loop
   structures between the FL and LL groups. Our analysis at a 100 kb
   resolution revealed that only 6.16% of the total genomic regions showed
   transitions between the A and B compartments in the FL and LL groups,
   suggesting high conservation in these compartment regions (Fig. [99]3e,
   f). Furthermore, a more refined 10 kb resolution allowed us to identify
   a total of 3829 TAD boundaries, of which 2898 (75.69%) were shared by
   the two groups. Additionally, there were 483 (12.61%) FL–specific TAD
   boundaries and 448 (11.70%) LL–specific TAD boundaries (Fig. [100]3g,
   h). Further examination of loop structures at resolutions of 1, 5, 10,
   and 25 kb yielded a total of 4246 loops, with only 1243 (29.27%) loops
   being shared by the two groups. Additionally, there were 2100 (49.46%)
   FL–specific loops and 903 (21.27%) LL–specific loops, which was further
   confirmed by CTCF enrichment at loop anchors (Fig. [101]3i, j and
   Supplementary Fig. [102]4d). These findings underscored a notable
   disparity in TADs (24.31%) and loops (70.73%) structures between FL and
   LL, suggesting their potential regulatory roles.

   We annotated the 2162 potentially functional variants associated with
   AFD traits on the high–order structure of the 3D genome, focusing on
   TAD boundaries and loops, to identify target genes regulated by these
   variants. This process revealed four primary transcriptional regulatory
   mechanisms mediated by functional variants: (1) alterations in proximal
   RE activity, in which 552 functional variants modulate the activity of
   168 proximal REs (promoters, nearby enhancers, and nearby silencers) to
   regulate the transcription of 154 target genes; (2) modifications in
   distal RE activity, in which 619 functional variants exert
   transcriptional control over 261 target genes by altering the activity
   of 160 REs (enhancers and silencers); (3) reconfiguration of the TAD
   structure, in which 40 functional variants impact the transcription of
   18 target genes by modifying 8 TAD structures, particularly at CTCF
   binding sites located at TAD boundaries; and (4) loop structure
   alterations, in which 666 functional variants regulate 267 distant
   target genes by influencing 163 loop structures, especially at CTCF
   binding sites within loop anchor points (Fig. [103]3k, l). In total,
   our analysis identified a compendium of 1169 functional variants that
   regulate the expression of 500 target genes by influencing chromatin
   structures such as proximal and distal regulatory elements, TADs, and
   loop structures, resulting in a complex network of 2559 variant–gene
   interactions.

Weighted gene coexpression network analysis reveals key variant–gene
interactions related to fat deposition

   After identifying the target genes of potentially functional variants,
   we employed weighted gene coexpression network analysis (WGCNA) to
   explore the associations between the expression of target genes and fat
   traits, thereby establishing comprehensive variant–gene networks
   related to AFD traits. Upon detailed analytical consideration, a power
   beta value of 11 was chosen for WGCNA to ensure a scale–free network
   (Fig. [104]4a, b). Using this power beta value, 439 of the above 500
   target genes were successfully grouped into four distinct functional
   modules (Fig. [105]4c). Specifically, the blue module encompassed 133
   genes, the turquoise module contained 172 genes, the brown module
   comprised 86 genes, and the yellow module included 48 genes. The
   remaining 61 genes, which did not exhibit a sufficiently strong
   coexpression relationship with the others, have been allocated to the
   gray group in Fig. [106]4c.

Fig. 4. Weighted gene coexpression network analysis (WGCNA) and functional
enrichment analysis.

   [107]Fig. 4
   [108]Open in a new tab

   a Scale independence (left) and mean connectivity (right) analysis for
   choosing the soft threshold in establishing the WGCNA network. b Gene
   clustering and identification of gene modules using WGCNA. c Heatmap
   illustrating the relationships between gene modules and traits. The
   module–trait relationships were assessed by calculating the Pearson
   correlation coefficients (r) between the module eigengenes and trait
   data (AFW and AFP). P values were calculated based on the Pearson
   correlation test, reflecting the statistical significance of the
   correlations. d Network diagram of adipose–related variant–gene
   interactions. e Gene Ontology (GO) term enrichment analysis for target
   genes. f Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway
   enrichment analysis of target genes, providing insight into biological
   pathways. KEGG pathway enrichment analysis was performed using a
   two-sided Fisher’s exact test. Multiple comparisons were adjusted using
   the Benjamini-Hochberg (BH) method to control the false discovery rate
   (FDR). Source data are provided as a Source Data file.

   The association analysis between the gene modules and phenotypes
   revealed a significant relationship between 439 target genes and AFD
   traits among the 1134 functional variants. Specifically, 305 target
   genes of 900 functional variants were associated to fat deposition.
   This finding is supported by the positive correlations observed in the
   blue module, which exhibited a strong positive correlation with AFW
   (r = 0.90, p = 4e − 04) and AFP (r = 0.87, p = 1e − 04), and in the
   turquoise module, which showed a similarly strong positive correlation
   with AFW (r = 0.98, p = 8e − 07) and AFP (r = 0.99, p = 1e − 07)
   (Fig. [109]4c). Conversely, another set of 134 target genes of 518
   functional variants appeared to inhibit fat deposition. This
   observation was supported by the negative correlations in the brown
   module with AFW (r = −0.64, p = 0.01) and AFP (r = −0.65, p = 0.01) and
   in the yellow module with AFW (r = −0.53, p = 0.04) and AFP (r = −0.54,
   p = 0.04) (Fig. [110]4c). Notably, there is an overlap of 284 variants
   between the two sets, resulting in a total of 1134 unique variants
   (900 + 518 - 284) regulating the 439 genes associated with AFD traits.
   Additionally, 61 target genes of 35 functional variants, encompassing
   304 variant–gene interactions, were identified within the grey module
   and were not associated with AFD traits.

   In total, our analysis successfully highlighted 439 significant
   adipose–related target genes influenced by 1134 functional variants,
   encompassing 2255 variant–gene interactions (Fig. [111]4d). Gene
   Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG)
   enrichment analyses further confirmed that these genes strongly
   correlated with lipid metabolism–related pathways, primarily signal
   transduction, metabolic processes, and cell growth and death
   (Fig. [112]4e, f and Supplementary Fig. [113]5d–f).

Functional characterization of variant–gene interactions in relation to fat
deposition

   In this study, we constructed an integrated framework, as described
   above, that combines genomic, epigenomic, 3D genomic, and
   transcriptomic data to unravel the mechanisms of transcriptional
   regulation involved in the modulation of AFD traits by functional
   variants; this framework is termed Integrative Multiomics for
   Variant–Gene Interactions (IMVGI). This holistic approach allowed a
   systematic analysis of functional variants and their impact on
   transcriptional regulatory mechanisms associated with AFD traits.
   (Fig. [114]5a). Using IMVGI, we identified 2255 adipose–related
   variant–gene interactions, including 1134 functional variants (90.48%
   SNPs and 9.52% InDels) and 439 target genes. These functional variants
   were predominantly localized in enhancer–associated regions (45.67%)
   and CTCF–associated regions (40.06%) (Fig. [115]5b and Supplementary
   Data [116]1–[117]3). On average, each functional variant was predicted
   to regulate 1.94 target genes, whereas each target gene was influenced
   by 4.23 functional variants. The average genomic distance of these
   interactions was approximately 201 kb, with about 71.00% of variants
   skipping over the nearest gene to regulate their target genes through
   long-range interactions (Fig. [118]5c–f and Supplementary
   Fig. [119]5a–c), highlighting the importance of integrating 3D genomic
   interactions in this study.

Fig. 5. Functional analysis of variant–gene interactions in relation to AFD
traits.

   [120]Fig. 5
   [121]Open in a new tab

   a Flowchart model for Integrative Multi–omics for Variant–Gene
   Interactions (IMVGI) identification of functional variants and target
   genes related to AFD traits. b Functional annotations for IMVGI
   variants. c Cumulative fractions of target genes per IMVGI variant
   (mean = 1.94). d Cumulative fractions of IMVGI variants per target gene
   (mean = 4.23). e Cumulative fractions of genomic distances for
   identified variant–gene connections (mean = 201 kb). f The bar chart
   indicates the number of genes skipped by variants before interacting
   with their target genes. The x-axis represents the number of genes
   skipped, and the y-axis represents the number of variants. g
   Comparative analysis of the enrichment of non–IMVGI and IMVGI variants
   in active chromatin regions, including ATAC–seq, H3K27ac, H3K4me3,
   H3K4me1, and CTCF markers. Violin plots provide a density estimation of
   the data distribution, and box plots display the median (central line),
   interquartile range (IQR, bounds), and whiskers representing the
   minimum and maximum values. Sample sizes are n = 1134 biologically
   independent samples (unpaired two-tailed Student’s t-test). h Heatmaps
   showing transcription factor binding preferences for non–IMVGI vs.