Abstract
Breeding practices adopted at different farms are aimed at maximizing
the profitability of pig farming. In this work, we have analyzed the
genetic diversity of Large White pigs in Russia. We compared genomes of
historic and modern Large White Russian breeds using 271 pig samples.
We have identified 120 candidate regions associated with the
differentiation of modern and historic pigs and analyzed genomic
differences between the modern farms. The identified genes were
associated with height, fitness, conformation, reproductive
performance, and meat quality.
Keywords: Artificial selection, Breeding, Selection signals, Farming
practices
Introduction
Human consumption drives the artificial selection of farm animals.
Understanding how selection creates genetic differences between
populations of different farms is essential for effective livestock
development.
In the last two centuries, a common strategy was to maximize pig
farming profitability of highly productive commercial breeds (such as
Large White, Landrace, and Duroc) with high growth rates, good feed
conversion and increased lean meat yield ([36]Wang et al., 2018). As a
result, these breeds became popular worldwide, including in the Russian
Federation ([37]Traspov et al., 2016c; [38]Traspov et al., 2016a;
[39]Traspov et al., 2016b; [40]Čandek Potokar et al., 2019; [41]Čandek
Potokar & Nieto, 2019).
Considering that Yorkshire pigs in Northern America are direct
descendants of the European Large White lineage ([42]Amer et al.,
2020), Large White is the most common commercial breed group. Countries
that develop production usually import the breeding stock of Large
White pigs since these pigs have a flexible genetic structure adapted
to selection pressure ([43]Getmantseva et al., 2020). This flexibility
and genetic variation of the breed make it an exciting object for
scientists striving to find the genomic regions and genes responsible
for the variation.
The initial livestock of Large White pigs (approximately 100 animals)
was brought to the Soviet Union from England in 1923. As a result of
continuous breeding efforts, a new regional population of the Large
White breed was created in the USSR during the second half of the 20th
century ([44]Traspov et al., 2016a; [45]Getmantseva et al., 2020). The
fall of the Soviet Union caused another period of hardship for Russian
pig farming ([46]Smith, 2014). The breeding programs were nearly
stopped, farming practices deteriorated, pigs were massively affected
by diseases, and were culled in huge numbers. After the USSR’s
collapse, the Soviet livestock was almost entirely replaced by imported
pigs from the leading breeding centers of Denmark, France, England,
Holland, Ireland, etc. Mitochondrial DNA analysis of pigs from various
European breeding centers shows significant genetic differences
([47]Getmantseva et al., 2020). In this work, we compare the Large
White pigs of Soviet breeding with the modern commercial pigs. We have
also analyzed the DNA structure of contemporary Large White pigs within
and between the breeding farms in Russia. We have identified selection
signatures attributed to the socio-economic conditions and breeding
centers’ practices.
Materials and Methods
Animals and sample collection
According to standard monitoring procedures and guidelines, the
participating holdings specialists collected tissue samples, following
the ethical protocols outlined in the Directive 2010/63/EU (2010). The
pig ear samples (ear pluck) were obtained as a general breeding
monitoring procedure or during the slaughter. The collection of ear
samples is a standard practice in pig breeding ([48]Kunhareang, Zhou &
Hickford, 2010). Previously collected historic tissue samples of the
Soviet-bred pigs were obtained from breeding farms in Russia between
2006 and 2010.
We have assembled a pool of 271 pig samples; 99 historical examples of
the Large White pigs from the Soviet breeding program (LW_Old, samples
collected from four breeding farms between 2006 and 2010); 106 samples
of Large White pigs of modern breeding from four Russian farms (LW_New:
LW_1 = 28; LW_2 = 31; LW_3 = 26; LW_4 = 21, all samples collected
between 2018 and 2020). The Landrace (L = 23) and Duroc (D = 43)
samples were collected between 2018 and 2020. Genomic DNA was extracted
from ear samples using a DNA-Extran-2 reagent kit (OOO NPF Sintol,
Russia) following the manufacturer’s protocol. The quantity, quality,
and integrity of DNA were assessed using a Qubit 2.0 fluorometer
(Invitrogen/Life Technologies, USA) and a NanoDrop8000
spectrophotometer (Thermo Fisher Scientific, USA).
Genotyping
The samples were genotyped using the GeneSeek^® GGP Porcine HD Genomic
Profiler v1 (Illumina Inc, USA), which includes 68,516 SNPs evenly
distributed with an average spacing of 25 kb. Genotype quality control
and data filtering were performed using PLINK 1.9, as recommended by
[49]Marees et al. (2018). The total genotyping rate is 0.999307; 41,262
variants and 271 pigs passed the QC filters and were retained for
further analysis.
Data availability
The dataset can be accessed at
[50]http://www.compubioverne.group/data/PIG/.
Population structure analysis
To study population structure, we performed a singular value
decomposition (SVD) decomposition of the GRM using the SVD function in
R ([51]Barker et al., 2001; [52]Van Raden, 2008). R package AdmixTools
was used to compute various F[2] statistics for all pairs of
populations and F[3] statistics outgroup statistics estimating the
relative divergence time for pairs of populations, using the Duroc pigs
as an outgroup ([53]Lazaridis et al., 2014; [54]Patterson et al.,
2012). AdmixTools was also and to plot the trees ([55]Patterson et al.,
2012; [56]Liu et al., 2019). Using the find_graphs routine, we have
generated and evaluated admixture graphs to find the best-fitting
arrangements. Although F[ST] and F[2] statistics also calculate genetic
distance or divergence time, they may be influenced by population
sizes. Statistic F[3](outgroup; A, B) estimates the genetic distance
between the outgroup and branching point between populations A and B
([57]Maier & Patterson, 2020).
To study the genetic structure, we used the VanRaden genomic
relationship matrix (GRM)) ([58]Van Raden, 2008). This matrix is
constructed from the SNP matrix Z, where rows correspond to individuals
and columns to markers, as
[MATH:
G=ZZ′k
:MATH]
, where denominator k is calculated using the allele frequencies of
genotyped individuals: k = 2∑[i]p[i](1 − p[i]). The denominator attains
maximum when all allele frequencies are equal to
[MATH: 12
:MATH]
.
We performed the SVD decomposition of GRM using the SVD function in R.
Singular value decomposition (SVD) is a valuable tool for
characterizing population genetic structure to detect and extract small
signals even if the data is noisy ([59]Berrar, Dubitzky & Granzow,
2007). Besides, a graphics package in R based on the GRM matrix is used
to visualize the relationships between the studied populations of pigs.
Plots of the first and second SVD components and a heat map were
generated to visualize the SVD results. We used the singular value
decomposition (SVD) approach ([60]Golub & Reinsch, 1971) to assess the
genetic structure of the studied populations of Large White pigs in
Russia.
To visualize the relationship between the studied populations of pigs
using the graphics package in R, based on the GRM matrix, we built a
heatmap plot that has separated the pigs by breeds. (R: A Language and
Environment for Statistical Computing, [61]http://www.r-project.org).
Detection of selection signatures
We used two statistics that can be calculated for unphased genotypic
data: F[ST] and F[LK]. Fixation index F[ST] is a measure of population
differentiation due to genetic structure. It is frequently estimated
from genetic polymorphism data, such as single-nucleotide polymorphisms
(SNP) or microsatellites. F[ST] value of a locus is calculated as a
ratio of the variance of allele frequencies between the populations and
the sum of the variances within and between populations. Positive
selection is indicated by high F[ST] values relative to their
heterozygosity ([62]Weigand & Leese, 2018). Smoothing of F[ST] isused
to identify contiguous genomic regions under selection. The smoothed
F[ST] method is based on the pure drift model of [63]Nicholson et al.
(2002). According to this model, individual SNPs are grouped into
genomic windows, and their average smoothed F[ST]values are calculated.
Smoothed F[ST] isuseful for analyzing distantly related populations and
reveals subtle differences between them [64]Porto-Neto et al. (2013).
We compared LW_OLD and LW_New groups using the F[ST] analysis to find
genomic traces of recent selection resulting from different
socio-economic conditions. We compared pigs from different farms to
analyze how the selection centers’ preferences and breeding practices