Abstract

   Breeding practices adopted at different farms are aimed at maximizing
   the profitability of pig farming. In this work, we have analyzed the
   genetic diversity of Large White pigs in Russia. We compared genomes of
   historic and modern Large White Russian breeds using 271 pig samples.
   We have identified 120 candidate regions associated with the
   differentiation of modern and historic pigs and analyzed genomic
   differences between the modern farms. The identified genes were
   associated with height, fitness, conformation, reproductive
   performance, and meat quality.

   Keywords: Artificial selection, Breeding, Selection signals, Farming
   practices

Introduction

   Human consumption drives the artificial selection of farm animals.
   Understanding how selection creates genetic differences between
   populations of different farms is essential for effective livestock
   development.

   In the last two centuries, a common strategy was to maximize pig
   farming profitability of highly productive commercial breeds (such as
   Large White, Landrace, and Duroc) with high growth rates, good feed
   conversion and increased lean meat yield ([36]Wang et al., 2018). As a
   result, these breeds became popular worldwide, including in the Russian
   Federation ([37]Traspov et al., 2016c; [38]Traspov et al., 2016a;
   [39]Traspov et al., 2016b; [40]Čandek Potokar et al., 2019; [41]Čandek
   Potokar & Nieto, 2019).

   Considering that Yorkshire pigs in Northern America are direct
   descendants of the European Large White lineage ([42]Amer et al.,
   2020), Large White is the most common commercial breed group. Countries
   that develop production usually import the breeding stock of Large
   White pigs since these pigs have a flexible genetic structure adapted
   to selection pressure ([43]Getmantseva et al., 2020). This flexibility
   and genetic variation of the breed make it an exciting object for
   scientists striving to find the genomic regions and genes responsible
   for the variation.

   The initial livestock of Large White pigs (approximately 100 animals)
   was brought to the Soviet Union from England in 1923. As a result of
   continuous breeding efforts, a new regional population of the Large
   White breed was created in the USSR during the second half of the 20th
   century ([44]Traspov et al., 2016a; [45]Getmantseva et al., 2020). The
   fall of the Soviet Union caused another period of hardship for Russian
   pig farming ([46]Smith, 2014). The breeding programs were nearly
   stopped, farming practices deteriorated, pigs were massively affected
   by diseases, and were culled in huge numbers. After the USSR’s
   collapse, the Soviet livestock was almost entirely replaced by imported
   pigs from the leading breeding centers of Denmark, France, England,
   Holland, Ireland, etc. Mitochondrial DNA analysis of pigs from various
   European breeding centers shows significant genetic differences
   ([47]Getmantseva et al., 2020). In this work, we compare the Large
   White pigs of Soviet breeding with the modern commercial pigs. We have
   also analyzed the DNA structure of contemporary Large White pigs within
   and between the breeding farms in Russia. We have identified selection
   signatures attributed to the socio-economic conditions and breeding
   centers’ practices.

Materials and Methods

Animals and sample collection

   According to standard monitoring procedures and guidelines, the
   participating holdings specialists collected tissue samples, following
   the ethical protocols outlined in the Directive 2010/63/EU (2010). The
   pig ear samples (ear pluck) were obtained as a general breeding
   monitoring procedure or during the slaughter. The collection of ear
   samples is a standard practice in pig breeding ([48]Kunhareang, Zhou &
   Hickford, 2010). Previously collected historic tissue samples of the
   Soviet-bred pigs were obtained from breeding farms in Russia between
   2006 and 2010.

   We have assembled a pool of 271 pig samples; 99 historical examples of
   the Large White pigs from the Soviet breeding program (LW_Old, samples
   collected from four breeding farms between 2006 and 2010); 106 samples
   of Large White pigs of modern breeding from four Russian farms (LW_New:
   LW_1 = 28; LW_2 = 31; LW_3 = 26; LW_4 = 21, all samples collected
   between 2018 and 2020). The Landrace (L = 23) and Duroc (D = 43)
   samples were collected between 2018 and 2020. Genomic DNA was extracted
   from ear samples using a DNA-Extran-2 reagent kit (OOO NPF Sintol,
   Russia) following the manufacturer’s protocol. The quantity, quality,
   and integrity of DNA were assessed using a Qubit 2.0 fluorometer
   (Invitrogen/Life Technologies, USA) and a NanoDrop8000
   spectrophotometer (Thermo Fisher Scientific, USA).

Genotyping

   The samples were genotyped using the GeneSeek^® GGP Porcine HD Genomic
   Profiler v1 (Illumina Inc, USA), which includes 68,516 SNPs evenly
   distributed with an average spacing of 25 kb. Genotype quality control
   and data filtering were performed using PLINK 1.9, as recommended by
   [49]Marees et al. (2018). The total genotyping rate is 0.999307; 41,262
   variants and 271 pigs passed the QC filters and were retained for
   further analysis.

Data availability

   The dataset can be accessed at
   [50]http://www.compubioverne.group/data/PIG/.

Population structure analysis

   To study population structure, we performed a singular value
   decomposition (SVD) decomposition of the GRM using the SVD function in
   R ([51]Barker et al., 2001; [52]Van Raden, 2008). R package AdmixTools
   was used to compute various F[2] statistics for all pairs of
   populations and F[3] statistics outgroup statistics estimating the
   relative divergence time for pairs of populations, using the Duroc pigs
   as an outgroup ([53]Lazaridis et al., 2014; [54]Patterson et al.,
   2012). AdmixTools was also and to plot the trees ([55]Patterson et al.,
   2012; [56]Liu et al., 2019). Using the find_graphs routine, we have
   generated and evaluated admixture graphs to find the best-fitting
   arrangements. Although F[ST] and F[2] statistics also calculate genetic
   distance or divergence time, they may be influenced by population
   sizes. Statistic F[3](outgroup; A, B) estimates the genetic distance
   between the outgroup and branching point between populations A and B
   ([57]Maier & Patterson, 2020).

   To study the genetic structure, we used the VanRaden genomic
   relationship matrix (GRM)) ([58]Van Raden, 2008). This matrix is
   constructed from the SNP matrix Z, where rows correspond to individuals
   and columns to markers, as
   [MATH:
   <mi>G</mi><mo>=</mo><mfrac><mrow><mi>Z</mi><msup><mrow><mi>Z</mi></mrow
   ><mrow><mo>′</mo></mrow></msup></mrow><mrow><mi>k</mi></mrow></mfrac>
   :MATH]
   , where denominator k is calculated using the allele frequencies of
   genotyped individuals: k = 2∑[i]p[i](1 − p[i]). The denominator attains
   maximum when all allele frequencies are equal to
   [MATH: <mfrac><mrow><mn>1</mn></mrow><mrow><mn>2</mn></mrow></mfrac>
   :MATH]
   .

   We performed the SVD decomposition of GRM using the SVD function in R.
   Singular value decomposition (SVD) is a valuable tool for
   characterizing population genetic structure to detect and extract small
   signals even if the data is noisy ([59]Berrar, Dubitzky & Granzow,
   2007). Besides, a graphics package in R based on the GRM matrix is used
   to visualize the relationships between the studied populations of pigs.
   Plots of the first and second SVD components and a heat map were
   generated to visualize the SVD results. We used the singular value
   decomposition (SVD) approach ([60]Golub & Reinsch, 1971) to assess the
   genetic structure of the studied populations of Large White pigs in
   Russia.

   To visualize the relationship between the studied populations of pigs
   using the graphics package in R, based on the GRM matrix, we built a
   heatmap plot that has separated the pigs by breeds. (R: A Language and
   Environment for Statistical Computing, [61]http://www.r-project.org).

Detection of selection signatures

   We used two statistics that can be calculated for unphased genotypic
   data: F[ST] and F[LK]. Fixation index F[ST] is a measure of population
   differentiation due to genetic structure. It is frequently estimated
   from genetic polymorphism data, such as single-nucleotide polymorphisms
   (SNP) or microsatellites. F[ST] value of a locus is calculated as a
   ratio of the variance of allele frequencies between the populations and
   the sum of the variances within and between populations. Positive
   selection is indicated by high F[ST] values relative to their
   heterozygosity ([62]Weigand & Leese, 2018). Smoothing of F[ST] isused
   to identify contiguous genomic regions under selection. The smoothed
   F[ST] method is based on the pure drift model of [63]Nicholson et al.
   (2002). According to this model, individual SNPs are grouped into
   genomic windows, and their average smoothed F[ST]values are calculated.
   Smoothed F[ST] isuseful for analyzing distantly related populations and
   reveals subtle differences between them [64]Porto-Neto et al. (2013).

   We compared LW_OLD and LW_New groups using the F[ST] analysis to find
   genomic traces of recent selection resulting from different
   socio-economic conditions. We compared pigs from different farms to
   analyze how the selection centers’ preferences and breeding practices