Abstract Breeding practices adopted at different farms are aimed at maximizing the profitability of pig farming. In this work, we have analyzed the genetic diversity of Large White pigs in Russia. We compared genomes of historic and modern Large White Russian breeds using 271 pig samples. We have identified 120 candidate regions associated with the differentiation of modern and historic pigs and analyzed genomic differences between the modern farms. The identified genes were associated with height, fitness, conformation, reproductive performance, and meat quality. Keywords: Artificial selection, Breeding, Selection signals, Farming practices Introduction Human consumption drives the artificial selection of farm animals. Understanding how selection creates genetic differences between populations of different farms is essential for effective livestock development. In the last two centuries, a common strategy was to maximize pig farming profitability of highly productive commercial breeds (such as Large White, Landrace, and Duroc) with high growth rates, good feed conversion and increased lean meat yield ([36]Wang et al., 2018). As a result, these breeds became popular worldwide, including in the Russian Federation ([37]Traspov et al., 2016c; [38]Traspov et al., 2016a; [39]Traspov et al., 2016b; [40]Čandek Potokar et al., 2019; [41]Čandek Potokar & Nieto, 2019). Considering that Yorkshire pigs in Northern America are direct descendants of the European Large White lineage ([42]Amer et al., 2020), Large White is the most common commercial breed group. Countries that develop production usually import the breeding stock of Large White pigs since these pigs have a flexible genetic structure adapted to selection pressure ([43]Getmantseva et al., 2020). This flexibility and genetic variation of the breed make it an exciting object for scientists striving to find the genomic regions and genes responsible for the variation. The initial livestock of Large White pigs (approximately 100 animals) was brought to the Soviet Union from England in 1923. As a result of continuous breeding efforts, a new regional population of the Large White breed was created in the USSR during the second half of the 20th century ([44]Traspov et al., 2016a; [45]Getmantseva et al., 2020). The fall of the Soviet Union caused another period of hardship for Russian pig farming ([46]Smith, 2014). The breeding programs were nearly stopped, farming practices deteriorated, pigs were massively affected by diseases, and were culled in huge numbers. After the USSR’s collapse, the Soviet livestock was almost entirely replaced by imported pigs from the leading breeding centers of Denmark, France, England, Holland, Ireland, etc. Mitochondrial DNA analysis of pigs from various European breeding centers shows significant genetic differences ([47]Getmantseva et al., 2020). In this work, we compare the Large White pigs of Soviet breeding with the modern commercial pigs. We have also analyzed the DNA structure of contemporary Large White pigs within and between the breeding farms in Russia. We have identified selection signatures attributed to the socio-economic conditions and breeding centers’ practices. Materials and Methods Animals and sample collection According to standard monitoring procedures and guidelines, the participating holdings specialists collected tissue samples, following the ethical protocols outlined in the Directive 2010/63/EU (2010). The pig ear samples (ear pluck) were obtained as a general breeding monitoring procedure or during the slaughter. The collection of ear samples is a standard practice in pig breeding ([48]Kunhareang, Zhou & Hickford, 2010). Previously collected historic tissue samples of the Soviet-bred pigs were obtained from breeding farms in Russia between 2006 and 2010. We have assembled a pool of 271 pig samples; 99 historical examples of the Large White pigs from the Soviet breeding program (LW_Old, samples collected from four breeding farms between 2006 and 2010); 106 samples of Large White pigs of modern breeding from four Russian farms (LW_New: LW_1 = 28; LW_2 = 31; LW_3 = 26; LW_4 = 21, all samples collected between 2018 and 2020). The Landrace (L = 23) and Duroc (D = 43) samples were collected between 2018 and 2020. Genomic DNA was extracted from ear samples using a DNA-Extran-2 reagent kit (OOO NPF Sintol, Russia) following the manufacturer’s protocol. The quantity, quality, and integrity of DNA were assessed using a Qubit 2.0 fluorometer (Invitrogen/Life Technologies, USA) and a NanoDrop8000 spectrophotometer (Thermo Fisher Scientific, USA). Genotyping The samples were genotyped using the GeneSeek^® GGP Porcine HD Genomic Profiler v1 (Illumina Inc, USA), which includes 68,516 SNPs evenly distributed with an average spacing of 25 kb. Genotype quality control and data filtering were performed using PLINK 1.9, as recommended by [49]Marees et al. (2018). The total genotyping rate is 0.999307; 41,262 variants and 271 pigs passed the QC filters and were retained for further analysis. Data availability The dataset can be accessed at [50]http://www.compubioverne.group/data/PIG/. Population structure analysis To study population structure, we performed a singular value decomposition (SVD) decomposition of the GRM using the SVD function in R ([51]Barker et al., 2001; [52]Van Raden, 2008). R package AdmixTools was used to compute various F[2] statistics for all pairs of populations and F[3] statistics outgroup statistics estimating the relative divergence time for pairs of populations, using the Duroc pigs as an outgroup ([53]Lazaridis et al., 2014; [54]Patterson et al., 2012). AdmixTools was also and to plot the trees ([55]Patterson et al., 2012; [56]Liu et al., 2019). Using the find_graphs routine, we have generated and evaluated admixture graphs to find the best-fitting arrangements. Although F[ST] and F[2] statistics also calculate genetic distance or divergence time, they may be influenced by population sizes. Statistic F[3](outgroup; A, B) estimates the genetic distance between the outgroup and branching point between populations A and B ([57]Maier & Patterson, 2020). To study the genetic structure, we used the VanRaden genomic relationship matrix (GRM)) ([58]Van Raden, 2008). This matrix is constructed from the SNP matrix Z, where rows correspond to individuals and columns to markers, as [MATH: G=ZZk :MATH] , where denominator k is calculated using the allele frequencies of genotyped individuals: k = 2∑[i]p[i](1 − p[i]). The denominator attains maximum when all allele frequencies are equal to [MATH: 12 :MATH] . We performed the SVD decomposition of GRM using the SVD function in R. Singular value decomposition (SVD) is a valuable tool for characterizing population genetic structure to detect and extract small signals even if the data is noisy ([59]Berrar, Dubitzky & Granzow, 2007). Besides, a graphics package in R based on the GRM matrix is used to visualize the relationships between the studied populations of pigs. Plots of the first and second SVD components and a heat map were generated to visualize the SVD results. We used the singular value decomposition (SVD) approach ([60]Golub & Reinsch, 1971) to assess the genetic structure of the studied populations of Large White pigs in Russia. To visualize the relationship between the studied populations of pigs using the graphics package in R, based on the GRM matrix, we built a heatmap plot that has separated the pigs by breeds. (R: A Language and Environment for Statistical Computing, [61]http://www.r-project.org). Detection of selection signatures We used two statistics that can be calculated for unphased genotypic data: F[ST] and F[LK]. Fixation index F[ST] is a measure of population differentiation due to genetic structure. It is frequently estimated from genetic polymorphism data, such as single-nucleotide polymorphisms (SNP) or microsatellites. F[ST] value of a locus is calculated as a ratio of the variance of allele frequencies between the populations and the sum of the variances within and between populations. Positive selection is indicated by high F[ST] values relative to their heterozygosity ([62]Weigand & Leese, 2018). Smoothing of F[ST] isused to identify contiguous genomic regions under selection. The smoothed F[ST] method is based on the pure drift model of [63]Nicholson et al. (2002). According to this model, individual SNPs are grouped into genomic windows, and their average smoothed F[ST]values are calculated. Smoothed F[ST] isuseful for analyzing distantly related populations and reveals subtle differences between them [64]Porto-Neto et al. (2013). We compared LW_OLD and LW_New groups using the F[ST] analysis to find genomic traces of recent selection resulting from different socio-economic conditions. We compared pigs from different farms to analyze how the selection centers’ preferences and breeding practices