Abstract Simple Summary Cashmere goats are valuable genetic resources which are famous for their high-quality fiber. Runs of homozygosity (ROHs) refer to uninterrupted sequences of homozygous genotypes present within the DNA sequence of an individual. To enhance the preservation, improvement and enduring utilization of this valuable genetic resource, we performed a comprehensive study on the genetic variance, degree of inbreeding, patterns of ROHs and selected genes situated in ROH islands which are associated with economic attributes in goats via whole-genome sequencing analysis. Our research showed that the Inner Mongolia cashmere goat possesses a notably high level of genetic diversity and presented the lowest inbreeding coefficient. On the contrary, the highest [MATH: Fro h :MATH] and lowest levels of diversity were observed in wild goats compared to domesticated goats. The analysis of selection signal through ROHs identified some genes related to meat, fiber and milk production; fertility, disease and cold resistance and adaptability; and body size and growth. Abstract Cashmere goats are valuable genetic resources which are famous worldwide for their high-quality fiber. Runs of homozygosity (ROHs) have been identified as an efficient tool to assess inbreeding level and identify related genes under selection. However, there is limited research on ROHs in cashmere goats. Therefore, we investigated the ROH pattern, assessed genomic inbreeding levels and examined the candidate genes associated with the cashmere trait using whole-genome resequencing data from 123 goats. Herein, the Inner Mongolia cashmere goat presented the lowest inbreeding coefficient of 0.0263. In total, we identified 57,224 ROHs. Seventy-four ROH islands containing 50 genes were detected. Certain identified genes were related to meat, fiber and milk production (FGF1, PTPRM, RERE, GRID2, RARA); fertility (BIRC6, ECE2, CDH23, PAK1); disease or cold resistance and adaptability (PDCD1LG2, SVIL, PRDM16, RFX4, SH3BP2); and body size and growth (TMEM63C, SYN3, SDC1, STRBP, SMG6). 135 consensus ROHs were identified, and we found candidate genes (FGF5, DVL3, NRAS, KIT) were associated with fiber length or color. These findings enhance our comprehension of inbreeding levels in cashmere goats and the genetic foundations of traits influenced by selective breeding. This research contributes significantly to the future breeding, reservation and use of cashmere goats and other goat breeds. Keywords: cashmere goat, runs of homozygosity, inbreeding coefficient, genetic diversity, selection signatures 1. Introduction Runs of homozygosity (ROHs) refer to uninterrupted sequences of homozygous genotypes present within the DNA sequence of an individual. These segments are identical because they have been inherited from both parents and their ancestors sharing a common genetic background [[46]1]. In general, a shorter ROH indicates common ancestors in the descent, while a longer ROH suggests that recent inbreeding has occurred, reflecting more immediate common ancestry and genetic homogeneity due to mating between close relatives [[47]2]. ROHs are useful for understanding the genetic connections among individuals and aiding in the reduction in inbreeding rates and the discovery of harmful genetic variations [[48]1]. Analyzing the ROH pattern in the genome, it is possible to explore the inbreeding level, genetic diversity and genetic defects in livestock and poultry. Utilizing ROHs to calculate the inbreeding coefficient ( [MATH: FRO H) :MATH] can precisely assess autozygosity and is more effective in identifying both historical and recent inbreeding than pedigree-based estimates ( [MATH: FPe d :MATH] ) [[49]3]. Increasing evidence indicates [MATH: FRO H :MATH] can be used as an indicator for a population’s inbreeding levels, especially in situations where pedigree data are not available [[50]4,[51]5,[52]6,[53]7]. In recent years, due to the extensive utilization of SNP microarray gene chip and whole-genome resequencing technologies [[54]8,[55]9,[56]10,[57]11], research into ROHs based on livestock and poultry genomic information has been increasing. This includes research on various species, such as goats [[58]12,[59]13,[60]14], sheep [[61]15,[62]16,[63]17], cattle [[64]7,[65]18,[66]19], pigs [[67]3,[68]20,[69]21], chickens [[70]22,[71]23,[72]24] and so on. Genomic regions that were abundant with ROHs were defined as ROH islands, potentially signaling selection sweeps. For instance, in cattle, pigs and sheep, some ROH islands have been discovered to correlate with genes related to reproduction, immune function and local adaptability [[73]13,[74]16,[75]18,[76]20]. ROHs also can be utilized as a useful tool to evaluate genetic diversity and inbreeding, providing insights into both recent and ancient inbreeding through the analysis of short and long ROH segments, respectively [[77]25]. Inner Mongolia cashmere goats and Hanshan White cashmere goats are outstanding Chinese cashmere goat breeds that adapt well to temperate continental semi-arid and cold climates. These breeds are known for their thin fiber diameter and abundant cashmere production, which play important roles in the global high-quality production of cashmere for the wool market. Current research primarily focuses on differential gene expression, key genes related to cashmere traits and molecular regulatory mechanisms of hair follicle development during the cashmere growth cycle [[78]26,[79]27,[80]28]. Few studies suggest that inbreeding has a significant impact on the cashmere traits of Inner Mongolia cashmere goats [[81]29,[82]30]. However, information about ROH patterns and inbreeding levels in Inner Mongolia cashmere goats and Hanshan White cashmere goats is scarce. In our study, we aimed to (1) characterize genome-wide ROH patterns and inbreeding levels; and (2) identify ROH islands and putative candidate genes associated with the cashmere trait in cashmere goats based on resequencing data from Inner Mongolia cashmere goats, Hanshan White cashmere goats and other non-cashmere goats. This study is of significant importance for the future breeding, conservation and utilization of goat breeds such as the Inner Mongolia cashmere goat and Hanshan White cashmere goat. 2. Materials and Methods 2.1. Experimental Population, DNA Extraction and Whole-Genome Resequencing In this research, a total of 123 goats (including 113 domestic goat individuals and 10 wild goat individuals) were analyzed, wherein the ear tissue of 55 Inner Mongolia Erlangshan cashmere goat (IMC) and Hanshan White cashmere goat (HSC) individuals was collected and conserved in 75% ethyl alcohol for DNA extraction. Additionally, we downloaded 68 individual goats’ genome resequencing data from the database of NCBI [83]https://www.ncbi.nlm.nih.gov (accessed on 23 March 2024). The sample populations included Inner Mongolia cashmere goats (IMC, n = 51), Hanshan White cashmere goats (HSC, n = 15), Alpine goats (ALG, n = 10), Boer goats (BOE, n = 12), Jining Gray goats (JNG, n = 11), Saanen Dairy goats (SAA, n = 14) and a wild goat population (Siberian Ibex) (IBE, n = 10). Details on sample names, data sources, abbreviations, geographical distributions, and numbers are provided in [84]Table 1 and [85]Figure S1. Table 1. Information about the goat populations in this study. Code Breed/Population Sample Size Location Use IMC Erlangshan cashmere goat 40 Bayannur City, Inner Mongolia, China Cashmere and meat IMC Albas cashmere goat 4 Ordos City, Inner Mongolia, China Cashmere and meat IMC Alxa cashmere goat 7 Alxa League, Inner Mongolia, China Cashmere and meat HSC Hanshan White cashmere goat 15 Chifeng City, Inner Mongolia, China Cashmere and meat JNG Jining Gray goat 11 Jining City and Heze City, Shandong Province, China Lambskin and meat ALG Alpine goat 10 France Milk SAA Saanen Dairy goat 8 South Korea Milk SAA Saanen Dairy goat 6 The United Republic of Tanzania Milk BOE Boer goat 3 South Korea Meat BOE Boer goat 3 New Zealand Meat BOE Boer goat 6 Australia Meat IBE Siberian Ibex 10 Switzerland [86]Open in a new tab The Inner Mongolia cashmere goat is divided into three types based on central production: the Albas type, the Erlangshan type and the Alxa type. Since the 1960s, selective breeding and cooperative breeding have been carried out for this breed. The Hanshan White cashmere goat is a breed developed from local goats and Liaoning cashmere goats. The center of the production of Hanshan White cashmere goats is located in Balin Right Banner, Chifeng City, Inner Mongolia. Genomic DNA was extracted by employing a Wizard^® Genomic DNA Purification Kit (Promega, Madison, WI, USA), adhering to the guidelines provided by the manufacturer. DNA concentration was determined by using a Nanodrop1000 spectrophotometer (Thermo Fischer Scientific, Wilmington, DE, USA). DNA exhibiting a purity ratio (A260/280) of 1.8 to 2.0 and ≥20 ng/μL was utilized. Following the guidelines provided by the manufacturer (Illumina Lnc., San Diego, CA, USA), sequencing libraries were constructed and sequenced using the DNBSEQ-T7 platform (Shenzhen MGI Co., Ltd., Shenzhen, China) with PE150. 2.2. The Processing of Whole-Genome Sequencing Data Raw data were trimmed for quality using fastp (version 0.20.1) ([87]https://github.com/OpenGene/fastp, accessed on 23 March 2024), and the quality of the trimmed reads was assessed with FastQC ([88]http://www.bioinformatics.babraham.ac.uk/projects/fastqc, accessed on 23 March 2024). After filtering, the remaining reads were aligned to the reference goat genome (ARS1 1.2, GCA_001704415.1) using the Burrows–Wheeler Aligner (BWA, version 0.7.17) [[89]31]. Next, bam files were sorted and indexed utilizing samtools (Samtools-1.7) [[90]32]. Picard software ([91]https://broadinstitute.github.io/picard/, accessed on 23 March 2024) was utilized to remove PCR repeats. After, Genome Analysis Toolkit software (GATK, version 4.1.8.0) ([92]https://software.broadinstitute.org/gatk, accessed on 23 March 2024) was utilized. Haplotype Caller was employed to generate GVCF files, and multiple gVCF files were combined into one VCF file using GATK GenomicsDBImport and GenotypeGVCFs. Then, we used GATK Variantfiltration to remove low-quality variant sites. Next, PLINK v1.9 ([93]https://www.cog-genomics.org/plink, accessed on 23 March 2024) [[94]33] was used, excluding SNPs with a missingness rate greater than 0.1 (–geno 0.1) with a minor allele frequency (MAF) less than 5% (–maf 0.05) that departed from the Hardy–Weinberg equilibrium at p  <  10^−3 (–hwe 0.001). Finally, only autosomal SNPs were retained for additional analysis. Individuals presenting over 10% missing genotypes (–mind 0.1) were excluded. For the analysis of ROH, the files were quality-controlled without setting the minimum allele frequency (–maf) parameter [[95]34]. The remaining SNPs were annotated according to their positions using SnpEff v4.35 [[96]35]. 2.3. The Analysis of Population Structure and Linkage Disequilibrium Principal component analysis (PCA) was conducted using GCTA software (version 1.25.3) [[97]36] (–make-grm, –grm) to assess the genetic structure. We used PLINK v1.9 to calculate the genetic distance matrix (–distance matrix) and applied the a‘pe’ package in R to construct a neighbor-joining (NJ) tree, which was visualized on the iTOL website. Model-based clustering to refine the population structure was fulfilled using the Admixture (v1.3.0, [98]http://dalexander.github.io/admixture/download.html, accessed on 23 March 2024). For K = 1 to 7, the number of clusters (K values) for all samples was determined. The best numbers of K clusters were determined by the minimum cross-validation (CV) error rate. In this study, linkage disequilibrium (LD) was measured with PLINK, and the correlation coefficients (r²) of alleles were calculated by PopLDdecay software ([99]https://github.com/BGI-shenzhen/PopLDdecay, accessed on 23 March 2024) [[100]37]. 2.4. Identification of ROH We used PLINK software (version 1.90) with the “–homozyg” command to identify ROHs. The main parameter settings were as follows: each ROH segment needed to contain at least 30 SNPs and have a minimum length of 200 kb, a density standard of one SNP per 30 kb and a maximal interval of 500 kb between two consecutive SNPs within an ROH; the sliding window size was 50 SNPs, permitting up to one heterozygous site and five missing sites within each window; and the sliding window threshold was set to 0.05 (–homozyg-window-snp 50, –homozyg-snp 30, –homozyg-kb 200, –homozyg-density 30, –homozyg-gap 500, –homozyg-window-missing 5, –homozyg-window-threshold 0.05, –homozyg-window-het 1). 2.5. ROH Classification and Assessment of Inbreeding Coefficients Based on ROH segment length, we classified ROHs into three groups: small (<300 kb), medium (300 kb–1.5 Mb) and large (>1.5 Mb). The count of ROH in each length category and the ROH number of each chromosome was determined across seven goat populations. The total number and length of ROHs for each animal was also assessed. [MATH: FRO H :MATH] for each individual were determined using the equation proposed by McQuillan et al. [[101]38]: [MATH: =Σ :MATH] [MATH: Lroh/ Lauto :MATH] . Here, [MATH: Lro h :MATH] represents the cumulative length of all ROHs in an individual genome, and [MATH: Lau to :MATH] is the total length of the autosomes covered by SNPs, which was 2466.19 Mb in our study. 2.6. Identification of Genes within ROH Islands of Domestic Goat Populations In our study, the incidence of ROH was calculated as the ratio of animals within domestic goat populations that possess an SNP located in an ROH segment. This analysis was visualized through Manhattan plots, generated utilizing the qqman package (version 0.1.9). We established the top 0.1% of SNPs within ROHs as the threshold for identifying the genomic areas most frequently related to ROHs across each population. A contiguous string of neighboring SNPs exceeding this threshold constituted what we have termed an ROH island, delineating genomic regions significantly impacted by ROHs. The specific selection method for the top 0.1% of the SNPs is based on the research content of Gorssen [[102]39]. The genes of the ROH islands were annotated using SnpEff v4.35. David ([103]https://david.ncifcrf.gov, accessed on 23 March 2024) was utilized to achieve Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment analysis. Finally, through an extensive and exact literature search, the biological function of each annotated gene on the island of ROH was determined. 2.7. Identification of Consensus Selection Signature Regions in Cashmere Goat Populations The parameter –homozyg-group was utilized to identify overlapping ROHs (pools) in cashmere goat populations [[104]40,[105]41,[106]42,[107]43]. The output file detailed each ROH of each animal, including their consensus sequences (CONs); their respective union (UNION); and the genome position, size, and number of SNPs. Consensus sequences (CONs) represented common ROH segments shared by at least two animals. In our study, ROH consensus regions with selection signatures in cashmere goats were considered to be homozygous regions (CONs) that overlapped greater than 0.1 Mb and were present in greater than 30% of individuals analyzed in IMC and HSC breeds. Finally, functional annotation and gene clustering were performed using David and KOBAS ([108]http://www.genome.jp/kegg, accessed on 23 March 2024). GO and KEGG analysis were performed for genes of consensus sequences. The count of significant genes corresponding to each term was identified using a threshold of p value ≤ 0.05. 3. Results 3.1. Population Structure Analyses and Linkage Disequilibrium In our study, whole-genome resequencing was performed on 55 samples of cash-mere goats from Inner Mongolia and downloaded genome resequencing data for 68 individual goats from the database of NCBI. A total of 123 individual goats from seven populations were included. All 1.66 Tb of raw data were generated, with an averaging depth of 10× for each individual. The principal component analysis (PCA) displayed that the wild goats were separated from the domestic goats ([109]Figure 1A and [110]Figure S2). The zoomed insets in [111]Figure 1A show the PCA for six domestic goat breeds and highlight that these populations were separated based on their utility traits. The plot shows four distinct clusters that are grouped according to their economic traits such as the meat, milk, skin and fiber production of goat populations. The phylogenetic tree (neighbor-joining tree, NJ tree) was similar to the PCA results, and the NJ tree showed all the domestic goat individuals were clearly separated into four categories as wild goats were used as an outgroup ([112]Figure 1B). The two cashmere goat breeds, IMC and HSC, clustered together on the same branch. Figure 1. [113]Figure 1 [114]Open in a new tab Population genetics structures analyses and linkage disequilibrium. (A) Principle component analysis (PCA); each point represents a single individual. The colors in B and D represent the same groups as indication in the legend of A. (B) Phylogenetic tree constructed by the neighbor-joining (NJ) method. (C) Population structure plot of seven goat populations at K = 2–5, (D) LD decay map measured by r² over distance between SNPs in seven populations. The ADMIXTURE analysis indicated K = 5 (cross-validation error = 0.4728) represents the optimal number of discrete genetic populations within the 123 samples ([115]Figure 1C and [116]Figure S3). JNG had the most mixed ancestry of all populations, while HSC had the least. At K  =  5, a majority of IMC individuals separated from HSC individuals. To further understand the linkage disequilibrium (LD) in domestic and wild goat populations, the LD coefficient (r²) was calculated for all individuals within each population. ([117]Figure 1D). Among the seven populations, the LD plot shows that the IMC population displayed the lowest level of LD, followed by SAA. IBE, while wild goats had the highest level of LD. On the whole, the average r² of IMC declined more rapidly than that observed in the other breeds. 3.2. Distribution of Runs of Homozygosity In total, we identified 57,224 autosomal regions with ROHs. The longest segment (8.5 Mb), containing 305,648 SNPs, was located on chromosome 9, while the shortest segment (0.2 Mb) with 7394 SNPs was identified on chromosome 6 ([118]Supplementary Table S1). The largest number of ROHs was detected on chromosome 1; the smallest count was found on 27, and there was an overall decreasing trend in the number of ROHs as the chromosome number increased. In addition, small ROHs were the most common on all chromosomes, followed by medium-sized ROHs, while large ROHs were relatively rare ([119]Figure 2A). IMC had fewer ROHs and a shorter cumulative length of ROHs. Individuals in the IBE population tended to present a higher number of ROHs and greater total lengths of ROHs compared to other populations ([120]Figure 2B). The distribution of the total ROH length across different populations is shown in [121]Figure 2C. JNG and IMC showed a distribution with lower overall lengths of ROHs compared to the IBE population. But HSC displayed a longer total length than IMC. The ALG population had a relatively high median total length of ROHs. This suggested that this population contains a larger number of individuals with longer segments of ROHs. The BOE population with a wide distribution exhibited significant variation in the ROH lengths among individuals within this population. It was evident that the small and medium ROH length class had a relatively high mean number of ROHs across all populations ([122]Figure 2D). The large ROH length class showed a markedly reduced mean number of ROHs. IMC showed the lowest mean numbers of ROHs across all length classes. In contrast, the IBE population had the largest mean number of ROHs in all classes, followed by HSC, indicating a possible high level of genomic homozygosity. [123]Table 2 and [124]Table S2 summarize the fundamental statistics of the three ROH categories. The analysis of the different ROH length classes for each population showed that ROH segments ranging from 0.3 to 1.5 Mb were the most common, while the long ROH segments (>1.5 Mb) accounted for just 1.55% of all three classes. This indicated that medium–long segments covered the highest proportion of the genome (66.59%). Figure 2. [125]Figure 2 [126]Open in a new tab The distribution of ROHs identified in different populations across autosomes. (A) This graph is a bar chart showing the number of ROHs on different chromosomes. ROHs are divided into three length classes: small (0–0.3 Mb), medium (0.3–1.5 Mb) and large (>1.5 Mb). (B) Relationship between the number of runs of homozygosity (ROH) per individual and the total length of the genome covered by them. The x-axis shows the sum total length of ROHs (Mb), and the y-axis shows the total number of ROHs. Every circle represents a different individual within a population, with the groups labeled as ALG, HSC, IMC and so on. (C) Violin plots and boxplots of sum total length of ROHs (SROHs). The form of each violin reflects the distribution density of cohort, with wider sections of the violin plot indicating higher frequency of observation at that y -value. Inside each violin, there is a boxplot that shows the median, the interquartile range (the length of the box) and the outliers (the points outside the whiskers). (D) The descriptive statistics for the ROHs, categorized by ROH length class across different populations, are presented as mean counts of ROHs (Y-axis) by class of ROH length. Table 2. Summary statistics for the numbers and lengths (in Mb) of ROHs based on different ROH length classes (0–0.3 Mb, 0.3–1.5 Mb and >1.5 Mb). ROH Length (Mb) ROH Number Number Percentage (%) Total_Length (Mb) Mean ± SD (Mb) Length Percentage (%) 0–0.3 25,933 45.32 6280.59 0.24 ± 0.028 25.64 0.3–1.5 30,406 53.14 16,313.46 0.54 ± 0.24 66.59 >1.5 885 1.55 1905.19 2.15 ± 0.81 7.78 [127]Open in a new tab 3.3. Inbreeding Coefficients To evaluate the inbreeding extent for each goat population, the population inbreeding coefficients and the individual inbreeding coefficients were calculated ([128]Table 3 and [129]Table S3). [MATH: Fro h :MATH] in the seven populations ranged from 0.0263 to 0.4780. Compared to other domestic goat populations, the IMC population exhibited the lowest inbreeding coefficient (0.0263), which is similar to the previous study [[130]29]. Contrastingly, the IBE population, a wild goat population, had the highest inbreeding coefficient (0.4780) and showed an exceptionally high total ROH length (11,788.44 Mb), indicating a higher level of inbreeding within this population. Thid was consistent with the findings of previous studies, where the modern wild bezoar from Iran displayed the highest extreme [MATH: Fro h :MATH] [[131]44,[132]45]. HSC had a relatively higher inbreeding coefficient compared to other domestic breeds. Other populations, such as JNG, SAA, ALG and BOE, exhibited varying degrees of inbreeding coefficients, ranging from 0.0446 to 0.0708. Table 3. Population name, sample size (n, the number of samples with a count of ROH greater than zero), the length of ROHs and inbreeding coefficient based on ROHs ( [MATH: Fro h :MATH] ). Population n Total ROH Length (Mb) Inbreeding Coefficient IMC 47 3045.20 0.0263 HSC 15 2619.12 0.0708 JNG 11 1209.02 0.0446 SAA 14 2326.06 0.0674 ALG 10 1672.71 0.0678 BOE 12 1838.68 0.0621 IBE 10 11,788.44 0.4780 [133]Open in a new tab 3.4. ROH Islands of Domestic Goat Populations ROH islands refer to regions of runs of homozygosity that are shared among multiple individuals within a population. These regions may be preserved or enhanced in the population due to factors such as genetic drift, natural selection or artificial selection. Seventy-four ROH islands were identified with 10,839,351 SNPs across the 29 autosomes based on the top 0.1% of SNPs for ROH incidence. Among them, 50 genes were detected as putative candidate genes ([134]Table 4, [135]Tables S4 and S5). The Manhattan plot showed the incidence of SNPs in ROHs per chromosome in HSC and BOE ([136]Figure 3) and other populations ([137]Figure S4). The HSC goat population displayed higher ROH incidence levels (>30%), with numerous ROH islands exceeding 80% incidence. In contrast, the BOE goat population showed generally low ROH incidence, with seven notable ROH islands on chromosomes 8, 11, 13, 16 and 19, respectively. Table 4. Candidate genes located in genomic regions with a high frequency of ROHs. Population Chromosome Position (Mb) Gene Name Gene Function ALG 6 32.73~59.46 GRID2 Udder development, growth, fertility, temperament traits 7 56.81~57.00 FGF1 Adipocyte differentiation 10 14.63~30.13 TMEM63C Body size and development MNAT1 Cell cycle and DNA repair BOE 8 38.47~39.00 PDCD1LG2 Disease resistance 11 15.04~21.07 BIRC6 Follicular development, fertility ARHGEF33 Retinal development 11 36.48~37.00 ACYP2 Growth NEBL Puberty 13 34.05~35.34 SVIL Disease resistance 16 47.76~48.17 PRDM16 Formation of brown fat cells, cold resistance 16 49.30~50.00 CFAP74 Meat production, staple length 19 22.96~23.00 SMG6 Growth 22 37.21~38.13 PRICKLE2 Postpartum dysgalactia syndrome HSC 5 18.57~68.73 RFX4 Adaptability, body size and development, fertility, udder development and milk production 6 32.54~95.45 GRID2 Udder development, growth, fertility, temperament traits 11 14.88~15.00 BIRC6 Follicular development, fertility 15 31.98~32.19 STIM1 Pulmonary circulation, body size and development, meat production, neural function or behavior 16 41.63~43.22 RERE Staple length 24 41.14~42.58 PTPRM Interaction between keratinocytes, meat production, immune PIEZO2 Meat production IMC 1 82.57~108.78 ECE2 Fertility, embryo development 6 95.44~116.63 FGF5 Hair growth and length SH3BP2 Immune 11 14.98~15.04 BIRC6 Follicular development, fertility 15 31.97~32.19 STIM1 Pulmonary circulation, body size and development, meat production, neural function or behavior 19 SMG6 Growth JNG 1 109.15~132.19 STAG1 Embryonic development SAA 5 59.27~69.98 SYN3 Body size and development, meat production 5 70.00~98.96 SYN3 Body size and development, meat production 7 27.59~57.47 XRCC4 Embryonic development 7 58.31~59.89 MATR3 Corpus luteum 8 74.99~75.00 NOL6 Skin color 11 78.35~93.90 SDC1 Body size and development, meat production, cashmere fineness STRBP Body size 19 24.51~40.23 RARA Milk production, udder development, hair follicle morphogenesis 28 CDH23 Fertility 29 17.94~18.00 PAK1 Fertility, puberty [138]Open in a new tab Figure 3. [139]Figure 3 [140]Open in a new tab Manhattan plot of SNPs in ROHs for HSC (A) and BOE (B). The x-axis exhibits positions along each chromosome. KEGG and GO enrichment analysis was conducted for 50 genes identified as potential candidate genes ([141]Supplementary Table S6). The GO analysis revealed seven GO entries were significantly enriched, such as the positive regulation of cell proliferation (GO:0008284), the negative regulation of granulocyte differentiation (GO:0030853) and nuclei (GO:0005634). Additionally, the enrichment analysis of KEGG just identified two pathways, including the Ras signaling pathway and cell-adhesion molecules. Notably, based on previous studies, several genes may be associated with the economic traits of goats. For instance, they were related to the meat trait (FGF1, CFAP74) fiber (RERE, PTPRM, FGF5) and milk (GRID2, RARA) production, fertility (BIRC6, ECE2, CDH23, PAK1), disease or cold resistance and adaptability (PDCD1LG2, SVIL, PRDM16, RFX4, SH3BP2, PTPRM) and body size and growth (TMEM63C, SYN3, SDC1, STRBP, SMG6, ACYP2, RFX4) ([142]Table 4). 3.5. The Consensus ROH of IMC and HSC Breeds To identify genes related to cashmere traits, ROH consensus sequences for the IMC and HSC breeds were explored. A total of 135 consensus ROH pools were identified ([143]Supplementary Table S7). The gene annotation of consensus ROHs revealed 191 protein-coding genes ([144]Supplementary Table S8). Notably, a specific consensus ROH (115.36 kb: 95,438,689–95,554,048 bp) contained the FGF5 gene on chromosome 6, which was shared by 19 individuals from the IMC breed. For the HSC population, a consensus ROH (106.374 Kb: 14,880,801–14,987,174 bp) was identified on chromosome 11, which was common for 13 individuals. Enrichment analysis revealed that genes within consensus ROH regions significantly contributed to 39 GO terms and 60 KEGG pathways ([145]Figure 4, [146]Supplementary Table S9). Notably, the GO terms included cell division (GO:0051301), nucleoplasm (GO:0005654), nucleus (GO:0005634), centrosome (GO:0005813), actin cytoskeleton (GO:0015629) and protein binding (GO:0005515), which were involved in cell division and cell metabolism. We deliberated whether these GO terms may be related to the morphogenesis of hair follicles. KEGG pathway analysis revealed that 60 pathways were significantly enriched, such as MAPK signaling, Ras signaling, PI3K-Akt signaling, Rap1 signaling, JAK-STAT signaling, Notch signaling, metabolic pathways, melanogenesis pathways, focal adhesion and signaling pathways regulating the pluripotency of stem cells, crucial for the regulation of hair follicle development. In the above GO terms and pathways, it had been confirmed that the included genes, FGF5, KIT [[147]46], DVL3 and NRAS, were crucial for cashmere fibers of cashmere goats. Figure 4. [148]Figure 4 [149]Open in a new tab (A) A Gene Ontology (GO) functional enrichment analysis. (B) KEGG enrichment pathways of gene annotation in the consensus ROHs of IMC and HSC breeds. 4. Discussion 4.1. Population Structure and Linkage Disequilibrium In our study, we conducted a population structure analysis and LD on a total of 123 individual goats from seven populations. The population structure analysis indicated a consistent clustering pattern based on PCA, NJ phylogenetic trees and admixture analyses. It has been noted that in the experimental samples, apart from the wild goat population, IBE, the other six domestic goat populations were clustered into four groups, which is generally in agreement with their economic traits [[150]47]. LD, a significant genetic concept, refers to the association of alleles at distinct loci that are not random. Insights into the history of a population and evolution can be observed from diminishments in LD between genetic markers. The LD decay pattern among genetic markers offers insightful observations on the history of a population and evolution. A faster rate of LD decay indicates higher genetic diversity, correlating with phenomena like genetic drift, migration and selection within a population. Overall, IMC exhibited higher genetic diversity, indicating the conservation measures employed on farms for IMC were successful. Likewise, the IBE population exhibited extensive LD, suggesting the possibility of inbreeding within this population. It is significant to enhance the preservation of these populations’ genetic resources to avoid diminishing genetic diversity. Thus, it is necessary to relentlessly explore genetic variation and structure. This is key to averting a swift reduction in diversity, which is vital for sustainable improvements in the livestock and poultry industry. 4.2. Patterns of Runs of Homozygosity (ROHs) The analysis of ROHs throughout seven goat populations unveiled ROH patterns ([151]Figure 2A,B). Some goat breeds, like IMC and JNG, characterized by shorter and fewer ROHs, indicate a relatively higher genetic diversity. In contrast, long ROH segments were observed in certain populations, particularly in the IBE and HSC populations, suggesting historical inbreeding or that a reduced effective population size might have occurred. Longer ROH segments suggested recent inbreeding, whereas shorter ROH segments implied more distant inbreeding events [[152]1,[153]48]. In our research, the IMC population mainly showed short ROH segments, whereas the proportion of long segments was very small, particularly those exceeding 1.5 Mb. This pattern indicated that the IMC population had a low level of inbreeding, with ancient ancestors being the primary group affected by inbreeding events. Therefore, compared to wild goats and other domestic goat breeds, the IMC population may display reduced inbreeding occurrences and enhanced genetic diversity, which were similar with the result of genetic diversity and the LD analysis. Predominantly, ROH segments shorter than 1.5 Mb were the most common, aligning with findings from preceding research on livestock and poultry [[154]41,[155]42,[156]45,[157]49,[158]50,[159]51]. 4.3. Inbreeding Levels within Populations The assessment of runs of homozygosity (ROHs) and inbreeding coefficients across the studied goat populations has revealed notable differences in inbreeding levels, which are crucial for understanding the genetic diversity of these populations. Inbreeding coefficients estimated by ROHs ranged from 0.0263 to 0.4780. The inbreeding coefficients across different populations could reflect the breeding practices and historical management of these breeds. The IMC population exhibited the lowest inbreeding coefficient of 0.0263, which was generally consistent with previous research [[160]29]. This could be attributed to diverse breeding practices or a larger effective population size, which is essential for the long-term sustainability of the breed. In contrast, the IBE population showed an inbreeding coefficient of 0.4780. This was similar to previous studies that reported the highest rates of extreme [MATH: Fro h :MATH] in modern wild bezoar from Iran [[161]44,[162]45] and diminished diversity levels relative to domestic goats [[163]52]. HSC had a relatively higher inbreeding coefficient compared to other domestic goat populations. This high level of inbreeding is a cause for concern, as it indicates a limited genetic pool and potential risks of inbreeding. The government should develop and implement a variety of management plans, as well as selective breeding programs, to detect and manage the genetic diversity of populations. Other populations, including SAA, JNG and ALG, displayed varying degrees of inbreeding coefficients, ranging from 0.0446 to 0.0708. These values suggest moderate levels of inbreeding, which could be reflective of controlled breeding strategies aimed at maintaining specific traits. 4.4. Functional Enrichment Analysis In the ROH islands of domestic goat populations, several genes were identified, such as FGF1, RERE, PTPRM, FGF5, GRID2, RARA and BIRC6 ([164]Table 4). Seven GO terms and two KEGG pathways of these genes were significantly enriched. For instance, GO terms included the positive regulation of cell proliferation (GO:0008284), the negative regulation of granulocyte differentiation (GO:0030853) and nuclei (GO:0005634). KEGG pathways were enriched, including the Ras signaling pathway and cell-adhesion molecules. Through an extensive and precise literature search, we determined the function of each annotated gene on the ROH island. In previous research, these genes were mainly related to meat, milk and fiber production; reproductive capacity; and disease resistance. The genes that we found in HSC and IMC, cashmere goat breeds, were related to fiber traits, for instance, RERE, PTPRM and FGF5. RERE was related to the wool staple length in Akkaraman lambs [[165]53]. PTPRM is known as a hair-related gene that regulates cell–cell communication in keratinocytes [[166]54,[167]55]. But it is also associated with brucellosis resistance in sheep and muscling in pig [[168]56,[169]57,[170]58]. In addition, we also found some genes were related to adaptability (RFX4) [[171]59] and immunity (SH3BP2) [[172]60] in cashmere goats. These genetic factors may potentially improve the adaptability of cashmere goats in cold environments as well as their resistance to pathogens. ALG is a dairy goat, and we have identified GRID2 as a gene related to milk production. GRID2 (glutamate ionotropic receptor delta-type subunit 2) is linked to the central suspensory ligament in Chinese Holstein cows [[173]61]. It is strongly related to dry matter intake, average daily gain, birth weight and milk fat yield in cattle [[174]62]. However, earlier research indicated that GRID2 plays a role in the sexual maturity of Simmental cattle, the temperament trait of sheep and the litter size of goat [[175]63,[176]64,[177]65]. On BOE, key genes associated with traits such as fertility (BIRC6), disease resistance (PDCD1LG2, SVIL), cold tolerance (PRDM16), bone development (ACYP2), meat production (CFAP74) and growth performance (SMG6) were discovered. It was also linked to the introduction and improvement of Boer goats. BIRC6 (baculoviral inhibitors of apoptosis repeat-containing 6) has been confirmed to be involved in the development of follicles in broilers and early embryonic development and fertility in Bos indicus [[178]66,[179]67,[180]68]. PDCD1LG2 (PDCD1 ligand 2) and SVIL are immunostimulatory genes. It has been reported that PDCD1LG2 in Bohuai goats and SVIL in Bos indicus are related to immunity [[181]69,[182]70]. PRDM16 is key in regulating the formation of brown fat cells and the production of brown fat [[183]71,[184]72,[185]73]. PRDM16 is also related to cold tolerance. For example, in sheep, hypothermia due to cold exposure at birth was prevented by the speedy activation of non-shivering thermogenesis in brown adipose tissue [[186]74]. In addition, it has been discovered that CFAP74 is associated with meat production in goats and yearling fiber diameter in sheep [[187]53,[188]75]. In SAA, apart from the genes related to milk production traits (RARA), the majority of the annotated genes are associated with embryonic development or litter size, such as XRCC4 [[189]76], CDH23 [[190]77] and PAK1 [[191]78,[192]79]. Consistent with previous studies, RARA was found on a region of chromosome 19 that displayed a negative pleiotropic impact on milk production traits (milk, fat yield and protein yield) as well as udder development (udder floor position and rear udder attachment) and was also linked to responses to intramammary infections [[193]80]. The functional enrichment analysis of genes identified in consensus ROH regions provided valuable insights into the biological processes and pathways potentially influencing economically important traits in cashmere goats. In our research, the analysis of KEGG pathways indicated that several significant pathways were enriched in relation to fiber traits. These pathways included focal adhesion, JAK-STAT signaling, Ras signaling, PI3K-Akt signaling, Rap1 signaling, Notch signaling, metabolic pathways, melanogenesis pathways, MAPK signaling and signaling pathways regulating the pluripotency of stem cells. These pathways are crucial in regulating the development of hair follicles. The MAPK signaling pathway was recognized for promoting the growth and differentiation of hair follicle cells, fostering their regular developmental cycle and promoting skin-derived precursors by enhancing both the proliferation and hair-inducing capacity [[194]81]. Notch signaling, Ras signaling and Rap1 signaling are important to the in vitro proliferation of hair follicle stem cells [[195]82]. PI3K-Akt signaling plays a necessary role in the new formation of hair follicles [[196]83]. The JAK-STAT signaling pathway is known to be related to the regulation of hair follicle growth, and it has been targeted for the purposes of alopecia areata in the clinical setting [[197]84]. Focal adhesions have been detected during the vital transition from the telogen to the anagen phase, potentially acting as key biomarkers for the regeneration process between these hair growth stages [[198]85,[199]86]. Signaling pathways regulating the pluripotency of stem cells play an important part in determining cell fate by controlling self-renewal and diversification into various cell types [[200]87]. Melanogenesis is an important biochemical process in which melanocytes produce melanin, a crucial element involved in the formation of coat color in mammals [[201]88,[202]89]. In our study, DVL3, NRAS and KIT in the melanogenesis pathway were identified as potential candidate genes influencing cashmere goat fiber color. Previous studies have reported on the involvement of NRAS and KIT in this process [[203]90]. Mutations or deletions of KIT can result in specific hair and skin colors among mammals [[204]91,[205]92,[206]93]. In the Rap 1 and MAPK pathways, we identified the FGF5 gene. FGF5 (fibroblast growth factor 5) is notable as it inhibits hair elongation and is related to hair growth and length in mammals [[207]94,[208]95,[209]96]. These findings not only enhance our understanding of the genetic basis of these characteristics but also pave the way for targeted genetic improvement. 5. Conclusions In conclusion, our study explored ROHs through seven goat populations’ genomes and counted the inbreeding coefficient to assess inbreeding levels. Additionally, we pinpointed ROH islands that harbor genes associated with economically vital traits. Our findings indicated that historical inbreeding has affected the IMC population, which exhibits a relatively low inbreeding level. Through analyzing regions identified in ROH islands and consensus, we discovered certain genes associated with economically crucial traits, including meat, fiber and milk production; fertility; growth; as well as the resistance and adaptability of goats. Our study contributes to a more comprehensive understanding of genetic diversity, the level of inbreeding and essential genes potentially under selection. It also offers a valuable perspective on future conservation measures and the use of cashmere goats. Acknowledgments