Abstract Wool fiber diameter (WFD) is the most important economic trait of wool. However, the genes specifically controlling WFD remain elusive. In this study, the expression profiles of skin from two groups of Gansu Alpine merino sheep with different WFD (a super-fine wool group [FD = 18.0 ± 0.5 μm, n= 3] and a fine wool group [FD=23.0±0.5μm, n=3]) were analyzed using next-generation sequencing–based digital gene expression profiling. A total of 40 significant differentially expressed genes (DEGs) were detected, including 9 up-regulated genes and 31 down-regulated genes. Further expression profile analysis of natural antisense transcripts (NATs) showed that more than 30% of the genes presented in sheep skin expression profiles had NATs. A total of 7 NATs with significant differential expression were detected, and all were down-regulated. Among of 40 DEGs, 3 DEGs (AQP8, Bos d2, and SPRR) had significant NATs which were all significantly down-regulated in the super-fine wool group. In total of DEGs and NATs were summarized as 3 main GO categories and 38 subcategories. Among the molecular functions, cellular components and biological processes categories, binding, cell part and metabolic process were the most dominant subcategories, respectively. However, no significant enrichment of GO terms was found (corrected P-value >0.05). The pathways that were significantly enriched with significant DEGs and NATs were mainly the lipoic acid metabolism, bile secretion, salivary secretion and ribosome and phenylalanine metabolism pathways (P < 0.05). The results indicated that expression of NATs and gene transcripts were correlated, suggesting a role in gene regulation. The discovery of these DEGs and NATs could facilitate enhanced selection for super-fine wool sheep through gene-assisted selection or targeted gene manipulation in the future. Introduction Fine wool sheep, also called Merino, is a world famous sheep breed that is known to produce high-quality fine wool. Fine wool sheep are distributed primarily in Australia, China, New Zealand, South Africa, Uruguay, Argentina and other countries[[39]1]. Wool fiber diameter(WFD) is the most important economic trait of merino sheep and determines 75% of the value of wool fibers. The WFD variation-induced profit accounted for 61% of the total profits of wool [[40]2, [41]3]. Wool is formed from keratinocytes derived from a progenitor population at the base of the hair follicle(HF)[[42]4]. The morphogenesis and growth of HF in sheep has been extensively studied since the 1950’s and the developmental processes at the cellular level are reasonably well understood [[43]5–[44]7]. Dermal papilla (DP) cells are a population of mesenchymal cells at the base of the HF[[45]8], and provide signals that contribute to specifying the size, shape and pigmentation of the wool[[46]9]. It is well known that WFD is significantly associated with size of DP and matrix in mammals [[47]4, [48]6, [49]10–[50]17], and is largely specified post-initiation, during the period of HFs growth and morphogenesis[[51]18]. To illustrate the molecular mechanisms of controlling WFD, the expression profiles of different stage of fetal and adult sheep skin have also been generated by sequencing of expressed sequence tags (ESTs) and cDNA microarray [[52]19, [53]20]. However, the size of DP and matrix in mammals is also markedly influenced by genetic[[54]10, [55]11, [56]21], physiological[[57]13], nutrition[[58]22], hormones[[59]12] during the anagen phase of the hair cycle. Up to now, there are no studies on the molecular mechanisms of controlling WFD during the anagen phase, and the genes specifically controlling WFD remain elusive [[60]23]. Knowledge of the genes controlling development of DP and matrix come from studies of the morphogenesis and cycle of HF of mice and human[[61]24–[62]28], It involves a series of signaling between the matrix and the dermal papill, such as Wnt/beta-catenin, EDA/EDAR/NF-κB, Noggin/Lef-1, Ctgf/Ccn2, Shh, BMP-2/4/7, Dkk1/Dkk4 and EGF[[63]4, [64]29]. The mutation, epigenetic modification and post-translational modification of any ligand or receptors in these pathways maybe affect WFD [[65]30, [66]31]. Therefore, in addition to the current efforts on protein-encoding genes, the attention should also be paid to a novel regulatory factor, non-coding RNA (ncRNA), such as micro RNA, long non-coding RNA and natural antisense transcripts (NATs)[[67]32]. Among these ncRNAs, NATs are not only large in quantity but also play important roles in gene expression regulation in organisms [[68]33]. NATs refer to a class of non-coding RNAs that are produced inside organisms under natural conditions and are expressed in many species [[69]34–[70]36]. NATs play an important role in transcript regulation at the mRNA and/or protein levels and in regulating various physiological and pathological processes, such as organ formation, cell differentiation and disease [[71]37, [72]38]. However, the roles of NATs in controlling WFD have not been described. The next-generation sequencing (NGS)-based digital gene expression (DGE) profiling technologies developed in recent years constitute a revolutionary change in traditional transcriptome technology. Compared with ESTs and cDNA microarray, the strength of DGE profiling is that it is an "open system" and has better capability to discover and search for new information, providing a new way to identify novel genes and NATs that specifically control WFD [[73]39]. However, the success or effectiveness of the search is largely dependent on the completeness and stitching quality of the genome sequences of the species studied [[74]40]. It is exciting that the International Sheep Genomics Consortium (ISGC) has achieved initial success in assembling the reference sheep genome. The total length of the assembled genome, Sheep Genome v3.1, has reached 2.64 Gb, with only 6.9% gaps [[75]41]. Needless to say, the release of a high-quality sheep genome reference sequence provides important resources for using NGS to study the skin transcriptome of sheep with different WFD and find genes and NATs that specifically control WFD [[76]42, [77]43]. In this study, NGS-based DGE profiling was conducted to analyze the differentially expressed genes (DEGs) and NATs in the skin of Gansu Alpine fine wool sheep with different WFD. A total of 47 significant DEGs and NATs were detected, including 9 up-regulated genes, 31 down-regulated genes and 7 down-regulated NATs. These DEGs and NATs may be useful in further study on molecular markers of controlling WFD in fine wool sheep. Results Sequencing and assembly NGS was performed on 6 individuals of the super fine wool group (sample nos.65505,65530,65540;n = 3) and the fine wool group (sample nos.5Y127,5Y212,5Y339;n = 3), and raw data greater than 5 Mb were obtained for all individuals ([78]Table 1), submitted to the NCBI BioProject database, with the accession number PRJNA274817. There was a 3' adaptor sequence in the raw data, along with small amounts of low-quality sequences and various impurities. Impurity data were removed from the raw data, and clean tags greater than 4.7 Mb were obtained. The distinct tag number of each individual was greater than 0.11 Mb ([79]Table 1). Table 1. Summary of tag numbers based on the DGE data from Gansu Alpine fine wool sheep skin with different WFD. Sample ID Raw Data Clean Tag Total number Distinct Tag number Total number Distinct Tag number 5Y127 5259561 355014 4917435 119886 5Y212 5364832 350472 5033144 117880 5Y339 5145452 336695 4857460 120525 65505 5244848 314710 4930422 110879 65530 5485701 343795 5184375 117642 65540 5282704 373209 4941669 143800 [80]Open in a new tab The sheep genome reference sequence database includes 19,346 gene sequences, including 18,940 genes with "CATG" loci and accounting for 97.90% of the total number of genes. The total number of reference tags in the reference tag database is 173,207, including 169,843 unambiguous reference tags, accounting for 98.06% of the total number of tags. All clean tags were compared with reference genes and reference genomes, and the results indicated that 87.45%, 87.32%, 88.63%, 88.97%, 88.72% and 87.38% of the total numbers of clean tags of the 6 individuals from two groups (sample nos. 5Y127, 5Y212, 5Y339, 65505, 65530,and 65540, the same hereinafter), respectively, could be matched to the reference tags; of these, 49.05%, 50.26%, 51.86%, 52.41%, 51.48% and 50.37% of the clean tags could be uniquely located in the reference sequences (sense and antisense), and the ratios of distinct tag were 36.00%, 36.63%, 37.64%, 36.61%, 37.33% and 33.22% ([81]Table 2), respectively; the numbers of unambiguous tag-mapped genes were 10,391, 10,209, 10,666, 10,434, 10,137, 10,298,10763 and 9,965, respectively. ([82]Table 2) These uniquely located sequences indicated that the key genes that regulate the WFD may exist in genes expressed in the individuals with different WFD. Table 2. Summary of unambiguous tag mapping to gene and unambiguous tag-mapped genes (sense & anti-sense). Sample ID Unambiguous Tag Mapping to Gene Unambiguous Tag-mapped Genes Total number Total % of clean tag Distinct Tag number Distinct Tag % of clean tag number % of ref genes 5Y127 2412224 49.05% 43158 36.00% 10391 53.71% 5Y212 2529818 50.26% 43175 36.63% 10209 52.77% 5Y339 2519260 51.86% 45369 37.64% 10666 55.13% 65505 2583990 52.41% 40595 36.61% 10434 53.93% 65530 2668978 51.48% 43910 37.33% 10137 52.40% 65540 2489004 50.37% 47775 33.22% 10298 53.23% [83]Open in a new tab There were a large number of possible unknown tags that are in the gene expression profile libraries of these six individuals. Accounting for 12.55% (617,198), 12.68% (638,292), 11.37% (552,279), 11.03% (543,900), 11.28% (585,017) and 12.62% (623,705) of the total number of clean tags were unannotated, respectively; the proportions of distinct tag number of unknown tag were 18.77% (22,503),18.79% (22,152),16.17% (19,493),18.02% (19,975),17.25% (20,292) and 20.43% (29,382), respectively. These results suggested that there were many unknown genes in sheep skin tissue that may also play important roles in the regulation of HF development and wool growth. Analysis of the expression profiling of NATs Sense-antisense regulation is an important method for controlling gene expression. If the clean tags can be matched to the antisense strand of the gene, then it suggests that there are also transcripts for the antisense strand of this gene and that this gene may be subjected to sense-antisense regulation [[84]44]. In the present study, we found that in the gene expression profiling libraries of the 6 individuals of the two groups, the percentages of genes with antisense transcripts were, respectively, 5.83%, 6.01%, 5.91%, 5.84%, 6.00% and 6.20% of the total numbers of the genes in the library, including 247,816 (5.04%), 262,353 (5.21%), 249,595 (5.14%), 247,940 (5.03%), 266,926 (5.15%) and 265,599 (5.37%) tags that could be exactly matched, respectively ([85]Table 3); the number of tag species accounted for, respectively, 13.31%, 13.31%, 13.46%, 13.20%, 13.65% and 11.94% of the total numbers of tag species in the expression profile libraries of the 6 individuals from the 2 groups, including 15,192 (12.67%), 14,966 (12.70%), 15,483 (12.85%), 13,957 (12.59%), 15,289 (13.00%) and 16,346 (11.37%) tag species that could be exactly matched, respectively([86]Table 3); there were 6472 (33.45%), 6398 (33.07%), 6585 (34.04%), 6227 (32.19%), 6346 (32.80%) and 6758 (34.93%) genes with NATs in the gene expression profiles of the 6 individuals of the 2 groups ([87]Table 3). The sense-antisense transcript ratios of all 6 expression profile libraries exhibited similar trends. The ratios between the tag number, distinct tag number and tag-mapped genes of sense and antisense transcripts were 10:1, 2:1 and 5:3, respectively; the spearman r between the sense and antisense transcripts expression were 0.533 ~ 0.557. Table 3. Summary of unambiguous tag mapping to anti-sense genes and unambiguous tag-mapped anti-sense genes. Sample ID Unambiguous tag mapping to anti-sense genes Unambiguous tag-mapped anti-sense genes Total number Total % of clean tag Distinct Tag number Distinct Tag % of clean tag number % of ref genes 5Y127 247816 5.04% 15192 12.67% 6472 33.45% 5Y212 262353 5.21% 14966 12.70% 6398 33.07% 5Y339 249595 5.14% 15483 12.85% 6585 34.04% 65505 247940 5.03% 13957 12.59% 6227 32.19% 65530 266926 5.15% 15289 13.00% 6346 32.80% 65540 265599 5.37% 16346 11.37% 6758 34.93% [88]Open in a new tab Detection of DEGs, NATs and validation The clean tags of all six individuals were aligned with the reference tag database, allowing a maximum of one base mismatch. Unambiguous tags were annotated, and the number of raw clean tags that corresponded to the same gene was counted and then standardized to obtain the standardized expression level of each gene in the skin transcriptome of the 6 individuals. Noiseq software was used to select DEGs, NATs that exhibited different expression levels between S and F group. However, there were only 47 significant DEGs and NATs (log2Ratio (S/F)≥1, q-value≥0.8).9 genes were up-regulated, 31 genes and all 7 NATs were down-regulated ([89]Table 4, [90]Fig 1). Among the down-regulated genes, NT5C3L (Gene ID 101109197) showed the greatest expression difference (log2Ratio(S/F) = –10.59,q value = 0.89); among the up-regulated genes, CCNA2 (Gene ID 100144758) showed the greatest expression difference (log2Ratio(S/F) = 6.04, q value = 0.83). 38 significant expression genes had NATs, and only 2 had no found antisense transcripts: CCNA2 (Gene ID: 100144758) and prolactin-inducible protein homolog (Gene ID: LOC101114011).3 significant DEGs (AQP8, Gene ID: 101108013;Bos d2,Gene ID:101116281, and SPRR,Gene ID: 443313) had significant NATs. All 3 of these genes and their NATs were significantly down-regulated in the super-fine wool group ([91]Table 4).10 DEGs, NATs were used to validate selected differentially expressed transcripts identified from DGE profiling by Real-time PCR. The results from the real-time PCR confirmed the expression pattern of DGEs and NATs at two different groups in Gansu Alpine fine wool sheep. Table 4. Summary of DEGs and NATs between two groups. Kinds of Transcriptes GeneID log2Ratio(S/F) Probability Symbol DEGs 101109197 -10.5912106 0.886583566 NT5C3L 101116281 -7.40151954 0.969520529 LOC101116281 101120858 -4.57602606 0.868500902 LOC101120858 443313 -4.45466762 0.848327046 Small proline-rich 101109718 -4.16402747 0.835055492 CA4 101108013 -4.13509115 0.838991854 AQP8 101118004 -3.99924015 0.86634137 LOC101118004 101115395 -3.90149163 0.829820677 PGLYRP1 101114011 -3.8813167 0.855789733 LOC101114011 101107368 -3.8266707 0.827278443 SLC25A35 101104026 -3.72227757 0.855311355 S100A8 101115563 -3.4495271 0.829328632 ABCC11 101109387 -3.38125687 0.87068777 GLYATL2 101106121 -3.3527916 0.870715106 LOC101106121 101115336 -3.29504885 0.839237877 DNASE1L2 101116409 -3.16407151 0.86568531 LOC101116409 101113168 -3.1562348 0.826690722 LOC101113168 100137068 -2.97890634 0.857115521 LOC100137068 101116799 -2.80442661 0.859794434 LOC101116799 [92]NM_001009395.1 -2.67922661 0.847588978 - 101102714 -2.61084761 0.822795364 LOC101102714 101102540 -2.3882578 0.861489257 RPL39 101102697 -2.14152458 0.81083593 DNAJC12 101108654 -2.10679611 0.847930676 LOC101108654 101116537 -1.89220493 0.803933629 LOC101116537 101121216 -1.86897593 0.842094473 LOC101121216 101112716 -1.85095925 0.834194413 KRT7 101121307 -1.5666126 0.800489312 ACSM3 101111121 -1.46170294 0.809141108 CD82 101114256 -1.40678444 0.802238806 ACTG2 101109430 -1.32738037 0.80115904 KRT1 101105583 1.377957974 0.802293478 GSDMA 101110063 1.491887466 0.816426111 HSPA2 100135694 1.587840025 0.830968782 RPS27A 443218 1.679295263 0.826485703 FOS 101113964 1.819875496 0.838185446 PDCD6IP 101119862 2.058276167 0.826212345 LIPK 101120443 2.31282643 0.827415122 LOC101120443 101105188 4.991770667 0.839825597 DAP 100144758 6.03994716 0.834235416 CCNA2 NATs 101116281 -9.15903036 0.941608145 Bos d2 443313 -7.587465008 0.819808271 SPRR 101108013 -7.807354922 0.844108795 AQP8 101120353 -5.685396543 0.88975588 major allergen I polypeptide chain 2-like 101120550 -6.896227669 0.975067811 SMC1A 101113086 -9.556506055 0.957344034 primary amine oxidase 101113693 -8.661778098 0.914297923 ABP [93]Open in a new tab Fig 1. The numbers of DEGs between two groups. [94]Fig 1 [95]Open in a new tab Between two groups, there were 9 upregulated genes and 31 downregulated genes and 7 NATs. Gene ontology (GO), Kyoto Encyclopedia of Genes and Genomes (KEGG) and other databases were used for the functional analysis and signaling pathway annotations of these DGEs and NATs. Among all of the significant DEGs and NATs, 44 genes were annotated, and 3 genes located in the genome could not be annotated effectively. In total of 47 DEGs and NATs were summarized as three main GO categories and 38 subcategories ([96]Fig 2). Among the molecular functions category, the top three were involved in binding and catalytic activity. Regarding cellular components, cell part, cell, organelle, membrane, organelle parts were the dominant groups. Within biological processes category, metabolic process and cellular process were the most dominant group. However, no significant enrichment of GO terms was found (corrected P-value >0.05). The pathways that were significantly enriched with significant DEGs and NATs were mainly the lipoic acid metabolism, bile secretion, salivary secretion and ribosome and phenylalanine metabolism pathways (P < 0.05). Fig 2. GO functional analysis of DEGs and NATs. Fig 2 [97]Open in a new tab The results were summarized in three main categories: biological process, cellular component and molecular function. Among these groups, the terms binding, cell part and metabolic process were dominant in each of the three main categories, respectively. Discussion Wool is produced via synthetic processes by HFs, which are embedded in the skin of sheep [[98]45]. There are two types of HFs, named primary HF and secondary HF, which are different in appearance and function. Secondary HFs of the fine wool sheep are the main hair follicle and are critical determinants of mean fiber diameter and other wool characteristics. WFD is highly correlated with the size of DP of HF [[99]17], whose origin can be traced to the dermal condensate, one of the earliest features of the developing HF[[100]19]. The charactering of the molecular controls of HF initiation, morphogenesis, branching and growth can facilitate enhanced selection for new sheep breeds with lower WFD [[101]20]. In order to understand the molecular mechanisms of controlling WFD, Adelson et al. (2004) constructed three cDNA libraries from fetal and adult sheep skin, obtained 2,345 noredundant EST sequences and identified 61 ESTs expressed in the adult HF, which constituted a high priority candidate gene subset for further work aimed at identifying genes useful as selection markers or as targets for genetic engineering[[102]19]. Norris et al. (2005) found 50 up- and 82 down-regulated genes with increased fetal and inguinal expression relative to adult by compared skin gene expression profiles between fetal day 82, day 105, day 120 and adult HF anagen stage using a combined ovine–bovine skin cDNA microarray[[103]20]. However, comparing to sequencing of expressed sequence tags (ESTs) and cDNA microarray, DGE profiling has many unique advantages [[104]46, [105]47]. It is highly accurate and has a very low detection limit, giving it a very wide range of applications[[106]48],such as detection of new transcripts [[107]49],functional research of non-coding RNA [[108]50, [109]51]. In this study, six DGE profiling were conducted to analyze the DEGs and NATs in the skin of Gansu Alpine fine wool sheep with different WFD at the same age, gender, and nutrition level during the anagen. More than 4.7 Mb of clean tags were obtained for each sheep, with more than 0.11 Mb of distinct tags. A total of 47 significant DEGs and NATs were detected, including 9 up-regulated genes, 31 down-regulated genes and 7 down-regulated NATs. The HF comprises several concentric epithelial structures[[110]29]. Wool fiber is enveloped by two epithelial sheaths, known as the inner root sheath (IRS) and the outer root sheath (ORS). At the distal end of the HF, IRS cells undergo apoptosis and liberate the wool fiber [[111]29, [112]52]. ORS cells contain proliferating cells derived from stem cells in the bulge that feed into the matrix compartment of the bulb[[113]53]. The matrix is a proliferative zone located at the proximal end of the HF surrounding the DP [[114]52]. During the anagen phase, the DP cells are thought to direct the matrix cells to proliferate and differentiate into the multiple cell types that form the wool fiber and its channel, the IRS [[115]9]. During postnatal life, hair follicles show patterns of cyclic activity with periods of active growth and hair production (anagen), apoptosis-driven involution (catagen), and relative resting (telogen) [[116]53–[117]55]. However, Merino HFs are predominantly in anagen throughout growth, different to many animals such as the mouse, rabbit, and guinea-pig [[118]30]. It is also well known that WFD is significantly associated with sizes of DP and matrix in mammals [[119]4, [120]6, [121]10–[122]17],which are markedly influenced by genetic[[123]10, [124]11, [125]21], physiological[[126]13],nutrition[[127]22], hormones[[128]12] during the anagen phase of the hair cycle. However, the genes specifically controlling the size of DP and matrix remain elusive. In the present study, the up-regulated genes in the super fine wool group included 5 genes associated with the cell cycle and apoptosis: GSDMA3 [[129]56], HSPA2 [[130]57], RPS27A [[131]58], PDCD6IP [[132]59], DAP[[133]60], CCNA2 [[134]61] and FOS[[135]62]; the down-regulated genes included 7 genes associated with promoting follicle cell proliferation and differentiation: KRT1 [[136]63, [137]64], KRT7 [[138]65, [139]66], HSD11B1 [[140]67, [141]68], S100A8 [[142]41, [143]69, [144]70], NT5C3 L[[145]71] and DNAJC12 [[146]72]. These genes may through promoting HF cell apoptosis, inhibiting follicle cell proliferation and differentiation, thereby reduce the WFD. White adiposities, dermal fibroblasts, smooth muscle and endothelial cells of the vasculature, neurons and resident immune cells also surround the HFs in the skin. Recent studies show that the growth of wool fiber is not only affected by the proliferation and differentiation of follicular cells but is also affected by other cells around the follicle [[147]73, [148]74]. The layer of intradermal adipocytes expands as the HF enters its anagen stage and then thins during telogen [[149]75, [150]76]. FATP4^-/-, Dgat1^-/-, or Dgat2^-/- Early B cell factor 1 (Ebf1^-/-) mice have decreased intradermal adipose tissue due to defects in lipid accumulation in mature adipocytes [[151]77–[152]79]. Interestingly, these mice also display abnormalities in skin structure and function, such as hair loss and epidermal hyperplasia. In the present study, in the super fine wool group, it was also determined that the genes related to fat synthesis (ABCC11[[153]80], GLYATL2 [[154]81], and ACSM3 [[155]82]) were significantly down-regulated and that LIPK, which is involved in lipolysis, was significantly up-regulated [[156]83]. Genes with NATs are very common in animal and plant genomes, accounting for 7% to 9% of transcripts in plants and 5% to 30% of transcripts in animals, except for nematodes [[157]84]. The present study indicated that more than 30% of the genes had NATs, which is consistent with the findings of the above mentioned studies. In the present study, correlation analysis was conducted on the sense and antisense transcripts in the DGEs of skin, and the results showed that the correlation coefficients between the sense and antisense transcripts were between 0.533 and 0.557, indicating that the WFD maybe were regulated by both sense and antisense transcripts. NATs regulate sense transcripts through positive or negative feedback by forming a sense-antisense bidirectional structure [[158]85]. In this study, 3 out of 40 significantly differentially expressed genes had significantly different NATs: AQP8, Bos d2, and SPRR (type II small proline-rich protein). AQP8 belongs to the aquaporin subfamily of the AQP family [[159]86]; it is located mainly in a variety of tissues and on the surfaces of acinar cells in various glands and plays important roles in gland secretion and the transportation of water, urea, ammonia and hydrogen peroxide [[160]87–[161]90]. Bos d 2 belongs to the lipocalin family of proteins[[162]91]. In the skin sections, Bos d2 was found in the secretory cells of apocrine sweat glands and the basement membranes of the epithelium and HF[[163]92]. It is assumed that Bos d2 is a pheromone carrier.[[164]92]. The SPRR proteins comprise a subclass of specific cornified envelope precursors encoded by a multigene family clustered within the epidermal differentiation complex region [[165]93]. Two SPRR1, seven SPRR2, one SPRR3 and one SPRR4 genes are located within a 300-kb area of the EDC[[166]94, [167]95] and are expressed in the epidermis, HFs and capillaries [[168]96, [169]97]. Several studies have suggested that the SPRRs are related to increased epithelial proliferation and to malignant processes[[170]98]. Other than functioning as structural proteins, SPRRs also regulate gene expression levels, inhibit cell proliferation and promote differentiation [[171]99]. The present study also found that AQP8, Bos d2, SPRR and their antisense transcripts were significantly down-regulated in the super fine wool group, suggesting that AQP8, Bos d2 and SPRR and their antisense transcripts were regulated by positive feedback. However, the mechanism by which AQP8, Bos d2 and SPRR regulate the WFD needs to be further studied. Materials and Methods Sheep skin sampling Gansu Alpine fine wool sheep were bred in the Huang Cheng District of Gansu Province, China, by cross breeding Mongolian or Tibetan sheep with Xinjiang Fine Wool sheep and then with some fine wool sheep breeds from the Union of Soviet Socialist Republics, such as Caucasian sheep and Salsk sheep. The breed was approved by the Gansu provincial government in 1980. Gansu Alpine fine wool sheep were obtained from a sheep stud farm located in Zhangye city, Gansu Province. All experimental and surgical procedures were approved by the Institutional Animal Care and Use Committee, Lanzhou Institute of Husbandry and Pharmaceutical Sciences, Peoples Republic of China. Six unrelated 3 years old ewes at different WFD, and also as different DP size were selected and divided into super fine wool group (S)(WFD = 18.0±0.5μm; Diameter of secondary DP size = 3.2±0.2μm) and fine wool group (F)(WFD = 23.0±0.5μm; Diameter of secondary DP size = 4.1±0.2μm) ([172]Fig 3). A piece of midside skin (2 mm in diameter) was collected via punch skin biopsy under local anesthesia using 1% procaine hydrochloride immediately placed in liquid nitrogen and stored at -80°C for subsequent analysis. Fig 3. The size of secondary DP cells between two groups. Fig 3 [173]Open in a new tab The diameter of super fine wool sheep secondary DP cells = 3.2±0.2μm (A), the diameter of fine wool sheep secondary DP cells = 4.1±0.2μm (B). Total RNA extraction, library construction and deep sequencing Total RNA was isolated from the tissues using the RNeasy Maxi Kit (Qiagen, Hilden, GER) according to the manufacturer instructions. RNA quality was verified using a 2100 Bioanalyzer RNA Nanochip (Agilent, Santa Clara, CA, USA), and the RNA Integrity Number (RIN) value was >8.5. Then, the RNA was quantified using the Nano Drop ND-2000 Spectrophotometer (Nano-Drop, Wilmington, DE, USA). Sequence tags were prepared using the Illumina Digital Gene Expression Tag Profiling Kit, according to the manufacturer’s protocol. Sequencing libraries were prepared from 1μg of total RNA using reagents from the NlaIII Digital Gene Expression Tag Profiling kit (Illumina Inc., San Diego, CA, USA). mRNA was captured on magnetic oligo(dT) beads and reverse transcribed into double-stranded cDNA (SuperScript II, Invitrogen, Carlsbad, CA, USA). The cDNA was cleaved using the restriction enzyme NlaIII. An adapter sequence containing the recognition sequence for the restriction enzyme MmeI was ligated to the NlaIII cleavage sites. The adapter-ligated cDNA was digested with MmeI to release the cDNA from the magnetic bead, while leaving 17 bp of sequence in the fragment. The fragments were dephosphorylated and purified by phenol–chloroform. A second adapter was ligated at the MmeI cleavage sites. Adapter-ligated cDNA fragments were amplified by PCR, and after 15 cycles of linear PCR amplification, 95 bp fragments were purified by 6% TBE PAGE gel electrophoresis, denatured and the single-chain molecules were fixed onto the Illumina Sequencing Chip. A single-molecule cluster sequencing template was created through in situ amplification. Nucleotides labeled with different colors were used to perform sequencing by the sequencing by synthesis method; each tunnel can generate millions of raw reads with a sequencing length of 35 bp. Determination of gene expression levels and detection of DEGs and NATs Sequencing-received raw image data were transformed by base culling into sequence data, which was called raw data. Raw sequences were transformed into clean tags after removed all low quality tags such as short tags (< 21 nt), empty reads, and singletons (tags that occurred only once). Clean tags were classified according to their copy number (as a percentage of the total number of clean tags) and the saturation of the library was analyzed. All clean tags were mapped to sheep reference sequences(version:Oarv3.1) by SOAP2(version:2.21) with default parameters, and allowing one mismatches. To monitor mapping events on both strands, both sense and complementary antisense sequences were included [[174]100]. A preprocessed database of all possible 17 base-long sequences of Oarv3.1 located next to the NlaIII restriction site was created as the reference tags, and only one mismatch was allowed. Following the common practice, only clean reads that can be uniquely mapped back to the reference tags were considered ("-r 0" option), and those mapped more than one location were discarded. Remainder clean tags were designed as unambiguous clean tags. When multiple types of remainder clean tags were aligned to the different positions of the same gene, the gene expression levels were represented by the summation of all. The number of unambiguous clean tags for each gene was calculated and then normalized to TPM (number of transcripts per million clean tags) = clean tag number corresponding to each gene number of total clean labels in the sample × 1,000,000 [[175]101, [176]102]. All TPM of 3 samples each group were integrated, and NOISeq (version: 2.8.0) [[177]103]with default parameters, which is a novel nonparametric approach for the identification of DEGs and NATs, was used to detect the differentially expressed transcripts between super fine wool group (S) and fine wool group (F). To measure expression level changes between two groups, NOISeq takes into consideration two statistics: M(the log[2]-ratio of the two conditions) and D(the absolute value of the difference between conditions). The probability thresholds were P ≥ 0.8 and the TMM value in the lower expressed sample was ≥1. The higher the probability, the greater the change in expression between two groups. Using a probability threshold of 0.8 means that the gene is 4 times more likely to be differentially expressed than non-differentially expressed[[178]104]. Strand-specific real-time quantitative RT-PCR To confirm the differentially expressed sense and antisense transcripts between super fine wool group and fine wool group, ten genes were randomly selected to verify their expression levels of gene and NATs transcripts in skin by strand-specific qRT-PCR according to the protocol described in Haddad et al. (2007)[[179]105]. Primers for real-time PCR were designed with Primer Express 3.0 (Applied Biosystems) ([180]Table 5). GAPDH was used as a reference control. Real time PCR was performed using SYBR Green master mix (TianGen) on the CFX96 Real-Time System (BioRAD, USA). The reaction was performed using the following conditions: denaturation at 95°C for 3 min, followed by 40 cycles of amplification (95°C for 30s, 60°C for 30s, and 72°C for 30s). Relative expression was calculated using the delta-delta-Ct method. Table 5. Relevant information of gene and primer sequences for strand-specific RT-PCR. Genes name Primer sequences (5′→3′) GenBank accession No Produce size (bp) Bos d2 ACAGTAGTGGCACAGAGAGC [181]XM_004021893.1 136 TTGTGGTCTTCTCGCCATCA SPRR CAAAAATGCCCTCCTGTGCC [182]NM_001009773.1 91 CCTGACCAGATAAAAGCTGATGC AQP8 TCATCCTGACGACACTGCTG [183]XM_004020851.1 146 ATTCATGCACGCTCCAGACA KRT1 CCCTGGATGTGGAGATTGCC [184]XM_004006320.1 119 ACTGATGCTGGTGTGACTGG KRT7 ACATCAAGCTGGCTCTGGAC [185]XM_004006333.1 108 CCACAGAGATGTTCACGGCT S100A8 GCTGACGGATCTGGAGAGTG [186]XM_004002523.1 163 GAACCAAGTGTCCGCATCCT GAPDH GCTGAGTACGTGGTGGAGTC [187]NM_001190390.1 136 GGTTCACGCCCATCACAAAC AQP8-NAT CCTTGGGAAATCCTTGCAGC 137 GAAAAGCCCGTTGCAGACTC Bos d2-NAT TTGGCGGAAGTATGTCGCAG 112 CAGCTCAGCCAGAATCGTCA SPRR-NAT CCATGACCCGCTTTGAAGA 132 CAATGCCAGCAGAAGTGCC [188]Open in a new tab GO and KEGG enrichment analysis of differentially expressed transcripts Gene ontology (GO), an international standardized gene functional classification system, offers a dynamic-updated controlled vocabulary and strictly defined concept to comprehensively describe the properties of genes and their products in any organism[[189]106]. Kyoto Encyclopedia of Genes and Genomes (KEGG) database is a database resource for understanding functions and utilities of the biological system, such as the cell, the organism and the ecosystem from molecular information, especially for large-scale molecular datasets generated by genome sequencing and other high throughput experimental technologies ([190]http://www.genome.jp/kegg/) [[191]107]. all DEGs and NATs were mapped to GO-terms in GO database, looking for significantly enriched GO terms in DEGs comparing to the genome background GOseq R package(version:1.18.0)[[192]108],while significantly enriched metabolic pathways or signal transduction pathways in DEGs and NATs were identified via pathway enrichment analysis using KEGG, public pathway-related database, and comparing with the whole genome background by KOBAS(version: 2.0)[[193]109]. In all tests, P-values were calculated using Benjamini-corrected modified Fisher’s exact test and ≤0.05 was taken as a threshold of significance. The calculating formula is: [MATH: P=1-i=0m-< mn>1MiN-Mn-iNn :MATH] Where N is the number of all genes with GO or KEGG annotation; n is the number of DEGs or NATs in N; M is the number of all genes that are annotated to the certain GO terms or specific pathways; m is the number of DEGs or NATs in M. Supporting Information S1 Table. Summary of tags mapping to gene & genome and unknown tag. (DOCX) [194]Click here for additional data file.^ (17.3KB, docx) S2 Table. Summary of tags mapping to anti-sense genes and tag-mapped anti-sense genes. (DOCX) [195]Click here for additional data file.^ (15KB, docx) Acknowledgments