Abstract Simple Summary In this study, we investigated the brain, gill, liver, and muscle transcriptomic responses of Taiwan tilapia towards cold stress. Some key genes and molecular markers involved in cold biological pathways were screened through differential expression. Among them, energy-related metabolic pathways and nucleotide genotypes were highly correlated with cold tolerance traits. This suggested that single nucleotide polymorphism (SNP) genetic variation can be used as a molecular marker to assist the selection and verification of cold-tolerant populations. Our study results will accelerate the understanding of different farmed tilapia tolerance mechanisms to environmental temperature changes and provide insights for the molecular breeding of cold-tolerant Taiwan tilapia species. Abstract Taiwan tilapia is one of the primary species used in aquaculture practices in Taiwan. However, as a tropical fish, it is sensitive to cold temperatures that can lead to high mortality rates during winter months. Genetic and broodstock management strategies using marker-assisted selection and breeding are the best tools currently available to improve seed varieties for tilapia species. The purpose of this study was to develop molecular markers for cold stress-related genes using digital gene expression analysis of next-generation transcriptome sequencing in Taiwan tilapia (Oreochromis spp.). We constructed and sequenced cDNA libraries from the brain, gill, liver, and muscle tissues of cold-tolerance (CT) and cold-sensitivity (CS) strains. Approximately 35,214,833,100 nucleotides of raw sequencing reads were generated, and these were assembled into 128,147 unigenes possessing a total length of 185,382,926 bp and an average length of 1446 bp. A total of 25,844 unigenes were annotated using five protein databases and Venny analysis, and 38,377 simple sequence repeats (SSRs) and 65,527 single nucleotide polymorphisms (SNPs) were identified. Furthermore, from the 38-cold tolerance-related genes that were identified using differential gene expression analysis in the four tissues, 13 microsatellites and 37 single nucleotide polymorphism markers were identified. The results of the genotype analysis revealed that the selected markers could be used for population genetics. In addition to the diversity assessment, one of the SNP markers was determined to be significantly related to cold-tolerance traits and could be used as a molecular marker to assist in the selection and verification of cold-tolerant populations. The specific genetic markers explored in this study can be used for the identification of genetic polymorphisms and cold tolerance traits in Taiwan tilapia, and they can also be used to further explore the physiological and biochemical molecular regulation pathways of fish that are involved in their tolerance to environmental temperature stress. Keywords: Taiwan tilapia, cold-tolerance, RNAseq, microsatellite, single nucleotide polymorphism 1. Introduction Due to the extreme climate and environmental changes caused by global warming that include unusual snowfall and frequent winter cold currents, all biological communities and ecosystems have been impacted [[36]1]. The sustainable development of agricultural production stability and safety is a major challenge facing all countries today [[37]2,[38]3]. Taiwan is located in a tropical/subtropical region. The majority of aquaculture production is dominated by subtropical fish species. High temperatures in summer and typhoons, heavy rainfall, low temperature, and cold currents in winter along with other extreme abnormal weather have caused serious damage to aquaculture fisheries and economic losses [[39]4]. Tilapiine fishes of the cichlid family are native to tropical and subtropical regions of Africa and eventually reached and settled in the Middle East through the Great Rift Valley [[40]5]. Given the tropical origin of these fish, 20–30 °C is the optimal growth temperature for most tilapia. When the temperature is lower than 20 °C, feeding and reproduction are typically inhibited. Currently, in addition to many genetic studies examining low temperature tolerance in different tilapia species, a number of countries have also developed cultivated tilapia strains exhibiting different degrees of cold-tolerance through selection and hybridization [[41]6,[42]7]. In Taiwan, after years of research, genetic improvement, and certification, tilapia fish have gradually become a unique and high-quality strain. Twenty years ago, the industry termed these fish “Taiwan tilapia” [[43]8]. Thermal tolerance is a quantitative trait that is biologically significant in several cultured fish species. This body phenotypic variation is attributed to acclimation, physiological stage, genetic effects, and environmental factors [[44]9,[45]10]. To survive, fish can also respond to environmental temperature fluctuations [[46]11]. When fish are stimulated by the environment, they activate the two major physiological regulation nerves and endocrine systems in the body to maintain cellular homeostasis by regulating metabolism, osmotic pressure, and the endocrine system [[47]12,[48]13,[49]14]. If the body functions of these fish are damaged, the resistance of these fish will decrease, and this can result in disease or even death. At the molecular level, low temperatures reduce the enzymatic reaction rates and the diffusion and transport of biomolecules and can slow protein denaturation and disaggregation [[50]15]. Additionally, low temperatures can also inhibit transcription and translation, destroy cellular cytoskeletal elements, alter membrane permeability, and affect energy production in cells [[51]16]. Furthermore, previous genetic studies have used microsatellite markers to facilitate genetic linkage map quantitative trait loci (QTL) mapping [[52]17] and cold-tolerance marker development [[53]7,[54]18,[55]19]. Several studies have attempted to further explore the genetic model and the genetic basis of cold-tolerance in fish [[56]20,[57]21,[58]22,[59]23,[60]24]. However, the mechanisms and pathways underlying the observed variations in cold-tolerance among tilapia fish species remain unknown. In this study, genetic variation within species was characterized by selecting and breeding families possessing different low-temperature tolerances and sensitivities. Furthermore, transcriptome analysis was used to compare the transcriptome responses in brain, gill, liver, and muscle tissues between cold-tolerant and sensitive Taiwan tilapia strains under low-temperature challenges. In addition to characterizing the differential gene expression patterns between cold-tolerant and cold-sensitive fish, it is also possible to develop and verify the low-temperature test of the functional genes involved in the regulation pathway. Molecular genetic DNA markers for microsatellites and single nucleotide polymorphisms assist in broodstock breeding. In the future, this can accelerate the molecular selection of low-temperature-tolerant strains of Taiwan tilapia to mitigate the losses of the aquaculture fish industry due to cold damage and increase the economic value of the aquaculture industry. 2. Materials and Methods 2.1. Animals and Experimental Conditions Three tested Taiwan tilapia strains were collected from the wild and from private markets by the National Taiwan Ocean University (NTOU), and breeding programs were performed at the core breeding greenhouse and Aquatic Organism Research and Conservation Center in Gongliao. The Rui-Fang (RF) strain is a wild species of Taiwan tilapia that was originally collected from the Rui-Fang field in northern Taiwan. The YiHua (YH) strain is a breeding species of Taiwan tilapia that was originally collected from the private commercial seed breeding farm in southern Taiwan. The NTOU (NT) strain is a pure strain of Nile tilapia, and this strain has been maintained for a long time period in the NTOU [[61]25,[62]26]. Spawns were obtained by mating in pairs to obtain 22 families ([63]Supplementary Figure S1). The offspring of each family were kept in separate fish tanks until they were 3 to 4 months old and weighed approximately 30–40 g or more. Subdermally injected dyes were used to identify families. To determine the cold-tolerance value of each family, each one contributed fry for cooling tests. The operation method was referred to and modified from the description published by Nitzan et al. [[64]9]. 2.2. Cooling Test The selection criteria for cold-tolerance traits as assessed using the cooling test involved the use of F1 (n = 15) and F2 (n = 15) test fish from the offspring of full-sib and half-sib families that were raised in a 90-L orange bucket that was equipped with an upper filter and an air pump (TURBO 600 pumping motor, SanYu Aquarium, Kaohsiung, Taiwan). The upper portion was covered with a plastic fence to prevent the fish from jumping out of the tank. The fish were allowed to adapt to a water temperature of 25 °C for one week, and the photoperiod was set to 12 h light/12 h darkness. The upper portion of the filter was cleaned, the water was changed, and the fish were fed daily. The offspring fish of each family were then moved into a 600-L water tank that was equipped with a temperature-controlled system (Firstek Scientific, New Taipei City, Taiwan) that was used to slowly lower the water temperature at a rate of 1 °C per day. The water temperature was recorded and observed after each cooling. The swimming behavior and feeding behavior of Taiwan tilapia were observed, and a dissolved oxygen saturation of greater than 90% was maintained. Finally, the minimum tolerable temperature (°C), full length (cm), body length (cm) and mass (g) of each flipping fish was recorded, and tail fin tissue samples were collected to extract genomic DNA for subsequent analysis. The offspring of each Taiwan tilapia family were tested for low-temperature tolerance using a cooling and cold attack test. The first three families (RF strain) considered most tolerant to low temperature, and the last three families (NT strain) considered least tolerant were selected and defined as the cold-tolerance (CT) and cold-sensitivity (CS) groups. 2.3. Total RNA Extraction The cold-tolerance (CT, n = 6) and cold-sensitivity (CS, n = 6) fish groups were subjected to cold stimulation at 10 °C, and the brain, gill, liver, and muscle tissues were collected and placed into a 1.5 mL centrifuge tube for isolation (Roche Applied Science, Mannheim, Germany). The EasyPure Total RNA Spin Kit (Bioman, Taipei, Taiwan) was used for total RNA extraction. Three stainless steel beads (3 mm) and one (5 mm) steel ball (LabTurbo^®, Taigen Bioscience, Taipei City, Taiwan) were placed into a 1.5 mL centrifuge tube containing Trizol, and they were homogenized in the SpeedMin PLUS (Analytik Jena AG, Biometra GmbH, Göttingen, Germany) and maintained at room temperature for 5 min. The mixture was poured into a 2 mL filter column and centrifuged (4 °C, 10,000× g) for 2 min. The supernatant was then transferred to a new centrifuge tube, and 400 µL of 70% ethanol was added. The tube contents were shaken, mixed immediately, and poured into the RB column (Bioman, Taipei, Taiwan). This was centrifuged (4 °C, 10,000× g) for 2 min, and the lower layer of liquid was discarded and the column was placed into a 2 mL collection tube. Next, 400 µL of W1 buffer was added to the RB column that was subsequently centrifuged (4 °C, 10,000× g) for 1 min. The lower layer of liquid was discarded and returned to the 2 mL collection tube. Then, 600 µL of wash buffer was added to the RB column that was subsequently centrifuged (4 °C, 10,000× g) for 1 min. The lower layer of liquid was poured into a 2 mL collection tube and then centrifuged (4 °C, 10,000× g) for 3 min. The RB column was placed into a new microcentrifuge tube, and 50 µL of RNase-free water was added to absorb RNA. The mixture was allowed to stand for 5 min and then centrifuged (4 °C, 10,000× g) for 2 min to obtain purified RNA. Subsequently, a NanoDrop (MaestroGen, Hsinchu City, Taiwan) was used for analysis. The ratio of OD[260] to OD[280] was measured using a spectrophotometer to assess the concentration of purified RNA. The ratio of 1.9–2.0 indicates high purity RNA. The RNA extracts were stored at −80 °C for later use. 2.4. Transcriptome High-Throughput Next-Generation Sequencing A 10 µg total RNA sample was sent to a sequencing service company for RNA sample quality testing to confirm the integrity of the RNA sample. The use of contaminated or degraded samples was avoided. Only high-quality total RNA was selected, and these RNA samples were divided into two groups that include CT and CS. Tissues (brain, gill, liver, and muscle) were each added to one specific tube, and total RNA extracts from CT (n = 6) and CS (n = 6) brain, gill, liver, and muscle tissue samples were used for transcriptome gene library construction according to high-throughput next generation sequencing (NGS) platform. The transcriptome assembly of eight libraries was submitted to the NCBI short read archive database (accession number: SUB 9446960). High-quality read sequence data for unigene assembly were used, and they were screened for differentially expressed genes and utilized for Gene Ontology (GO) annotation and pathway enrichment analysis. Additionally, Kyoto Encyclopedia of Genes and Genomes (KEGG), Clusters of Orthologous Groups (COG) annotation and prediction analyses of encoded proteins were performed using gene function annotation. MicroSAtellite (MISA) was used to search for simple sequence repeat (SSR) or microsatellite DNA markers on the reference sequence of the transcript. Primer3 (v2.3.7) [[65]27] was used to analyze the genes for the detected SSR markers and for the design of the primer pair for the locus. HISAT (v0.1.6) software [[66]28] was used to compare clean reads for genes, and GATK (v3.4-0) [[67]29] was used to detect single nucleotide polymorphisms (SNPs) and to filter low-quality SNPs. 2.5. Transcript Database Gene Differential Expression From the Nile tilapia transcript gene library, the maximum and minimum log2 ratio (CT/CS) 2 of the expression of the cold-tolerance group (CT) and the cold-sensitivity group (CS) was selected. Digitized gene expression (DGE) analysis and selection of genes exhibiting differential expression were conducted. Gene differential performance analysis was used to identify genes possessing different expression levels in different samples, and their expression levels according to FPKM were expressed. Fragments per kilobase of exon per million fragments mapped (FPKM) was calculated according to the following formula: [MATH: FPKM=total exo n readsmapped reads(millions)×ex on leng< mi>th(KB) :MATH] Total fragments represent the number of fragments that are uniquely aligned to the gene, mapped reads represent the total number of fragments that are uniquely aligned to all genes, and exon length is the number of bases within the gene [[68]30]. 2.6. Reverse Transcription Polymerase Chain Reaction A high-capacity cDNA reverse transcription kit (Applied Biosystems—Life Technologies, Carlsbad, CA, USA) was used for reverse transcription. The total reaction volume was 20 µL, and the components were 1 µg of RNA, 2 µL of 10× RT buffer, 0.8 µL of 25× dNTP mix (100 mM), 2 µL of 10× RT Random primer, 1 µL of MultScribe^TM reverse transcriptase (50 U/µL), and 4.2 µL of Nuclease-free water. Reaction was performed using Veriti^® thermal cycler (Applied Biosystem—Life Technologies), and the reaction conditions were 25 °C for 10 min, 37 °C for 120 min, and 85 °C for 5 min. After the reaction, the sample was diluted 100-fold and stored at −20 °C for subsequent use. 2.7. Real-Time Quantitative Polymerase Chain Reaction To verify the credibility of the RNAseq results, six functional genes (CL10781_10, CL1487_25, CL5212_1, CL5902_1, Unigene196, and Unigene7071) were used for qPCR expression analysis and screened based on: (1) expression ability in four tissues, including brain, gill, liver, and muscle; (2) differences in expression between the cold-tolerance group (CT) and the sensitive group (CS); (3) the presence of SSR or SNP molecular markers. The tested samples of cold-tolerance (CT, n = 6) and cold-sensitivity (CS, n = 6) previously used for RNAseq were used for qPCR analysis. The total reaction volume was 20 µL, and the components were 10 µL of 2× Power SYBR^® Green PCR Master Mix, 5 µL of cDNA (diluted 100-fold), 1.2 µL of forward primer (0.5 µM), 1.2 µL of reverse primer (0.5 µM) and 2.6 µL of RNase-free water. A Roche LightCycler^® 480 Real-Time PCR System (Roche Applied Science, Mannheim, Germany) was used for the reaction. The reaction conditions were 50 °C for 2 min in the first stage, 95 °C for 10 min in the second stage, and 40 cycles at 95 °C for 15 s and 60 °C for 30 s in the third stage. The obtained Ct value was used as the mean and standard deviation (SD), and the data from the control group (18S rRNA and β-actin) were deducted as a relative quantification expression graph. The primer pairs of candidate target and internal reference genes employed are listed and shown in [69]Supplementary Table S1. 2.8. Genomic DNA Extraction Approximately 100 mg of the tail fin of the test fish was cut and placed into a 1.5 mL microcentrifuge tube containing 800 µL of 70% ethanol solution. Information regarding the source of the sample that included the sample number, sample name, and sampling date was attached to each centrifuge tube. After computer labeling, the samples were stored at −20 °C and frozen for genomic DNA extraction. The MasterPure^TM DNA Purification Kit (Epicenter, Madison, WI, USA) was used to extract genomic DNA from Taiwan tilapia. The fin sample was placed into a new 1.5 mL microcentrifuge tube, and 800 µL of tissue and cell lysis solution (Epicenter) and 2 µL of proteinase K (Epicenter) were both added. The samples were mixed well and placed into a 55 °C oven for 6–12 h. After the tissue was completely dissolved, 400 µL was pipetted into a new centrifuge tube, and 250 µL of MPC protein precipitation reagent (Epicenter) was added and mixed thoroughly. The mixture was centrifuged at 10,000× g and 4 °C for 10 min. Thereafter, 500 µL of supernatant was pipetted into a new 1.5 mL microcentrifuge tube, and 500 µL of isopropanol (Sigma-Aldrich, Saint Louis, MO, USA) was added and mixed several times to obtain dehydrated DNA pellets. The pellets were centrifuged to the bottom of the tube using a microcentrifuge (Spin mini), and the supernatant was poured out. Next, 100 µL of 70% alcohol was added to the 1.5 mL microcentrifuge tube for two washes. After centrifugation to remove residual alcohol, the tube was placed in an oven at 55 °C for 10 min to completely dry the alcohol. Subsequently, 200 µL of ddH[2]O was added, and the tube was placed into a dry bath at 37 °C for 20 min to dissolve the DNA. A Maestro Nano Spectrophotometer (Maestrogen, Las Vegas, NV, USA) was used to measure the OD[260] and OD[280] absorbance values and to calculate the DNA concentration. Then, 0.8% agarose gel electrophoresis was used for electrophoresis separation. GelRed Nucleic Acid Gel Stain (Biotium, Inc., Fremont, CA, USA) was used after staining for 30 min, and the Slite140 Compact Gel Documentation System (Avegene Life Science, New Taipei City, Taiwan) colloid camera system was used to assess the quality of genomic DNA. After measuring the computer label of the sample source information that includes tissue number, sample name, extraction date, and other information, the samples were stored in a frozen state at −20 °C for subsequent testing. 2.9. Microsatellite Marker DNA Genotyping The microsatellite markers that were developed using two low-temperature tolerance-related microsatellite markers and 13 transcript functional databases targeted 269 low-temperature-tested Taiwan tilapia (RF, n = 120; YH, n = 102; NT, n = 47) for genotyping analysis. Based on the multiple fluorescent labeling method, the first PCR amplification used the forward primer containing the adaptor to bind gDNA fragments. The total volume of the reaction solution was 10 µL, and the composition was 5 µL of 2× Taq DNA polymerase Mastermix (Bioman, Taipei, Taiwan), 0.3 µL of forward primer, 0.3 µL of reverse primer, 2 µL of template DNA, and 2.4 µL of ddH[2]O. PCR was performed using a 96-well polymerase chain reactor (Veriti^® thermal cycler; Applied Biosystems—Life Technologies). The reaction conditions were 94 °C for 3 min, 94 °C for 45 s, annealing temperature (T[a]) °C for 30 s, 72 °C for 30 s, and 72 °C for 7 min. The four steps were cycled 30 times. When the second PCR amplification was performed, the forward primer was changed to fluorescent primers that included JOE (green), FAM (blue), ROX (red), NED (yellow), and four different fluorescent primers. After labeling, the total volume of the reaction solution was 10 µL, and the components were 5 µL of 2× Taq DNA polymerase Mastermix (Bioman), 0.3 µL of fluorescent primer, 0.3 µL of reverse primer, 2 µL of the first PCR product diluted 10-fold as template DNA and 2.4 µL of ddH[2]O. A 96-well polymerase chain reactor (Veriti^® thermal cycler; Applied Biosystems—Life Technologies) was used for PCR, and the reaction conditions were the same as those used for the first PCR amplification. After the PCR products were separated using electrophoresis with a 2% agarose colloid, they were stained with GelRed Nucleic Acid Gel Stain (Bioman) for 30 min, photographed, and assessed using the Slite 140 Compact Gel Documentation System (Average Life Science, New Taipei City, Taiwan) colloid camera system. The probability of allele identification errors was reduced, and each sample was subjected to a second PCR analysis. If the same result was obtained twice, the microsatellite genotype of this sample was determined. After mixing four different fluorescently labeled PCR products, 5 µL of these products were further mixed with 0.2 µL of GeneScan^TM-500 LIZ^TM size standard (Applied Biosystems, Foster City, CA, USA) and 10.8 µL HiDi^TM Deionized formamide (Applied Biosystems, Foster City, CA, USA) and centrifuged. After heating at 94 °C for 3 min, the samples were quickly placed on ice for 5 min and then placed on an ABI PRISM^® 3730xl DNA Analyzer (Applied Biosystems, Foster City, CA, USA) automatic DNA analyzer for short tandem repeat (STR) capillary electrophoresis separation. The obtained microsatellite marker data were represented by the analogy of A, B, C, and so forth according to the size of the allele fragments, using Geneious software for the interpretation and analysis of multiple fluorescent PCR polymorphic marker genotypes. 2.10. Single Nucleotide Polymorphism Markers Genotyping The principle of typing is to perform a single-base extension reaction at the SNP site and then use mass spectrometry to analyze the molecular weight of the product. As a single base extends four bases (A, T, C, and G), the molecular weights are all different (ddATP: 475.18 g/mol, ddCTP: 451.16 g/mol, ddGTP: 491.18 g/mol, ddTTP: 466.17 g/mol), and SNP genotypes can be analyzed based on this feature. SNP stereotype ratio, HM call, and LM call respectively represent the ratio of homozygous high-molecular-weight base type to low molecular weight base type. For example, when assessing at SNP of the mutual conversion of cytosine (C) and thymine (T), the molecular weight of the T base is greater than is the molecular weight of the C base. Based on this, the T base will be set as high mass (HM), and the C base will be set as low mass (LM). Thirty-seven single-nucleotide polymorphism gene-based markers were selected from the transcript database for use in the targeting of 190 Taiwan tilapia populations (RF = 72, YH = 96, NT = 22) that were tested under low temperature using the Sequenom MassARRAY platform and iPLEX gold chemistry (Sequenom, San Diego, CA, USA) for genotyping analysis. Design-specific PCR primers and extension primers created using the Assay Designer software package (v.4.0) (Premier Biosoft International, Palo Alto, CA, USA) were used to configure a total volume of 5 µL of the reaction solution for multiplex PCR reactions. The composition was 1 µL of genomic DNA (10 ng/μL), 0.2 units of Taq polymerase, 2.5 pmol of PCR primer, and 25 mM of dNTP (Sequenom, San Diego, CA, USA). PCR reaction conditions were 94 °C for 4 min, 94 °C for 20 s, 56 °C for 30 s, 72 °C for 1 min, and 72 °C for 3 min. Steps 2–4 were cycled 45 times, and 0.3 U of shrimp alkaline phosphatase (SAP) was added to inactivate the excess dNTPs. The iPLEX enzyme, terminator mix, and extension primer (Sequenom, San Diego, CA, USA) were used for the single-base extension reaction. The PCR reaction conditions were 94 °C for 30 s, 94 °C for 5 s, 56 °C for 5 s, 80 °C for 5 s, and 72 °C for 3 min. Steps 3–4 were cycled five times, and then the program transitioned back to Step 2 for 40 cycles. A cation exchange resin was added to remove the residual salts. Then, 7 nL of the product was placed on the 384-well SpectroCHIP (Sequenom, San Diego, CA, USA), and a MassARRAY Analyzer 4 was used for analysis. Then, TYPER 4.0 analysis software ([70]http://sequenom.com/Genetic-Analysis/Applications/iPLEX-Genotyping /iPLEX-Literature, accessed on 15 October 2021) was used to read the mass spectrometry results for data analysis and SNP genotype result interpretation. The genotype call score parameter analysis result determines if it can be interpreted as a reliable result. The genotype call score calculation method was as follows: Genotype call score = P[MA] × P[YLD] × P[SKW] The genotype call score was calculated using three parameters that include the signal intensity parameter of the PYLD-peak, the molecular weight accuracy and signal sharpness of PMA-pea, and the signal intensity ratio of the two genotypes of PSKW-SNP. The software determines if each peak in the mass spectrum is a credible result and performs SNP interpretation based on these results. When the genotype call score is lower than the cut-off value that the software can trust, no genotype interpretation will be provided. If the genotype call score is above the cut-off value, the software can perform genotype interpretation. According to the genotype call score from low to high, the software will be aggressive, moderate, or conservative in the final analysis result as a reference. 2.11. Statistical Analysis The obtained genotype data were imported into PopGene32 software [[71]31] to statistically analyze the number of alleles (Na) and allele frequency (Ne) of each microsatellite locus, and the various evaluating coefficients of genetic diversity and miscellaneous data were observed. Observed heterozygosity (H[o]), expected heterozygosity (H[e]), polymorphism information content (PIC), and fixation index (F[IS] and F[ST]) were calculated as follows. Observed heterozygosity (H[o]): H[o] = N[het] ÷ (N[hom] + N[het]) where N[het] is the number of heterozygous individuals, and N[hom] is the number of homozygous individuals. Expected heterozygosity (H[e]): [MATH: He=1−< /mo>jnPi2 :MATH] where n is the number of alleles at each locus, and Pi is the frequency of the ith allele [[72]32]. Polymorphism information content (PIC): [MATH: PIC=1i=1< /mn>k pi 2i=1< /mn>k1j=i< /mi>+1k2p j2pj2 :MATH] where k is the number of alleles, and pi and pj are the gene frequencies of the ith and jth alleles [[73]33]. Fixed index (F[IS]): F[IS] = 1 − H[o] ÷ H[e] where H[o] is the observed heterozygosity, and H[e] is the expected heterozygosity [[74]34]. The phenotypic allele effect of SSR/SNP which was associated with cold-tolerance trait was estimated through comparison between the average phenotypic values over accessions with the specific allele. The IBM SPSS Statistics version 22.0 program (SPSS Inc., Chicago, IL, USA) was used for statistical analysis of the association between the phenotypes and the markers. According to ANOVA and Scheffe’s 95% confidence level (post-hoc test), genotype and body phenotype differences were significant. 3. Results 3.1. Phenotypic Differences in Cold-Tolerance as Assessed by the Loss of Balance Behavior in Response to Cooling Stress The results from the reduction test indicated that RF fish only reduced their food intake when the water temperature reached 18.3 °C, and only a small number of these fish would eat food at a water temperature of 12.3 °C. The fish began to roll over at 10.5 °C. At 8.6 °C, the rollover rate reached 55.5%, and at 7.7 °C, the rollover rate became 100%. YH fish only consumed less food when the water temperature reached 19.0 °C, and only a small number would eat when the temperature reached 13.3 °C. At 11.2 °C, the fish began to roll over. At 9.3 °C, the rollover rate reached 51.9 %, and at 8.0 °C, the rollover rate became 100%. The NT fish only consumed less food when the water temperature reached 18.7 °C, and only a small number of fish would eat at 12.2 °C. The fish began to roll at 11.5 °C. At 10.3 °C, the rollover rate reached 50.1%, and at 9.3 °C, the rollover rate became 100%. Among the three low-temperature test groups, the RF Taiwan tilapia group exhibited the best cold-tolerance with an average temperature tolerance of 8.74 ± 0.55 °C, and this was followed by the YH Taiwan tilapia group with an average temperature tolerance of 9.36 ± 0.72 °C. The NT Taiwan tilapia population possessed the worst low-temperature tolerance with an average tolerance temperature of 10.16 ± 0.45 °C ([75]Figure 1). Figure 1. [76]Figure 1 [77]Open in a new tab The phenotypic distribution of the measured values under cold conditions based on the cold-tolerance temperature of 22 in the Taiwan tilapia family RF (blue), YH (grey) and NT (brown) populations. The top 3 families (Family 1, 2, and 8) and the last 3 families (Family 19, 21, and 22) were sorted according to cold-tolerance temperature and defined as cold-tolerance (CT) and cold-sensitivity (CS), respectively. 3.2. Transcriptome Sequencing Analysis Overview 3.2.1. RNAseq Retrieval, Pre-Processing, Assembly, and Annotation of the Unigenes In this study, the cold-tolerant fish (those that can tolerate 8.74 ± 0.55 °C) exhibit lower temperature tolerance than do the cold-sensitive group of fish (that can tolerate 10.16 ± 0.45 °C). The collection of tissue samples used for transcriptome sequencing was based on the lowest tolerable temperature (10 °C) of the cold-sensitive group. Concurrently, cold-tolerant group fish samples were collected. The differences in measured gene expression were indeed caused by the differences in tolerance. Using the Illumina Hiseq 2000 next-generation platform, the transcriptome gene library was constructed for the total RNA of the brain, gill, liver, and muscle tissues of the two groups of cold-tolerance (CT) and cold-sensitivity (CS) fish. After quality trimming and filtering of low-quality reads, 234,765,554 high-quality paired-end (PE) reads were generated (each with a length of 150 bp), and a total of 35,214,833,100 nucleotides of data ([78]Table 1) were generated for assembly from scratch. Table 1. Statistical summary of sequencing reads after filtering in Taiwan tilapia (Oreochromis spp.). Sample ^1 Clean Reads ^2 Clean Bases ^3 Q20 (%) ^4 Q30 (%) ^5 GC (%) ^6 RL (bp) ^7 CT-B 29,473,122 4,420,968,300 98.88 96.28 49.36 150 CT-G 29,944,552 4,491,682,800 98.69 95.63 51.22 150 CT-L 29,536,538 4,430,480,700 98.76 95.83 50.54 150 CT-M 29,268,400 4,390,260,000 98.9 96.27 52.25 150 CS-B 29,547,582 4,432,137,300 98.72 95.85 49.57 150 CS-G 28,697,764 4,304,664,600 98.8 96.04 49.92 150 CS-L 29,117,880 4,367,682,000 99.05 96.72 50.07 150 CS-M 29,179,716 4,376,957,400 98.81 96.02 52.11 150 [79]Open in a new tab ^1 CT-B, brain tissue of the cold-tolerance group; CT-G, gill tissue of the cold-tolerance group; CT-L, liver tissue of the cold-tolerance group; CT-M, muscle tissue of the cold-tolerance group; CS-B, brain tissue of the cold-sensitive group; CS-G, gill tissue of the cold-sensitive group; CS-L, liver tissue of the cold-sensitive group; CS-M, muscle tissue of the cold-sensitive group. ^2 Total number of paired-end reads after filtering the sequencing reads which containing low-quality, adaptor-polluted and high content of unknown base (N) reads. ^3 Total clean bases (nt) = Total clean reads × Read length. ^4 The rate of bases which quality is greater than 20. ^5 The rate of bases which quality is greater than 30. ^6 The percentage of G and C bases in all unigenes. ^7 RL, read length. The length distribution statistics of the unigene transcripts after assembly are presented in [80]Supplementary Figure S2. A total of 128,147 unigenes were obtained, and these possessed a total length of 185,382,926 nt, an average length of 1446 nt, an N50 of 3157 nt, and a GC content of 47.46% ([81]Supplementary Table S2). Unigene was used to compare the distribution of genes based on blastx NR annotation, and the species comparisons utilized Oreochromis niloticus, Haplochromis burtoni, Neolamprologus brichardi, Maylandia zebra, and other species ([82]Figure 2). The analysis results revealed that 48,521 single genes exhibited higher homology and the highest similarity to Oreochromis niloticus, where they accounted for 64.99% of the total. By comparing unigenes to the nucleic acid database Nt (p-value < 0.00001) through the use of blastn, the protein possessing the highest sequence similarity to that of the unigenes can be determined, and information regarding the protein function of the unigene can be obtained. A total of 71,890 predicted CDS were compared to listings in the protein database, and 1389 predicted CDS were identified for a total of 73,279. Figure 2. [83]Figure 2 [84]Open in a new tab Distribution of annotated species. The distribution of annotated species is characterized statistically using NR annotation. Oreochromis niloticus: 64.99%; Neolamprologus brichardi: 6.11%; Haplochromis burtoni: 6.16%; Maylandia zebra: 4.6%; other: 18.14%. In terms of function annotation, unigenes included protein function annotation and COG function annotation. Unigenes were annotated to the NR, NT, Swiss-Prot, COG, GO, and KEGG databases ([85]Supplementary Figures S3–S5), and we counted the number of unigenes annotated to each database and also the total number of annotations ([86]Table 2). The cross-comparison analysis of the Venn diagrams of the four major protein databases yielded 25,844 co-annotated unigenes ([87]Figure 3). Table 2. Functional annotation information of the Taiwan tilapia transcriptome dataset. Transcriptome Dataset Unigene Number Percentage (%) NR ^1 74,656 58.26 NT ^2 97,575 76.14 Swiss-Prot ^3 61,098 47.68 COG ^4 26,342 20.56 CO ^5 7154 5.58 KEGG ^6 59,699 46.59 Overall (total annotation) ^7 100,108 78.12 Total 128,147 100 [88]Open in a new tab ^1 NR: Unigenes with NCBI non-redundant protein. ^2 NT: Nucleotide Database. ^3 Swiss-Prot: A curated protein sequence database which strives to provide a high level of annotation. ^4 COG: Clusters of Orthologous Groups. ^5 CO: Gene Ontology. ^6 KEGG: Kyoto Encyclopedia of Genes and Genomes. ^7 The number of unigenes which were annotated in at least one functional database. Figure 3. [89]Figure 3 [90]Open in a new tab Annotation of assembled Taiwan tilapia unigenes. The unigenes were annotated according to different protein databases that included NR, KEGG, COG and Swiss-Prot and are presented in a Venn diagram. NR: Unigenes with NCBI non-redundant protein. KEGG: Kyoto Encyclopedia of Genes and Genomes. COG: Clusters of Orthologous Groups. Swiss-Prot: A curated protein sequence database which strives to provide a high level of annotation. Based on the COG database, a large proportion of sequences within the transcriptome of Taiwan tilapia fish participate in the functional classification of “general function prediction only”. Additionally, “replication, recombination, repair”, “transcription”, “signal transduction mechanisms”, “cell cycle control, cell division, chromosome partitioning” and “translation, ribosomal structure and biogenesis” pathways were identified ([91]Supplementary Figure S3). The functional classification of these transcripts according to the GO database indicates that “cellular process”, “single-organism process” and “metabolic process” are the primary involved functions. Biological processes such as “cell”, “cell part”, “membrane” and “membrane part” are the primary cellular components involved, and “binding” and “catalytic activity” are the primary molecular functions involved ([92]Supplementary Figure S4). 3.2.2. Detection of Microsatellites and Single Nucleotide Polymorphism Markers Using the assembled unigene as the reference sequence, the MicroSAtellite (MISA) software was used to search for a total of 38,377 microsatellite markers containing one (single), two (double), three, four, five, and six base repeats. The analysis identified 8745 single-base repeats, 17,610 double-base repeats, 10,045 three-base repeats, 1182 four-base repeats, 650 five-base repeats, and 145 six-base repeats ([93]Figure 4A). Figure 4. [94]Figure 4 [95]Open in a new tab Identification of simple sequence repeats (SSRs) and single nucleotide polymorphism (SNPs) quantity statistics in the transcriptome of Taiwan tilapia. (A) The histogram presents the distribution of the total number of SSRs within different classes (repeated nucleotide types). The X-axis is the specific repeated nucleotide types, which were defined as mone for one-, di for two-, tri for three-, quad for four-, penta for five-, and hexa for six-nucleotides and the Y-axis indicates the SSR motif numbers. (B) The histogram presents the distribution of the total number of SNPs within different classes (transition and transversion). The X-axis is the SNP variation type, and the Y-axis indicates the SNP numbers. Additionally, after comparing clean reads to genes using the HISAT (v0.1.6) software, GATK (v3.4-0) software was used to search for single nucleotide polymorphisms (SNPs) and to filter low-quality SNPs. There were 28,093 and 27,746 identified transitions for adenine (A)/guanine (G) and cytosine (C)/thymine (T), respectively. There were also 5783, 5437, 6444, and 5822 identified transversions of adenine (A)/cytosine (C), adenine (A)/thymine (T), cytosine (C)/guanine (G), and guanine (G)/thymine (T), respectively ([96]Figure 4B). 3.3. Transcriptome Responses to Temperature Decreases 3.3.1. Differential Gene Expression between Cold-Tolerant and Sensitive Fish To compare the cold-tolerance (CT) and cold-sensitivity (CS) groups of Taiwan tilapia in the context of the brain, gill, liver, and muscle (CT-B vs. CS-B, CT-G vs. CS-G, CT-L vs. CS-L, and CT-M vs. CS-M), differential expression of transcripts was assessed. Parameters that included a false discovery rate (FDR) < 0.05, log[2] fold change > 1, or log[2] fold change < − 1 were used to screen for differentially expressed genes (DEGs) from the DE-seq analysis. All unigenes represent the results for up-regulation and down-regulation in regard to differential expression distribution based on quantitative analysis ([97]Figure 5). Totals of 4191, 4235, 3164, and 2439 differentially expressed genes were detected in the brain, gill, liver, and muscle tissues, respectively. The numbers of regulatory genes within the four tissues of the cold-tolerance (CT) group were 2081, 2261, 1824, and 1200, respectively. There were 2110, 1974, 1340, and 1239 downregulated genes, respectively ([98]Figure 5A,B). The results revealed that the low-temperature treatment process resulted in significant differences in gene expression between cold-tolerance (CT) and cold-sensitivity (CS) fish. Figure 5. [99]Figure 5 [100]Open in a new tab Statistics for the differentially expressed genes (DEGs). (A) MA plot of the DEGs. The X-axis represents value A (log[2] transformed mean expression level). The Y-axis represents value M (log[2] transformed fold change). Red, blue, and black points represent up-, down- and non-regulated DEGs, respectively. “CS-B, CS-G, CS-L, and CS-M” were the controls and “CT-B, CT-G, CT-L, and CT-M” were experimental groups in the “CS-vs-CT” paired comparison. (B) The numbers of differentially expressed genes (DEGs) between the two comparison groups in brain, gill, liver, and muscle tissues were determined. All DEGs were determined based on the results from the statistical analysis (FDR < 0.05). The X-axis represents the comparison of samples. The Y-axis represents the DEG numbers. The red color represents up regulated DEGs, and the blue color represents down-regulated DEGs. (C) Venn diagram of the expressed unigenes in brain, gill, liver, and muscle tissues. A total of 10,380 unigenes were expressed, and of these, 156 genes were commonly expressed in all the four tissues. Unigenes exhibiting |log[2] fold change| ≧ 1 were considered to be differentially expressed in each tissue. From a total of 10,380 differentially expressed genes, the Venny online software (2.1) ([101]https://bioinfogp.cnb.csic.es/tools/venny/, accessed on 15 October 2021) was used to cross-check CT-B vs. CS-B, CT-G vs. CS-G, CT-L vs. CS-L, CT-M vs. CS-M, and other. Venn diagram analysis of upregulated and downregulated genes in the four tissues revealed that 156 (1.5%) genes could overlap among the four tissues ([102]Figure 5C). 3.3.2. Differential Expression of Functional Genes Containing SSRs and SNPs Among the 156 genes exhibiting differential performance in the four tissues described above and combined with the microsatellite and single nucleotide polymorphic marker database previously explored from transcripts, from the 38-cold tolerance-related functional genes that were identified using differential gene expression analysis in the four tissues, 13 genes with 13 microsatellites ([103]Table 3) and 25 genes with 37 single nucleotide polymorphism markers ([104]Table 4) were identified. Of these, six unigenes contained two to five SNP markers. Each unigene containing SSR and SNP markers corresponded to the observed expression level ([105]Figure 6). Table 3. Overall information of simple sequence repeat (SSR) marker genes in Taiwan tilapia transcriptome database. No. Unigene ID SSR Length Position LG ^1 Location Gene Annotation ^2 1 CL138_1 (A)[n] 2492 1902 LG7 3′-UTR CTD small phosphatase-like protein 2-A 2 CL1487_25 (GT)[n] 8664 1862 LG16 5′-UTR Nuclear pore complex protein Nup98-Nup96 isoform X6 3 CL1876_16 (TTC)[n] 7751 7251 LG4 3′-UTR Myosin-10 isoform X3 4 CL5902_1 (G)[n] 2572 569 LG12 3′-UTR Ubiquitin-conjugating enzyme E2 G1 5 CL9541_2 (CAG)[n] 5825 2164 LG17 Exon AT-rich interactive domain-containing protein 2 6 CL10781_10 (A)[n] 9102 8403 LG8 3′-UTR Fatty acid synthase isoform X1 7 CL279_7 (GCA)[n] 4772 2266 LG23 Exon R3H domain-containing protein 1 8 CL2262_1 (T)[n] 3105 3083 LG2 3′-UTR TNFAIP3-interacting protein 1 isoform X1 9 Unigene7071 (GT)[n] 2729 1722 LG14 Intron Protein IWS1 homolog isoform X3 10 Unigene196 (CA)[n] 5125 5073 LG14 3′-UTR Serine/threonine-protein kinase SIK3 isoform X3 11 CL2061_8 (TC)[n] 1393 102 LG7 5′-UTR Glucose-6-phosphate isomerase-like 12 CL9318_1 (ATT)[n] 706 530 LG12 3′-UTR Bifunctional methylenetetrahydrofolate dehydrogenase/cyclohydrolase, mitochondrial-like 13 CL241_10 (GAG)[n] 5371 3079 LG8 Exon Rho GTPase-activating protein 17 isoform X1 [106]Open in a new tab ^1 Linkage group. ^2 Assembled unigenes were aligned against the NCBI non-redundant nucleotide sequence database (Nt) by BLASTn with an E-value cut off of 10−5. Then they were searched in public databases including NR, COG, GO, and KEGG through BLASTx under the same criterion as BLASTn. Table 4. Overall information of single nucleotide polymorphism (SNP) marker genes in Taiwan tilapia transcriptome database. SNP Unigene ID Length LG ^1 Allele Location Change ^2 Gene Annotation ^3 1 CL4970 3236 2 T/C UTR - pre-mRNA-splicing factor RBM22 2 CL6296_2 3447 23 d/A UTR - lipoamide acyltransferase component of branched-chain alpha-keto acid dehydrogenase complex, mitochondrial 3 CL2980_2 1540 3 G/C Exon P/P NADH dehydrogenase 4 CL6447_4 3573 ND C/T Intron - WD repeat domain phosphoinositide-interacting protein 2 isoform X1 5 CL4350_5 3286 23 G/A UTR - protein Jade-3 6 CL4943_1 2166 2 A/G UTR - NEDD4 family-interacting protein 1 isoform X2 7 CL179_6 5803 5 G/A Exon V/I period circadian protein homolog 3-like isoform X4 8 CL4788_1 1945 4 d/G UTR - hsc70-interacting protein isoform X2 9 CL4788_1 1945 4 d/A UTR - 10 CL5212_1 1774 2 d/A UTR - clathrin light chain B-like isoform X4 11 CL11264_1 2075 2 G/A UTR - cyclin-G1-like 12 CL11264_1 2075 2 C/T UTR - 13 CL11264_1 2075 2 C/A UTR - 14 CL11264_1 2075 2 T/C UTR - 15 CL11264_1 2075 2 G/A Exon I/V 16 CL3637_1 10,003 23 T/A UTR - protein PRRC2C 17 CL715_12 5227 13 d/A Exon S/K adipocyte plasma membrane-associated protein isoform X2 18 CL715_12 5227 13 d/C UTR - 19 CL715_12 5227 13 d/C UTR - 20 CL715_12 5227 13 d/G UTR - 21 CL8349_2 3478 7 T/C UTR - ATP-dependent Clp protease ATP-binding subunit clpX-like, mitochondrial isoform X1 22 CL4663_1 3831 ND A/T UTR - bone morphogenetic protein receptor type-1A 23 CL11264_2 2074 2 A/G UTR - cyclin-G1-like 24 CL11264_2 2074 2 C/T UTR - 25 CL11264_2 2074 2 G/T UTR - 26 CL4356_2 1809 ND d/G UTR - bone morphogenetic protein receptor type-1A 27 CL1248_2 3831 2 C/T Intron - dynactin subunit 4 isoform X2 28 CL2394_12 8460 13 G/A Exon R/H spectrin beta chain, non-erythrocytic 1 29 CL2394_12 8460 13 C/T Exon V/I 30 CL4788_2 2157 4 A/G Exon A/A hsc70-interacting protein isoform X2 31 CL144_1 1772 13 d/G UTR - E3 ubiquitin-protein ligase UBR2 isoform X3 32 CL4659_4 3929 13 d/T Exon F/F echinoderm microtubule-associated protein-like 4 isoform X3 33 CL179_13 1906 5 d/T UTR - period circadian protein homolog 3-like isoform X4 34 CL569_1 3529 13 d/A Exon P/P cullin-9 isoform X1 35 CL2216_1 1929 8 T/C Exon S/T dynamin-binding protein isoform X3 36 CL4659_13 4077 13 d/T Exon D/D echinoderm microtubule-associated protein-like 4 isoform X3 37 CL4659_13 4077 13 d/A Exon T/N [107]Open in a new tab ^1 Linkage group. ^2 Amino acid change. ^3 Assembled unigenes were aligned against the NCBI non-redundant nucleotide sequence database (NT) by BLASTn with an E-value cut off of 10^−5. Then they were searched in public databases including NR, COG, GO, and KEGG through BLASTx under the same criterion as BLASTn. Figure 6. [108]Figure 6 [109]Open in a new tab Expression analysis of Taiwan tilapia transcripts containing (A) microsatellites (SSRs) and (B) single nucleotide polymorphisms (SNPs) loci from brain, gill, liver, and muscle tissues between the cold-tolerance (CT) and cold-sensitivity (CS) groups based on their relative FPKM values. The transcripts were hierarchically clustered based on correlation distance and average linkage method. Blue indicates the lowest level of expression, white indicates an intermediate level of expression, and red indicates the highest level of expression. FPKM, fragments per kilobase of transcript per million mapped reads. MicroSAtellite (MISA) and GATK (v3.4-0) software programs were used to label the microsatellite (SSR) and single nucleotide polymorphism (SNP) variant sequences and to determine other biological information regarding the 156 differentially expressed genes selected above. The analysis revealed that there were 13 and 25 genes possessing SSR and SNP sequences, respectively. The online software ClustVis was used to perform pattern clustering analysis of the FPKM value of the marker gene and to create a heat map to indicate the gene difference performance ([110]Figure 6A,B). Additionally, based on the results of the gene annotation database and the NCBI BLAST tool used to compare DNA sequence data, Sequence Viewer was used to annotate genes to the RefSeq database, where 13 candidate SSRs ([111]Figure 6A) and 37 candidate SNPs were marked ([112]Figure 6B). Gene function annotations located in gene exons, introns, 3′- and 5′-untranslated regions (UTR), and other positions are listed in [113]Table 3 and [114]Table 4, respectively. 3.3.3. Validation of the Transcriptome Sequencing Results Using Real-Time qPCR To verify the results obtained by RNAseq, six genes were randomly selected for qPCR analysis. The results revealed that the expression and transcription levels of RNAseq and qPCR genes in the brain, gill, liver, and muscle tissues of the cold-tolerance (CT) and cold-sensitivity (CS) strains were similar for each gene, and the two data sets were highly correlated (R^2 = 0.9794, p < 0.001, [115]Figure 7). This result indicates that the RNAseq results were reliable. Figure 7. [116]Figure 7 [117]Open in a new tab Validation of the FPKM value for the CT and CS of comparative RNAseq transcriptome data in brain (A), gill (B), liver (C), and muscle (D) tissues using real-time qPCR analysis for relative gene expression. The mRNA expression levels of six unigenes exhibiting different expressions in our RNAseq results were confirmed by qPCR analysis-based comparisons of the CT and CS (control) groups. The data are expressed as log fold changes and are represented as standard deviations of the means from three biological replicates. In this experiment, the results of target gene sequence amplification revealed that the number of molecules can be doubled in each replication cycle and that this process exhibits excellent amplification efficiency, thus indicating that the primer design of this experiment is appropriate and of high quality and that there are no problems with secondary structures. The optimal qPCR reagent concentration and reaction conditions yielded ideal results. 3.4. Correlation between the Genotypes of the Polymorphic RNAseq Markers and Cold-Tolerance with Significant Genetic Variation in Taiwan Tilapia 3.4.1. Identification of Candidate SSR Markers Involved in Cold-Tolerance Using 13 microsatellite markers to analyze the diversity of the three Taiwan tilapia populations (RF, YH, and NT), it was determined that the Unigene196 marker can produce the largest number of alleles (16) and genotype combinations (44), while CL1876_16 and CL279_7 yielded only two allele numbers and two genotype combinations ([118]Table 5). Previous studies have developed two cold-tolerance markers that are identified as UNH916 and UNH999 [[119]19]. According to the test results of this experiment, it was determined that UNH916 and UNH999 can produce 12 and 14 alleles in the Taiwan tilapia test population, respectively. Among them, the RF population possessed the two markers with the most alleles (9 and 12, respectively), and the NT population possessed the least alleles (2 and 3). The E, H, J, and L alleles marked by UNH916 were only observed in the RF population, and allele I was only observed in the YH population. The I and N alleles marked by UNH999 were only observed in the RF and YH populations, respectively ([120]Supplementary Table S3). Table 5. Genomic SSR markers used for genetic variation study in RF, YH, and NT populations of Taiwan tilapia. Locus ^1 Ho He PIC F[IS] p Value ^2 RF YH NT RF YH NT RF YH NT RF YH NT UNH916 0.722 0.480 0.532 0.775 0.485 0.395 0.769 0.483 0.390 0.061 0.004 −0.362 0.345 UNH999 0.778 0.686 0.447 0.767 0.671 0.395 0.761 0.668 0.390 −0.022 −0.028 −0.144 0.103 CL1487_25 0.472 0.726 0.000 0.624 0.676 0.000 0.620 0.672 0.000 0.239 −0.079 — 0.537 CL1876_16 0.181 0.000 0.000 0.289 0.000 0.000 0.287 0.000 0.000 0.371 — — 0.004 ** CL5902_1 0.903 0.441 0.000 0.753 0.658 0.000 0.747 0.655 0.000 −0.208 0.326 — 0.456 CL9541_2 0.472 0.598 0.447 0.523 0.472 0.351 0.520 0.470 0.347 0.091 −0.273 −0.288 0.217 CL10781_10 0.278 0.559 0.489 0.412 0.592 0.599 0.410 0.589 0.593 0.322 0.051 0.175 0.178 CL279_7 0.000 0.000 0.000 0.130 0.000 0.000 0.129 0.000 0.000 1.000 — — 0.017 * CL2262_1 0.722 0.784 0.894 0.798 0.728 0.678 0.793 0.725 0.671 0.089 −0.082 −0.332 0.257 Unigene7071 0.694 0.196 0.575 0.749 0.201 0.431 0.744 0.200 0.427 0.066 0.021 −0.346 0.011 * Unigene196 0.750 0.686 0.553 0.776 0.710 0.454 0.771 0.706 0.449 0.027 0.028 −0.232 0.098 CL9318_1 0.667 0.284 0.000 0.759 0.283 0.000 0.754 0.282 0.000 0.116 −0.010 — 0.010 * CL241_10 0.319 0.382 0.575 0.383 0.367 0.505 0.381 0.365 0.499 0.161 −0.048 −0.151 0.255 Mean 0.535 0.448 0.347 0.595 0.449 0.293 0.591 0.447 0.290 0.178 −0.008 −0.210 SD 0.272 0.264 0.306 0.225 0.260 0.256 0.223 0.259 0.253 0.277 0.135 0.165 [121]Open in a new tab ^1 Locus symbol is derived from reference or abbreviated according to the unigene ID form transcriptome database. ^2 The p value is mean level of significance. Data sets that are significant at different levels: * p < 0.05, ** p < 0.01. The number of alleles (N[a]) was 2–13, 1–12, and 1–4, and the number of genotypes (N[g]) was 2–24, 1–20, and 1–5. The average observed heterozygosity (H[o]) values were 0.535 ± 0.27, 0.448 ± 0.26, and 0.347 ± 0.31, and the average expected heterozygosity (H[e]) values were 0.595 ± 0.23, 0.449 ± 0.26, and 0.293 ± 0.26, respectively. The average polymorphism average information contents (PIC) were 0.591 ± 0.22, 0.447 ± 0.26, and 0.290 ± 0.25, respectively. The average fixed index (F[IS]) values were 0.178 ± 0.28, −0.008 ± 0.14, and −0.210 ± 0.17 ([122]Table 5). IBM SPSS Statistics v22.0.0 and the Scheffe method (SPSS Inc., Chicago, IL, USA) both were used to analyze the genotypes and to verify the correlation with the minimum temperature tolerance of 13 microsatellite markers in the three Taiwan tilapia populations (RF, YH, and NT). The results revealed that myosin-10 isoform X3 (CL1876_16), R3H domain-containing protein 1 (CL279_7), protein IWS1 homolog isoform X3 (Unigene7071), bifunctional methylenetetrahydrofolate dehydrogenase/cyclohydrolase, mitochondrial-like (CL9318_1), and other functional gene markers were significantly associated with the cold-tolerance traits of Taiwan tilapia (p < 0.05) ([123]Table 5). 3.4.2. Genotype of the Gene-Based SNP Marker That Was Significantly Correlated with Cold-Tolerance The results of the SNP assay analysis of 37 candidate single nucleotide polymorphism markers related to low temperature tolerance traits revealed that the success rate of SNP marker gene typing was as high as 99.30%, and the reliability (conservative calls) averaged 87.98% ([124]Table 6). The average proportions of homozygous and heterozygous genotypes were 83.8% and 13.5%, respectively. The AutoCluster model was used to classify homozygote and heterozygote according to the population characteristics of the samples acquired from different Taiwan tilapia populations. The peak signal was used as the coordinate axis to draw a two-dimensional graph, and the genotype was then judged based on the clustering results ([125]Figure 8). Among the 37 single nucleotide polymorphism markers, 26, 20, and four markers in the RF, YH, and NT Taiwan tilapia populations, respectively, were polymorphic, and a total of 27 markers were polymorphic among the three test populations ([126]Table 6). Table 6. The number of genotype, the frequency of homozygous and heterozygous, and the overall data of cold-tolerance related tests were analyzed by using RNAseq single nucleotide polymorphism (SNP) marker loci in RF, YH and NT Taiwan tilapia populations. Unigene SNP High Mass Allele Calls Heterozygous Calls Low Mass Allele Calls Homozygous freq Heterozygous freq HW ChiSquare HW p-Value Trait p-Value RF YH NT RF YH NT RF YH NT RF YH NT RF YH NT CL4970 C/T 0 0 0 0 2 0 72 94 22 1.000 0.979 1.000 0.000 0.021 0.000 0.01 0.940 0.879 CL6296_2 A/d 0 0 0 0 0 0 72 96 22 1.000 1.000 1.000 0.000 0.000 0.000 - - - CL2980_2 C/G 13 13 0 31 35 0 27 45 22 0.563 0.624 1.000 0.437 0.376 0.000 6.08 0.010 0.148 CL6447_4 C/T 25 49 0 28 36 22 15 10 0 0.588 0.621 0.000 0.412 0.379 1.000 0.00 1.000 0.056 CL4350_5 A/G 11 0 0 23 0 13 38 96 9 0.681 1.000 0.409 0.319 0.000 0.591 13.60 0.000 0.589 CL4943_1 A/G 37 73 22 28 20 0 2 3 0 0.582 0.792 1.000 0.418 0.208 0.000 0.06 0.800 0.919 CL179_6 G/A 58 89 22 13 7 0 1 0 0 0.819 0.927 1.000 0.181 0.073 0.000 0.23 0.630 0.591 CL4788_1_1 G/d 16 10 0 29 37 22 27 48 0 0.597 0.611 0.000 0.403 0.389 1.000 0.00 0.980 0.094 CL4788_1_2 A/d 0 0 0 0 0 0 72 96 22 1.000 1.000 1.000 0.000 0.000 0.000 - - - CL5212_1 A/d 11 0 0 30 13 0 31 83 22 0.583 0.865 1.000 0.417 0.135 0.000 7.75 0.010 0.001 ** CL11264_1_1 G/A 38 95 22 28 1 0 6 0 0 0.611 0.990 1.000 0.389 0.010 0.000 8.15 0.000 0.378 CL11264_1_2 C/T 38 95 22 27 1 0 7 0 0 0.625 0.990 1.000 0.375 0.010 0.000 11.92 0.000 0.602 CL11264_1_3 C/A 37 95 22 27 1 0 7 0 0 0.620 0.990 1.000 0.380 0.010 0.000 11.81 0.000 0.599 CL11264_1_4 T/C 1 0 0 25 1 0 46 95 22 0.653 0.990 1.000 0.347 0.010 0.000 0.00 0.970 0.650 CL11264_1_5 G/A 7 0 0 26 1 0 38 95 22 0.634 0.990 1.000 0.366 0.010 0.000 12.91 0.000 0.490 CL3637_1 T/A 38 96 22 25 0 0 9 0 0 0.653 1.000 1.000 0.347 0.000 0.000 22.54 0.000 0.473 CL715_12_1 A/d 0 0 0 0 0 0 72 96 22 1.000 1.000 1.000 0.000 0.000 0.000 - - - CL715_12_2 C/d 28 7 0 35 43 22 9 46 0 0.514 0.552 0.000 0.486 0.448 1.000 242.63 0.000 0.099 CL715_12_3 C/d 9 0 0 24 0 22 38 96 0 0.662 1.000 0.000 0.338 0.000 1.000 200.20 0.000 0.488 CL715_12_4 G/d 0 0 0 0 0 0 67 96 22 1.000 1.000 1.000 0.000 0.000 0.000 - - - CL8349_2 C/T 0 28 12 6 50 10 66 18 0 0.917 0.479 0.545 0.083 0.521 0.455 13.43 0.000 0.366 CL4663_1 T/A 4 0 0 21 0 22 38 95 0 0.667 1.000 0.000 0.333 0.000 1.000 0.06 0.810 0.769 CL11264_2_1 G/A 2 0 0 27 1 0 43 95 22 0.625 0.990 1.000 0.375 0.010 0.000 0.38 0.540 0.650 CL11264_2_2 C/T 46 95 22 25 1 0 1 0 0 0.653 0.990 1.000 0.347 0.010 0.000 0.00 0.970 0.650 CL11264_2_3 G/T 41 95 22 25 1 0 1 0 0 0.627 0.990 1.000 0.373 0.010 0.000 0.00 0.950 0.650 CL4356_2 G/d 0 0 0 0 0 0 72 96 22 1.000 1.000 1.000 0.000 0.000 0.000 - - - CL1248_2 T/C 4 0 0 27 3 0 40 93 22 0.620 0.969 1.000 0.380 0.031 0.000 2.83 0.090 0.581 CL2394_12_1 T/C 4 0 0 24 9 22 44 87 0 0.667 0.906 0.000 0.333 0.094 1.000 0.41 0.520 0.888 CL2394_12_2 A/G 4 0 0 24 0 22 44 96 0 0.667 1.000 0.000 0.333 0.000 1.000 0.01 0.920 0.888 CL4788_2 G/A 23 80 14 35 15 8 14 1 0 0.514 0.844 0.636 0.486 0.156 0.364 3.85 0.050 0.104 CL144_1 G/d 0 0 0 0 0 0 70 96 22 1.000 1.000 1.000 0.000 0.000 0.000 - - - CL4659_4 T/d 4 0 0 23 0 22 44 96 0 0.676 1.000 0.000 0.324 0.000 1.000 199.71 0.000 0.926 CL179_13 T/d 0 0 0 0 0 0 72 96 22 1.000 1.000 1.000 0.000 0.000 0.000 - - - CL569_1 A/d 0 0 0 0 0 0 72 96 22 1.000 1.000 1.000 0.000 0.000 0.000 - - - CL2216_1 T/C 70 96 12 1 0 10 0 0 0 0.986 1.000 0.545 0.014 0.000 0.455 0.17 0.680 0.478 CL4659_13_1 T/d 0 0 0 0 0 0 72 96 22 1.000 1.000 1.000 0.000 0.000 0.000 - - - CL4659_13_2 A/d 0 0 0 0 0 0 66 95 22 1.000 1.000 1.000 0.000 0.000 0.000 - - - [127]Open in a new tab The p value is mean level of significance. Data sets that are significant at different levels: ** p < 0.01. Figure 8. [128]Figure 8 [129]Open in a new tab Analysis of SNP markers. (A) The frequency of polymorphic and monomorphic SNPs genotyping results derived from 37 genes in Rui-Fang (RF), YiHua (YH), and NTOU (NT) Taiwan tilapia populations. HM call: High mass allele genotype calling; Hetero: Heterozygous genotype calling; LM call: Low mass allele genotype calling; No Calls: Failed for genotyping. The candidate gene marker (CL5212) related to cold-tolerance indicated by red triangles arrow. (B) AutoCluster analysis. The call cluster plot displays the single-nucleotide polymorphisms (SNPs) genotype data of 37 targeted unigenes assayed using the Sequenom MassARRAY^® platform from 72, 96, and 22 samples in the RF, YH, and NT populations, respectively. One spot corresponds to one sample. Homozygous samples possessing genotypes AA and BB are orange and blue, respectively. Heterozygous samples exhibiting genotype AB are shown in green; samples with missing genotypes are red. IBM SPSS Statistics v22.0.0 and the Scheffe test method (SPSS Inc., Chicago, IL, USA) were both used to analyze the correlation between the genotypes of 37 single nucleotide markers and the minimum tolerable temperature of the three Taiwan tilapia populations (RF, YH, and NT). The results revealed the SNPs in the transcript functional database. The gene marker CL5212_1 was significantly correlated with the low-temperature tolerance of the YH Taiwan tilapia population (p < 0.01) ([130]Table 6). 3.5. Verification of the SNP Marker Assisted Selection for Cold-Tolerant Strains in Taiwan Tilapia CL5212_1 marker from CT and CS species were used to select parental individuals possessing AA and dd genotypes, respectively, and to perform CS × CS, CS × CT, CT × CS, CT × CT positive and negative hybridization pairings to produce SC1 (n = 87), SC2 (n = 148), SC3 (n = 172), and SC4 (n = 110) ([131]Figure 9). The progeny fish of the four families were subjected to low-temperature tolerance test, and the minimum temperature tolerance of each fish was recorded. Genomic DNA was extracted from the collected fin samples to confirm the genotype. Significant differences in cold-tolerance were observed in the four tested hybrid families (p < 0.01) that originated from the SC1 family with homozygous A/A cold-tolerant genotypes, and the lowest temperature tolerance was 7.3 ± 0.23 °C. For SC2 and SC3 family fish possessing the heterozygous A/d genotype, the minimum tolerable temperatures were 8.35 ± 0.49 °C and 8.47 ± 0.31 °C, respectively. For the SC4 family fish possessing the homozygous d/d cold-sensitive genotype, the minimum tolerable temperature was 10.47 ± 0.70 °C. This study demonstrates that the CL5212_1 marker is a selection marker that can be used for a marker-assisted breeding platform. Figure 9. [132]Figure 9 [133]Open in a new tab Illustration of the SNP marker-assisted crossbreeding scheme used to obtain the four reciprocal crosses (SC1 [CT × CT], SC2 [CT × CS], SC3 [CS × CT], and SC4 [CS × CS] families [marked as dam × sire]) of CT and CS strains in Taiwan tilapia that were compared for their cold-tolerance in this study. The means (±SD) of the lowest temperature tolerance of the progeny of each hybrid family is presented as a box-and-whisker diagram, and significant differences (p < 0.05) between crosses are marked with different letters. 4. Discussion Tilapia is an important farmed fish species in tropical and subtropical regions. It is often exposed to prolonged or repeated cold stress at low temperatures during winter, and this stress can result in a reduction or a complete halt of food intake. Physiological responses such as stunting and death can seriously affect yield and quality [[134]7,[135]18,[136]35,[137]36]. Sensitivity to low temperatures is the primary limiting factor that hinders the introduction and distribution of tilapia in temperate and high-elevation areas. Since the late 1990s, several Nile tilapia strains exhibiting low temperature adaptability have been domesticated and cultivated [[138]37,[139]38]. Understanding the genetic basis for intra-species phenotypic variation is a long-standing goal in biological breeding [[140]39,[141]40]. In addition to providing abundant genetic resources, body phenotypic variation that has undergone long-term evolution or multi-generational improvement also plays a key role in the contribution of gene expression pattern variation [[142]41]. Research comparing the cold-tolerance characteristics among the different Nile tilapia species located in different geographic locations such as Egypt, Ghana, and the Ivory Coast revealed that Nile tilapia have survived through natural selection for many years, and the physiologies and lethal temperature ranges reflect different cold-tolerant phenotypic characteristics acquired from their ancestral populations [[143]42,[144]43]. Although it is well known that there are intraspecies phenotypic differences in temperature tolerance [[145]44,[146]45,[147]46,[148]47], there is still a lack of information regarding this genetic variation. The physiological basis of this study was the long-term collection of populations possessing different cold-tolerant phenotypes. After multiple generations of low-temperature pressing and parental family selection, cold-tolerant and cold-sensitive strains were established. Since 2007, the Taiwan tilapia research team has been cooperating with the Fisheries Research Institute of the Executive Yuan Agriculture Committee to perform genetic improvement studies for the selection of growth traits for Nile tilapia broodstock [[149]25,[150]26]. In addition to the pure Nile strain (code name NT), samples of a variety of Taiwan tilapia species were continuously collected from northern Taiwan (Rui-Fang wild field, code RF) and southern Taiwan (private commercial seed breeding farm, code YH). These fish are maintained in the core breeding greenhouse of the Department of Aquaculture at National Taiwan Ocean University and the Aquatic Organism Research and Conservation Center in Gongliao. In addition to establishing the family code of each full-sib family through the use of the Avid Identification System, each strain also tracks the performance of each family in F1 and F2 for different generations of cold tolerance trait selection. These characteristics were measured from individual fish that survived in the F1 and F2 generations. According to the previous results involving cold tolerance data, the cold tolerance of the two generations from the same sibling family exhibits hereditary performance. A number of studies have used high-throughput next-generation sequencing technology to examine Atlantic salmon [[151]14], channel catfish (Ictalurus punctatus) [[152]20], zebrafish (Danio rerio) [[153]21,[154]23,[155]48], Nile tilapia (Oreochromis niloticus) [[156]24,[157]49,[158]50,[159]51], Blue Tilapia (Oreochromis aureus) [[160]9], Japanese flounder (Paralichthys olivaceus) [[161]52], carp (Cyprinus carpio haematopterus) [[162]22], milkfish (Chanos chanos) [[163]53], rainbow trout [[164]54], and other tropical and temperate aquaculture models of fish for low-temperature domestication-related regulatory mechanism research [[165]48,[166]55]. However, apart from systematically comparing gene expression patterns of specific tissue transcripts of the same fish species possessing different low-temperature tolerance genetic mutations, there is limited information regarding functional genes that can allow for molecular marker-assisted breeding. In the present study, transcriptomic gene libraries of the cold-tolerant (CT) and cold-sensitive (CS) groups were constructed using high-throughput next generation sequencing (NGS). A total of 48 total RNA samples (2 groups × 6 fish/group × 4 tissues/fish) were extracted from four tissues that included the brain, gill, liver and muscle. Biological analysis met the requirements of the test sample repetition. Additionally, research reports by Assefa et al. [[167]56] and Zhao et al. [[168]57] noted that pooling of RNA samples can not only reduce the cost of sequencing (optimize the cost) but also maintain high-throughput sequencing and high quality (maintaining the power). In this study, based on the consideration of sequencing cost, we mixed high-quality total RNA samples for the same tissue in the same group of six samples and then sequenced them, and this produced high-quality sequencing results. This study proposes for the first time that the Illumina HiSeq 2000 platform can be used to observe the transcriptome of the brain, gill, liver, and muscle tissues of the two strains of Taiwan tilapia under low temperature stimulation in the context of genetic research. Among the identified unigenes, 100,108 (78.12%) were successfully annotated in the public NR, GO, COG, KOG, and KEGG databases through the use of a BLAST search. GO and COG analyses determined the distribution of functional genes in tilapia from Taiwan. The KEGG database search successfully revealed the functions of cellular process genes and the gene products of metabolic processes. Low temperature and cold adaptation culture model fish species such as Japanese flounder (Paralichthys olivaceus) [[169]52] and common carp (Cyprinus carpio haematopterus) [[170]22] have been previously studied, and the transcriptome KEGG analyses revealed that the differentially expressed genes primarily participate in cell pathways (intercellular communication, cell movement, membrane transport, and catabolism), signal transmission (signal molecular interaction, genetic information processing, folding, classification, degradation, transcription and translation), metabolism (amino acid metabolism, lipid metabolism, energy metabolism), and the biological system (endocrine, circulation, digestion, nerve, sensory, excretion and immune system). Additionally, transcript sequencing analysis examining milkfish (Chanos chanos) used transcript sequencing and determined that this low-temperature sensitive fish species can produce enough energy to resist cold under low temperature stress, and the glycogen and fatty acid degradation- and synthesis-related functional genes appear to be up- and down-regulated, respectively [[171]53]. The KEGG analysis in this study revealed that the metabolic pathways exhibit common enrichment in four tissues, including the brain, gill, liver, and muscle. Among these, tight junctions are the most enriched pathways in the brain and gills, and these structures support the continuous intercellular barrier between epithelial cells that separates interstitial spaces and regulates the selective passage of solutes through epithelial cells [[172]58]. In liver tissues, protein degradation (ubiquitin-mediated proteolysis) that is primarily guided by ubiquitin, is the most enriched pathway. It is extensively involved in the regulation of the cell cycle, immune and inflammatory responses, signal control transmission pathways, and development and differentiation [[173]59]. The most enriched pathway in muscle tissue is the MAPK signaling pathway that uses protein networks to transmit extracellular signals to the cell to regulate cellular processes. It involves the activation of cell membrane signaling molecules and protein kinases [[174]60]. It is noteworthy that this reflects the interactions between the cell and the surrounding environment, adjacent cells, and ECM, and this pathway participates in intracellular processes, such as protein processing and RNA transport and degradation. These findings are consistent with the results of previous transcriptomic analyses [[175]9,[176]22,[177]48,[178]52]. The current study used the qPCR test and the 18S rRNA and β-actin as the internal references genes, which were quite stable. Based on