Abstract Gibel carp (Carassius gibelio) is a cyprinid fish that originated in eastern Eurasia and is considered as invasive in European freshwater ecosystems. The populations of gibel carp in Europe are mostly composed of asexually reproducing triploid females (i.e., reproducing by gynogenesis) and sexually reproducing diploid females and males. Although some cases of coexisting sexual and asexual reproductive forms are known in vertebrates, the molecular mechanisms maintaining such coexistence are still in question. Both reproduction modes are supposed to exhibit evolutionary and ecological advantages and disadvantages. To better understand the coexistence of these two reproduction strategies, we performed transcriptome profile analysis of gonad tissues (ovaries) and studied the differentially expressed reproduction-associated genes in sexual and asexual females. We used high-throughput RNA sequencing to generate transcriptomic profiles of gonadal tissues of triploid asexual females and males, diploid sexual males and females of gibel carp, as well as diploid individuals from two closely-related species, C. auratus and Cyprinus carpio. Using SNP clustering, we showed the close similarity of C. gibelio and C. auratus with a basal position of C. carpio to both Carassius species. Using transcriptome profile analyses, we showed that many genes and pathways are involved in both gynogenetic and sexual reproduction in C. gibelio; however, we also found that 1500 genes, including 100 genes involved in cell cycle control, meiosis, oogenesis, embryogenesis, fertilization, steroid hormone signaling, and biosynthesis were differently expressed in the ovaries of asexual and sexual females. We suggest that the overall downregulation of reproduction-associated pathways in asexual females, and their maintenance in sexual ones, allows the populations of C. gibelio to combine the evolutionary and ecological advantages of the two reproductive strategies. However, we showed that many sexual-reproduction-related genes are maintained and expressed in asexual females, suggesting that gynogenetic gibel carp retains the genetic toolkits for meiosis and sexual reproduction. These findings shed new light on the evolution of this asexual and sexual complex. Supplementary Information The online version contains supplementary material available at 10.1186/s12864-024-10462-4. Keywords: Carassius gibelio, Reproduction, Gynogenesis, Asexual reproduction, Evolution of sexual reproduction, Meiosis, Differential expression analysis, Oogenesis, Transcriptomics Introduction The establishment of sexual reproduction has been a major event in the evolution of eukaryotes [[43]1]. However, asexual reproduction has evolved independently in dozens of eukaryotic lineages, and is documented in approximately 80 vertebrate species, all representing reptiles, amphibians [[44]2], and teleost fish [[45]3]. Asexual species often originate from hybridization events and/or ploidy alteration [[46]4–[47]7]. These processes usually affect meiosis and generate new species with asexual females only [[48]8–[49]11]. Both sexual and asexual reproduction exhibit various evolutionary and ecological advantages and disadvantages. The main disadvantage of sexual reproduction is the two-fold cost of meiosis and the production of male offspring [[50]12]. Consequently, sexual individuals can be outnumbered by parthenogenetic females that exhibit twice the egg production rate. On the other hand, parthenogenetic forms suffer from the accumulation of deleterious mutations and reduced adaptive abilities, including lower ecological tolerance and higher susceptibility to parasites, following the principle of Muller`s ratchet [[51]13]. Hence, asexually reproducing species are usually considered a short-term evolutionary dead-end, and this explains the maintenance of sexual reproduction in the vast majority of eukaryotic lineages [[52]14]. Still, asexual reproduction persists in nature, and for some vertebrates, sexual and asexual complexes of closely-related species often coexist with sexual forms in the same habitats (e.g., the teleosts Poecilia and Cobitis, and the lizard Aspidoscelis) [[53]6, [54]15, [55]16]. Interspecific hybridization played an important role in the formation of polyploid asexual species. In amphibians and teleosts, all-female asexual species reproduce by gynogenesis [[56]17], a process where females use the sperm from males of the same species or a closely-related species to induce embryogenesis, without the contribution of paternal genetic material to the offspring. Regarding fish, several asexual-sexual complexes have been reported. The asexual North American leuciscid Phoxinus eos-neogaeus is the result of interspecific hybridization between the sexual species P. eos and P. neogaeus [[57]18]. In the European Cobitis complex, hybridization between sexual species generated sterile males and asexual triploid females that produce eggs through premeiotic endoreplication [[58]6, [59]19]. The asexual Poecilia formosa from the Amazon basin, which forms eggs through achiasmatic meiosis without recombination [[60]15], results from hybridization between two sexual species, P. mexicana and P. latipinna [[61]20]. The Iberian minnow Leuciscus alburnoides represents another case of a species resulting from hybridization, this with a complex genetic constitution and exhibiting the coexistence of diploid and triploid forms, as well as gynogenesis and sexual reproduction [[62]21, [63]22]. The gibel carp (Carassius gibelio), also known as Prussian carp, considered as a member of the C. auratus complex or with a species status [[64]23, [65]24], is a cyprinid fish originating from eastern Eurasia that became invasive in European freshwater ecosystems during the 20th century, due to its high ecological tolerance and adaptive abilities [[66]25, [67]26]. Gibel carp exhibits a dual mode of reproduction - sexual reproduction and gynogenesis [[68]23, [69]27–[70]29]. The emergence of asexual reproduction in this species is concomitant with a triploidization event [[71]30]. The first populations invading the freshwaters of the Czech Republic around 1975 [[72]31] included only triploid asexual females. Fifteen years later, mixed populations composed of triploid asexual females and diploid females and males reproducing sexually appeared. A low proportion of triploid and tetraploid males was also reported [[73]31, [74]32]. In Asian populations of C. auratus gibelio (following the taxonomy used by Asian authors), this phenomenon was explained by allogynogenesis, where heterologous sperm sometimes contribute to the phenotype of the offspring [[75]33]. Zhou et al. (2000) [[76]32] even reported molecular evidence of sexual reproduction in the asexual females of Chinese populations of C. auratus gibelio. They suggest that homologous sperm insemination of the eggs of asexual females is similar to classical sexual reproduction (the fused nucleus of the zygote undergoes recombination and removes extra maternal chromosomes). However, there is no empirical evidence of the capacity of sexual reproduction in the asexual form of C. gibelio distributed across Europe. The coexistence of the two reproduction forms in C. gibelio might be a unique case of the switch from a unisexual species to a partly sexual species. Several mechanisms have been proposed to explain the coexistence of asexual and sexual individuals (summarized by Knytl et al. [[77]23]). The Red Queen hypothesis predicts evolution towards equilibrium in the populations of sexual and asexual forms coexisting together and co-evolving with parasites. As asexual reproduction is associated with reduced genetic diversity, parasitism is supposed to play an important role in the maintenance of sexual reproduction [[78]28, [79]34]. Clonally reproducing females of C. gibelio suffer from higher parasite loads when compared to the genetically variable sexual form which is expected to escape the parasite infection. Sexual selection also increases the variability of immune genes, therefore sexual diploids show higher genetic diversity in immune genes than asexual triploids, in accordance with the Red Queen hypothesis [[80]28]. The coexistence of the two reproduction forms in fish may also be facilitated by other ecological processes, such as male discrimination against asexual females [[81]35], the generation of sexual individuals from asexual females [[82]36], the differential competitive abilities of asexuals and sexuals [[83]37], and the occupation of different ecological niches [[84]38]. While asexual reproduction allows for a quick clonal multiplication of individuals in stable environments [[85]39], sexual reproduction favors genetic diversity [[86]40], heterozygosity, and DNA repair, and hence adaptation to changing environments. Moreover, the necessity of asexual forms to coexist with sexual forms is directly related to gynogenesis, which requires males of conspecifics or close species in the same habitats for egg activation. Carassius gibelio represents a unique example of a species where sexual and asexual forms coexist [[87]34]. Hence, this species constitutes an object of study to elucidate the evolution of sexuality and asexuality in animals, and the mechanisms responsible for the stable coexistence of sexual and asexual individuals. Furthermore, the origin of C. gibelio is still in question. The widely accepted hypothesis is that C. gibelio arose from autotriploidization within the evolutionary branch of the C. auratus complex, leading to triploid asexual females [[88]41–[89]44]. However, Yuan et al. (2010) [[90]45], focusing on hox genes, suggested the potential hybrid origin of triploid asexual C. gibelio from C. auratus and C. carpio. Understanding the role of polyploidization in the origin of C. gibelio, and the extent of the genomic contribution of C. carpio and C. auratus to C. gibelio, could provide a better understanding of the evolution of asexual and sexual reproduction in C. gibelio. Here, the molecular mechanisms associated with reproduction in C. gibelio were analysed to study the coexistence of asexual and sexual forms. In particular, the expression of reproduction-related genes was expected to differ between asexual and sexual females, since meiosis-related genes are not important for asexually reproducing individuals. To test this hypothesis, transcriptome profile analyses of gonadal tissues (ovaries) from asexual females and sexual females of C. gibelio were performed. In addition, the transcriptomes of the closely-associated species C. carpio and C. auratus were also analysed, with a particular emphasis on the genes contributing to sexual reproduction. Materials and methods Fish tissue sampling Asexual and sexual C. gibelio were obtained from artificial breeding of the parental fish collected in their natural habitats. Parental C. gibelio were sampled in the locality (Dyje River, Czech Republic) and genotyped following the approach of Pakosta et al. [[91]46], Papoušek et al. [[92]47] and Šimková et al. [[93]48]. Asexual female offspring was obtained by induced embryogenesis using sperm of C. carpio. The sexual offspring was obtained from the artificial interbreeding of sexual specimens. The ploidy of parental asexual females used for gynogenesis (i.e. induced embryogenesis by C. carpio) and parental sexual specimens used for interbreeding was analysed by flow cytometry following Šimková et al. (2015). From each fish, fin clip about 1 cm^2 was sampled for ploidy detection. Before analysis this tissue was homogenised with scissors on Petri dish in 2 ml solution of CyStain DNA 1 step PARTEC and relative DNA content was estimated using Partec CCA I flow cytometer (Partec GmbH; [94]www.sysmex-par tec.com). Diploid C. auratus was used as a reference standard. All parental C. gibelio were also genotyped for mtDNA (D-loop) and microsatellite loci were amplified following Papoušek et al. (2008), Šimková et al. (2013) and Pakosta et al. (2018). The fish offspring was reared in aquarium conditions until the age of four years and subsequently their gonadal tissues were sampled (the age of the examined fish corresponded to 4+). Cyprinus carpio and Carassius auratus were obtained from external breeding facilities. Fish were euthanized using physical stunning through a blow to the skull with a blunt wooden instrument immediately followed by exsanguination. Four or five biological samples per fish group from a total of 8 fish groups were analysed (females and males of C. gibelio resulting from sexual reproduction, females and temperature-induced males of C. gibelio resulting from gynogenesis, females and males of sexual C. auratus, and females and males of sexual C. carpio). Gonadal tissues of each fish specimen were individually submerged in Ambion RNAlater stabilization solution (Thermo Fisher Scientific, Waltham, MA, USA). Tubes with tissues were stored at -80ºC until the isolation of total RNA. Prior to sampling, ploidy of each C. gibelio specimen was checked using the same methodology as described above. RNA extraction and library preparation Total RNA was isolated from the gonad tissue of each fish specimen. For extraction, PureLink® RNA Mini Kit (Ambion) with Trizol reagent (Thermo Fisher Scientific) and on-column PureLink DNase treatment were used according to the manufacturer´s protocol. Reagent and buffer volumes were adjusted according to the weight of tissue entering the isolation process (30 mg on average). The final elution was performed using 100 µl of RNAse-free water in the first step and the primal eluate in the second step. The yield and concentration of RNA isolates were checked using a QubitTM 4 fluorometer (Invitrogen by Thermo Fisher Scientific) and Qubit RNA HS Assay Kit (Thermo Fisher Scientific). The quality and integrity of RNA were analysed using RNA 6000 Nano Kit on a 2100 Bioanalyser instrument (Agilent Technologies, Santa Clara, CA, USA). All RNA isolates were normalized by dilution at a uniform concentration of 20 ng/µl with RNase-free water. They served as templates for DNA library preparation in twice the reaction volume recommended by the manufacturer. All samples (RNA integrity number – RIN > 7) were used for DNA library preparation. 500ng of total RNA was used for mRNA enrichment using the Poly(A) mRNA Magnetic Isolation Module (New England Biolabs, Ipswich, MA, USA). Subsequently, NEBNext® Ultra™ Directional RNA Library Prep Kit for Illumina®, and NEBNext® Multiplex Oligos for Illumina® (Dual Index Primers Set 2, New England Biolabs) were used for library preparation, with 11 PCR cycles utilized for PCR enrichment. RNA fragmentation (13 min at 94 °C) and the size selection conditions (a bead volume of 30 µl and 15 µl for the first and second bead selections, respectively) were further modified in the protocol. The quantification of DNA libraries was performed on a QubitTM 4 fluorometer (Invitrogen by Thermo Fisher Scientific) using Qubit dsDNA HS Assay Kit, and quality and size control were performed on a 2100 Bioanalyser with DNA 1000 Kit (Agilent Technologies). Finally, amplicons were pooled in equimolar amounts. The final concentration of each library in the pool was 10 nM in the pool. Subsequently, NEBNext® Ultra™ Directional RNA Library Prep Kit for Illumina® and NEBNext® Multiplex Oligos for Illumina® (Dual Index Primers Set 2, New England Biolabs) together with spike-in RNA were used for cDNA library preparation from total RNA. The quality of prepared cDNA libraries was evaluated using a Qubit fluorometer (Thermo Fisher Scientific). The quality of cDNA libraries was visualized by a 2100 Bioanalyser (Agilent Technologies), and the libraries were finally sequenced by Macrogen Korea on Illumina HiSeq X (one lane) in a paired-end configuration producing 150 bp long reads. Quality and quantity control steps were carried out by a service company. NGS data analyses A quality check of raw paired-end fastq reads was carried out by FastQC [[95]49] and their origin was categorized using BioBloomTools v2.3.4 [[96]15]. The Illumina adapters clipping and quality trimming of raw fastq reads were performed using Trimmomatic v0.39 [[97]50] with settings CROP:250 LEADING:3 TRAILING:3 SLIDINGWINDOW:4:5 MINLEN:35. Trimmed RNAseq reads were mapped to the Carassius auratus genome (ASM336829v1) with Ensembl annotation (release 104) using STAR v2.7.3a [[98]51] as a splice-aware short read aligner and default parameters except for --outFilterMismatchNoverLmax 0.1 and --twopassMode Basic. Quality control after alignment concerning the number and percentage of uniquely- and multi-mapped reads, rRNA contamination, mapped regions, read coverage distribution, strand specificity, gene biotypes, and PCR duplication was performed using several tools – namely, RSeQC v4.0.0 [[99]52], Picard toolkit v2.25.6 [[100]53], and Qualimap v.2.2.2 [[101]54]. All statistics were processed by MultiQC v1.10.1 [[102]55]. SNP clustering analysis The genomic sequences of all collected samples were aligned to the Carassius auratus reference genome (ASM336829v1-104) utilizing the Burrows-Wheeler Aligner (BWA) software [[103]56]. Post alignment and germline variants were called using Strelka2 variant calling software [[104]57], generating variant calls in VCF format which were further filtered to retain only high-confidence variants. These variants were then annotated using the reference Gene Transfer Format (GTF) file for Carassius auratus (ASM336829v1-104). Subsequent data processing was carried out in R, where the variant tables were further refined and merged with sample information. A series of filtering steps were performed to ensure only variants with sufficient coverage and sample counts were retained for analysis. The filtered variant table was then reorganized and formatted for subsequent comparative analyses. Variants located on sex chromosomes were excluded for certain analyses to ensure accurate cross-species comparisons. The data were then restructured to compare SNP identity across species, generating similarity matrices and Venn diagrams to visualize the overlap of SNPs by species and ploidy levels. Differential expression analysis and pathway enrichment analysis Appropriate bioinformatics tools were used for the processing of raw sequencing data. The genome of C. auratus was used as reference. The differential gene expression was calculated on the basis of the gene counts produced using featureCounts from the Subread package v2.0 [[105]58] and further analysed by Bioconductor package DESeq2 v1.34.0 [[106]59]. Data generated by DESeq2 with independent filtering were selected for differential gene expression analysis to avoid potential false positive results. Differences in gene expression were considered significant on the basis of a cut-off of the adjusted p-value ≤ 0.05. GO term enrichment was analysed using David [[107]60] to retrieve Gene Ontology terms in the Biological process, Cellular Component and Molecular function categories, as well as KEGG pathways [[108]61, [109]62]. Graphical representations of the GO enrichment were realized using R [[110]63] and Revigo [[111]64]. Reproduction-associated candidate genes were retrieved using the BlastKoala tool of KEGG [[112]61], the BioMart tool of Ensembl [[113]65], and published studies [[114]20, [115]66–[116]68]. GO terms enrichment was tested using Fisher’s exact test (α = 0.05) with false discovery rate (FDR) correction of the p-value. To interpret the biological functions of the DEGs, their mapping to the Gene Ontology (GO) [[117]62] and KEGG [[118]61] databases was performed to analyse pathway enrichment. In each of six fish groups associated with sexual reproduction and asexual males, significantly differently-expressed genes (DEGs) compared to the triploid asexual females of C. gibelio were selected on the basis of the following criteria: Basemean > 10, and a padj value < 0.05. For KEGG pathway analysis, no filtering based on log2 fold change was applied. Gene functions were investigated using the biological databases Uniprot [[119]69], KEGG [[120]61], Zfin [[121]70] and GeneCards [[122]71]. Principal component analysis (PCA) was performed using the DESeq2 R package [[123]59]. For PCA based on reproduction-associated genes, a set of 208 reproduction genes was selected using the BioMart tool of Ensembl [[124]65]. Gene selection and real-time quantitative PCR Based on the results of an NGS approach and published studies [[125]20, [126]66–[127]68], as well as the presence of appropriate GO and KEGG terms, candidate reproduction-associated genes were selected for the further analyses of gene expression. A-tubulin (A-tub) was used as a housekeeping gene to normalize variation in the gene expression, this gene was previously reported to be stable in fish ovary [[128]72]. The Reference Gene Selection Tool from Bio-Rad CFX Maestro software (Bio-Rad), based on geNorm software principles [[129]73] with an algorithm to normalize the Cq of each gene against the Cq values of the reference gene, was used. A total of 20 biologically-relevant genes were selected from transcriptomic outputs using published studies, and the expressions of 17 of them were validated by real-time quantitative PCR (qPCR). Three genes were excluded because of the amplification of unspecific products. Primers were designed using Primer Blast [[130]74] at the exon-exon junction. A summary of the genes analysed, and their primer sequences are presented in Table [131]1. Table 1. List of the target genes selected from RNA seq and the housekeeping gene analysed using RT-qPCR, and their respective primer sequences Gene name Gene description Forward/reverse primers (5’->3’) Amplicon size A-TUB Alpha-Tubulin TGCCAACTACGCCCG AGAGGTGAAACCAGAGCC PIWIL2 Piwi-Like Protein 2 TGACACCAACGGTTGCCA 81 CCCCCGTCCAAGAGGT ZPE3L2 Zona Pellucida Sperm-Binding Protein 3-Like TTCTTTGCCAATGGGTGGCT 92 TCCCACTGAAAACACCTTCCT RASA1B Ras Gtpase-Activating Protein 1 GGTTGTGGGTGACGAATGTC 97 CCATGAAACCAGGCTTTCCC HRASAL Gtpase Hras TCCGGGGAATCAGAGGTTGA 136 GGGGTCGTATTCGTCCACAA ZP3EL1 Zona Pellucida Sperm-Binding Protein 3-Like TCTCTGCTAATGGTTGGGTGT 129 CTGGTCACTTCCTCTTCGGT SPO11 SPO11, Initiator of Meiotic Double Stranded Breaks AGTACGGCTCACGGTCTCTG 117 TAAGCGTTTCCTCTGGGACTC SYCE1 Synaptonemal Complex Central Element Protein 1 CCCTACAGTTGGAGGGTACA 107 GTTCTGCTCAAGCTGCCTTTG C1ORF146 Chromosome 31 C1orf146 Homolog CAAGCCCCAGTCTACGGAAA 141 GGTTTACTTGTGGCCTTCGC SPINBZL Spindlin-Z-Like AAGAGCTCTCACAAGCACAAA 136 CTTGGACTAGTACGGTCCCC CAMSAP2A Calmodulin-Regulated Spectrin-Associated Protein 2 CCCAGACACCCGAAAAACAC 137 TCTTCTGGAACACTGTCTGTACC DMRT2A Doublesex- And Mab-3-Related Transcription Factor 2 AGCAAGCGACAGAGGACAAA 91 GTTGATGGACGAATGTGCCG NCOA2L Nuclear Receptor Coactivator 2 TTGCTGCTGAGTAATAACGACTG 141 TTTCCCCGACAGCACTCATC RNF212 Ring Finger Protein 212 CTTCGTGTCTCCTGGTCCTG 115 CAGACACCCTGTTTTCCTCTCT SOX8L Transcription Factor SOX-8 CAACAGCTCCACGGTGCTCA 112 TGGTGTTATCCGATGCACGC ALDH1A3 Aldehyde Dehydrogenase 1 A3 GAAAACCATGCCAGTCGATGA 141 GTGTTCCCGCAGGCCAAA CALM3A Calmodulin 3a TAGACACGTTTATCGCACGGG 83 AACGCCTCCTTGAACTCAGC BUC Bucky Ball GGACCTCAGGATCAAGGGAG 106 CTTCGTGGCCTTTGTTGGTG [132]Open in a new tab Reverse transcription following total RNA extraction from preserved samples of gonadal tissues stored in RNAlater was performed using High-Capacity RNA-to-cDNA Kit (Applied Biosystems by Thermo Fisher Scientific) according to the manufacturer’s instructions. The suitability of primers, their optimal annealing temperatures and amplicon lengths, and the specificity of the amplification of all selected genes were verified by classical PCR for representative samples of all fish groups. The PCR reaction mix (10 ul) contained 5 µl of prepared cDNA, 1 x Taq Buffer with (NH4)[2]SO[4], 1.5 mM MgCl[2], 200 µM of each dNTP, 0.4 µM of forward and reverse primers (Table [133]1), 1 U of Taq DNA polymerase (Thermo Fisher Scientific), and nuclease-free water. PCR was run under the following conditions: initial denaturation at 95˚C for 4 min; 30 cycles of 95˚C for 30 s, an optimization gradient of 40–65˚C for 30 s, 72˚C for 45 s; and a final amplification at 72˚C for 10 min. At least 5 samples from each fish group were used for the test. Three replicates for each sample were included in the qPCR analysis. Real-time qPCR was performed using the LightCycler 480 II Real-Time PCR System (Roche Diagnostics) and LightCycler 480 SYBR Green I Master chemistry (Roche). The reaction mixture (final volume 20 µl) consisted of 10 µl of 2x SYBR Green I Master, 1 µl of each primer, 3 µl of dd H[2]O, and 5 µl of cDNA template. To test the reaction efficiency and to obtain the standard amplification curve, templates were prepared by means of six serial decimal dilutions of the cDNA of representatives of each fish group. Reactions were run on a LightCycler 480 Instrument II under the following conditions: 95˚C for 5 min; 45 cycles of 95˚C for 10 s, 55˚C for 10 s, and 72˚C for 10 s; melt curve 55˚C → 95˚C (increment 0.5˚C)/5 s. In each run plate, together with samples run in triplicates, one negative control, in which RNase/DNase-free water was used instead of the cDNA and A-tub as the reference gene, was analysed. LightCycler 480 software 1.5.1 was used for analyses of qPCR outputs. The relative expression value of the differentially expressed target gene – the normalized expression – was computed using the ΔΔCq method. Differences in gene expression between sexual and asexual females were statistically evaluated. The sequences of the primers used in this analysis are listed in Table [134]1. Results Next generation sequencing and assembly and SNPs analysis of C. gibelio The sequencing of four to five diploid males and females from C. gibelio, C. auratus and C. carpio, and triploid females and males of C. gibelio yielded from 8 M to 17 M raw reads per individual (Additional file 1). The number of mapped reads varied between 5 M and 12 M. Across individual samples, from 51 to 83% of reads were uniquely mapped, and from 12% to 22% of reads were multimapped. A total of 857,874 SNPs were identified in the transcriptomes of the eight fish groups (males and females of the three species including both triploid and diploid forms of gibel carp). Clustering analysis based on SNP numbers showed that C. gibelio and C. auratus are closely related and that asexual C. gibelio and sexual C. gibelio are conspecific (Fig. [135]1A). Specifically, the proportion of SNPs shared between C. gibelio and C. auratus was 2.35 times higher than the proportion of SNPs shared by C. gibelio and C. carpio (Fig. [136]1B). However, C. carpio and C. auratus shared only 3555 SNPs. The sexual diploid and asexual triploid individuals of C. gibelio were more similar to each other than to C. auratus or C. carpio and both forms shared a similar number of SNPs with C. auratus (Fig. [137]1C). Fig. 1. [138]Fig. 1 [139]Open in a new tab (A) Dendrogram of the hierarchical clustering of different lineages based on the degree of SNP similarity. Venn diagrams of the numbers of SNPs shared (B) between C. gibelio, C. auratus and C. carpio, and (C) between C. auratus, C. carpio, and diploid and triploid females of C. gibelio Differential gene expression analysis The transcriptome profiles of the females and males of C. gibelio, C. auratus and C. carpio were analysed (Fig. [140]2). Both reproductive forms – asexual and sexual – were included for C. gibelio. In all cases, the biological replicates of same sex, ploidy level, and species tend to be more similar to each other. PCA based on transcriptome-wide gene expression (Fig. [141]2A) showed differences in transcriptome profiles between sexes of the same species, these separated by PC1, and a similarity between the transcriptome profiles of the asexual females of C. gibelio and the sexual females of C. gibelio and C. auratus. However, even the females of C. auratus were separated from C. gibelio by PC1. Likewise, the transcriptomes of the diploid and triploid males of C. gibelio and C. auratus also tended to be similar to each other. According to the transcriptome profiles, the males and females of C. carpio were separated from the other fish groups by PC2. To compare the expression levels of reproduction-related genes among fish groups, a total of 208 genes related to reproduction were selected. This set of reproductive genes led to a similar grouping of species and sexes, as revealed by all of the transcriptomic data; however, the asexual triploid females C. gibelio were more separated from the sexual ones of by PC2 (Fig. [142]2B). Fig. 2. [143]Fig. 2 [144]Open in a new tab Principal component analysis (PCA) of normalized RNAseq read counts between the diploid males and females of C. gibelio, C. auratus and C. carpio and the triploid females and males of C. gibelio for all genes (A) and a set of 208 randomly selected reproductive genes (B), on the first two principal components The numbers of non-differentially and differentially expressed genes are shown in Table [145]2. For all comparisons, the number of upregulated genes in C. gibelio asexual females was higher than the number of downregulated genes or similar to the number of downregulated genes. Comparison of the asexual and sexual females of C. gibelio revealed 1728 differentially expressed genes (DEGs). The numbers of upregulated and downregulated genes are shown in Table [146]2. The number of DEGs in asexual C. gibelio females was lower compared to sexual females in every species than compared to males of the same species. The number of DEGs between asexual C. gibelio females and C. auratus females and males and the number of DEGs between asexual C. gibelio females and C. carpio females and males was higher when compared to the number of DEGs between asexual C. gibelio females and sexual C. gibelio females and males (Table [147]2). Table 2. Number of non-differentially expressed genes and differentially expressed genes (down- and upregulated) in the triploid asexual females of C. gibelio compared to each of the diploid sexual males and females of C. gibelio, C. auratus and C. carpio C. gibelio 2n females C. gibelio 2n males C. gibelio 3n males C. auratus 2n females C. auratus 2n males C. carpio 2n females C. carpio 2n males Non-differentially expressed gens 46,058 43,659 22,363 47,174 46,089 42,435 42,751 Downregulated genes in asexual females 782 3836 13,603 4214 8298 6773 9185 Upregulated genes in asexual females 946 10,944 7634 4704 11,841 6733 10,818 [148]Open in a new tab GO enrichment analysis The full transcriptomes of the three species were functionally annotated to 3747 GO terms for females and 3755 GO terms for males. A total of 3635 were shared by all female lines, and 3721 were shared by all male lines. 30 GO terms identified in asexual females of C. gibelio were not identified in the sexual females of C. gibelio, and 30 GO terms identified in the sexual females were not present in the asexual females of C. gibelio. Furthermore, 3 GO terms were identified in diploid males of C. gibelio but not in triploid males, and 3 GO terms were identified in triploid males of C. gibelio but not in diploid males (Fig. [149]3). Fig. 3. [150]Fig. 3 [151]Open in a new tab Venn diagram of Gene Ontology terms for the females (A) and males (B) of C. gibelio, C. auratus and C. carpio, including the triploid asexual females of C. gibelio and the triploid males of C. gibelio. Total numbers of unique and shared identified GO terms are indicated Transcriptomes of sexual and asexual females were compared and investigated for pathway enrichment using overrepresentation analysis. Of the total of 1728 DEGs, 1471 were successfully annotated to the Gene Ontology (GO) and KEGG databases. A total of 809 were upregulated in asexual females in comparison to sexual females, and 662 downregulated. The significantly enriched GO terms are presented in Fig. [152]4. In the biological process category, we identified GO terms associated with gametogenesis and cell cycle control, including egg coat formation (GO:0035803), the binding of sperm to zona pellucida (GO:0007339), the positive regulation of acrosome reaction (GO:2,000,344), synaptonemal complex assembly (GO:0007130), the negative regulation of nuclear division (GO:0051784), and the negative regulation of cell cycle process (GO:0010948). In the cellular component category, the most enriched terms included egg coat (GO:0035805). In the molecular function category, they included the structural constituent of egg coat (GO:0035804) and calcium ion binding (GO:0005509). The significantly enriched KEGG pathways included oocyte meiosis (caua04114), and cell cycle (caua04110). Fig. 4. [153]Fig. 4 [154]Open in a new tab Dot plot of GO terms enrichment analysis in the biological process (A), cellular component (B), molecular function (C), and KEGG pathway (D) categories. The x-axis represents the fold enrichment (the number of DEGs in the GO term / the number of all DEGs)/(the number of genes annotated in this pathway/ the number of the genes annotated in all pathways). The y-axis corresponds to the enriched GO terms. The magnitude of dots represents the number of DEGs in the GO term, and the color corresponds to the -log10 of the p-value Meiosis-associated genes To determine whether meiotic pathways are disrupted in asexual females of C. gibelio, we first analysed the differences in expression levels of the meiosis-associated genes between sexual and asexual females following refs [[155]66–[156]68, [157]75, [158]76]. Of the set of 40 meiosis-associated genes, almost all were detected in both asexual and sexual females; however, pms1 was not detected in most sexual and asexual females, and hormad2 was not detected in any sexual or asexual individual. Hence, the meiotic pathways did not appear to be disrupted in asexual females. Seven genes were significantly differently regulated. Spo11, msh2, pds5b and stag1a displayed higher expression levels in sexual females when compared to asexual females, as well as rec114, which was close to significance (padj = 0.07). In contrast, rad1, one rad51b homologue and slc39a1 were significantly more expressed in asexual females. The other meiosis-associated genes, including meiotic nuclear division 1 (mnd1), dmc1, the double strand break repair rad1 and several rad51 homologues, did not show significant gene expression differences (Table [159]3). Table 3. List of meiosis-associated genes with their expression levels in sexual and asexual females of C. gibelio Ensembl ID Gene name Gene description L2fc padj ENSCARG00000016377* spo11 SPO11 initiator of meiotic double stranded breaks -1.75 5.36e-6 ENSCARG00000024429 hormad1 HORMA domain containing 1 0.30 0.22 ENSCARG00000034047 mnd1 Meiotic nuclear division 1 -0.4 0.24 ENSCARG00000050983 mlh1 MutL homolog 1 -0.6 0.61 ENSCARG00000069004 mlh3 MutL homolog 3 -0.1 0.87 ENSCARG00000021963 pms1 PMS homolog 1 -0.33 NA ENSCARG00000010121 pms2 PMS homolog 2 0.02 0.97 ENSCARG00000007723 dmc1 DNA meiotic recombinase 1 -0.38 0.55 ENSCARG00000038661* msh2 MutS homolog 2 -0.85 5.83e-6 ENSCARG00000047192 msh4 MutS homolog 4 0.79 0.52 ENSCARG00000015097 msh5 MutS homolog 5 -0.01 0.99 ENSCARG00000011896 msh6 MutS homolog 6 -0.36 0.18 ENSCARG00000011987* rad1 Rad1 cohesin complex component 1.43 8.11e-6 ENSCARG00000026371 rad21 Rad21 cohesin complex component 0.75 0.37 ENSCARG00000022693 rad50 Rad50 double strand repair protein -0.25 0.70 ENSCARG00000002053 rad51c Rad51 recombinase 0.30 0.85 ENSCARG00000010638* rad51b Rad51 recombinase 0.81 0.03 ENSCARG00000018365 rad51d RAD51 recombinase 0.29 0.35 ENSCARG00000027817 rad51 RAD51 recombinase -0.71 0.55 ENSCARG00000056842 rad51 RAD51 recombinase 0.69 0.63 ENSCARG00000064885 rad51b RAD51 recombinase 0.31 0.69 ENSCARG00000047813 rad51 RAD51 recombinase 0.14 0.92 ENSCARG00000004144 rad51 RAD51 recombinase -0.02 0.96 ENSCARG00000045888 rad52 Rad52 DNA repair protein 0.01 0.96 ENSCARG00000039864 rec8 Rec8 meiotic recombination protein 1.27 0.59 ENSCARG00000032822 rec114 Rec114 meiotic recombination protein -1.4 0.07 ENSCARG00000057676 smc1b Structural maintenance of chromosome 1b 1.02 0.13 ENSCARG00000039472 smc1a Structural maintenance of chromosome 1a -0.11 0.74 ENSCARG00000005430 smc2 Structural maintenance of chromosome 2 -0.24 0.4 ENSCARG00000055515 smc3 Structural maintenance of chromosome 3 -0.21 0.51 ENSCARG00000010356 smc4 Structural maintenance of chromosome 4 0.13 0.90 ENSCARG00000042929 smc5 Structural maintenance of chromosome 5 -0.13 0.75 ENSCARG00000021147 pds5a PDS5 cohesin associated factor B -0.17 0.75 ENSCARG00000017951* pds5b PDS5 cohesin associated factor B -0.98 4.7e-6 ENSCARG00000001906* stag1a Cohesin subunit SA 1 A -0.96 0.02 ENSCARG00000022475 stag1b Cohesin subunit SA 1B -0.24 0.47 ENSCARG00000052961 mre11 Double strand break repair nuclease -0.04 0.92 ENSCARG00000018605 hfm1 (mer3) Helicase for meiosis 1 -0.27 0.81 ENSCARG00000053878* slc39a1 Solute carrier family 39A1 0.65 0.04 ENSCARG00000062463 mus81 Crossover junction endonuclease MUS81 0.10 0.77 [160]Open in a new tab A positive log2 fold change (l2fc) indicates transcripts that were more abundant in asexual females when compared to sexual females. A negative log2 fold change indicates transcripts that were more abundant in sexual females when compared to asexual females. Asterisks indicate significant difference in expression levels between sexual and asexual females (padj < 0.05) Identification of differentially expressed genes in sexual and asexual females ofC. gibelio Among the 1728 differentially expressed genes revealed by transcriptome profile analysis, we specifically focussed on the genes related to reproduction pathways revealed by GO and KEGG enrichment analyses and published studies [[161]20, [162]66, [163]68]. We identified genes that were involved in reproduction pathways including cell cycle control, oocyte meiosis and maturation, and signalling pathways related to reproduction and sex differentiation (Fig. [164]5, see Table [165]4 for the list of the genes and their biological function). Fig. 5. [166]Fig. 5 [167]Open in a new tab Summary of the number of genes upregulated in asexual females or in sexual females of C. gibelio in reproduction-associated pathways Table 4. List of selected differently-expressed genes potentially involved in the reproduction of C. gibelio, including the description of gene function according to the biological databases Uniprot, KEGG, Zfin and GeneCards unless other references are mentioned