Abstract Trillions of microbes colonize the ungulate gastrointestinal tract, playing a pivotal role in enhancing host nutrient utilization by breaking down cellulose and hemicellulose present in plants. Here, through large-scale metagenomic assembly, we established a catalog of 131,416 metagenome-assembled genomes (MAGs) and 11,175 high-quality species-level genome bins (SGBs) from 17 species of ungulates in China. Our study revealed the convergent evolution of high relative abundances of carbohydrate-active enzymes (CAZymes) in the gut microbiomes of plateau-dwelling ungulates. Notably, two significant factors contribute to this phenotype: structural variations in their gut microbiome genomes, which contain more CAZymes, and the presence of novel gut microbiota species, particularly those in the genus Cryptobacteroides, which are undergoing independent rapid evolution and speciation and have higher gene densities of CAZymes. Furthermore, these enrichment CAZymes in the gut microbiomes are highly enrichment in known metabolic pathways for short-chain fatty acid (SCFA) production. Our findings not only provide a valuable genomic resource for understanding the gut microbiomes of ungulates but also offer fresh insights into the interaction between gut microbiomes and their hosts, as well as the co-adaptation of hosts and their gut microbiomes to their environments. Subject terms: Microbial genetics, Metagenomics, Molecular evolution Introduction The gut microbiota plays a crucial role in host energy metabolism, immune homeostasis, and significantly influences host behavior, health, and diseases by regulating the gut-organ axis^[50]1–[51]8. Notably, ungulates such as horses, wild asses, camels, cattle, and sheep primarily subsist on vegetation with limited nutritional content, particularly grass^[52]9,[53]10. The dominant components, such as (hemi)cellulose in grass, pose challenges for digestion, compelling these ungulates to rely predominantly on metabolites like short-chain fatty acids (SCFAs) and volatile fatty acids (VFAs) synthesized through gut microbiome fermentation for their survival and reproductive processes^[54]9,[55]11. According to previous studies, SCFAs/VFAs serve up to 70% of the remints’ energy needs^[56]12. The Qinghai-Tibet Plateau (QTP), characterized by low temperature, hypoxia, and intense ultraviolet radiation, exhibits sparse vegetation compared to other plain areas in China^[57]13,[58]14. The harsh climate and limited food resources pose significant challenges to the survival of local ungulates. In this context, the gut microbiota plays an indispensable role for plateau ungulates, enabling them to maintain highly efficient energy metabolism essential for survival in the high-altitude harsh environment^[59]15. The carbohydrate-active enzymes (CAZymes) present in rumen gut microbiomes play a crucial role in various metabolic pathways involved in carbohydrate degradation^[60]12. They employ several mechanisms, including the biosynthesis and hydrolysis of glycosidic bonds, beta-elimination, and the removal of ester bonds. CAZymes convert cellulose, hemicellulose, and lignin into monosaccharides, such as glucose, galactose, and mannose, significantly contributing to the production of SCFAs/VFAs^[61]16,[62]17. Both monosaccharides and SCFAs/VFAs serve as essential energy sources for the survival and reproduction of rumen microorganisms, while also supporting their health and environmental adaptation^[63]15–[64]18. Carbohydrate-active enzymes (CAZymes) are mainly divided into six major groups: glycoside hydrolases, glycosyltransferases, polysaccharide lyases, carbohydrate esterases, auxiliary activities, and carbohydrate-binding modules^[65]12. Rumen gut microbiomes convert cellulose, hemicellulose, and lignin into monosaccharides through the cooperation of CAZymes, which are involved in most metabolic pathways for SCFA and VFA synthesis^[66]9,[67]11,[68]12. Therefore, the efficiency of SCFA and VFA production relies on the composition and abundance of CAZymes and gut microbiomes at the molecular and cellular levels, respectively. Notably, previous researches have demonstrated that Tibetan ungulates exhibit higher efficiency in SCFA and VFA production compared to their lowland relatives, which may be attributed to their unique gut microbiome community compositions^[69]15,[70]19–[71]23. Additionally, high CAZyme abundances have also been found in other Tibetan plateau mammals such as Tibetan pigs and horses^[72]15,[73]20,[74]23–[75]25. However, little is known about the gut microbiome compositions, genomic characterizations, and evolutionary patterns underlying this phenotype in these ungulates. In previous studies, frequent genetic variations such as structural variations (SVs) and copy number variations (CNVs) in the genomes of gut microbiomes, mainly caused by widespread horizontal gene transfer (HGT) events and high mutation rates, are highly associated with the gain and loss of CAZymes and the overall rate of SCFA production in gut microbiomes^[76]26–[77]30. On the other hand, hosts and their gut microbiomes form closely related systems that interact tightly in response to changing environments^[78]31,[79]32. Recent studies on apes and hominids and their gut microbiomes suggest a co-evolution and co-speciation relationship between gut microbiomes and their hosts. Based on these findings, hosts and their gut microbiomes constitute a stable system adapted to environments, regardless of their internal interactions^[80]33–[81]35. However, some studies also observed widespread host selection, host swapping, and extinction in the evolution and speciation of gut microbiomes with their hosts, suggesting that co-evolution and co-speciation are not always dominant^[82]36–[83]40. In other words, special evolution and speciation patterns of gut microbiomes may occur in the dynamics of their hosts and external environments. Consequently, we propose that the high SCFA/VFA production efficiency phenotypes observed in the gut microbiomes of Tibetan ungulates may be related to the convergent evolution of higher abundances of CAZymes in some of their core SCFA/VFA-producing gut microbiome species compared to closely related species in lowland hosts. Specifically, this may be associated with three potential factors: first, the higher abundances of some core gut microbiomes with higher CAZyme activities in plateau hosts; second, the adaptive genomic structural variations with higher CAZyme coding genes in some core gut microbiomes in plateau hosts; and third, the establishment of unique gut microbiomes in plateau hosts through other evolutionary and speciation patterns which have higher CAZyme coding genes. Unlike traditional metagenomic strategies that focus on gene sets within assembled contigs/scaffolds, the metagenomic binning strategy groups these structures into clusters representing the whole genome of an organism, defined as metagenome-assembled genome (MAG)^[84]41–[85]43. This approach facilitates evolutionary and functional analysis of microbiota (including numerous novel microbiomes) at the species or strain level based on comparative genomes. In our study, we sequenced and assembled MAGs from the gut microbiotas of 1167 individuals across 17 plateau and lowland ungulate species. Using these MAGs, we conducted comparative genomic and phylogenetic analyses to explore factors associated with the high SCFA/VFA production efficiency phenotypes in plateau ungulates. We further elucidated the relationship among functional genes, genomic structural variations, and evolution patterns of gut microbiomes in response to their energy metabolic phenotypes, shedding more light on the co-adaptation mechanisms of plateau ungulates and their gut microbiomes to harsh plateau environments. Results Distribution and survival status of 17 species of common ungulates Among the 17 common ungulates, two species in one family belong to perissodactyl, and 15 species in two families (four subfamilies) belong to artiodactyl (Supplementary Data [86]1). Noteworthy species include the kiang (Equus kiang), white-lipped deer (Cervus albirostris), Chinese red deer (Cervus elaphus), domestic yak (Bos grunniens), wild yak (Bos mutus), goitered gazelle (Gazella subgutturosa), Tibetan gazelle (Procapra picticaudata), Przewalski’s gazelle (Procapra przewalskii), Tibetan antelope (Pantholops hodgsonii), bharal (Pseudois nayaur), and argali (Ovis ammon), which inhabit the Qinghai-Tibet Plateau (QTP) at altitudes exceeding 3000 m. Conversely, the horse (Equus caballus), sika deer (Cervus nippon), eastern roe deer (Capreolus pygargus), cattle (Bos taurus), goat (Capra hircus), and sheep (Ovis aries) dwell in vast plains in China at altitudes below 1000 m (Fig. [87]1). Specifically, the kiang, white-lipped deer, wild yak, goitered gazelle, Tibetan gazelle, Przewalski’s gazelle, and Tibetan antelope are endemic to the Tibetan plateau; the sika deer and eastern roe deer are common wild ungulates, while the horse, domestic yak, cattle, goat, and sheep are common livestock. According to the IUCN Red List of threatened species ([88]https://www.iucnredlist.org/en), Przewalski’s gazelle is endangered (EN); wild yak and goitered gazelle are vulnerable (VU); Tibetan gazelle, Tibetan antelope, and argali are near threatened (NT); sika deer, eastern roe deer, and bharal are least concern (LC) (Supplementary Data [89]1). Fig. 1. The distribution of 17 common ungulates in our study. [90]Fig. 1 [91]Open in a new tab The left tree was constructed based on the geographic distance of fecal samples from 17 ungulates’ species. the Qinghai-Tibet Plateau (QTP) at altitudes exceeding 3000 m, while brown branches represent ungulates inhabiting the vast plains in China at altitudes below 1000 m. The right map is color-coded by altitude, and the silhouettes of ungulates depict the locations of their fecal samples. Metagenome-assembled genomes of ungulates’ gut microbiota Following Illumina NovaSeq 6000 sequencing, we generated 9.15 Tb of high-quality metagenomic sequencing data from fecal samples of 10 ungulate species. Additionally, 9.19 Tb of metagenomic sequencing data from the gut microbiota of seven ungulate species were downloaded from NCBI (Supplementary Data [92]2). In total, we accumulated 18.34 Tb of sequencing data encompassing 17 ungulate species (1167 individuals; 15.72 ± 2.35 Gb per sample). Post quality control, we removed 101.17 Gb of reads originating from the hosts’ genome, retaining 18.01 Tb of host-free reads (99.45%; averaging 15.43 Gb per sample) for further analysis (Supplementary Data [93]3). All host-free reads underwent assembly and classification, resulting in the generation of 131,416 metagenome-assembled genomes (MAGs). This set included 88,343 low-quality MAGs (completeness < 50% or contamination ≥ 5%), 20,075 medium-quality MAGs (completeness ≥ 50% and contamination < 5%), and 22,998 high-quality MAGs (completeness ≥ 80% and contamination < 5%) (Fig. [94]2a and Supplementary Data [95]4). Among the high-quality MAGs, the average completeness and contamination were 90.51% and 1.44%, respectively (Supplementary Data [96]5). Notably, 83.71% of high-quality MAGs exhibited an assembly N50 length > 10 kb (Supplementary Data [97]5). Subsequently, we assessed the correlation (Spearman’s r) between relative abundances and completeness of MAGs. The results revealed a positive relationship between relative abundances and completeness of MAGs (r = 0.23, p < 0.01 for high-quality MAGs, r = 0.25, p < 0.01 for all MAGs) (Fig. [98]2c and Supplementary Fig. [99]1). Remarkably, through high sequence depth (15.72 ± 2.35 Gb per sample), many high-quality MAGs (n = 14,867) with a low relative abundance (log[10] ^relative abundance < 1) were completely assembled (Fig. [100]2c and Supplementary Data [101]4, [102]5). All highlight the overall high assembly quality of these MAGs. Fig. 2. Metagenome-assembled genomes in the gut microbiota of the 17-ungulate species. [103]Fig. 2 [104]Open in a new tab a The number of high-quality, medium-quality, and low-quality MAGs. b The number of species-level genome bins (SGBs) in different hosts. c Correlation between the relative abundances and completeness of MAGs. The correlation (Spearman’s r) was measured for the high-quality MAGs. d The phylogenetic tree of 8489 nonredundant SGBs, including 8387 bacteria and 102 archaea, with taxonomies assigned by GTDB-Tk and colored with circles at the phylum level. The legend is arranged in decreasing order (top to bottom) of the number of bacteria and archaea detected in the corresponding phyla, respectively. For redundancy removal, high-quality MAGs in each ungulate gut microbiota were dereplicated separately at the strain and species levels by dRep^[105]44,[106]45. This process in total yielded 16,638 strain-level MAGs (ANI ≤ 99%) and 11,175 species-level genome bins (SGBs; ANI ≤ 95%). Based on GTDB-Tk assignments, 4084 SGBs had references in the Genome Taxonomy Database (GTDB) and were defined as