Abstract Background & Aims Limited understanding exists regarding the characteristics and biological significance of the salivary microbiome in healthy individuals experiencing physiological fatigue. This study aimed to delineate the structural and functional alterations in the salivary microbiome of healthy individuals undergoing physiological fatigue compared to energetic controls, and to explore its potential as a biomarker for fatigue status. Methods A cohort of 7 healthy individuals experiencing acute physiological fatigue (induced by prolonged study and confirmed via electroencephalography; Fatigue group, FTG) and 63 energetic healthy controls (Energetic group, ENG) were enrolled. Saliva samples were collected, from which microbial DNA was extracted. The V3–V4 hypervariable region of the 16S rRNA gene was subsequently sequenced using high-throughput technology. Bioinformatics analyses encompassed assessment of alpha and beta diversity, identification of differential taxa using Linear discriminant analysis Effect Size (LEfSe) with multi-method cross-validation, construction of microbial co-occurrence networks, and screening of fatigue-associated biomarker genera via the Boruta-SHAP algorithm. Microbial community phenotypes and potential functional pathways were predicted using BugBase and PICRUSt2, respectively. Results The FTG group exhibited significantly diminished alpha diversity (Simpson index, p=0.01071) relative to the ENG group. Beta diversity analysis demonstrated significant dissimilarities in microbial community structure between the groups (p<0.05). Taxonomic profiling revealed a significant enrichment in the relative abundance of potential periodontopathogenic genera, including Streptococcus and Filifactor, within the FTG group, concomitantly with a significant depletion of health-associated genera such as Rothia and Neisseria. A predictive model constructed using the Boruta-SHAP algorithm, based on 15 key genera, effectively discriminated between fatigue and non-fatigue states, achieving an area under the receiver operating characteristic curve (AUC) of 0.948. Phenotypic predictions indicated a significant increase in the proportion of bacteria harboring Mobile Genetic Elements (MGEs) (p=0.048), alongside significant reductions in the proportion of aerobic bacteria (p=0.006) and biofilm-forming capacity (p=0.002) in the FTG group. Functional pathway analysis (PICRUSt2) revealed an enrichment of pathways such as "Neuroactive ligand-receptor interaction" in the FTG group, whereas pathways pertinent to energy metabolism (e.g., Citrate cycle (TCA cycle), Oxidative phosphorylation) and amino acid metabolism (e.g., Phenylalanine metabolism, Histidine metabolism) were significantly enriched in the ENG group. Conclusion This study provides novel evidence that physiological fatigue induces significant structural and functional alterations in the salivary microbiome of healthy individuals. These perturbations include diminished microbial diversity, disrupted community architecture, enrichment of potential opportunistic pathogens, and marked shifts in key metabolic pathways, particularly those governing neuroactivity and energy metabolism. These findings suggest that the salivary microbiome may be implicated in the physiological regulation of fatigue, potentially via an "oral-microbiome-brain axis," and underscore its potential as a source of non-invasive biomarkers for assessing fatigue status. Further mechanistic investigations are warranted to elucidate these interactions. Keywords: fatigue, healthy individuals, salivary microbiome, 16S rRNA sequencing, biomarkers 1. Introduction Fatigue is defined as a self-reported functional impairment symptom characterized by limitations in physical and cognitive functions, and it typically involves complex mechanisms such as immune dysfunction, metabolic disorders, and regulation of the microbiota-gut-brain axis ([44]Raizen et al., 2023). Fatigue can be classified as physiological and pathological fatigue. Physiological fatigue, which develops from physical activity, mental exertion, sleep deprivation, or infections, is relieved by rest and/or sleep. Pathological fatigue, resulting from conditions such as myalgic encephalomyelitis/chronic fatigue syndrome (ME/CFS), rheumatic diseases, multiple sclerosis, Parkinson’s disease, and long COVID postacute sequelae of SARS-CoV-2 infection (PASC), is only partially relieved by rest ([45]Raizen et al., 2023). Recently, a systematic review and meta-analysis of the global prevalence of fatigue reported an average prevalence of 7.7% for chronic fatigue (pathological fatigue) and 24.2% for generalized fatigue (physiological fatigue) ([46]Yoon et al., 2023). Fatigue has a substantial economic impact on society. For instance, it is estimated to cost employers over $136 billion annually in the United States due to the loss of productivity ([47]Ricci et al., 2007). However, this estimate does not account for additional losses due to accidents related to fatigued driving ([48]Zhang et al., 2022) and negative health outcomes associated with fatigue ([49]Knoop et al., 2021). The impact of fatigue on healthcare is severely underestimated. Despite the high economic and social costs of fatigue, the mechanisms and biomarkers of fatigue under different health conditions remain unclear. Over the past two decades, remarkable advances have been made in microbiome research ([50]Fremont et al., 2013; [51]Nagy-Szakal et al., 2017; [52]Guo et al., 2023; [53]Xiong et al., 2023). Several studies have investigated the relationship between the gut microbiome and fatigue ([54]Fremont et al., 2013; [55]Nagy-Szakal et al., 2017; [56]Guo et al., 2023; [57]Xiong et al., 2023). Previous research indicates that the gut microbiome composition in ME/CFS patients is altered, with a reduction in biodiversity; however, the precise relationship between bacterial composition and ME/CFS pathogenesis remains unclear. Dysbiosis may affect ME/CFS through the microbiota-gut-brain axis in several potential ways. These include (1) inflammation and immune activation: Dysbiosis can lead to increased intestinal permeability, commonly known as “leaky gut,” which allows bacteria or bacterial metabolites from the gut to enter the bloodstream. This can trigger immune responses and systemic inflammation, thereby affecting the brain and contributing to ME/CFS symptoms ([58]Clapp et al., 2017); (2) neurotransmitter signaling: The gut microbiome plays a role in producing and regulating neurotransmitters. Dysbiosis can disrupt the production and balance of neurotransmitters such as serotonin (5-HT) and γ-aminobutyric acid (GABA), which are crucial for mood, cognition, and other brain functions. Alterations in neurotransmitter production and signaling may contribute to fatigue symptoms in ME/CFS patients ([59]Loebel et al., 2016); (3) metabolite production: The gut microbiome produces various metabolites, including short-chain fatty acids (SCFAs), which can influence brain function and behavior. Dysbiosis may alter the production and availability of these metabolites, thereby potentially affecting gut-brain communication and leading to ME/CFS symptoms; and (4) activation of the immune-brain axis: Dysbiosis can activate the immune system, leading to the release of proinflammatory cytokines and other immune molecules. These immune molecules can communicate with the brain through pathways such as the vagus nerve and immune cell transport, potentially affecting brain function and contributing to ME/CFS symptoms ([60]Holzer et al., 2017; [61]Arzani et al., 2020). While research on the microbiota-gut-brain axis has substantially enhanced our understanding of the interactions between the microbiota and fatigue, it has predominantly focused on the lower gastrointestinal tract, often overlooking another crucial environment: the oral microbiome. The oral cavity is the entry point for all substances (microorganisms and other substances) into the body and serves as the starting point of the digestive system. Similar to research on the gut microbiome, oral microbiome research is shifting toward a comprehensive understanding of its functions and interactions with the body ([62]Baker et al., 2024). Recent findings indicate that the oral microbiome is not only a marker of oral health issues, such as dental caries and periodontal disease, but also a key player in systemic conditions, including obesity, diabetes, and neurological and psychiatric disorders ([63]Wu et al., 2018; [64]Cunha et al., 2019; [65]Lin et al., 2019; [66]Xue et al., 2020; [67]Yang et al., 2021; [68]Ahrens et al., 2022). Indeed, similar to the gut microbiome, the oral microbiome may also engage in complex bidirectional interactions between the brain and the central nervous system (CNS). The cascading effects of the oral microbiota and metabolites escaping into the brain can directly lead to the development various diseases. For instance, in mice, the oral pathogen Streptococcus mutans can enter the bloodstream from the oral cavity and induce cerebral hemorrhage by disrupting the blood-brain barrier through its collagen-binding activity ([69]Watanabe et al., 2016). Similarly, Porphyromonas gingivalis, a bacterial species present in many individuals with poor oral health, may play a pivotal role in the development and progression of periodontal disease. Notably, P. gingivalis can enter the bloodstream, colonize the brain, and release neurotoxic proteases known as gingipains, which are implicated in Alzheimer’s disease progression ([70]Lassalle et al., 2018). Recent studies have revealed how the oral microbiome negatively affects neurological processes and influences cognition and behavior. The analysis of the oral microbiome metabolic pathways in smokers showed enrichment of the neurotransmitter-related pathways. These pathways include tyrosine metabolism and the production of glutamine-glutamate and glutamatergic synapses. Smoking stimulates neurotransmitter production through the glutamine and glutamate pathways, thereby influencing reward circuitry in the brain. Thus, the oral microbiome can directly affect the reward pathways associated with smoking behavior and dependence, altering the typical interactions between the oral microbiome and the brain’s functional connectivity ([71]Lin et al., 2019). In light of this research background, to determine the relationship between the oral microbiome and fatigue, we performed 16S rRNA high-throughput sequencing to analyze the oral microbiome composition in a cohort of fatigued healthy individuals. We further predicted the phenotypes and metabolic pathways of the oral microbiome in these fatigued subjects. By evaluating the effect of fatigue on the oral microbiome, we aimed to infer the potential implications of these changes for oral health, the CNS, and overall systemic health. 2. Materials and methods 2.1. Study design, participants, and assessment procedures This study employed a prospective observational design, with the detailed protocol previously described ([72]Xu et al., 2018; [73]Xu et al., 2020). Seventy healthy university students, aged 18–50 years, were recruited. Baseline assessments confirmed all participants had sufficient sleep, exhibited normal awake electroencephalogram (EEG) patterns (absence of fatigue characteristics), and did not meet criteria for ME/CFS. Stringent inclusion and exclusion criteria (adapted from Breithaupt-Groegler et al ([74]Breithaupt-Groegler et al., 2017), covering chronic fatigue history, recent infections, specific symptoms, medication use, smoking, and oral health) were rigorously applied. Ethical approval was obtained from the Ethics Committee of the Affiliated Hospital of Hebei University of Engineering (Handan, China; March 12, 2014; Clinical Trial Registration: ChiCTR-DCD-14005746), and written informed consent was secured from all participants. Subsequently, all 70 eligible participants underwent a standardized physiological fatigue induction protocol, involving continuous high-intensity cognitive tasks (“continuous study work”) in a quiet setting for at least 18 hours (actual range: 18–24 hours), with minimal necessary breaks. Immediately following the cognitive tasks, comprehensive subjective and objective fatigue assessments were conducted on all participants. Subjective fatigue was evaluated using a revised Piper Fatigue Scale (PFS; based on Piper et al ([75]Piper et al., 1998)), with item scores from 0-10; the overall average score categorized fatigue as mild (1-3.3), moderate (3.4-6.7), or severe (6.8-10), serving as a preliminary reference for grouping. Objective fatigue state was determined via EEG monitoring (SOLAR-RTA/BFM system), based on characteristic waveform changes compared to baseline (significant increase in slow-wave and/or decrease in fast-wave activity). Upon completion of these assessments, saliva samples were immediately collected from all 70 participants for subsequent microbiome analysis. The final group assignment for the comparative analyses herein was strictly determined based on the objective EEG assessment results, although PFS scores provided valuable subjective context. The Fatigue Group (FTG; n=7) comprised participants whose post-protocol EEG clearly met the predefined objective fatigue criteria (typically corresponding with higher PFS scores, e.g., >6.7). Conversely, the Energetic Group (ENG; n=63) consisted of participants who, despite completing the same protocol, did not meet the objective EEG fatigue criteria (typically corresponding with lower PFS scores, e.g., ≤3.3). 2.2. Saliva sample collection Saliva samples were collected from the subjects after 18 h of continuous study work. The subjects rinsed their mouths three times with sterile saline. Prior to sample collection, all subjects rinsed their mouth three times (1 min each time) with 30 mL of distilled water to remove food debris. After rinsing, each subject sat straight in a seat for 5 min, with their head tilted slightly forward and their eyes open. The subjects then chewed to stimulate salivary secretion. Once a sufficient amount of saliva had accumulated in the lower jaw, they placed their tongue against the palate and opened their mouth to allow the saliva to flow naturally into a 2 mL sterile centrifuge tube. Subsequently, the saliva samples from each participant were collected and immediately transported to the National Key Laboratory of Institute of Infectious Diseases Prevention and Control of Chinese Center for Disease Control and Prevention by a cold-chain shipping company and stored at -80°C. 2.3. DNA extraction and 16S rRNA gene sequencing An aliquot of 500 μL saliva was centrifuged at 13,200 rpm for 10 min. The supernatant was discarded, and the precipitate was retained. Total saliva microbial DNA was extracted using the QIAamp^® DNA Mini Kit (Qiagen, Germany) following the manufacturer’s instructions. The quantity and quality of extracted DNA were assessed using a NanoDrop 2000 spectrophotometer and agarose gel electrophoresis, respectively. The extracted DNA was used as a template for the PCR amplification of bacterial 16S rRNA genes of the V3–V4 region with specific primers containing barcodes. The primer sequences were 341F (5′-CCTAYGGGRBGCASCAG-3′) and 806R (5′-GGACTACNNGGGGTATCTAAT-3′) ([76]Youngseob et al., 2005). The amplicons were purified, quantified, and prepared for library construction, with all amplicons mixed in equal amounts. Library quality was assessed on the Qubit^® 2.0 fluorometer (Thermo Scientific) and the Agilent Bioanalyzer 2100 system. The library was sequenced on an Illumina HiSeq 2000 platform (250 bp paired-end reads) at Novogene Bioinformatics Technology. (Beijing, China). 2.4. Sequencing data processing and bioinformatics analysis Raw sequencing reads were processed using the USEARCH pipeline (v11.0.667) for amplicon sequence analysis ([77]Edgar, 2010). Key processing steps included demultiplexing reads based on barcodes and performing quality filtering. Subsequently, the UNOISE3 algorithm was employed for sequence denoising, merging of paired-end reads and generation of non-chimeric Amplicon Sequence Variants (ASVs) ([78]Edgar, 2016a). The UCHIME process inherently includes chimera detection and removal ([79]Edgar, 2016b). Taxonomic assignment of the resulting representative ASV sequences was performed using the RDP Classifier (v2.13) against a relevant reference database ([80]Cole et al., 2014). Alpha diversity indices (Shannon, Simpson, Chao1, and ACE) were calculated to assess within-sample microbial diversity after rarefying each sample to a depth of 10,000 reads. To evaluate beta diversity (between-sample community structure differences), the robust Aitchison distance, which is suitable for compositional microbiome data, was calculated using the vegan package (v2.6-6.1) in R (v4.2.3) ([81]Clarke, 1993) ([82]Oksanen, 2024). Non-metric Multidimensional Scaling (NMDS) ordination based on the Aitchison distance matrix was used to visualize community structures, and the Wilcoxon rank-sum test was applied to assess significant differences between groups in the NMDS ordination space. Additionally, Permutational Multivariate Analysis of Variance (PERMANOVA) and Multi-Response Permutation Procedures (MRPP) were conducted to formally test for significant differences in overall community composition between groups. For PERMANOVA and MRPP analysis, the function of the vegan package for R was used. Linear Discriminant Analysis Effect Size (LEfSe) (v1.1.2) was utilized to identify statistically significant differentially abundant taxa between groups, applying a Linear Discriminant Analysis (LDA) score threshold of > 3.0 ([83]Segata et al., 2011). To ensure the robustness of the identified differential taxa, the results were cross-validated using multiple alternative methods, including ALDEx2 ([84]Gloor, 2015), ANCOM-II ([85]Mandal et al., 2015), MaAsLin3 ([86]Nickols et al., 2024), PROC-GLM ([87]Sunwoo et al., 2020) and ZicoSeq ([88]Yang and Chen, 2023). Microbial co-occurrence networks were constructed using the R package ggClusterNet (v0.1.0). Network edges were determined based on SparCC correlations (|r| > 0.3, p < 0.05) between taxa ([89]Wen et al., 2022). The functional potential of the microbial communities was predicted by inferring KEGG Orthology (KO) profiles from ASV data using PICRUSt2 (v1.7.2) ([90]Douglas et al., 2020). Differential abundance analysis of predicted functional pathways between groups was performed using the LinDA method implemented within the ggpicrust2 package (v1.7.2), which also facilitated visualization ([91]Yang et al., 2023). Furthermore, key phenotypic traits of the microbial communities, such as Gram staining properties, oxygen tolerance, and biofilm formation capacity, were inferred using BugBase ([92]https://github.com/knights-lab/BugBase) ([93]Ward et al., 2017). To identify potential taxonomic biomarkers associated with fatigue status, a feature selection approach using the Boruta algorithm (implemented with LightGBM) was employed ([94]Lundberg and Lee, 2017). SHAP (SHapley Additive exPlanations) analysis was subsequently applied to interpret the contribution of the selected features to the model, thereby enhancing model interpretability ([95]Kursa and Rudnicki, 2010). 2.5. Statistical analysis Baseline demographic characteristics were compared between the Fatigue Group (FTG) and the Energetic Group (ENG). Specifically, continuous variables (e.g., age) were compared between groups using PROC GLM (implemented via the sasLM package v0.10.3 in R v4.2.3). Post hoc analysis using Tukey’s Honestly Significant Difference (HSD) test was performed if required for multiple comparisons. Categorical variables (e.g., gender distribution) were compared using the Wilcoxon rank-sum test (or Chi-squared test, as appropriate for the data). Statistical methods for group comparisons related to bioinformatics analyses (e.g., comparisons of alpha diversity indices, PERMANOVA/MRPP tests for beta diversity, differential abundance analysis of taxa and functions) are detailed within their respective descriptions in Section 1.4. All statistical tests were two-sided where applicable. A p-value < 0.05 was considered statistically significant. Significance levels in the results are denoted as *p < 0.05, **p < 0.01, and ***p < 0.001. 3. Results 3.1. Overview of the study cohort and sequencing data A total of 70 healthy individuals aged 18–50 years were enrolled according to strict inclusion and exclusion criteria ([96] Figure 1 ), including 7 individuals experiencing fatigue (FTG) and 63 individuals in an energized state (ENG). No significant differences in gender or age were observed between the two groups ([97] Supplementary Table S1 ). Saliva samples were systematically collected from all participants and subjected to 16S rRNA amplicon sequencing on the Illumina HiSeq 2000 platform. Following rigorous data preprocessing and quality control procedures—including splicing, filtering, and chimera removal—4,164,405 valid sequences were obtained in total, with each sample yielding an average of 59,492 ± 5,101 sequences. The average read length of the valid sequences was 425 base pairs. Subsequent taxonomic assignment revealed 1,204 ASVs, providing a comprehensive overview of the microbial profiles in these samples. Figure 1. [98]Figure 1 [99]Open in a new tab Schematic overview of the study workflow. Healthy volunteers (n=70) were recruited and assessed (including EEG), leading to categorization (energized=63, fatigue=7). Saliva samples underwent 16S rRNA gene sequencing. Bioinformatics analysis characterized the saliva microbiota to identify structural differences, fatigue-related features, and biomarkers, which were correlated with microbiota composition, predicted functions, and phenotypes. Key phases included Recruitment, Sampling and Sequencing, and Bioinformatics Analysis and Statistics. 3.2. Intra-variations in salivary microbial diversity between the FTG and ENG groups To validate sequencing depth adequacy for microbial diversity assessment, rarefaction curves were initially constructed to confirm data saturation ([100] Supplementary Figure S1 ). Comparative analysis of α-diversity indices revealed a significant reduction in overall microbial diversity within the FTG group compared to controls ([101] Supplementary Table S2 ). Regarding indices predominantly reflecting species richness, elevated values were observed in the FTG group for both ACE (439.22 ± 59.84, p = 0.1868, [102]Figure 2A ) and Chao1 (438.67 ± 61.66, p = 0.7858, [103]Figure 2B ) estimators. Interestingly, while the Simpson index (0.80 ± 0.08, p = 0.01071, [104]Figure 2C ) and Shannon index (2.85 ± 0.40, p = 0.1152, [105]Figure 2D ) integrate both species richness and evenness, only the Simpson index demonstrated statistically significant intergroup differences. This discrepancy suggests potential dominance of specific microbial taxa in community structure of the FTG group. Figure 2. [106]Figure 2 [107]Open in a new tab Comparison of α-diversity indices of the oral microbiota between the FTG and ENG groups. (A) ACE index. (B) Chao1 index. (C) Shannon index. (D) Simpson index. Statistical comparisons were performed using the PROC GLM test, with significance levels denoted as follows: P < 0.05 (*) and P > 0.05 (ns). 3.3. Inter-variations in salivary microbial diversity between the FTG and ENG groups To evaluate the similarity of salivary microbial communities, β-diversity analysis at the amplicon sequence variant (ASV) level was performed on 16S rRNA amplicon sequencing data using the vegan package (v2.6.4). A Robust Aitchison distance matrix, optimized for compositional data analysis, was constructed to characterize microbial community structures. Non-metric multidimensional scaling (NMDS) was employed to visualize inter-group differences between FTG and ENG cohorts ([108] Figure 3A ). While NMDS plot demonstrated partial overlap between groups, suggesting subtle overall differences in salivary microbiota, statistically significant distinctions were confirmed through Wilcoxon rank-sum test ([109] Figure 3B , p=5.29E-3), ANOSIM (permutations=999, p=0.029), and MRPP (permutations=9999, p=0.0419). Figure 3. [110]Figure 3 [111]Open in a new tab Comparison of β-diversity indices of the oral microbiota between the FTG and ENG groups. (A) Non-metric multidimensional scaling (NMDS) ordination derived from robust Aitchison dissimilarity distances for oral microbiota community (stress value = 0.099, k = 2). Colored ellipses indicate 95% confidence intervals for each group. (B) The boxplot illustrates the distribution of robust Aitchison distances for pairwise comparisons between samples from the FTG and ENG groups. Individual data points represent specific pairwise comparisons, with asterisks (**) denoting statistically significant differences between groups as determined by Wilcoxon rank-sum test (p < 0.01). 3.4. Taxonomic differences in the salivary microbiome of the FTG and ENG groups To elucidate compositional differences in the salivary microbiota between energized (ENG) and fatigued (FTG) individuals at the phylum and genus levels, a systematic comparative analysis was conducted. First, Venn diagrams were employed to quantitatively assess shared and unique microbial taxonomic units between the groups ([112] Figures 4A, B ). Phylum-level analysis ([113] Figure 4A ) identified 14 phyla in total, with 12 shared by both groups. Two phyla (Chloroflexi and Elusimicrobia) were unique to the ENG group, resulting in a high shared proportion (85.71%) and indicating substantial similarity in core phylum composition. Genus-level analysis ([114] Figure 4B ), however, revealed that among the 157 identified genera, only 111 were shared. The ENG group harbored significantly more unique genera (n=45) compared to the FTG group (n=1), with the shared proportion decreasing to 70.70%. This highlights increased inter-group divergence and greater microbial uniqueness within the ENG cohort at the genus level. Figure 4. [115]Figure 4 [116]Open in a new tab Comparison of salivary microbial community composition between ENG and FTG groups. (A, B) Venn diagrams assessing the number of shared and unique taxa at the phylum (A) and genus (B) levels. Numbers are counts; percentages indicate shared proportion. (C, D) Circos plots illustrating relative abundance and inter-group associations of major taxa at the top 10 phylum (C) and top 15 genus (D) levels in ENG (left) and FTG (right) groups. Outer arc length corresponds to relative abundance; inner ribbon width reflects association strength. Second, Circos plots were utilized to visualize the relative abundance and inter-group associations of major taxa ([117] Figures 4C, D ). The phylum-level Circos plot ([118] Figure 4C ) displayed the top 10 most abundant phyla (cumulatively >99.9% relative abundance), confirming Firmicutes, Proteobacteria, Bacteroidetes, and Actinobacteria as the primary dominant phyla in both groups. The genus-level Circos plot ([119] Figure 4D ) focused on the top 15 most abundant genera (each >1% relative abundance, collectively >60% total relative abundance), clearly illustrating significant inter-group differentiation: Streptococcus exhibited markedly higher relative abundance in the FTG group, whereas Rothia and Neisseria were notably enriched in the ENG group. Furthermore, abundance differences for other major genera, including Gemella, Granulicatella, and Prevotella, were also depicted. 3.5. Oral microbiome network in healthy individuals in fatigue state To elucidate the interactions within the oral microbiome, we identified interactive networks within the groups ([120] Figures 5A, B ) and delineated differences in microbiome interactions between the FTG and ENG groups ([121] Figure 5C ). We found interactive networks within the oral microbiome of the ENG group ([122] Figure 4A ). This was confirmed by multiple network topological indices, whose values were greater than zero in the saliva samples of the ENG group, including the number of clusters (No. Clusters), number of edges (Num. Edges), number of positive edges (Num. Pos. Edges), number of negative edges (Num. Neg. Edges), number of vertices, diameter, average path length, and centralization betweenness ([123] Supplementary Table S3 ); this finding indicated the existence of a complex network of microbiota in the ENG group ([124] Figure 5A ). The interacting microbiota within the networks of the ENG group was predominantly distributed across 6 phyla and 13 genera. These genera, ranked in descending order of interaction frequency, were as follows: Prevotella (40/112, 35.71%), Streptococcus (26/112, 23.21%), Lancefieldella (11/112, 9.82%), Leptotrichia (7/112, 6.25%), Porphyromonas (6/112, 5.36%), Lachnoanaerobaculum (5/112, 4.46%), Veillonella (5/112, 4.46%), Eubacterium (4/112, 3.57%), Saccharibacteria_genera_incertae_sedis (2/112, 1.79%), Unassigned (2/112, 1.79%), Aggregatibacter (1/112, 0.89%), Granulicatella (1/112, 0.89%), Neisseria (1/112, 0.89%), and Schaalia (1/112, 0.89%). Figure 5. [125]Figure 5 [126]Open in a new tab Salivary microbial network of the ENG and FTG groups. (A) Visualization of the salivary microbial network in the ENG group. (B) Visualization of the salivary microbial network in the FTG group. (C) Changes in network topology, including number of clusters (No. Clusters), number of edges (Num. Edges), average path length, and diameter between the two groups. Concurrently, we observed an increase in the network topological complexity in the FTG group, as indicated by metrics such as Num. Edges, Num. Pos. Edges, Num. Neg. Edges, number of vertices, and diameter ([127] Figures 5B, C ; [128]Supplementary Table S3 ). The dashed line plot ([129] Figure 5C ) indicates a significant increase in both the number of edges and vertices in the network topology of the FTG group. The numbers of both positively and negatively correlated edges were significantly higher in the FTG group than in the ENG group. However, the FTG group showed a decrease in network topological indices such as connectance (edge density) and mean clustering coefficient (average CC), which suggested a reduction in network cohesiveness in this group. Specifically, the FTG group network included interacting species from 11 phyla and 33 genera. The top 15 species based on interaction frequency were Streptococcus (123/710, 17.32%), Neisseria (54/710, 7.61%), Rothia (47/710, 6.62%), Schaalia (46/710, 6.48%), Prevotella (38/710, 5.35%), Veillonella (33/710, 4.65%), Leptotrichia (32/710, 4.51%), Porphyromonas (30/710, 4.23%), Gemella (28/710, 3.94%), Eubacterium (26/710, 3.66%), Oribacterium (26/710, 3.66%), Stomatobaculum (26/710, 3.66%), Lautropia (20/710, 2.82%), Granulicatella (19/710, 2.68%), and Haemophilus (18/710, 2.56%). These findings revealed that under fatigue conditions, the number of interacting species and the frequency of their interactions in the FTG group network significantly increased, along with a notable increase in both cooperative and competitive interactions among the species. However, despite the increased complexity of the network, its overall density decreased. 3.6. Identification and validation of salivary microbiota biomarkers for fatigue status To elucidate the distinctive salivary microbiome profiles associated with the fatigue state (FTG), we initially employed Linear Discriminant Analysis Effect Size (LEfSe) to compare microbial taxa with significantly different abundances between the FTG and ENG groups. The resulting cladogram illustrates the differentially abundant taxa hierarchically from phylum to genus level. Nineteen taxa were identified with significant differential abundance (LDA score > 2, P < 0.05) between the groups ([130] Figure 6A ). Specifically, the FTG group exhibited significant enrichment of Firmicutes(phylum), Bacilli(class),Streptococcaceae(family), Peptostreptococcaceae(family),Streptococcus (genus), Filifactor (genus),and Peptostreptococcaceae incertae sedis (unclassified genus). Conversely, the ENG group demonstrated significantly higher abundances of Actinobacteria (phylum), Actinobacteria (class), Micrococcales (order), Micrococcaceae(family),Proteobacteria(phylum),Betaproteobacteria(class) ,Neisseriales(order),Neisseriaceae(family), Rothia (genus), Neisseria (genus), Megasphaera (genus),and Flavobacteriaceae_Unassigned (unclassified genus). Figure 6. [131]Figure 6 [132]Open in a new tab Biomarker taxa of the salivary microbiota in the FTG and ENG groups. (A) Intergroup microbial community markers of the oral microbiota in the ENG and FTG groups based on the LEfSe analysis. (B) SHAP summary plots according to the Boruta-SHAP algorithm. Bar plot showing global SHAP values for feature importance, and beeswarm plot for the local SHAP values, showing the contribution of each genus to the fatigue predictions of the model. Features in both bar plot and beeswarm plot were ranked by mean absolute SHAP value, hence their rankings are identical. In the SHAP beeswarm plot, each point represents an individual in the training data. The x-axis corresponds to the SHAP value, with vertical jitter indicating a high density of points. The color scale indicates the relative magnitude of each feature with yellow indicating high values of the feature and purple the opposite; ROC curve obtained for the 15 feature genera from the model based on the Boruta-SHAP algorithm. (C) Venn diagram and heatmap visualizations comparing genus-level taxa detected by seven differential analysis methods. The Venn diagram illustrates the overlap of genus-level taxa detected by each method. The heatmap displays the distribution patterns of the detected genus-level species across different methods, allowing for a visual comparison of their performance. To investigate the predictive utility of the salivary microbiome in discriminating individual fatigue status, this study constructed a machine learning model based on genus-level taxonomy. The model employed the Boruta algorithm for feature selection and integrated SHAP (SHapley Additive exPlanations) analysis for model interpretation and key feature identification, with its performance and robustness ultimately assessed via cross-validation. The model’s predictive efficacy, evaluated using the Receiver Operating Characteristic (ROC) curve ([133] Figure 6B ), yielded an Area Under the Curve (AUC) of 0.948 (95% CI: 0.919 - 0.974), demonstrating excellent and statistically significant discriminatory power in effectively differentiating between fatigued and non-fatigued (energized) individuals. Furthermore, assessment of model stability through resampling techniques revealed robust performance across key metrics ([134] Supplementary Figure S2 ), exhibiting high consistency particularly for Specificity (mean ~0.95), Negative Predictive Value (mean ~0.94), and AUC (mean ~0.95). Accuracy (mean ~0.90) and Positive Predictive Value (mean ~0.84) also showed good performance. Although Sensitivity (mean ~0.75) and F1 Score (mean ~0.79) were comparatively lower with slightly wider distributions, suggesting potential room for improvement in identifying fatigued (positive) samples, the Matthews Correlation Coefficient (MCC, mean ~0.71), as a balanced metric, nonetheless confirmed the model’s reasonably good overall predictive capability. To gain deeper insights into the model’s decision-making mechanisms, SHAP analysis was utilized to visualize the contributions of the top 15 feature genera ([135] Figure 6B ). The SHAP summary plot (comprising a beeswarm plot and a bar plot) clearly elucidated: (1) feature importance ranking based on mean absolute SHAP values; (2) the directionality of the effect of feature abundance (color: yellow=high, purple=low) on predictive contribution (sign of SHAP value); and (3) the pattern of the relationship between feature abundance and predictive impact (distribution of points). For instance, Rothia, the most important feature, showed that higher abundance was associated with a reduced prediction of fatigue (predominantly negative SHAP values), and its relatively symmetrical SHAP value distribution suggested an approximately linear relationship between its abundance and fatigue status risk. In summary, this study successfully developed and validated a high-performance classifier based on the salivary microbiome, capable of reliably discriminating between different fatigue statuses (AUC=0.948). The model demonstrated robust performance, and SHAP analysis elucidated the specific impact patterns of key microbial taxa (e.g., Rothia) and their abundances on predictions, enhancing model interpretability. These findings indicate that the identified key salivary microbial taxa hold potential value as non-invasive biomarkers for clinical fatigue risk assessment. To robustly validate the bacterial genera significantly associated with fatigue, we employed a cross-validation strategy using seven distinct differential abundance analysis methods: LEfSe, ALDEx2, ANCOM-II, ZicoSeq, MaAsLin3, PROC-GLM, and our previously described fatigue-associated Boruta-SHAP algorithm model. Comparative visualization via Venn diagrams and a heatmap facilitated the assessment of consensus in genus-level taxa identification across these methodologies ([136] Figure 6C ). The Venn diagram illustrates the overlap of genera detected by each method, highlighting methodological concordance. The heatmap depicts the distribution patterns (e.g., detection status or statistical significance/effect size) of the consensus genera across the analytical approaches, enabling a visual comparison of their performance. Thirteen genera exhibited consistent detection by at least two methods. Notably, Rothia and Filifactor were concurrently identified by six methodologies, followed by Neisseria and Streptococcus, detected by five approaches. Peptostreptococcaceae incertae sedis demonstrated consensus across three methods. Eight additional genera (Megasphaera, Mycoplasma, Pyramidobacter, Treponema, Necropsobacter, Pseudoramibacter, Alloprevotella, and Flavobacteriaceae_Unassigned) showed agreement between two independent analytical frameworks. This comprehensive multi-method validation strategy substantially strengthens the reliability of these microbial signatures as fatigue-associated biomarkers. Furthermore, to visually compare the abundance distributions of potential microbial biomarkers within saliva samples from the ENG and FTG groups, boxplots were generated ([137] Figure 7 ). These plots illustrate the relative abundances (Y-axis, log10 scale) of major microbial taxa at the phylum (left panel) and genus (right panel) levels between the two groups. At the phylum level, Firmicutes and Proteobacteria were identified as the most dominant phyla in both groups. Compared to the FTG group, the ENG group showed slightly higher trending relative abundances of Actinobacteria and Proteobacteria, whereas the relative abundance of Firmicutes appeared slightly elevated in the FTG group. Spirochaetes and Tenericutes consistently displayed lower relative abundances in both cohorts. The genus-level comparison (right panel) revealed more pronounced inter-group differences. Specifically, the relative abundances of Rothia and Neisseria were markedly higher in the ENG group than in the FTG group. Conversely, the FTG group exhibited significantly higher relative abundances of Filifactor and Streptococcus. The distributions of numerous other genera, including Megasphaera and Peptostreptococcaceae incertae sedis, also showed varying degrees of difference between the groups. Although many genera were present at low overall abundance, their differential presence may still possess biological significance. Figure 7. [138]Figure 7 [139]Open in a new tab Comparison of salivary microbial community composition between the ENG and FTG groups at the phylum and genus levels. Box plots show the relative abundance (%) distribution (Y-axis, log10 scale) of major taxa at the phylum (left panel) and genus (right panel) levels. Orange boxes represent the ENG group, and blue boxes represent the FTG group. Each box indicates the interquartile range (IQR), the horizontal line within the box represents the median, and the whiskers extend to the furthest data points within 1.5 times the IQR from the box edge. Individual dots represent the actual relative abundance values for each sample corresponding to the respective taxon. Following the identification of key bacterial genera exhibiting significant abundance differences between the FTG and ENG groups ([140] Figures 5 , [141]6 ), we sought to gain deeper insights into the potential functional ramifications of these taxonomic shifts. Consequently, we performed comprehensive functional annotation of these differential genera, with detailed results compiled in [142]Supplementary Table S2 . This annotation integrates multi-dimensional information, including the potential pro- or anti-inflammatory properties of each genus, their capacity for gamma-aminobutyric acid (GABA) metabolism, notable metabolite production (particularly short-chain fatty acids, SCFAs), ecological and adaptive traits (e.g., carriage of mobile genetic elements (MGEs), biofilm formation capabilities), and potential clinical relevance (e.g., associations with periodontal pathogenesis or systemic diseases). The information presented in [143]Supplementary Table S1 was systematically curated from published literature and public databases (references in table footnotes). Overall, this table highlights that