Abstract

Background & Aims

   Limited understanding exists regarding the characteristics and
   biological significance of the salivary microbiome in healthy
   individuals experiencing physiological fatigue. This study aimed to
   delineate the structural and functional alterations in the salivary
   microbiome of healthy individuals undergoing physiological fatigue
   compared to energetic controls, and to explore its potential as a
   biomarker for fatigue status.

Methods

   A cohort of 7 healthy individuals experiencing acute physiological
   fatigue (induced by prolonged study and confirmed via
   electroencephalography; Fatigue group, FTG) and 63 energetic healthy
   controls (Energetic group, ENG) were enrolled. Saliva samples were
   collected, from which microbial DNA was extracted. The V3–V4
   hypervariable region of the 16S rRNA gene was subsequently sequenced
   using high-throughput technology. Bioinformatics analyses encompassed
   assessment of alpha and beta diversity, identification of differential
   taxa using Linear discriminant analysis Effect Size (LEfSe) with
   multi-method cross-validation, construction of microbial co-occurrence
   networks, and screening of fatigue-associated biomarker genera via the
   Boruta-SHAP algorithm. Microbial community phenotypes and potential
   functional pathways were predicted using BugBase and PICRUSt2,
   respectively.

Results

   The FTG group exhibited significantly diminished alpha diversity
   (Simpson index, p=0.01071) relative to the ENG group. Beta diversity
   analysis demonstrated significant dissimilarities in microbial
   community structure between the groups (p<0.05). Taxonomic profiling
   revealed a significant enrichment in the relative abundance of
   potential periodontopathogenic genera, including Streptococcus and
   Filifactor, within the FTG group, concomitantly with a significant
   depletion of health-associated genera such as Rothia and Neisseria. A
   predictive model constructed using the Boruta-SHAP algorithm, based on
   15 key genera, effectively discriminated between fatigue and
   non-fatigue states, achieving an area under the receiver operating
   characteristic curve (AUC) of 0.948. Phenotypic predictions indicated a
   significant increase in the proportion of bacteria harboring Mobile
   Genetic Elements (MGEs) (p=0.048), alongside significant reductions in
   the proportion of aerobic bacteria (p=0.006) and biofilm-forming
   capacity (p=0.002) in the FTG group. Functional pathway analysis
   (PICRUSt2) revealed an enrichment of pathways such as "Neuroactive
   ligand-receptor interaction" in the FTG group, whereas pathways
   pertinent to energy metabolism (e.g., Citrate cycle (TCA cycle),
   Oxidative phosphorylation) and amino acid metabolism (e.g.,
   Phenylalanine metabolism, Histidine metabolism) were significantly
   enriched in the ENG group.

Conclusion

   This study provides novel evidence that physiological fatigue induces
   significant structural and functional alterations in the salivary
   microbiome of healthy individuals. These perturbations include
   diminished microbial diversity, disrupted community architecture,
   enrichment of potential opportunistic pathogens, and marked shifts in
   key metabolic pathways, particularly those governing neuroactivity and
   energy metabolism. These findings suggest that the salivary microbiome
   may be implicated in the physiological regulation of fatigue,
   potentially via an "oral-microbiome-brain axis," and underscore its
   potential as a source of non-invasive biomarkers for assessing fatigue
   status. Further mechanistic investigations are warranted to elucidate
   these interactions.

   Keywords: fatigue, healthy individuals, salivary microbiome, 16S rRNA
   sequencing, biomarkers

1. Introduction

   Fatigue is defined as a self-reported functional impairment symptom
   characterized by limitations in physical and cognitive functions, and
   it typically involves complex mechanisms such as immune dysfunction,
   metabolic disorders, and regulation of the microbiota-gut-brain axis
   ([44]Raizen et al., 2023). Fatigue can be classified as physiological
   and pathological fatigue. Physiological fatigue, which develops from
   physical activity, mental exertion, sleep deprivation, or infections,
   is relieved by rest and/or sleep. Pathological fatigue, resulting from
   conditions such as myalgic encephalomyelitis/chronic fatigue syndrome
   (ME/CFS), rheumatic diseases, multiple sclerosis, Parkinson’s disease,
   and long COVID postacute sequelae of SARS-CoV-2 infection (PASC), is
   only partially relieved by rest ([45]Raizen et al., 2023). Recently, a
   systematic review and meta-analysis of the global prevalence of fatigue
   reported an average prevalence of 7.7% for chronic fatigue
   (pathological fatigue) and 24.2% for generalized fatigue (physiological
   fatigue) ([46]Yoon et al., 2023). Fatigue has a substantial economic
   impact on society. For instance, it is estimated to cost employers over
   $136 billion annually in the United States due to the loss of
   productivity ([47]Ricci et al., 2007). However, this estimate does not
   account for additional losses due to accidents related to fatigued
   driving ([48]Zhang et al., 2022) and negative health outcomes
   associated with fatigue ([49]Knoop et al., 2021). The impact of fatigue
   on healthcare is severely underestimated. Despite the high economic and
   social costs of fatigue, the mechanisms and biomarkers of fatigue under
   different health conditions remain unclear.

   Over the past two decades, remarkable advances have been made in
   microbiome research ([50]Fremont et al., 2013; [51]Nagy-Szakal et al.,
   2017; [52]Guo et al., 2023; [53]Xiong et al., 2023). Several studies
   have investigated the relationship between the gut microbiome and
   fatigue ([54]Fremont et al., 2013; [55]Nagy-Szakal et al., 2017;
   [56]Guo et al., 2023; [57]Xiong et al., 2023). Previous research
   indicates that the gut microbiome composition in ME/CFS patients is
   altered, with a reduction in biodiversity; however, the precise
   relationship between bacterial composition and ME/CFS pathogenesis
   remains unclear. Dysbiosis may affect ME/CFS through the
   microbiota-gut-brain axis in several potential ways. These include (1)
   inflammation and immune activation: Dysbiosis can lead to increased
   intestinal permeability, commonly known as “leaky gut,” which allows
   bacteria or bacterial metabolites from the gut to enter the
   bloodstream. This can trigger immune responses and systemic
   inflammation, thereby affecting the brain and contributing to ME/CFS
   symptoms ([58]Clapp et al., 2017); (2) neurotransmitter signaling: The
   gut microbiome plays a role in producing and regulating
   neurotransmitters. Dysbiosis can disrupt the production and balance of
   neurotransmitters such as serotonin (5-HT) and γ-aminobutyric acid
   (GABA), which are crucial for mood, cognition, and other brain
   functions. Alterations in neurotransmitter production and signaling may
   contribute to fatigue symptoms in ME/CFS patients ([59]Loebel et al.,
   2016); (3) metabolite production: The gut microbiome produces various
   metabolites, including short-chain fatty acids (SCFAs), which can
   influence brain function and behavior. Dysbiosis may alter the
   production and availability of these metabolites, thereby potentially
   affecting gut-brain communication and leading to ME/CFS symptoms; and
   (4) activation of the immune-brain axis: Dysbiosis can activate the
   immune system, leading to the release of proinflammatory cytokines and
   other immune molecules. These immune molecules can communicate with the
   brain through pathways such as the vagus nerve and immune cell
   transport, potentially affecting brain function and contributing to
   ME/CFS symptoms ([60]Holzer et al., 2017; [61]Arzani et al., 2020).

   While research on the microbiota-gut-brain axis has substantially
   enhanced our understanding of the interactions between the microbiota
   and fatigue, it has predominantly focused on the lower gastrointestinal
   tract, often overlooking another crucial environment: the oral
   microbiome. The oral cavity is the entry point for all substances
   (microorganisms and other substances) into the body and serves as the
   starting point of the digestive system. Similar to research on the gut
   microbiome, oral microbiome research is shifting toward a comprehensive
   understanding of its functions and interactions with the body
   ([62]Baker et al., 2024). Recent findings indicate that the oral
   microbiome is not only a marker of oral health issues, such as dental
   caries and periodontal disease, but also a key player in systemic
   conditions, including obesity, diabetes, and neurological and
   psychiatric disorders ([63]Wu et al., 2018; [64]Cunha et al., 2019;
   [65]Lin et al., 2019; [66]Xue et al., 2020; [67]Yang et al., 2021;
   [68]Ahrens et al., 2022). Indeed, similar to the gut microbiome, the
   oral microbiome may also engage in complex bidirectional interactions
   between the brain and the central nervous system (CNS). The cascading
   effects of the oral microbiota and metabolites escaping into the brain
   can directly lead to the development various diseases. For instance, in
   mice, the oral pathogen Streptococcus mutans can enter the bloodstream
   from the oral cavity and induce cerebral hemorrhage by disrupting the
   blood-brain barrier through its collagen-binding activity ([69]Watanabe
   et al., 2016). Similarly, Porphyromonas gingivalis, a bacterial species
   present in many individuals with poor oral health, may play a pivotal
   role in the development and progression of periodontal disease.
   Notably, P. gingivalis can enter the bloodstream, colonize the brain,
   and release neurotoxic proteases known as gingipains, which are
   implicated in Alzheimer’s disease progression ([70]Lassalle et al.,
   2018). Recent studies have revealed how the oral microbiome negatively
   affects neurological processes and influences cognition and behavior.
   The analysis of the oral microbiome metabolic pathways in smokers
   showed enrichment of the neurotransmitter-related pathways. These
   pathways include tyrosine metabolism and the production of
   glutamine-glutamate and glutamatergic synapses. Smoking stimulates
   neurotransmitter production through the glutamine and glutamate
   pathways, thereby influencing reward circuitry in the brain. Thus, the
   oral microbiome can directly affect the reward pathways associated with
   smoking behavior and dependence, altering the typical interactions
   between the oral microbiome and the brain’s functional connectivity
   ([71]Lin et al., 2019).

   In light of this research background, to determine the relationship
   between the oral microbiome and fatigue, we performed 16S rRNA
   high-throughput sequencing to analyze the oral microbiome composition
   in a cohort of fatigued healthy individuals. We further predicted the
   phenotypes and metabolic pathways of the oral microbiome in these
   fatigued subjects. By evaluating the effect of fatigue on the oral
   microbiome, we aimed to infer the potential implications of these
   changes for oral health, the CNS, and overall systemic health.

2. Materials and methods

2.1. Study design, participants, and assessment procedures

   This study employed a prospective observational design, with the
   detailed protocol previously described ([72]Xu et al., 2018; [73]Xu
   et al., 2020). Seventy healthy university students, aged 18–50 years,
   were recruited. Baseline assessments confirmed all participants had
   sufficient sleep, exhibited normal awake electroencephalogram (EEG)
   patterns (absence of fatigue characteristics), and did not meet
   criteria for ME/CFS. Stringent inclusion and exclusion criteria
   (adapted from Breithaupt-Groegler et al ([74]Breithaupt-Groegler
   et al., 2017), covering chronic fatigue history, recent infections,
   specific symptoms, medication use, smoking, and oral health) were
   rigorously applied. Ethical approval was obtained from the Ethics
   Committee of the Affiliated Hospital of Hebei University of Engineering
   (Handan, China; March 12, 2014; Clinical Trial Registration:
   ChiCTR-DCD-14005746), and written informed consent was secured from all
   participants. Subsequently, all 70 eligible participants underwent a
   standardized physiological fatigue induction protocol, involving
   continuous high-intensity cognitive tasks (“continuous study work”) in
   a quiet setting for at least 18 hours (actual range: 18–24 hours), with
   minimal necessary breaks.

   Immediately following the cognitive tasks, comprehensive subjective and
   objective fatigue assessments were conducted on all participants.
   Subjective fatigue was evaluated using a revised Piper Fatigue Scale
   (PFS; based on Piper et al ([75]Piper et al., 1998)), with item scores
   from 0-10; the overall average score categorized fatigue as mild
   (1-3.3), moderate (3.4-6.7), or severe (6.8-10), serving as a
   preliminary reference for grouping. Objective fatigue state was
   determined via EEG monitoring (SOLAR-RTA/BFM system), based on
   characteristic waveform changes compared to baseline (significant
   increase in slow-wave and/or decrease in fast-wave activity). Upon
   completion of these assessments, saliva samples were immediately
   collected from all 70 participants for subsequent microbiome analysis.

   The final group assignment for the comparative analyses herein was
   strictly determined based on the objective EEG assessment results,
   although PFS scores provided valuable subjective context. The Fatigue
   Group (FTG; n=7) comprised participants whose post-protocol EEG clearly
   met the predefined objective fatigue criteria (typically corresponding
   with higher PFS scores, e.g., >6.7). Conversely, the Energetic Group
   (ENG; n=63) consisted of participants who, despite completing the same
   protocol, did not meet the objective EEG fatigue criteria (typically
   corresponding with lower PFS scores, e.g., ≤3.3).

2.2. Saliva sample collection

   Saliva samples were collected from the subjects after 18 h of
   continuous study work. The subjects rinsed their mouths three times
   with sterile saline. Prior to sample collection, all subjects rinsed
   their mouth three times (1 min each time) with 30 mL of distilled water
   to remove food debris. After rinsing, each subject sat straight in a
   seat for 5 min, with their head tilted slightly forward and their eyes
   open. The subjects then chewed to stimulate salivary secretion. Once a
   sufficient amount of saliva had accumulated in the lower jaw, they
   placed their tongue against the palate and opened their mouth to allow
   the saliva to flow naturally into a 2 mL sterile centrifuge tube.
   Subsequently, the saliva samples from each participant were collected
   and immediately transported to the National Key Laboratory of Institute
   of Infectious Diseases Prevention and Control of Chinese Center for
   Disease Control and Prevention by a cold-chain shipping company and
   stored at -80°C.

2.3. DNA extraction and 16S rRNA gene sequencing

   An aliquot of 500 μL saliva was centrifuged at 13,200 rpm for 10 min.
   The supernatant was discarded, and the precipitate was retained. Total
   saliva microbial DNA was extracted using the QIAamp^® DNA Mini Kit
   (Qiagen, Germany) following the manufacturer’s instructions. The
   quantity and quality of extracted DNA were assessed using a NanoDrop
   2000 spectrophotometer and agarose gel electrophoresis, respectively.
   The extracted DNA was used as a template for the PCR amplification of
   bacterial 16S rRNA genes of the V3–V4 region with specific primers
   containing barcodes. The primer sequences were 341F
   (5′-CCTAYGGGRBGCASCAG-3′) and 806R (5′-GGACTACNNGGGGTATCTAAT-3′)
   ([76]Youngseob et al., 2005). The amplicons were purified, quantified,
   and prepared for library construction, with all amplicons mixed in
   equal amounts. Library quality was assessed on the Qubit^® 2.0
   fluorometer (Thermo Scientific) and the Agilent Bioanalyzer 2100
   system. The library was sequenced on an Illumina HiSeq 2000 platform
   (250 bp paired-end reads) at Novogene Bioinformatics Technology.
   (Beijing, China).

2.4. Sequencing data processing and bioinformatics analysis

   Raw sequencing reads were processed using the USEARCH pipeline
   (v11.0.667) for amplicon sequence analysis ([77]Edgar, 2010). Key
   processing steps included demultiplexing reads based on barcodes and
   performing quality filtering. Subsequently, the UNOISE3 algorithm was
   employed for sequence denoising, merging of paired-end reads and
   generation of non-chimeric Amplicon Sequence Variants (ASVs)
   ([78]Edgar, 2016a). The UCHIME process inherently includes chimera
   detection and removal ([79]Edgar, 2016b). Taxonomic assignment of the
   resulting representative ASV sequences was performed using the RDP
   Classifier (v2.13) against a relevant reference database ([80]Cole
   et al., 2014).

   Alpha diversity indices (Shannon, Simpson, Chao1, and ACE) were
   calculated to assess within-sample microbial diversity after rarefying
   each sample to a depth of 10,000 reads. To evaluate beta diversity
   (between-sample community structure differences), the robust Aitchison
   distance, which is suitable for compositional microbiome data, was
   calculated using the vegan package (v2.6-6.1) in R (v4.2.3)
   ([81]Clarke, 1993) ([82]Oksanen, 2024). Non-metric Multidimensional
   Scaling (NMDS) ordination based on the Aitchison distance matrix was
   used to visualize community structures, and the Wilcoxon rank-sum test
   was applied to assess significant differences between groups in the
   NMDS ordination space. Additionally, Permutational Multivariate
   Analysis of Variance (PERMANOVA) and Multi-Response Permutation
   Procedures (MRPP) were conducted to formally test for significant
   differences in overall community composition between groups. For
   PERMANOVA and MRPP analysis, the function of the vegan package for R
   was used. Linear Discriminant Analysis Effect Size (LEfSe) (v1.1.2) was
   utilized to identify statistically significant differentially abundant
   taxa between groups, applying a Linear Discriminant Analysis (LDA)
   score threshold of > 3.0 ([83]Segata et al., 2011). To ensure the
   robustness of the identified differential taxa, the results were
   cross-validated using multiple alternative methods, including ALDEx2
   ([84]Gloor, 2015), ANCOM-II ([85]Mandal et al., 2015), MaAsLin3
   ([86]Nickols et al., 2024), PROC-GLM ([87]Sunwoo et al., 2020) and
   ZicoSeq ([88]Yang and Chen, 2023).

   Microbial co-occurrence networks were constructed using the R package
   ggClusterNet (v0.1.0). Network edges were determined based on SparCC
   correlations (|r| > 0.3, p < 0.05) between taxa ([89]Wen et al., 2022).
   The functional potential of the microbial communities was predicted by
   inferring KEGG Orthology (KO) profiles from ASV data using PICRUSt2
   (v1.7.2) ([90]Douglas et al., 2020). Differential abundance analysis of
   predicted functional pathways between groups was performed using the
   LinDA method implemented within the ggpicrust2 package (v1.7.2), which
   also facilitated visualization ([91]Yang et al., 2023). Furthermore,
   key phenotypic traits of the microbial communities, such as Gram
   staining properties, oxygen tolerance, and biofilm formation capacity,
   were inferred using BugBase
   ([92]https://github.com/knights-lab/BugBase) ([93]Ward et al., 2017).

   To identify potential taxonomic biomarkers associated with fatigue
   status, a feature selection approach using the Boruta algorithm
   (implemented with LightGBM) was employed ([94]Lundberg and Lee, 2017).
   SHAP (SHapley Additive exPlanations) analysis was subsequently applied
   to interpret the contribution of the selected features to the model,
   thereby enhancing model interpretability ([95]Kursa and Rudnicki,
   2010).

2.5. Statistical analysis

   Baseline demographic characteristics were compared between the Fatigue
   Group (FTG) and the Energetic Group (ENG). Specifically, continuous
   variables (e.g., age) were compared between groups using PROC GLM
   (implemented via the sasLM package v0.10.3 in R v4.2.3). Post hoc
   analysis using Tukey’s Honestly Significant Difference (HSD) test was
   performed if required for multiple comparisons. Categorical variables
   (e.g., gender distribution) were compared using the Wilcoxon rank-sum
   test (or Chi-squared test, as appropriate for the data).

   Statistical methods for group comparisons related to bioinformatics
   analyses (e.g., comparisons of alpha diversity indices, PERMANOVA/MRPP
   tests for beta diversity, differential abundance analysis of taxa and
   functions) are detailed within their respective descriptions in Section
   1.4.

   All statistical tests were two-sided where applicable. A p-value < 0.05
   was considered statistically significant. Significance levels in the
   results are denoted as *p < 0.05, **p < 0.01, and ***p < 0.001.

3. Results

3.1. Overview of the study cohort and sequencing data

   A total of 70 healthy individuals aged 18–50 years were enrolled
   according to strict inclusion and exclusion criteria ([96] Figure 1 ),
   including 7 individuals experiencing fatigue (FTG) and 63 individuals
   in an energized state (ENG). No significant differences in gender or
   age were observed between the two groups ([97] Supplementary Table S1
   ). Saliva samples were systematically collected from all participants
   and subjected to 16S rRNA amplicon sequencing on the Illumina HiSeq
   2000 platform. Following rigorous data preprocessing and quality
   control procedures—including splicing, filtering, and chimera
   removal—4,164,405 valid sequences were obtained in total, with each
   sample yielding an average of 59,492 ± 5,101 sequences. The average
   read length of the valid sequences was 425 base pairs. Subsequent
   taxonomic assignment revealed 1,204 ASVs, providing a comprehensive
   overview of the microbial profiles in these samples.

Figure 1.

   [98]Figure 1
   [99]Open in a new tab

   Schematic overview of the study workflow. Healthy volunteers (n=70)
   were recruited and assessed (including EEG), leading to categorization
   (energized=63, fatigue=7). Saliva samples underwent 16S rRNA gene
   sequencing. Bioinformatics analysis characterized the saliva microbiota
   to identify structural differences, fatigue-related features, and
   biomarkers, which were correlated with microbiota composition,
   predicted functions, and phenotypes. Key phases included Recruitment,
   Sampling and Sequencing, and Bioinformatics Analysis and Statistics.

3.2. Intra-variations in salivary microbial diversity between the FTG and ENG
groups

   To validate sequencing depth adequacy for microbial diversity
   assessment, rarefaction curves were initially constructed to confirm
   data saturation ([100] Supplementary Figure S1 ). Comparative analysis
   of α-diversity indices revealed a significant reduction in overall
   microbial diversity within the FTG group compared to controls ([101]
   Supplementary Table S2 ). Regarding indices predominantly reflecting
   species richness, elevated values were observed in the FTG group for
   both ACE (439.22 ± 59.84, p = 0.1868, [102]Figure 2A ) and Chao1
   (438.67 ± 61.66, p = 0.7858, [103]Figure 2B ) estimators.
   Interestingly, while the Simpson index (0.80 ± 0.08, p = 0.01071,
   [104]Figure 2C ) and Shannon index (2.85 ± 0.40, p = 0.1152,
   [105]Figure 2D ) integrate both species richness and evenness, only the
   Simpson index demonstrated statistically significant intergroup
   differences. This discrepancy suggests potential dominance of specific
   microbial taxa in community structure of the FTG group.

Figure 2.

   [106]Figure 2
   [107]Open in a new tab

   Comparison of α-diversity indices of the oral microbiota between the
   FTG and ENG groups. (A) ACE index. (B) Chao1 index. (C) Shannon index.
   (D) Simpson index. Statistical comparisons were performed using the
   PROC GLM test, with significance levels denoted as follows: P < 0.05
   (*) and P > 0.05 (ns).

3.3. Inter-variations in salivary microbial diversity between the FTG and ENG
groups

   To evaluate the similarity of salivary microbial communities,
   β-diversity analysis at the amplicon sequence variant (ASV) level was
   performed on 16S rRNA amplicon sequencing data using the vegan package
   (v2.6.4). A Robust Aitchison distance matrix, optimized for
   compositional data analysis, was constructed to characterize microbial
   community structures. Non-metric multidimensional scaling (NMDS) was
   employed to visualize inter-group differences between FTG and ENG
   cohorts ([108] Figure 3A ). While NMDS plot demonstrated partial
   overlap between groups, suggesting subtle overall differences in
   salivary microbiota, statistically significant distinctions were
   confirmed through Wilcoxon rank-sum test ([109] Figure 3B , p=5.29E-3),
   ANOSIM (permutations=999, p=0.029), and MRPP (permutations=9999,
   p=0.0419).

Figure 3.

   [110]Figure 3
   [111]Open in a new tab

   Comparison of β-diversity indices of the oral microbiota between the
   FTG and ENG groups. (A) Non-metric multidimensional scaling (NMDS)
   ordination derived from robust Aitchison dissimilarity distances for
   oral microbiota community (stress value = 0.099, k = 2). Colored
   ellipses indicate 95% confidence intervals for each group. (B) The
   boxplot illustrates the distribution of robust Aitchison distances for
   pairwise comparisons between samples from the FTG and ENG groups.
   Individual data points represent specific pairwise comparisons, with
   asterisks (**) denoting statistically significant differences between
   groups as determined by Wilcoxon rank-sum test (p < 0.01).

3.4. Taxonomic differences in the salivary microbiome of the FTG and ENG
groups

   To elucidate compositional differences in the salivary microbiota
   between energized (ENG) and fatigued (FTG) individuals at the phylum
   and genus levels, a systematic comparative analysis was conducted.
   First, Venn diagrams were employed to quantitatively assess shared and
   unique microbial taxonomic units between the groups ([112] Figures 4A,
   B ). Phylum-level analysis ([113] Figure 4A ) identified 14 phyla in
   total, with 12 shared by both groups. Two phyla (Chloroflexi and
   Elusimicrobia) were unique to the ENG group, resulting in a high shared
   proportion (85.71%) and indicating substantial similarity in core
   phylum composition. Genus-level analysis ([114] Figure 4B ), however,
   revealed that among the 157 identified genera, only 111 were shared.
   The ENG group harbored significantly more unique genera (n=45) compared
   to the FTG group (n=1), with the shared proportion decreasing to
   70.70%. This highlights increased inter-group divergence and greater
   microbial uniqueness within the ENG cohort at the genus level.

Figure 4.

   [115]Figure 4
   [116]Open in a new tab

   Comparison of salivary microbial community composition between ENG and
   FTG groups. (A, B) Venn diagrams assessing the number of shared and
   unique taxa at the phylum (A) and genus (B) levels. Numbers are counts;
   percentages indicate shared proportion. (C, D) Circos plots
   illustrating relative abundance and inter-group associations of major
   taxa at the top 10 phylum (C) and top 15 genus (D) levels in ENG (left)
   and FTG (right) groups. Outer arc length corresponds to relative
   abundance; inner ribbon width reflects association strength.

   Second, Circos plots were utilized to visualize the relative abundance
   and inter-group associations of major taxa ([117] Figures 4C, D ). The
   phylum-level Circos plot ([118] Figure 4C ) displayed the top 10 most
   abundant phyla (cumulatively >99.9% relative abundance), confirming
   Firmicutes, Proteobacteria, Bacteroidetes, and Actinobacteria as the
   primary dominant phyla in both groups. The genus-level Circos plot
   ([119] Figure 4D ) focused on the top 15 most abundant genera (each >1%
   relative abundance, collectively >60% total relative abundance),
   clearly illustrating significant inter-group differentiation:
   Streptococcus exhibited markedly higher relative abundance in the FTG
   group, whereas Rothia and Neisseria were notably enriched in the ENG
   group. Furthermore, abundance differences for other major genera,
   including Gemella, Granulicatella, and Prevotella, were also depicted.

3.5. Oral microbiome network in healthy individuals in fatigue state

   To elucidate the interactions within the oral microbiome, we identified
   interactive networks within the groups ([120] Figures 5A, B ) and
   delineated differences in microbiome interactions between the FTG and
   ENG groups ([121] Figure 5C ). We found interactive networks within the
   oral microbiome of the ENG group ([122] Figure 4A ). This was confirmed
   by multiple network topological indices, whose values were greater than
   zero in the saliva samples of the ENG group, including the number of
   clusters (No. Clusters), number of edges (Num. Edges), number of
   positive edges (Num. Pos. Edges), number of negative edges (Num. Neg.
   Edges), number of vertices, diameter, average path length, and
   centralization betweenness ([123] Supplementary Table S3 ); this
   finding indicated the existence of a complex network of microbiota in
   the ENG group ([124] Figure 5A ). The interacting microbiota within the
   networks of the ENG group was predominantly distributed across 6 phyla
   and 13 genera. These genera, ranked in descending order of interaction
   frequency, were as follows: Prevotella (40/112, 35.71%), Streptococcus
   (26/112, 23.21%), Lancefieldella (11/112, 9.82%), Leptotrichia (7/112,
   6.25%), Porphyromonas (6/112, 5.36%), Lachnoanaerobaculum (5/112,
   4.46%), Veillonella (5/112, 4.46%), Eubacterium (4/112, 3.57%),
   Saccharibacteria_genera_incertae_sedis (2/112, 1.79%), Unassigned
   (2/112, 1.79%), Aggregatibacter (1/112, 0.89%), Granulicatella (1/112,
   0.89%), Neisseria (1/112, 0.89%), and Schaalia (1/112, 0.89%).

Figure 5.

   [125]Figure 5
   [126]Open in a new tab

   Salivary microbial network of the ENG and FTG groups. (A) Visualization
   of the salivary microbial network in the ENG group. (B) Visualization
   of the salivary microbial network in the FTG group. (C) Changes in
   network topology, including number of clusters (No. Clusters), number
   of edges (Num. Edges), average path length, and diameter between the
   two groups.

   Concurrently, we observed an increase in the network topological
   complexity in the FTG group, as indicated by metrics such as Num.
   Edges, Num. Pos. Edges, Num. Neg. Edges, number of vertices, and
   diameter ([127] Figures 5B, C ; [128]Supplementary Table S3 ). The
   dashed line plot ([129] Figure 5C ) indicates a significant increase in
   both the number of edges and vertices in the network topology of the
   FTG group. The numbers of both positively and negatively correlated
   edges were significantly higher in the FTG group than in the ENG group.
   However, the FTG group showed a decrease in network topological indices
   such as connectance (edge density) and mean clustering coefficient
   (average CC), which suggested a reduction in network cohesiveness in
   this group.

   Specifically, the FTG group network included interacting species from
   11 phyla and 33 genera. The top 15 species based on interaction
   frequency were Streptococcus (123/710, 17.32%), Neisseria (54/710,
   7.61%), Rothia (47/710, 6.62%), Schaalia (46/710, 6.48%), Prevotella
   (38/710, 5.35%), Veillonella (33/710, 4.65%), Leptotrichia (32/710,
   4.51%), Porphyromonas (30/710, 4.23%), Gemella (28/710, 3.94%),
   Eubacterium (26/710, 3.66%), Oribacterium (26/710, 3.66%),
   Stomatobaculum (26/710, 3.66%), Lautropia (20/710, 2.82%),
   Granulicatella (19/710, 2.68%), and Haemophilus (18/710, 2.56%). These
   findings revealed that under fatigue conditions, the number of
   interacting species and the frequency of their interactions in the FTG
   group network significantly increased, along with a notable increase in
   both cooperative and competitive interactions among the species.
   However, despite the increased complexity of the network, its overall
   density decreased.

3.6. Identification and validation of salivary microbiota biomarkers for
fatigue status

   To elucidate the distinctive salivary microbiome profiles associated
   with the fatigue state (FTG), we initially employed Linear Discriminant
   Analysis Effect Size (LEfSe) to compare microbial taxa with
   significantly different abundances between the FTG and ENG groups. The
   resulting cladogram illustrates the differentially abundant taxa
   hierarchically from phylum to genus level. Nineteen taxa were
   identified with significant differential abundance (LDA score > 2, P <
   0.05) between the groups ([130] Figure 6A ). Specifically, the FTG
   group exhibited significant enrichment of Firmicutes(phylum),
   Bacilli(class),Streptococcaceae(family),
   Peptostreptococcaceae(family),Streptococcus (genus), Filifactor
   (genus),and Peptostreptococcaceae incertae sedis (unclassified genus).
   Conversely, the ENG group demonstrated significantly higher abundances
   of Actinobacteria (phylum), Actinobacteria (class), Micrococcales
   (order),
   Micrococcaceae(family),Proteobacteria(phylum),Betaproteobacteria(class)
   ,Neisseriales(order),Neisseriaceae(family), Rothia (genus), Neisseria
   (genus), Megasphaera (genus),and Flavobacteriaceae_Unassigned
   (unclassified genus).

Figure 6.

   [131]Figure 6
   [132]Open in a new tab

   Biomarker taxa of the salivary microbiota in the FTG and ENG groups.
   (A) Intergroup microbial community markers of the oral microbiota in
   the ENG and FTG groups based on the LEfSe analysis. (B) SHAP summary
   plots according to the Boruta-SHAP algorithm. Bar plot showing global
   SHAP values for feature importance, and beeswarm plot for the local
   SHAP values, showing the contribution of each genus to the fatigue
   predictions of the model. Features in both bar plot and beeswarm plot
   were ranked by mean absolute SHAP value, hence their rankings are
   identical. In the SHAP beeswarm plot, each point represents an
   individual in the training data. The x-axis corresponds to the SHAP
   value, with vertical jitter indicating a high density of points. The
   color scale indicates the relative magnitude of each feature with
   yellow indicating high values of the feature and purple the opposite;
   ROC curve obtained for the 15 feature genera from the model based on
   the Boruta-SHAP algorithm. (C) Venn diagram and heatmap visualizations
   comparing genus-level taxa detected by seven differential analysis
   methods. The Venn diagram illustrates the overlap of genus-level taxa
   detected by each method. The heatmap displays the distribution patterns
   of the detected genus-level species across different methods, allowing
   for a visual comparison of their performance.

   To investigate the predictive utility of the salivary microbiome in
   discriminating individual fatigue status, this study constructed a
   machine learning model based on genus-level taxonomy. The model
   employed the Boruta algorithm for feature selection and integrated SHAP
   (SHapley Additive exPlanations) analysis for model interpretation and
   key feature identification, with its performance and robustness
   ultimately assessed via cross-validation. The model’s predictive
   efficacy, evaluated using the Receiver Operating Characteristic (ROC)
   curve ([133] Figure 6B ), yielded an Area Under the Curve (AUC) of
   0.948 (95% CI: 0.919 - 0.974), demonstrating excellent and
   statistically significant discriminatory power in effectively
   differentiating between fatigued and non-fatigued (energized)
   individuals. Furthermore, assessment of model stability through
   resampling techniques revealed robust performance across key metrics
   ([134] Supplementary Figure S2 ), exhibiting high consistency
   particularly for Specificity (mean ~0.95), Negative Predictive Value
   (mean ~0.94), and AUC (mean ~0.95). Accuracy (mean ~0.90) and Positive
   Predictive Value (mean ~0.84) also showed good performance. Although
   Sensitivity (mean ~0.75) and F1 Score (mean ~0.79) were comparatively
   lower with slightly wider distributions, suggesting potential room for
   improvement in identifying fatigued (positive) samples, the Matthews
   Correlation Coefficient (MCC, mean ~0.71), as a balanced metric,
   nonetheless confirmed the model’s reasonably good overall predictive
   capability. To gain deeper insights into the model’s decision-making
   mechanisms, SHAP analysis was utilized to visualize the contributions
   of the top 15 feature genera ([135] Figure 6B ). The SHAP summary plot
   (comprising a beeswarm plot and a bar plot) clearly elucidated: (1)
   feature importance ranking based on mean absolute SHAP values; (2) the
   directionality of the effect of feature abundance (color: yellow=high,
   purple=low) on predictive contribution (sign of SHAP value); and (3)
   the pattern of the relationship between feature abundance and
   predictive impact (distribution of points). For instance, Rothia, the
   most important feature, showed that higher abundance was associated
   with a reduced prediction of fatigue (predominantly negative SHAP
   values), and its relatively symmetrical SHAP value distribution
   suggested an approximately linear relationship between its abundance
   and fatigue status risk. In summary, this study successfully developed
   and validated a high-performance classifier based on the salivary
   microbiome, capable of reliably discriminating between different
   fatigue statuses (AUC=0.948). The model demonstrated robust
   performance, and SHAP analysis elucidated the specific impact patterns
   of key microbial taxa (e.g., Rothia) and their abundances on
   predictions, enhancing model interpretability. These findings indicate
   that the identified key salivary microbial taxa hold potential value as
   non-invasive biomarkers for clinical fatigue risk assessment.

   To robustly validate the bacterial genera significantly associated with
   fatigue, we employed a cross-validation strategy using seven distinct
   differential abundance analysis methods: LEfSe, ALDEx2, ANCOM-II,
   ZicoSeq, MaAsLin3, PROC-GLM, and our previously described
   fatigue-associated Boruta-SHAP algorithm model. Comparative
   visualization via Venn diagrams and a heatmap facilitated the
   assessment of consensus in genus-level taxa identification across these
   methodologies ([136] Figure 6C ). The Venn diagram illustrates the
   overlap of genera detected by each method, highlighting methodological
   concordance. The heatmap depicts the distribution patterns (e.g.,
   detection status or statistical significance/effect size) of the
   consensus genera across the analytical approaches, enabling a visual
   comparison of their performance. Thirteen genera exhibited consistent
   detection by at least two methods. Notably, Rothia and Filifactor were
   concurrently identified by six methodologies, followed by Neisseria and
   Streptococcus, detected by five approaches. Peptostreptococcaceae
   incertae sedis demonstrated consensus across three methods. Eight
   additional genera (Megasphaera, Mycoplasma, Pyramidobacter, Treponema,
   Necropsobacter, Pseudoramibacter, Alloprevotella, and
   Flavobacteriaceae_Unassigned) showed agreement between two independent
   analytical frameworks. This comprehensive multi-method validation
   strategy substantially strengthens the reliability of these microbial
   signatures as fatigue-associated biomarkers.

   Furthermore, to visually compare the abundance distributions of
   potential microbial biomarkers within saliva samples from the ENG and
   FTG groups, boxplots were generated ([137] Figure 7 ). These plots
   illustrate the relative abundances (Y-axis, log10 scale) of major
   microbial taxa at the phylum (left panel) and genus (right panel)
   levels between the two groups. At the phylum level, Firmicutes and
   Proteobacteria were identified as the most dominant phyla in both
   groups. Compared to the FTG group, the ENG group showed slightly higher
   trending relative abundances of Actinobacteria and Proteobacteria,
   whereas the relative abundance of Firmicutes appeared slightly elevated
   in the FTG group. Spirochaetes and Tenericutes consistently displayed
   lower relative abundances in both cohorts. The genus-level comparison
   (right panel) revealed more pronounced inter-group differences.
   Specifically, the relative abundances of Rothia and Neisseria were
   markedly higher in the ENG group than in the FTG group. Conversely, the
   FTG group exhibited significantly higher relative abundances of
   Filifactor and Streptococcus. The distributions of numerous other
   genera, including Megasphaera and Peptostreptococcaceae incertae sedis,
   also showed varying degrees of difference between the groups. Although
   many genera were present at low overall abundance, their differential
   presence may still possess biological significance.

Figure 7.

   [138]Figure 7
   [139]Open in a new tab

   Comparison of salivary microbial community composition between the ENG
   and FTG groups at the phylum and genus levels. Box plots show the
   relative abundance (%) distribution (Y-axis, log10 scale) of major taxa
   at the phylum (left panel) and genus (right panel) levels. Orange boxes
   represent the ENG group, and blue boxes represent the FTG group. Each
   box indicates the interquartile range (IQR), the horizontal line within
   the box represents the median, and the whiskers extend to the furthest
   data points within 1.5 times the IQR from the box edge. Individual dots
   represent the actual relative abundance values for each sample
   corresponding to the respective taxon.

   Following the identification of key bacterial genera exhibiting
   significant abundance differences between the FTG and ENG groups ([140]
   Figures 5 , [141]6 ), we sought to gain deeper insights into the
   potential functional ramifications of these taxonomic shifts.
   Consequently, we performed comprehensive functional annotation of these
   differential genera, with detailed results compiled in
   [142]Supplementary Table S2 . This annotation integrates
   multi-dimensional information, including the potential pro- or
   anti-inflammatory properties of each genus, their capacity for
   gamma-aminobutyric acid (GABA) metabolism, notable metabolite
   production (particularly short-chain fatty acids, SCFAs), ecological
   and adaptive traits (e.g., carriage of mobile genetic elements (MGEs),
   biofilm formation capabilities), and potential clinical relevance
   (e.g., associations with periodontal pathogenesis or systemic
   diseases). The information presented in [143]Supplementary Table S1 was
   systematically curated from published literature and public databases
   (references in table footnotes). Overall, this table highlights that