Abstract Background Diagnosis of the cause of cerebral thrombi is vital for recurrence prevention but also challenging. The presence of the microbiome has recently been confirmed in thrombus, suggesting a novel approach to distinguish cerebral thrombi of different origins. However, little is known about whether there is heterogeneity in microbiological colonization of cerebral thrombi of different sources. Methods and Results Forty patients experiencing acute ischemic stroke were included and clinical data were collected. Metagenomic next‐generation sequencing was adopted to detect bacterial and genomic signatures of human cerebral thrombi samples. We found similar species diversity between the large‐artery atherosclerosis thrombi and cardioembolic thrombi but different species composition and distribution of cerebral thrombus microbiota. Compared with the group with cardioembolism, the group with large‐artery atherosclerosis showed a significantly higher relative abundance of Ralstonia insidiosa among the top 10 bacterial species in cerebral thrombi. Twenty operational taxonomy units were correlated with 11 clinical indicators of ischemic stroke. The Gene Ontology enrichment analysis revealed 9 different enriched biological processes (translation and carbohydrate metabolic process, etc). The enriched Kyoto Encyclopedia of Genes and Genomes pathways included ribosome, butanoate metabolism, and sulfur metabolism. Conclusions This study, based on the approach of metagenomic next‐generation sequencing, provides a diagnostic microbiological method to discriminate individuals with cardioembolic thrombi from those with large‐artery atherosclerosis thrombi with human cerebral thrombi samples. Our findings provide a fresh perspective on microbial heterogeneity of cerebral thrombi and demonstrate biological processes and pathway features of cerebral thrombi. Keywords: cerebral thrombus, metagenomic next‐generation sequencing, microbiome, thrombectomy Subject Categories: Ischemic Stroke, Embolism, Thrombosis __________________________________________________________________ Nonstandard Abbreviations and Acronyms AIS acute ischemic stroke KEGG Kyoto Encyclopedia of Genes and Genomes LAA large‐artery atherosclerosis mNGS metagenomic next‐generation sequencing mRS modified Rankin Scale NIHSS National Institutes of Health Stroke Scale Clinical Perspective. What Is New? * Metagenomic next‐generation sequencing is used to identify the microbial distribution and genome information of human cerebral thrombus. * Microbial heterogeneity of cerebral thrombi is associated with different origins. * Microbial genes further reveal biological processes and pathway features of cerebral thrombi. What Are the Clinical Implications? * Our study provides a metagenomic next‐generation sequencing‐based diagnostic microbiology method to distinguish atherosclerotic thrombus and cardiogenic thrombus. * Our study also reveals the close relevance of the thrombus microbiome and clinical indicators of patients with stroke, indicating the potential effects of the thrombus microbiota on host physiological function. * These findings facilitate a new direction to develop metagenomic next‐generation sequencing‐based treatment strategies for thrombolysis and thrombosis prevention. Stroke is the second leading cause of death and the third leading cause of disability in the world.[44] ^1 Acute ischemic stroke (AIS) is the most common subtype of stroke. Thrombi‐based cerebral blood vessel occlusion can directly lead to ischemic stroke onset. The pathogenetic diagnosis of AIS subtypes is crucial for determining secondary prevention therapy against stroke recurrence, which occurs in approximately 25% of individuals with stroke within 5 years after the initial occurrence.[45] ^2 Currently, the pathogenetic diagnosis of stroke is a comprehensive diagnosis dependent on atrial fibrillation rates, risk factors, postoperative angiography, and thrombus cellular component identification.[46] ^3 Several recent studies have provided novel evidence to help distinguish cerebral thrombi origin, such as molecular characteristics of thrombus,[47] ^4 thrombus‐related imaging sign,[48] ^5 and morphological signs of intravital clot contraction.[49] ^6 We have previously found that thrombotic component analysis has important implications for stroke therapeutic effect, secondary prevention, and the search for new targets for antithrombotic therapy.[50] ^7 However, our understanding of thrombus heterogeneity is far from sufficient. The microbiota–gut–brain axis is well known to play a vital role in the reciprocal interactions between the gut and the brain during acute cerebral ischemia, mainly mediated by microbiota‐related metabolites, bacterial components (eg, lipopolysaccharide) and the neuroimmune system.[51] ^8 In addition, gut microbe‐derived metabolites, such as trimethylamine‐N‐oxide and short‐chain fatty acids, can promote thrombosis potential[52] ^9 and affect atherosclerotic plaque stability.[53] ^10 Moreover, thrombus‐related microbiota is highly reported in coronary thrombus,[54] ^11 where high levels of microbiota and microbe‐derived metabolites are associated with the survival rate of patients with myocardial infarction.[55] ^12 Recently, studies have proven microbiota colonization in cerebral thrombus,[56] ^13 , [57]^14 which mainly originated from the plasma[58] ^15 of patients with AIS. Thrombi from cardioembolic and large‐artery atherosclerotic strokes had distinct bacterial concentrations, dominant species, and distribution patterns.[59] ^15 Nevertheless, alternative approaches for analyzing complex microbial communities, such as metagenomic next‐generation sequencing (mNGS), may provide more robust information on detailed microbial identification, gene diversity, and functional profiles.[60] ^16 In the present study, we aimed to analyze mNGS data of 40 thrombi obtained from patients with AIS undergoing endovascular thrombectomy and discriminate cardioembolic and atherothrombotic thrombi origins according to their microbiota population and gene diversity. We sought to establish a more accurate pathogenetic diagnosis to provide more information for secondary stroke prevention therapy and to correlate microbial features with clinical characteristics. Based on mNGS data, Kyoto Encyclopedia of Genes and Genomes (KEGG), and Gene Ontology analyses were conducted to offer additional details about the potential impacts of microbial gene enrichment, which may help to provide more insight into the thrombotic microenvironment and host function. METHODS The data related to this study are available from the corresponding author upon reasonable request. The original contributions presented in the study are included in the article/Supplementary Material. Further inquiries can be directed to the corresponding authors. Study Patients and Sample Preparation In the present study, we continuously investigated thrombi (n=40) from patients with AIS who underwent endovascular thrombectomy in the First Affiliated Hospital of Army Medical University between November 2020 and February 2022. Our methods and experimental protocols were conducted according to the recommendations of guidelines from the Institutional Review Board of the First Affiliated Hospital of Army Medical University and were granted by the Ethics Committee of the First Affiliated Hospital of Army Medical University [(A)KY2021023]. All patients or their legal representatives signed informed consent for medical research of their images and specimens. Baseline clinical data and laboratory examination indices were immediately recorded when the patient with AIS was admitted to the hospital. The severity of patients with AIS was evaluated with the National Institutes of Health Stroke Scale (NIHSS) score[61] ^17 on the day of admission, at 24 hours after endovascular thrombectomy and at discharge. The ischemic stroke subtypes were assessed using the TOAST (Trial of Org 10172 in Acute Stroke Treatment) criteria.[62] ^18 The functional outcomes were assessed with a 90‐day modified Rankin Scale (mRS) score,[63] ^19 and independent functional outcome was defined as mRS score 0 to 2. Vascular recanalization was evaluated by the modified cerebral infarction thrombolytic classification scoring standard.[64] ^20 (Grades 0–2a were defined as nonrecanalized blood vessels, and grades 2b–3 were defined as successful blood vessel recanalization.) Thrombi from 40 patients with AIS who met the following criteria were consecutively included: (1) patients over 18 years old; (2) clinical indication for mechanical recanalization of a proximal intracranial occlusion of the anterior or posterior circulation and at least 1 thrombus fragment was collected during the procedure. The exclusion criteria were as follows: (1) currently suffering from infection, (2) autoimmune disease, (3) severe liver function and renal function impairment, and (4) active or recurrent cancer. Thrombi were retrieved using a stent retriever or aspiration device according to the judgment of the 2 treating neurological intervention physicians. If the thrombus material collected in a patient by any pass was pooled, it was further considered 1 thrombus. Of note, because thrombi often adhere to the device, it must be rinsed with sterile saline and gently manipulated for transfer to a sterile collection container. The collected thrombus was placed into a preprepared sterile collection container with PBS. The thrombus was then frozen in liquid nitrogen and stored at −80 °C until mNGS analysis. Nucleic Acid Extraction, Library Preparation, and Sequencing All thrombus samples were collected by following the standards of aseptic processing procedures. The thrombus tissue specimen was cut into small pieces and then wall broken. The specific wall breaking method was as follows: (1) 1 g of 0.5 mm diameter glass beads was added to the broken wall tube, (2) 0.6 mL of treated specimen was added, (3) the tube was shaken at high speed at 2800 to 3200 r/min for 30 minutes, (4) nucleic acid extraction was performed in 300 L according to the TIANamp Micro DNA Kit (DP316, China Tiangen Biochemical Technology Co., Ltd.), and (5) after extraction, DNA was sonicated to fragments of aproximately 200 to 300 bp. The DNA library libraries were constructed by fragmenting DNA, end repair, adapter ligation, and amplifying the DNA with polymerase chain reaction. On an Illumina NextSeq 550 system (Illumina, San Diego, CA), pooled libraries were sequenced using a 75‐bp, single‐end sequencing kit. The qualified results had a Q30 score of at least 90% and at least 15 million readings per sample. Negative controls were conducted and environmental contaminants were excluded via a background pathogen detection method of bioinformatics analysis. Bioinformatics Analyses After removing low‐quality and short read sequences (length<35 bp), 75 bp single‐end reads from Illumine Nextseq 550 were analyzed with IDseq software developed in house to determine the abundance of each microorganism. The high‐quality read data were aligned to the human reference genome sequence, the sequence on the alignment was removed, and the part was determined to be a human host sequence[65] ^21 The remaining read sequences were aligned using a pathogen‐specific microbial database containing 6350 bacteria, 1064 fungi, 4945 viruses, and 234 parasites. The number of reads that could match the various pathogen‐specific sequences was obtained by the alignment algorithm, and the possible pathogens were judged according to the number of sequences and other clinical tests. Bacterial Diversity and Taxonomic Analysis Bacterial diversity was evaluated using a sampling‐based operational taxonomic unit analysis, with the Shannon index, Chao index, and Abundance‐Based Coverage Estimator index serving as indicators. Calculations were facilitated by the R program package “vegan.” Additionally, principal component analysis and principal coordinate analysis were employed through the R package ([66]http://www.R‐project.org/) to visualize the microbiome space among the group samples. A heatmap, generated using the heatmap builder, was used to identify key variables. Taxonomic analyses of bacteria and comparisons between the 2 groups were conducted using the Wilcoxon rank‐sum test, encompassing both bacterial phylum and genus. Furthermore, the linear discriminant analysis effect size method ([67]http://huttenhower.sph.harvard.edu/lefse/) was applied to analyze the microbial characterization of thrombi between the groups with large‐artery atherosclerosis (LAA) and with cardioembolism, based on the normalized relative abundance matrix. This approach initially identified features with significant differential abundance using the Kruskal–Wallis rank‐sum test (P<0.05) and then assessed the effect size of each feature through linear discriminant analysis (LDA), with an LDA score (log10) threshold of 2. Random Forest Model Prediction Taking into account the robustness of the algorithm, we used the random forest function from the R package “RandomForest” to construct a prediction model aimed at pinpointing potential diagnostic biomarkers. The study gathered abundance profiles, and all of this sample set was used for model training. Core genera specific to the groups with LAA or cardioembolism were selected as predictive input variables. To identify the genera that significantly contributed to the predictions, we employed a nested 10‐fold cross‐validation procedure. Briefly, the top 30 species in the ranking of importance were selected according to the feature importance evaluated by the random forest model. Then, the feature selection algorithm is used to select the 30 species on the basis of the feature selection to obtain the final species set for model construction. Furthermore, the area under the curve index and receiver operating characteristic analysis were conducted to assess the efficacy of potential cutoff values for the tests. A 10‐fold cross‐validation was used to evaluate the area under the curve, sensitivity and specificity of the model. Statistical Analysis IBM SPSS Statistics 22 (SPSS, Chicago, IL) was used to analyze differences between the 2 groups' clinical indices. A Mann–Whitney U test was used to determine whether there was statistical significance between the 2 groups. The statistical significance was set at P < 0.05. Analysis of all sequenced data was conducted using R (R Foundation for Statistical Computing, Vienna, Austria). The Kruskal–Wallis test was used for α ‐diversity between groups and for species abundance differences between groups, and the Spearman correlation test was used for species abundance. The nonparametric factor Kruskal–Wallis rank‐sum test was used to select metagenomic biomarker species (P<0.05) for LDA effect size analysis. The influence of differential species (LDA score) on biomarker species was then evaluated by LDA. It was considered statistically significant to identify species with LDA>2.0. RESULTS Characteristics of the Study Participants The baseline characteristics of the patients are shown in the [68]Table. In total, 40 patients were included in this study, all of whom underwent endovascular thrombectomy; 24 were men, and the mean age of all patients was 70.55±11.01 years. Overall, 15 patients (37.5%) were classified into the group with LAA, and 25 (62.5%) were classified into the group with cardioembolism based on the TOAST classification. In the group with cardioembolism, 15 patients (60%) had atrial fibrillation and 11 patients (44%) had >2 stent passages, which were significantly higher than those in the group with LAA. The rates of having a smoking history and using rescue therapy in the group with LAA were significantly higher than those in the group with cardioembolism. Besides laboratory tests there were no significant differences in white blood cells, neutrophils, monocytes, lymphocytes, platelets, fibrinogen, prothrombin time, and activated partial thromboplastin time between the 2 groups. Table . Baseline Characteristics Variables Entire population (n=40) P value Group with LAA (n=15) Group with cardioembolism (n=25) Demographics Male, n (%) 10 (66.7) 14 (56.0) 0.505 Age, y, median (IQR) 72.0 (68.3–78.8) 74.0 (58.0–80.0) 0.720 Vascular risk factors Hypertension, n (%) 10 (66.7) 13 (52.0) 0.364 Coronary artery disease, n (%) 2 (13.3) 5 (20.0) 0.591 Atrial fibrillation, n (%) 1 (6.7) 15 (60.0) 0.001[69]^* Hyperlipidemia, n (%) 2 (13.3) 3 (12.0) 0.902 Diabetes, n (%) 2 (13.3) 3 (12.0) 0.902 Smoking, n (%) 9 (60.0) 7 (28.0) 0.046[70]^* Drinking, n (%) 5 (33.3) 6 (24.0) 0.522 Clinical characteristics Admission National Institutes of Health Stroke Scale score, median (IQR) 18 (14–24) 17 (11–18) 0.088 Admission modified Rankin Scale score, median (IQR) 4 (4–5) 4 (3–4) 0.080 Stroke onset to thrombectomy, min, median (IQR) 310 (212–600) 285 (220–453) 0.439 Thrombolysis, n (%) 4 (26.7) 3 (12.0) 0.452 Vascular occlusion site Internal carotid artery, n (%) 6 (40.0) 11 (44.0) 0.804 Middle cerebral artery, n (%) 8 (53.3) 12 (48.0) 0.744 Basilar artery, n (%) 1 (6.7) 2 (8.0) 0.877 Procedure details Number of stent passage >2, n (%) 2 (13.3) 11 (44.0) 0.045[71]^* Rescue therapy, n (%) 4 (26.7) 1 (4.0) 0.036[72]^* Modified thrombolysis in cerebral infarction 2b/3, n (%) 13 (86.7) 23 (92.0) 0.586 Laboratory tests White blood cells, ×10^9/L, median (IQR) 9.2 (7.6–11.7) 7.5 (6.2.9.2) 0.112 Neutrophils, ×10^9/L, median (IQR) 7.0 (6.0–9.1) 5.9 (4.3–7.0) 0.074 Monocytes, ×10^9/L, median (IQR) 0.5 (0.3–0.6) 0.5 (0.4–0.7) 0.524 Lymphocytes, ×10^9/L, median (IQR) 1.0 (0.9–1.7) 1.1 (0.9–1.7) 0.720 Platelets, ×10^9/L, median (IQR) 152.5 (138.3–215.0) 171.0 (151.0–229.0) 0.699 Fibrinogen, g/L, median (IQR) 2.7 (2.2–3.6) 2.5 (2.1–3.5) 0.761 Prothrombin time, s, median (IQR) 11.4 (10.8–12.3) 12.1 (11.0–12.9) 0.192 Activated partial thromboplastin time, s, median (IQR) 25.8 (25.3–28.6) 27.3 (25.6–28.7) 0.361 [73]Open in a new tab IQR indicates interquartile range; and LAA, large‐artery atherosclerosis. ^* P<0.05. Thrombus Microbiota Profile and Thrombus Microbial Diversity In this study, the microbial composition of thrombi was identified via mNGS. The hierarchical clustering results of all 40 samples comparing data in phylum, genus, and species classification are shown in Figure [74]1, including the top 10 compositions and relative abundance of the bacterial community in both groups. Figure 1. Species composition and distribution of cerebral thrombi from different sources at the phylum, genus, and species levels. Each cerebral thrombus sample (total=40), including the top 10 bacterial species compositions at the (A) phylum, genus (C), and species (E) levels in abundance is shown in the heat color map. The top 10 relative abundance species compositions at phylum (B), genus (D), and species (F) levels for all samples (group with LAA vs group with cardioembolism) are displayed in a bar chart. **P<0.01. Figure 1 [75]Open in a new tab LAA indicates large‐artery atherosclerosis. The top 10 most abundant phyla in the thrombus were Proteobacteria, Actinobacteria, Firmicutes, Ascomycota, Bacteroidetes, Basidiomycota, Deinococcus, Thermus, Discosea, Cyanobacteria, and Fusobacteria (Figure [76]1A). The top 10 most abundant genera of bacteria in the thrombus were Acinetobacter, Ralstonia, Corynebacterium, Staphylococcus, Lactobacillus, Pseudomonas, Mycobacterium, Sphingomonas, Aquincola, and Bacteroides (Figure [77]1C). The top 10 most abundant species of bacteria in the thrombus were Acinetobacter_parvus, Ralstonia_insidiosa, Lactobacillus_iners, Aquincola_tertiaricarbonis, Ralstonia_pickettii, Mycobacterium_iranicum, Bradyrhizobium_oligotrophicum, Bacteroides_graminisolvens, Sphingomonas_echinoides, Candida_parapsilosis (Figure [78]1E). Based on further analysis of the top 10 microbiota composition of thrombus at the phylum, genus, and species levels, the dominant taxa in phylum were not different between the group with LAA and the group with cardioembolism (P>0.05, Figure [79]1B). However, the abundance of Ralstonia at the top 10 microbiota compositions of thrombi at genus level was significantly higher in the group with LAA group in the group with cardioembolism (P=0.00999, Figure [80]1D). Further analysis showed that the abundance of Ralstonia_insidiosa in the top 10 microbiota compositions of thrombi at the species level was significantly higher in the group with LAA than in the group with cardioembolism (P=0.00791, Figure [81]1F). More data on differential abundance at the phylum, class, order, family, genus, and species levels between the 2 groups are presented in Figure [82]S1. We further assessed the thrombus‐related changes in bacterial alpha diversity. However, there were no significant differences between the group with LAA and the group with cardioembolism using 5 methods to estimate the thrombus‐related change in alpha diversity (P>0.05, Figure [83]2A through [84]2E). Moreover, in the beta‐diversity analysis, the principal coordinate analysis and the principal component analysis based on genus profiles showed no separation between the group with LAA and the group with cardioembolism (P>0.05, Figure [85]2F through [86]2G). Figure 2. Species diversity of cerebral thrombi from different origins. Figure 2 [87]Open in a new tab The calculated α‐diversity indices of cerebral thrombi: A, the richness index, B, the ACE index, C, the Chao1 index, D, the Shannon index, and E, the Simpson index. The calculated β‐diversity indices of cerebral thrombi: F, PCoA of the samples in a 2‐dimensional graph (PCoA1=13.18%, PCoA2=9.12), (G) PCA of the samples in a 2‐dimensional graph (PCA1=18.12%, PCA2=12.98%). ACE indicates Abundance‐Based Coverage Estimator; LAA, large‐artery atherosclerosis; PCA, principal component analysis; and PCoA, principal coordinate analysis. Differences in the Microbial Composition of Thrombi Between the Groups With LAA and Cardioembolism To identify the specific bacterial taxa and predominant bacteria distinguishing LAA and cardioembolic emboli, linear discriminant analysis effect size was used to assess the maximum difference in the microbial structures in patients with LAA stroke versus those in patients with cardioembolic stroke. The contributions of the thrombus microbial composition to the differences between the 2 groups were evaluated using the LDA score. As shown in Figure [88]3, 77 genera were significantly enriched, and 19 genera were significantly depleted in the group with LAA compared with the group with cardioembolism (all P<0.05). Figure 3. Biomarker identification by LEfSe analysis. Figure 3 [89]Open in a new tab A, LEfSe analysis (taxa with LDA score>2); B, cladogram of the main taxa of microbiota that were different between the group with LAA and the group with cardioembolism on the basis of LEfSe analysis. (Color code: Yellow indicates no significant difference in taxa; Green indicates significantly different taxa, with their highest relative abundance in the group with LAA; Red indicates significantly different taxa, with their highest relative abundance in the group with cardioembolism). LAA indicates large‐artery atherosclerosis; and LEfSe, Linear discriminant analysis effect size. Distinguishing Potential of Cerebral Thrombus Origin Based on Multiple Gene Markers In this study, we further constructed a random forest classifier model between 15 patients with LAA stroke and 25 patients with cardioembolic stroke to assess the potential of multiple gene markers as a noninvasive distinguishing tool for cerebral thrombus origin. A total of 9 operational taxonomic units were selected as the optimal marker set of cerebral thrombus origin in a 10‐fold cross‐validation with 5 repeats random forest model (Figure [90]4A). The list of the top 9 bacterial taxa at the species level is shown in Figure [91]4B. In addition, according to the results of the receiver operating characteristic analysis (Figure [92]4C), the distinguishing potential value of cerebral thrombus origin showed good discrimination, with an area under the receiver operating characteristic curve of 0.903 (95% CI, 0.808–0.998; P<0.001). The sensitivity and specificity were 100% and 85.7%, respectively. Figure 4. Multiple gene markers for distinguishing cerebral thrombi origins. Figure 4 [93]Open in a new tab A, Nine microbial OTUs were found to be the best set of markers by a random forest model; B, Chart of the 9 microbial OTUs variables importance ranking based on mean decrease accuracy; C, ROC curve analysis of the predictive value of the 9 microbial gene markers for distinguishing cerebral thrombi origins. AUC indicates area under the curve; FPR, false positive rate; OTU, operational taxonomic unit; ROC, receiver operating characteristics; and TPR, true positive rate. Correlation Between the Thrombus Microbiome and Clinical Indicators of Patients With LAA or Cardioembolic Stroke In this study, we also analyzed the correlations between the thrombus microbiome and clinical indicators of cerebral thrombus. Of note, 9 clinical indicators, including neutrophil counts, lymphocyte counts, lymphocyte to monocyte ratio, neutrophil to lymphocyte ratio, ejection fraction, NIHSS score (preadmission), mRS score (preadmission), NIHSS score (postoperation), and mRS score (postoperation), were closely related to the microbiome of cerebral thrombi (as shown in Figure [94]5). Neutrophil counts were positively correlated with Elizabethkingia meningoseptica. Lymphocyte counts were positively correlated with Schizophyllum commune but negatively associated with Massilia putida. Lymphocyte to monocyte ratio was positively correlated with Lactococcus raffinolactis. The neutrophil to lymphocyte ratio was positively correlated with Massilia putida. Ejection fraction % was negatively correlated with Pannonibacter phragmitetus. NIHSS score (preadmission) was positively correlated with Acinetobacter ursingii, Corynebacterium jeikeium, and Staphylococcus arlettae but negatively correlated with Aspergillus chevalieri and Serinicoccus chungangensis. mRS score (preadmission) was positively correlated with Acinetobacter ursingii, Alishewanella agri, Corynebacterium jeikeium, and Schizophyllum commune but negatively correlated with Aspergillus chevalieri. NIHSS score (postoperation) was positively correlated with Bacillus thermoamylovorans, Castellaniella caeni, Corynebacterium jeikeium, and Providencia stuartii but negatively correlated with Aspergillus chevalieri and Prevotella loescheii. mRS score (postoperation) was positively correlated with Bacillus thermoamylovorans, Castellaniella caeni, Corynebacterium glutamicum, Corynebacterium jeikeium, Providencia stuartii, and Staphylococcus arlettae but negatively correlated with Aspergillus chevalieri. Figure 5. Correlation between different species and clinical data. Figure 5 [95]Open in a new tab Heat map illustrating the partial Spearman's correlation coefficients among 20 OTUs and 11 clinical indicators of acute ischemic stroke (N=40). Red and blue denote positive and negative correlations, respectively. ρ, Spearman's correlation coefficient. *P<0.05. EF indicates ejection fraction; LMR, lymphocyte to monocyte ratio; mRS, modified Rankin Scale score; NIHSS, National Institutes of Health Stroke Scale; NLR, neutrophil to lymphocyte ratio; pre, preadmission; post, postoperation; and WBC, white blood cell. Functional Analysis of Genes Associated With the Cerebral Thrombus Microbiota Furthermore, we performed both Gene Ontology annotation and KEGG pathway enrichment analysis to obtain deeper insight into the biological functions of microbial genes associated with the thrombus microbiota. Here, we identified a total of 608 microbial genes. The Gene Ontology enrichment analysis results are shown in Figure [96]6A. Differentially expressed genes associated with the thrombus microbiota of biological process were involved in translation, carbohydrate metabolic process, and phosphoenolpyruvate‐dependent sugar phosphotransferase system. The cellular component analysis revealed that differentially expressed genes were significantly enriched in the ribosome. For molecular function analysis, the top 6 markedly enriched terms were structural constituent of ribosome, RNA binding, GTP binding, GTPase activity, transferase activity transferring glycosyl groups, and protein‐N(PI)‐phosphohistidine‐sugar phosphotransferase activity. In addition, the enriched KEGG pathways presented in Figure [97]6B included ribosome, butanoate metabolism, and sulfur metabolism. Figure 6. Functional analysis of cerebral thrombus using GO and KEGG databases. Figure 6 [98]Open in a new tab A, Bubble plot of GO enrichment analysis of DEGs including BPs, CCs, and MFs between the group with LAA and the group with cardioembolism; B, Bubble plot of KEGG enrichment analysis of DEGs between the group with LAA and the group with cardioembolism. The x axis indicates the proportion of the number of genes that differ from the total gene count in a certain pathway. The size and color of the dots indicate the extent of gene enrichment and the significance of the data, respectively. BP indicates biological processes; CC, cellular components; DEGs, differentially expressed genes; GO, Gene Ontology; KEGG, Kyoto Encyclopedia of Genes and Genomes; LAA, large‐artery atherosclerosis; and MF, molecular functions. DISCUSSION Our study presented the microbiome distribution and genome information of cerebral thrombi with the use of mNGS to identify stroke cause. The relative abundance of Ralstonia insidiosa may differ between the groups with LAA and cardioembolism. Microbial gene markers provided as a predictive model were identified to classify stroke cause diagnosis. Furthermore, this study demonstrated the close relevance of the thrombus microbiome and clinical indicators of patients with stroke, suggesting the potential effects of the thrombus microbiota on host physiological function. Based on Gene Ontology analysis, microbial genes enriched in thrombi participated in multiple biological processes (translation, carbohydrate metabolic process, phosphoenolpyruvate‐dependent sugar phosphotransferase system), cellular components (ribosome), and molecular functions (structural constituent of ribosome, RNA binding, GTP binding, GTPase activity, transferase activity transferring glycosyl groups, and protein‐N(PI)‐phosphohistidine‐sugar phosphotransferase activity), whereas KEGG analysis showed pathways relevant to ribosome, butanoate metabolism, and sulfur metabolism. Our findings provided a novel strategy to identify the different sources of cerebral thrombi with microbial heterogeneity and suggested the effect of microbial colonization on translational process and cellular metabolism. Among all of the pathogens detected by mNGS, the relative abundance of Ralstonia insidiosa seem to be different between the group with LAA group and the group with cardioembolism. A previous study demonstrated that Ralstonia can release lipopolysaccharide (LPS),[99] ^22 which contributes to the neuroinflammatory response.[100] ^23 LPS may play important roles in different causes of stroke. It has been found that LPS is present in human carotid atherosclerotic plaques,[101] ^24 and experimentally LPS aggravates atherosclerosis.[102] ^25 Animal studies have shown that LPS may trigger dysfunction of serum lipids, which further accumulate on the aortic walls to form atherosclerotic plaques in apolipoprotein E−/− mice.[103] ^26 Therefore, our results preliminarily indicate the possible high accumulation of Ralstonia insidiosa in human atherosclerotic thrombus samples, which still needs to be further confirmed by more samples. Previous studies have confirmed the correlation of thrombus composition with functional parameters such as stroke severity (NIHSS) and clinical outcome (mRS).[104] ^27 In our study, we reported that NIHSS and mRS scores were positively related to Corynebacterium jeikeium in thrombi but negatively related to Aspergillus chevalieri in thrombi. Corynebacterium jeikeium has recently been found to be the most common species in clinical specimen cultures, particularly blood, pus, urine, and pleural effusion. Corynebacterium jeikeium has become more frequently identified as a severe cause of serious bloodstream infections, infective endocarditis, pneumonia, meningitis, and skin infections.[105] ^28 Thus, the level of Corynebacterium jeikeium in thrombi might be an indicator of severity and clinical outcomes. Although clots form in blood, the microenvironments of arterial thrombosis and venous thrombosis differ. As our results showed, the diversity of the microbiome in LAA and cardioembolic‐derived thrombi may be attributed to the diverse blood microenvironment of thrombosis. As reported, circulating microbiota and microbiota‐related metabolites can play a prothrombotic role, which enhances thrombin generation, endothelial dysfunction, venous stasis, platelet activation, and eventually hypercoagulation.[106] ^29 Therefore, microbiota and microbiota‐related metabolites may contribute to thromboinflammation, which results in cerebral artery blockage, inflammatory responses, and severe neuronal damage following ischemic events. Further study is needed to determine the potential differences in microbiota and microbiota‐related metabolite distributions in the arterial system and venous system. mNGS has been widely used in multiple infectious diseases and performs massive sequencing of DNA, allowing the simultaneous detection of multiple genes even at very low mutational levels. Because microbiology identification can guide diagnosis, management, and treatment strategies for patients with infection,[107] ^30 mNGS application in cerebral thrombus samples allows for a novel understanding of microbial compositions and microbe‐phenotype associations. Moreover, microbial genes of thrombi were mainly mapped to ribosomes. In line with this finding, disturbances of ribosome proteins such as S6 kinase have been prominent in platelet activation and aggregation.[108] ^31 Regarding the limitations of this study, an additional large sample size is needed to further clarify whether and how the thrombus microbiome can reveal stroke cause and guide second prevention strategies using anticoagulant therapy or antiplatelet therapy. Additionally, due to the limited size of individual thrombus samples, we did not analyze microbial metabolites. Whether targeting specific bacteria or their metabolites is effective for thrombolysis and recurrence prevention remains to be discussed. CONCLUSIONS Together, this mNGS‐based approach has presented evidence that microbial diversity in thrombus ecology provides a diagnostic microbiology method to distinguish atherosclerotic thrombus and cardiogenic thrombus. These findings facilitate new directions to uncover thrombus microbiome disturbances in depth and develop treatment strategies for thrombolysis and thrombosis prevention. Sources of Funding This work was supported by the National Natural Science Foundation of China (Nos. 81971130 and 8220052770), the technical innovation project in major clinical fields of Third Military Medical University (CX2019LC103), and Chongqing Technology Innovation and Application Development Program (no. cstc2019jscx‐gksbX0064). Disclosures None. Supporting information Figures S1–S7 [109]JAH3-13-e033221-s001.pdf^ (884.7KB, pdf) Acknowledgments