Abstract The metabolic versatility of Bacillus subtilis makes it useful for a wide range of applications in biotechnology, from bioremediation to industrially important metabolite production. Understanding the molecular attributes of the biocontrol characteristics of B. subtilis is necessary for its tailored use in the environment and industry. Therefore, the present study aimed to conduct phenotypic characterization and whole genome analysis of the B. subtilis BDSA1 isolated from polluted river water from Dhaka, Bangladesh to explore its biotechnological potential. The chromium reduction capacity at 100 ppm Cr (VI) showed that B. subtilis BDSA1 reduced 40 % of Cr (VI) within 24hrs at 37 °C. Exposure of this bacterium to 200 ppm cadmium resulted in 43 % adsorption following one week of incubation at 37 °C. Molecular detection of chrA and czcC gene confirmed chromium and cadmium resistance characteristics of BDSA1. The size of the genome of the B. subtilis BDSA1 was 4.2 Mb with 43.4 % GC content. Genome annotation detected the presence of numerous genes involved in the degradation of xenobiotics, resistance to abiotic stress, production of lytic enzymes, siderophore formation, and plant growth promotion. The assembled genome also carried chromium, cadmium, copper, and arsenic resistance-related genes, notably cadA, czcD, czrA, arsB etc. Genome mining revealed six biosynthetic gene clusters for bacillaene, bacillibacin, bacilysin, subtilosin, fengycin and surfactin. Importantly, BDSA1 was predicted to be non-pathogenic to humans and had only two acquired antimicrobial resistance genes. The pan-genome analysis showed the openness of the B. subtilis pan-genome. Our findings suggested that B. subtilis BDSA1 might be a promising candidate for diverse biotechnological uses. Keywords: Bacillus subtilis, Bioremediation, Genome mining, Heavy metal, Secondary metabolite Highlights * • In vitro experimentation and in-depth genome analysis characterized the biocontrol potential of B. subtilis BDSA1. * • The strain harbors genes for heavy metal bioremediation, xenobiotics degradation, plant growth promotion, and others. * • The strain BDSA1 also carries 6 gene clusters for the biosynthesis of antimicrobial compounds. * • The lack of virulence genes and non-pathogenic characteristics makes it safe for industrial uses. 1. Introduction The widespread presence of heavy metals such as cadmium, chromium, nickel, arsenic, lead, and others is causing significant environmental pollution, especially with the advancement of industrialization [[31]1]. Bangladesh is in the most precarious position due to the abundance of tanneries, pharmaceutical, textile, fertilizer and other industries around Dhaka City, that release untreated wastewater into the Buriganga, Turag, Shitalakhya, and Balu rivers, causing heavy metal pollution of the aquatic environment [[32]2]. In addition to being harmful for humans and animals, such heavy metal contamination promotes the emergence of antibiotic and metal resistance through co-resistance and cross-resistance [[33]3]. Therefore, effective strategies are obvious to control heavy metal pollution and curb the occurrence of antimicrobial resistant bacteria in the environment. Traditional methods for the detoxification and degradation of metals such as adsorption, chemical oxidation or reduction reactions, chemical precipitation, and electrochemical process are often costly and inefficient, especially for low metal concentrations [[34]4]. Moreover, these methods lack specificity, require substantial space, and often demand high energy consumption [[35]5]. Thus, there is a need for bioremediation using microorganisms with potential for remediation of polluted environments and production of eco-friendly secondary metabolites. Microorganisms have developed various mechanisms, including biosorption, biotransformation, bioaccumulation, and biomineralization to detoxify heavy metals [[36]6]. Bioremediation of heavy metals offers an alternative and effective means of decontaminating metal-polluted environments [[37]7]. The use of bacteria, belonging to the Bacillus genus, might be one way of bioremediating pollutants [[38]8]. B. subtilis is one of the well-described species in this genus that has enormous potential in biotechnology because of its direct industrial application and satisfactory bioremediation potential [[39]9,[40]10]. Its ability to survive in harsh environmental conditions for extended times and rapid multiplication rate have made it a popular biocontrol agent [[41]11]. A growing body of literature reported the potential use of B. subtilis in the bioremediation of chromium, cadmium, nickel, mercury, and copper [[42][12], [43][13], [44][14]]. Apart from its bioremediation potential, B. subtilis is often regarded as a cell factory for microbial production of industrially important chemicals, enzymes and antimicrobial compounds because of its sophisticated protein secretion system and remarkable adaptability [[45]15]. These beneficial compounds include volatile organic or inorganic compounds, antimicrobial peptides, non-ribosomally synthesized peptides (NRP), polyketides, and others [[46]11]. Usually, the secondary metabolite-encoding genes are typically found in clusters and encode multifunctional enzyme complexes [[47]16]. Some notable secondary metabolites produced by B. subtilis include bacilysin, surfactin, fengycin, and others [[48]17]. Furthermore, it also facilitates pathogen inhibition and plant growth promotion (PGP) by producing phytohormones, siderophores, lipopeptides, and phytases [[49]18]. Mechanism of bioremediation, metabolite production, and PGP activity vary between strains; thus, individual strains specificities may influence species ratios and metabolic conditions, which in turn may affect hosts and microbial communities interactions [[50]11]. Bioinformatics-assisted recognition of specific functional genes or gene clusters in genome sequences, popularly known as genome mining, is faster and more effective in the detection of bacterial metabolites than conventional screening strategies [[51]17]. It also provides a more comprehensive evaluation of the safety of using B. subtilis, expanding from traditional phenotypic examination to molecular mechanism and genetic features [[52]19]. Several previous studies conducted whole genome analysis of B. subtilis strain EB1, XF-1, MBI-600, Bbv57 to evaluate their ability to serve numerous biotechnological purposes [[53][20], [54][21], [55][22], [56][23]]. In pursuing sustainable solutions to heavy metal pollution and harnessing the untapped potential of microbial metabolites, we explained the bioremediation and secondary metabolite production potential of indigenous B. subtilis BDSA1. The main objective of the study was to assess the biotechnological potential of BDSA1 through the characterization of resistance to heavy metals, and whole-genome analysis. 2. Materials and methods 2.1. Isolation and presumptive identification of Bacillus spp. B. subtilis BDSA1 was isolated from polluted water samples collected from the Sitalakshya river, Dhaka, Bangladesh (23°43′18.48"N; 90°30′1.8"E) [[57]24]. Briefly, collected samples were diluted (10^−01 to 10^−08) using normal saline. Each dilution was spread on Luria-Bertani agar (LB) supplemented with 50 ppm chromium (K[2]Cr[2]O[7]) and incubated at 37 °C for 24hrs. The glycerol stock (25 %) of the pure culture of this bacterium was stored at −80 °C for long-term maintenance. The isolate was retrieved from stock by inoculating a loopful of inoculum on LB agar followed by incubation at 37 °C for 24–48hrs. Then, several biochemical tests, including catalase, oxidase, IMViC, carbohydrate fermentation, gelatin, starch hydrolysis and hemolysis tests were done for presumptive identification with the help of Bergey's Manual of Systematic Bacteriology [[58]25]. 2.2. Tolerance to heavy metal The extent of metal tolerance of BDSA1 was determined according to the procedure described earlier [[59]26]. A single colony of freshly grown bacteria from LB agar (without metal supplementation) was inoculated in 10 ml LB broth supplemented with varying concentrations of chromium or cadmium (100–700 ppm) and incubated overnight at 37 °C. LB broth with no heavy metal was used as a control. Bacterial growth was determined by measuring OD at 600 nm. The minimum concentration of the heavy metals at which no turbidity was observed was considered as the Minimum Inhibitory Concentration (MIC). 2.3. Reduction of hexavalent chromium The bacterial ability to reduce hexavalent chromium was estimated using 1,5-diphenylcarbazide (DPC) method [[60]27]. The isolate was grown in 50 ml LB broth amended with 100 ppm Cr for 24hrs and cultures were harvested by centrifugation at 6000×g for 10 min. After adding DPC reagent to the collected supernatant from control and test cultures, the absorbance of the color produced by Cr^6+-DPC complex was measured spectrophotometrically at 540 nm. A standard curve was prepared to estimate the concentration of Cr (VI) in the samples. 2.4. Biosorption of cadmium Cadmium biosorption potential of BDSA1 was determined by following the protocol as described by Syed et al., 2015 [[61]28]. The tested isolate was inoculated on LB broth containing 200 ppm cadmium and kept at 37 °C for 7 days. Then, the culture was centrifuged at 8000×g for 10 min and the concentration of cadmium in the supernatant was measured by Atomic absorption spectrometry (AAS). Finally, the biosorption efficiency of the bacteria was calculated using the following formula. [MATH: (CiCf)/Ci×100(%) :MATH] where C[i] is the initial concentration, and C[f] is the final concentration of cadmium. 2.5. Detection of Cr and Cd resistant genes Genomic DNA from B. subtilis BDSA1 was extracted using the Monarch Genomic DNA purification kit (New England Biolabs) according to the manufacturer's instructions. The concentration and purity of genomic DNA were measured using Nanodrop 2000 spectrophotometer (Thermo scientific, USA). The extracted DNA was then subjected to polymerase chain reaction using gene (chrA and czcC) specific primer. The details of primer sequences, PCR conditions, and amplicon size are available in [62]Supplementary Table 1. The total volume of each PCR reaction was 25 μL, consisting of 12.5 μL 2X mastermix, 1 μL forward primer, 1 μL reverse primer, 8.0 μL nuclease-free water and 2.5 μL template DNA. The amplified products were separated and visualized on 1.5 % agarose gel electrophoresis. 2.6. Antibiotic susceptibility pattern The susceptibility of B. subtilis BDSA1 to a wide array of antibiotics was assessed by following the Kirby-Bauer disc diffusion method [[63]29]. The used discs were gentamycin (CN10), streptomycin (S10), imipenem (IMI10), meropenem (MRP10), vancomycin (VA30), ampicillin (AMP10), chloramphenicol (C10), doripenem (DOR10), tetracycline (TE30), colistin (CT10), cefotaxime (CTX30), and ceftriaxone (CRO30). Freshly grown bacteria standardized to 0.5 McFarland was evenly spread on Mueller-Hinton agar using sterile cotton swab. Antibiotic discs were placed on the surface of the bacterial lawn and incubated at 37 °C for 16–18hr. Then, the diameter of the zone of inhibition was measured and classified as sensitive (S), intermediate (I), and resistant (R) in accordance with the guidelines of Clinical and Laboratory Standards Institute (CLSI) [[64]30]. 2.7. Bacterial genome sequencing and assembly Bacterial DNA was sent to Macrogen Genome Center, South Korea for paired-end whole genome sequencing using Illumina NextSeq 500 platform. The quality parameters of raw sequences were checked using FastQC. Following adapter trimming by Trimmomatic v0.36 [[65]31], quality-filtered trimmed reads were assembled using SPAdes genome assembler v3.15.2. using –isolate parameters for high coverage isolate and multi-cell Illumina data. Finally, the quality of the assembled genome was checked with QUAST v5.2.0 [[66]32]. Contigs size <200bp were removed and the draft genome was submitted to NCBI GenBank. 2.8. Phylogenetic analysis and functional annotation The assembled genome was uploaded to the Type (Strain) Genome Server (TYGS) [[67]33] and The Genome Taxonomy Database (GTDB)-Tk v2.3.2 [[68]34] for whole genome-based taxonomic identification and classification of bacteria. TYGS also constructed a whole-genome phylogenetic tree based on the Genome Blast Distance Phylogeny (GBDP) approach. Further phylogenetic analysis was conducted using The Bacterial Genome Tree Service integrated in BV-BRC, using codon tree method [[69]35] and visualized using iTOL [[70]36]. PubMLST (Public databases for molecular typing and microbial genome diversity) [[71]37] and K-merFinder v3.2 [[72]38] tools were employed to identify and determine the sequence type (ST) of the isolate. Genome annotation and identification of target genes were accomplished using RAST (Rapid Annotation using Subsystem Technology) webserver [[73]39] and PROKKA (Galaxy Version 1.14.6) [[74]40] integrated in Galaxy with default parameters. EggNOG-mapper v2 [[75]41] was utilized for genome-wide functional annotation and determining the composition of the cluster of orthologous genes (COG). Moreover, we also used KEGG annotations in the KEGG Automatic Annotation Server (KAAS) v2.1 [[76]42] with the Bidirectional Best Hits (BBH-method) to study the metabolic pathways in the assembled genome. 2.9. Detection of antimicrobial- and metal resistance associated genes Contigs in the assembled genome were screened for determining antimicrobial resistance genes using ABRicate with different databases including CARD [[77]43] and ResFinder [[78]44] (Minimum DNA identity and coverage = 80 %). Genes associated with metal resistance were screened by performing blastP using the BacMet (version 2.0) database [[79]45]. 2.10. Characterization of virulence and other properties The Virulence Factor Database (VFDB) [[80]46], PathogenFinder v1.1 [[81]47], PHASTER [[82]48], CRISPRFinder [[83]49] and PlasmidFinder [[84]50] were utilized to detect bacterial virulence factors, pathogenic capability, prophage, CRISPR and plasmid sequences, respectively in the draft genome. The genome mining for the identification of secondary metabolites biosynthetic gene cluster and bacteriocins were performed with the help of AntiSMASH (bacterial version) [[85]51] and BAGEL4 [[86]52]. 2.11. Pan-genome analysis The genome sequences of 36 different strains of B. subtilis were downloaded from NCBI Genbank database ([87]Supplementary Table 2). The sequences were subjected to pan-genome analysis using IPGA v1.09 [[88]53] and PanExplorer [[89]54]. Roary [[90]55] was selected for genome clustering and pan-genome analysis procedures. 3. Results and discussion Heavy metal pollution substantially threatens human health, food production, and the natural environment globally. The search for novel microbial strains by exploring indigenous sources is essential for enabling innovative measures to counter environmental pollution. Using beneficial microorganisms for bioremediation, enhancement of plant growth, and production of valuable metabolites can facilitate the development of sustainable strategies for the environment, agriculture, and industry. Therefore, a heavy metal-resistant B. subtilis was characterized through laboratory experiments and in silico genome exploration that provide adequate genomic insights into different biological characteristics of the strain BDSA1. [91]Fig. 1 illustrates the workflow of this study. Figure 1. [92]Figure 1 [93]Open in a new tab Graphical representation of the entire work 3.1. Phenotypic characterization of metal- and antibiotic resistance of BDSA1 The B. subtilis BDSA1 could tolerate up to 600 ppm of cadmium and chromium, though the growth tend to decrease with the increasing metal concentration ([94]Supplementary Fig. 1). The finding was in agreement with previous studies that reported the tolerance range of chromium reducing microorganisms between 100 ppm and 4000 ppm [[95]56,[96]57]. Such broad variations in the tolerance level might be due to chemical composition of the media and its consistency (solid or liquid), type of microorganism, microbial adaptation and resistance mechanisms, environmental factors and others [[97]58]. For example, the toxicity of metals is greater in liquid than solid media due to more free availability of metals in liquid cultures [[98]59]. The isolate reduced 40 % of 100 ppm hexavalent Chromium to non-toxic trivalent form within 24hrs at 37 °C at pH = 7.0. The chromium reduction ability of B. subtilis is well-documented. Earlier studies reported the influence of numerous factors such as temperature, pH of the media, initial chromium concentration, presence of other heavy metals, exposure time, and others on the extent of chromium reduction by different strains of B. subtilis. For example, B. subtilis MNU16 reduced 75 % of 50 ppm of Cr(VI) at 30 °C within 72hrs [[99]60]. B. subtilis CRB-1 achieved a 100 % reduction of 50 ppm Cr (VI) at pH range 7–9.0, temperature 42 °C with aerobic condition [[100]61]. Alkaliphilic B. subtilis also caused complete chromium reduction at pH = 9.0 [[101]57]. On the other hand, B. subtilis QH-1 reduced only 23.22 % of 100 ppm initial concentration of Cr (VI) at pH = 7.0 within 24hrs [[102]62]. Overall, a comparison of this evidence indicated that optimization of media pH, incubation time and temperature, and initial Cr (VI) concentration might result in greater chromium reduction by BDSA1. The B. subtilis BDSA1 also caused adsorption of 43 % of 200 ppm cadmium after 7 days of incubation at 37 °C, similar to B. subtilis strain NSPA13 and KC6, though the percentage of adsorption varied depending on experimental conditions and initial cadmium concentration [[103]13,[104]63]. We confirmed the presence of chrA and czcC gene in bacterial genomic DNA, as evident from distinct band at 214 and 232 bp, respectively in agarose gel ([105]Supplementary Fig. 2). The occurrence of chrA gene reaffirmed the chromate resistance characteristic of BDSA1. Zhu et al., 2019 demonstrated the involvement of chrA (chromate efflux pump) in bacterial chromate resistance by heterologous expression [[106]61]. The bacterium exhibited resistance to the action of streptomycin, ceftriaxone, cefotaxime and sensitivity to gentamicin, ampicilin, imipenem, meropenem and trimethoprim. Intermediate resistance was observed to tetracycline, chloramphenicol and vancomycin ([107]Supplementary Table 3). [108]Table 1 provides a comparison of the antibiotic resistance profiles and hemolytic characteristics on blood agar of different reported strains of B. subtilis. Table 1. Comparison of antibiotic resistance and hemolytic activity in various B. subtilis strains B. subtilis strains Antibiotic resistance Hemolytic activity Reference BDSA1 Streptomycin, Ceftriaxone, Cefotaxime Non-hemolytic This study RZS-01 Nor-floxacin and Amoxicillin na [109][64] G8 Tiamulin, Oxacillin, Lincomycin na [110][65] MBTDCMFRI Ba37 Bacitracin, Ampicillin and Colistin Non-hemolytic [111][66] CU1 No resistance Non-hemolytic [112][67] VKPM B2335 (BS3) Chloramphenicol, Oxacillin, Cefotaxime, Ceftriaxone Non-hemolytic [113][68] IDCC1101 Streptomycin Non-hemolytic [114][69] [115]Open in a new tab 3.2. Genomic identification and characterization of BDSA1 Whole genome sequencing (WGS) has emerged as a powerful tool for accurate characterization and functional analysis of bacterial strains at the genomic level [[116]70]. The total size of the assembled genome of BDSA1 was 4.2 Mb, distributed into 47 contigs with 43.4 % GC content ([117]Table 2). Whole genome taxonomic identification using TYGS and GTDB revealed the isolate as Bacillus subtilis. Codon tree-based phylogenetic analysis suggested that B. subtilis BDSA1 was similar to B. subtilis strain 168 ([118]Fig. 2A), whereas GBDP based whole genome phylogenetic tree identified B. subtilis NCIB 3610 as the closest neighbor of BDSA1 ([119]Fig. 2B). Analysis of sequence data using pubMLST (Locus: ilvD, tpiA, pycA, rpoD, glpF, pta, purH) and K-merFinder also confirmed the identification of Bacillus subtilis. As MLST-2.0 predicted the housekeeping genes pycA and purH as novel allele, the isolate might be designated as a novel sequence type (ST). Nowadays, ANI and dDDH are used to identify bacterial species in which an ANI value of ≥96 %, and a in silico ddH value of ≥70 % are considered cut-off for species delineation. B. subtilis BDSA1 had dDDH value = 88.9 % and ANI value = 98.47 % with B. subtilis NCIB 3610, validating the whole genome identification ([120]Fig. 3A) [[121]71]. Table 2. Different characteristics of B. subtilis BDSA1 assembled genome Characteristics __________________________________________________________________ Terms __________________________________________________________________ Taxonomy Firmicutes>Bacilli>Bacillales>Bacillaceae>Bacillus>Bacillus subtilis Genome statistics Number of contigs 47 Size 4,221,385 bp GC content (%) 43.4 Contig N50 value 502218 Contig L50 value 3 Genomic features CDS 4438 tRNA 37 rRNA 9 Genome quality Completeness 100% Contamination 0.4% Overall remark Good Annotation features Transporter (TCDB) 326 Drug target (Drug Bank) 71 Antibiotic resistance (PATRICK) 44 Protein features Proteins with functional assignments 3658 Proteins with EC 1077 Proteins with GO assignments 902 Proteins with Pathway assignments 794 Hypothetical proteins 780 Genome availability Bio Project [122]PRJNA898679 Bio Sample [123]SAMN31620504 GenBank accession [124]JAPFBY000000000.1 [125]Open in a new tab Figure 2. [126]Figure 2 [127]Open in a new tab aPhylogenetic analysis of B. subtilis BDSA1 based on codon tree in PATRIC Figure 2. [128]Figure 2 [129]Open in a new tab bPhylogenetic analysis of B. subtilis BDSA1 based on Genome Blast Distance Phylogeny (GBDP) approach in TYGS Figure 3. [130]Figure 3 [131]Open in a new tab aWhole genome comparison of BDSA1 with closely related strains According to PathogenFinder, the probability of the B. subtilis BDSA1 being a human pathogen is 0.296, in a scale of 1 as the highest value. The PHASTER server predicted genome regions that contained one intact, four incomplete, and one questionable prophage sequences ([132]Supplementary Table 4). PlasmidFinder identified two plasmid sequences in BDSA1 genome that did not have any heavy metal or antibiotic resistant genes. As evident from the MGEFinder, two copies of insertion sequences, ISBpu1 were found in node_3 and node_6. Both sequences were transposases that originated from B. pumilus and carried the aadK gene and mphK gene. Presence of two more insertion sequences ISSep2, from Staphylococcus epidermidis and Proteus vulguris and ISSau6 from Staphylococcus aureus were also detected in node 13. The genome also had three confirmed CRISPR regions. VFDB analysis resulted in detection of only one virulence factor, bslA (hydrophobin). The genes encoding virulence factors such as hemolysin, enterotoxin, cytotoxin and other toxins were not found in the genome. 3.3. Genome annotation of BDSA1 Genome annotation using RAST predicted 4438 protein coding sequences and 37 tRNA in the draft assembly, whereas PROKKA-based annotation detected 4227 CDS, 6 rRNA and 43 tRNA in the draft genome ([133]Fig. 3B). The discrepancy in the number of predicted coding sequences might be due to differences in their underlying algorithms, reference databases, and criteria for gene calling [[134]39,[135]40]. RAST-annotation revealed a total of 330 subsystems including carbohydrates, protein metabolism, amino acids and derivatives, cofactors, vitamins and others ([136]Fig. 4A). Figure 3. [137]Figure 3 [138]Open in a new tab bCircular genome map of B. subtilis BDSA1 with different annotation characteristics of the isolate Figure 4. [139]Figure 4 [140]Open in a new tab aFunctional annotation of B. subtilis BDSA1: Distribution of subsystems by RAST Based on EggNOG mapping, 16.21 % of proteins were involved in information storage and processing activities including replication, transcription, translation, ribosomal biogenesis, recombination and repair. 14.85 % of proteins were related to cellular processes and signaling, including activities such as cell wall biogenesis, signal transduction, cell cycle regulation, intracellular trafficking, and post-translational modification. Additionally, about 29.82 % of proteins were associated with metabolic functions (amino acid transport, carbohydrate metabolism, energy production, conversion, and inorganic ion transport) ([141]Fig. 4B). 26.01 % of proteins were poorly characterized. KEGG pathway enrichment analysis suggested that most of the pathways were associated with metabolism of carbohydrate, amino acid, lipid, energy and biosynthesis of other compounds ([142]Fig. 4C). In addition, BDSA1 harbored 39 genes that were linked to xenobiotics degradation and metabolism pathways including, benzoate, aminobenzoate, xylene, naphthalene, styrene, atrazine, and DDT degradation. Similar to our findings, Li et al., 2019 and Kumar et al., 2023 identified 37 and 34 genes in B. subtilis DM2 and B. subtilis EB-1 respectively that might be responsible for the degradation of petroleum hydrocarbon and xenobiotics [[143]20,[144]72]. Figure 4. [145]Figure 4 [146]Open in a new tab bFunctional annotation of B. subtilis BDSA1: Pathway enrichment analysis Figure 4. [147]Figure 4 [148]Open in a new tab cFunctional annotation of B. subtilis BDSA1: COG annotations Genome annotation of BDSA1 also showed the presence of different genes associated with resistance to oxidative and osmotic stress, metabolic regulation, biofilm formation, and production of degradative enzymes ([149]Table 3). For example, BDSA1 had opuA operon (multicomponent glycine betaine transport system), consisting of three structural genes that allow bacteria to uptake osmoprotectant [[150]73]. Biofilms of B. subtilis have been associated with probiotic and biocontrol activities, as well as potential industrial applications [[151]74]. tasA and bslA, encoding amyloid fiber protein and self-assembling bacteria hydrophobin, are required for biofilm formation in B. subtilis and their presence in BDSA1 indicated its biofilm-forming ability [[152]75]. The strain also carried genes for production of degradative enzymes such as endoglucanase, phytase, amylase, xylanase that have potential applications in lignocellulosic biomass degradation, poultry production and others [[153]76]. Table 3. Representative genes related to biocontrol properties of B. subtilis BDSA1 Gene Product/Pathway involvement Description Opu operon (opuAA, opuAB, opuAC) High affinity osmopro- tectant glycine betaine uptake system Resistance to osmotic, oxidative and others abiotic stress yfkM, yugL General stress protein nsrR, Nitrosative resistance gene perR Peroxide operon regulator nasABCDE Nitrate assimilation operon Nitrogen metabolism pstA, pstB, pstS Phosphate transport Phosphate solubilization dhbABCD Bacillibactin biosynthesis Siderophore formation yclNOPQ Ferric-Petrobactin uptake operon eglS Endoglucanase Lytic/Degradative enzyme xynABC, xynD Xylanase csn Chitosanase phy 3-phytase tasA, Biofilm matrix component Biofilm formation bslA, bslB Biofilm surface layer protein alsS. alsD Acetoin biosynthesis Plant growth promotion and regulation treA, treP, treR Trehalose biosynthesis trpABCD, trpF Tryptophan biosynthesis and regulation of auxin biosynthesis gabD, gabP Biosynthesis of gamma-aminobutyric acid (GABA) pelC Pectate lyase paiA, bltD Spermine/spermidine acetyltransferase [154]Open in a new tab B. subtilis BDSA1 harbored numerous genes involved in plant growth promoting mechanisms including regulation of auxin production, GABA (Gamma-aminobutyric acid) formation, spermidine acetyltransferase, siderophore production. Prior studies utilized genome analysis for determining the biocontrol potential of different strains of B. subtilis, notably Bbv57, MBI600, CTXW 7-6-2 against phytopathogens [[155]22,[156]23,[157]77]. Phosphate solubilization-associated genes pstA, pstB, pstS were identified in the BDSA1 genome, which together account for the phosphate transport system. Based on the occurrence of these genes, other studies also reported B. subtilis RS10, B. subtilis EA-CB0575, and B. subtilis QH-21 as phosphate solubilizers [[158]18,[159]62,[160]78]. Iqbal et al., 2021 identified the phosphate transport system in B. subtilis RS10 genome and their role in plant growth was experimentally validated through in vitro experiments [[161]78]. Siderophore-producing bacteria increase the availability of iron for plant development [[162]79]. Siderophore-formation genes in the BDSA1 genome suggested its ability to sequester rhizospheric iron for plant metabolism. Annotated genome also found to carry genes for tryptophan synthase, anthranilate phosphoribosyltransferase, and phosphoribosylanthranilate isomerase that function in tryptophan synthesis and regulation of auxin biosynthesis. Taghavi et al., 2010 demonstrated the association between tryptophan-related genes and auxin formation [[163]80]. In addition, spermidine production by B. subtilis inhibited ethylene production, affecting plants-microbe interactions [[164]81]. Overall, the genomic characteristics of BDSA1 suggested its potential in plant growth promotion which needs further experimentation to elaborate the extent of PGPR activities. 3.4. Genomic analysis of AMR and HMR properties The search for AMR genes in BDSA1 genome against CARD database yielded no perfect hit and 16 strict hits for AMR genes, whereas, according to ResFinder analysis, only 2 acquired resistance genes, namely aadK (aminoglycoside 6-adenylyltransferase) and mphK (macrolide phosphotransferase) were detected. The variations in the number of AMR gene predictions could occur because CARD encompasses both intrinsic and acquired resistance genes, leading to a higher count, whereas ResFinder targets mainly acquired resistance genes, contributing to the observed discrepancies [[165]82]. However, phenotypic resistance to ceftriaxone, cefotaxime and aztreonam aligned with the CARD-based AMR analysis. The acquired resistance gene aadK encodes a streptomycin-modifying enzyme and thus confers streptomycin resistance [[166]83]. Overall, the results of antibiotic susceptibility testing were consistent with in silico AMR analysis. Adaptation and resistance to heavy metals is well-recognized in B. subtilis. A number of metal resistance associated genes, such as chrA, czcD, arsB, cadA were found that were responsible for bacterial resistance to chromium (Cr), cadmium (Cd), arsenic (As), Copper (Cu), manganese (Mn), Nickel (Ni) etc ([167]Table 4). Among these genes, cadA encodes cadmium-translocating ATPase, which is thought to be a multifunctional metal-exporting pump for the extrusion of Zn^2+, Cd^2+, Ag^2+, and Pb^2+ [[168]84]. Copper-translocating P-type ATPase (copA), Cobalt-zinc-cadmium resistance protein (czcD), Arsenical pump membrane protein (arsB), were also present in tested genome. The three copper resistance-associated genes in BDSA1 were arranged (csoR-copZ-copA) in the same way as the copper resistance system seen in B. subtilis and B. megaterium [[169]85]. Besides, czrA regulon responds to different mono and divalent cations that consist of cadA and czcD, a determinant of resistance to Zn(II), Cu, Ni(II), and Co(II) [[170]86]. We also observed manganese transport systems encoded by mntABCD and mntH along with their regulator mntR and manganese efflux mneS, mneP in different contigs of the genome. Table 4. Some notable antimicrobial and metal-resistance genes and their mechanism of actions Gene Location __________________________________________________________________ Function Resistance Mechanism of resistance/Pathway involvement Start (bp) End (bp) Strand Node Antimicrobial resistance genes ykkC 41106 41444 + 1 Small multidrug resistance efflux protein Aminoglycoside Phenicol Tetracycline [171]Antibiotic efflux ykkD 41444 41761 + rphB 773517 776113 - Rifampin phosphotransferase protein Rifamycin [172]Antibiotic inactivation bmr 1110143 1111312 + MFS efflux pump Fluoroquinolone phenicol [173]Antibiotic efflux blt 39890 41092 + Fluoroquinolone aadK 59216 60070 - 3 Aminoglycoside nucleotidyltransferase Aminoglycoside [174]Antibiotic inactivation mphK 77240 78160 + Macrolide phosphotransferase Macrolide lmrB 90052 91485 - 6 chromosomally-encoded efflux pump Lincosamide [175]Antibiotic efflux tmrB 140609 141202 - ATP-binding tunicamycin resistance protein Nucleoside [176]Reduced permeability mprF 27270 29840 - Integral membrane protein that modifies phosphatidylglycerol Peptide antibiotic [177]Antibiotic target alteration Metal resistance associated genes chrA 589,219 589,755 + 2 Chromate transport protein Chromium Responsible for induction of resistance copA 307,753 310,167 - 2 Copper-translocating P-type ATPase Copper Activation of copZA operon by CueR and repressed by YfmP czcD 45,711 46,748 - 3 Cobalt-zinc-cadmium resistance protein CzcD Cadmium, Zinc, Cobalt Efflux of metal ion mntR 1,158,929 1,159,357 - 1 Transcriptional regulator Manganese, Magnesium Represses the expression of Mn(II) uptake czrA 794,932 795,255 - 1 A repressor for the czr operon Zinc, Cadmium Part of the czrSRCBA resistance operon arsB 580,023 581,237 + 2 Arsenical pump membrane protein Arsenic Arsenite extrusion pump cadA 305,484 307,583 - 2 Cadmium efflux ATPase Cadmium, Zinc Chromosomal determinant to cadmium resistance csoR 310,531 310,836 - Copper-sensing transcriptional repressor Copper Negatively regulates expression of the copZA operon corA 1,175,964 1,176,917 - Cobalt/magnesium transport protein Magnesium, Cobalt, Nickel, Manganese Mediates magnesium ions influx and efflux as well as cobalt and nickel uptake [178]Open in a new tab 3.5. Secondary metabolites analysis B. subtilis produces a wide range of chemically diverse secondary metabolites exerting diverse biological activities [[179]87]. The BDSA1 genome was found to carry six biosynthetic gene clusters encoding bacillaene, bacilysin, fengycin, bacilibactin, subtilosin and surfactin. Occurrence of these gene clusters is often reported in different strains of B. subtilis and in others species of Bacillus genus, though little information is available regarding their detection, quantification and antimicrobial use [[180]23,[181]77]. These gene clusters were composed of core and additional biosynthetic genes, transport-related genes, regulatory genes and others ([182]Fig. 5). Fengycin and surfactin are lipopeptides produced by B. subtilis strains that are known to have a broad range of antifungal and antibacterial activities, respectively [[183]88,[184]89]. Expression of the genes for biosynthesis of a catechol siderophore, named bacillibactin, allows B. subtilis to compete with other microbes for iron acquisition in iron-deficient conditions [[185]90]. Bacillaene is a polyketide that inhibits protein synthesis and, thus, exerts antibacterial activity, whereas bacilysin show antimicrobial effect by disrupting the cell membrane integrity [[186]91,[187]92]. In addition, BAGEL4 predicted 5 gene clusters of interest linked to the production of secondary metabolites such as UviB, comX4, subtilosin, Lanthipeptide_class_IV and sactipeptides. Subtilosin A, a ribosomally synthesized and post-translationally modified peptide, proved to interfere with the phospholipid bilayer and thus showed a broad spectrum of antibacterial activity [[188]93]. Figure 5. [189]Figure 5 [190]Open in a new tab Schematic diagram of secondary metabolites biosynthetic gene clusters in B. subtilis BDSA1 3.6. Pan-genome analysis Roary-based pan-genome analysis showed 7348 clusters, including 1574 strain-specific genes (21.4 %), 2225 core genes (30.3 %) and 3549 dispensable genes (48.3 %) ([191]Fig. 6A). In the core-pan rarefaction curve, it was evident that the number of the pan genome genes increased gradually with the addition of new strains, whereas the core genome exhibited the opposite trend ([192]Fig. 6B). No distinct plateau in core/pan-genome ratio indicated the open state of the B. subtilis pan-genome. The finding is consistent with earlier studies [[193]94]. According to ANI-based pan-genome analysis, B. subtilis BDSA1 was found to be closely related to other tested isolates with ANI values of 98 % or higher except strain CU1 and CW14 (92.6 %) ([194]Fig. 6C). Functional classification of Pan-genome by COG's distribution indicated that majority of gene clusters were associated with Inorganic ion transport and metabolism (P), Amino acid transport and metabolism (E), Signal transduction mechanisms (T), Energy production and conversion (C) ([195]Fig. 6D). Moreover, phylogenetic tree generated by hierarchical clustering from binary matrix (presence/absence) of accessory genes suggested that BDSA1 occurred in the same clade with strains SG6, PMB102, HY2, GOT9, FY-1 etc ([196]Fig. 6E). Figure 6. [197]Figure 6 [198]Open in a new tab aPan-genome analysis of B. subtilis strains: Distribution of core and accessory genes, Figure 6. [199]Figure 6 [200]Open in a new tab bPan-genome analysis of B. subtilis strains: Pan-genome size rarefaction curve Figure 6. [201]Figure 6 [202]Open in a new tab cPan-genome analysis of B. subtilis strains: Comparison of the ANI values between strains Figure 6. [203]Figure 6 [204]Open in a new tab dPan-genome analysis of B. subtilis strains: COG’s distribution Figure 6. [205]Figure 6 [206]Open in a new tab ePan-genome analysis of B. subtilis strains: Pan-genome phylogenetic tree *na=not available 4. Conclusion Our study performed a comprehensive genomic analysis of native B. subtilis BDSA1 to uncover the potential biocontrol properties and to assess its safety. The whole genome analysis reveals the presence of an array of genes associated with bioremediation, plant growth promotion, secondary metabolites formation and indicates the diverse biotechnological applications that this isolate might possess. Phenotypic data of chromium reduction and cadmium absorption also supports the potential, which needs further evaluation of optimum experimental parameters. However, future works on experimental validation of in silico identified characteristics, production of secondary metabolites, application for plant growth development, and others can determine the suitability of this strain for industrial-scale use. Declarations The authors declared no conflict of interests. Data availability statement The genome sequence of Bacillus subtilis BDSA1, as discussed in the manuscript, has been deposited in the NCBI GenBank under the accession number [207]JAPFBY000000000.1. CRediT authorship contribution statement Tanvir Ahmed Saikat: Writing – original draft, Methodology, Formal analysis, Data curation. Md Abu Sayem Khan: Writing – original draft, Methodology, Formal analysis, Data curation. Md Saiful Islam: Formal analysis. Zarin Tasnim: Formal analysis. Sangita Ahmed: Writing – review & editing, Supervision, Methodology, Investigation, Funding acquisition, Conceptualization. Declaration of competing interest The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper. Acknowledgement