Abstract Alzheimer’s disease (AD) poses a major challenge due to its impact on the elderly population and the lack of effective early diagnosis and treatment options. In an effort to address this issue, a study focused on identifying potential biomarkers and therapeutic agents for AD was carried out. Using RNA-Seq data from AD patients and healthy individuals, 12 differentially expressed genes (DEGs) were identified, with 9 expressing upregulation (ISG15, HRNR, MTATP8P1, MTCO3P12, DTHD1, DCX, ST8SIA2, NNAT, and PCDH11Y) and 3 expressing downregulation (LTF, XIST, and TTR). Among them, TTR exhibited the lowest gene expression profile. Interestingly, functional analysis tied TTR to amyloid fiber formation and neutrophil degranulation through enrichment analysis. These findings suggested the potential of TTR as a diagnostic biomarker for AD. Additionally, druggability analysis revealed that the FDA-approved drug Levothyroxine might be effective against the Transthyretin protein encoded by the TTR gene. Molecular docking and dynamics simulation studies of Levothyroxine and Transthyretin suggested that this drug could be repurposed to treat AD. However, additional studies using in vitro and in vivo models are necessary before these findings can be applied in clinical applications. Supplementary Information The online version contains supplementary material available at 10.1038/s41598-024-75431-z. Keywords: RNA-Seq, Alzheimer’s disease, Biomarker, Drug discovery Subject terms: Drug discovery, Computational biology and bioinformatics, Gene ontology, Transcriptomics Introduction Alzheimer’s disease (AD) is a degenerative neurological condition characterized by a progressive and irreversible decline in cognitive and functional abilities, resulting in significant impairment in routine tasks and social interactions^[41]1. Various environmental and genetic risk factors have been linked with its onset^[42]2,[43]3. The pathological mechanisms of the disease are mostly distinguished by the formation of amyloid plaques (Aβ plaques), neurofibrillary tangles, and neuronal degeneration in the brain^[44]4. Furthermore, Alzheimer’s disease (AD) is a matter of significant public health concern on a global scale, both in the United States and in several other countries around the world^[45]5,[46]6. Furthermore, it is noteworthy that this condition ranks as the fifth most prevalent cause of death in the elderly population of the United States. Additionally, it is estimated that approximately 35 million individuals globally are impacted by this disease, with estimations indicating that this figure will increase to 65 million by the year 2030^[47]7. Additionally, the number of cases of Alzheimer’s disease in Asia, in developed as well as developing countries, is affected by age, gender, and cultural factors^[48]8–[49]10. In Bangladesh, there is a lack of precise epidemiological data on the number of AD patients, and awareness of the disease is still in the stages of development. The lack of awareness has resulted in ongoing difficulties for patients and their families. Moreover, limited funding for AD research makes it difficult for a lower-middle-income country like Bangladesh to effectively manage the disease. Therefore, it is time to consider the disease and its management proactively and take the necessary measures^[50]11. Several studies have been conducted in recent times regarding the etiology of AD, with the majority of the research indicating that the cause of AD was genetic in nature^[51]12. Despite the increasing number of genes that have been suggested to impact vulnerability to Alzheimer’s disease (AD), the comprehension of their mechanisms and the enhancement of disease management are still restricted by challenges in understanding the functional implications of genetic associations^[52]13,[53]14. Recent research has demonstrated that various mechanisms of gene expression regulation, including interactions between mRNA and transcription factors, non-coding RNAs, alternative splicing, and variants, may have an impact on the process of neurodegeneration and an increasing number of developments are being observed in the concurrent analysis of transcriptome data, to investigate the consequences of recently identified genetic risk factors on the transcriptome level^[54]15. The area of molecular biology and genomics has grown tremendously in the previous decade, with the most recent development content including “omics” technologies such as genomics, proteomics, transcriptomics, and metabolomics^[55]16. The transcriptomic concept refers to the complete set of transcripts found in various cell types, tissues, or organs. This encompasses both coding and noncoding RNA molecules that are responsible for encoding proteins^[56]17. Several studies have utilized the transcriptomics approach to identify biomarkers that differentiate individuals with Alzheimer’s disease from those without Alzheimer’s. The identification of differentially expressed genes (DEGs) through RNA-Seq analysis is an essential part of the study of biological pathways implicated in various neurological disorders. The purpose of conducting Differential Expression Gene (DEG) analysis is to identify genes that exhibit potential overexpression or underexpression in the context of a disease state, relative to a control group that remains unaffected^[57]18. Dysregulation of gene expression, whether it be overexpression or underexpression, can lead to disruptions in various biological pathways such as metabolic and immune pathways, which eventually result in the development of diseases^[58]19. Differentially expressed genes (DEGs) may exert an impact on the initiation of neurodegenerative disorders like Alzheimer’s disease. Furthermore, it is plausible that there may exist divergences in the gene expression profiles across distinct regions of the brain^[59]20. Determining whether the transcriptional alterations may yield cumulative impacts on established disease susceptibility factors and disease-associated pathways is of paramount significance^[60]21. The incidence of Alzheimer’s disease (AD) may exhibit an upward trend with advancing age, and anomalous transcriptional alterations give rise to pathogenic mechanisms associated with the disease^[61]22. The application of differentially expressed genes (DEGs) in systemic biology studies can identify significant functional components and central hub genes that are linked to the development of various diseases. This approach may involve the use of various tools such as Gene Ontology (GO)^[62]23 and Kyoto Encyclopedia of Genes and Genomes (KEGG)^[63]24 biological pathways. The goal of the study was to identify differentially expressed genes from bulk RNA-Seq data and suggest potential biomarkers as well as find any repurposable drug candidates, if available. For this purpose, a number of analyses such as differential gene expression, druggability, molecular docking and dynamics simulation were to be carried out. Methods Retrieval of NGS data The RNA-Seq datasets of 221 patients with Alzheimer’s (AD = 132) and non-Alzheimer’s (control = 89) were obtained in FASTQ format from the public database Gene Expression Omnibus (GEO) ([64]https://www.ncbi.nlm.nih.gov/geo/) (Project ID: PRJNA675355, PRJNA767074, PRJNA796229, PRJNA232669, PRJNA377568, PRJNA413568) in the National Center of Biotechnology Information (NCBI) website were collected to identify potential biomarkers and therapeutic targets. Dataset of PRJNA683625 were used as independent dataset for validation. The criteria used for the selection of datasets have been depicted in Fig. [65]1. Supplementary File 1 contains information regarding the selected datasets (project ID, sample size, number of Alzheimer’s subjects, number of healthy subjects, gender, age range, brain region, stage, reference). These samples were subsequently processed using various computational tools. These steps have been presented through a flowchart (Fig. [66]2). Fig. 1. [67]Fig. 1 [68]Open in a new tab The criteria used for the selection of datasets. Fig. 2. [69]Fig. 2 [70]Open in a new tab Overall workflow of the study. Processing and alignment of the reads HISAT2 was used to align all the raw reads to the reference genome of Homo sapiens GRCh38.p13 (GCA_000001405.28) from Ensembl^[71]25. Afterwards, the mapped reads were then assigned to Ensembl genomic features defined in GRCm38.69 ([72]http://primerseq.sourceforge.net/gtf.html). The number of reads per gene were quantified with the use of FeatureCounts ([73]http://subread.sourceforge.net/). Differential gene expression analysis The quantification of RNA-Seq relies on read counts that are assigned to genes in a probabilistic manner. For the minimization of batch effect, ComBat-seq function of sva package was used for each project^[74]26. In order to compute differential expression, the statistical approach DESeq2 was used to predict DEGs, and false discovery rate (FDR) was used to correct p-values and identify true DEGs. The fold change (FC) of each gene was calculated between control and AD groups, and genes with p-adjusted value < 0.05 and |Log2FC| > 1.45 were considered significant DEGs^[75]27. Functional and pathway enrichment analysis of DEGs To understand the biological functions and pathways associated with the differentially expressed genes (DEGs), enrichment analyses of Gene Ontology (GO) and KEGG pathways were performed. Enrichment analysis was carried out using Enrichr ([76]https://maayanlab.cloud/Enrichr/), which is a web-based tool for functional enrichment analysis. GO terms with a corrected p-value ≤ 0.05 were considered significantly enriched. For pathway enrichment analysis, the KEGG database was used in the Enrichr web server and pathways with a p-value ≤ 0.05 were considered significantly enriched. PPI, hub, MicroRNA, TF, and drug-gene interactions network analysis with validation on independent dataset Protein-protein interactions networks for the upregulated and downregulated genes in Alzheimer’s disease were constructed and visualized using STRING and Cytoscape respectively^[77]28,[78]29. MicroRNAs and their interacting genes were identified using the miRTarBase database^[79]30. Similarly, the TRRUST database was used for identifying the transcription factors and their target genes^[80]31. Subsequently, Cytoscape was used for visualizing the interactions as networks. The hub genes within the network of upregulated and downregulated genes were detected using the CytoNCA plug-in in Cytoscape^[81]32. The top 5 genes with the highest degree of connectivity were selected as hub genes. The hub genes were tested for their expression levels on an independent dataset PRJNA683625 in order to validate the findings. In order to explore drug-gene interactions, the Drug-Gene Interaction Database (DGIdb) was used with hub proteins as input^[82]33. Druggability of DEG encoded proteins The protein sequences of DEGs were retrieved from UniProtKB ([83]https://www.uniprot.org/help/uniprotkb) and queried in the DrugBank database ([84]https://go.drugbank.com/) as drug targets. Proteins were considered potential druggable targets if their sequences had a high degree of similarity (E value- <10^–100, bit score > 100) to those in the DrugBank database. Those that had no hits were regarded as novel targets^[85]34. Molecular docking Chemical structure of Levothyroxine was obtained from DrugBank. It was docked against the target downregulated transthyretin protein encoded by TTR gene using Webina 1.0.3 web server ([86]https://durrantlab.pitt.edu/webina/). Following successful molecular docking, the protein-ligand interactions were visualized using Pymol ([87]https://pymol.org/2/) and Poseview ([88]https://proteins.plus/). Molecular dynamic simulation 100 ns Molecular dynamics (MD) simulation was carried out using the GROningen MAchine for Chemical Simulations aka GROMACS (version 2020.6) for the apo TTR and Levothyroxine-TTR complex^[89]35. The proteins were embedded in the TIP3 water model^[90]36,[91]37. The whole system was energetically minimized using CHARMM36m force-field^[92]38. The systems were neutralized using K^+ and Cl^- ions. Following energy minimization, isothermal-isochoric (NVT) equilibration, and Isobaric (NPT) equilibration of the system were executed. Afterward, 100 ns production MD simulation was run. Using MD simulation data, root mean square deviation (RMSD), root mean square fluctuation (RMSF), radius of gyration (Rg), and solvent accessible surface area (SASA), and Hydrogen bond analysis were conducted. The ggplot2 package ([93]https://ggplot2.tidyverse.org/) in RStudio was utilized for generating the graphs for each of these analyses. All MD simulations were performed in the high-performance simulation stations running on Ubuntu 20.04.4 LTS operating system located at the Bioinformatics Division, National Institute of Biotechnology, Bangladesh. Results Quantification of the high quality raw reads FastQC v0.11.5 was utilized to check the quality of the raw sequencing data obtained from NCBI. All of the raw sequences were evaluated and were found to be of high quality. Following the alignment of the reads to the human reference genome, a total of 62,702 genes were identified which were forwarded to DEG analysis in the quantification step. Differential gene expression analysis revealed significantly upregulated and downregulated genes A total of 10,730 differentially expressed genes (DEGs) were identified in AD patient samples, with 7814 genes being upregulated and 2916 genes being downregulated. These DEGs were visualized through volcano plots using Rstudio. In Fig. [94]3, the volcano plot displayed the DEGs, with the upregulated DEGs located on the right and the downregulated DEGs on the left. The significant DEGs (p value < 0.05) are indicated by the yellow color while the rest was shown in red. Fig. 3. [95]Fig. 3 [96]Open in a new tab Volcano plot of differentially expressed genes. The upregulated and downregulated genes were depicted on the right and left respectively. The genes were indicated using yellow dots. Significant functional and pathway enrichment analysis of DEGs Table [97]1 demonstrated that out of 10,730 DEGs, a total 12 genes passed the cutoff value (|Log2FC| > 1.45 ) for significance. Among these 12 genes, 9 DEGs were upregulated and 3 DEGs were downregulated. This figure also depicted that in Alzheimer’s disease, PCDH11Y was the most upregulated (log2foldchange value = 1.889662998) and TTR was the most downregulated (log2foldchange value = – 2.361971992) DEGs. The GO functional and KEGG pathway enrichment analysis of these 12 DEGs showed that 11 GO-Biological Process (GO-BP) terms, 9 GO-Cellular Component (GO-CC) terms, 6 GO-Molecular Function (GO-MF) terms, 1 KEGG pathway and 5 Reactome pathways were associated with the 2 downregulated genes, respectively (Supplementary File 2). The downregulated gene-associated KEGG pathway was thyroid hormone synthesis and Reactome pathways were Amyloid fiber formation, metal sequestration by antimicrobial proteins, neutrophil degranulation and innate immune systems (Fig. [98]4a). Moreover, 4 GO-Biological Process (GO-BP) terms, 2 GO-Cellular Component (GO-CC) terms, 2 GO-Molecular Function (GO-MF) terms, 1 KEGG pathway were associated with the 9 upregulated genes. Among them, one upregulated gene ISG15 was found to be involved in RIG-I-like receptor signaling pathway (Fig. [99]4b). Table 1. Significant upregulated and downregulated genes. Gene ID Gene Name log2FoldChange p-value padj ENSG00000187608 ISG15 1.76E + 00 2.66E–09 1.51E–06 ENSG00000197915 HRNR 1.497234321 9.75E–06 0.0003905954358 ENSG00000240409 MTATP8P1 1.46E + 00 3.01E–03 0.0188562683 ENSG00000198744 MTCO3P12 1.69E + 00 2.35E–07 3.16E–05 ENSG00000012223 LTF – 2.014635682 3.47E–11 6.81E–08 ENSG00000197057 DTHD1 1.82E + 00 7.47E–11 1.15E–07 ENSG00000077279 DCX 1.46E + 00 6.49E–10 5.82E–07 ENSG00000229807 XIST – 1.645681137 0.002128052759 0.01491435091 ENSG00000140557 ST8SIA2 1.593330969 2.06E–06 0.0001395682111 ENSG00000118271 TTR – 2.361971992 7.45E–08 1.57E–05 ENSG00000053438 NNAT 1.523563947 7.98E–13 4.30E–09 ENSG00000099715 PCDH11Y 1.889662998 8.60E–07 7.54E–05 [100]Open in a new tab Fig. 4. [101]Fig. 4 [102]Open in a new tab a Significantly enriched pathway of the downregulated genes by using KEGG and Reactome database; b Significantly enriched pathway of the upregulated genes by using KEGG database. Network analysis revealed crucial interactions between hubs, microRNAs, TFs, drugs and genes Protein-protein interactions networks for the upregulated and downregulated genes in Alzheimer’s disease have been depicted in Fig. [103]5. The network of upregulated genes contained 4 connected components while the network of downregulated genes contained only one connected component. For the network of upregulated genes, the number of nodes and edges were 67 and 31 respectively whereas for the network of downregulated genes, the number of nodes and edges were 24 and 15 respectively. MicroRNA-gene interactions analysis based on the interactions information provided by miRTarBase database revealed that the microRNAs interact more with the downregulated genes compared to the upregulated ones. Figure [104]6 shows the microRNAs and their interacting genes within the network of upregulated and downregulated genes. The complete list of microRNAs in the upregulated and downregulated network have been provided in Supplementary File 3 and Supplementary File 4 respectively. Among the upregulated genes, DTHD1, PCDH11Y, ST8SIA2, and ISG15 were found to be the targets of microRNAs. In the case of downregulated genes, CACNA2D1, CXCR1, CXCR4, LCN10, LTF, LYZ, MMP8, MMP10, RGS1, RTKN2, SETD5, S100A4, TTR, VNN1, VNN2, and XIST were the targets. Transcription factors and their target genes within the network of upregulated and downregulated genes as identified by the TRRUST database have been presented in Fig. [105]7. Supplementary File 5 and Supplementary File 6 contains the list of transcription factors and their target genes within the network of upregulated and downregulated genes in Alzheimer’s disease respectively. Transcription factors targeted upregulated genes such as GZMB, IFNG, IFNL1, IDO1, and SELE. Several downregulated genes namely CXCR1, CXCR4, GDF3, IL1R2, LTF, LYZ, MMP10, and S100A4 were also the targets of transcription factors. Hub gene analysis revealed the genes CXCL11, GZMB, IFNG, IFNL1, and ISG15 as hub genes in the upregulated network. On the other hand, in the downregulated network, the genes CXCR4, IL1R2, LTF, MMP8, and TTR were identified as hub genes (Fig. [106]8). Drug-gene interactions analysis presented a number of drugs that can interact with hub proteins in the networks of upregulated and downregulated genes (Fig. [107]9). Fig. 5. [108]Fig. 5 [109]Open in a new tab Protein-protein interactions network for the (a) upregulated and (b) downregulated genes in Alzheimer’s disease. Fig. 6. [110]Fig. 6 [111]Open in a new tab MicroRNA interacting genes within the network of (a) upregulated and (b) downregulated genes in Alzheimer’s disease. Fig. 7. [112]Fig. 7 [113]Open in a new tab Transcription factors and their target genes within the network of (a) upregulated and (b) downregulated genes in Alzheimer’s disease. Fig. 8. [114]Fig. 8 [115]Open in a new tab Hub genes within the network of (a) upregulated and (b) downregulated genes in Alzheimer’s disease. Fig. 9. [116]Fig. 9 [117]Open in a new tab Drug-gene interactions with the hub proteins in the network of (a) upregulated and (b) downregulated genes in Alzheimer’s disease. Levothyroxine was identified as a potential drug againstTTRgene product Transthyretin The DrugBank webserver was used to find potential drugs that might target the 4 downregulated genes. It revealed that only one gene (TTR) had a corresponding FDA-approved drug called Levothyroxine. The remaining genes did not match to any drugs in the DrugBank database, suggesting that they can be considered as potential novel therapeutic targets. Molecular docking analysis revealed molecular interactions between the drug and target protein Utilizing Webina 1.0.3 web server, a molecular docking analysis had been carried out between the FDA-approved drug Levothyroxine and the predicted structures of the TTR encoded protein called Transthyretin. The molecular interactions between the ligand Levothyroxine and Transthyretin indicated a significant binding energy value of -5.1 kcal/mol. According to docking analysis, TTR gene interacted with Levothyroxine through Arg103A, Asp99A, Thr119A, Ala120A, Ser100A. (Fig. [118]10) Fig. 10. [119]Fig. 10 [120]Open in a new tab Molecular Docking between the TTR and Levothyroxine. TTR gene interacts with Levothyroxine through Arg103A, Asp99A, Thr119A, Ala120A, Ser100A. MD simulation analysis confirmed the stability of drug-receptor complex Root Mean Square Deviation (RMSD) calculation was carried out in order to assess stability of the systems. Change in RMSD value corresponds to conformational changes of the protein as a result of ligand binding. The purple line represents the RMSD profile of apo receptor whereas the green line depicts drug-receptor complex (Fig. [121]11). After 50ns, the RMSD value of the drug-receptor complex did not increase beyond ~ 2.5 nm whereas the apo receptor RMSD value gradually increased up to ~ 4.0 nm. Root Mean Square Fluctuation (RMSF) analysis was utilized to determine the regional flexibility of the protein. The higher the RMSF, the higher is the flexibility of a given amino acid position. Figure [122]12 demonstrates the RMSF profile of apo receptor and drug-receptor complex. The major RMSF peaks were observed at around the 20th residue and the 85th residue. In the peak near the 85th residue, the apo receptor showed higher mobility. The radius of gyration is a measure to determine its degree of compactness. A relatively steady value of radius of gyration means stable folding of a protein. Fluctuation of radius of gyration implies the unfolding of the protein. According to Fig. [123]13, the Levothyroxine-receptor complex went under less folding according to the Rg (nm) values. Solvent Accessible Surface Area (SASA) is used in MD simulations to predict the hydrophobic core stability of proteins. The higher the SASA value, the higher the chance of destabilization of the protein due to solvent accessibility. The SASA value of apo receptor and drug-receptor was ~ 65 nm^2 initially. However, after 90 ns, the values of both proteins overlapped (Fig. [124]14). Fig. 11. [125]Fig. 11 [126]Open in a new tab RMSD profile of apo receptor (Purple) and Levothyroxine-receptor complex (Green). Fig. 12. [127]Fig. 12 [128]Open in a new tab RMSF profile of apo receptor (Purple) and Levothyroxine-receptor complex (Green). Fig. 13. [129]Fig. 13 [130]Open in a new tab Rg profile of apo receptor (Purple) and Levothyroxine-receptor complex (Green). Fig. 14. [131]Fig. 14 [132]Open in a new tab SASA profile of apo receptor (Purple) and Levothyroxine-receptor complex (Green). Discussion AD is a progressive and intricate condition that affects multiple brain functions. Besides the causative genes, various risk factors can also contribute to the progression of the disease. Therefore, using transcriptomics analysis is essential in comprehending the underlying mechanisms of the disease and identifying potential targets for treatment. In this study, 221 RNA-Seq samples were collected from the GEO database and these samples were found to have high-quality scores. This score is an indication of the accuracy of the base call, and a higher score is generally preferable. However, it is common for the quality of reads to decrease towards the 3’ end, and if the quality of certain bases falls below a certain threshold, they need to be removed. For this reason, the trimming process is necessary to eliminate poor-quality bases, trim adaptor sequences, and ensure high-quality reads. Here, the trimming step was not required since the sample reads were already of high quality. Differentially expressed genes (DEGs) play a crucial role in identifying the biological pathways involved in various diseases including NDs. The main objective of DEG analysis is to identify genes that are either upregulated or downregulated in a disease state compared to healthy controls. Because the upregulation or downregulation of specific genes can cause disruptions in metabolic, immune, and other pathways, potentially leading to disease development^[133]39. Therefore, identifying DEGs can help us understand the mechanisms underlying disease and develop targeted therapies to treat them. Here, ISG15, HRNR, MTATP8P1, MTCO3P12, DTHD1, DCX, ST8SIA2, NNAT, PCDH11Y were found to be most significantly upregulated while the most significantly downregulated genes included LTF, XIST, and TTR. GO functional analysis revealed that the Alzheimer’s-causing genes were significantly enriched with positive regulation of osteoblast proliferation and development, bone morphogenesis, purine-containing compound metabolic process, regulation of Tumor Necrosis Factor superfamily cytokine production, positive regulation of Toll-Like Receptor 4 signaling pathway, negative regulation of ATP-dependent activity which causes bone loss, inflammation, damage to nerve cells and surrounding brain tissue which were previously reported by other studies as well^[134]40–[135]45. One CC term phagocytic vesicle was significantly associated with AD. This was observed to hinder phagocytic process and can cause reduced clearance of accumulated dystrophic neurites^[136]46. In accordance with other studies, we have also found MF terms such as Cysteine-Type endopeptidase inhibitor activity, Iron ion binding, protein Serine/Threonine kinase activator activity, protein kinase activator activity to be enriched in downregulated genes^[137]47–[138]49. Finally, one upregulated gene named ISG15 was predicted to be involved in RIG-I-like receptor signaling pathway which may lead to the inflammatory response through the activation of type I IFN production in AD^[139]50. Additionally, two downregulated genes such as TTR and LTF in AD were predicted to be involved in thyroid hormone synthesis, amyloid fiber formation, diseases associated with visual transduction, neutrophil degranulation, and canonical retinoid cycle in rods (Twilight Vision) pathways. The downregulation of these two genes may lead to the cognitive impairment, amyloid-beta accumulation and retinal dysfunction and visual impairments due to the disruption in thyroid hormone signaling^[140]51–[141]53. The hub genes in the upregulated and downregulated network reveal important insights into AD etiology and progression mechanisms particularly emphasizing the roles of neuroinflammation, immune dysregulation, and impaired protein homeostasis. In the upregulated network, CXCL11, GZMB, IFNG, IFNL1, and ISG15 emerge as central players in the inflammatory cascade associated with AD. CXCL11, a chemokine involved in T-cell recruitment, is a chemokine that has been reported to play a role in the chronic neuroinflammation such as those observed in AD patients^[142]54. Higher neuronal apoptosis and extracellular matrix degradation, hinging on higher synaptic loss and neuronal death, might be mediated through the overexpression of GZMB^[143]55. Upregulation of IFNG reduces the clearance of amyloid-beta plaques, which in turn contributes directly to the etiology of AD^[144]56. There has been limited research on the significance of IFNL1 and ISG15 in AD, but its overexpression suggests that interferon-mediated immune responses are persistently activated, similar to the chronic immunological activation observed in AD brains^[145]57,[146]58. Decreased expression of hub genes- CXCR4, IL1R2, LTF, MMP8, and TTR indicates disturbances in protein homeostasis and compromised neuroprotective mechanisms. Neuronal survival and amyloid-beta clearance is hampered as a result of CXCR4 downregulation, which in turn aids the progression of AD^[147]59. Decreased expression of IL1R2 and LTF promotes neuroinflammatory reactions and oxidative stress which are important elements in AD progression^[148]60,[149]61. It has been suggested that a buildup of pathogenic protein aggregates might occur as a result of reduced levels of MMP8^[150]62. Downregulation of TTR is known to impede the clearance of amyloid-beta aggregation resulting in plaque development in AD^[151]63. TTR gene product Transthyretin has been reported to inhibit Amyloid-β accumulation and thus prevent the spread of AD^[152]64,[153]65. Hyperactivation of neutrophils is a common feature associated with the progression of AD. Our analysis uncovered a relationship between neutrophil degranulation and the TTR gene as well^[154]66,[155]67. Since we have also identified TTR as the most downregulated gene in AD patients, it can be predicted as a potential biomarker for AD. Previous studies have also suggested that this gene can be targeted for treating AD^[156]68. However, the novelty of our study lies in the fact that following whole transcriptome analysis, we looked for the druggability of the targets and discovered that the FDA-approved drug, Levothyroxine, might be an effective repurposable drug against the Transthyretin protein encoded by the TTR gene. In this study, Transthyretin protein and Levothyroxine drug showed effective interaction with a high binding energy (– 5.1 kcal/mol) which indicated that Levothyroxine can potentially activate the proteins activity. The stability of a Levothyroxine-receptor complex was investigated in this study using molecular dynamics simulations. In order to assess the conformational changes in the protein structure upon drug binding, Root Mean Square Deviation (RMSD) analysis was performed. The drug’s steady binding to the receptor was shown by the gradual increase in the apo receptor’s RMSD profile around 4.0 nm and the drug-receptor complex’s RMSD value being stable at 2.5 nm after 50 ns. The regional flexibility of the protein was ascertained using the Root Mean Square Fluctuation (RMSF) technique. The ~ 20th and ~ 85th residues were where the RMSF peaks were found. When compared to the drug-receptor complex, the apo receptor had more mobility at the peak near the 85th residue. According to this finding, the drug binding may help to stabilize certain areas of the protein structure. The degree of protein compactness was looked at using the radius of gyration (Rg) study. Less folding was evident in the Rg (nm) values of the Levothyroxine-receptor complex compared to the apo receptor, suggesting that the compactness of the protein may have been impacted by the drug’s interaction. Lastly, Solvent Accessible Surface Area (SASA) analysis was used to forecast the stability of proteins’ hydrophobic cores. Initial SASA values for the apo receptor and drug-receptor complex were both 65 nm^2, but after 90 ns, the values converged, showing that the drug’s interaction did not disrupt the protein’s hydrophobic core. Overall, the study’s findings point to the stability of the Levothyroxine-receptor complex and the possibility that the drug’s binding may have had an impact on the protein’s stability and structure, particularly in areas with high mobility. The development of more effective and targeted medications that target this receptor may be affected by these discoveries. Conclusion In this study, 7 DEGs of AD were predicted and they could be targeted as potential biomarkers for the diagnosis of AD. These genes could be used to monitor disease progression and treatment response, leading to more personalized treatment options. One repurposable drug candidate, Levothyroxine was identified whose interaction with the target Transthyretin was confirmed through molecular docking and dynamics simulation analysis. Overall, the identification of potential biomarkers and therapeutics for AD could have significant implications for the diagnosis, treatment, and management of this incapacitating condition. However, in vitro and in vivo studies are necessary for further validation of our findings. Electronic supplementary material Below is the link to the electronic supplementary material. [157]Supplementary Material 1^ (9KB, xlsx) [158]Supplementary Material 2^ (7.8KB, xlsx) [159]Supplementary Material 3^ (7.2KB, xlsx) [160]Supplementary Material 4^ (7.2KB, xlsx) [161]Supplementary Material 5^ (46.4KB, xlsx) [162]Supplementary Material 6^ (21.9KB, xlsx) Author contributions A.B.L, I.A., and A.B. conducted the analysis. A.B.L, I.A. wrote the original draft of the manuscript. M.U.H., A.I., Z.M.C., and K.C.D. reviewed the manuscript. M.S. and C.A.K. supervised the research project. Data availability The datasets analysed during the current study are available in the NCBI Gene Expression Omnibus (GEO) ([163]https://www.ncbi.nlm.nih.gov/geo/) repository under the following accession IDs: PRJNA675355, PRJNA767074, PRJNA796229, PRJNA232669, PRJNA377568, PRJNA413568. Declarations Competing interests The authors declare no competing interests. Footnotes The original online version of this Article was revised: In the original version of the Article, Figure 9 was a duplication of Figure 14. The Figure legends were correct at the time of publication. Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. Anika Bushra Lamisa, Ishtiaque Ahammad and Arittra Bhattacharjee contributed equally to this work. Change history 2/11/2025 A Correction to this paper has been published: 10.1038/s41598-025-87907-7 References