Abstract Cognitive dysfunction caused by diabetes has become a serious global medical issue. Diabetic kidney disease (DKD) exacerbates cognitive dysfunction in patients, although the precise mechanism behind this remains unclear. Here, we conducted an investigation using RNA sequencing data from the Gene Expression Omnibus (GEO) database. We analyzed the differentially expressed genes in DKD and three types of neurons in the temporal cortex (TC) of diabetic patients with cognitive dysfunction. Through our analysis, we identified a total of 133 differentially expressed genes (DEGs) shared between DKD and TC neurons (62 up-regulated and 71 down-regulated). To identify potential common biomarkers, we employed machine learning algorithms (LASSO and SVM-RFE) and Venn diagram analysis. Ultimately, we identified 8 overlapping marker genes (ZNF564, VPS11, YPEL4, VWA5B1, A2ML1, KRT6A, SEC14L1P1, SH3RF1) as potential biomarkers, which exhibited high sensitivity and specificity in ROC curve analysis. Functional analysis using Gene Ontology (GO) revealed that these genes were primarily enriched in autophagy, ubiquitin/ubiquitin-like protein ligase activity, MAP-kinase scaffold activity, and syntaxin binding. Further enrichment analysis using Gene Set Enrichment Analysis (GSEA) and Gene Set Variation Analysis (GSVA) indicates that these biomarkers may play a crucial role in the development of cognitive dysfunction and diabetic nephropathy. Building upon these biomarkers, we developed a diagnostic model with a reliable predictive ability for DKD complicated by cognitive dysfunction. To validate the 8 biomarkers, we conducted RT-PCR analysis in the cortex, hippocampus and kidney of animal models. The results demonstrated the up-regulation of SH3RF1 in the cortex, hippocampus and kidney of mice, which was further confirmed by immunofluorescence and Western blot validation. Notably, SH3RF1 is a scaffold protein involved in cell survival in the JNK signaling pathway. Based on these findings, we support that SH3RF1 may be a common gene expression feature that influences DKD and cognitive dysfunction through the apoptotic pathway. Keywords: Diabetic kidney disease, Cognitive dysfunction, Machine learning algorithms, RNA sequencing, SH3RF1 Subject terms: Biological techniques, Computational biology and bioinformatics, Neuroscience, Biomarkers, Nephrology Introduction Cognitive dysfunction, diabetes, and diabetic kidney disease (DKD) are three of the most pressing epidemics of our time, with considerable overlap in risk factors, comorbidities, and putative pathophysiological mechanisms. Cognitive dysfunction is a complication of many diseases, but diabetes and DKD are independent risk factors for cognitive dysfunction. Diabetes is a common chronic metabolic disease that affects multiple systems and is projected to affect 693 million adults by 2045^[40]1. 90% to 95% of them are type 2 diabetes mellitus (T2DM)^[41]2,[42]3. In T2DM, the prevalence of mild cognitive dysfunction (MCI) is 20% to 30%, while the probability of dementia is about 17.3%, and a 2014 global study believes that the cognitive decline of middle-aged patients with diabetes will be as high as 19% in the next 20 years, the duration of diabetes is also associated with higher late-life cognitive ability^[43]4. Another study found that the risk of all-cause death increased by 33% and 50%, respectively, in patients with diabetes with mild and severe cognitive dysfunction^[44]5. At present, the specific mechanism of diabetes-related cognitive dysfunction is not fully understood. In addition to chronic hyperglycemia producing more late-stage glycosylation end products and reactive oxygen species (ROS) in the central nervous system (CNS) that are potentially toxic to neurons^[45]6, insulin resistance (IR)^[46]7, amyloid deposition, chronic inflammation in the brain^[47]8,[48]9, mitochondrial damage^[49]10,[50]11, oxidative stress^[51]11,[52]12, increased blood–brain barrier (BBB) permeability^[53]13,[54]14, iron metabolism disorder^[55]15, intestinal flora metabolism disorder and other factors may also participate in the process of cognitive dysfunction^[56]16,[57]17. The major source of chronic kidney disease is DKD^[58]18, with approximately 43.5% of T2DM patients developing clinically significant DKD, characterized by excessive proteinuria and/or decreased renal function (lower estimated glomerular filtration rate [eGFR])^[59]19. Whether DKD aggravates cognitive dysfunction on the basis of diabetes is unknown, but DKD should consider the influence of kidney injury on cognitive function in addition to diabetes factors. Evidence shows that the prevalence of cognitive dysfunction in patients with chronic kidney disease is three times higher than in the general population. With the development of dialysis treatment, the risk of cognitive dysfunction was significantly increased, and the degree of cognitive dysfunction was significantly worsened^[60]20. The pathogenesis is related to decreased glomerular filtration rate, increased albuminuria^[61]21, accumulation of toxins^[62]22 (creatinine, urea nitrogen, 4-hydroxyphenylacetate, uric acid, parathyroid hormone, etc.), water and electrolyte disorders (hyperkalemia, hyperphosphorus, hypermagnesium, etc.), cerebrovascular damage caused by renal anemia and renal hypertension, the activation of renin-angiotensin^[63]23–[64]26 is considered to be involved in the occurrence of cognitive dysfunction. In the end stage of chronic kidney disease, organ transplantation can only be performed. Before obtaining a suitable donor, the patient's cognitive state is also one of the indicators of whether the operation can be performed in the perioperative period. Therefore, there is an urgent need to explore new targets to protect cognitive function in DKD patients and reduce damage of nerve function. Numerous research confirms that the regions related to learning and memory are mainly the prefrontal cortex, hippocampus, and medial temporal lobe in the brain^[65]27,[66]28. Memory information flows and integrates between the cortex, hippocampus, and the medial temporal lobe^[67]29,[68]30. Therefore, we selected temporal cortex data from diabetic patients and kidney data from DKD patients to a conduct comorbidity analysis of the two diseases through the method of omics, to further understand the relationship between the two diseases, and the differential genes of the two diseases were screened by machine learning method, and the relative mRNA expression levels in the cortex,hippocampus and kidney of DKD mice were detected by real-time quantitative polymerase chain reaction (RT-PCR) to verify the up-regulated marker gene. Based on the RT-PCR results, the final common biomarkers were further validated in laboratory animals. Our comprehensive analysis showed that the SH3RF1 gene was significantly up-regulated in the cortex, hippocampal and kidney of DKD mice combined with cognitive dysfunction. SH3RF1 gene is an important scaffold protein in JNK apoptosis pathway. Therefore, targeting these pathways is a potential therapeutic option for treating DKD and cognitive dysfunction. Materials and methods Data collection and differential gene expression analyses Microarray series from the Gene Expression Omnibus (GEO) database were searched using the following keywords: DKD, diabetes, cognition, Alzheimer's disease. Two datasets ([69]GSE199838, and [70]GSE161355) met the inclusion criteria: (1) the experimental data type was microarray or high-throughput sequencing; (2) the study samples were from humans. Data processing The platform for [71]GSE199838 was [72]GPL21290, the Illumina HiSeq 3000, with datasets containing kidney tissues from renal biopsies of 3 DKD patients and normal tissues from 3 non-diabetic renal cancer patients undergoing surgical resection. The platform for [73]GSE161355 was GLP570, the [HG-U133_Plus_2] Affymetrix Human Genome U133 Plus 2.0 Array, with datasets containing human temporal cortex neuron, astrocyte, and endothelial cell-enriched RNA from 6 cases with self-reported T2DM in the Cognitive Function and Ageing Study neuropathology cohort, and 5 age and sex-matched controls. We downloaded the microarray normalized expression matrix of the dataset and re-annotate the probes using the dataset's annotation file. During re-annotation, we performed ID conversion, and when a gene was represented by multiple probes, the average expression value was taken as the gene expression value. Summarized information of the analyzed data is provided in Table [74]1. For the samples in these datasets, used the limma package to filter, log2 transform, and normalize all datasets. Two sets of data are processed separately^[75]31. Genes with logFold Change > 2 and P  < 0.05 were defined as differentially expressed genes (DEGs). Then to obtain their intersection genes by Venn diagram analysis. Table 1. Summary of diabetic kidney disease and diabetes with cognitive dysfunction associated microarray datasets from the Gene Expression Omnibus database. Series accession Platform Sample size [76]GSE199838 [77]GPL21290 DKD:3, NC:3 [78]GSE161355 [79]GPL570 AST-DM:6 AST-NC:5 BV-DM:6 BV-NC:5 NEU-DM:6 NEU-NC:5 [80]Open in a new tab AST astrocytes, BV brain neurovascular endothelial cells, NEU cortical neurons. Enrichment analyses for DEGs The enrichment analysis of the co-expressed genes in two datasets was carried out through the packages clusterProfiler (version 4.6.2), the org.Hs.eg.db (version 3.17.0) and the KOBAS website, which were applied to perform Gene Ontology (GO)^[81]32, and Kyoto Encyclopedia of Genes and Genomes (KEGG)^[82]33 pathway enrichment analysis with screening criteria of P< 0.05. LASSO and SVM-based machine learning algorithm for common biomarkers and validation The Least Absolute Shrinkage and Selection Operator (LASSO) logistic regression^[83]34, and Support Vector Machine-Recursive Feature Elimination (SVM-RFE)^[84]35 algorithms were employed independently to screen common biomarkers in the DEGs from the selected modules. Ultimately, genes that overlapped among the two machine learning algorithms were regarded as common biomarkers. The pROC package in R was used to plot receiver operating characteristic (ROC) curves to verify the validity and predictive accuracy of the diagnostic biomarkers. The diagnostic biomarkers with an area under the curve (AUC) > 0.7 were considered useful for disease diagnosis^[85]34. Then Gene set enrichment analysis (GSEA) and Gene Set Variation Analysis (GSVA) was employed to clarify the enriched KEGG pathways with MSigDB gene sets “c2.cp.kegg.v7.4.symbols.gmt” used as a reference. Animal experiment A total of 12 male mice (14 weeks old) were purchased from the Chongqing Medical University Animal Experiment Center (Chongqing, China). There were 6 C57BL/6 mice and 6 db/db mice respectively, two groups of mice were tested with Morris water maze for a week to evaluate their cognitive function. The experimental protocol has been approved by the Research Ethics Committee of Guizhou Provincial People's Hospital, and the Animal Ethics Approval Number is (2022) 257. All animal experiments were carried out following the Guide for the Care and Use of Laboratory Animals. Mice were housed in standard polypropylene transparent cages under environmentally controlled conditions (temperature, 24 °C; relative humidity, 55 ± 10%; with a 12 h light:12 h dark cycle). The mice were fed on a standard pellet diet (Amrut mice feed, Chongqing, China) and had free access to water. Biochemical index detection Blood glucose levels were measured using the Roche Dynamic Blood Glucose Monitoring System (Roche, Mannheim, Germany) by blood sampling from the tail vein. The blood and urine of 6 healthy C57BL/6 mice and 6 db/db mice were collected, and the serum/ urine creatinine were determined by sarcosine oxidase method according to the kit instructions (C011-2-1 Creatinine Assay Kit, Nanjing), blood urea nitrogen (BUN) was determined by urease method (C013-2-1, Urea Assay Kit, Nanjing), urine albumin was determined by CBB method (C035-2-1, Urine albumin test kit, Nanjing). All results were quantitatively analyzed using microplate spectrophotometer (Epoch, BioTek, US) based on different absorbed light. Immunohistochemistry and immunofluorescence Paraffin-embedded kidneys and cerebral were dewaxed, rehydrated by sectioning, and blocked by peroxidase for 10 min. The tissue slices were high-pressure repaired by Tris–EDTA (pH = 9.0) for 2 min (C1034, solarbio, China). The serum was blocked for 30 min.The tissue slices were sequentially incubated with primary antibodies of interest (α-SMA [1:1000 dilution, 14395-1-AP, Proteintech, China], SH3RF1 [1:200 dilution, 14649-1-AP, Proteintech, China]) overnight at 4 °C. The secondary antibody (HRP-labeled goat anti-rabbit IgG, 1:200, ab6721, Abcam, UK) was incubated at 37 °C for 30 min. DAB colour development was used by Immunohistochemistry, and the slices dehydrated and sealed with neutral resin. Immunofluorescence was incubated with fluorescent secondary antibody (Alexa Fluor® 488, 1: 200, Abcam, UK) at room temperature for 2 h. The nucleus were stained with DAPI for 10 min. The stained slices were visualized and pictured with a confocal microscopy (Olympus, Tokyo, Japan). Image-Pro Plus software (version 6.0) was used to quantify the integrated density which represents the sum of pixel values in an image. TUNEl staining According to the manufacturer's instructions, TUNEL staining was used to detect cortical and hippocampal apoptosis in mice using an apoptosis kit (E-CK-A320, Elabscience, China). Extraction of hippocampus and cortex tissue from mouse brain After the mice were anesthetized, they were quickly executed and decapitated. Cut the fur from the center of the head, exposing the parietal bone. Then cut a knife left and right at the break and lift the occipital bone. Cut the parietal bone along the vector seam. Gently lift the left and right top bones with scissors, exposing brain tissue. Drag the bottom of the brain and pull out wards until the brain is detached from the skull. Separate cerebellum and brainstem along lambdoidal suture. Slowly separate the left and right half of the cerebral cortex along the vector seam. Turn both sides of the cerebral cortex gently to the left and right sides, we will see the crescent-shaped hippocampus, gently separate the hippocampus on both sides, and separate the two sides of the cortex corresponding to the position of the hippocampus. All tissus were stored in a refrigerator at − 80 °C. qPCR and western blotting TRIzol reagent (Invitrogen) was used to extract total RNA from cortex, hippocampal and kidney, followed by reverse transcription of mRNA to cDNA. Power SYBR Green PCR Master Mix (Takara Biotechnology, Dalian, China) was used to perform RT-PCR on the cDNA. Finally, the 2^−ΔΔCT method. was used to evaluate the RT-PCR results. The GADPH gene was used as a control. All common biomarkers were tested by RT-PCR, and differential genes associated with DKD and cognitive dysfunction were screened for further detection. The sequence of primers is shown in Table [86]2. Table 2. Summary of the primers’ sequence of diagnostic biomarkers. Primer name Forward sequence Reverse sequence A2ML1 5′-AGTTGGACCGCTCTGGATTG-3′ 5′-CTACAGGTTTGCCGTCGAGT-3′ KRT6A 5′-GTGGCCTCAGCTCTTCTACC-3′ 5′-TCTGAGCACGGGATTCTGC-3′ SH3RF1 5′-AGGAGTGGACACGGCAGAATG-3′ 5′-GGTCGCTTCCTGGTGTTCTTCTTG-3′ VWA5B1 5′-GCCCTGCCTCTCATGGAATC-3′ 5′-GGTCGCTTCCTGGTGTTCTTCTTG-3′ YPEL4 5′-CTGGAGCAGACCTCAAGGTGA-3′ 5′-GCTAGAGCCGTGTTTCAGGA-3′ VPS11 5′-AGCTCCCTGACACGGCAGAATG-3′ 5′-TCGTCAAGTGGGCTGTTAGG-3′ [87]Open in a new tab The difference results of RT-PCR were further verified at protein level using Western blotting on cortex, hippocampal and kidney. Proteins were separated using sodium dodecyl sulfate/ polyacrylamide gel electrophoresis (SDS–PAGE), transferred onto a polyvinylidene fluoride (PVDF) membrane, and subjected to immunoblot analysis. Blotting was performed using antibodies against for selected biomarkers. After rinsing with Tris-buffered saline supplemented with 0.1% Tween 20 (TBST), the membranes were incubated with a horseradish peroxidase (HRP)-conjugated anti-rabbit antibody for 2 h at 4 °C temperature. Specific signals were detected using enhanced chemiluminescence (Millipore, Merck. US) on the ChemiDocTM Touch Imaging System (Bio-Rad, Hercules, CA, USA). Quantification was performed by measuring the intensity of the gels using ImageJ. Statistical analysis Bioinformatic analyses and visualization were performed in R software (version.4.1.3). All t-tests in this study used P < 0.05, logFold Change> 2 as the criteria for statistical significance. The data verified by animal experiments were expressed as means ± SDs. A minimum of three biological replicates were included for all mice experiments, including for histopathology, immunofluorescence,TUNEL and RT-PCR, western blotting, representative experiments are shown. Statistical analyses of the animal experiments data were performed using GraphPad Prism, version 8 (GraphPad Software, San Diego, CA, USA). A P < 0.05 was considered statistically significant (*P < 0.05; **P< 0.01; ***P < 0.001; and ****P < 0.0001). Result Identification of common DEGs between DKD and TC The raw data and platform information were downloaded from GEO database. After re-annotation, 19,027 genes in [88]GSE199838 ([89]GPL21290), 20,824 genes in GSE1161355([90]GPL570) were obtained. To increase the signal and lessen the fraction of false positive findings, two datasets were batch normalized to reduce variability. The raw and normalized data are visualized in Fig. [91]1. Fig. 1. [92]Fig. 1 [93]Open in a new tab Normalization of raw data in Gene Expression Omnibus database. The left represents raw data and the right represents data after normalization. A total of 3408 differentially expressed genes (DEGs) were identified in the DKD dataset using the limma package, of which 1574 genes were up-regulated and 1834 genes were down-regulated, and 3612 DEGs were obtained in the TC dataset, of which 1799 genes were up-regulated and 1813 genes were down-regulated. The Volcano plot of dysregulated genes and heatmap of the top 50 dysregulated genes in the DKD and TD samples are shown in Fig. [94]2A–D. After taking the intersection of the DEGs in DKD and TC datasets, a total of 133 overlapping DEGs were identified, including 62 commonly up-regulated genes and 71 commonly down-regulated gene (Fig. [95]3A and supplementary Table [96]S1). Fig. 2. [97]Fig. 2 [98]Open in a new tab Volcano plot of DEGs and heatmap of DEGs. (A,B) Volcano plot of dysregulated genes and heatmap of the top 50 dysregulated genes in the DKD samples. (C,D) Volcano plot of dysregulated genes and heatmap of the top 50 dysregulated genes in the TC samples at limma. P < 0.05 and |logFold Change| ≥ 2 or less than -2. DKD, Diabetic Kidney Disease. TD, temporal cortex. Fig. 3. [99]Fig. 3 [100]Open in a new tab Venn diagram of common DEGs and Functional enrichment analysis of the common DEGs. (A) A total of 62 common up-regulated genes and 71 common down-regulated genes were identified in DKD and TD. DEGs, differentially expressed genes; OB, Obesity; PD, Periodontitis. (B) KEGG pathway analysis of the DEGs. (C) GO enrichment results of the DEGs for the categories of biological processes, cellular composition and molecular function. The size of the bubble represents the number of genes enriched in a particular gene set, and the color of the bubble represents the P value. An adjusted P < 0.05 was considered statistically significant. DEGs, differentially expressed genes. DKD diabetic kidney disease, TD temporal cortex, KEGG Kyoto encyclopedia of genes and genomes, GO gene ontology. GO and KEGG enrichment analysis of DEGs To investigate the potential biological processes and pathways of the DEGs, we separately performed GO and KEGG enrichment analysis using the clusterProfiler package and the KOBAS website. The results of KEGG enrichment analysis demonstrated that these genes were mainly enriched in “Parkinson disease” and “Non-alcoholic fatty liver disease” (Fig. [101]3B). The results of GO analysis of up-regulated genes were shown that “cellular aerobic respiration, electron transport and integrity of the respiratory chain, mitochondrial membrane function, lysosomal membrane function, oxidoreductase activity and other functions, etc.” were significantly enriched from the aspects of biological process, cell composition and molecular function. And the results of GO analysis of down-regulated genes were shown that “neuronal migration, cortical GABaergic neuronal differentiation, glucose 6-phosphate metabolism, synaptic growth regulation, postsynaptic membrane function and ion channel activity, glutamate receptor binding, etc.” were significantly enriched from the aspects of biological process, cell composition and molecular function (Fig. [102]3C). The analysis results are shown in supplementary Tables [103]S2, [104]S3. The common biomarker identification and verification The DEGs were identified by algorithms for machine learning. All genes were ordered by “average decreased accuracy” and “average decreased Gini coefficient”. The larger the two values are, the closer the relationship between DKD and cognitive dysfunction. A total of 13 genes (ZNF564, VPS11, GNAL, NIF3L1, YPEL4, VWA5B1, A2ML1, LOC101928851, KRT6A, ANKRD53, SEC14L1P1, SH3RF1, DEFB132) from the selected modules were identified as potential diagnostic biomarkers through the LASSO regression algorithm (Fig. [105]4A,B). By SVM-RFE algorithm, the other 13 genes (VPS11, ZNF564, ARL6IP1, A2ML1, KRT6A, SH3RF1, YPEL4, SMIM13, SEC14L1P1, RAB43, LMBRD1, NDEL1, VWA5B1) were extracted from these modules as candidate biomarkers (Fig. [106]4C). Eight genes (A2ML1, KRT6A, SEC14L1P1, SH3RF1, VPS11, VWA5B1, YPEL4, ZNF564) were then overlapped via a Venn diagram, and as biomarkers of co-expression of the two diseases (Fig. [107]4D). To estimate the predictive utility, the ROC curve was performed and it was found that all eight genes showed significant differentiation efficiency in the DKD and TD datasets. The AUC values of the 8 biomarkers were all greater than 0.7 respectively, indicating excellent predictive ability (Fig. [108]4E,F). Fig. 4. [109]Fig. 4 [110]Open in a new tab Identification of the common biomarkers from the machine learning and validation of diagnostic biomarkers. (A,B) LASSO regression analysis. (C) SVM-RFE algorithm. (D) Venn plot exhibiting the reliable biomarkers among LASSO, and SVM-RFE. (E,F) ROC curves for evaluating the diagnostic ability in the DKD datasets and TD datasets. GO enrichment analysis of common biomarkers The results of GO analysis revealed that biological processes (BP) such as “negative regulation of endopeptidase, peptidase, proteolysis and hydrolase activity”, “regulation of killing of cells of other organism” and “positive regulation of cytoplasmic transport. In terms of cellular composition(CC), terms such as “vesicle tethering complex”, “keratin filament” and “autophagosome” were significantly enriched. In terms of molecular function (MF), terms such as “ubiquitin protein ligase activity”, “molecular adaptor activity” and “ubiquitin-protein transferase activity” were significantly enriched (Fig. [111]5A). In our analysis, we did not observe any FDR values below 0.05. Therefore, we made the decision to relax the threshold and present the results based on Pvalues (supplementary Table [112]S4). Fig. 5. [113]Fig. 5 [114]Open in a new tab GO enrichment analysis of common biomarkers. GO enrichment results of the DEGs for the categories of biological processes, cellular composition and molecular function. The size of the bubble represents the number of genes enriched in a particular gene set, and the color of the bubble represents the Pvalue. An adjusted P < 0.05 was considered statistically significant. Gene set enrichment analysis and gene set variation analysis The 8 diagnostic biomarkers are analyzed for individual GSEA and GSVA. GSEA enrichment suggest that the up-regulation or down-regulation of these 8 diagnostic biomarkers may be involved in the occurrence of Diabetes, Alzheimer's disease, Parkinsons disease, Huntington’s disease, as well as through hematopoietic cell lineage, cell activation receptor interactions, oxidative phosphorylation, neuroactive ligand receptor interactions, autophagy, ECM receptor interactions, focal adhesion, complement and coagulation cascades, oxidative phosphorylation, Ubiquitin-mediated proteolysis and intestinal immune network regulation of IgA production pathways affect the production of cognitive dysfunction in DKD (Fig. [115]6A). According to GSVA's analysis the 8 diagnostic biomarkers clusters’ biological behavior and pathway differences were displayed. The phenotype of the high A2ML1 gene was mainly concentrated in the glycosaminoglycan biosynthesis heparan sulfate, alanine aspartate and glutamate metabolism, steroid biosynthesis, oxidative phosphorylation, proteasome, terpenoid backbone biosynthesis, aminoacyl tRNA biosynthesis and Parkinsons disease pathway. The low A2ML1 gene expression phenotype was predominantly concentrated in complement and coagulation cascades, cytokine receptor interaction pathways. The low SH3RF1, VPS11 and ZNF564 gene expression phenotype was predominantly concentrated in glycosaminoglycan biosynthesis heparan sulfate, glyoxylate and dicarboxylate metabolism, riboflavin metabolism and regulation of autophagy pathways. The phenotype of the high VWA5B1 gene was mainly concentrated in the RNA degradation and glycosaminoglycan biosynthesis heparan sulfate pathway. The low VWA5B1 gene expression phenotype was predominantly concentrated in basal cell carcinoma, hematopoietic cell lineage, cytokine receptor interaction, complement and coagulation cascades, hedgehog signaling, allograft rejection, adhesion molecules cams, asthma, intestinal immune network for IgA production, ECM receptor interaction, primary immunodeficiency, graft versus host disease pathways. The phenotype of the high YPEL4 gene was mainly concentrated in intestinal immune network for IgA production, viral myocarditis and hedgehog signaling pathway. The low YPEL4 gene expression phenotype was predominantly concentrated in regulation of autophagy pathways (Fig. [116]6B). Fig. 6. [117]Fig. 6 [118]Open in a new tab Gene set enrichment analysis and Gene set variation analysis. (A,B) Merged enrichment plot of A2ML1, KRT6A, SEC14L1P1, SH3RF1, VPS11, VWA5B1, YPEL4 and ZNF564 from gene set enrichment analysis and gene set variation analysis of [119]GSE161355 datasets. The threshold value of GSEA results was set as NES> 1.0, P < 0.05 and FDR P < 0.25. GSEA, gene set enrichment analysis; NES, normalized enrichment score; FDR, false positive rate. Validation in animal models In order to understand how eight common biomarkers are expressed in the brain and kidney, we established an animal model of DKD combined with cognitive dysfunction and the treatment scheme is shown in Fig. [120]7A. The levels of blood glucose, serum creatinine, BUN, urine albumin were significantly elevated, and urine creatinine was significantly decreased in the db/db mice compared with the C57BL/6 mice (P < 0.05, Fig. [121]7B). The 12 mice were subjected to Mirro Water Maze experiments after 1 week of adaptive feeding. In the positioning navigation experiment, the escape latency and total distance of the two groups were gradually shortened with the increase of training time, but the escape latency and total distance of C57BL/6 mice were significantly reduced than that of db/db mice, and the number of C57BL/6 mice crossing the platform was significantly increased than that of db/db mice in the space exploration experiment (Fig. [122]7C), indicating that db/db mice had poor learning and memory ability. As shown in Fig. [123]6D, HE and PAS staining showed glomerular hypertrophy, proliferation of glomerular mesangial cells, dilation of the mesangial matrix, mild vacuolation of renal tubular epithelium and irregular thickening of the glomerular and tubular basement membrane, as well as the glycogen deposition in the glomeruli and tubular were observed in the renal tissue of db/db mice model. Oil Red O staining showed the number of lipid droplets increased, and the lipid accumulation in the tubulointerstitial was more obvious than that in the glomerulus. Sirius red staining revealed the formations of renal red-stained extracellular collagen, mostly in the glomerular tissue. Renal fibrosis biomarker α-SMA were stained to confirm that compared with the blank group, db/db mice had a small amount of fibrosis in the kidneys. Thus, 14-week db/db mice have been confirmed to develop DKD with cognitive dysfunction. Fig. 7. [124]Fig. 7 [125]Open in a new tab Experimental procedure of animal model and validation of diabetic nephropathy model. (A) The treatment protocol for mouse model. (B) Blood glucose, serum creatinine, BUN, urine creatinine and 24 h urine albumin levels. (C) Behavioral results of Mirros water maze. (D) Renal pathological sections stained with H&E, PAS, Oil Red O, Sirius red and Immunofluorescence staining of ɑ-SMA in renal tissue. ****P < 0.0001, ** P < 0.01, * P < 0.05 vs. healthy subjects. ns not significant. To verify the overexpression of the 8 common biomarkers in cortex, hippocampus and kidney in mice with DKD, qPCR was performed. Because two of these biomarkers (SEC14L1P1, ZNF564) are not expressed in mice, the 6 biomarkers (A2ML1, KRT6A, SH3RF1, VPS11, VWA5B1, YPEL4) were detected by qPCR. The results confirmed the upregulation of SH3RF1, VWA5B1, YPEL4 and VPS11 in cortex, the upregulation of SH3RF1, VWA5B1 in hippocampus, and the upregulation of KRT6A, SH3RF1, VPS11, YPEL4 in kidney (P < 0.05, Fig. [126]8A–C). SH3RF1 is a co-upregulated gene in DKD mice brain and kidney. After reviewing relevant literature, we found that the up-regulation of SH3RF1 gene may play a certain role in the process of DKD and cognitive dysfunction. Therefore, we further verified it at the protein level. The western blot results displayed that SH3RH1 was significantly increased in cortex, hippocampus and kidney of DKD mice. (P < 0.05, Fig. [127]8D). Fig. 8. [128]Fig. 8 [129]Open in a new tab The validation of common biomarkers in cortex, hippocampus and kidney of DKD mice. (A) The mRNA levels of genes in cortex of DKD and control mice (RT-PCR, n = 4). (B) The mRNA levels of genes in hippocampus of DKD and control mice (RT-PCR, n = 4). (C) The mRNA levels of genes in kidney of DKD and control mice (RT-PCR, n = 4). (D) The protein levels of SH3RF1 in cortex, hippocampus and kidney. The final optical density (OD) values were averaged from 3 independent experiments. The data are expressed as the means ± SDs. ****P < 0.0001, ***P < 0.001, **P < 0.01, *P < 0.05 vs. healthy subjects. ns not significant. Considering that SH3RF1 is a gene that related to apoptosis pathway, we further performed immunofluorescence and TUNEL staining to understand the expression of SH3RF1, and the apoptosis of cells in the mice cerebral cortex and hippocampus. It was observed by immunofluorescence that SH3RF1 was mainly expressed in the cytoplasm of cortical neurons and hippocampal DG and CA3 neurons of mice, and the expression of SH3RF1 in DKD mice was significantly increased (P  < 0.05, Fig. [130]9A). Meanwhile, The apoptotic cells in the cortex and hippocampus of DKD mice were significantly increased compared with the control group (P < 0.05, Fig. [131]9B). Fig. 9. [132]Fig. 9 [133]Open in a new tab Expression of SH3RF1 and apoptosis in cerebral cortex and hippocampus of mice. (A) Confocal photomicrographs showing the immunoreactivity of SH3RF1 (red) in the cortex and DG region of hippocampus in different experimental mice (B) The TUNEL staining showed that cell apoptosis(green) in mice cortex and DG region of hippocampus in different experimental mice. The data are expressed as the means ± SDs of 3 mice per group and are representative of 3 independent experiment. ****P < 0.0001, ***P < 0.001, **P < 0.01, *P < 0.05 vs. healthy subjects. ns not significant. Discussion Diabetes, DKD, and cognitive dysfunction are chronic diseases that often co-exist in people over 65 years of age^[134]36. However, the control of blood sugar does not reduce the increased risk of cognitive impairment, because, in patients with cognitive impairment, there are usually pathological mixed diseases other than diabetes, such as AD, cerebrovascular injury, DKD, etc^[135]37. Type 2 diabetes may have an effect on vascular and neurodegeneration through insulin changes, advanced glycosylation, chronic inflammation, and hyperglycemia, and the extent of its pathology depends on age and increased vascular risk^[136]38. DKD also has similar risk factors for cognitive impairment, such as hypertension, diabetes, and hyperlipidemia^[137]39, while the link between DKD and cognitive impairment is primarily due to vascular factors or direct neurotoxic effects of uremia^[138]22,[139]40. It is studied that chronic renal failure people with albuminuria have a 35% increased risk of cognitive impairment^[140]41, and urinary protein is the main early clinical manifestation of DKD^[141]21,[142]42. Cognitive dysfunction caused by DKD results from the interactions of multiple reasons, genes, and mechanisms. However, its potential mechanisms remain unclear. Recently, a large number of studies have focused on the screening of related biomarkers. By transcriptome sequencing the brains of cognitively impaired db/db mice and wild-type mice, Song et al. found that 80 genes were differentially expressed (48 up-regulated genes and 32 down-regulated genes). After further joint analysis with the metabolome and intestinal flora, 33 key genes were identified, mainly related to mitochondrial respiration, glycolysis and inflammation^[143]43. Further single-cell sequencing demonstrated Neurovascular unit (NVU) dysfunction and cell population changes in neurons, glial cells and microglia in db/db mice, among which Rps29, Tspan7, Mt3, Apoe, Ptgds, Rpsa, Actb, Mbp, Meg3, Gria1, Gad2, Rps21, Egr1, Hba-a2was the differentially expressed gene associated with cognition^[144]44, and indicated that the changes of key regulated genes might contribute to neuronal maturation, metabolic changes, inflammation and other mechanisms of cognitive dysfunction in diabetes. However, although many efforts have been made to explore novel targets for cognitive impairment cause by diabetes, the present knowledge seems about the genes involved in DKD and cognitive dysfunction to be insufficient, potential biomarkers with high specificity and sensitivity are still urgently required. In this study, 8 common biomarkers (A2ML1, KRT6A, SEC14L1P1, SH3RF1, VPS11, VWA5B1, YPEL4 and ZNF564) with diagnostic value (ACU greater than 0.7) were selected from DEGs of two diseases through machine learning. But SEC14LIPI and ZNF564 were not expressed in mice. We used RT-PCR to verify the remaining genes in the cortex, hippocampus and kidney of DKD and control mice. KRE6A, VWA5B1, YPEL4, VPS11 and SH3RF1 were upregulated in the brain and kidney of DKD mice, respectively. However, only SH3RF1 has common high expression in the cortex, hippocampus, and kidneys of DKD mice. In these genes, KRT6A is a member of a family of genes that code for keratin, which has been studied in cancer^[145]45 and hair follicles^[146]46. VWA5B1 is only expressed by RNA in the brain. The function of the YPEL4 gene is unclear, studies have shown that is involved in maintaining the integrity of red blood cell membranes^[147]47. The protein encoded by VPS11 gene is mainly related to late endosome/lysosome, which is may be involved in the occurrence of autophagy pathway^[148]48. SH3RF1 is a gene involved in apoptosis. Therefore, we selected SH3RF1 as the gene for further verification. In our studies, we used different methods to verify the up-regulation of SH3RF1 in the cortex, hippocampus and kidney. At the same time, TUNEL staining was used to observe the apoptosis of brain neurons in each group. It can be seen by Immunofluorescence showed that SH3RF1 was mainly distributed in the cortex, and DG,CA3 regions of hippocampus. WB results showed that SH3RF1 was significantly increased in brain and kidney of DKD mice. At the same time, we also performed TUNEL staining on the brain sections of mice in each group, the apoptotic cells in the cortex and hippocampus of DKD mice were significantly increased. Our study confirms that SH3RF1 does play a role in the pathogenesis of DKD and cognitive impairment. However, whether the apoptosis of neurons is caused by SH3RF1 needs further experimental verification. SH3RF1, also known as POSH (plenty of SH3, which contains four SH3 domains). Which is a scaffold for signaling proteins regulating cell survival^[149]49. Specifically, SH3RF1 promotes assembly of a complex including Rac GTPase, mixed lineage kinase (MLK), MKK7, and Jun kinase (JNK)^[150]50. It is a scaffold protein first discovered by Tapon et al. that binds to Rac1 and activates the JNK signaling pathway^[151]51,[152]52.Meanwhile, SH3RF1 as regulators of synaptic growth responses, it is a novel nexu slinking activity of TGF-β and JNK signaling via TAK1 (TGF-β–activated kinase 1), a JNK kinase kinase (JNKKK) essential for synaptic overgrowth in Rab8 mutants^[153]53, this may be one of the causes of frontotemporal dementia. Other studies have observed that SH3RF1 is an Intracellular Signal Transducer for the Axon Outgrowth Inhibitor Nogo66 and is involved in inhibiting axon growth downstream of Nogo66/Pirb^[154]54. After SH3RF1 was knocked out, the mice exhibited severe autism-like behavior, as well as learning and memory deficits, and abnormal dendritic spine development and synaptic transmission were detected in the mice's hippocampus neurons^[155]55, and the normal synaptic cluster of the NMDAR/PSD-95/SHANK complex is destroyed. Functionally, SH3RF1 regulates several cell activities, including apoptosis, synaptic growth, and neuronal migration^[156]49,[157]53,[158]56. It has been reported that SH3RF1 is an E3 ubiquiting ligase stimulates the ubiquitination of Kir1.1^[159]57 (ROMK channels in the apical membrane of epithelial cells in the ascending branches of kidney and cortical collecting ducts), and expanded^[160]58 (upstream components in the HIPPO pathway)through its N-terminal RING finger domain involved in clathrin-independent endocytosis, organ growth, tissue homeostasis, and tumorigenesis. In this study, the GO, GSEA and GSVA analyses all emphasized that ubiquitin protein ligase activity may play an important role in DKD induced cognitive dysfunction. Although the specific mechanism is still unclear, this may become a target for further studies. To sum up, using LASSO, SVM-RFE, GSEA and GSVA, SH3RF1 were identified as the potential common biomarkers of DKD and cognitive impairment. The animal model validation confirmed that SH3RF1 is highly expressed in the cortex, hippocampus and kidney of DKD mice, and the mechanism may be that SH3RF1 as a scaffold protein, participates in the abnormal apoptosis of cells in the JNK pathway. Besides, SH3RF1 may be as an E3 ubiquitin ligase, regulates the Hippo pathway and participates in the abnormal synapse growth, which can cause neurodegeneration. Therefore, the findings may shed light on the management and treatment of DKD patients with cognitive impairment. Supplementary Information [161]Supplementary Figures.^ (3.2MB, docx) [162]Supplementary Table 1.^ (28.8KB, docx) [163]Supplementary Table 2.^ (44.2KB, docx) [164]Supplementary Table 3.^ (55.9KB, docx) [165]Supplementary Table 4.^ (31.5KB, docx) Author contributions J.P., S.Y. and J.Q.Z. conceived the study, development or design of methodology, drafted and wrote the manuscript. C.M.Z. and C.G.Q. analyzed and interpreted the data. K.Y.F., Y.T. and J.J.D. supervised and performed the experiments. Y.Z. contributed reagents, materials. All authors contributed to manuscript revision, read and approved the submitted version. Funding Jing Peng was supported by Science and Technology Fund project of Guizhou Provincial Health Commission [gzwkj2022-126]. Yan Zha was supported by National Natural Science Foundation of China (82360148), Guizhou Science and Technology Department (QKHPTRC2018-5636-2; QKHCG2023-ZD010). Kaiyun Fang was supported by Guizhou Science and Technology Department ([2019]2815). Data availability The authors confirm that bioinformatics analysis of the data supporting the findings of this study are openly available in Gene Expression Omnibus database([166]GSE199838 and [167]GSE161355). All data generated or analysed during this study are included in this published article and its supplementary information files. Competing interests The authors declare no competing interests. Ethical statement The authors confirm that all procedures performed in studies involving animals were in accordance with ARRIVE guidelines, and approved by the Research Ethics Committee of Guizhou Provincial People's Hospital, the Animal Ethics Approval Number is (2022) 257. There are no human subjects in this article and informed consent is not applicable. Footnotes Publisher's note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. These authors contributed equally: Jing Peng and Sha Yang. Contributor Information Jingjing Da, Email: dajingjing0927@126.com. Jiqing Zhang, Email: zt1724@yeah.net. Yan Zha, Email: zhayan72@126.com. Supplementary Information The online version contains supplementary material available at 10.1038/s41598-024-72327-w. References