Abstract Vitamin D deficiency poses a widespread health challenge, shaped by environmental and genetic determinants. A recent discovery identified a genetic regulator, rs11542462, in the SDR42E1 gene, though its biological implications remain largely unexplored. Our bioinformatic assessments revealed pronounced SDR42E1 expression in skin keratinocytes and the analogous HaCaT human keratinocyte cell lines, prompting us to select the latter as an experimental model. Employing CRISPR/Cas9 gene-editing technology and multi-omics approach, we discovered that depleting SDR42E1 showed a 1.6-fold disruption in steroid biosynthesis pathway (P-value = 0.03), considerably affecting crucial vitamin D biosynthesis regulators. Notably, SERPINB2 (P-value = 2.17 × 10^−103), EBP (P-value = 2.46 × 10^−13), and DHCR7 (P-value = 8.03 × 10^−09) elevated by ∼2–3 fold, while ALPP (P-value <2.2 × 10^−308), SLC7A5 (P-value = 1.96 × 10^−215), and CYP26A1 (P-value = 1.06 × 10^−08) downregulated by ∼1.5–3 fold. These alterations resulted in accumulation of 7-dehydrocholesterol precursor and reduction of vitamin D[3] production, as evidenced by the drug enrichment (P-value = 4.39 × 10^−06) and total vitamin D quantification (R^2 = 0.935, P-value = 0.0016) analyses. Our investigation unveils SDR42E1's significance in vitamin D homeostasis, emphasizing the potential of precision medicine in addressing vitamin D deficiency through understanding its genetic basis. Keywords: SDR42E1, Vitamin D biosynthesis, Steroidogenesis, CRISPR/Cas9, Multi-omics, HaCaT Highlights * • SDR42E1 knockout alters key vitamin D synthesis regulators: EBP, DHCR7, ALPP, and CYP26A1. * • Multi-omics reveal SDR42E1's broad role in steroid synthesis and lipid metabolism. * • SDR42E1 knockout accumulates 7-dehydrocholesterol, hindering vitamin D production. * • SDR42E1 variant presents a promising target for addressing vitamin D deficiency. 1. Introduction Vitamin D deficiency, characterized by suboptimal 25-hydroxyvitamin D (25(OH)D) levels below 20 ng/mL (50 nmol/L), represents a widespread nutritional deficiency linked to critical health conditions, including osteoporosis and rickets. The intricate interplay of genetic determinants substantially influences serum 25(OH)D concentrations, with twins and familial studies revealing notable variations ranging from 23 % to 90 % [[29]1]. The prevalence of vitamin D deficiency in regions with abundant sunlight, such as the Middle East and North-Africa (MENA) region, emphasizes the relevance of genetic determinants [[30]2].Vitamin D synthesis begins in the skin, where ultraviolet B (UVB) radiation converts 7-dehydrocholesterol (7-DHC) to vitamin D[3](cholecalciferol), or alternatively through converting 8-dehydrocholesterol (8-DHC) to 7-DHC via sterol-8,7-isomerase (EBP) [[31]3]. Vitamin D[3] is subsequently hydroxylated by hepatic cytochrome P450 enzymes, including CYP2R1, CYP27A1, CYP11A1, and renal CYP27B1 to form the biologically active 1,25-dihydroxyvitamin D[3] (calcitriol). Active vitamin D acts through various receptors, like vitamin D receptor (VDR) and retinoid acid receptor-related orphan receptor alpha (RORA) and is inactivated by hepatic CYP24A1 [[32]4]. Despite this knowledge, the complex pathways and involvement of multiple genes in vitamin D synthesis hinder our comprehensive understanding of the genetic contributions to 25(OH)D variation. Recent genome-wide association studies (GWAS) have provided crucial insights into the genetic architecture of 25(OH)D, identifying single-nucleotide polymorphisms (SNP) statistically linked to 25(OH)D levels [[33]5]. One notable variant is identified in the novel, uncharacterized human short-chain dehydrogenase/reductase family 42E, member 1 (SDR42E1) on chromosome 16q23 in exon 3, a genomic locus that has received limited attention. This variant, identified as rs11542462 in the SNP database, introduces a premature stop codon, resulting in the substitution of amino acids, specifically Glutamine, with termination at position 30 of the protein (p.Q30* GLN>*TER). This mutation potentially leads to a non-functional SDR42E1 enzyme [[34][6], [35][7], [36][8]]. SDR42E1 is a member of the extended short-chain dehydrogenase/reductase superfamily with a broad substrate specificity and potential involvement in lipid metabolism [[37]9]. Although its precise function remains uncertain, the protein is implicated in regulating cellular processes and may impact steroid synthesis through proposed roles as an oxidoreductase and steroid delta-isomerase, potentially utilizing nicotinamide adenine dinucleotide phosphate (NAD(P)(H)) [[38]10,[39]11]. Interestingly, numerous genetic investigations highlight a close interrelation between SDR42E1 and the regulation of steroid hormone biosynthesis [[40]12,[41]13]. Given the potential importance of this gene, further investigation is crucial to elucidate the structure, biological function, and impact of SDR42E1 mutations on human health. The nonsense variant in SDR42E1 has been associated with serum concentrations of vitamin D precursor, 8-DHC [[42]3]. This steroidal compound demonstrates a close relationship with a crucial precursor in vitamin D synthesis, namely 7-DHC, both of which accumulate in patients affected by Smith-Lemli-Opitz syndrome [[43]14,[44]15]. Our previous research on in silico characterization of the SDR42E1 identified potential substrates of the protein as vitamin D[3], 8-DHC, 7-DHC, and 25(OH)D [[45]10]. In this study, we expand on previous findings to a comprehensive profiling of SDR42E1 functions using a multi-disciplinary approach, integrating bioinformatic analysis with the use of clustered regularly interspaced short palindromic repeats (CRISPR) and CRISPR-associated protein 9 (Cas9) gene-editing technology to mimic the SDR42E1 nonsense variant in the selected human keratinocyte HaCaT cell line. 2. Methods 2.1. In silico bioinformatics analyses The mRNA expressions of quantitative trait loci (eQTLs) of SDR42E1 in normal human tissues were explored through Genotype-tissue Expression (GTEx, [46]http://commonfund.nih.gov/GTEx/). The All RNA-sequencing and Combining Chromatin Immunoprecipitation-sequencing (ChIP-seq) Sample Search Space (ARCHS4) database was also used to assess the expression map of SDR42E1 in human cells and tissues, which contains data on 187,964 human and mouse samples, as previously described [[47]16]. 2.2. Cell culture The human immortalized skin keratinocyte, HaCaT (CRL-2309, Research Resource Identifier (RRID): CVCL_0038), and human embryonic kidney cell lines, HEK293T (CRL-3216, RRID: CVCL_0063), were obtained from the American Type Culture Collection (ATCC, Manassas, VA, United States) or generously provided by collaborators. Cells were seeded in T75 tissue culture flasks (Sigma-Aldrich, United States) and cultured in Dulbecco's modified Eagle's medium (DMEM) with (4.5 g/L) high D-Glucose, (2 mM) Glutamine, and (1 mM) Sodium Pyruvate, supplemented with 10 % fetal bovine serum (FBS) and (1×) antibiotic-antimycotic (Thermo Fisher Scientific Gibco, United States). To prevent contamination, the medium was changed every other day. At 70 %–80 % confluency, all cells were harvested through treatment with 0.25 % trypsin-ethylenediaminetetraacetic acid (EDTA, Sigma-Aldrich). The cell pellets were subsequently rinsed once with sterile (1×) Dulbecco's phosphate-buffered saline (DPBS; Gibco), underwent centrifugal precipitation at 900 revolutions per minute (rpm) for 3 min, and passaged at a 1:6 ratio. All cell lines were grown under sterile conditions in a monolayer culture at 37 °C, with a 5 % CO[2] atmosphere and 95 % air humidity. 2.3. Plasmid construction and single guide RNA cloning To generate a single-cell-derived SDR42E1 knockout through CRISPR/Cas9 technology, GenScript (United States) designed single guide RNA (sgRNA) targeting exon 3 of the human SDR42E1 gene, proximate to the p.Q30* GLN>*TER premature stop codon mutation, as illustrated in [48]Figure S1 a. These sgRNAs were synthesized by Integrated DNA Technologies (IDT, United States) with overhangs complementary to the BsmBI-digested plasmid (sgRNAs-SDR42E1, [49]Table S2). After the BsmBI digestion and filler fragment excision using GeneJET Gel Extraction Kit (Thermo Scientific, United States), the sgRNA guide sequences were integrated into the BsmBI site of the LentiCRISPR v2-Puro-U6 vector (a gift from Feng Zhang, RRID: Addgene_52961; [50]http://n2t.net/addgene:52961, Addgene plasmid, United States) according to a corresponding Addgene protocol with minor modifications [[51]17]. Briefly, oligonucleotides for the sgRNA-SDR42E1 were annealed through heating to 95 °C for 5 min and gradual cooling to 25 °C. Subsequently, they were phosphorylated using T4 Polynucleotide Kinase (New England Biolabs, United States) at 37 °C for 30 min and inactivated at 70 °C for 10 min. The annealed oligonucleotides were cloned into the digested lentiCRISPR v2 backbone using T4 DNA ligase (Invitrogen, United States) at 16 °C overnight. The three obtained CRISPR/Cas9 constructs were amplified in Stbl3 chemically competent Escherichia coli (RRID: C737303, Invitrogen) through heat shock transformation and cultured in Super Optimal Broth Culturing medium (SOC, Invitrogen) for 1 h. Positive colonies were purified from (100 μg/mL) ampicillin-supplemented-LB media using QIAprep Spin Miniprep (Qiagen, Germany). The insertion of the sgRNA cassette was confirmed for several colonies through colony polymerase chain reaction (PCR) validation and DNA Sanger sequencing using the human U6 Forward primer (U6 for Lenti-CRISPRv2, [52]Table S2). The most efficient sgRNAs were selected for the SDR42E1 knockout experiments, namely sgRNA2-SDR42E1 and sgRNA3-SDR42E1 ([53]Figure S1 b). Non-targeting lentiCRIPSR v2 plasmid (an empty vector) served as a negative control. 2.4. Lentiviral production HEK293T cells, leveraging the crucial Simian Virus 40 (SV40) large T-antigen for viral vector production, were transiently transfected with non-targeting and SDR42E1-targeting CRISPR/Cas9 plasmids to produce lentiviral particles (LVPs). Briefly, HEK293T cells were seeded in a 10-cm Petri dish (Sigma-Aldrich) for 20–24 h. Cells were then supplemented with DMEM containing 10 % Bovine serum (Gibco) and treated with (25 mM) chloroquine (Cayman, United States) to enhance LVP stability and transfection efficiency. At 65–70 % confluency, co-transfection of (12 μg) lentiviral DNA constructs with lentivirus packaging mix (Dharmacon, TLP4606, United States) was executed using calcium phosphate precipitation as described previously [[54]18]. At 6 h of incubation at 37 °C, culture media was changed to standard culture medium, and cells were incubated for at least 72 h to achieve high-titer virus production. Afterward, the viral supernatant was harvested, filtered through 0.45-μm sterile low protein binding filters (Millipore, Sigma-Aldrich), and frozen at −80 °C in small aliquots, being freshly thawed for each infection cycle. 2.5. Generation of CRISPR/Cas9-mediated SDR42E1-edited cells The targeted HaCaT cells were infected with viruses generated from each CRISPR/Cas9-sgRNA construct at a multiplicity of infection (MOI) of 5 in 6-well plates (Sigma-Aldrich), in the presence of (10 μg/mL) polybrene to induce sgRNA expression. After 24 h of transduction at 37 °C, puromycin (A1113803, Thermo Fisher Scientific) was applied at a concentration of 2–5 μg/mL for approximately 7 days to eliminate non-transduced cells. Cells were then harvested at 80 % confluency for DNA and mRNA extractions to verify the gene editing efficiency. The array dilution method, a robust isolation technique, was utilized to generate a monoclonal homozygous SDR42E1 gene-edited cell line from a polyclonal pool of heterozygous cells. Infected cells were isolated and sorted into single clones in a 96-well plate through a 2-fold serial dilution, first vertically, and then horizontally. Three to four weeks post-infection, individual cell clones were verified for genome editing efficiency and genotyping using T7 endonuclease 1 (T7E1) assay, reverse transcription-quantitative polymerase chain reaction (RT-qPCR), Sanger sequencing, and Western blotting. The top gene-edited clone, exhibiting homozygous knockout, was selected for further investigations into gene function and expression. 2.6. T7 endonuclease 1 mismatch assay The efficiency of CRISPR/Cas9-mediated gene-editing of the designed sgRNAs was validated through a T7E1 assay, as per the instructions, which recognizes and cleaves non-perfectly matched DNA. Genomic DNA was extracted from infected HaCaT cells using Quick Extract Genomic DNA buffer (A560001, AMPLIQON, Denmark), following the manufacturer's protocol. After DNA quantification, the sgRNA genomic target site in SDR42E1 exon 3 was PCR-amplified with high-fidelity AccuPrime Taq DNA polymerase (NEB 2U/uL, Thermo Scientific) using primers flanking the target site (T7-SDR42E1, [55]Table S2) and purified with a GeneJET PCR purification kit (Thermo Scientific). The PCR amplicons (200 ng) were denatured and reannealed to form heteroduplexes DNA, followed by digestion with T7E1 enzyme (M0302L, New England Biolabs) at 37 °C for 30 min. The digested products were run on a 2 % agarose gel at 90 V for 40 min to verify the size and specificity of the products. Control samples, including empty lentiCRISPR vector and non-T7 digested-genomic samples, were used for result validation. 2.7. DNA sequencing To determine the exact genotype of the CRISPR/Cas9-mediated SDR42E1 gene-editing, Sanger sequencing of the purified PCR product of genomic DNA was conducted by Macrogen Inc. ([56]http://macrogen.com, Korea), using the forward T7-SDR42E1 primer (T7-SDR42E1, [57]Table S2). For mutation identification, the obtained sequences were aligned to the human genomic reference sequence of SDR42E1 from Ensembl (ENST00000328945) and compared to a non-targeting lentiCRISPR vector control using Snapgene version 5.3.2 (GSL Biotech LLC; Chicago, IL, United States). 2.8. Cell lysis and Western blot of immunoprecipitation To confirm the loss of protein expression in SDR42E1 gene-edited cells, washed cells were lysed using ice-cold radioimmunoprecipitation assay (RIPA) buffer, supplemented with (1 mM) phenylmethylsulfonyl fluoride (PMSF), (1M) dithiothreitol (DDT), and a protease inhibitor cocktail (UD282713, Thermo Fisher Scientific). After a 30-min incubation at 4 °C with rotation, the lysates were centrifuged at 10,000×g for 15 min at 4 °C. The supernatants were stored at −80 °C until analysis. Protein concentration was determined photometrically using a bicinchoninic acid (BCA) protein assay kit (Pierce, Thermo Fisher Scientific). Whole-cell extracts were pre-incubated with the desired antibody, anti-SDR42E1 rabbit monoclonal antibody (Thermo Fisher Scientific Cat# PA5-53156, RRID: [58]AB_2647060, 1:500 dilution) or rabbit aHA-tag polyclonal antibody (Thermo Fisher Scientific Cat# 71-5500, RRID: AB_2533988, 1:100 dilution), overnight with gentle rotation at 4 °C to generate specific immunocomplex. Protein samples were then incubated with protein A/G magnetic beads (Pierce, 88802, Thermo Scientific) for 2 h with gentle rotation at 4 °C. After a brief centrifugation at 3000 rpm for 30 s to pellet the magnetic beads with bound immunocomplexes, they were placed in a magnetic separation rack and the supernatant was discarded. Magnetic beads were briefly spun down with the bound immunocomplexes at 3000 rpm for 30 s, and subsequently placed in a magnetic separation rack and the supernatant. The pellet underwent three washes with ice-cold RIPA buffer and was finally denatured in 5× SDS-sample buffer at 37 °C for 30 min. An equal quantity of protein lysates was electrophoresed on a precast Novex NuPAGE 4–12 % Bis-Tris SDS-polyacrylamide gel using the XCell SureLock electrophoresis system (Invitrogen). Resolved proteins were wet-transferred to a nitrocellulose membrane (Millipore) using a Trans-Blot Turbo transfer system (BioRad). After 1-h blocking with 5 % bovine serum albumin (BSA, Tocris, United Kingdom) buffer at room temperature, the membrane was then immunoblotted using an anti-SDR42E1-tag rabbit monoclonal antibody (1:500 dilution, WG3329739B, Invitrogen) for 3 days at 4 °C in 5 % BSA buffer. Beta-actin levels were determined as loading controls using a monoclonal anti-beta-actin mouse antibody (1:5000 dilution, Sigma-Aldrich Cat# A5441, RRID: [59]AB_476744). Following six washes with 1× Tris Buffered Saline with Tween 20 (TBST), membranes were visualized under chemiluminescence using peroxidase IgG fraction monoclonal anti-mouse IgG (H + L) secondary antibody HRP conjugate (1:5000 dilution, Thermo Fisher Scientific Cat# 62-6520, RRID: [60]AB_2533947) or anti-rabbit IgG light chain specific (1:5000 dilution, Jackson ImmunoResearch Labs Cat# 211-032-171, RRID: [61]AB_2339149, United Kingdom) with Pierce ECL Western blotting substrate (32106, Thermo Scientific). 2.9. RNA extraction and RT-qPCR To assess the mRNA expression in the SDR42E1-edited cells, total RNA was isolated from cell lines using 1 mL TRIzol reagent (Invitrogen) according to the manufacturer's guidelines. RNA quantity and integrity were evaluated spectrophotometrically by a NanoDrop 8000 (ND-8000-GL, Thermo Scientific) and agarose gel electrophoresis. Subsequently, 2 μg of total RNA were reverse transcribed into complementary-DNA (cDNA) using the High-capacity cDNA reverse transcriptase kit (Applied Biosystems, United States) following the recommendations. For human gene expression analysis, each 20 μL RT-qPCR reaction utilized 5 μl of the 1:5 diluted cDNA on a Quant Studio 6 Flex System (Thermo Fisher Scientific) with PowerUp SYBR Green Master Mix (Applied Biosystems). Primers designed to amplify target genes are presented in [62]Table S2. The thermal cycling conditions included a denaturation step at 95 °C for 10 min, followed by 50 cycles at 95 °C for 15 s and 60 °C for 1 min, concluding with a melt curve analysis. The housekeeping gene human beta-ACTIN served as an internal control to normalize variations in total RNA expression levels across each sample. Relative mRNA expression levels were assessed through comparative threshold cycle (CT) analysis, and the fold changes were calculated by the 2−^ΔΔCT (delta-delta cycle threshold) method, referencing the average ΔCT value of wild-type controls and reference genes [[63]19]. 2.10. RNA sequencing To identify differentiated genes in the relevant pathway, equivalent amounts of high-quality RNA pools (100–200 ng) were precipitated in a 75 % ethanol solution and shipped to Macrogen Inc. ([64]http://macrogen.com, Korea) for total RNA-sequencing library construction and next-generation sequencing. In summary, the RNA sequencing libraries were prepared with the TruSeq Stranded Total RNA Library Prep Kit (1000000040499, Illumina), following the instructions. Following PCR enrichment, the libraries were quality-checked and quantified before sequencing on the Illumina NovaSeq 6000 platform, yielding 101 bp paired-end reads. Raw sequence reads were trimmed to eliminate contaminant DNA, PCR duplicates, and adaptor sequences. Reads with a quality below Q20 were filtered using CLC Genomics Workbench version 22.0.2 software. The same software was employed for the assembly and alignment of all paired-end reads against the latest version of the human genome (Homo sapiens, GRCh38). Genes with average raw counts below 10 were excluded. Differential expression of the genes (DEG) between control and knocked-out samples was defined as significant if the expression change was ≥ 2-fold, accompanied by an adjusted P-value <0.05. This assessment was conducted and visualized using the R package DESeq2 version 1.34 with default parameters [[65]20]. For gene expression heatmaps based on RNA-sequencing data, log[2] of the fold-change maximum likelihood estimate (lfcMLE) values, and −log[10] of false discovery rate (FDR, an adjusted P-value) values were used and visualized by R package pheatmap version 1.0.2. The Pathview R package was used to visualize Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways of related genes [[66]21]. We conducted Gene Set Enrichment (GSE) analysis as outlined in previous descriptions [[67]22], against Homo sapiens gene sets from the Enrichment map with gene lists in descending order based on lfcMLE, the unshrunk log[2] fold change generated by DESeq2. Functional annotation analyses were performed using the R package ClusterProfiler version 4.2.2, an enrichment analysis tool renowned for its comprehensive visualization capabilities that offers detailed insights into the collective functions of the input genes, encompassing Gene ontology (GO), KEGG, and enrichment analyses with default settings. The Drug Signatures Database for Gene Set Analysis (DSigDB, [68]http://biotechlab.fudan.edu.cn/database/drugsig/) [[69]23] was employed to identify candidate drugs associated with the DEG. Access to DSigDB was facilitated via ClusterProfiler [[70]24]. To validate the RNA-sequencing findings, RT-qPCR was performed on the same RNA extracts, targeting five representative genes: SDR42E1, alkaline phosphatase placental (ALPP), solute carrier family 7A5 (SLC7A5), serine protease inhibitor B2 (SERPINB2), and 7-dehydrocholesterol reductase (DHCR7) ([71]Table S2). Furthermore, we conducted a comparative analysis between the RNA-sequencing results from SDR42E1 Knockout HaCaT cells and common genes associated with Vitamin D traits in the NHGRI-EBI GWAS Catalog (EFO_0004631) with a statistical significance threshold of P-value <5.0 × 10^−8, as released on January 09, 2024, and accessed on January 27, 2024 [[72]25]. To enhance the robustness of our analysis, we also examined genetic markers located within a 250-kilobase region upstream and downstream of the reported GWAS signals, thereby enabling a comprehensive identification of significant variants associated with vitamin D. 2.11. Proteomics Following the whole-cell extraction procedure mentioned earlier, protein samples underwent preparation through incubation at 37 °C for 30 min. Subsequently, 50 μg of each protein sample, encompassing six samples from three biological replicates of homozygous SDR42E1 knockout and HaCaT controls, was loaded onto a precast Novex NuPAGE 4–12 % Bis-Tris SDS-polyacrylamide gel and run at 100-voltage for approximately 60 min. For protein band visualization, PageBlue Protein Staining Solution was applied overnight at 4 °C, followed by a 10-min wash with sterilized distilled water. Each gel lane was individually cut, placed in tubes, and kept at 4 °C for further mass spectrophotometry analysis, as detailed previously [[73]26]. Briefly, dithioerythritol and s-carbamidomethylation with iodoacetamide were included as reducing agents before in-gel tryptic digestion. Following gel piece washing, sequencing-grade modified porcine trypsin was added for rehydration and incubated overnight at 37 °C. Extracted and desalted peptides were loaded onto a nanoflow UPLC system and separated using a gradient elution solvent. The resulting peptides were then analyzed using an Orbitrap Fusion Tribrid mass spectrometer (Thermo Scientific). Relative protein abundance was calculated using precursor ion areas from non-conflicting unique peptides. Data analysis was then conducted using Progenesis QI (Version 2.2., Waters), Mascot Daemon (version 2.6.0, Matrix Science), and R packages Limma (Linear Models for Microarray Data) version 3.56.2. Benjamini–Hochberg approach was used to convert Student's t-tests-derived P-values to multiple test-corrected q-values, with a cut-off of <0.05. Enrichment analysis and functional annotations were performed for the ranked genes based on Limma-generated P-values against Homo sapiens protein set using ClusterProfiler with standard settings. 2.12. Vitamin D quantification Vitamin D concentrations in the cell lysates of SDR42E1 knockouts and wild-type HaCaT cells were measured using a Human Vitamin D Enzyme-linked Immunosorbent Assay (ELISA) kit (MY BIOSOURCE, Cat# MBS735897, USA) following the manufacturer's instructions. Cell lysates were prepared by trypsinization, 3 times ultrasonication, and centrifugation at 1000×g for 15 min to remove cellular debris. The vitamin D ELISA assay employes a competitive immunosorbent technique on a microtitre plate pre-coated with a polyclonal anti-vitamin D antibody and a vitamin D-HRP conjugate. to induce a color change inversely correlated with total vitamin D levels with sensitivity of 0.1 ng/mL. Vitamin D measurements were made spectrophotometrically at 450 nm in a microplate reader, with triplicate biological samples and a standard curve for interpolation. 2.13. Immunofluorescence Human HaCaT cells were plated on poly-L-lysine-coated coverslips in a 12-well plate. At 40–60 % confluency, cells were transiently transfected with 5-μg SDR42E1-HA tag plasmids (RRID: Addgene_55182, Addgene) using Lipofectamine 3000 (2293283, Thermo Fisher Scientific) following the manufacturer's protocol for 24 h. After fixation with 4 % paraformaldehyde, cells were blocked with 5 % BSA in DPBS with 0.1 % Tween-20 for 60 min at room temperature. Following three DPBS washes, cells were probed with primary antibody overnight at 4 °C, washed three times with DPBS, and then incubated with secondary and tertiary antibodies for 60 min each at room temperature. Primary antibodies were diluted at 1:100, including rabbit αHA-tag polyclonal antibody (SG77, Thermo Fisher Scientific Cat# 71-5500, RRID: [74]AB_2533988), mouse monoclonal anti-golgi 58K antibody (Sigma-Aldrich Cat# G2404, RRID: [75]AB_477002), HSP60 (Heat Shock Protein 60) recombinant rabbit monoclonal antibody-mitochondrial marker (HSPD1-2206R, Thermo Scientific), and rabbit calreticulin-ER marker IgG polyclonal antibody (Thermo Fisher Scientific Cat# PA5-80402, RRID: [76]AB_2787722). For secondary antibodies, biotinylated goat anti-rabbit IgG (H + L) antibody (65-6140, Thermo Scientific) and goat anti-rabbit IgG antibody (H + L), biotinylated (Vector Laboratories Cat# BA-1000, RRID: [77]AB_2313606) were used at a 1:250 dilution. The tertiary antibody used was Streptavidin Alexa Fluor 488 conjugate at a 1:250 dilution (Molecular Probes Cat# S32354, RRID: [78]AB_2315383). Nuclei were stained with Hoechst (33258, Invitrogen) at a concentration of 1 μg/mL, and slides were mounted with ProlongTM Antifade Histomount (Thermo Fisher Scientific). 2.14. Confocal microscopy Confocal microscopy was performed using a Nikon A1R confocal fluorescence microscope with a 100× oil immersion objective. Specific parameters included laser excitations at 405 nm for Hoechst, 488 nm for Alexa Fluor 488, and 561 nm for GFP, with detector settings optimized for each fluorophore. Images were captured with consistent gain, offset, and pinhole settings across all samples. Motor arm movements of 60 μm, 0.3 mm, and 6.0 mm were utilized to ensure consistent field selection. Image reconstruction and analysis were performed using Fiji-ImageJ software [[79]27], ensuring uniform treatment of all images to prevent bias. 2.15. Statistical analyses Statistical analyses were performed using GraphPad Prism version 9. An unpaired or paired two-tailed Student's t-test assessed significance between two groups, while Kruskal-Wallis one-way analysis of variance (ANOVA) with Dunn's post hoc test was used for three or more groups. A P-value cutoff of 0.05 determined significance. Results were presented as average ± standard deviation or average ± standard error of the mean (SEM). Experiments in each figure utilized at least three independent biological replicates for robust and reproducible data. 3. Results 3.1. Bioinformatic screening reveals high SDR42E1 expression in skin and intestinal epithelial cells The GTEx database was utilized as an initial resource for analyzing the mRNA expression of SDR42E1 across different tissues under normal physiological conditions. Our analysis revealed highest expression of SDR42E1 in sun-exposed and non-exposed skin, followed by the esophagus with mean transcripts per million (TPM) values of 11.93, 11.62, and 5.08, respectively ([80]Figure S2 a). To explore the potential functional role of SDR42E1 in human tissues and cell lines, we evaluated its gene expression using the RNA-sequencing public resource ARCHS4. Our analysis unveiled that SDR42E1 exhibits the highest expression in intestinal epithelial cells (TPM = 8.7) and skin keratinocytes (TPM = 8.4) compared to other cell types present in intestinal and skin tissues ([81]Figure S2 b). Additionally, we observed a significantly higher expression of SDR42E1 in HaCaT cells (TPM = 11.2), a spontaneously transformed aneuploid keratinocyte line from adult human skin biopsies, and the HCT116 cell line (TPM = 10), a human colorectal carcinoma cell line with a Kirsten rat sarcoma (KRAS) mutation initiated from an adult male ([82]Figure S2 c). 3.2. Transcriptomic profiling identifies extensive alterations in gene expressions and pathways in the SDR42E1 knockout model To investigate the function of SDR42E1, we targeted the surrounding region of the nonsense variant with 3 different guide RNAs in HaCaT cells using the LentiCRISPR v2-sgRNA system. Successful gene editing at the SDR42E1 locus was illustrated in [83]Fig. 2a and b, affirming the efficacy of the sgRNAs. Employing the T7E1 mismatch cleavage assay, we confirmed multiple homozygous and heterozygous gene edits through the detection of undigested and endonuclease-digested bands ([84]Figure S1 c). Subsequently, 45 gene-edited clones were subjected to genotype validation through Sanger sequencing to ascertain the specificity of the induced modifications. Interestingly, DNA sequencing of Clone 32 revealed a frame-shifting insertion of a (T) nucleotide at the Cas9 cleavage site, introducing a premature stop codon 30 amino acids downstream of the p.Q30* GLN>*TER mutation ([85]Figure S1 d). Fig. 2. [86]Fig. 2 [87]Open in a new tab Extensive alterations in gene expressions in the SDR42E1 knockout model. a, A PCA plot demonstrates the clustering of three biological replicates of wild-type HaCaT controls (C; in rose) and SDR42E1 homozygous knockouts (Hom, in blue), through the major principal components of the regularized log-transformed counts. b, An MDS plot illustrates the correlation between log[2] fold change and the mean of the normalized counts in SDR42E1 knockouts, with significant DEG highlighted in blue (adjusted P-value <0.05). c, A volcano plot depicts significant gene expression changes in SDR42E1 knockouts compared to wild-type controls. The X-axis displays the log[2] fold change (FC), with upregulated genes to the right and downregulated to the left, while the Y-axis represents the false discovery rate (FDR). Points represent individual genes with detectable expression changes, meeting the criteria of an adjusted P-value <0.05 and a Log[2]FC > 1, with the top 20 most significantly altered genes labeled. d, A cluster heatmap shows the Z-scores of regularized log-transformed counts for the top 100 DEG, with blue indicating lower and red indicating higher expression, highlighting the distinct separation of sample conditions. The X- and Y-axes are labeled with sample names and DEGs, respectively. All analyses were conducted and visualized using R/DESeq2 and Pheatmap. In parallel, to establish a comparative baseline model for validating gene-edited cells and characterized the expression of the SDR42E1, HaCaT cells were transiently transfected with an in-house constructed plasmid carrying the entire coding sequence of wild-type SDR42E1 to amplify its protein expression. This augmentation was essential given the minimal endogenous expression of SDR42E1. Wild-type SDR42E1 protein was detected at the anticipated molecular weight of approximately 44 kiloDalton (kDa) via Western blotting ([88]Fig. 1a and [89]Figure S3a), predominantly localized to the cytoplasm and cellular membrane in HaCaT cells ([90]Fig. 1b), excluding the mitochondria, Golgi apparatus, or endoplasmic reticulum (data not shown). Subsequent Western blot analysis of immunoprecipitated SDR42E1 on Clone 32 confirmed complete loss of its expression ([91]Fig. 1c and [92]Figure S3b and c), aligning with RT-qPCR results ([93]Fig. 1d). Consequently, this clone was identified as an SDR42E1 homozygous knockout and selected for further validation and functional investigations. Fig. 1. [94]Fig. 1 [95]Open in a new tab Transient protein and transcript expression of SDR42E1 in gene-edited and wild-type HaCaT cells. a, Immunoblotting of protein extracts from wild-type HaCaT cells (lane 1) and cells with transient overexpression of wild-type SDR42E1-HA (44 kDa; lane 2, indicated by a red arrow) was performed using rabbit SDR42E1-tag polyclonal antibody at a 1:1000 dilution. Ponceau-S Red staining (PonS) served as a loading control. b, HaCaT cells with wild-type SDR42E1 expression (green) were stained overnight with rabbit αHA-tag polyclonal antibody at a 1:100 dilution, with nuclear localization revealed by Hoechst (blue). Cells were transiently transfected with a 5-μg SDR42E1-HA tagged plasmid for 24 h c, WB analysis of immunoprecipitated SDR42E1 in whole protein lysate derived from the gene-edited HaCaT cells of clone 32 (SDR42E1-KO-32) revealed the absence of a 44 kDa band corresponding to SDR42E1 (highlighted by a red arrow) using rabbit SDR42E1 polyclonal antibody (PA5-53156, Invitrogen). Before the IP experiments, one-tenth of total lysates were subjected to the respective WB as input controls using an anti-beta-actin mouse antibody (A5441, Sigma, 1:5000 dilution). Ctrl-Cas9 is an untargeted sgRNA-Cas9 vector in HaCaT as a negative control. d, RT-qPCR analysis revealed significantly decreased SDR42E1 transcript expression in SDR42E1-KO-32 compared to controls (Ctrl-Cas9). The relative expression level of SDR42E1 was normalized by the internal control β-actin. Data represent the mean ± standard deviation of three replicates, with similar results and significant differences relative to Cas9 control were analyzed by t-test with p < 0.0001 (****). The investigation into the transcriptomic impacts of the homozygous knockout of SDR42E1 in HaCaT cells was initially conducted through RNA-sequencing. Employing the R package DESeq2 for differential expression analysis, 5,449 DEG were identified out of 14,025 read counts in the homozygous knockout cells, exhibiting an −log of FDR less than 0.05 compared to the wild-type controls ([96]Fig. 2). The principal component analysis (PCA) and multidimensional scaling (MDS) plots revealed a distinct separation between the SDR42E1 knockouts and the wild-type HaCaT cells ([97]Fig. 2a and b). This distinction was also evident in the heatmap displaying the expression levels of the top 100 DEG ([98]Fig. 2c), with the most statistically significant mRNA observed to be downregulated, as depicted in the volcano plot ([99]Fig. 2d). Among the significant DEG resulting from the inactivation of SDR42E1, 458 were significantly upregulated, with a log[2] fold change (FC) ≥ 0.3, including laminin subunit gamma 2 (LAMC2; FC = 1.60, FDR = 4.37 × 10^−262, pleckstrin homology like domain family A1 (PHLDA1; FC = 1.93, FDR = 1.31 × 10^−228), keratin 6A (KRT6A; FC = 1.26, FDR = 8.32 × 10^−197), keratin 18 (KRT18; FC = 1.36, FDR = 2.75 × 10^−171), SERPINB2 (FC = 3.17, FDR = 2.17 × 10^−103), and endothelial lipase G (LIPG; FC = 0.75, FDR = 7.31 × 10^−68) ([100]Fig. 2c and d). Conversely, 1,058 were downregulated with a log[2] fold change ≤ −0.3, including ALPP (FC = −2.96, FDR <2.2 × 10^−308), keratin 19 (KRT19; FC = −1.60, FDR <2.2 × 10^−308), DNA topoisomerase II alpha (TOP2A; FC = −1.29, FDR = 2.4 × 10^−284), SLC7A5 (FC = −1.54, FDR = 1.96 × 10^−215), and solute carrier family 3A2 (SLC3A2; FC = −1.09, FDR = 1.17 × 10^−171) ([101]Fig. 2c and d). Summary statistics for top RNA-sequencing data are presented in [102]Table S3. To gain deeper insights into the biological functions of DEG from the SDR42E1 knockout model, an enrichment analysis was conducted using the ClusterProfiler R package. The KEGG analysis showed that the majority of activated pathways are predominantly involved in ribosome biogenesis (FC = 2.15, adjusted P-value = 0.0008), interleukin 17 (IL-17) signaling (FC = 1.6, adjusted P-value = 0.0008), immune disorders (FC = 1.6, adjusted P-value = 0.002), cellular senescence (FC = 1.5, adjusted P-value = 0.004), steroid biosynthesis (FC = 1.6, adjusted P-value = 0.03), lipid metabolism and atherosclerosis process (FC = 1.3, adjusted P-value = 0.04) ([103]Fig. 3a and c). On the other hand, deactivated pathways are linked to glycosphingolipid biosynthesis (FC = −1.6, adjusted P-value = 0.004), cardiac muscle contraction (FC = −1.6, adjusted P-value = 0.008), vitamins metabolism (FC = −1.6, adjusted P-value = 0.009), ABC transporters (FC = −1.5, adjusted P-value = 0.04) ([104]Fig. 3a). Fig. 3. [105]Fig. 3 [106]Open in a new tab Extensive alterations in gene pathways in the SDR42E1 knockout model. a, A dot plot illustrates enriched KEGG pathways for DEG. The Y-axis represents the KEGG pathways; the X-axis represents the ratio of the genes enriched in the KEGG pathway. A KEGG pathway diagram, enhanced via the R/Pathview package, shows the expression profiles of genes involved in b, steroid biosynthesis and c, in steroid hormone biosynthesis. Red indicates genes upregulated, while green denotes genes downregulated by the SDR42E1 knockout. d, A bar plots of signature drugs associated with SDR42E1 knockout through DSigDB. The color and size of dots and bars reflect the significance and count of DEG linked to KEGG pathways and drugs, respectively. P-value adjusted (p. adjust) < 0.05 was used as the threshold to select KEGG terms. Interestingly, analysis of KEGG pathways revealed significant genetic alterations in the pathways of steroid and steroid hormone biosynthesis ([107]Fig. 3b and c). This was characterized by the upregulation of EBP (FC = 0.50, FDR = 2.46 × 10^−13), DHCR7 (FC = 0.42, FDR = 8.03 × 10^−09), CYP51A1 (FC = 0.63, FDR = 1.16 × 10^−27) and CYP27B1 (FC = 1.08, FDR = 1.11 × 10^−06) ([108]Table S3). Conversely, there was a downregulation observed in several CYP family genes, including CYP26A1 (FC = −1.55, FDR = 1.06 × 10^−08) and CYP24A1 (FC = −0.26, FDR = 3.8 × 10^−02) as well as lamin B receptor (LBR, FC = −0.58, FDR = 5.14 × 10^−20) and catechol-O-methyltransferase (COMT, FC = −0.23, FDR = 8.80 × 10^−04) ([109]Table S3). Drug prediction enrichment analysis of the DEG utilizing the DSigDB database, revealed therapeutic agents potentially regulated by SDR42E1, including dinoprostone (P-value = 7.37 × 10^−10), mifepristone (P-value = 1.12 × 10^−09), 17-Ethynyl estradiol (P-value = 1.67 × 10^−07), vitamin D3 (P-value = 4.39 × 10^−06), and medroxyprogesterone acetate (P-value = 1.56 × 10^−05) ([110]Fig. 3c and [111]Table S4). To corroborate findings derived from the RNA-sequencing analysis of SDR42E1 knockout HaCaT cells, RT-qPCR analysis was performed on a selected set of five representative genes utilizing the identical RNA extracts. The RT-qPCR outcomes for ALPP, SDR42E1, SLC7A5, DHCR7, and SERPINB2 exhibited a robust concordance with the RNA-sequencing data, with a substantial Spearman correlation coefficient (R^2) of approximately 0.93 when compared to wild-type HaCaT cells ([112]Table 1). These findings collectively affirm the robustness and reliability of the RNA-sequencing analysis conducted in this study. Table 1. Correlation between RNA-sequencing and RT-qPCR findings in the SDR42E1 knockout model. Genes RT-qPCR RNA-Sequencing SLC7A5 −1.7 −1.54 ALPP −3.59 −2.96 DHCR7 0.27 0.42 SERPINB2 1.24 3.19 SDR42E1 −2.71 −1.15 Spearman R^2 0.93 [113]Open in a new tab Spearman correlation coefficient (R^2) reveals strong correlations on the log[2] (fold-change) data. Beta-actin gene was utilized as an internal control for expression normalization by geometric mean. HaCaT wild-type cells were functioned as a negative control. Data present the log[2] (fold-change) of three replicates, demonstrating consistent outcomes. Abbreviations: RT-qPCR, reverse transcription-quantitative polymerase chain reaction. Furthermore, we assessed the gene replication by analyzing the DEG in the SDR42E1 Knockout HaCaT RNA-sequencing data with Vitamin D-related genes reported in the GWAS Catalog [[114]25]. We successfully replicated a considerable number of the common genes in our SDR42E1 knockout dataset, identifying 65 out of 248 genes. Notable instances include the low-density lipoprotein receptor (LDLR), endothelial lipase (LIPG), and involucrin (IVL), which exhibited significant associations in the SDR42E1 knockout data and the GWAS Catalog, with P-values ranging from 9 × 10^−2305 to 5 × 10^−08 ([115]Table 2). Table 2. Replication of DEG reported in the GWAS catalog for vitamin D in the SDR42E1 knockout model. Gene log[2] FC P-value FDR Gene Description LDLR 0.979 2.297E−73 4.536E−71 Low-Density Lipoprotein Receptor; is involved in the regulation of cholesterol levels in the blood. LIPG 0.749 4.168E−70 7.305E−68 Endothelial Lipase; plays a role in lipid metabolism. IVL −1.937 1.471E−50 1.517E−48 Involucrin; regulates vitamin D receptor in the epidermis. FGFBP1 0.763 1.040E−46 9.656E−45 Fibroblast Growth Factor Binding Protein 1; modulates fibroblast growth factor activity. CXCL8 1.306 4.189E−42 3.228E−40 Chemokine (C-X-C motif) Ligand 8; is a pro-inflammatory cytokine is involved in immune response. HERPUD1 −1.177 6.225E−37 3.950E−35 Homocysteine-Inducible, Endoplasmic Reticulum Stress-Inducible, Ubiquitin-Like Domain 1; involves in the unfolded protein response. RETREG3 −0.867 2.697E−35 1.637E−33 Reticulophagy Regulator 3; possibly is involved in autophagy or cell survival. PRXL2A −0.585 7.438E−32 3.890E−30 Peroxiredoxin-Like 2A; is involved in oxidative stress response. SDR42E1 −1.137 1.981E−29 9.168E−28 Short Chain Dehydrogenase/Reductase Family 42E, 1; potential role in steroid metabolism. ADAR −0.467 3.532E−27 1.448E−25 Adenosine Deaminase Acting on RNA; is involved in RNA editing. ZPR1 0.634 1.190E−25 4.498E−24 ZPR1 Zinc Finger; essential for cell viability and may play a role in cell proliferation. ARNT −0.734 2.807E−19 7.197E−18 Aryl Hydrocarbon Receptor Nuclear Translocator; is involved in response to environmental toxins. BCL11A −1.658 5.743E−17 1.276E−15 BAF Chromatin Remodeling Complex Subunit BCL11A; a transcription factor is involved in hematopoietic development. TRPS1 −3.028 1.862E−15 3.668E−14 TRPS1 Transcription Repressor; is involved in skeletal development. KIF20B −0.403 5.962E−15 1.134E−13 Kinesin Family 20B; is involved in mitosis and cell division. RABGAP1 −0.417 1.898E−10 2.341E−09 RAB GTPase Activating Protein 1; is involved in intracellular membrane trafficking. GNAQ −0.395 7.397E−10 8.351E−09 G Protein Subunit Alpha Q; a component of a signaling pathway is involved in various cell processes. CYP26A1 −1.55 9.489E−10 1.054E−08 Cytochrome P450 Family 26 Subfamily A1; is involved in vitamins metabolism. CARMIL1 −0.390 2.820E−09 2.936E−08 Capping Protein Regulator and Myosin 1 Linker 1; is involved in cell migration and adhesion. ATP1B3 −0.312 3.532E−09 3.618E−08 ATPase Na+/K + Transporting Subunit Beta 3; is involved in ion transport. DHCR7 0.406 8.034E−09 7.869E−08 7-Dehydrocholesterol Reductase; is involved in cholesterol and steroid biosynthesis. ZNF587B 0.456 1.458E−08 1.369E−07 Zinc Finger Protein 587B; likely a transcription factor. ASH1L −0.337 1.510E−08 1.415E−07 ASH1 Like Histone Lysine Methyltransferase; is involved in chromatin modification. DOCK8 −3.042 2.127E−07 1.687E−06 Dedicator Of Cytokinesis 8; is involved in immune cell signaling and function. GALNT2 −0.307 2.218E−07 1.754E−06 Polypeptide N-Acetylgalactosaminyltransferase 2; is involved in glycosylation. SMYD3 −0.755 5.670E−07 4.180E−06 SET And MYND Domain Containing 3; a histone methyltransferase is involved in chromatin regulation. KLK10 −0.394 2.198E−06 1.45E-05 Kallikrein Related Peptidase 10; is involved in proteolysis and various physiological processes. CELSR2 −0.421 2.214E−06 1.46E-05 Cadherin EGF LAG Seven-Pass G-Type Receptor 2; is involved in cell adhesion and signaling. ZNF680 −0.820 5.506E−06 3.37E-05 Zinc Finger Protein 680; likely functions as a transcription factor. USP3 0.330 6.418E−06 3.88E-05 Ubiquitin Specific Peptidase 3; is involved in DNA damage response and repair. FTO −0.344 8.016E−06 4.77E-05 FTO Alpha-Ketoglutarate Dependent Dioxygenase; associated with body mass and obesity. MCUB −0.599 9.294E−06 5.46E-05 Mitochondrial Calcium Uniporter, MCUb Subunit; is involved in mitochondrial calcium uptake. MAN2A1 −0.236 1.51E-05 8.52E-05 Mannosidase Alpha Class 2A 1; is involved in glycoprotein processing. TRMT61A 0.419 1.88E-05 0.0001046 tRNA Methyltransferase 61A; is involved in tRNA modification. BTBD10 0.296 3.22E-05 0.0001702 BTB Domain Containing 10; potentially is involved in neuronal survival and apoptosis. [116]Open in a new tab Genes listed associated with vitamin D at P-values < 5E−05 and reported in the GWAS Catalog for vitamin D with P-values < 5E−08. Abbreviations: Log[2] FC, Log[2] Fold Change estimate; FDR, false discovery rate represents adjusted P-value using the Benjamini-Hochberg in DESeq2. 3.3. SDR42E1 knockout profoundly alters protein expressions and pathways involved in vitamin D regulation Employing a label-free LC-MS/MS shotgun proteomics approach, we discerned a range of differentially expressed proteins in the HaCaT cells with homozygous SDR42E1 knockout, totaling 138 out of 1,320 proteins ([117]Fig. 4 and [118]Table S5). In comparison to wild-type HaCaT cells, the SDR42E1 knockout cells exhibited 101 downregulated proteins and 37 upregulated proteins ([119]Fig. 4a and b). Noteworthy increases were noted in SERPINB2 (molecular mass of 46851), keratin 17 (KRT17) (molecular mass of 48361), and SERPINB1 (molecular mass of 42742), exhibiting adjusted P-values (q-value) of 1.90 × 10^−08, 9.91 × 10^−05, and 2.38 × 10^−02, along with log[2] fold changes of 3.36, 1.37, and 1.23, respectively ([120]Fig. 4c). Simultaneously, there were significant downregulations of SLC3A2 (molecular mass = 68180), SLC7A5 (molecular mass = 55659), LIM And SH3 Protein 1 (LASP1, molecular mass = 29717) with q-values of 8.41 × 10^−07, 1.88 × 10^−05, and 2.38 × 10^−02, and log[2] fold changes of −1.54, −1.77, and −1.20, respectively ([121]Fig. 4c). These findings were corroborated by RNA-sequencing analysis results. Fig. 4. [122]Fig. 4 [123]Open in a new tab Extensive alterations in protein expressions and pathways in the SDR42E1 knockout model. a, A PCA plot reveals the clustering of three biological replicates of wild-type HaCaT controls (C; in rose) and SDR42E1 homozygous knockouts (Hom, in blue), based on the major components of regularized log-transformed counts. b, A heatmap displays the Z-scores of log-transformed counts for the top 100 differentially expressed proteins, using blue to indicate lower and red for higher expression, highlighting clear separation between samples. Axes are labeled with sample names and proteins. c, A volcano plot compares the proteomic data of homozygous SDR42E1 knockout cells to wild-type HaCaT controls. The X-axis shows log[2] fold change (FC), with significant upregulation to the right (greater than 1) and downregulation to the left (less than −1). The Y-axis shows the −log of the false discovery rate (FDR), marking significance below 0.05. The top 20 significantly altered proteins are highlighted. d, A GSE dot plot presents pathway enrichment analysis of the differentially expressed proteins in SDR42E1 homozygous knockouts. These analyses were conducted and visualized using R/Limma, Pheatmap, and ClusterProfiler. The GSE analysis linked a wide range of differentially expressed proteins to the activation of pathways involved with the regulation of epithelial cell apoptosis (GO:1904019, FC = 1.8, adjusted P-value = 0.002), development of skin epidermis (GO:0098773, FC = 1.7, adjusted P-value = 0.008), and various processes of wound healing (GO:0042060 and GO:0009611, FC = 1.7, adjusted P-value = 0.02) ([124]Fig. 4d). Moreover, numerous proteins contributed to the suppression of pathways related to the formation of melanosome and pigment granules (GO:0042470 and GO:0048770), cellular response to heat (GO:0034605), and various immune-related disorders (including GO:0002250, GO:0002460, and GO:0002699), with FC of −1.7 and adjusted P-value of 0.016 ([125]Fig. 4d). 3.4. SDR42E1 knockout decreases vitamin D production in HaCaT cells To evaluate the direct impact of the SDR42E1 knockout on the levels of vitamin D in cell lysates, vitamin D ELISA was conducted. These tests demonstrated a significant reduction in total vitamin D levels to 0.52-fold (R^2 = 0.935) in the SDR42E1 knockout samples compared with the wild-type controls, dropping from 78.18 ng/mL to 40.60 ng/mL, yielding a P-value of 0.0016 ([126]Fig. 5). Fig. 5. [127]Fig. 5 [128]Open in a new tab Decreased vitamin D levels in the SDR42E1 knockout model. Vitamin D levels in the SDR42E1 knockout model was measured with a vitamin D ELISA assay. Compared to wild-type HaCaT cells, the knockout model showed significantly reduced vitamin D levels across three replicated experiments (**; P-value <0.01). 4. Discussion The widespread prevalence of vitamin D deficiency poses a significant public health concern, associated with various serious illnesses. The intricate regulation of vitamin D levels is influenced by a multifaceted interplay of genetic and environmental factors, such as limited sunlight exposure and malnutrition [[129]2]. Recent GWAS research has linked a new non-sense variant, rs11542462, in the uncharacterized SDR42E1 to serum 25(OH)D levels [[130][6], [131][7], [132][8]]. The novel SDR42E1 is believed to be engaged in multiple metabolic processes that could impact lipid metabolism and steroid hormone biosynthesis [[133]10,[134]11]. Our study comprehensively characterizes, for the first time, the structure and function of the SDR42E1 gene and its variant, exploring the potential impact on biological processes related to vitamin D. Preliminary examination of available mRNA data for SDR42E1 from healthy subjects revealed elevated expression levels in essential tissues involved in vitamin D synthesis and metabolism [[135]28], notably in skin keratinocytes and the corresponding HaCaT cell line. This evidence suggests that SDR42E1 could be instrumental in the regulation of vitamin D biosynthesis in the skin. Such insights are valuable for understanding the function of SDR42E1, guiding the development of suitable models for further investigation. Consequently, we selected the human keratinocyte HaCaT cell line as our model system, reflecting the primary site of vitamin D[3] biosynthesis. This choice is vital for understanding the proposed involvement of SDR42E1 in vitamin D regulation, providing a relevant and accurate context for our investigations. In the characterization study, examining the subcellular localization of uncharacterized proteins provides significant clues about their metabolic functions. Previous research has reported that the subcellular distribution of proteins in the SDR family largely depends on their enzymatic activity, predominantly localized to mitochondria, cytoplasm, plasma membrane, and endoplasmic reticulum [[136]10,[137]29]. In the current research, we specifically targeted the overexpression of SDR42E1 in enriched human HaCaT cell lines to delineate its subcellular distribution. Employing targeted antibodies against SDR42E1 and staining for different cellular components, we revealed a prominent localization of SDR42E1 to the plasma membrane and cytoplasm. The plasma membrane and cytosol are critical platforms for lipid and steroid metabolic processes [[138]30]. Our observation expands our comprehension of the roles of SDR42E1 and strongly suggests its potential involvement in regulating lipid metabolism within these critical cellular compartments. Employing CRISPR/Cas9 technology in HaCaT cells, we successfully introduced a targeted and efficient modification to the SDR42E1, replicating the p.Q30* GLN>*TER nonsense mutation, and mimicking a functional knock-out of the gene. We conducted extensive transcriptomic and proteomic profiling in the homozygous SDR42E1 knockout and wild-type HaCaT cells, uncovering numerous differentially expressed proteins and gene enrichments in various downstream and steroid-related pathways. Key observations were the substantial downregulation of ALPP and KRT19 gene expressions in the SDR42E1 knockout model. Alkaline phosphatase (ALP) encompasses a set of enzymes, encoded by four genes, with three being tissue-specific and one ubiquitous across various body tissues. ALP plays a crucial role in the synthesis of cellular membrane phospholipids and in the mineralization of new bone, particularly via ALPP [[139]31]. Elevated serum ALP levels are indicative of osteomalacia, often stemming from vitamin D deficiency [[140]32]. Kover and his colleagues initially highlighted the role of ALP as a marker for vitamin D deficiency in premature infants with rickets [[141]33]. Subsequent studies have consistently shown an inverse relationship between ALP and serum 25(OH)D levels [[142]34], leading to the use of increased serum ALP as a diagnostic marker for vitamin D deficiency [[143]35]. Unfortunately, there is no evidence of a functional role for ALPP in the skin, despite its significant expression. The depletion of SDR42E1 from HaCaT cells resulted in notable alterations in numerous genes previously reported in the GWAS catalog for vitamin D levels [[144]36], following the depletion of SDR42E1 from HaCat cells. Among these DEG were key players, including LDLR, LIPG, IVL, CYP26A1, and DHCR7, known for their involvement in diverse biological processes related to vitamin D regulation, including synthesis, transport, and degradation. Furthermore, we noticed that a batch of important genes, including SERPINB2, SLC7A5, CYP3A5, and LBR, underwent significant modifications in the homozygous SDR42E1 knockout model. These observations are consistent with previous studies, emphasizing the potential role of the SDR42E1 in modulating lipid and steroid metabolism, including vitamin D in skin keratinocytes [[145]37,[146]38]. Interestingly, our analysis of the enriched pathways has revealed a pronounced impact on the steroid biosynthesis pathway following the inactivation of the SDR42E1 gene in HaCaT cells. Notably, the upregulation of key genes involved in vitamin D synthesis in the skin, EBP and DHCR7, leads to enhanced production of 7-DHC and a consequent obstruction in its conversion to vitamin D3 in the absence of SDR42E1 ([147]Fig. 6). This discovery aligns with our earlier research, which demonstrated the robust substrate affinity of vitamin D3 and its precursors, 7-DHC and 8-DHC, towards SDR42E1 through in silico docking studies [[148]10]. This affinity is further evidenced by the drug enrichment analysis of our SDR42E1 knockout model, which shows a significant association with vitamin D3 therapy. Moreover, other genes encoding key enzymes implicated in vitamin D synthesis and absorption, notably CYP27B1, CYP24A1, COMT, and ABCB1, are also significantly regulated in the absence of SDR42E1, further confirming a role of SDR42E1 in maintaining vitamin D homeostasis. Fig. 6. [149]Fig. 6 [150]Open in a new tab Potential role of SDR42E1 in vitamin D biosynthesis and regulation. The pathway illustrates the influence of SDR42E1 absence in vitamin D skin synthesis from 7-DHC upon solar ultraviolet B (UVB) exposure. To boost vitamin D levels, the body increases the conversion of 8-DHC or cholesterol to 7-DHC by upregulating enzymes EBP or DHCR7. Intestinal absorption of vitamin D is also improved by the upregulation of ABCB1. The liver then enhances the conversion to 25(OH)D by upregulating CYP27A1 or CYP3A4, and the kidneys increase activation to 1,25-dihydroxyvitamin D via CYP27B1 upregulation, regulating related-gene expressions via vitamin D receptor (VDR)/retinoid-X receptor (RXR) complex. The inactivation and secretion process, usually facilitated by the CYP24A1 enzyme, is also diminished. Red indicates proteins upregulated, while green denotes proteins downregulated by the SDR42E1 absence. (?) indicates our proposed SDR42E1 involvement in vitamin D biosynthesis. 2D chemical structures obtained from PubChem: [151]https://pubchem.ncbi.nlm.nih.gov. Generated with [152]BioRender.com. Our proteomic analysis, consistent with RNA-sequencing, demonstrated marked increases in SERPINB2, KRT17, and SERPINB1, and decreases in SLC3A2, SLC7A5, and LASP1 protein levels in the SDR42E1 knockout model. SERPINB2, encoding the plasminogen activator inhibitor-2 (PAI-2) protein, is key in regulating blood clot breakdown while SERPINB1 is an intracellular protein that shields cells from stress-induced cytoplasmic proteases [[153]39]. A previous study showed that vitamin D regulates the production of antithrombin by affecting the SERPIN proteins, suggesting a new pathway for antithrombotic therapy development [[154]40]. Furthermore, vitamin D influences keratinocyte behavior and gene expression, notably SERPIN genes, which play a role in skin differentiation and the management of skin conditions [[155]41]. KRT17 is also connected to various cell functions, skin disorders, and bone irregularities [[156]42,[157]43]. The SLC7A5 encodes the large neutral amino acid transporter 1 (LAT1), vital for transporting large neutral amino acids across cell membranes [[158]44]. Recent studies emphasize the critical role of LAT1 in bone homeostasis, regulated by the vitamin D receptor [[159]45,[160]46]. Additionally, vitamin D regulates LAT1 expression in the placenta, possibly aiding fetal growth in vitamin D-deficient preeclampsia through vitamin D receptor and the mTOR pathways [[161]47]. SLC3A2, a component of CD98 glycoprotein, is essential for skin health and osteoclast formation by interacting with LAT1 and vitamin D [[162]48]. SLC3A2 is upregulated by estrogen, promoting colon health and vitamin D synthesis, thereby preventing colorectal cancer through vitamin D receptor activation [[163]49]. These discoveries underscore the intricate role of SDR42E1 in managing vitamin D-related pathways and health outcomes, highlighting the need for further research into its specific genetic mechanisms. While our research provides considerable new insights into the functional role of SDR42E1 in vitamin D biosynthesis, several areas warrant further exploration. The functions and substrates of SDR42E1 require additional characterization to fully elucidate its involvement in various biological processes, particularly in cellular senescence and DNA repair. Moreover, our study did not investigate the potential interactions between SDR42E1, environmental factors, and human diseases. This aspect could benefit from further investigation using diverse cellular and animal models, as well as clinical samples. These additional studies would validate our in vitro findings and enhance the understanding of the physiological and pathological implications of SDR42E1 in human health. In conclusion, our research has unveiled a pioneering characterization of the SDR42E1 gene, elucidating its relationship with key genes involved in steroid and vitamin D biosynthesis. This study also highlights the role of SDR42E1 in lipid metabolism, cellular aging, and DNA repair, thereby expanding our knowledge of its understanding uncharacterized cellular functions. Data availability statement All data generated throughout the study have been thoroughly reviewed and included in this published article or documented in the referenced data repositories. The RNA-sequencing data is accessible on the GEO database portal ([164]https://www.ncbi.nlm.nih.gov/geo/) under the series accession number [165]GSE262704 . Bioinformatic and statistical analyses employed publicly accessible software tools, as detailed in the main text and Methods sections. For further information about resources or reagents, and access to raw data and code, please contact the lead author, Georges Nemer (gnemer@hbku.edu.qa), upon reasonable request. Funding Open access funding provided by the Qatar National Library. CRediT authorship contribution statement Nagham Nafiz Hendi: Conceptualization, Data curation, Formal analysis, Investigation, Methodology, Software, Validation, Visualization, Writing – original draft, Writing – review & editing. Maria Teresa Bengoechea-Alonso: Resources. Johan Ericsson: Resources, Writing – review & editing. Georges Nemer: Conceptualization, Funding acquisition, Project administration, Resources, Supervision, Writing – review & editing. Declaration of competing interest The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper. Acknowledgement