Abstract Despite past research linking HLF mutations to cancer development, no pan-cancer analyses of HLF have been published. As a result, we utilized multiple databases to illustrate the potential roles of HLF in diverse types of cancers. Several databases were used to assess HLF expression in the TCGA cancer samples. Additional assessments were undertaken to investigate the relationship between HLF and overall survival, immune cell infiltration, genetic alterations, promoter methylation, and protein-protein interaction. HLF's putative roles and the relationship between HLF expression and drug reactivity were investigated. HLF expression was shown to be lower in tumor tissues from a variety of malignancies when compared to normal tissues. There was a substantial link found between HLF expression and patient survival, genetic mutations, and immunological infiltration. HLF influenced the pathways of apoptosis, cell cycle, EMT, and PI3K/AKT signaling. Abnormal expression of HLF lowered sensitivity to numerous anti-tumor drugs and small compounds. According to our findings, reduced HLF expression drives cancer growth, and it has the potential to be identified as a vital biomarker for use in prognosis, immunotherapy, and targeted treatment of a range of malignancies. Keywords: Bioinformatics, Biomarkers, Tumor-infiltrating immune cells, Prognosis, Target therapy, Cancer genetics Highlights * • HLF expression was lower in tumor tissues when compared to normal tissues. * • There was a link between HLF expression and patient survival, mutations, and immunological infiltration. * • HLF influenced the pathways of apoptosis, cell cycle, EMT, and PI3K/AKT signaling. * • Abnormal expression of HLF lowered sensitivity to numerous anti-tumor drugs and small compounds. * • HLF has the potential to be a biomarker for prognosis, immunotherapy, and targeted treatment. 1. Introduction Cancer is a serious hazard to public health owing to substantial rise in global warming and poor lifestyles [[35]1]. The pathophysiology of cancer is exceedingly complicated, making early detection challenging. Furthermore, established diagnostic and therapeutic procedures are ineffective in the early stages of diagnosis and treatment [[36][2], [37][3], [38][4]]. Developments in sequencing technology and online datasets have allowed us to understand some genes' roles in human cancer [[39]5,[40]6]. For example, The TCGA project has information on 33 malignancies and over 10,000 individuals [[41]7]. Therefore, we can now better understand how particular genes function in human cancer. Circadian disruption has been associated with elevated susceptibility to malignancies such as breast, prostate, colorectal, liver, and non-Hodgkin's lymphoma in recent research [[42][8], [43][9], [44][10], [45][11], [46][12]]. In point of fact, there is substantial evidence suggesting that shift work, which disrupts humans' natural circadian rhythms, is detrimental to their health. A complex auto-regulatory network of 'clock' genes is responsible for orchestrating circadian rhythms. These genes govern physiological and behavioral functions in response to periodic changes in the environment. Hepatic leukemia factor, often known as HLF, is a clock-dependent transcription factor that plays an important role in the circadian adjustment of a number of different activities [[47][13], [48][14], [49][15]]. It belongs to the proline- and acidic amino acid-rich basic leucine zipper protein family and was initially found in early B-lineage acute leukemia patients who had aberrant expression of the transcription factor E2–HLF fusion gene. Subsequently, it was discovered to be expressed in liver and kidney cells [[50]16,[51]17]. Furthermore, the HLF gene has been uncovered to have a crucial regulatory function in developing several malignancies, including lung, renal, glioma, liver, and breast [[52][18], [53][19], [54][20], [55][21], [56][22], [57][23]]. These findings raise the idea that the HLF gene plays diverse functions during tumorigenesis depending on the tissue and setting. However, it is still unclear whether the HLF gene is implicated in the etiology of cancers and could be regarded as a possible target and a crucial mediator in a number of human cancers via a shared signaling mechanism. We utilized the data from the TCGA dataset to perform a pan-cancer analysis of the HLF gene in this research. To unravel the molecular processes of the HLF gene in cancer, we explored the association of HLF expression with prognosis, genetic changes, the immunological microenvironment, gene function, and drug sensitivity of cancer patients. 2. Materials and methods 2.1. Gene expression analysis We applied the “TCGA” option of the UALCAN platform to conduct a comparison of the levels of HLF expression in tumor tissues and normal samples across 24 human cancers in the TCGA project. UALCAN ([58]http://ualcan.path.uab.edu/analysis.html) is a powerful web tool that allows access to available cancer omics data [[59]24,[60]25]. For those cancers with insufficient normal sample size or with highly limited normal tissues (N < 10), or without available data in the UALCAN database, we utilized the “box plot” tab of “Expression DIY” module of the GEPIA2 web server to assess the expression difference between the tumor tissues of TCGA cohort and the related normal tissues of the TCGA and the GTEx projects, with predefined settings (q-value cutoff = 0.01, log2FC (Fold Change) cutoff = 1). The GEPIA2 ([61]http://gepia2.cancer-pku.cn/#analysis) is an online TCGA data analysis tool that enables researchers to perform various analyses such as expression analysis and survival analysis for a given gene [[62]26]. Cholangiocarcinoma, Sarcoma, Pheochromocytoma and Paraganglioma, Mesothelioma, and Uveal Melanoma were excluded from our analysis because of limited normal tissues (N < 10) in both databases. Furthermore, we analyzed the difference in the HLF gene expression between pathological stages of human cancers by the GSCA database ([63]http://bioinfo.life.hust.edu.cn/GSCA/#/). The GSCA platform is a web tool that combines omics data retrieved from on TCGA database. It enables researchers to perform several analyses, including expression analysis, pathway activity, and drug sensitivity [[64]27]. [65]Table S1 shows the abbreviation and sample size for 33 cancer types deposited in the TCGA and GTEx datasets according to GEPIA2 database. 2.2. Survival prognosis analysis We employed the “Survival Map” module of the GEPIA2 database and the "pan-cancer” option of the km-plotter database to investigate the OS of cancer patients with aberrant expression of the HLF gene. We also evaluated the association between dysregulation of the HLF gene and OS outcome across different tumors through the GSCA and the Kaplan Meier plotter (KM-plotter) databases. The Kaplan-Meier ([66]https://kmplot.com/analysis/) is an online resource that can explore the influence of 54,000 genes on survival outcomes in 21 carcinoma types [[67]28]. These assessments were performed under the following conditions: Based on the median expression levels of the HLF gene, cancer patients were divided into high-expression (Cutoff-high (50 %)) and low-expression (cutoff-low (50 %)) groups. A log-Rank p-value <0.05 was considered statically significant. 2.3. Genetic alteration analysis The cBioPortal database ([68]https://www.cbioportal.org/) is an open resource for cancer genomics, providing easy access to data including copy number changes, aberrations in mRNA expression, DNA methylation, and protein expression for more than 5000 tumor samples among more than 20 carcinoma studies [[69]29]. Utilizing the cBioPortal website, we employed the "TCGA Pan-Cancer Atlas Studies" option to explore genetic aberrations of the HLF gene. The “Mutations” module also was applied to investigate the mutation site information for the HLF gene. 2.4. DNA methylation analysis Using data from the DNMIVD ([70]http://1193.41.228/dnmivd/) database [[71]30], we determined the methylation status of the HLF gene in all accessible TCGA cohorts. For comparing the methylation levels between cancer and normal samples, an independent Student's t-test was conducted, and cancers with |beta difference|>0.1 and independent Student's t-test adjusted p-value <0.05 were considered tumors with significant changes in the promoter of the HLF gene. In addition, whenever we found the promoter of the HLF gene was abnormally methylated, we also explored the association between gene expression and promoter methylation in primary tissues using the Pearson and Spearman correlation analysis with predefined criteria (rho value < -1 and p-value <0.05). 2.5. Protein-protein interaction (PPI) analysis We used the GeneMANIA ([72]http://www.genemania.org) web tool [[73]31] to construct PPI network, including physical interaction, co-localization, prediction, co-expression, shared protein domains, and genetic interaction connections between the HLF gene and related genes. 2.6. Pathway Enrichment analysis Utilizing the GSCA database, we investigated the association between the expression of the HLF gene and pathway activity across all TCGA tumors. The pathway GSCA contains TSC/mTOR, Receptor Tyrosine Kinase (RTK), RAS/MAPK, PI3K/AKT, Hormone ER, Hormone AR, EMT, DNA Damage Response, Cell Cycle, Apoptosis pathways which recognized as famous cancer-related pathways. In this analysis, on the basis of the median HLF gene expression, samples were separated into two groups (High and Low). The Student's T-test calculated the difference in PAS. Then p-value was adjusted by the FDR method; FDR ≤ 0.05 is recognized as statistically significant. When PAS for samples with High expression of the HLF gene was greater than PAS of samples with Low expression, we supposed that the HLF gene might promote pathway activity, otherwise suppressing pathway function. 2.7. Immune infiltration analysis We utilized the “Immune-Gene” module of the TIMER2 ([74]http://timer.comp-genomics.org/) database [[75]32] to examine the correlation between the HLF gene expression and immune infiltrates among all available TCGA cohorts. The CIBERSORT-ABS method was used to conduct this assessment, and the p-value and partial correlation (cor) values were corrected for tumor purity using Spearman's rank correlation test. A p-value <0.05 was considered statistically significant. [76]Table S2 indicates the immune cells selected for our analysis. 2.8. Drug sensitivity analysis We used the GSCA database to conduct drug sensitivity analysis to discover whether abnormal expression of the HLF gene affects cancer patients' clinical response and targeted therapy. This platform has gathered the IC50 of 265 small molecules in 860 cell lines and related mRNA gene expression from the GDSC. It conducts Spearman's correlation analysis to assess the correlation between the expression of a given gene with drug sensitivity. The positive association unveils that the overexpression of a gene may be related to drug resistance and vice versa. 3. Results 3.1. Expression analysis According to the UALCAN database, the expression level of the HLF gene was lower in tumor tissues of BLCA, BRCA, COAD, HNSC, KIRC, KICH, LUAD, LUSC, PRAD, READ, STAD, THCA, and UCEC than in matching control tissues ([77]Fig. 1A). The GEPIA2 database also revealed that whereas the expression of the HLF gene was dramatically downregulated in malignant tissues of ACC, CESC, GBM, OV, SKCM, and UCS, it was significantly upregulated in tumor tissues of patients with THYM compared to normal samples ([78]Fig. 1B). The expression of this gene was lower in ACC, BLCA, BRCA, CESC, COAD, GBM, HNSC, KIRC, KICH, LUAD, LUSC, PRAD, READ, SKCM, STAD, THCA, UCEC, and UCS tumor tissues and greater in THYM tumor tissues. When we compared the HLF gene expression across pathological stages of TCGA tumors, we found a substantial difference in expression of the HLF gene between pathological stages of KIRC, THCA, and BLCA, with a considerable decrease observed at advanced stages ([79]Fig. 1C). Fig. 1. [80]Fig. 1 [81]Open in a new tab The expression analysis of the HLF gene and its correlation with pathological stages. Analysis of HLF expression across TCGA cancers in the UALCAN database; Tumors showing significant reductions are represented by green rectangles. (A). Analysis the expression of HLF in ACC, CESC, GBM, DLBC, LAML, LGG, OV, PAAD, SKCM, TGCT, THYM, and UCS between TCGA tumor tissues and related normal tissues in the TCGA and GTEx databases by the GEPIA2 database; Cancers that display remarkable decreases are indicated by green rectangles, whereas those demonstrating substantial increases are labeled with red rectangles. (B). Analysis of the HLF gene expression variation among pathological stages of TCGA tumors via the GSCA database; Cancers with notable changes in cancer pathological stages are indicated with blue rectangles (C). (For interpretation of the references to color in this