Abstract Background The WASF3 gene has been linked to promoting metastasis in breast cancer (BC) cells, and low expression reduces invasion potential. Circular RNAs (circRNAs) function as microRNA (miRNA) modulators and are involved in cancer progression, but the relationship between these factors remains unclear. Methods This study used bioinformatics methods and a computational approach to investigate the role of circRNAs and miRNAs in the context of WASF3 overexpression. Differentially expressed mRNAs, circRNAs, and miRNAs were identified using Gene Expression Omnibus (GEO) datasets. A competing endogenous RNA (ceRNA) network was constructed based on circRNA-miRNA pairs and miRNA-mRNA pairs. Functional and pathway enrichment analyses were predicted using a circRNA-miRNA-mRNA network. Results RNA expression patterns were significantly different between normal and tumor samples. A total of 190 circRNAs, 76 miRNAs, and 678 mRNAs were differentially expressed. The analysis of the circRNA-miRNA-mRNA regulatory network revealed interactions between hsa-circ-0100153, hsa-miR-31, hsa-miR-767-3p, and hsa-miR-935 with WASF3 in cancer. These interactions primarily function in DNA replication and the cell cycle. Conclusions This study reveals a mechanism by which WASF3 overexpression affects the expression of circRNAs hsa-circ-0100153, promoting BC progression by sponging hsa-miR-31/hsa-miR-767-3p /hsa-miR-935. This mechanism may increase the invasive potential of cancers, in addition to other reported molecular mechanisms involving the WASF3 gene. Keywords: Cancer, Oncology, Gene expression, circRNA, miRNA, Bioinformatics Graphical abstract Image 1 [29]Open in a new tab 1. Introduction Cancer is a devastating disease that remains one of the leading causes of death globally, with an estimated 10 million deaths each year [[30]1]. Among the various types of cancer, breast cancer (BC) is the most commonly diagnosed form in women. Unfortunately, the primary cause of death in BC patients is often due to metastasis, which is the spread of cancer cells from the primary tumor to other parts of the body [[31]2]. An essential gene identified for its pivotal role in promoting invasion and metastasis is WASF3, also known as WAVE3. This gene is a member of the Wiskott-Aldridge Syndrome (WAS) family of genes, which code for proteins that contain several highly conserved motifs [[32]3]. These motifs include a WASP-homology-2 domain (WHD/V), a cofilin-homology domain (C), and an acidic (A) domain, all located at the C-terminal end of the protein. The verprolin central acidic (VCA) domain is particularly important because it coordinates the recruitment of monomeric actin and the ARP2/3 complex of proteins, which facilitate actin polymerization [[33]4,[34]5]. Actin polymerization is essential for cell movement and invasion, which are key processes in the development of metastatic cancer [[35]6,[36]7]. Research has shown that WASF3 forms a complex with several other proteins, including the p85 component of PI3K and ABL kinase [[37]8,[38]9]. Additionally, other members of the WASF family, such as Abi1/2, CYFIP1/2, NCKAP1, and HSPC300, keep the protein in an inactive state. When activated, the protein complex is released, and the VCA domain becomes exposed, allowing the ARP2/3 complexes to bind and initiate actin polymerization, facilitating cell movement and metastasis [[39][10], [40][11], [41][12], [42][13]]. Thanks to previous research [[43]3,[44]11], our understanding of the function and role of WASF3 in promoting metastasis has greatly improved. By identifying the complex interplay between various biological molecules involved in this process, we can develop more effective strategies to target this pathway and ultimately improve outcomes for cancer patients. Recent studies have identified a new class of non-coding RNAs, called covalently closed circular RNAs (circRNAs), which can evade exonuclease-mediated degradation and are more stable in blood or plasma than linear RNAs [[45]14,[46]15]. As a result, circRNAs are considered ideal candidates for developing new diagnostic or prognostic biomarkers for cancers like BC [[47]16,[48]17]. CircRNAs have been shown to play a role in various BC hallmarks, including proliferation, apoptosis, and activating invasion and metastasis [[49]18,[50]19]. miRNAs are another type of non-coding RNA that regulates gene expression at the post-transcriptional level. CircRNAs have recently been discovered to regulate miRNA function by sponging miRNAs, which can either increase or decrease the expression levels of miRNAs [[51]20,[52]21]. For example, the oncogenic circRNA has-circ-0052112 enhances tumor cell invasion and migration by sponging miR-125a-5p, a tumor suppressor that inhibits the BAP1 oncogene. MiRNAs play essential roles in tumorigenesis, cancer invasion, metastasis, relapse, and drug resistance, making them attractive targets for cancer research [[53]22]. In this study, we used a bioinformatics approach to identify additional functional pathways of the WASF3 gene and highlight its role in the regulation of circRNAs and miRNAs in cancer progression. We collected expression profiles of circRNAs, miRNAs, and mRNAs in BC and normal breast tissues from the Gene Expression Omnibus (GEO) datasets and The Cancer Genome Atlas (TCGA) database. We identified differentially expressed mRNAs, circRNAs, and miRNAs using R software and reconstructed a circRNA-miRNA-mRNA regulatory network. We then predicted miRNA sponging by circRNA and miRNA target genes and assessed the competitive endogenous RNA (ceRNA) network using gene ontology (GO) annotation and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analyses to determine the main functional pathways of BC. Graphical abstract ([54]Fig. 1) shows the flowchart of the study procedure. Fig. 1. [55]Fig. 1 [56]Open in a new tab Graphical Abstract. The flowchart illustrates the study procedure. The results of this study provide insight into the functional pathways regulated by the WASF3 gene and its role in BC progression. Additionally, the identification of circRNA-miRNA-mRNA regulatory networks and the ceRNA network can facilitate the development of new diagnostic and therapeutic strategies for BC treatment. This study contributes to our understanding of the genetic and epigenetic changes that drive BC progression and provides a foundation for further research in this field. 2. Materials and Methods 2.1. Raw data In this research, to concurrently investigate the regulatory interactions between circRNA, miRNA, and mRNA, we chose and utilized three datasets from the GEO database ([57]GSE113230, [58]GSE56614, and [59]GSE173661). These datasets comprised 100 samples of breast cancer and normal breast tissue, all processed through Illumina HiSeq 2000, 3000, 4000, and 5000 systems. To ensure data consistency and data uniformity, 47 samples from the GSEs were excluded due to factors such as drug interference or sequencing system incompatibility. Consequently, 49 breast cancer tissue samples and four normal breast tissue samples were retained for the study, resulting in a total of 53 samples for analysis. By leveraging these valuable resources in the GEO database, researchers can gain insights into the molecular underpinnings of breast cancer and ultimately improve diagnosis and treatment options for patients. 2.2. Data pre-processing Trimmomatic-0.4 was employed to filter the raw data obtained from RNA sequencing by removing adapter sequences, low-quality reads, and invalid reads [[60]23,[61]24]. The resulting clean reads were further processed to eliminate any rRNA residues by comparing against known rRNA information in the RNA central database. Subsequently, these clean reads were aligned to the human reference genome-HG38 available in the UCSC genome database. External factors such as sample preparation or sequencing process that are not biologically relevant may affect the expression of individual samples, and as such, it is necessary to ensure that all samples have a similar range and distribution of expression values. Normalization is required to achieve this, and the R software's “edgeR” package (version 3.50.3) was used for normalization of raw data and subsequent data processing. The package ensures that the expression distributions of each sample are consistent throughout the experiment. The ability to accurately normalize RNA sequencing data is crucial for reliable downstream analysis and ultimately leads to a better understanding of gene expression in various biological contexts [[62]25]. 2.3. Analysis of mRNAs differential expression (DE) To preprocess the RNA sequencing data, Trimmomatic-0.36 was utilized to filter out low-quality reads, adapter sequences, and undetermined bases. Next, rRNA residues were removed to ensure high-quality clean reads. The clean reads were aligned to the reference human genome-HG38 using TopHat 2.1.1 version, with annotation references selected from