Abstract Background/Objectives: The dysregulation of miRNA expression in samples from autistic individuals indicates that they are involved in autism. The participation of miRNAs in paternal epigenetic inheritance has also been reported. This study used bioinformatics tools to analyze the literature and genetic databases to search for miRNAs associated with autism, aiming to explore their suitability to investigate paternal epigenetic inheritance. Methods: Autism-related miRNAs were searched in public databases using bioinformatic tools (miRNA-to-genes analysis). The genes targeted by these autism-related miRNAs, which are common to neurons, sperm, and PBMCs, were identified. Enrichment analyses were performed to identify the biological processes regulated by the candidate miRNAs. Autism-related miRNAs were also identified by an inverse analysis (genes-to-miRNA analysis), starting from autism-related genes. Results: In the miRNA-to-gene analysis, 416 miRNAs involved in autism were found, of which 77 were expressed in sperm, PBMCs, and neurons. From these, 18 were differentially expressed in the brain and in at least one peripheral sample (saliva or blood), suggesting that they might be suitable to be used in the investigation of autism biomarkers and inheritance. In the genes-to-miRNA analysis, 36 miRNAs were identified, from which 9 coincided with the results of direct analysis. Conclusions: Although there is no consensus about miRNAs related to autism, there are candidate miRNAs that show clear potential to be explored as biomarkers. The coincidence in the expression of miRNAs in sperm, neurons, and PBMCs indicates that they are valuable biological samples to study the role of miRNAs in the paternal epigenetic inheritance of autism. Keywords: microRNA, autism, epigenetic inheritance 1. Introduction Autism, referred to in diagnostic manuals as autism spectrum disorder (ASD), is a neurodevelopmental condition characterized by difficulties in social interaction and social communication, as well as by restricted and repetitive patterns of behaviors, interests, or activities [[26]1]. Autism manifestation varies greatly between individuals, as does the level of support required, and may also be accompanied by symptoms typical of other conditions, including intellectual disability, attention deficit hyperactivity disorder (ADHD), and anxiety disorder. Other characteristics may also be present, including language impairment, epilepsy, gastrointestinal problems, motor deficits, and sleep disorders [[27]1,[28]2,[29]3,[30]4,[31]5,[32]6], leading to great complexity. Autism is a multifactorial condition that involves genetic factors, environmental influences, and epigenetic mechanisms [[33]7]. The epigenetic mechanisms include DNA methylation, histone modification, and the activity of small non-coding RNAs. MiRNAs, the focus of this study, are sncRNAs with approximately 22 nucleotides that regulate gene expression through interaction with mRNA [[34]8]. During neurodevelopment, interactions between miRNAs and their target genes regulate important processes, such as neurogenesis, gliogenesis, neuronal migration, neuronal polarization, and dendrite and synapse formation [[35]9]. Dysregulated patterns of miRNA expression in autistic individuals have already been suggested by studies using different types of samples, including post-mortem brain, saliva, blood plasma and serum, peripheral blood monocytes, olfactory mucosa stem cells, lymphoblastoid cell lines, and neurons derived from induced pluripotent stem cells [[36]10,[37]11]. Ozkul and colleagues [[38]12] analyzed the expression of miRNAs in serum samples obtained from 45 autistic children and found 6 miRNAs with low or very low expression compared to the control participants. One of the methods used to study cellular and molecular mechanisms involved in autism and possible biomarker exploration is the collection of peripheral mononuclear blood cells (PBMCs) and its reprogramming into induced pluripotent stem cells [[39]13,[40]14,[41]15,[42]16]. In addition, PBMCs are considered a good peripheral sample to investigate systemic alterations related to autism [[43]17,[44]18,[45]19,[46]20,[47]21,[48]22], including miRNA dysregulation [[49]23,[50]24,[51]25,[52]26,[53]27,[54]28,[55]29]. Despite the large number of studies seeking to comprehend autism etiology, this condition remains obscure, although its high heritability seems clear. Large-scale genetic studies indicate that both rare and common variants are related to the etiology of autism, although common genetic variants are responsible for most cases and are more frequently involved in the heritability of autism [[56]30]. The participation of miRNAs in epigenetic inheritance has also been reported. Animal studies have suggested that miRNAs present in sperm can contribute to paternal epigenetic inheritance of behaviors across generations [[57]31,[58]32,[59]33,[60]34]. Regarding autism paternal inheritance, Ozkul and colleagues [[61]12] compared the expression of the six miRNAs between sperm samples from the father of one of the autistic participants and samples from the control men; they observed that five of them showed reduced expression. Dysregulation of the miRNA expression in sperm can be caused by different environmental factors, such as smoking [[62]35] and early life stress [[63]36], and may have repercussions for the descendants. However, the mechanisms through which this type of inheritance occurs and how environmental factors can influence autism and are transmitted to subsequent generations are still poorly understood. Although data regarding the participation of miRNAs in autism etiology and inheritance is promising and bulk data have been produced, it is still controversial, with studies reporting conflicting information about the role of specific miRNAs. Considering that the studies about the epigenetic inheritance of autism involve a comparison of paternal and child biological samples, and that one of the most used samples are PBMCs, it is important to know the miRNA profile of these samples and how they reflect miRNA signatures of the autistic brain. To address this issue, this study used bioinformatics tools to analyze the literature and different genetic databases to search for miRNAs associated with autism, as well as to explore their expression in PBMCs and sperm, aiming to infer the suitability of these cell types to investigate paternal epigenetic inheritance via miRNAs. 2. Materials and Methods [64]Figure 1 shows a workflow of the methods used in the current study. Figure 1. [65]Figure 1 [66]Open in a new tab Workflow of the methods used in the current study [[67]37,[68]38,[69]39,[70]40,[71]41,[72]42,[73]43,[74]44,[75]45,[76]46, [77]47,[78]48,[79]49,[80]50,[81]51,[82]52,[83]53,[84]54]. 2.1. Compilation of Autism-Related miRNAs Autism-related miRNAs were obtained from miRNet version 2.0 [[85]37] and RNADisease version 4.0 [[86]38], and the PubMed, Web of Science, and Embase databases. Medical Subjective Heading terms related to autism and general terms referring to autism, such as “Autism spectrum disorder”, “Autistic disorder”, and “autism”, were used in combination with the terms “miRNA”, “microRNA”, and “sncRNA” ([87]Table S1). The searches were carried out from December 2022 to February 2023. Non-experimental studies that analyzed previously published databases were also included. The following data were extracted from the articles: DOI, PMID, name of the authors, year of publication, type of biological sample used, statistical test used, characteristics of the participants (number of volunteers in the ASD and control groups; age; inclusion and exclusion criteria), method used to identify miRNAs (RNA sequencing, microarray, PCR, NanoString nCounter, TaqMan low-density array), and differentially expressed miRNAs in samples of autistic individuals compared to the control participants or those suggested as potential biomarkers for this disorder. The names of the mature miRNAs that resulted from these searches were manually standardized according to the classification adopted in miRBase [[88]55] version 22.1; the miRBase Converter package version 1.30 [[89]56] in R (version 4.2.2) and RStudio (version 2023.12.0+369) was also used to convert the names of the miRNAs. The miRNAs absent in miRBase were not included in the subsequent analyses. 2.2. Identification of miRNAs Expressed in Peripheral Blood Mononuclear Cells (PBMCs), Sperm, and Neurons One desirable characteristic for the identification of suitable biomarkers for autism and its paternal inheritance is the coincidence of these miRNAs in peripheral samples, in the central nervous system, and in sperm. The peripheral samples chosen for this analysis were peripheral blood mononuclear cells (PBMCs), since they are naturally obtained when blood is collected to perform general laboratorial and/or genetic analyses of autistic individuals. The identification of miRNAs expressed in each of these three cell types was performed using miTED [[90]39] for PBMC analysis, miRmine [[91]40] and SpermBase [[92]41] for sperm analysis, and data provided by McCall and collaborators [[93]42] for neuron analysis. For all three cell types, only miRNAs with an expression ≥100 RPM were considered expressed. All miRNAs available in SpermBase were considered as expressed in sperm. In addition to the coincident expression in these three cell types, another criterion used to identify possibly relevant miRNA was its expression in the post-mortem brain and in one type of peripheral sample (blood, plasma, serum, PBMCs, or saliva) of autistic individuals. The expression of the miRNA provided by the databases mentioned above was compared between the three cell types to identify miRNAs whose expression coincides between them. This analysis was performed using the R program (version 4.3.2) and RStudio [[94]57], as well as for the intersection of the sets of miRNAs. The packages used were: ”readxl” [[95]58], “dplyr” [[96]59], “tidyr” [[97]60], “DescTools” [[98]61], and “ggvenn” [[99]62]. 2.3. Analysis of Genes Targeted by the Autism-Related miRNAs To confirm whether the compiled miRNAs target genes were relevant to autism, two tools were used: miRDB [[100]43], mirDIP version 5.3.0.1 [[101]44], and miRTarBase version 9.0 [[102]45]. A default score threshold of 50 was adopted to select miRDB target genes. For targets predicted by mirDIP, the minimum score class “high” was selected. After retrieving the target genes, the gene symbols were converted to Ensembl ID using R (version 4.3.2), RStudio, and the package “gprofiler2” (version 0.2.2) [[103]63]. The conversion of the gene symbols to the Ensembl ID was carried out due to the existence of distinct symbols to represent the same gene and the existence of different genes with coincident symbols. For subsequent analysis, the target genes were separated into three groups: (a) all genes retrieved; (b) genes expressed in PBMCs, neurons, and sperm; (c) genes expressed only in neurons and sperm. 2.4. Pathway Enrichment Analysis 2.4.1. Metascape To analyze the biological pathways and diseases in which the genes targeted by the autism-related miRNAs are involved, the Metascape portal, release 3.5, was used [[104]46]. Metascape uses a hypergeometric test for the enrichment analysis. This tool clusters enriched terms into non-redundant groups using the Kappa-test score, and the term with the lowest p-value within each cluster is chosen to represent the cluster. Process enrichment was performed using Gene Ontology (Biological Process), Wikipathways, KEGG, and Reactome ontologies. Disease enrichment was also performed using the DisGeNET database. Three different sets of gene lists were used for enrichment analyses using these tools, as follows: (a) all the genes targeted by each miRNA; (b) target genes expressed in common in sperm and neurons; (c) target genes expressed in common in PBMCs, sperm, and neurons. Thus, a previous analysis of the genes that are specifically expressed in PBMCs, neurons, and sperm was necessary. For this, RNA expression data available in the Human Protein Atlas [[105]47] were obtained for total PBMCs, for inhibitory neurons, and excitatory neurons. Gene expression data in sperm were obtained from SpermBase. For Human Protein Atlas data, genes with an expression ≥1 nTPM (normalized transcripts per million) in inhibitory or excitatory neurons and in PBMCs were considered as expressed; and for SpermBase data, all coding genes were considered as expressed in sperm. The gene symbols in the SpermBase data table were converted to Ensembl ID using R (version 4.3.2), RStudio [[106]57], and the package “gprofiler2” (version 0.2.2) [[107]63]. For the enrichment analysis (a), p-value = 0.01, minimum number of genes in the term = 3, enrichment score = 1.5, and Kappa score = 0.3; standard values defined by Metascape release 3.5 were adopted. The default p-value of 0.01 was adopted because it is a p-value rigorous enough to provide coherent, legitimate, and enriched terms. For the other analyses from (b) to (c), p-value = 10^−4, enrichment score = 1.5, minimum number of genes in the term = 3, and Kappa score = 0.3 were adopted. A stricter p-value was chosen to avoid interpreting enrichment analysis results with marginal p-values, such as 10^−3 or 10^−4 [[108]64]. Within Metascape release 3.5, the “multi-gene list meta-analysis” tool was used to compare process enrichment between the lists of analyses (b) and (c); however, it was not used for analysis (a), since performing enrichment analysis for target genes of all miRNAs would exceed the limit imposed by Metascape release 3.5 (3000 genes). For the analysis (a), the entire genome was used as the background. To avoid results that included processes unrelated to the cells of interest, restricted background lists were adopted in the analyses considering these specific cell types. Therefore, for analysis (b), only genes expressed in common in neurons and sperm were considered for the enrichment background, and for analysis (c), only genes expressed in common in PBMCs, neurons, and sperm were considered as background. 2.4.2. gProfiler In addition to Metascape release 3.5, enrichment analysis was also performed using gProfiler [[109]48], including the databases Gene Ontology (Biological Process), Reactome, WikiPathways, and Human Phenotype Ontology. As pathways with few genes can be difficult to interpret results and present redundancy with larger pathways, terms with fewer than 10 genes were excluded. Furthermore, terms from the biological pathway databases (Gene Ontology, Reactome, and WikiPathways) with more than 500 genes and Human Phenotype Ontology terms with more than 1000 genes were also excluded, since terms containing many genes are overly general and do not contribute significantly to the interpretation of results. As performed for Metascape, the three gene groups (a, b, and c) previously described were analyzed in gProfiler. To correct the p-value in gProfiler, gSCS was used, and a corrected p-value less than 0.05 was adopted. For analysis (a), only annotated genes were included as the background, and for the other analyses, the customized backgrounds described in the previous section were used, although they were customized over annotated genes. To visualize the results of this enrichment analysis, enrichment maps were created using the EnrichmentMap library [[110]49] in Cytoscape software, version 3.10.1 [[111]50]. Maps for biological pathways and maps for phenotypes were performed separately. The Gene Matrix Transposed file was obtained from gProfiler on 22 February 2024. The creation of the enrichment maps was based on the “Pathway enrichment analysis and visualization of omics data using g:Profiler, Gene Set Enrichment Analysis (GSEA), Cytoscape, and EnrichmentMap” protocol [[112]65]. The clustermaker2 app, version 2.3.4, was applied to define clusters using the Markov Clustering Algorithm, and the number of clusters was limited to a maximum of 20. For analysis (a), i.e., for all genes targeted by all candidate miRNAs (described in the previous section), the clusters were defined separately for each miRNA to identify relevant biological pathways regulated by each one individually, and clusters were also defined using the results of all enrichment analyses to compare the miRNA target gene lists. In the clusters defined individually for each miRNA, the term with the lowest adjusted p-value was selected to represent each cluster. In other analyses, the AutoAnnotate app, which uses the WordCloud app to name the clusters, was used, and, when necessary, the name of the cluster was changed to better describe the terms included. The parameters adopted in the creation of the enrichment maps are shown in [113]Table 1. Table 1. Parameters adopted in the EnrichmentMap app using Cytoscape. Enrichment Analysis Biological Pathways Enrichment Map Human Phenotypes Ontology Enrichment Map * (a) all identified target genes of each miRNA OC = 0.5 JC = 0.3 * (b) target genes expressed in common in sperm and neurons OC = 0.5 OC = 0.5 * (c) target genes expressed in common in PBMCs, sperm, and neurons OC = 0.5 OC = 0.5 [114]Open in a new tab OC: overlap coefficient; JC: Jaccard coefficient. 2.5. miRNAs Enrichment Analysis To validate autism-related miRNAs that resulted from the active search, an inverse analysis was carried out, starting with the genes suggested to be associated with autism to then find the miRNAs that target them. Autism-related genes were searched in SFARI [[115]51], in the literature using PubMed, and in the GWAS Catalog [[116]52]. Although GWAS can consider different types of genetic variations, the single nucleotide polymorphisms (SNPs) are the most studied [[117]66]; thus, only GWAS and GWAS meta-analyses that analyzed SNPs were considered. In addition, only associations with a p-value < 5 × 10^−8 were included. The searches were carried out from April 2023 to June 2023. From the list of SNPs found, two criteria were adopted to identify the genes influenced by these autism-related SNPs: the proximity of the SNP and expression quantitative trait loci (eQTL). For proximity analysis, the Variant Effect Predictor [[118]53] was used. In the Variant Effect Predictor tool, genes that contained at least one of the variants in intron regions, exon regions, the 3′ UTR region, or with variants up to 5000 base pairs away from the gene were included. For eQTL analysis, the GTEx (version 8—[119]https://gtexportal.org/home/ (accessed on 29 April 2024)) was accessed to determine whether any of the significant genome variants corresponded to eQTL. Data from all regions of the brain with eQTL data available in GTEX were used: amygdala, anterior cingulate cortex (BA24), caudate nucleus (basal ganglia), cerebellum, cerebellar hemisphere, frontal cortex (BA9), cortex, hippocampus, hypothalamus, nucleus accumbens (basal ganglia), putamen (basal ganglia), and substantia nigra. GTEx and Variant Effect Predictor were accessed on 29 April 2024. Genes available in the SFARI Gene database were downloaded on 2 May 2024. The symbols of the genes from SFARI Gene were converted to Ensembl ID using R (version 4.3.2), RStudio [[120]57], and the “gprofiler2” package(version 0.2.2) [[121]63]. Neurodevelopment-related genes were also obtained from the Gene Ontology ([122]https://geneontology.org/ (accessed on 9 April 2024)), WikiPathways ([123]https://www.wikipathways.org/ (accessed on 9 April 2024)), or Reactome version 88 ([124]https://reactome.org/ (accessed on 9 April 2024)) portals, considering 11 biological pathways (8 biological processes from Gene Ontology, 2 biological pathways from WikiPathways, and 1 biological pathway from Reactome—[125]Tables S2 and S3). The downloads were performed on 9 April 2024. The biological pathways were selected according to their relationship with brain development, neurogenesis, and synapse function. Pathway gene symbols were converted to Ensembl ID using R (version 4.3.2), RStudio, and the “gprofiler2” package (version 0.2.2) [[126]63]. Enrichment analysis was performed using the “enrichMiR” package version 0.99.32 [[127]54] in R (version 4.3.2) and RStudio. All miRNA-gene interactions experimentally validated for humans and available in miRTarBase 9.0 were downloaded [[128]45]. The background gene list included all human protein-coding genes (GRCh38.p14) from Ensembl 111 [[129]67] and was obtained through BioMart. In the “enrichMiR” package, the overlap test (Fisher test) was performed; only genes present in common in the background and in the miRTarBase were considered (14,479 genes) and only miRNAs with at least five experimentally validated interactions were included (2602 miRNAs). An FDR = 0.05 was adopted. The R script used is available in [130]Additional File S1. 3. Results 3.1. Autism-Related miRNAs: miRNA-to-Gene Analysis A total of 749 miRNAs were identified in these studies. These miRNAs were identified in different types of samples, such as serum, plasma, PBMCs, monocytes, lymphoblastoid cells lines, post-mortem brain, cerebellar cortex, saliva, olfactory mucosal stem cells, fibroblasts, neural stem cells, neural progenitor cells, astrocytes, and neurons derived from induced pluripotent stem cells and pineal gland samples. From these 749 miRNAs, 56 had no record in miRBase and were, therefore, not included in the subsequent analyses. After the update according to miRBase and the exclusion of duplicates (the same miRNAs identified in more than one study), 416 different mature miRNAs were obtained ([131]Table S4). The most cited miRNAs were hsa-miR-23a-3p, hsa-miR-451a, and hsa-miR-7-5p. 3.2. miRNAs Expressed in PBMCs, Neurons, and Sperm Considering the criteria adopted, from the 416 autism-related miRNAs identified, 77 were identified as common to PBMCs, neurons, and sperm ([132]Table S5); 17 are expressed in PBMCs and sperm, but not in neurons; 15 are expressed in neurons and sperm, but not in PBMC; and 13 are expressed in PBMCs and neurons but not in sperm; and finally, 214 miRNAs were not identified as expressed in any of the three cell types ([133]Figure 2a). Figure 2. [134]Figure 2 [135]Open in a new tab Number of miRNAs expressed in sperm, neurons, and PBMCs. (a) Total number of miRNAs expressed in common in sperm (miRmine), neurons, and PBMCs, regardless of the type of tissue used in the studies included; (b) number of miRNAs expressed in common in sperm (miRmine), neurons, and PBMCs reported as dysregulated in post-mortem brain or cerebellum and in peripheral samples (blood, plasma, serum, PBMCs, or saliva); (c) number of miRNAs expressed in common in sperm neurons and PBMCs in post-mortem brain or cerebellum samples and in peripheral samples (blood, plasma, serum, PBMCs, or saliva). As described in the Methods section, in addition to expression in the aforementioned cell types, another criterion was adopted for the selection of the most relevant miRNAs as biomarkers for autism and its inheritance: its dysregulation in at least one study that used post-mortem brain or cerebellum and in one study that used viable peripheral samples (blood, plasma, serum, PBMCs, or saliva), i.e., that can be collected in a low or non-invasive way. A total of 51 miRNAs met these criteria, from which 3 are expressed only in neurons ([136]Table 2, [137]Figure 2b), 3 only in sperm ([138]Table 2, [139]Figure 2b), 4 only in PBMCs ([140]Table 2, [141]Figure 2b), 2 in neurons and PBMCs ([142]Table 3, [143]Figure 2b), 10 in PBMCs and sperm ([144]Table 3, [145]Figure 2b), 3 in neuron and sperm ([146]Table 3, [147]Figure 2b), and 8 are not expressed in any of the three cell types (hsa-miR-1277-3p, hsa-miR-33b-5p, hsa-miR-219a-5p, hsa-miR-940, hsa-miR-4742-3p, hsa-miR-4454, hsa-miR-206, and hsa-miR-1248). Table 2. Autism-related miRNAs only expressed in one cell type (PBMCs, neurons, or sperm). Sperm Neurons PBMCs hsa-miR-424-5p hsa-miR-432-5p hsa-miR-451a hsa-miR-619-5p hsa-miR-196a-5p hsa-miR-320a-3p hsa-miR-193b-3p hsa-miR-494-3p hsa-miR-144-3p hsa-miR-664a-3p [148]Open in a new tab Table 3. Autism-related miRNAs expressed in common in two cell types. Sperm and Neurons Sperm and PBMCs Neurons and PBMCs hsa-miR-106a-5p hsa-miR-223-3p hsa-miR-874-3p hsa-miR-379-5p hsa-miR-142-3p hsa-miR-199a-5p hsa-miR-148a-5p hsa-miR-142-5p hsa-miR-146a-5p hsa-miR-15a-5p hsa-miR-29a-3p hsa-miR-29b-3p hsa- miR-29c-3p, hsa-miR-19a-3p hsa-miR-155-5p [149]Open in a new tab According to SpermBase, from the 51 autism-related miRNAs found, 36 are expressed in sperm ([150]Figure 2c). However, a difference in sperm-expressed miRNA was found between Spermbase and miRmine: has-miR-4454, has-miR196a-5p, has-miR-494-3p, has-miR-541a, has-miR-664a-3p, and has-miR-199a-5p were found in SpermBase, but not in miRmine. On the other hand, four miRNAs found in miRmine (hsa-miR-193b-3p, hsa-miR-379-5p, hsa-miR-148a-5p, and hsa-miR-142-5p) were not found in SpermBase. However, 30 miRNAs were found in both databases ([151]Table 4). Table 4. Autism-related miRNAs expressed in sperm according to SpermBase and miRmine. miRNA hsa-miR-106a-5p hsa-miR-7-5p hsa-miR-223-3p hsa-let-7a-5p hsa-miR-142-3p hsa-miR-128-3p hsa-miR-146a-5p hsa-miR-99a-5p hsa-miR-155-5p hsa-miR-15b-5p hsa-miR-15a-5p hsa-miR-484 hsa-miR-29b-3p hsa-miR-19b-3p hsa-miR-29c-3p hsa-miR-106b-5p hsa-miR-19a-3p hsa-miR-221-3p hsa-miR-29a-3p hsa-miR-574-3p hsa-miR-424-5p hsa-miR-93-5p hsa-miR-619-5p hsa-miR-27a-3p hsa-miR-143-3p hsa-miR-335-3p hsa-miR-140-5p hsa-miR-130a-3p hsa-miR-23a-3p hsa-miR-146b-5p [152]Open in a new tab When miRmine and Spermbase data were combined, 18 miRNAs were found expressed in common in the three cell types ([153]Table 5, [154]Figure 2b). Table 5. Autism-related miRNAs included in the pathway enrichment analysis. miRNA References