Abstract Background Gastric cancer (GC) is the fifth-leading cause of cancer-related mortality, with a 5-year survival rate less than 20%. It develops from preneoplastic lesions to adenocarcinoma, but these early genetic alterations remain poorly understood. Therefore, we aimed to identify early genetic drivers underlying the development of preneoplastic lesions and the initiation of gastric carcinogenesis. Methods We characterized preneoplastic and early gastric adenocarcinoma using 48 samples from 16 Guatemalan patients, a country with a high incidence of GC. We sequenced a panel of 127 genes to identify early genetic drivers and possible actionable targets. Results We identified extensive genetic heterogeneity, including single nucleotide and copy number variations. After comparing our data with other studies, we identified TP53 and APC as the most mutated genes in preneoplastic lesions and early GC. Our mean tumor mutational burden was higher in diffuse (0.017 mutations/Mb) and intestinal adenocarcinomas (0.015) than in chronic gastritis (0.005), and analysis of the mutational signatures revealed several processes acting at different stages of the disease. Signatures S15 (DNA mismatch repair deficiency) and S03 (homologous recombination deficiency) were more frequent in early adenocarcinoma than in chronic gastritis, intestinal metaplasia, necrosis, tubular adenoma, and atrophy. Notably, 10 of 16 patients (62.5%) had at least one actionable mutation in their preneoplastic lesions or gastric adenocarcinomas. Conclusions We show that at the preneoplastic and earliest stages, GC is genetically heterogeneous and presents key cancer-driving mutations that may participate in neoplastic transformation and progression, with 62.5% of lesions having the potential for treatment. This study expands the limited research on early GC and highlights key opportunities for precision medicine in populations with high GC incidence. Keywords: early cancer drivers, preneoplastic lesions, chronic gastritis, gastric adenocarcinoma, somatic variants, germline variants Introduction Gastric cancer (GC) is the fifth-leading cause of cancer-related mortality worldwide [[52]1], and adenocarcinoma accounts for 90% of cases, which are classified as either intestinal or diffuse type according to the Lauren classification system [[53]2]. The disease has four molecular subtypes: tumors positive for Epstein-Barr virus, microsatellite unstable tumors, genomically stable tumors, and chromosomal instable lesions, which are used for patient stratification and clinical trials for targeted therapies [[54]3]. Despite significant advances in clinical and pharmacological research, the overall 5-year survival rate is still less than 20%, with even lower rates in developing countries [[55]4]. GC is associated with a variety of environmental and genetic factors. The majority of cases are due to infectious agents such as Helicobacter pylori (H. pylori) and Epstein-Barr virus [[56]5]. A small percentage of less than 3% are hereditary, caused mainly by inactivating germline mutations in E-cadherin (CHD1) [[57]6]. Other models, such as Correa’s cascade, describe the development of GC from precursor lesions, starting with chronic gastritis. The progression continues with atrophic gastritis, followed by intestinal metaplasia, and finally ends in adenocarcinoma [[58]7]. In locally advanced GC, mutations accumulate progressively in genes such as TP53, APC, PIK3CA, and KRAS, and amplification events have been detected in JAK2, CD274 (PD-L1), and PDCD1LG2 (PD-L2), among others [[59]3]. In Central and South America, the mortality rate of GC is one of the highest in the world [[60]8]. Guatemala is among the nations with a notably high incidence of GC (12.2 age-standardized rate per 100,000) and a high GC mortality (10.7 age-standardized rate per 100,000), ranking fifth and fourth in the world, respectively [[61]1]. GC is frequently detected at an advanced stage with different prevalence among ethnic groups, histological subtypes, and treatment response. For instance, diffuse GC is more prevalent in intestinal in the Mayan than that in the Mestizo population [[62]9]. Guatemalan patients with locally advanced GC treated with chemotherapy have shown a median disease-free survival of 5.2 months and a median of overall survival of 15.3 months [[63]10]. This has fueled a government effort in Guatemala to implement policies for prevention and treatment strategies for different types of cancer through early detection and the implementation in collaboration with research efforts [[64]11], a decision that highlights the critical need for enhanced prevention strategies, diagnosis, and disease monitoring with early molecular markers to reduce the impact of GC in this population. Most GC studies have focused on describing the molecular events of the disease in its advanced stages with limited information on early molecular events, defined as the first genetic alterations conducting the development of cancer, including single nucleotide variations (SNVs), indels, and copy number variations (CNVs). To address this research gap, we aimed to identify early genetic drivers in preneoplastic lesions and early GC in a high incidence population on a panel of 127 genes significantly mutated in cancer using next generation sequencing. We assessed mutations and CNVs, tumor mutational burden (TMB), and mutational signatures and identified potential actionable mutations in 16 patients with multiple samples recruited at the Unit of Gastroenterology at the Department of Internal Medicine at Roosevelt Hospital in Guatemala. In addition, to enhance the significance of our findings, we compared mutational compositions of preneoplastic lesions and early GC between our present study and previous studies to provide a broader perspective of its early molecular driving events. The evidence reported in this paper on the early drivers in preneoplastic lesions and early GC may become critical for defining future therapeutic options. Materials and methods Sample information A total of 48 samples from 16 patients with preneoplastic lesions and gastric adenocarcinoma were selected from the Gastroenterology Unit of the Internal Medicine Department of Roosevelt Hospital from Guatemala ([65]Figure 1). All patients were diagnosed through endoscopic biopsy. Tissue samples were formalin-fixed and paraffin-embedded for histopathological analysis with hematoxylin and eosin staining by pathology specialists, and analytical biopsies for sequencing were preserved in RNAlater and fresh frozen. The Sydney classification system was used to characterize and grade the histopathological features of gastritis and the Lauren classification was used to categorize gastric adenocarcinoma from biopsies [[66]2, [67]12]. In addition, H. pylori was evaluated by using Giemsa staining. The inclusion criteria were as follows: (i) patients over 18 years old and diagnosed with preneoplastic lesions and early-stage gastric adenocarcinoma; (ii) patients with primary treatment-naïve tumors; (iii) tumor tissues with cellularity ≥70%; and (iv) patients with complete clinical history. Each patient had at least three samples from chronic gastritis, necrosis, atrophy, intestinal metaplasia, tubular adenoma, or diffuse or intestinal adenocarcinoma ([68]Supplementary Table S1). The study was approved by the Research and Ethics Committees of the National Hospital Carlos N. Arana of Chiquimula, Guatemala (protocol 8311647) and conducted in accordance with the Declaration of Helsinki. Figure 1. [69]Figure 1. [70]Open in a new tab Experimental design. Panel 1: research design. Patient recruitment and approval of the protocol by the ethics committee for the study of preneoplastic lesions and gastric adenocarcinoma. Panel 2: sample preparation. DNA extraction and library preparation with exome enrichment using a commercial kit based on 127 mutated genes. Panel 3: sequencing of the libraries and identification of SNVs, CNVs, mutational signature, and MEIs. SNVs of preneoplastic lesions and early GC from other studies were integrated into our analysis [[71]21–24]. Statistical analyses and visualization were carried out using packages in R and Python. SNVs = single nucleotide variants; CNVs = copy number variants; MEIs = mobile element insertions. Created in BioRender.com. DNA extraction and library preparation DNA was extracted from fresh frozen tissues using 50 to 200 mg of sample with the DNeasy Blood & Tissue Kit (Qiagen, Hilden, Germany) pursuant to the manufacturer’s instructions. DNA quantification was performed by fluorometry (Qubit, Invitrogen, USA), DNA integrity was assessed on 0.8% agarose gels, and DNA purity was estimated by spectrophotometry (Implen, Germany). All samples had an A260/280 ratio ≥1.8 and were used for subsequent experiments. The library preparation was done with a xGen Pan-Cancer Hybridization Panel kit (IDT, Coralville, USA) based on the most 127 significantly mutated genes in the most prevalent 12 cancer types, according to The Cancer Genome Atlas study [[72]13], a panel that covers 800 kb of protein-coding bases. Size distribution of the final libraries was evaluated on the Bioanalyzer by using a High Sensitivity DNA kit (Agilent, USA), and paired-end sequencing was performed on an Illumina HiSeq 2500 for 2 × 150 cycles with a mean depth of 119.9X (± 38.4). Data preprocessing and identification of somatic mutations Sequencing reads were aligned to the human genome reference hg19 with BWA-MEM [[73]14], and GATK tools were used for data preprocessing [[74]15]. Single nucleotide variants were called with Mutect2 [[75]16] and annotated with ANNOVAR [[76]17]. Variants were filtered as follows: (i) variants that passed the quality filters (“PASS” flag); (ii) Phred score ≥ 30; (iii) variants with mapping quality ≥ 60; (iv) variants with an allelic frequency less than 0.001 in 1,000 genomes, ESP6500, and gnomAD databases; (v) variants with a mutant allelic fraction > 0.03 were kept; and (vi) variants with evidence in both strands, with a minimum of two reads on each strand. Driver mutations were defined according to well-known hotspot mutations in COSMIC [[77]18] and cBioportal [[78]19], and potential driver mutations if predicted by two of three algorithms, SIFT, PolyPhen2, or MutationTaster, as deleterious. All driver mutations were manually curated by inspection of the BAM files using Integrative Genomics Viewer (IGV) software [[79]20]. The number of SNVs and indels in each sample was normalized by the size of the analyzed region to evaluate the TMB (mutations/Mb). In addition, in order to provide a wider molecular perspective on the somatic mutations of early GC, we included additional mutational data from previous studies. We conducted a search of the PubMed database with the terms “gastric cancer,” “stomach cancer,” “early cancer,” “preneoplastic lesion,” and “early adenocarcinoma” for next-generation sequencing GC studies and identified four studies with 15, 43, 8, and 17 samples from USA, Japan, and Korea [[80]21–24]. Samples of advanced-stage GC in these studies were eliminated from the analysis. We then compared the mutational composition reported in these reports with our data, focusing on driver genes associated to SNVs and CNVs, described above. CNV identification CNVs were identified by using CNVkit [[81]25] in tumor-only mode under the default parameters. Variants were filtered based on depth, the 95% confidence interval, and the directionality criteria while considering the biological activity of the affected genes as follows: (i) CNVs with copy number (CN) = 2, the normal genome state, were excluded; (ii) oncogenes with CN ≤ 2 were excluded and CN > 2 were kept; (iii) tumor suppressor genes with CN ≥ 2 were excluded and CN < 2 were kept. Mobile element insertions identification Mobile element insertions (MEIs) were analyzed by using MELT [[82]26]. Variants were filtered as follows: (i) variants marked as “PASS” with evidence of MEIs in the 5’ and 3’ regions; (ii) variants with ASSESS = 5, evidence of target site duplication (TSD), split reads (SRs), discordant reads, breakpoint site, and insertional mutagenesis; and (iii) coincident SRs in alternative chromosomic location were kept. The selected variants were visualized and manually curated in IGV with a bed file of Repeat Masker version hg19 obtained from the UCSC Table Browser [[83]27]. Mutational signatures The filtered SNVs (described above) with an allelic fraction > 0.03 were used for the analysis of mutational signatures. Given the low number of mutations in individual samples, we consolidated the samples into five histopathological groups: chronic gastritis, atrophy, intestinal metaplasia, intestinal adenocarcinoma, and diffuse adenocarcinoma. Necrosis and tubular adenoma were excluded because each consisted of a single sample. Mutational signatures were then evaluated in the context of 96 trinucleotides by using the R package deconstrictSigs [[84]28]. To evaluate the distribution of the samples with mutational signatures, an unsupervised hierarchical clustering analysis was performed by calculating Euclidean distances as previously described [[85]29]. Pathway enrichment analysis Pathway and network enrichment analysis was defined by the David Functional Annotation Tool 6.8 [[86]30] using driver genes in the KEGG pathway database. Uninformative pathways were eliminated, including general cancer and noncancer diseases such as addictions, development processes, and metabolic, psychiatric, and parasitic diseases. Signaling pathways were then evaluated using Fisher’s exact test with Bonferroni corrections for the false discovery rate (FDR). After Bonferroni correction, P values less than 0.01 were included in the study. Global actionable alterations Treatments approved by the Food and Drug Administration (FDA) for actionable genes in all types of cancers were analyzed in the OncoKB database [[87]31] and classified into Tiers according to ASCO-CAP classification [[88]32]. Only variants that were classified as (i) Tier I, FDA-approved therapy or included in professional guidelines or (ii) Tier II, FDA-approved therapies for different tumor types or investigational therapies were included. Statistical analysis Significant differences in the number of mutations and TMB, as well as the mutational signature across disease stages were assessed using the Kruskal–Wallis rank sum test, followed by pairwise comparisons with the Wilcoxon rank sum test and Bonferroni correction for FDR. P values less than 0.05 were considered statistically significant. Results Clinical characteristics A total of 48 samples were selected from 16 patients with preneoplastic lesions and GC at the Gastroenterology Unit of the Internal Medicine Department of Roosevelt Hospital in Guatemala. Each patient contributed three samples, from chronic gastritis, necrosis, atrophy, intestinal metaplasia, tubular adenoma, or diffuse or intestinal adenocarcinoma ([89]Supplementary Table S1). Most patients with preneoplastic lesions and gastric adenocarcinoma were older than 50 years (56.3%) were male (62.5%) and had no family history of cancer (75%). One patient had H. pylori infection ([90]Table 1). Table 1. Epidemiological characteristics of patients with preneoplastic lesions and early GC Characteristic Mean ± SD n (%) Age  < 50 years 41.3 ± 4.8 7 (43.7)  ≥ 50 years 68 ± 9.4 9 (56.3) BMI  ≤18.4 underweight 16.5 ± 0 2 (12.5)  18.5 < 24.9 normal 22.1 ± 2.3 10 (62.5)  25 < 29.9 overweight 27.2 ± 1.7 4 (25) H. pylori infection  Yes – 1 (6.3)  No – 15 (93.8) Sex  Female - 6 (37.5)  Male – 10 (62.5) FHC  Yes – 4 (25.0)  No – 12 (75.0) [91]Open in a new tab BMI = body mass index, SD = standard deviation, H. pylori = Helicobacter pylori, FHC = family history of cancer. TMB and mutational signature Overall, we identified 477 SNVs with a mean of 9.9 per sample, which were categorized into five histopathological groups: diffuse adenocarcinoma with a mean of 14.2 SNVs (range, 8–29), intestinal adenocarcinoma 12.18 (range, 5–22), intestinal metaplasia 10.6 (range, 4–21), atrophy 8.35 (range, 3–20), and chronic gastritis 4.71 (range, 1–13) ([92]Figure 2A). Necrosis and tubular adenoma samples were excluded from this analysis because they represented only single cases. Figure 2. [93]Figure 2. [94]Open in a new tab Tumor mutational burden and mutational signatures in preneoplastic lesions and early gastric adenocarcinomas. (A) Number of mutations; (B) TMB; and (C) mutational signatures of the preneoplastic lesions and early GC segmented by histopathological type. Histopathological types with at least three samples were included in the analysis of number of mutations in (A). TMB = tumor mutational burden; MMR = mismatch repair; HR = homologous recombination. The mean TMB was 0.012 (mutations/Mb; range, 0.001–0.036), with differences between the histopathological subtypes. Diffuse adenocarcinoma had a mean TMB of 0.017 (range, 0.010–0.036), intestinal adenocarcinoma of 0.015 (range, 0.006–0.027), intestinal metaplasia of 0.013 (range, 0.005–0.026), atrophy of 0.010 (range, 0.003–0.025), and chronic gastritis of 0.005 (range, 0.001–0.016) ([95]Figure 2B). No statistically significant differences were observed between the number of mutations and the TMB within the histopathological subtypes. An analysis of mutational signatures was performed to identify possible molecular etiologies of preneoplastic lesions and gastric adenocarcinomas. We computed the mutational signatures by categorizing the samples into their histopathological subtypes and identified five mutational signatures, four of which (S01, S03, S15, and S17) have been previously described in gastric adenocarcinoma ([96]Figure 2C). Mutational signature S01, associated with the aging process, was predominant in almost all samples except in chronic gastritis. Signature S06 (DNA mismatch repair, MMR) was detected in chronic gastritis and tubular adenoma, and signature S17 was found in chronic gastritis and atrophy. S15, associated with DNA MMR, was found in diffuse and intestinal adenocarcinoma, and S03 (homologous recombination deficiency, HRD) was found only in diffuse adenocarcinoma. An unsupervised hierarchical clustering analysis was conducted to identify the mutational signature programs operating in the early GC carcinogenesis, and this yielded three different groups of mutational signatures patterns Group 1, comprising chronic gastritis, was characterized by the signature S06 and S17, and lack of S01; Group 2, including intestinal metaplasia, necrosis, tubular adenoma, and atrophy, was defined by a predominant proportion of S01 signature; and Group 3 was characterized by a lower proportion of S01 and the presence of S15 in intestinal and diffuse adenocarcinoma, with S03 additionally detected in intestinal adenocarcinoma. After the statistical test and applying Bonferroni corrections, no significant differences were found between the mutational signatures in preneoplastic lesions and early GC. Patients with preneoplastic lesions and GC showed genetic heterogeneity An extensive genetic heterogeneity was detected in preneoplastic lesions and gastric adenocarcinoma in term of SNVs and CNVs. We identified nine driver mutations in four genes and 49 potential driver mutations in 29 genes ([97]Figure 3, [98]Supplementary Figure S1 and [99]Supplementary Table S2). Driver mutations in SMAD4 were found in the atrophy and tubular adenoma of the same patient (p4, [100]Supplementary Figure S1), TP53 in intestinal adenocarcinoma (p21, p22, p37) and diffuse adenocarcinoma (p23), NRAS in intestinal adenocarcinoma (p14), and ATM in atrophy (p6). One patient had a potential germline mutation in ATM c.6680G>A p. Arg2227His that was validated by Sanger sequencing (p45, [101]Supplementary Figures S1 and S2). Figure 3. [102]Figure 3. [103]Open in a new tab Driver mutation in preneoplastic lesions, early, and advanced GC. The distribution and composition of SNVs were identified in this work and other studies with early lesions and early GC. Only CNVs were included in our study. Legends of type of alteration, tissue sample type, TNM grading, sequencing method, and author are shown in the right of the figure. *In the early GC [[104]24], we included the 20 genes with the highest mutation prevalence. GC = gastric cancer; SNVs = single nucleotide variants; CNVs = copy number variants. We also compared these results with other data on preneoplastic lesions and early GC to identify early driver mutations that may participate in cancer progression. In gastric hyperplastic polyp (GHP) with high grade dysplasia, TP53 was found in 25.0% (2/8) of cases, whereas PIK3CA was found in only 12.5% (1/8) in pyloric-type dysplasia cases ([105]Figure 3) [[106]23]. In pyloric gland adenomas (PGA), 10 out 15 samples with low-grade dysplasia had mutations in APC, KRAS, and GNAS genes, whereas 5 samples with high-grade dysplasia exhibited mutations in several genes, including APC, CTNNB1, KRAS, GNAS, TP53, CDKN2A, PIK3CA, and EPHA5 genes, but did not exhibit mutations in the triad of APC, KRAS, or GNAS genes [[107]21]. In the premalignant lesion of dysplasia/intraepithelial neoplasia (D/IEN), two genes were frequently mutated, APC in 76% (19/25) of cases and TP53 in 48% (11/25) [[108]22]. In other studies of early GC drivers, mutations were identified in TP53, APC, PIK3CA, ARID1A, and KRAS [[109]24]. Overall, the data showed that TP53 and APC are the most mutated genes in preneoplastic lesions and early GC. Additionally, we identified 22 CNVs in 13 genes in the samples from this study ([110]Figure 3, [111]Supplementary Figure S1, and [112]Supplementary Table S3). Deletion in VHL was the most common alteration and was detected in five samples (four patients), followed by deletion of ATM and amplification of NOTCH1 in three patients. Interestingly, the sample EDM_0009 (p23, diffuse adenocarcinoma) had a considerable number of CNVs that consisted of seven amplifications and one deletion in VHL. In our study, we classified driver mutations and potential drivers into 6 signaling pathways for CNV and 10 for SNVs ([113]Supplementary Figure S1). RAS/MAPK and transcription factors pathways were the most common in CNVs, and cell cycle, chromatin, and transcription were the most prevalent in SNVs Furthermore, we evaluated mobile transposable elements as alternative molecular pathogenesis mechanisms in early GC lesions. Potential candidates were manually visualized in IGV using a bed file of Repeat Masker version hg19. However, they were discarded due to lack of evidence in TSD, SRs, discordant reads, or insertional mutagenesis. Possible therapeutic targets for GC We performed a specific search in the OncoKB database to identify mutations associated with treatment of GC following the ASCO-CAP guidelines for mutation classification [[114]32]. The search showed that 62.2% (10/16) of patients had at least one actionable somatic variant in preneoplastic lesions or gastric adenocarcinoma in six genes ([115]Figure 4A and B). ATM, BRAF, and FBXW7 were found to be sensitive to olaparib, vemurafenib, and sirolimus, and TP53, and NRAS and SMAD4 were found to confer resistance to venetoclax and EGFR inhibitors ([116]Figure 4C). Three main signaling pathways were detected in these genes: cell cycle (50%), RAS/MAPK (33.3%), and transcription factor (16.7%) ([117]Figure 4C). Figure 4. [118]Figure 4. [119]Open in a new tab Patients with potentially actionable alterations in preneoplastic and gastric adenocarcinoma. (A) Percentage of patients with actionable alterations; (B) percentage of actionable genes; (C) actionable genes and possible treatments according to variants. Pathway and drugs are shown in the left panel, and preneoplastic lesions and GC in the bottom panel. Discussion Most studies have focused on the advanced stages of GC, and there is limited research on the earliest genetic alterations that drive the transition from preneoplastic lesions to malignancy, particularly in populations from high-incidence regions in developing countries. Our study provides the first genomic and molecular analysis of preneoplastic lesions and early GC in a Guatemalan population, a high-incidence region with limited prior research. H. pylori was detected in only 6% of our cases. Consistent with our findings, one large molecular study showed a very low prevalence of H. pylori, but other works have shown higher numbers [[120]3, [121]5]. In developing countries, GC often presents at a younger age, with early life H. pylori infection and other factors contributing to this accelerated onset [[122]33]. In Guatemala, the seroprevalence of H. pylori has been reported to reach approximately 70%, although molecular studies on GC are lacking [[123]34]. The detection rate in our study may be influenced by the patients’ ages at diagnosis, with the majority being over 50 years old, as well as the diagnostic methods employed. We evaluated the mean TMB to estimate of mutagenic exposure and DNA repair defects. Diffuse adenocarcinoma (0.017 mutations/Mb) and intestinal adenocarcinoma (0.015 mutations/Mb) had higher rates than intestinal metaplasia (0.013 mutations/Mb), atrophy (0.010 mutations/Mb), and chronic gastritis (0.005 mutations/Mb). Although a higher TMB was identified in the advanced stages of GC, no statistically significant differences were found between these lesions. In precursor lesions of GC, such as chronic gastritis, the mutation rate was low, whereas advanced stage showed an increased mutation rate, as well as a high TMB, which can be attributed to disease progression [[124]35]. We identified five mutational signatures in preneoplastic and adenocarcinoma samples, four of which (S01, S03, S15, S17) have GC associations [[125]36]. The aging-related S01 signature, caused by 5-methylcytosine deamination at CpG sites, dominated most samples except chronic gastritis. In these inflammatory lesions, S01 contribution was attenuated by more active processes, including S17 (15% median contribution, linked to oxidative damage) [[126]36] and S06 (85%, MMR deficiency). This likely reflects gastritis inflammatory microenvironment, where oxidative stress and epithelial damage promote these signatures over S01 clock-like accumulation. Furthermore, the progression from chronic inflammation to intestinal metaplasia and neoplasia likely involves a shift from exogenous or inflammatory mutagenesis to endogenous, age-related processes, which may explain the increased prominence of S01 in more advanced lesions. The MMR-deficient signature S15 was specifically detected in adenocarcinomas, while the oxidative damage-associated S17 appeared in atrophy and chronic gastritis. Intriguingly, while MMR deficiency signatures (S06 and S15) were present in chronic gastritis, tubular adenoma, diffuse and intestinal adenocarcinoma, we found no underlying MMR gene mutations. This suggests alternative mechanisms such as MLH1 promoter methylation (reported in 50% of sporadic and 23% of familial GC cases) [[127]37] may drive these mutational patterns. The exclusive presence of S06 in chronic gastritis and tubular adenoma, previously unreported in GC, highlights DNA MMR dysfunction as a potential early event in gastric carcinogenesis, though studies of preneoplastic lesions remain limited. Notably, we detected HRD-associated S03 in intestinal adenocarcinoma, a signature more common in breast and ovarian cancers but previously reported in GC [[128]38]. While one atrophy case had a BRCA1 mutation, the S03 HRD signature appeared in a separate intestinal adenocarcinoma, implying other HR pathway genes (not covered by our panel) or epigenetic silencing (e.g. of BRCA1 and RAD51C as in breast cancer [[129]39]) could contribute to this signature. Unsupervised hierarchical clustering analysis identified three distinct sample groups, indicating that mutagenic processes may operate either independently or in combination at different stages of disease progression. Notably, advanced stages in our study were characterized by the presence of the S15 mutational signature, suggesting that it could be the driving process of carcinogenesis at this stage (Group 3). Our study also revealed extensive genetic heterogeneity, with 31 genes harboring driver or potential driver mutations and 13 genes with CNVs, especially in SMAD4, TP53, NRAS, and ATM. TP53 and NRAS were found in intestinal and diffuse adenocarcinoma. In addition, several studies have reported mutation of TP53 in more than 50% of cases of GC [[130]3, [131]40, [132]41] and genes involved in the RAS/MAPK signaling pathway including KRAS, BRAF, PIK3CA, and NRAS [[133]42–44], consistently with our present work in which we found four patients with TP53 mutations that may indicate a potential chromosomal instability molecular phenotype [[134]3]. Comparative analysis with other studies of preneoplastic lesions and early GC supports our findings. TP53 (25%) and PIK3CA (12.5%) mutations were identified in gastric hyperplastic polyps, suggesting their role in malignant transformation [[135]23]. In addition, pyloric gland adenomas/adenocarcinoma, with low-grade dysplasia presented APC, KRAS, and GNAS mutations, while advanced lesions acquired additional alterations in TP53, PIK3CA, and other genes [[136]21]. Other studies have similarly identified APC and TP53 mutations as early events in gastric tumorigenesis, particularly in intestinal-type cancers [[137]22, [138]24]. TP53 and APC emerge as the most consistently mutated genes across the gastric carcinogenesis spectrum. As a critical tumor suppressor, TP53 maintains genomic stability through cell cycle regulation, DNA repair, and apoptosis induction. Early TP53 mutations may confer a selective advantage in the harsh gastric environment, facilitating proliferation despite genotoxic stress, as evidenced by the S17 mutational signature. This pattern mirrors observations in other malignancies, where TP53 defects frequently initiate preneoplastic progression [[139]45]. Clinically, TP53 mutations correlate with poorer outcomes in GC patients [[140]46, [141]47]. While APC mutations are best characterized in colorectal cancer (occurring in 80% of cases), they also significantly contribute to gastric tumorigenesis through Wnt pathway activation [[142]48, [143]49]. These findings underscore the complex molecular landscape underlying GC development and progression. Hereditary GC represents less than 3% of cases and is primarily driven by germline mutations in CDH1 [[144]6]. In our cohort, we identified one patient (6.3%) with a likely pathogenic germline mutation in ATM (p.R2227H). However, this patient did not have family history of cancer. Germline mutations in this gene have been reported in other GC studies [[145]50–52], suggesting possible association with the disease, but the current evidence of the increased risk is inconclusive [[146]53]. In terms of CNVs, VHL, ATM, and NOTCH1 were the most frequently observed genes and were mainly detected in adenocarcinomas. VHL has been extensively described in clear cell renal carcinoma and hemangioblastomas, associated with hypoxic and angiogenic tumors [[147]54], and proposed as a therapeutic target [[148]55]. The affected signaling pathways with higher frequency were RAS/MAPK, cell cycle, chromatin remodeling, and transcription factor. In line with this work, these pathways have been described in several studies of locally advanced GC [[149]3]. We evaluated MEI as an additional mechanism of mutational pathogenesis, but we did not identify any robust evidence of transposon insertion in the target genes covered. Our findings carry important therapeutic implications. MMR-deficient GC have elevated TMB, which enhances tumor immunogenicity and correlates with improved response to immunotherapy and better prognosis [[150]56]. Similarly, HRD serves as a predictive biomarker for sensitivity to both platinum-based chemotherapy and immunotherapy. Metastatic GC patients with HRD, including those with diffuse adenocarcinoma, have significantly improved overall survival when treated with platinum agents [[151]57]. The favorable outcomes in HRD cases may be associated to molecular features including high TMB, enhanced immune activity, and microsatellite instability [[152]58]. In addition, we identified at least one potentially actionable genetic variant in all the preneoplastic lesions and GC. Variants found in ATM, BRAF, and FBXW7 have been associated with sensitivity to actionable therapy, and TP53, NRAS, and EGFR with resistance to it. ATM deficiency has shown sensitivity to olaparib in some GC cell lines [[153]59], with one multicenter study ([154]NCT03829345) in oesophagogastric cancer and gastric adenocarcinoma showing promising results [[155]60]. BRAF mutations are used to treat melanoma and solid tumors (different than GC) with vemurafenib [[156]61, [157]62]. FBXW7 is associated with sensitivity to sirolimus in ovarian and endometrial cancer. However, TP53 mutations confer poor treatment outcomes and resistance to therapy with venetoclax in patients with acute myeloid leukemia [[158]63], and NRAS and SMAD4 showed resistance to EGFR inhibitors in lung and colorectal cancers [[159]64–66]. Although these findings highlight the potential value of these genes in identifying, and developing possible targeted treatments in early GC, caution must be taken in the interpretation of their potential clinical translation. Limitations of this study include but are not limited to the following. First there was insufficient sample size to establish statistical association with epidemiological variables. Due to the limited number of genes analyzed, it was not possible to comprehensively assess the number of mutations, TMB, and mutational signature in each patient, leading to the grouping of samples by disease histopathological type only. Second, the lack of molecular testing to identify H. pylori infection and virulence factors prevented a more detailed understanding of epidemiological role of this pathogen. CNV events were detected by robust bioinformatic methods but were not validated. Other molecular mechanisms need to be evaluated, such as epigenetic analysis, which may provide alternative pathogenic routes in preneoplastic lesions and early GC. Finally, proposed therapy based on mutational data should be treated with caution and require further independent replication and clinical validation. Despite these limitations, we identified several key early genetic alterations acting in preneoplastic lesions and GC that may promote cancer progression and may thus be used as possible therapeutic targets, as well as for follow-up after surgery, as has been shown in liquid biopsy studies [[160]67, [161]68]. Conclusions This work provides comprehensive genomic characterization of the earliest alterations of preneoplastic and malignant gastric lesions in a high-incidence Latin American population. Our findings reveal that at the preneoplastic and earliest stages, GC is genetically heterogeneous and presents key cancer driving mutations that may participate in neoplastic transformation and progression, with 62% of lesions having at least one targetable mutation, which presents a significant therapeutic opportunity. However, the presence of considerable genetic heterogeneity underscores the need for continued molecular characterization efforts. Furthermore, our research highlights the potential for early detection and disease monitoring using techniques such as liquid biopsy analysis. These insights improve our understanding of gastric early carcinogenesis and emphasize the critical importance of precision medicine approaches in combating GC. Supplementary Material goaf089_Supplementary_Data [162]goaf089_supplementary_data.zip^ (5.2MB, zip) Acknowledgements