Abstract Background Gastric cancer (GC) is among the most common forms of cancer affecting the digestive system. This study sought to identify hub genes regulating early GC (EGC) in order to explore their potential for early diagnosis and prognosis of patients. Methods We utilized a publically available dataset from the Gene Expression Omnibus database ([38]GSE55696). Differences between EGC and LGIN with respect to gene expression were compared using the limma software. Identified differentially expressed genes (DEGs) were subjected to gene ontology (GO) and pathway enrichment analyses with the DAVID application, and the STRING website and Cytoscape software were used to construct a protein-protein interaction (PPI) network incorporating these DEGs. This network was in turn used to identify hub genes among selected DEGs, which were analyzed with the Kaplan-Meier Plotter database. In addition, Western blotting, qRT-PCR, immunohistochemistry, and UALCAN were all employed to validate the relationship between the expression of these genes and GC patient prognosis. Results A total of 482 DEGs were identified, with GO analyses indicating an increase in the expression of genes linked with the development of cancer. Pathway analyses also indicated that these genes play a role in certain cancer-related pathways. The PPI network highlighted four potential hub genes, of which only ICAM1 was linked to a poor GC patient prognosis. This link between ICAM1 and GC patient outcomes was confirmed via UALCAN, Western blotting, immunohistochemistry, and qRT-PCR. Conclusion ICAM1 may therefore modulate tumor progression in GC, thus potentially representing a valuable prognostic and diagnostic biomarker of EGC. Keywords: bioinformatics analysis, gastric cancer, early diagnosis, ICAM1, biomarker Introduction Gastric cancer (GC) remains one of the most common forms of digestive system tumors, and the third leading cancer-associated cause of death according to GLOBOCAN2018. Approximately 10,000,000 new GC cases were diagnosed in 2018, with 783,000 people having died of the disease, which occurs twice as often in men as it does in women.[39]^1 GC is the third most common form of cancer in China and the second most prominent cause of cancer-associated mortality.[40]^2 This mortality often results from a failure to detect early GC (EGC), as current diagnostic strategies primarily depend upon endoscopic examination, imaging, and serology,[41]^3^–[42]^5 with most analyzed patients already being in the advanced stages of disease when subjected to these analyses. In China, GC has a 5-year survival rate of 35.9%, which is lower than rates in Japan and South Korea (60.3% and 68.9%, respectively). This is largely explained by the much higher rates of EGC diagnosis and detection in Korea and Japan.[43]^6^,[44]^7 As such, there is a clear need for the more reliable detection of EGC through the use novel biomarkers of this disease. A common current strategy for analyzing tumor-associated gene expression depends upon the use of microarray and bioinformatics analytical approaches. Through such strategies, Li et al identified CLO4A1 as a potential biomarker of recurrent GC.[45]^8 Yan et al also found COL1A1, MMP2, FN1, TIMP1, SPARC, COL4A1, and ITGA5 to all represent potential GC biomarkers.[46]^9 These biomarker identification strategies, however, largely depend upon comparisons between normal tissue and advanced GC tissue samples, making them of limited utility when guiding EGC diagnosis. The WHO reclassified gastric cancers in 2010 into low-grade intraepithelial neoplasia (LGIN), high-grade intraepithelial neoplasia (HGIN), EGC, and GC categories, although controversy regarding the definition of these different disease states remains.[47]^10 For example, researchers in Japan posit that as HGIN tumors exhibit dysplasia, they are better classified as instances of EGC.[48]^11 Such discrepancies may further explain the increased rates of EGC detection in Japan. In the present study, to better identify genes associated with the earliest stages of GC differentiation and progression, we classified HGIN as a form of EGC in line with these Japanese criteria, and we then compared gene expression between EGC and LGIN in a publically available dataset in an effort to detect differentially expressed genes (DEGs) linked to GC patient outcomes. For this study, we utilized the available [49]GSE55696[50]^12 dataset uploaded in the Gene Expression Omnibus (GEO, [51]https://www.ncbi.nlm.nih.gov/) database. This dataset incorporated gene expression results from endoscopic biopsy tissue samples from patients diagnosed with LGIN, HGIN, or EGC. After pooling HGIN and EGC data, we used the limma package as a means identifying DEGs between the EGC and LGIN groups. The resultant DEGs were then subjected to gene ontology (GO) and pathway enrichment analyses to better explore their biological roles. We further generated a protein-protein interaction (PPI) network for these genes to highlight central hub genes. We then used the Kaplan-Meier Plotter ([52]http://kmplot.com/analysis/) database to assess how these hub genes were linked to GC patient outcomes. We additionally classified the clinical relevance of these hub genes based upon an online database in an effort to identify potentially novel biomarkers of EGC that may permit earlier patient diagnosis and prognostic planning. Materials and Methods Microarray Data [53]GSE55696 data were downloaded from GEO, which compiles large amounts of publically-available gene expression data, including high-throughput microarray data.[54]^13 The chosen study had employed the [55]GPL6480 Agilent-014850 Whole Human Genome Microarray 4x44K G4112F for their analyses, and included a total of 19 LGIN, 20 HGIN, 19 EGC, and 19 chronic gastritis tissue samples. DEG Identification After downloading the [56]GSE55696 series matrix file, we omitted chronic gastritis samples from further analyses owing to their unclear definition, and we combined EGC and HGIN samples prior to comparing this aggregate EGC group to the LGIN group using the limma, impute, and heat map R packages derived from bioconductor ([57]http://bioconductor.org/biocLite.R). In cases where there were multiple probes for a single gene, mean values were used. DEGs were those with P < 0.05 in a t-test and a [logFC] > 1. Functional and Pathway Enrichment Analyses GO analyses allow for exploration of the functional roles of sets of genes,[58]^14 while KEGG analyses allow for exploration of the pathways in which such genes may function.[59]^15 We conducted these two forms of analyses on our identified DEGs using the DAVID ([60]https://david.ncifcrf.gov/) tool, which allowed for comprehensive functional annotation.[61]^16 Significant enrichment was said to be evident when P < 0.05. Hub Gene Identification The PPI network was constructed based on predicted interactions in the online STRING database ([62]https://string-db.org/).[63]^17 We uploaded all 482 DEGs in the present study to yield an initial PPI, and then visualized this network using Cytoscape Version 3.7.1. Next, cytoHubba was used to rate the network, with the top 10 genes rated according to their Degree, Closeness, and Betweenness scores being the candidate hub genes. Hub Gene Survival Analyses To analyze the relevance of identified hub genes to GC patient survival, we separated patients into hub gene-high and –low groups based on median expression levels, and then used the online Kaplan-Meier Plotter database to compare GC patient outcomes.[64]^18 Differences in survival based on hub gene expression were assessed via Log rank test, with P<0.05 as the significance threshold. Using this approach we were able to identify those genes associated with a poorer GC prognosis. ICAM1 Hub Gene Validation in GC Using UALCAN To confirm the relevance of the identified hub gene ICAM1 in GC, we employed the online UALCAN ([65]http://ualcan.path.uab.edu/) tool that allows for comparisons of gene expression data and clinical data across 31 forms of cancer.[66]^19 Differences in gene expression were compared between groups via t-tests, with P<0.05 as the significance threshold, and we explored how ICAM1 expression related to GC patient clinical findings. qRT-PCR For qRT-PCR, 30 paired GC and adjacent normal tissue samples surgically collected from 2013–2014 at Guangxi Medical University Cancer Hospital were used. RNAiso plus (9108, TaKaRa, USA was used for RNA extraction, followed by use of a cDNA reverse transcription kit (RRO47A; TaKaRa, USA). Primers used were: ICAM1 forward, 5ʹ‐CAGGAGCAACTTCTCCTGC‐3ʹ; ICAM1 reverse, 5ʹ‐ACCGGAATGACAATGTCCAGGATA‐3ʹ.[67]^20 A SYBR Green kit (RR820; TaKaRa, USA) was used for qRT-PCR on an ABI7500 device. Cycle settings were: 30s at 95°C, then 40 cycles of 15 s at 95°C, and 34 s at 60°C. Triplicate samples were used and averaged, with β-actin as a reference control. Western Blotting Tissue samples were homogenized in RIPA buffer (p00136, Beyotime Biotechnology, China) containing PMSF (ST506, Beyotime Biotechnology). Samples were then boiled in loading buffer (5×) (P0015L, Beyotime Biotechnology), after which samples were electrophoretically separated on SDS-PAGE 4–10% Bis-Tris gels prior to transfer to PVDF membranes (IPVH00011, Solarbio, China). Next, 5% skim milk powder was used to block membranes, followed by overnight incubation with rabbit monoclonal anti-ICAM1 (1:1000, ab53013; Abcam, Cambridge, UK) or anti-β-actin (1:1000, Sigma, China). Then, a goat anti-rabbit IgG (H + L) antibody (1:1000, A0208, Beyotime Biotechnology) was used to detect primary antibodies, with a ChemiDoc MP system (Bio Rad Laboratories, Inc.) used for chemiluminescent protein visualization. Immunohistochemistry Tissue samples from chronic superficial gastritis, LGIN, HGIN, EGC, and GC patients collected between 2018 and 2019 and held in the Guangxi Medical University Cancer Hospital specimen library were used for IHC experiments. The same ICAM1 antibody used for Western blotting was used for IHC at a 1:50 dilution, with a biotin donkey-anti-rabbit antibody (AP182B) (1:500; Millipore, Billerica, MA, USA) used for secondary detection. The IHC staining procedure was as follows: samples were warmed at 60°C for 4h, dewaxed with xylene, rehydrated using an ethanol gradient, treated EDTA (50x) for antigen repair, subjected to an endogenous peroxidase blocker (PV-6000, BAOXIN BIO, China), and then warmed for 10 min to 37°C prior to rinsing using PBS. Samples were then probed overnight with primary antibodies at 4°C, warmed for 15 min to 37°C, rinsed in PBS, stained with secondary antibody for 20 min at 37°C. DAB was used for development of staining, and samples were then counterstained with hematoxylin, differentiated with hydrochloric acid alcohol, dehydrated, dried, and sealed before analysis. Results DEG Identification In order to screen for meaningful biomarkers in EGC, using the R limma package and the p < 0.05 and [logFC] > 1 cut‑off criteria, we detected a total of 482 DEGs when comparing the EGC and LGIN datasets ([68]Figure 1A). Of these genes, 270 were upregulated and 212 were downregulated ([69]Table 1). We further generated a heat map of the top 50 DEGs using an appropriate R package ([70]Figure 1B). Figure 1. [71]Figure 1 [72]Open in a new tab DEG selection and hierarchical clustering analysis. (A) DEGs are arranged in a volcano plot, with the vertical and horizontal axes corresponding to logFC (fold change) and -log10 (p value). Green and red dots correspond to DEGs, whereas genes that were not DEGs are represented by black dots. (B) The top 50 DEGs were arranged in a heat map, with genes on the horizontal axis and samples along the vertical axis. DEGs could be divided into cancer and non-cancer groups. Up- and down-regulated DEGs are shown in red and green, respectively. Table 1. A Total of 482 DEGs Were Identified from the [73]GSE55696 Dataset, Including 270 Up-Regulated DEGs and 212 Down-Regulated DEGs in EGC Tissues, Compared to LGIN Tissues DEGs Gene Symbol^# Up-Regulated S100A12, S100A8, S100A9, LST1, FPR1, FCGR3A, MNDA, NFE2, FCGR2A, CLEC4E,SELM, CMTM2, ADGRE3, AQP9, CCL4, FCN1, TNIP3, FMNL1, TRIB3, VNN2, TYROBP, BEST1, LRRK2, CCL3, CARD9, CLEC4A, FGR, LILRB3, FAM49A, CFP, CXCR1, G0S2, FAM65B, LILRB2, CMTM7, LINC-PINT, RNASE2, PYGL, FCGR1B, NCF2, CXCR2P1, NCF1, ITIH4, GBP5, IL7R, LY96, SPI1, IL21R, OSM, CSF3R, PPBP, IGSF6, LGALS2, SOD2, CXCR4, DEFA3, ACSS3, ADAM8, SNHG6, TREM1, ADGRE2, CAMP, TNFAIP2, CD28, UAP1L1, APOE,RGS18, HBD, DUOXA1, HBA2, IFITM3, CLEC4D, FADS1, SLC2A6, FSTL3, MMP9, GLT1D1, CXCR2, S1PR4, CXCL2, ICAM1, LINC01410, SIGLEC10, KLHL17, TLR8, PLEK, CTHRC1, CSF2RA, HBA1, PTPRC, RAB42, BCAS4, MMP25, LINC01094, TNFAIP6, HLA-DPB1, PLTP, PAQR5, CCL3L3, MILR1, FKBP10, C5AR1, MFAP2, LOC100507460, MCEMP1, FAM198A, NEU1, RGS1, LOC107987020, TNFRSF10C, ADAMDEC1, PABPC1L, SPP1, TPM2, LAMP3, PTGS2, LY6E, RELL2, ZBTB32, REC8, PTGDS, AIM2, PDPN, SPRR1B,NNMT, EGFL6, SRGN, CXCL1, ITGA4, PDZK1IP1, APOC1, LCN2, HOXC10, HBB, EGR3, LOC284454, HLA-DRB1, BAAT, CCR7, MUCL1, LRGUK, FIGF,LRRC39, PARVB, LILRA2, C3, DPYSL4, CXCL3, KRT23, HOXC13, FPR2, MIR503HG, CCL20, C4B, ALAS2, PP7080, TRAT1, ZBP1, BMP7, LINC01296, PTCHD1, P2RX5, IRX2, CYB5R1, CATSPERB, LOC155060, CD70, UCHL1, LOC339803, SAA1, KRT6B, CTSW, BIRC3, HLA-DRB3, SPNS1, IFI6, SLC22A31, CARD14, ABCA12, KLHL6, ATP6V0A4, KCNIP3, FBXO2, INHBA,TMEM213, CLDN6, VEPH1, PLA2G7, ZNF556, TSPOAP1-AS1, CD72, CYR61, DUOX1, CP, IGHG4, HOXC11, BNIP3, ZBED2, CDIPT-AS1, LYPD2, CD79B,CYP2A7P1, CKLF-CMTM1, CHI3L2, CXCL9, LINC00886, IDO1, SAA4, CHI3L1, ATP6V1C2, PF4, KRT6C, UBD, IFI44L, SYNDIG1, PI3, COL8A1, CLECL1, CHRDL2, CETP, TNFRSF6B, PRAME, HBG1, IGF2BP3, DUOX2, FOXC1, POTEG, FGG, HOXC-AS3, POTED, CXCL13, SMKR1, LGSN, POTEB3, IL19, IL13RA2, HS3ST2, PCOLCE2, APOBEC3A, FXYD4, ACTL8, GTSF1, CD19, DSCR8, DDX43, KRT6A, SERPINA5, LY6K, GABRB3, KIAA1324L,CT45A5, ST8SIA6-AS1, SIX1, FOSB, MAGEA9, SERPINA3, OSR2, FOLR1, CSAG1, CTAG2, MAGEA1, DKK1, BPIFB1, OGDHL, SCRG1, BEX2, CRABP2, FAM25A, PDILT, TNNC1 Down-Regulated CAPNS2, PFN1P2, HNRNPCL1, NACAP1, CHRNB3, SUMO1P3, LOC100507351, MORF4, CSNK1A1L, ACTG1P4, PARP4, ZNF729, NLRP5, ICMT, HSDL2, NUDT12, MYH2, PABPC3, ANP32C, PGAM1P4, SIM1, SPACA1, CAPZA2, PDZD4, NUBPL, GABRR3, KRT35, GPR12, PRSS56, KRT8P41, LOC148709, ODF4, SCRT2, ATP8B5P, EEF1DP4, CXXC4, TAGLN3, CBWD6, UGT2A3, ZNF367, ANXA2P3, CXADRP2, CUBN, CTAGE3P, GALC, KLKB1, PPIAP30, CCDC68, CTAGE6, KRT19P2, MYT1, DRD2, CRYBB3, ITGA2, RTN4RL2, RPS2P45, DOPEY2, LOC100128164, SVOP, ADAMTS19, PCSK1N, MKRN9P, MS4A8, GRXCR1, ZNF259P1, BSN, PCSK1, FAM197Y2P,NMUR2, LOC105370109, FABP5, MMD2, RNF113B, RTP5, OTC,OR2C1, LOC643549, NUDT16P1, RPL23AP32, TRIM49, HTR3D, NOL4, DAGLA, TACR1, ELOVL4, CORIN, SCG5, SCG2, SLC7A8, LOC101927000, MAGEE2, LINC00326, SPAG1, TMPRSS6, CTAGE10P, HNRNPA1P27, GCNT2, RFX6, ADH1A, LOC729652, LYPD4, SCGB1D2, MYSM1, EDN3, INSM1, KCNMB2, CACNG2, KCNH6, FEV, SUCNR1, DSCR4, IL3, CTSG,NHLH2, HMP19, CNTF, ARX, SYT4, ANO8, COL17A1, ARVCF,TMEM155, ADAM7, PROKR2, SCG3, UGT2B10, SCGN,LCE3D,C11orf40, RPRML,LOC729080, SST, OVCH1-AS1,VIL1,ANXA2, OR10H1,MLN, PAX6, ABO,RGS4, GLDN, OR1N2, SHANK1, ALDOC, LRTM2, ABCC11, CHGB, TENM2,UGT2B11,PROX1,GPRC6A,NCKAP5,OR8H1,NPY6R,NR2E1,GSTA3,PLCXD3, TUBB2B,GNG3, HRG,SCGB1D1, HEPACAM2, KCNJ6, CA9, WSCD2, ADH6, GUSBP10, CNDP1, TINAG, CEACAM3, NKX6-3, TM6SF2, FBP2,CASR,SYNPR, KRT20,TMPRSS15,ODF3L1,TEKT4P2,CCL13,TM4SF4,CCL25,LYVE1,AK5,CLDN10,PTPN2 0,GAST,BTNL8,ZSCAN4,TONSL,TRIM23,C20orf85,FAM189A1,SLC38A11,DACH1,SLC18 A1,GCG,NAT2,GPA33,GSTA1,ALPI,A1CF,LHFPL3-AS2,CARTPT,CEACAM6,CHST5,SLC10 A2,UPK1B,KLK12, GALNT8 [74]Open in a new tab Note: ^#It is set at p<0.05 and |fold change| >1.0 as cut‑off criteria. GO and KEGG Analyses In order to clarify the role of these DEGs in the progression of GC, we needed to immediately predict the functional role of these genes. We next conducted GO and KEGG pathway analyses on these 482 DEGs. GO analyses revealed upregulated DEGs to be enriched for terms including “inflammatory response”, “extracellular region”, “immune response”, “extracellular space”, and “neutrophil chemotaxis” (P < 0.05) ([75]Table 2). In contrast, downregulated DEGs were enriched for terms including “hormone activity”, “peptide hormone processing”, “secretory granule”, “fibrinolysis” and “neuropeptide signaling pathway” (P < 0.05) ([76]Table 2). In the pathway enrichment analyses, upregulated DEGs were shown to be enriched for pathway terms including “cytokine−cytokine receptor interaction”, “chemokine signaling pathway”, and “staphylococcus aureus infection” (P < 0.05) ([77]Figure 2A), whereas downregulated DEGs were enriched for pathway terms including “chemical carcinogenesis”, “metabolism of xenobiotics by cytochrome P450”, and “drug metabolism − cytochrome P450” (P < 0.05) ([78]Figure 2B). Table 2. The Top 15 Significant Enriched GO Terms of Up/Down-Regulated DEGs Category Term Count P value Up-Regulated BP Inflammatory response 39 1.46E-22 CC Extracellular region 70 8.42E-20 BP Immune response 37 4.18E-19 CC Extracellular space 58 6.39E-16 BP Neutrophil chemotaxis 14 3.47E-12 BP Chemokine-mediated signaling pathway 14 9.28E-12 BP Chemotaxis 16 8.76E-11 MF Chemokine activity 11 5.31E-10 BP Response to lipopolysaccharide 16 5.73E-09 BP Cellular defense response 11 8.94E-09 CC Plasma membrane 91 4.69E-08 BP Innate immune response 23 8.17E-08 BP Cell surface receptor signaling pathway 17 9.24E-07 BP Cell chemotaxis 9 2.40E-06 BP Defense response to bacterium 12 4.11E-06 Down-regulated MF Hormone activity 7 2.12E-04 BP Peptide hormone processing 4 3.81E-04 CC Secretory granule 6 5.89E-04 BP Fibrinolysis 4 8.74E-04 BP Neuropeptide signaling pathway 6 0.002309 BP Cell-cell signaling 9 0.002309 BP Type B pancreatic cell differentiation 3 0.002847 BP Metabolic process 7 0.004416 BP Feeding behavior 4 0.004954 CC Secretory granule lumen 3 0.005187 CC Extracellular space 23 0.005855 BP Chemical synaptic transmission 8 0.006493 BP G-protein coupled receptor signaling 17 0.00795 Pathway MF Serine-type endopeptidase activity 8 0.008681 BP Positive regulation of neural precursor 3 0.009102 Cell proliferation [79]Open in a new tab Figure 2. [80]Figure 2 [81]Open in a new tab KEGG pathways enriched for DEGs. P-values are represented by the coloration of individual points in the scatterplot, with point size corresponding to the number of counts. (A) KEGG pathways associated with up-regulated DEGs. (B) KEGG pathways associated with down-regulated DEGs. Scatterplots were constructed using the online website ([82]http://www.ehbio.com/ImageGP/index.php/Home/Index/index.html. PPI Network Analysis It is well known that a gene can directly or indirectly affect another gene to exert a biological role. After clarifying the functions of these genes, we were more concerned about genes that play a pivotal role. We next used the STRING and Cytoscape tools to construct and visualize a PPI network for these DEGs. The final network was made up of 316 nodes and 1816 edges following irrelevant node deletion ([83]Figure 3A), with 166 DEGs not falling within this network. The cytoHubba plugin was next used to rate the entire network, with the top 10 genes based on Degree, Closeness and Betweenness identified as potential hub genes ([84]Table 3). Based on the overlap between these three profiles ([85]Figure 3B), we identified 4 overlapping potential hub genes: MMP9 (matrix metallopeptidase 9), ICAM1 (intercellular adhesion molecule 1), TLR8 (toll-like receptor 8), and PTPRC (protein tyrosine phosphatase receptor type C). Figure 3. [86]Figure 3 [87]Open in a new tab PPI network and candidate hub genes. (A) PPI network of identified DEGs. Up- and down-regulated DEGs are represented by red and blue nodes, respectively, with node size corresponding to logFC. Red and blue lines correspond to positive and negative correlations, respectively. (B) Venn diagram of four candidate hub genes from the top ten as determined according to Degree, Betweenness, and Closeness parameters. An online tool was used for Venn diagram construction ([88]http://bioinformatics.psb.ugent.be/cgi-bin/Liste/Venn/calculate_ve nn.htpl). Table 3. The Top Ten Genes in Each of the Three Main Scores Degree Closness Betweenness Node Name Score Node Name Score Node Name Score TLR8 62 TLR8 162 SPP1 9024.657 PTPRC 55 CXCR4 159.36667 SST 7393.812 CXCR4 53 CXCL1 159.11667 DRD2 6805.501 C3 52 MMP9 158.95 PTPRC 6554.313 MMP9 52 PTPRC 158.63333 MMP9 6510.665 CCL4 51 C3 157.35 ICAM1 6220.507 CXCL1 50 CCL4 157.08333 TLR8 5657.862 FPR2 48 ICAM1 156.2 IL13RA2 4956.119 SAA1 47 PPBP 153.31667 APOE 4947.338 ICAM1 46 CCL20 153.30952 MAGEA1 4312.797 [89]Open in a new tab Hub Gene Survival Analyses However, based on the practicality of clinical guidance, we needed to find genes in these hub genes that could promote gastric cancer and could be used for survival prediction. We next used the Kaplan-Meier Plotter database to explore how these hub genes related to GC patient survival. Of these genes, we found all 4 to be significantly differentially expressed (p<0.05), but only elevated ICAM1 expression was linked to a poorer GC patient prognosis (HR=1.51, 95%CI:1.26–1.81, P=9.6e-06) ([90]Figure 4). Figure 4. [91]Figure 4 [92]Open in a new tab GC patient survival analyses for the four candidate hub genes. Red and black lines represent patients with high and low expression levels of the indicated hub gene, respectively. (A) ICAM1 survival analysis in patients with GC (B) PTPRC survival analysis in patients with GC. (C) MMP9 survival analysis in patients with GC. (D) TLR8 survival analysis in patients with GC. The Association Between Elevated ICAM1 and GC Patient Clinicopathological Features Although the application of bioinformatics has given us directions, we still needed to combine sample verification to provide guidance for clinical research. Using IHC, we found ICAM1 expression in GC tissues to be primarily localized to cell membranes ([93]Figure 5A and [94]B). We further found ICAM1 to be highly expressed in GC tissue samples ([95]Figure 5B and [96]C), consistent with TCGA findings (p<0.05; [97]Figure 5D). To confirm that ICAM1 expression was linked to the clinical features of GC, we next explored this relationship in the online UACLAN database. We found ICAM1 expression to be unrelated to gender (p=7.740800E-1; [98]Figure 5E) or race (p=6.050600E-01, p=4.357800E-01, p=9.113600E-01) ([99]Figure 5F). Figure 5. [100]Figure 5 [101]Open in a new tab ICAM1 expression in GC patients. (A) Immunohistochemical localization of ICAM1 in GC. (B) Verification of protein levels expression in GC by Western blot. (C) ICAM1 levels were detected in 30 pairs of GC tissues by qRT-PCR, revealing significantly higher ICAM1 expression in GC tissues relative to paracancerous tissues. ΔCt values were determined by subtracting the β-actin Ct value from the ICAM1 Ct value. A smaller ΔCt value indicates higher expression. (D) ICAM1 expression in primary gastric tumor tissue compared to normal tissues; (E) ICAM1 expression in male and female GC patients; (F) ICAM1 expression in different races. STAD: stomach adenocarcinoma. *means p<0.05. ***means p<0.0001. Discussion GC remains one of the most common types of cancer in China,[102]^21 and yet owing to its lack of early-stage symptoms patients are often not diagnosed until the disease is already significantly advanced, leading to a poor prognosis.[103]^22 In a retrospective analysis of GC patients in Japan, a 5-year survival rate of 71.1% was detected among 118,367 GC patients following surgical resection, with respective 5-year survival rates for those with pathological IA, IB, II, IIIA, IIIB, and IV GC of 91.5%, 83.6%, 70.6%, 53.6%, 34.8%, and 16.4%.[104]^23 This suggests that the best means of improving GC patient outcomes is by detecting GC while it is in its early stages. By better exploring the molecular mechanisms governing the development of GC, it will be possible to better detect and diagnose EGC. Previous research has highlighted a number of genes associated with GC.[105]^24^–[106]^26 The specific molecular mechanisms governing this disease, however, are complex, and as such further data mining efforts are needed to identify relevant candidate biomarkers for GC diagnosis. In this study, we used the [107]GSE55696 dataset to explore gene expression in EGC, comparing differences between EGC (pooled with HGIN) and LGIN and thereby identifying 270 and 212 up- and down-regulated genes, respectively. We conducted GO analyses to gain better functional insights into the roles of the DEGs detected through this approach. This strategy revealed upregulated DEGs to be enriched for GO terms including “inflammatory response”, “extracellular region”, “immune response”, “extracellular space”, and “neutrophil chemotaxis”. Inflammation is a very important factor linked with GC progression and prognosis.[108]^27^,[109]^28 The immune response is also associated with GC to some degree.[110]^29 We further found downregulated DEGs to be enriched for GO terms such as “hormone activity”, “peptide hormone processing”, “secretory granule”, “fibrinolysis”, and “neuropeptide signaling pathway”. Neuropeptides are known to drive oncogenesis in response to inflammation through enhanced proliferation of epithelial cells.[111]^30 Fibrinolysis is associated with D-dimer production, with Lan et al having observed higher D-dimer levels in the plasma of GC patients relative to healthy controls in a manner correlated with depth of invasion, tumor size, lymph node metastasis, and TNM stage.[112]^31 In our pathway enrichment analysis, we found upregulated DEGs to be involved in “cytokine−cytokine receptor interaction”, “chemokine signaling pathway” and “staphylococcus aureus infection”. Chemokines are known to control cell migration in the context of inflammation, and prolonged inflammation can create a microenvironment conducing to the growth of tumors.[113]^32^,[114]^33 DEGs downregulated in this study were linked to “chemical carcinogenesis”, “metabolism of xenobiotics by cytochrome P450”, and “drug metabolism − cytochrome P450”. This is consistent with previous research.[115]^34^–[116]^36 These functional enrichment analyses gave us a clear understanding of the potential functions of these DEGs in the development of GC, which also suggested the importance of finding hub genes among these genes. Using a PPI network, we identified 4 candidate hub genes among these DEGs – MMP9, ICAM1, TLR8, and PTRC. These genes were highly expressed in EGC tissue samples but not in LGIN samples. Of these found genes, only ICAM1 was significantly associated with a poor GC patient prognosis, suggesting it may serve an oncogenic role. The discovery of this result gave us the enlightenment: ICAM1 regulates the occurrence of GC directly or indirectly, and may be used as a biomarker for the diagnosis of EGC. Of course, we need a larger sample for verification. Since ICAM1 has a biological role in regulating the occurrence of gastric cancer, we were also interested in whether ICAM1 could be used as a marker for evaluating the prognosis of patients with EGC. ICAM1, also known as CD54,[117]^37 is an immunoglobulin superfamily (IGSF) member containing immunoglobulin-like domains of 90–100 amino acids in length.[118]^38 ICAM1 is a key adhesion molecule that can vary substantially in size according to its glycosylation status.[119]^39^,[120]^40 It is also a key regulator of tumor progression, and it has been found to be upregulated in many tumor types including breast, kidney, and pancreatic cancers.[121]^41^–[122]^43 Our findings are in line with previous work exhibiting significantly increased ICAM1 levels in EGC relative to LGIN. ICAM1 plays tissue type-specific roles, with positive ICAM1 expression in breast cancer being negatively correlated with tumor size, lymph node metastasis, and tumor invasion.[123]^44 In contrast, Huang et al found IL-35 to drive PDAC metastasis via promoting ICAM1 overexpression,[124]^45 and Di et al determined that inhibiting ICAM1 significantly impaired breast cancer cell metastatic activity.[125]^42^,[126]^43 In this study, we found that high ICAM1 expression was evident in GC patient tissues, and was linked with a poorer GC patient prognosis. This expression was not associated with patient gender or ethnicity. This suggests ICAM1 has the potential to be used as an independent prognostic biomarker of EGC, in addition to playing a role in disease progression. But this also required a large number of samples for diagnostic tests. In this report, we identified 482 genes differentially expressed between EGC and LGIN tumor samples, among which ICAM1 was prominent. Enrichment analyses revealed the identified DEGs to be associated with pathways known to be relevant in cancer. Using a PPI network, we identified ICAM1 as a gene associated with GC patient prognosis and survival, with elevated expression of ICAM1 being present in GC patients regardless of race or gender. This suggests that ICAM1 may play a role in GC progression, and may be a valuable early biomarker with diagnostic and prognostic relevance. Study Limitations This study has many limitations, including its limited number of samples analyzed and the lack of detail regarding the mechanisms by which ICAM1 is expressed in GC. Future studies focused on these underlying mechanisms in a larger cohort of samples are thus needed. Acknowledgment