Abstract

Background

   Gastric cancer (GC) is among the most common forms of cancer affecting
   the digestive system. This study sought to identify hub genes
   regulating early GC (EGC) in order to explore their potential for early
   diagnosis and prognosis of patients.

Methods

   We utilized a publically available dataset from the Gene Expression
   Omnibus database ([38]GSE55696). Differences between EGC and LGIN with
   respect to gene expression were compared using the limma software.
   Identified differentially expressed genes (DEGs) were subjected to gene
   ontology (GO) and pathway enrichment analyses with the DAVID
   application, and the STRING website and Cytoscape software were used to
   construct a protein-protein interaction (PPI) network incorporating
   these DEGs. This network was in turn used to identify hub genes among
   selected DEGs, which were analyzed with the Kaplan-Meier Plotter
   database. In addition, Western blotting, qRT-PCR, immunohistochemistry,
   and UALCAN were all employed to validate the relationship between the
   expression of these genes and GC patient prognosis.

Results

   A total of 482 DEGs were identified, with GO analyses indicating an
   increase in the expression of genes linked with the development of
   cancer. Pathway analyses also indicated that these genes play a role in
   certain cancer-related pathways. The PPI network highlighted four
   potential hub genes, of which only ICAM1 was linked to a poor GC
   patient prognosis. This link between ICAM1 and GC patient outcomes was
   confirmed via UALCAN, Western blotting, immunohistochemistry, and
   qRT-PCR.

Conclusion

   ICAM1 may therefore modulate tumor progression in GC, thus potentially
   representing a valuable prognostic and diagnostic biomarker of EGC.

   Keywords: bioinformatics analysis, gastric cancer, early diagnosis,
   ICAM1, biomarker

Introduction

   Gastric cancer (GC) remains one of the most common forms of digestive
   system tumors, and the third leading cancer-associated cause of death
   according to GLOBOCAN2018. Approximately 10,000,000 new GC cases were
   diagnosed in 2018, with 783,000 people having died of the disease,
   which occurs twice as often in men as it does in women.[39]^1 GC is the
   third most common form of cancer in China and the second most prominent
   cause of cancer-associated mortality.[40]^2 This mortality often
   results from a failure to detect early GC (EGC), as current diagnostic
   strategies primarily depend upon endoscopic examination, imaging, and
   serology,[41]^3^–[42]^5 with most analyzed patients already being in
   the advanced stages of disease when subjected to these analyses. In
   China, GC has a 5-year survival rate of 35.9%, which is lower than
   rates in Japan and South Korea (60.3% and 68.9%, respectively). This is
   largely explained by the much higher rates of EGC diagnosis and
   detection in Korea and Japan.[43]^6^,[44]^7 As such, there is a clear
   need for the more reliable detection of EGC through the use novel
   biomarkers of this disease.

   A common current strategy for analyzing tumor-associated gene
   expression depends upon the use of microarray and bioinformatics
   analytical approaches. Through such strategies, Li et al identified
   CLO4A1 as a potential biomarker of recurrent GC.[45]^8 Yan et al also
   found COL1A1, MMP2, FN1, TIMP1, SPARC, COL4A1, and ITGA5 to all
   represent potential GC biomarkers.[46]^9 These biomarker identification
   strategies, however, largely depend upon comparisons between normal
   tissue and advanced GC tissue samples, making them of limited utility
   when guiding EGC diagnosis. The WHO reclassified gastric cancers in
   2010 into low-grade intraepithelial neoplasia (LGIN), high-grade
   intraepithelial neoplasia (HGIN), EGC, and GC categories, although
   controversy regarding the definition of these different disease states
   remains.[47]^10 For example, researchers in Japan posit that as HGIN
   tumors exhibit dysplasia, they are better classified as instances of
   EGC.[48]^11 Such discrepancies may further explain the increased rates
   of EGC detection in Japan. In the present study, to better identify
   genes associated with the earliest stages of GC differentiation and
   progression, we classified HGIN as a form of EGC in line with these
   Japanese criteria, and we then compared gene expression between EGC and
   LGIN in a publically available dataset in an effort to detect
   differentially expressed genes (DEGs) linked to GC patient outcomes.

   For this study, we utilized the available [49]GSE55696[50]^12 dataset
   uploaded in the Gene Expression Omnibus (GEO,
   [51]https://www.ncbi.nlm.nih.gov/) database. This dataset incorporated
   gene expression results from endoscopic biopsy tissue samples from
   patients diagnosed with LGIN, HGIN, or EGC. After pooling HGIN and EGC
   data, we used the limma package as a means identifying DEGs between the
   EGC and LGIN groups. The resultant DEGs were then subjected to gene
   ontology (GO) and pathway enrichment analyses to better explore their
   biological roles. We further generated a protein-protein interaction
   (PPI) network for these genes to highlight central hub genes. We then
   used the Kaplan-Meier Plotter ([52]http://kmplot.com/analysis/)
   database to assess how these hub genes were linked to GC patient
   outcomes. We additionally classified the clinical relevance of these
   hub genes based upon an online database in an effort to identify
   potentially novel biomarkers of EGC that may permit earlier patient
   diagnosis and prognostic planning.

Materials and Methods

Microarray Data

   [53]GSE55696 data were downloaded from GEO, which compiles large
   amounts of publically-available gene expression data, including
   high-throughput microarray data.[54]^13 The chosen study had employed
   the [55]GPL6480 Agilent-014850 Whole Human Genome Microarray 4x44K
   G4112F for their analyses, and included a total of 19 LGIN, 20 HGIN, 19
   EGC, and 19 chronic gastritis tissue samples.

DEG Identification

   After downloading the [56]GSE55696 series matrix file, we omitted
   chronic gastritis samples from further analyses owing to their unclear
   definition, and we combined EGC and HGIN samples prior to comparing
   this aggregate EGC group to the LGIN group using the limma, impute, and
   heat map R packages derived from bioconductor
   ([57]http://bioconductor.org/biocLite.R). In cases where there were
   multiple probes for a single gene, mean values were used. DEGs were
   those with P < 0.05 in a t-test and a [logFC] > 1.

Functional and Pathway Enrichment Analyses

   GO analyses allow for exploration of the functional roles of sets of
   genes,[58]^14 while KEGG analyses allow for exploration of the pathways
   in which such genes may function.[59]^15 We conducted these two forms
   of analyses on our identified DEGs using the DAVID
   ([60]https://david.ncifcrf.gov/) tool, which allowed for comprehensive
   functional annotation.[61]^16 Significant enrichment was said to be
   evident when P < 0.05.

Hub Gene Identification

   The PPI network was constructed based on predicted interactions in the
   online STRING database ([62]https://string-db.org/).[63]^17 We uploaded
   all 482 DEGs in the present study to yield an initial PPI, and then
   visualized this network using Cytoscape Version 3.7.1. Next, cytoHubba
   was used to rate the network, with the top 10 genes rated according to
   their Degree, Closeness, and Betweenness scores being the candidate hub
   genes.

Hub Gene Survival Analyses

   To analyze the relevance of identified hub genes to GC patient
   survival, we separated patients into hub gene-high and –low groups
   based on median expression levels, and then used the online
   Kaplan-Meier Plotter database to compare GC patient outcomes.[64]^18
   Differences in survival based on hub gene expression were assessed via
   Log rank test, with P<0.05 as the significance threshold. Using this
   approach we were able to identify those genes associated with a poorer
   GC prognosis.

ICAM1 Hub Gene Validation in GC Using UALCAN

   To confirm the relevance of the identified hub gene ICAM1 in GC, we
   employed the online UALCAN ([65]http://ualcan.path.uab.edu/) tool that
   allows for comparisons of gene expression data and clinical data across
   31 forms of cancer.[66]^19 Differences in gene expression were compared
   between groups via t-tests, with P<0.05 as the significance threshold,
   and we explored how ICAM1 expression related to GC patient clinical
   findings.

qRT-PCR

   For qRT-PCR, 30 paired GC and adjacent normal tissue samples surgically
   collected from 2013–2014 at Guangxi Medical University Cancer Hospital
   were used. RNAiso plus (9108, TaKaRa, USA was used for RNA extraction,
   followed by use of a cDNA reverse transcription kit (RRO47A; TaKaRa,
   USA). Primers used were: ICAM1 forward, 5ʹ‐CAGGAGCAACTTCTCCTGC‐3ʹ;
   ICAM1 reverse, 5ʹ‐ACCGGAATGACAATGTCCAGGATA‐3ʹ.[67]^20 A SYBR Green kit
   (RR820; TaKaRa, USA) was used for qRT-PCR on an ABI7500 device. Cycle
   settings were: 30s at 95°C, then 40 cycles of 15 s at 95°C, and 34 s at
   60°C. Triplicate samples were used and averaged, with β-actin as a
   reference control.

Western Blotting

   Tissue samples were homogenized in RIPA buffer (p00136, Beyotime
   Biotechnology, China) containing PMSF (ST506, Beyotime Biotechnology).
   Samples were then boiled in loading buffer (5×) (P0015L, Beyotime
   Biotechnology), after which samples were electrophoretically separated
   on SDS-PAGE 4–10% Bis-Tris gels prior to transfer to PVDF membranes
   (IPVH00011, Solarbio, China). Next, 5% skim milk powder was used to
   block membranes, followed by overnight incubation with rabbit
   monoclonal anti-ICAM1 (1:1000, ab53013; Abcam, Cambridge, UK) or
   anti-β-actin (1:1000, Sigma, China). Then, a goat anti-rabbit IgG
   (H + L) antibody (1:1000, A0208, Beyotime Biotechnology) was used to
   detect primary antibodies, with a ChemiDoc MP system (Bio Rad
   Laboratories, Inc.) used for chemiluminescent protein visualization.

Immunohistochemistry

   Tissue samples from chronic superficial gastritis, LGIN, HGIN, EGC, and
   GC patients collected between 2018 and 2019 and held in the Guangxi
   Medical University Cancer Hospital specimen library were used for IHC
   experiments. The same ICAM1 antibody used for Western blotting was used
   for IHC at a 1:50 dilution, with a biotin donkey-anti-rabbit antibody
   (AP182B) (1:500; Millipore, Billerica, MA, USA) used for secondary
   detection. The IHC staining procedure was as follows: samples were
   warmed at 60°C for 4h, dewaxed with xylene, rehydrated using an ethanol
   gradient, treated EDTA (50x) for antigen repair, subjected to an
   endogenous peroxidase blocker (PV-6000, BAOXIN BIO, China), and then
   warmed for 10 min to 37°C prior to rinsing using PBS. Samples were then
   probed overnight with primary antibodies at 4°C, warmed for 15 min to
   37°C, rinsed in PBS, stained with secondary antibody for 20 min at
   37°C. DAB was used for development of staining, and samples were then
   counterstained with hematoxylin, differentiated with hydrochloric acid
   alcohol, dehydrated, dried, and sealed before analysis.

Results

DEG Identification

   In order to screen for meaningful biomarkers in EGC, using the R limma
   package and the p < 0.05 and [logFC] > 1 cut‑off criteria, we detected
   a total of 482 DEGs when comparing the EGC and LGIN datasets
   ([68]Figure 1A). Of these genes, 270 were upregulated and 212 were
   downregulated ([69]Table 1). We further generated a heat map of the top
   50 DEGs using an appropriate R package ([70]Figure 1B).

Figure 1.

   [71]Figure 1
   [72]Open in a new tab

   DEG selection and hierarchical clustering analysis. (A) DEGs are
   arranged in a volcano plot, with the vertical and horizontal axes
   corresponding to logFC (fold change) and -log10 (p value). Green and
   red dots correspond to DEGs, whereas genes that were not DEGs are
   represented by black dots. (B) The top 50 DEGs were arranged in a heat
   map, with genes on the horizontal axis and samples along the vertical
   axis. DEGs could be divided into cancer and non-cancer groups. Up- and
   down-regulated DEGs are shown in red and green, respectively.

Table 1.

   A Total of 482 DEGs Were Identified from the [73]GSE55696 Dataset,
   Including 270 Up-Regulated DEGs and 212 Down-Regulated DEGs in EGC
   Tissues, Compared to LGIN Tissues
   DEGs Gene Symbol^#
   Up-Regulated S100A12, S100A8, S100A9, LST1, FPR1, FCGR3A, MNDA, NFE2,
   FCGR2A,
   CLEC4E,SELM, CMTM2, ADGRE3, AQP9, CCL4, FCN1, TNIP3, FMNL1, TRIB3,
   VNN2, TYROBP, BEST1, LRRK2, CCL3, CARD9, CLEC4A, FGR, LILRB3, FAM49A,
   CFP, CXCR1, G0S2, FAM65B, LILRB2, CMTM7, LINC-PINT, RNASE2, PYGL,
   FCGR1B, NCF2, CXCR2P1, NCF1, ITIH4, GBP5, IL7R, LY96, SPI1, IL21R, OSM,
   CSF3R, PPBP, IGSF6, LGALS2, SOD2, CXCR4, DEFA3, ACSS3, ADAM8, SNHG6,
   TREM1, ADGRE2, CAMP, TNFAIP2, CD28, UAP1L1, APOE,RGS18, HBD, DUOXA1,
   HBA2, IFITM3, CLEC4D, FADS1, SLC2A6, FSTL3, MMP9, GLT1D1, CXCR2, S1PR4,
   CXCL2, ICAM1, LINC01410, SIGLEC10, KLHL17, TLR8, PLEK, CTHRC1, CSF2RA,
   HBA1, PTPRC, RAB42, BCAS4, MMP25, LINC01094, TNFAIP6, HLA-DPB1, PLTP,
   PAQR5, CCL3L3, MILR1, FKBP10, C5AR1, MFAP2, LOC100507460, MCEMP1,
   FAM198A, NEU1, RGS1, LOC107987020, TNFRSF10C, ADAMDEC1, PABPC1L, SPP1,
   TPM2, LAMP3, PTGS2, LY6E, RELL2, ZBTB32, REC8, PTGDS, AIM2, PDPN,
   SPRR1B,NNMT, EGFL6, SRGN, CXCL1, ITGA4, PDZK1IP1, APOC1, LCN2, HOXC10,
   HBB, EGR3, LOC284454, HLA-DRB1, BAAT, CCR7, MUCL1, LRGUK, FIGF,LRRC39,
   PARVB, LILRA2, C3, DPYSL4, CXCL3, KRT23, HOXC13, FPR2, MIR503HG, CCL20,
   C4B, ALAS2, PP7080, TRAT1, ZBP1, BMP7, LINC01296, PTCHD1, P2RX5, IRX2,
   CYB5R1, CATSPERB, LOC155060, CD70, UCHL1, LOC339803, SAA1, KRT6B, CTSW,
   BIRC3, HLA-DRB3, SPNS1, IFI6, SLC22A31, CARD14, ABCA12, KLHL6,
   ATP6V0A4, KCNIP3, FBXO2, INHBA,TMEM213, CLDN6, VEPH1, PLA2G7, ZNF556,
   TSPOAP1-AS1, CD72, CYR61, DUOX1, CP, IGHG4, HOXC11, BNIP3, ZBED2,
   CDIPT-AS1, LYPD2, CD79B,CYP2A7P1, CKLF-CMTM1, CHI3L2, CXCL9, LINC00886,
   IDO1, SAA4, CHI3L1, ATP6V1C2, PF4, KRT6C, UBD, IFI44L, SYNDIG1, PI3,
   COL8A1, CLECL1, CHRDL2, CETP, TNFRSF6B, PRAME, HBG1, IGF2BP3, DUOX2,
   FOXC1, POTEG, FGG, HOXC-AS3, POTED, CXCL13, SMKR1, LGSN, POTEB3, IL19,
   IL13RA2, HS3ST2, PCOLCE2, APOBEC3A, FXYD4, ACTL8, GTSF1, CD19, DSCR8,
   DDX43, KRT6A, SERPINA5, LY6K, GABRB3, KIAA1324L,CT45A5, ST8SIA6-AS1,
   SIX1, FOSB, MAGEA9, SERPINA3, OSR2, FOLR1, CSAG1, CTAG2, MAGEA1, DKK1,
   BPIFB1, OGDHL, SCRG1, BEX2, CRABP2, FAM25A, PDILT, TNNC1
   Down-Regulated CAPNS2, PFN1P2, HNRNPCL1, NACAP1, CHRNB3, SUMO1P3,
   LOC100507351,
   MORF4, CSNK1A1L, ACTG1P4, PARP4, ZNF729, NLRP5, ICMT, HSDL2, NUDT12,
   MYH2, PABPC3, ANP32C, PGAM1P4, SIM1, SPACA1, CAPZA2, PDZD4, NUBPL,
   GABRR3, KRT35, GPR12, PRSS56, KRT8P41, LOC148709, ODF4, SCRT2, ATP8B5P,
   EEF1DP4, CXXC4, TAGLN3, CBWD6, UGT2A3, ZNF367, ANXA2P3, CXADRP2, CUBN,
   CTAGE3P, GALC, KLKB1, PPIAP30, CCDC68, CTAGE6, KRT19P2, MYT1, DRD2,
   CRYBB3, ITGA2, RTN4RL2, RPS2P45, DOPEY2, LOC100128164, SVOP, ADAMTS19,
   PCSK1N, MKRN9P, MS4A8, GRXCR1, ZNF259P1, BSN, PCSK1, FAM197Y2P,NMUR2,
   LOC105370109, FABP5, MMD2, RNF113B, RTP5, OTC,OR2C1, LOC643549,
   NUDT16P1, RPL23AP32, TRIM49, HTR3D, NOL4, DAGLA, TACR1, ELOVL4, CORIN,
   SCG5, SCG2, SLC7A8, LOC101927000, MAGEE2, LINC00326, SPAG1,
   TMPRSS6, CTAGE10P, HNRNPA1P27, GCNT2, RFX6, ADH1A, LOC729652,
   LYPD4, SCGB1D2, MYSM1, EDN3, INSM1, KCNMB2, CACNG2, KCNH6, FEV, SUCNR1,
   DSCR4, IL3, CTSG,NHLH2, HMP19, CNTF, ARX, SYT4, ANO8, COL17A1,
   ARVCF,TMEM155, ADAM7, PROKR2, SCG3, UGT2B10, SCGN,LCE3D,C11orf40,
   RPRML,LOC729080, SST, OVCH1-AS1,VIL1,ANXA2, OR10H1,MLN, PAX6, ABO,RGS4,
   GLDN, OR1N2, SHANK1, ALDOC, LRTM2, ABCC11, CHGB,
   TENM2,UGT2B11,PROX1,GPRC6A,NCKAP5,OR8H1,NPY6R,NR2E1,GSTA3,PLCXD3,
   TUBB2B,GNG3, HRG,SCGB1D1, HEPACAM2, KCNJ6, CA9, WSCD2, ADH6, GUSBP10,
   CNDP1, TINAG, CEACAM3, NKX6-3, TM6SF2, FBP2,CASR,SYNPR,
   KRT20,TMPRSS15,ODF3L1,TEKT4P2,CCL13,TM4SF4,CCL25,LYVE1,AK5,CLDN10,PTPN2
   0,GAST,BTNL8,ZSCAN4,TONSL,TRIM23,C20orf85,FAM189A1,SLC38A11,DACH1,SLC18
   A1,GCG,NAT2,GPA33,GSTA1,ALPI,A1CF,LHFPL3-AS2,CARTPT,CEACAM6,CHST5,SLC10
   A2,UPK1B,KLK12, GALNT8
   [74]Open in a new tab

   Note: ^#It is set at p<0.05 and |fold change| >1.0 as cut‑off criteria.

GO and KEGG Analyses

   In order to clarify the role of these DEGs in the progression of GC, we
   needed to immediately predict the functional role of these genes. We
   next conducted GO and KEGG pathway analyses on these 482 DEGs. GO
   analyses revealed upregulated DEGs to be enriched for terms including
   “inflammatory response”, “extracellular region”, “immune response”,
   “extracellular space”, and “neutrophil chemotaxis” (P < 0.05)
   ([75]Table 2). In contrast, downregulated DEGs were enriched for terms
   including “hormone activity”, “peptide hormone processing”, “secretory
   granule”, “fibrinolysis” and “neuropeptide signaling pathway” (P <
   0.05) ([76]Table 2). In the pathway enrichment analyses, upregulated
   DEGs were shown to be enriched for pathway terms including
   “cytokine−cytokine receptor interaction”, “chemokine signaling
   pathway”, and “staphylococcus aureus infection” (P < 0.05) ([77]Figure
   2A), whereas downregulated DEGs were enriched for pathway terms
   including “chemical carcinogenesis”, “metabolism of xenobiotics by
   cytochrome P450”, and “drug metabolism − cytochrome P450” (P < 0.05)
   ([78]Figure 2B).

Table 2.

   The Top 15 Significant Enriched GO Terms of Up/Down-Regulated DEGs
   Category                  Term                   Count P value
   Up-Regulated
   BP       Inflammatory response                   39    1.46E-22
   CC       Extracellular region                    70    8.42E-20
   BP       Immune response                         37    4.18E-19
   CC       Extracellular space                     58    6.39E-16
   BP       Neutrophil chemotaxis                   14    3.47E-12
   BP       Chemokine-mediated signaling pathway    14    9.28E-12
   BP       Chemotaxis                              16    8.76E-11
   MF       Chemokine activity                      11    5.31E-10
   BP       Response to lipopolysaccharide          16    5.73E-09
   BP       Cellular defense response               11    8.94E-09
   CC       Plasma membrane                         91    4.69E-08
   BP       Innate immune response                  23    8.17E-08
   BP       Cell surface receptor signaling pathway 17    9.24E-07
   BP       Cell chemotaxis                         9     2.40E-06
   BP       Defense response to bacterium           12    4.11E-06
   Down-regulated
   MF       Hormone activity                        7     2.12E-04
   BP       Peptide hormone processing              4     3.81E-04
   CC       Secretory granule                       6     5.89E-04
   BP       Fibrinolysis                            4     8.74E-04
   BP       Neuropeptide signaling pathway          6     0.002309
   BP       Cell-cell signaling                     9     0.002309
   BP       Type B pancreatic cell differentiation  3     0.002847
   BP       Metabolic process                       7     0.004416
   BP       Feeding behavior                        4     0.004954
   CC       Secretory granule lumen                 3     0.005187
   CC       Extracellular space                     23    0.005855
   BP       Chemical synaptic transmission          8     0.006493
   BP       G-protein coupled receptor signaling    17    0.00795
   Pathway
   MF       Serine-type endopeptidase activity      8     0.008681
   BP       Positive regulation of neural precursor 3     0.009102
            Cell proliferation
   [79]Open in a new tab

Figure 2.

   [80]Figure 2
   [81]Open in a new tab

   KEGG pathways enriched for DEGs. P-values are represented by the
   coloration of individual points in the scatterplot, with point size
   corresponding to the number of counts. (A) KEGG pathways associated
   with up-regulated DEGs. (B) KEGG pathways associated with
   down-regulated DEGs. Scatterplots were constructed using the online
   website
   ([82]http://www.ehbio.com/ImageGP/index.php/Home/Index/index.html.

PPI Network Analysis

   It is well known that a gene can directly or indirectly affect another
   gene to exert a biological role. After clarifying the functions of
   these genes, we were more concerned about genes that play a pivotal
   role. We next used the STRING and Cytoscape tools to construct and
   visualize a PPI network for these DEGs. The final network was made up
   of 316 nodes and 1816 edges following irrelevant node deletion
   ([83]Figure 3A), with 166 DEGs not falling within this network. The
   cytoHubba plugin was next used to rate the entire network, with the top
   10 genes based on Degree, Closeness and Betweenness identified as
   potential hub genes ([84]Table 3). Based on the overlap between these
   three profiles ([85]Figure 3B), we identified 4 overlapping potential
   hub genes: MMP9 (matrix metallopeptidase 9), ICAM1 (intercellular
   adhesion molecule 1), TLR8 (toll-like receptor 8), and PTPRC (protein
   tyrosine phosphatase receptor type C).

Figure 3.

   [86]Figure 3
   [87]Open in a new tab

   PPI network and candidate hub genes. (A) PPI network of identified
   DEGs. Up- and down-regulated DEGs are represented by red and blue
   nodes, respectively, with node size corresponding to logFC. Red and
   blue lines correspond to positive and negative correlations,
   respectively. (B) Venn diagram of four candidate hub genes from the top
   ten as determined according to Degree, Betweenness, and Closeness
   parameters. An online tool was used for Venn diagram construction
   ([88]http://bioinformatics.psb.ugent.be/cgi-bin/Liste/Venn/calculate_ve
   nn.htpl).

Table 3.

   The Top Ten Genes in Each of the Three Main Scores
       Degree           Closness          Betweenness
   Node Name Score Node Name   Score   Node Name  Score
   TLR8      62    TLR8      162       SPP1      9024.657
   PTPRC     55    CXCR4     159.36667 SST       7393.812
   CXCR4     53    CXCL1     159.11667 DRD2      6805.501
   C3        52    MMP9      158.95    PTPRC     6554.313
   MMP9      52    PTPRC     158.63333 MMP9      6510.665
   CCL4      51    C3        157.35    ICAM1     6220.507
   CXCL1     50    CCL4      157.08333 TLR8      5657.862
   FPR2      48    ICAM1     156.2     IL13RA2   4956.119
   SAA1      47    PPBP      153.31667 APOE      4947.338
   ICAM1     46    CCL20     153.30952 MAGEA1    4312.797
   [89]Open in a new tab

Hub Gene Survival Analyses

   However, based on the practicality of clinical guidance, we needed to
   find genes in these hub genes that could promote gastric cancer and
   could be used for survival prediction. We next used the Kaplan-Meier
   Plotter database to explore how these hub genes related to GC patient
   survival. Of these genes, we found all 4 to be significantly
   differentially expressed (p<0.05), but only elevated ICAM1 expression
   was linked to a poorer GC patient prognosis (HR=1.51, 95％CI:1.26–1.81,
   P=9.6e-06) ([90]Figure 4).

Figure 4.

   [91]Figure 4
   [92]Open in a new tab

   GC patient survival analyses for the four candidate hub genes. Red and
   black lines represent patients with high and low expression levels of
   the indicated hub gene, respectively. (A) ICAM1 survival analysis in
   patients with GC (B) PTPRC survival analysis in patients with GC. (C)
   MMP9 survival analysis in patients with GC. (D) TLR8 survival analysis
   in patients with GC.

The Association Between Elevated ICAM1 and GC Patient Clinicopathological
Features

   Although the application of bioinformatics has given us directions, we
   still needed to combine sample verification to provide guidance for
   clinical research. Using IHC, we found ICAM1 expression in GC tissues
   to be primarily localized to cell membranes ([93]Figure 5A and [94]B).
   We further found ICAM1 to be highly expressed in GC tissue samples
   ([95]Figure 5B and [96]C), consistent with TCGA findings (p<0.05;
   [97]Figure 5D). To confirm that ICAM1 expression was linked to the
   clinical features of GC, we next explored this relationship in the
   online UACLAN database. We found ICAM1 expression to be unrelated to
   gender (p=7.740800E-1; [98]Figure 5E) or race (p=6.050600E-01,
   p=4.357800E-01, p=9.113600E-01) ([99]Figure 5F).

Figure 5.

   [100]Figure 5
   [101]Open in a new tab

   ICAM1 expression in GC patients. (A) Immunohistochemical localization
   of ICAM1 in GC. (B) Verification of protein levels expression in GC by
   Western blot. (C) ICAM1 levels were detected in 30 pairs of GC tissues
   by qRT-PCR, revealing significantly higher ICAM1 expression in GC
   tissues relative to paracancerous tissues. ΔCt values were determined
   by subtracting the β-actin Ct value from the ICAM1 Ct value. A smaller
   ΔCt value indicates higher expression. (D) ICAM1 expression in primary
   gastric tumor tissue compared to normal tissues; (E) ICAM1 expression
   in male and female GC patients; (F) ICAM1 expression in different
   races. STAD: stomach adenocarcinoma. *means p<0.05. ***means p<0.0001.

Discussion

   GC remains one of the most common types of cancer in China,[102]^21 and
   yet owing to its lack of early-stage symptoms patients are often not
   diagnosed until the disease is already significantly advanced, leading
   to a poor prognosis.[103]^22 In a retrospective analysis of GC patients
   in Japan, a 5-year survival rate of 71.1% was detected among 118,367 GC
   patients following surgical resection, with respective 5-year survival
   rates for those with pathological IA, IB, II, IIIA, IIIB, and IV GC of
   91.5%, 83.6%, 70.6%, 53.6%, 34.8%, and 16.4%.[104]^23 This suggests
   that the best means of improving GC patient outcomes is by detecting GC
   while it is in its early stages. By better exploring the molecular
   mechanisms governing the development of GC, it will be possible to
   better detect and diagnose EGC. Previous research has highlighted a
   number of genes associated with GC.[105]^24^–[106]^26 The specific
   molecular mechanisms governing this disease, however, are complex, and
   as such further data mining efforts are needed to identify relevant
   candidate biomarkers for GC diagnosis.

   In this study, we used the [107]GSE55696 dataset to explore gene
   expression in EGC, comparing differences between EGC (pooled with HGIN)
   and LGIN and thereby identifying 270 and 212 up- and down-regulated
   genes, respectively.

   We conducted GO analyses to gain better functional insights into the
   roles of the DEGs detected through this approach. This strategy
   revealed upregulated DEGs to be enriched for GO terms including
   “inflammatory response”, “extracellular region”, “immune response”,
   “extracellular space”, and “neutrophil chemotaxis”. Inflammation is a
   very important factor linked with GC progression and
   prognosis.[108]^27^,[109]^28 The immune response is also associated
   with GC to some degree.[110]^29 We further found downregulated DEGs to
   be enriched for GO terms such as “hormone activity”, “peptide hormone
   processing”, “secretory granule”, “fibrinolysis”, and “neuropeptide
   signaling pathway”. Neuropeptides are known to drive oncogenesis in
   response to inflammation through enhanced proliferation of epithelial
   cells.[111]^30 Fibrinolysis is associated with D-dimer production, with
   Lan et al having observed higher D-dimer levels in the plasma of GC
   patients relative to healthy controls in a manner correlated with depth
   of invasion, tumor size, lymph node metastasis, and TNM stage.[112]^31

   In our pathway enrichment analysis, we found upregulated DEGs to be
   involved in “cytokine−cytokine receptor interaction”, “chemokine
   signaling pathway” and “staphylococcus aureus infection”. Chemokines
   are known to control cell migration in the context of inflammation, and
   prolonged inflammation can create a microenvironment conducing to the
   growth of tumors.[113]^32^,[114]^33 DEGs downregulated in this study
   were linked to “chemical carcinogenesis”, “metabolism of xenobiotics by
   cytochrome P450”, and “drug metabolism − cytochrome P450”. This is
   consistent with previous research.[115]^34^–[116]^36

   These functional enrichment analyses gave us a clear understanding of
   the potential functions of these DEGs in the development of GC, which
   also suggested the importance of finding hub genes among these genes.

   Using a PPI network, we identified 4 candidate hub genes among these
   DEGs – MMP9, ICAM1, TLR8, and PTRC. These genes were highly expressed
   in EGC tissue samples but not in LGIN samples. Of these found genes,
   only ICAM1 was significantly associated with a poor GC patient
   prognosis, suggesting it may serve an oncogenic role. The discovery of
   this result gave us the enlightenment: ICAM1 regulates the occurrence
   of GC directly or indirectly, and may be used as a biomarker for the
   diagnosis of EGC. Of course, we need a larger sample for verification.

   Since ICAM1 has a biological role in regulating the occurrence of
   gastric cancer, we were also interested in whether ICAM1 could be used
   as a marker for evaluating the prognosis of patients with EGC.

   ICAM1, also known as CD54,[117]^37 is an immunoglobulin superfamily
   (IGSF) member containing immunoglobulin-like domains of 90–100 amino
   acids in length.[118]^38 ICAM1 is a key adhesion molecule that can vary
   substantially in size according to its glycosylation
   status.[119]^39^,[120]^40 It is also a key regulator of tumor
   progression, and it has been found to be upregulated in many tumor
   types including breast, kidney, and pancreatic
   cancers.[121]^41^–[122]^43 Our findings are in line with previous work
   exhibiting significantly increased ICAM1 levels in EGC relative to
   LGIN. ICAM1 plays tissue type-specific roles, with positive ICAM1
   expression in breast cancer being negatively correlated with tumor
   size, lymph node metastasis, and tumor invasion.[123]^44 In contrast,
   Huang et al found IL-35 to drive PDAC metastasis via promoting ICAM1
   overexpression,[124]^45 and Di et al determined that inhibiting ICAM1
   significantly impaired breast cancer cell metastatic
   activity.[125]^42^,[126]^43 In this study, we found that high ICAM1
   expression was evident in GC patient tissues, and was linked with a
   poorer GC patient prognosis. This expression was not associated with
   patient gender or ethnicity. This suggests ICAM1 has the potential to
   be used as an independent prognostic biomarker of EGC, in addition to
   playing a role in disease progression. But this also required a large
   number of samples for diagnostic tests.

   In this report, we identified 482 genes differentially expressed
   between EGC and LGIN tumor samples, among which ICAM1 was prominent.
   Enrichment analyses revealed the identified DEGs to be associated with
   pathways known to be relevant in cancer. Using a PPI network, we
   identified ICAM1 as a gene associated with GC patient prognosis and
   survival, with elevated expression of ICAM1 being present in GC
   patients regardless of race or gender. This suggests that ICAM1 may
   play a role in GC progression, and may be a valuable early biomarker
   with diagnostic and prognostic relevance.

Study Limitations

   This study has many limitations, including its limited number of
   samples analyzed and the lack of detail regarding the mechanisms by
   which ICAM1 is expressed in GC. Future studies focused on these
   underlying mechanisms in a larger cohort of samples are thus needed.

Acknowledgment