Abstract Background Neutrophil extracellular trap (NET) is associated with host response, tumorigenesis, and immune dysfunction. However, the link between NET and the tumor microenvironment (TME) of gastric cancer (GC) remains unclear. Our study aims to characterize the expression patterns of NET-related genes and their relationships with clinicopathological characteristics, prognosis, TME features, and immunotherapy efficacy in GC cohorts. Methods Transcriptomic and single-cell RNA sequencing profiles of GC with annotated clinicopathological data were obtained from TCGA-STAD (n = 415), [35]GSE62254 (n = 300), [36]GSE15459 (n = 192), and [37]GSE183904 (n = 26). The consensus cluster algorithm was used to classify tumor samples into different NET-related clusters. A NET-related signature was constructed using LASSO regression and verified in four immunotherapy cohorts. ROC and Kaplan-Meier analyses were conducted to evaluate the predictive and prognostic value of the model for immunotherapy efficacy. Results This study identified two NET-related clusters with distinct clinicopathological features, prognosis, and TME landscapes. The high NET-related cluster, characterized by increased NET-related gene expression, exhibited more aggressive behavior and a worse prognosis (HR = 1.63, P = 0.004) than the low NET-related cluster. DEGs were primarily involved in the chemokine/cytokine-associated pathways. Moreover, the high NET-related cluster had significantly higher levels of TME scores, immune infiltration, and immune effectors (all P < 0.001). The NET-related signature displayed a high predictive accuracy for immunotherapy response (AUC = 0.939, P < 0.001). Furthermore, patients with high NET-related scores consistently harbored a more favorable prognosis in different immunotherapy cohorts (all P < 0.05). Conclusions This study identified the NET-related signature as a robust model for predicting immunotherapy response in GC, which can help clinicians make appropriate immunotherapy decisions. Keywords: Neutrophil extracellular trap, Gastric cancer, Immunotherapy, Response Ethics statement The authors declare human ethics approval was not applicable for the research. Funding This research was funded by the Jinhua Science and Technology Research Project (No. 2021-4-165) and the Key Research and Development Program of the Science and Technology Department of Zhejiang Province (No. 2018C03022). 1. Introduction Gastric cancer (GC) is a significant global health burden, ranking fifth in incidence and fourth in cancer-related mortality in 2020 [[38]1]. Despite the advances in treatment strategies including surgical resection, radiotherapy, targeted therapy, and immunotherapy, patients with GC continue to experience high morbidity, recurrence, and mortality rates [[39]2]. High molecular heterogeneity poses a major obstacle limiting the therapeutic benefits in GC. Accordingly, molecular biomarkers are recognized as important tools for accurately predicting prognosis and providing better-individualized management. The advent of immune checkpoint inhibitors (ICIs) has marked a new era of precision medicine [[40]3]. Nevertheless, widely varying clinical benefits were observed across different patients treated with ICIs in clinical practice, owing to the high heterogeneity of GC [[41]4]. Recent clinical trials have accelerated the need to identify novel molecular biomarkers for predicting immunotherapy response [[42]5,[43]6]. Developing effective biomarkers can help inform patient stratification and guide clinicians in making immunotherapy decisions. Classical biomarkers, including programmed cell death ligand 1 (PD-L1) [[44]7], Epstein-Barr virus (EBV) infection [[45]8], and microsatellite instability (MSI) [[46]9], were proposed to screen the subpopulation of GC that potentially benefit from immunotherapy; however, the predictive capacity and robustness of these biomarkers were not satisfactory enough in clinical practice [[47]10]. This study aimed to develop a novel molecular biomarker to stratify GC patients into cohorts with different responsiveness and prognosis, which is necessary to maximize the benefits of immunotherapy while minimizing the adverse effects. Neutrophils, a type of abundant immune effector cell in humans, can respond to exogenous pathogens by releasing neutrophil extracellular trap (NET) which contained nuclear DNA fibers and proteins in a web-like structure. Traditional studies consider that NET is primarily involved in infectious diseases, such as microbial infection and pathogen clearance [[48]11]. Currently, increasing evidence has linked NET to tumorigenesis and metastatic spread in cancers. Neutrophils isolated from the peripheral blood released NET more strongly in cancer mice than in healthy mice [[49]12]. Abnormal production of NET promoted liver metastasis of intrahepatic cholangiocarcinoma [[50]13]. Nevertheless, the association between NET and the tumor microenvironment (TME) and its clinical significance in GC remains unclear. This study comprehensively analyzed the transcriptomic and single-cell RNA sequencing (scRNA-seq) profiles to depict the molecular patterns of NET in the TME and to develop a NET-related model for predicting immunotherapy efficacy in GC patients. It can assist clinicians designing precise treatment strategies for this highly heterogeneous disease. 2. Materials and methods 2.1. Data acquisitions The RNA sequencing (RNA-seq) profiles with annotated clinical information of 415 GC cases (TCGA-STAD) were acquired from the cBioportal database ([51]www.cbioportal.org). RPKM-mapped reads recorded in the gene expression matrix were transformed into TPM-mapped reads for transcriptomic analysis. Clinical information included age, gender, tumor site, grade, TNM stage, Lauren classification, TCGA subtypes, and overall survival. A systematic search was performed in the GEO database ([52]www.ncbi.nlm.nih.gov/geo) to acquire the large-scale GC datasets containing transcriptional data and clinicopathological information ([53]Fig. S1A). The filter terms were as follows: “gastric cancer” [all fields] AND “homo sapiens” [organism] AND (“expression profiling by array” [all fields] OR “expression profiling by high throughput sequencing” [all fields]) AND (“2000/01/01” [PDAT]: “2022/12/31” [PDAT]) AND “attribute name tissue” [Filter]). The Datasets with small sample sizes, duplicate samples, missing clinical data and irrelevant topics were excluded. The sample size thresholds for microarray and single-cell sequencing datasets were set at 190 and 25, respectively. Ultimately, two microarray datasets ([54]GSE62254, n = 300; [55]GSE15459, n = 192) and a single-cell RNA-seq (scRNA-seq) dataset ([56]GSE183904, n = 26) were enrolled in this study. The [57]GSE62254 and [58]GSE15459 datasets were used to verify the robustness of the consensus clustering algorithm. The [59]GSE183904 dataset was used to explore the associations between NET and TME characteristics in GC. Besides, a comprehensive literature search in PubMed, Embase, Web of Science, and EMBASE up to December 31, 2022 was conducted to obtain public immunotherapy cohorts ([60]Fig. S1B). The search terms and their combinations were used as follows: (“cancer” OR “neoplasm” OR “malignancy” OR “tumor”) AND (“immune checkpoint inhibitor” OR “immune checkpoint blocker” OR “ICI” OR “ICB” OR “PD-1 inhibitor” OR “PD-L1 inhibitor” OR “CTLA-4 inhibitor” OR “anti-PD-1” OR “anti-PD-L1” OR “anti-CTLA-4” OR “pembrolizumab” OR “nivolumab” OR “sintilimab” OR “durvalumab” OR “tislelizumab” OR “atezolizumab” OR “cemiplimab” OR “camrelizumab” OR “Ipilimumab” OR “avelumab” OR “tremelimumab” OR “toripalimab”) AND (“RNA sequencing” OR “Whole-transcriptome sequencing” OR “RNA-seq” OR “transcriptomic”). Eligible studies met all the following criteria for inclusion: (1) patients diagnosed with solid tumors and treated with immune checkpoint inhibitors; (2) detailed records reporting therapeutic response or survival outcomes; (3) RNA sequencing on primary tumor samples before treatment; (4) open data access; (5) articles published in English. Ultimately, four immunotherapy datasets, including 45 GC patients treated with pembrolizumab (Kim cohort) [[61]14], 73 melanoma patients treated with nivolumab or pembrolizumab (Gide cohort) [[62]15], 121 melanoma patients treated with nivolumab or pembrolizumab (Liu cohort) [[63]16], and 298 patients with urothelial carcinoma treated with atezolizumab (Mariathasan cohort) [[64]17], were included in this study. The details of the clinical cohorts are shown in [65]Table S1. The microarray data preprocess included probe transformation, quality control, background correction, and normalization. First, the probes matched to multiple genes were removed from the gene list. When multiple probes were matched to the same gene, the expression levels of these probes were averaged. Second, each probe was transformed into the corresponding gene symbol according to the manufacturer-provided annotation file. Subsequently, quality control analysis was performed by assessing the quality of RNA samples using the R package “Simpleaffy” [[66]18]. Affymetrix Expression Console and robust multiarray average (RMA) algorithms were applied to filter out unwanted noise by data normalization and background correction. Z score was adopted as the relative expression level in the gene matrix. 2.2. Genetic and transcriptional analyses First, a gene set containing 170 NET-related genes was extracted from a previous study [[67]19]. Then, the gene set was matched to the TCGA-STAD cohort according to the Entrez ID of each gene. As a result, 163 genes were successfully matched whereas 7 genes were not found in the cohort. Finally, the 163 NET-related genes were included in this study ([68]Table S2). The somatic mutational landscape of NET-related genes in all TCGA-STAD patients was analyzed using the R package “maftools” [[69]20]. The Metascape database was used to establish the protein-protein interaction (PPI) network of NET-related genes and perform pathway enrichment for each module in the PPI network ([70]https://metascape.org/) [[71]21]. The expression matrix of the NET-related genes was visualized using the R package “ComplexHeatmap” [[72]22]. Then, DEGs between tumor and normal tissues were identified using the R package “limma” [[73]23]. The genes with adjusted P value < 0.05 and |fold change (FC)| > 1.5 were defined as DEGs. The distribution of DEGs was visualized as a volcano plot using the R package “ggplot2” [[74]24]. 2.3. Cluster analysis and functional annotation The consensus clustering algorithm is a widely used tool in the field of cancer molecular classification [[75]25]. By repeated subsampling and clustering, the algorithm can identify clusters that share similar gene expression patterns. It provides a quantitative basis for user-specified clustering algorithms (e.g. agglomerative hierarchical clustering, k-means, or a custom clustering algorithm) to determine the optimal cluster number during sample clustering. In this study, the unsupervised cluster analysis was conducted based on the expression level of 163 NET-related genes using the R package “ConsensusClusterPlus” [[76]25]. To be specific, the “K-Means” clustering algorithm with an “Euclidean” distance was applied to measure similarity in hierarchical clustering and classify tumor samples into different NET-related clusters. It was repeated 1000 times on 80 % of the samples for stability. The consensus clustering matrix, cumulative distribution function (CDF), and Delta area plots were utilized to determine the optimal k value. With the increase of the k value from 2 to 10, the relative change in area under the CDF curve decreases in the delta area plot. The selection of the k value is dependent on the slope of the delta area curve and the stability of the consensus clustering matrix. Ultimately, k = 2 was selected as the cluster number. The DEGs between the NET-related clusters were identified using the R package “limma” [[77]23] and displayed using the R package “ComplexHeatmap” [[78]22] and “ggplot2” [[79]24]. After transforming the gene symbol to an Entrez ID, the R package “clusterProfiler” [[80]26] was used to perform gene ontology (GO), Kyoto Encyclopedia of Genes and Genomes (KEGG) enrichment analysis, and gene set enrichment analysis (GSEA) on the DEGs. The pathway enrichment results were visualized using the R package “ggplot2” [[81]24]. The gene sets were obtained from the Molecular Signatures Database (MsigDB). The adjusted P value < 0.05 and false discovery rate (FDR) < 0.25 were adopted as the threshold for statistical significance. 2.4. TME analysis Based on bulk RNA-seq data of the TCGA-STAD cohort, TME scores, including the stromal, immune, and ESTIMATE scores, were determined using the R package “ESTIMATE” [[82]27], which represented stromal content, immune infiltration, and tumor purity, respectively. Furthermore, the R package “CIBERSORT” [[83]28] algorithm was employed to estimate the infiltration level of 22 immune cell subsets in each GC sample. Immune molecule markers were collected from previous publications, including activated dendritic cell markers, genes involved in antigen processing signaling, CD8^+ T markers, IFN-γ downstream signaling, molecular markers of natural killer (NK) cells, cytotoxic activity, and immune checkpoints [[84][29], [85][30], [86][31], [87][32], [88][33], [89][34], [90][35]]. Immune signatures proposed by previous researchers, such as immune cytolytic activity (CYT) [[91]34], IFN-γ signature [[92]29], and gene expression profile (GEP) associated with T cell-inflamed phenotype [[93]36], were identified as effective transcriptional biomarkers for predicting immunotherapy efficacy. CYT levels were quantified according to the GZMA and PRF1 transcript levels, indicating the immune effector activity of NK and cytotoxic T cells [[94]34]. The IFN-γ signaling composed of IFNG, HLA-DRA, IDO1, CXCL9, CXCL10, and STAT1 was recognized as a robust signature for responsiveness to PD-1 blockade in different tumor types [[95]29]. T cell-inflamed GEP consisted of 18 marker genes, including CXCR6, CD27, HLA-DQA1, CCL5, HLA-DRB1, CXCL9, HLA-E, CD8A, CD276, IDO1, NKG7, LAG3, CMKLR1, PDCD1LG2, CD274, STAT1, TIGIT, and PSMB10 [[96]36]. This study compared the TME scores, immune cell infiltrations, immune molecule markers, and immunotherapy biomarkers between two NET-related clusters. 2.5. scRNA-seq analysis The scRNA-seq dataset containing 26 primary tumor samples of patients with GC was obtained from the GEO database ([97]GSE183904). The first step was to filter and standardize the scRNA-seq data using the R package “Seurat” [[98]37]. Cells with more than 5 % mitochondrial or less than 100 genes were filtered. Normalized genes with low variance were excluded. Subsequently, gene dimensionality was reduced using principal component analysis (PCA). The UMAP algorithm was employed to sort cells into different clusters. The “FindAllMarkers” function was applied to screen the marker genes in each cluster using Wilcoxon-Mann-Whitney tests. The top 10 marker genes among the DEGs were identified by the rank of the P value and log2FC. The cell annotation of each cluster was performed using the CellMarker online database based on the identified marker genes. The cell AUC for NET-related genes was calculated and matched to each cell using the R package “AUCell” [[99]38]. 2.6. Model construction The Kim cohort, which contained 45 GC patients treated with PD-1 blockers, was used as the training cohort. The Gide, Liu, and Mariathasan cohorts, which contained 492 patients treated with PD-1/CTLA-4 blockers, were used as the validating cohorts. The least absolute shrinkage and selection operator (LASSO) method is a compression estimation that generates a penalty function by compressing certain coefficients and zeroing out other coefficients. The prominent advantage of LASSO regression is to retain several critical genes by shrinking the coefficient of other collinear genes to zero during dimensionality reduction. Therefore, it has been widely utilized to construct clinically actionable models based on transcriptome data [[100]39,[101]40]. In this study, the LASSO regression analysis was applied to identify potential markers for predicting immunotherapy efficacy in the training cohort using the R package “glmnet” [[102]41] with a selection filter of P < 0.05. The number of candidate variables was determined by the optimal penalty (Lambda). Subsequently, these candidate variables were included in the multivariate analysis to establish a NET-related model. The NET-related score was ultimately determined by multiplying the NET gene expression level by the coefficients (coef) using multivariate logistic regression. The predictive capability of the NET-related model for response to ICIs was evaluated by receiver operating characteristic (ROC) curve analysis in the Kim, Gide, Liu, and Mariathasan cohorts. The leave-one-out cross-validation (LOOCV) method maximizes the use of available data by using each observation for validation exactly once, while the bootstrapping algorithm involves repeated sampling to create multiple training sets and provide robust performance estimates. These cross-validation methods were performed using the R package “caret” [[103]42] to evaluate the predictive performance of the NET-related model in the training cohort. Furthermore, several immunotherapy biomarkers including microsatellite instability (MSI) [[104]10], Epstein-Barr virus (EBV) [[105]10], GEP [[106]36], combined positive score (CPS) [[107]43], and TME score [[108]44] were also enrolled in the ROC analysis for comparison with the NET-related score. The AUC value was determined using the R package “pROC” [[109]45]. Furthermore, Kaplan-Meier analysis was conducted to investigate the association between the NET-related model and prognosis in the Gide, Liu, and Mariathasan cohorts. 2.7. Statistical analysis R 4.2.1, GraphPad Prism 9.0, and SPSS 26.0 were employed for statistical analysis. Kaplan-Meier analyses and log-rank tests were adopted to compare the differences in the survival outcomes between different groups within each cohort. Multivariate Cox regression analysis was performed to assess the independent prognostic significance of the NET-related cluster after adjusting for other clinical variables (age, gender, tumor site, and TNM stage) in the TCGA-STAD cohort. Comparisons of the gene expressions, cells, and, scores between different groups were determined by the Mann-Whitney test. Chi-squared tests were applied for comparing the clinicopathological characteristics between two-NET clusters in the TCGA-STAD cohort. A P < 0.05 was considered statistically significant. 3. Results 3.1. Genetic and transcriptional characteristics of NET-related genes in GC A flowchart of this study is depicted in [110]Fig. S2. To draw the genetic map of NET-related genes in GC, the whole exome sequencing data of 415 TCGA-STAD samples were obtained from the cBioportal database and a summary analysis of somatic mutations was performed. Among the 415 patients in the TCGA-STAD cohort, 257 (62.2 %) had NET-related gene mutations, with missense mutations accounting for the majority. [111]Fig. 1A shows the somatic mutation landscape of the top 10 frequently mutated NET-related genes. Except for PI3K3CA (15 %), the other genes, including VWF, MTOR, C3, HDAC4, TLR4, CR1, PLCG1, CLCN3, and ITGAL, exhibited a low incidence of somatic mutations (<10 %). At the transcriptional level, it was observed that most NET-related genes had higher expression in tumors than those in normal tissues ([112]Fig. 1B). Among 163 NET-related genes, a total of 67 (41.1 %) up-regulated DEGs and eight (4.9 %) down-regulated DEGs were identified in the tumor samples ([113]Fig. 1C–[114]Table S3). Protein-Protein Interaction (PPI) network analysis identified seven molecular complex detections (MCODEs) among the NET-related genes ([115]Fig. 1D). As expected, the MCODE1-3 were all enriched in NET formation ([116]Table S4). The genes in MCODE4-7 were mainly involved in the VEGFA-VEGFR2 signaling, TNF-related weak inducer of the apoptosis signaling pathway, cross-presentation of particulate exogenous antigens (phagosomes), and sensory processing of sound by inner hair cells of the cochlea, respectively ([117]Table S4). Fig. 1. [118]Fig. 1 [119]Open in a new tab Genetic and transcriptional characteristics of NET-related genes in gastric cancer. (A) Somatic mutation landscape of the top 10 NET-related genes in TCGA-STAD. (B) Heatmap showing gene expression profiles of 163 NET-related genes between tumors and normal tissues in TCGA-STAD. (C) Volcano plot displaying differentially expressed NET-related genes between tumors and normal tissues. (D) PPI network of NET-related genes generated by Metascape database. NET, neutrophil extracellular trap; PPI, Protein-Protein Interaction. 3.2. Identification of NET-related clusters in GC To better understand the NET-related gene expression patterns in GC, consensus clustering analysis was performed on the expression matrix of 163 NET-related genes in training dataset (TCGA-STAD) and validating datasets ([120]GSE62254 and [121]GSE15459), respectively. According to the slopes of the CDF and Delta area curves, k = 2 was determined as the ideal cluster number value in the TCGA-STAD dataset ([122]Fig. 2A–C). The PCA analysis demonstrated the significant differences in gene distribution between two NET-related clusters ([123]Fig. 2D). Fig. 2. [124]Fig. 2 [125]Open in a new tab Identification of NET-related gene-cluster subtypes in gastric cancer. (A) Consensus matrix heatmap of TCGA-STAD cohort samples defining two NET-related clusters (k = 2). (B) CDF curve showing the association of CDF value with the consensus index from k = 2 to 10. (C) CDF Delta area curve of consensus clustering. (D) Heatmap showing distinct clinicopathological features between two NET-related clusters. (E) PCA analysis of two NET-related clusters. (F) Kaplan-Meier curves showing the comparison of OS between two NET-related clusters in the TCGA-STAD cohort. CDF, cumulative distribution function; HR, hazard ratio; NA, not available; OS, overall survival; PCA, principal component analysis. 3.3. Clinical relevance of NET-related clusters The two NET-related clusters exhibited distinct clinicopathological characteristics ([126]Fig. 2E–[127]Table 1). Specifically, patients in the high NET-related cluster were significantly younger than those in the low NET-related cluster (40.8 % versus 27.1 %, P = 0.004). There were no differences in gender or tumor site between the two clusters. Pathologically, the high NET-related cluster had more aggressive features than the low NET-related cluster, such as advanced grade (G3: 80.3 % versus 49.4 %, P < 0.001) and diffuse type (51.1 % versus 20.2 %, P < 0.001). Patients in the high NET-related cluster exhibited a more advanced TNM stage than those in the low NET-related cluster (III/IV stage: 60.4 % versus 50.2 %, P = 0.052). Besides, the low NET-related cluster had a higher proportion of CIN types than the high NET-related cluster (68.8 % versus 46.5 %, P < 0.052). Kaplan-Meier analysis revealed that the high NET-related cluster harbored a more unfavorable OS than the low NET-related cluster (HR = 1.63, P = 0.004, [128]Fig. 2F). These findings were also verified using the [129]GSE62254 and [130]GSE15459 datasets ([131]Fig. S3). Furthermore, after adjusting for several clinical variables, including age, gender, tumor site, and TNM stage, the high NET-related cluster was found to be an independent risk factor for patients with GC (HR = 1.47, P = 0.026, [132]Fig. S4). Taken together, this study established a NET-based clustering classification for stratifying GC patients with different clinical characteristics and prognosis. The high NET-related cluster, characterized by increased NET-related gene expression, exhibited more aggressive behavior and a worse prognosis than the low NET-related cluster. Previous studies reported that high NET-related gene expression correlated with high-metastasis potential and poor prognosis in GC [[133]46,[134]47], which was consistent with our findings. Therefore, detecting the NET-related genes in the high NET-related cluster might assist clinicians in stratifying GC patients with different outcomes. Table 1. Comparison of clinicopathological features between two NET-related clusters in the TCGA-STAD cohort. NET-related cluster __________________________________________________________________ P Low (N = 263) High (N = 152) Age - years 0.004 ≤60 70(27.1) 62(40.8) >60 188(72.9) 90(59.2) NA 5 0 Gender - no. (%) 0.376 Female 89(33.8) 58(38.2) Male 174(66.2) 94(61.8) Tumor site - no. (%) 0.217 Antrum 94(37.6) 61(42.1) Cardia 36(14.4) 18(12.4) Fundus/body 87(34.8) 56(38.6) GEJ 33(13.2) 10(6.90) NA 12 7 Grade - no. (%) <0.001 G1 7(2.70) 5(3.40) G2 124(47.9) 24(16.3) G3 128(49.4) 118(80.3) NA 4 5 TNM stage - no. (%) 0.052 I/II 125(49.8) 55(39.6) III/IV 126 (50.2) 84(60.4) NA 12 13 Lauren subtype - no. (%) <0.001 Intestinal 130(79.8) 46(48.9) Diffuse 33(20.2) 48(51.1) NA 100 58 TCGA subtype - no. (%) <0.001 CIN 143(68.8) 60(46.5) GS 8(3.80) 36(27.9) MSI 43(20.7) 20(15.5) EBV 14(6.70) 13(10.1) NA 55 23 Overall survival time - months 0.080 Median 15.7 14.3 Range (0.1, 122.2) (0.26, 116.3) NA 18 10 Overall survival status - no. (%) 0.010 Alive 172(65.4) 80(52.6) Dead 91(34.6) 72(47.4) [135]Open in a new tab CIN, chromosomal instability; EBV, Epstein-Barr virus; GEJ, gastroesophageal junction; GS, genomically stable; MSI, microsatellite instability; NA, not available. 3.4. Identification of DEGs between NET-related clusters and pathway enrichment To investigate the biological features of the NET-related clusters, 4935 DEGs were identified using the R package “limma”, including 3656 up-regulated and 1279 down-regulated genes ([136]Fig. 3A, [137]Table S5). Subsequently, the functional annotation was conducted using pathway enrichment analyses, including GO, KEGG, and GSEA. The KEGG enrichment results revealed that the high NET-related cluster was significantly linked with the chemokine signaling pathway and cytokine-cytokine receptor interaction ([138]Fig. 3B). GO-BP analysis suggested that the DEGs were involved in cell adhesion-related processes, such as the positive regulation of cell adhesion, cell-cell adhesion regulation, and molecule leukocyte migration ([139]Fig. 3C). According to GO-CC analysis, the DEGs were located in the extracellular zone, such as the collagen-containing extracellular matrix, the external side of the plasma membrane, and the cell-cell junction ([140]Fig. 3C). The GO-MF results revealed that the molecular functions of the DEGs contained extracellular matrix structural constituents, glycosaminoglycan binding, and immune receptor activity ([141]Fig. 3C). GSEA revealed that the high NET-related cluster was significantly associated with neutrophil degranulation, cell adhesion molecules, and cytokine-cytokine interactions ([142]Fig. 3D). Fig. 3. [143]Fig. 3 [144]Open in a new tab Function annotation and pathway enrichment analysis between two NET-related clusters in gastric cancer. (A) Volcano plot showing DEGs between two NET-related clusters in TCGA-STAD cohort. (B) KEGG enrichment analysis of DEGs between two NET-related clusters. (C) GO enrichment analysis for biological process, cellular component, and molecular function. (D) Gene set enrichment analysis of the differentially expressed genes between two NET-related clusters. DEGs, differentially expressed genes. 3.5. Characterization of the TME and immune infiltration in NET-related clusters The ESTIMATE method was used to estimate stromal content, immune infiltration, and tumor purity. As a result, TME scores, including stromal, immune, and ESTIMATE scores, were significantly elevated in the high NET-related cluster ([145]Fig. 4A). Second, the CIBERSORT algorithm was applied to infer the infiltration level of 22 immune cell types in each tumor sample. As illustrated in [146]Fig. 4B, the high NET-related cluster exhibited higher immune infiltrations with immune-inflamed features, such as CD8^+ T cells, M1 macrophages, activated NK cells, and neutrophils. In contrast, the low NET-related cluster had more immunosuppressive cells, including resting NK cells and activated mast cells. Moreover, the high NET-related cluster overexpressed a variety of immune molecule markers participating in multiple immunological processes, including DC activation, antigen presentation, cytotoxic CD8^+ T cell markers, IFN-γ signaling, NK cell activation, cytolytic activity, and immune checkpoint inhibitors ([147]Fig. 4C–[148]Table S6). Furthermore, the association between NET-related clusters and the three immunotherapy biomarkers was investigated. As expected, the transcriptional expression levels of CYT, GEP, and IFN-γ signaling were significantly increased in the high NET-related cluster ([149]Fig. 4D). In summary, these results indicated that patients in the high NET-related cluster tend to develop an immune-inflamed TME phenotype, which may render them more sensitive to immunotherapy. Fig. 4. [150]Fig. 4 [151]Open in a new tab Association of NET-related clusters with tumor immune microenvironment in TCGA-STAD. (A) Box plots showing the comparison of the stromal, immune, and ESTIMATE scores between two NET-related clusters. The stromal, immune, and ESTIMATE scores represent stromal content, immune infiltration, and tumor purity in the tumor microenvironment, respectively. (B) Box plots showing the comparison of immune infiltrations between two NET-related clusters. The infiltration level of 22 immune cell types was inferred by the CIBERSORT algorithm. (C) Heatmap showing the gene expression of various immune molecule markers in two NET-related clusters, including activated dendritic cell markers, genes involved in antigen processing machinery, CD8^+ T markers, IFN-γ signaling, natural killer cell markers, cytolytic activity, and immune checkpoints. (D) Box plots showing the comparison of three immune signatures (CYT, IFN-γ signature, and T cell-inflamed GEP) between two NET-related clusters. ***P < 0.001; ns, no significance. 3.6. Exploration of a NET-related model at the single-cell level In total, 48,228 cells from 26 GC samples were included in the scRNA-seq analysis. After quality control filtering, 41,135 cells were included in the dimensional reduction analysis. The top 10 PCs were identified using PCA and were incorporated into the UMAP analysis. Finally, a total of 10 cell clusters were recognized and annotated, including T cells, plasma cells, epithelial cells, macrophages, B cells, DCs, fibroblasts, mast cells, endothelial cells, and erythrocytes ([152]Fig. 5A). After removing the non-immune cells and re-conducting UMAP analysis for immune cells, 10 immune cell clusters were identified, including CD4^+ T cell, C1QC^+ macrophage, SPP1^+ macrophage, memory CD8^+ T cell, plasma cells, B cell, regulatory T cell, DC, effector CD8^+ T cell, and mast cell ([153]Fig. 5B). Subsequently, the pathway activity of NET-related genes in each cell was quantified using the R package “AUcell” ([154]Fig. 5C). It was observed that the group with high NET-AUC had more effector CD8^+ T cells, C1QC^+ macrophages, and SPP1^+ macrophages than the group with low NET-AUC ([155]Fig. 5D). These findings linked elevated NET signaling activity with immune-inflamed TME characteristics in GC. Fig. 5. [156]Fig. 5 [157]Open in a new tab Exploration of NET-related genes in immune infiltration by single-cell RNA-seq. (A) cell annotation of all subsets clustered by UMAP. (B) Cell annotation of immune cell subsets by UMAP. (C) cluster distributions of NET-AUC score. NET-AUC score was estimated by the gene expression ranking, representing the NET activity in each cell. (D) Association between NET-AUC score and the distribution of immune cells. AUC, area under the curve. 3.7. Establishment of a NET-related model for predicting immunotherapy response in patients with GC Our previous findings depicted a distinct immunological landscape between the groups with different expression levels of NET-related genes. To construct a clinically actionable model for immunotherapy in GC, the transcriptional data and clinicopathological information of 45 GC patients treated with ICIs were acquired from Kim et al.’s study [[158]14]. Initially, the expression matrix of NET-related genes was extracted from the Kim cohort. Then, ROC analysis was conducted to preliminarily identify candidate genes for model construction. A total of 37 NET-related genes were screened according to the rank of AUCs (AUC >0.65, [159]Table S7). Subsequently, LASSO regression analysis was performed to remove collinear variables from the candidate NET-related genes. N = 5 was set as the optimal parameter for variable selection ([160]Fig. 6A and B). Finally, five candidate genes were enrolled in the NET-related model and the NET-related score was determined using the gene expression and coefficients ([161]Fig. 6C). Responders in the Kim cohort had significantly higher expression of AKT1 (P < 0.01), HIST2H2AC (P < 0.05), PRKCG (P < 0.01), and lower expression of ACTA2 (P < 0.01) and ITGB3 (P < 0.05) than non-responders ([162]Fig. 6D). The NET-related score was also remarkably elevated among responders (P < 0.001, [163]Fig. 6E), and the group with high NET-related scores had more responders (P < 0.001, [164]Fig. 6F). The ROC analyses demonstrated the high predictive performance of the NET-related model (AUC = 0.939, 95%CI = [0.870, 1], P < 0.001, [165]Fig. 6G). Furthermore, except for CPS (AUC = 0.956, 95%CI = [0.900, 1], P < 0.001), the predictive biomarkers such as MSI/EBV (AUC = 0.902, 95%CI = [0.787, 1], P < 0.001), GEP (AUC = 0.836, 95%CI = [0.675, 0.997], P = 0.001), and TME score (AUC = 0.891, 95%CI = [0.746, 1], P < 0.001) were weaker than those of the NET-related model ([166]Fig. 6G). Besides, leave-one-out cross-validation (LOOCV) and bootstrapping algorithms validated that the NET-related model had a robust performance for predicting immunotherapy response (LOOCV: AUC = 0.785, 95%CI = [0.630, 0.941], P = 0.004; bootstrapping: AUC = 0.819, 95%CI = [0.746, 0.901], P < 0.001; [167]Fig. S5). Fig. 6. [168]Fig. 6 [169]Open in a new tab Construction of a NET-related model for predicting response to ICIs in gastric cancer. (A) Association between the coefficients and Lambda value using the LASSO regression analysis. (B) The best Lambda value was selected according to the variation of binomial deviance with the increasing Lambda value. (C) Five candidate variables were identified, including ACTA2, AKT1, HIST2H2AC, ITGB3, and PRKCG. The NET-related model, named NET-related score, was established based on the sum of gene expression multiplied by coefficients. (D) Violin plot showing the comparison of the gene expression of ACTA2, AKT1, HIST2H2AC, ITGB3, and PRKCG between responders and non-responders in the Kim cohort. (E) Violin plot showing the comparison of NET-related score between responders and non-responders in the Kim cohort. (F) Histogram showing the association between NET-related score and response rate of ICIs therapy in the Kim cohort. (G) ROC curve showing the predictive capability of response to ICIs by NET-related score, MSI/EBV, GEP, CPS, and TME score in the Kim cohort. *P < 0.05; **P < 0.01; ***P < 0.001. 3.8. Validation of the NET-related model for predicting response and prognosis in patients treated with ICIs To verify the predictive performance of the NET-related model for ICIs therapy, three ICI-treated cohorts were acquired from previous studies [[170][15], [171][16], [172][17]]. Responders in the Gide, Liu, and Mariathasan cohorts consistently had higher NET-related scores than non-responders (all P < 0.001, [173]Fig. 7A), and the group with higher NET-related scores also had more responders than the group with lower NET-related scores (all P < 0.001, [174]Fig. 7B). ROC analyses demonstrated a robust and efficient predictive performance of the NET-related model in the Gide (AUC = 0.767, 95%CI = [0.657, 0.876], P < 0.001), Liu (AUC = 0.704, 95%CI = [0.609, 0.799], P < 0.001), and Mariathasan (AUC = 0.682, 95%CI = [0.612, 0.752], P < 0.001) cohorts ([175]Fig. 7C). Kaplan-Meier analyses suggested that patients with high NET-related scores had better OS than those with low NET-related score in the Gide (HR = 0.77, P = 0.002), Liu (HR = 0.44, P = 0.001), and Mariathasan (HR = 0.74, P = 0.043) cohorts ([176]Fig. 7D). Similar findings were observed in the Gide (HR = 0.33, P < 0.001) and Liu (HR = 0.40, P < 0.001) cohorts for PFS ([177]Fig. 7E). Taken together, patients with high NET-related scores had significant immunotherapeutic benefits and favorable survival outcomes. These findings supported the NET-related model as a promising biomarker to predict immunotherapy efficacy in GC, which can help clinicians make appropriate immunotherapy decisions. Fig. 7. [178]Fig. 7 [179]Open in a new tab Validation of the NET-related model for predicting response and prognosis in patients treated with ICIs. (A) Violin plot showing the comparison of NET-related score between the responder and non-responder group in the Gide, Liu, and Mariathasan cohorts. (B) Histogram showing the association between NET-related score and response rate of ICIs therapy in the Gide, Liu, and Mariathasan cohorts. (C) ROC curve showing the predictive value of the NET-related model for response to ICIs in the Gide, Liu, and Mariathasan cohorts. (D–E) Kaplan-Meier curves showing the comparison of OS (D) and PFS (E) between the groups with low and high NET-related scores in the Gide, Liu, and Mariathasan cohorts. ***P < 0.001. 4. Discussion NET is a web-like chromatin structure containing DNA fibers and proteins from activated neutrophils, involved in the biological processes of inflammation, immune defense, and autoimmune disorders [[180]11]. NET has recently emerged as a hotspot subject for oncology research because of its multiple roles in cancer progression and metastatic dissemination [[181]48]. DNA released by NET has been observed to promote tumor cell migration and metastases in several mouse models [[182]49]. NET induced by sustained inflammation activates silent tumor cells and promotes relapse and metastasis [[183]50]. NET accumulation accelerates postoperative tumor progression and recurrence [[184]51]. Additionally, several NET-related markers have been proposed to predict survival outcomes in multiple cancers, such as renal cell carcinoma [[185]52], breast cancer [[186]19], and lung adenocarcinoma [[187]53]. Nonetheless, limited research has focused on NET regulation in the TME and its clinical significance in GC immunotherapy. To address this issue, the role of NET in GC was explored and a novel NET-related model was developed to predict immunotherapy efficacy. This study began by depicting the genomic and transcriptional landscape of NET-related genes in GC. It was observed that PIK3CA mutations occurred most frequently, whereas the other genes had a low mutation rate. Frequent mutations in PIK3CA were also observed in breast cancer [[188]19]. Nevertheless, the prognostic significance was inconsistent in previous studies. In the present study, the association between PIK3CA mutations and prognosis in GC was not observed. The biological significance of PIK3CA mutations in GC warrants further experimental investigations. At the transcriptional level, it was revealed that most NET-related genes were up-regulated in tumors and enriched in signaling pathways, such as NET formation and ECM-related pathways. Mechanically, it may be stated that NETs act as suitable harbors for cell adhesion, thereby promoting the progression and spread of tumor cells [[189]54]. Significant diversity in the compositions and locations of the immune and non-immune components across different tumors determines the status of clinical benefits from immunotherapy [[190]2]. Activated CD8^+ T and NK cells are the classical effectors of anti-tumor immune responses [[191]55]. In this study, clustering the transcriptional matrix of NET-related genes revealed that the high NET-related cluster had more abundant infiltrations of anti-tumor immune cells and exhibited more immune effector molecules than the low NET-related cluster. One limitation of bulk RNA-seq analysis is that it cannot assign the dysregulated genes to specific cell types, due to the numerical superposition of signaling from different cell subsets [[192]56]. To counteract this bias, scRNA-seq analysis was conducted to provide a detailed understanding of how NET-related genes regulate immune cells in the TME. Unbalanced distributions of immune cell subsets existed in the tumors with different NET activity. The abundance of effector CD8^+ T cells, C1QC^+ macrophages, and SPP1^+ macrophages was increased in the group with a high AUC, adding to the knowledge about the association between NET-related clusters and TME in GC tumors. Previous studies have revealed multiple immunosuppressive roles of NET in TME. For instance, Kaltenmeier et al. demonstrated that NET caused exhaustion and dysfunction of CD8^+ T cells [[193]57]. Teijeira et al. observed that NET-coated tumor spheroids protected cancer cells from the cytotoxicity of NK cells [[194]58]. Donis-Maturano et al. found that macrophages and DCs underwent apoptosis after prolonged incubation with NET [[195]59]. Besides, Wang et al. revealed that NET could activate TLR4 on the surface of naïve CD4^+ T cells to regulate Treg differentiation [[196]60]. Therefore, the interactions between NETs and various immune cells contribute to immune suppression in TME. ICI therapy has been recognized as the most revolutionary cancer treatment in the era of precision medicine. However, it remains plagued by unsatisfactory efficacy, particularly in highly heterogeneous tumors such as GC [[197]4]. Thus, developing predictive models for immunotherapy is urgently needed [[198]61]. This study successfully established a predictive signature of the NET-related genes, namely the NET-related score, to predict response to immunotherapy. ROC analyses revealed that the predictive capability of the NET-related score was more effective than that in previous immunotherapy biomarkers, such as MSI, EBV, GEP, and TME score. Furthermore, the predictive and prognostic values of the NET-related score were verified using the independent immunotherapy datasets. This is the first study to explore the predictive performance of the NET-related model for GC immunotherapy based on multiple independent cohorts. In clinical practice, PD-L1 expression is the most important biomarker for immunotherapy [[199]2]. However, a considerable number of patients with high PD-L1 expression still cannot benefit from immunotherapy, whereas some patients with low PD-L1 expression respond to immunotherapy [[200]62]. It suggests that a single biomarker is not sufficient enough to predict immunotherapy response accurately. Combining the NET-related model with PD-L1 expression will capture more immune-inflamed features to improve the predictive capability. Therefore, it may become a promising strategy to better predict immunotherapy response in the future. More importantly, given its advantages of low cost, high efficacy, and strong robustness, the NET-related model may be clinically actionable to screen the suitable population for immunotherapy. Nonetheless, there are some limitations to our research. First, due to the lack of public immunotherapy cohorts in GC, the NET-related model for predicting immunotherapy was validated in patients with metastatic melanoma and urothelial carcinoma. Larger cohorts of patients with GC are required to validate the robustness of this model. Second, despite the excellent performance of the model, heterogeneity in different datasets may have introduced potential bias in this study, such as an inconsistent sequencing platform, selective bias in datasets, and incomplete follow-up data. More cross-validation techniques are needed to confirm the stability and generalizability of this model. Last but not least, the NET-related model should be validated further by performing both in vitro and in vivo experiments to better understand the association between NET-related score and tumor immune response. The in-depth molecular mechanisms require further investigation in the future. In conclusion, this study established a NET-based classification for prognostic stratification and constructed a robust model to predict immunotherapy response in GC patients. These findings provide new insights into NET-related genes in GC and can guide clinicians in making immunotherapy decisions. Developing novel predictive strategies will be a promising area in cancer immunotherapy in the future. Data availability The TCGA-STAD dataset was downloaded from the cBioportal database ([201]https://www.cbioportal.org/). The microarray datasets ([202]GSE62254, [203]GSE15459) and the scRNA-seq dataset ([204]GSE183904) were downloaded from Gene Expression Omnibus (GEO) database ([205]http://www.ncbi.nlm.nih.gov/geo/). The data of the Kim, Gide, Liu, and Mariathasan cohorts were acquired from the corresponding publications and article supplementary materials. All the codes and datasets can be obtained from the supplementary materials and the Github website ([206]https://github.com/jiangjunjie0415/Neutrophil-Extracellular-Trap) . CRediT authorship contribution statement Ningjie Sun: Writing – original draft, Data curation. Junjie Jiang: Writing – original draft, Formal analysis, Data curation. Biying Chen: Resources, Methodology, Investigation. Yiran Chen: Visualization, Software. Haiming Wu: Validation, Project administration, Formal analysis. Haiyong Wang: Writing – review & editing, Supervision, Funding acquisition, Conceptualization. Jianfeng Chen: Writing – review & editing, Conceptualization. Declaration of competing interest The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper. Acknowledgments