Abstract Inflammatory cell infiltration within the tumor microenvironment compromises the basement membrane’ integrity, contributing to chemotherapy resistance and distant metastasis. This study investigates the interplay between inflammatory responses and basement membrane-related genes (BMIRGs) in lung adenocarcinoma (LUAD). With datasets from the TCGA and GEO databases, the BMIRG risk-score model was developed based on univariate Cox regression and a LASSO-Cox regression model. Nine genes were identified as comprising the BMIRG signature. Using this signature, we developed the predictive models, including BMIRG risk scores and nomogram and Cox regression models. Stratifying patients into high-and low-risk groups based on BMIRG risk scores revealed a more favorable prognosis in the latter group. Additionally, the signature exhibited correlations with drug sensitivity, immunotherapy response, infiltration of multiple immune cells and inflammatory factors in microenvironment. The role of one BMIRG signature, CCL20, was further validated using in intro experiments and tissue samples from Chinese patients with LUAD. Functional experiments revealed that inhibiting CCL20 attenuated cell migration in both A549 and H1299 cells. Immunohistochemistry and Western blot analyses demonstrated that CCL20 protein levels were significantly elevated in LUAD tissues compared to adjacent non-tumor tissues and also associated with tumor the TNM stage. High CCL20 expression was also linked to shorter overall survival. In conclusion, the BMIRG signature is a robust predictor of survival and offers a valuable clinical tool for LUAD management. Moreover, CCL20 emerges as a potential diagnostic and therapeutic target, given its high expression in LUAD patients and association poorer prognosis. Supplementary Information The online version contains supplementary material available at 10.1038/s41598-025-05582-0. Keywords: Lung adenocarcinoma, Basement membrane, Inflammatory response, Prognosis, Immune microenvironment Subject terms: Cancer, Immunology, Biomarkers, Diseases, Oncology Introduction Lung cancer remains the leading cause of cancer-related incidence and mortality worldwide^[40]1,[41]2, with an overall poor prognosis due to its atypical symptoms, propensity for distant metastasis and rapid progression. Most patients are diagnosed at an advanced stages, with a dismal 5-year survival rate of less than 20%^[42]3. LUAD, the most prevalent subtype of non-small cell lung cancer (NSCLC), accounts for approximately 40% of all cases^[43]4. Current standard treatments for NSCLC primarily include surgical resection, radiotherapy and chemotherapy^[44]5. Recent advancements in genetic testing have identified key mutations in KRAS and EGFR, enabling the development of targeted therapies and immunotherapies tailored to individual patients^[45]6. However, drug resistance remains a major challenge, often leading recurrence and distant metastasis in a substantial subset of patients^[46]7. This highlights the crucial need to identify novel biomarker for early diagnosis and therapeutic target discovery. Within lung cancer tissues, cancer cells constitute only a small fraction of tumor component composition, with the majority comprising extracellular matrix, various immune cells, inflammatory cells, chemokines and cancer-related fibroblasts^[47]8–[48]10. Furthermore, lung cancer is characterized by an extensive tumor immunosuppressive microenvironment (TIME) which promotes cancer cell proliferation through direct suppression of anti-tumor immunity and/or evasion of immune surveillance mechanisms^[49]11,[50]12. Emerging research underscores the potential of disrupting this immunosuppressive network to enhance the tumor-killing activity of immune effector cells, offering new avenues for improving patient outcomes. Immunotherapy and targeted therapy have made significant strides in clinical, notably through the use of immune checkpoint blockers[13]and tyrosine kinase receptor inhibitors (TKI)^[51]14. However, the overwhelming majority of advanced lung cancer patients become resistance to current treatments, resulting in disease progression. Therefore, a comprehensive understanding of the mechanisms underlying lung cancer within the TIME is essential to developing new therapeutic strategies for patients with lung cancer. Metastasis is the primary cause of mortality in patients with cancer, including those with LUAD^[52]15. Malignant cells infiltrating and breaking through the basement membrane is a hallmark of metastasis^[53]16. The basement membrane, a specialized component of the extracellular matrix (ECM), plays an indispensable role in organizing epithelial tissues by separating their epithelial and stromal compartments, acting as a crucial barrier against cancer cells invasion and metastasis^[54]17,[55]18. The inflammatory response within the TIME exerts both protective and deleterious effects. On one hand, it can activate anti-tumor immune cells, fostering the elimination of tumor cells and preventing their breach of the basement membrane, facilitating the elimination of cancer cells and preserving the integrity of the basement, thereby preventing metastasis^[56]19–[57]21. Conversely, chronic inflammation and the infiltration of inflammatory cells basement membrane, enabling the invasion of cancer cells into adjacent tissues and distant metastasis^[58]22,[59]23. Hence, identifying adverse inflammatory markers within the microenvironment and devising targeted therapeutic interventions is essential to mitigate metastasis and enhance therapeutic efficacy. In this paper, we systematically analyzed BMIRGs reported in recent years. The expression profiles of these genes from multiple databases to explore gene signature. Initially, we explored the enrichment pathways of various BMIRG patterns and established a BMIRG-based risk score model to predict patient prognosis using the data from Cancer Genome Atlas (TCGA) and Gene Expression Omnibus (GEO) databases. The risk scores were then utilized to develop nomograms and Cox regression models, which effectively predicted patient survival. Furthermore, the association between BMIRG risk scores and patients’ responsiveness to drug was evaluated, providing evidence-based insights into LUAD treatment strategies. Notably, BMIRG risk scores were correlated with a tumor suppressive immune microenvironment, suggesting their potential utility in predicting immunotherapy responses in patients with LUAD. Finally, the findings were validated through in vitro experiments and analysis of LUAD tissue samples from Chinese patients, reinforcing the clinical relevance of the CCL20. Materials and methods Data collection The mRNA expression matrix data and corresponding clinical information for LUAD were downloaded from the TCGA database ([60]https://portal.gdc.cancer.gov/) and the [61]GSE31210 dataset obtained from the GEO database ([62]https://www.ncbi.nlm.nih.gov/geo/). Basement membrane-inflammatory response related gene sets were comprehensively adopted from previously published literature^[63]24 and the Molecular Signature Database^[64]25. The combined cases were randomly divided to a training cohort and a validation cohort at a 1:1 ratio. Table [65]1 shows the clinical and pathological characteristic of LUAD patients in TCGA and GEO cohorts, while Fig. [66]1 visually represents the entire research procedure. Table 1. Clinical and pathological characteristics of the LUAD patients retrieved from the TCGA and GEO databases. Covariates Type Total Test Train p-value Age <  = 65 415 214 201 0.5438 > 65 308 154 157 Unknow 10 1 9 Gender Female 393 201 192 0.5272 Male 340 165 175 Stage I 440 209 231 0.1545 II 178 101 77 III 81 43 38 IV 26 11 15 Unknow 8 2 6 [67]Open in a new tab Fig. 1. [68]Fig. 1 [69]Open in a new tab Study workflow diagram. Abbreviation BMIRGs, basement membrane-inflammatory response related genes; DEGs, differentially expressed genes; GEO, gene expression omnibus; GO, gene ontology; KEGG, Kyoto Encyclopedia of Genes and Genomes; TCGA, the cancer genome atlas; WB, western blot; IHC, immunohistochemistry. Identification and characterization of molecular subtypes based on BMIRGs The expression levelsof BMIRGs were visualized using a volcano plot constructed using R programming. Differentially expressed genes (DEGs) associated with basement membrane and inflammatory response were identified base on the criteria of FDR < 0.05 and |log2 FC|≥ 1, and their expression patterns were illustrated on a heatmap. Prognostic BMIRGs in LUAD were identified through univariate Cox regression analysis with a significance threshold of p < 0.05. These prognostic DEGs were subsequently divided into two clusters via consensus clustering. Gene expression levels in each cluster were analyzed in relation to clinical characteristics. To elucidate the potential biological functions of BMIRGs patterns, functional enrichment analysis, including gene set enrichment analysis (GSEA) and gene set variation analysis (GSVA), was performed using the “limma”, “GSEABase” and “GSVA” packages. Development and validation of a prognostic model based on BMIRGs A prognostic scoring system for LUAD based on BMIRGs was constructed through a LASSO-Cox regression analysis in the training group with the “glmnet” and “survival” R packages. Risk scores were calculated by multiplying the expression value of each identified risk gene by its corresponding regression coefficient, which was determined based on the optimal penalty parameter λ. The accuracy and robustness of the scoring system were evaluated using the receiver operating characteristic (ROC) curves generated with the “timeROC” R package, assessing its sensitivity and specificity. Kaplan–Meier analysis was employed to compare overall survival (OS) between high-risk and low-risk groups. To enhance the predictive capability, a nomogram model incorporating the BMIRGs-based scoring score and clinical parameters was generated using the “survival” and “rms” packages. Calibration curves at 1, 3, and 5 years were created to assess the model’s accuracy. A cumulative hazard curve was constructed using the “survminer” package the decision curve analysis was conducted using the “survival” and “survminer” packages to evaluate the nomogram’s clinical utility. Validation of the scoring system was performed in the total and validation groups. Relationship between BMRIG risk score and drug sensitivity Therapeutic strategies for LUAD typically include chemotherapy, small-molecule inhibitors, and immunotherapy. Differential sensitivities to these therapies between high-and low-risk groups were analyzed using the “oncopredict” R package. The “limma” and “ggpubr” packages were utilized to assess differential responses to immunotherapy. Immune dysfunction and exclusion (TIDE) scores were obtained from website ([70]http://tide.dfci.harvard.edu/, accessed on 20 January 2023) and analyzed to evaluate the potential for tumor immune escape based on gene expression data. Relationship between BMIRG risk score and the tumor immune microenvironment The assessment of tumor immunity involves evaluating key factors, such as stroma scores, tumor immune scores, and the abundance of various infiltrated immune cell populations within the tumor microenvironment. Moreover, it involves considering the expression of immune checkpoints and biomarkers of immune cell activation. Associations between the distribution of immune cells and individual risk scores were also systemically analyzed. BMIRG signature functional enrichment analysis To functionally characterize the differentially expressed genes (DEGs), we performed enrichment analysis using the “clusterProfiler” R package, encompassing both Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) analyses^[71]26,[72]27. The GO enrichment analysis including biologic process (BP), cellular components (CC) and molecular function (MF); P < 0.05 was considered statistically significant. Survival analysis and clinical relevance analysis for CCL20 The patients cohorts were divided into two subgroups based on gene expression levels through R packages “survival”and “survminer”. The associations between clinical characteristics and target gene expression levels were subsequently analyzed and visualized as box plots using the R package “ggpubr”. Single-cell transcriptome analysis The single-cell RNA-sequencing data (scRNA-seq) for lung cancer was accessed from [73]GSE123902 dataset, available in the GEO database corresponding to a published article^[74]28. For this analysis, raw data from seventeen samples were utilized, including eight primary tumor tissues, four normal tissues and five metastatic tumor tissues. The [75]GSE123902 dataset was employed to investigate the expression patterns of CCL20 in lung cancer cells. Data preprocessing and rigorous quality control were performed using the “Seurat” R package. The filtering criteria were defined as follows: (a) Each case was required to comprise a minimum of 300 cells; (b) Each gene was required to be expressed in at least 3 cells, and each cell was required to express a minimum of 300 genes; (c) Mitochondrial gene content and sequencing depth were evaluated to assess cell viability and technical quality, with mitochondrial RNA content restricted to a maximum of 20% per cell to ensure that mitochondrial RNA content did not exceed 20% in each cell^[76]29. Following quality control, a filtered gene expression matrix comprising 23,278 genes across 42,847 cells was obtained for subsequent analyses. Cell culture The LUAD cell lines A549, H1299 and HEK293T cell line were provided by the NHC Key Laboratory of Cancer Proteomics of Central South University. A549 and H1299 were cultured in RPMI1640 (MeisenCTCC, Zhejiang, China) supplemented with 10% fetal bovine serum (FBS, NEWZERUM, New Zealand) plus 1% penicillin/streptomycin at 37 °C under 5% CO[2.] HEK293T were maintained in DMEM (MeisenCTCC, Zhejiang, China) medium with 10% FBS and 1% penicillin–streptomycin. Plasmids construction and lentivirus infection The shRNAs targeting CCL20 were cloned into pLKO.1-puro vector (Addgene). The pLKO.1-shCCL20 or pLKO.1 scramble control vector was co-transfected with lentivirus package plasmids (psPAX2 and PMD2. G, Addgene) into HEK293T cells and the supernatant containing viruses were collected at 48 h or 72 h post-transfection. The collected viral particles were used infect A549 and H1299 cells, which were then selected with 2 μg/mL puromycin. shRNA sequences for CCL20 were as follows: shCCL20#1; GGAATTGGACATAGCCCAAGA shCCL20#2; GACCGTATTCTTCATCCTAAA Real-time PCR analysis Total RNA was extracted from cell lines using TRIzol (Invitrogen). Total RNA (1 μg) was transformed to cDNA using the HiScript II Q RT SuperMix for qPCR (Vazyme). qRT‒PCR was performed using the ChamQ Universal SYBR qPCR Master Mix (Vazyme) on an LC480 instrument (Roche). The endogenous control for RNA quantification was ACTB. The sequences of the primer pairs are as follows: CCL20: Forward: AGCACTCCCAAAGAACTGGG, Reverse: AGGTTCTTTCTGTTCTTGGGCT; ACTB: Forward: GCTCCGGCATGTGCAAGG, Reverse: GGCCTCGTCGCCCACATA. Western blot Samples were lysed in RIPA buffer (Beyotime, P0013B) containing proteinase (AbMole, M5239) and phosphatase inhibitors (AbMole, M7528). Lysates were centrifuged at 13,000 g at 4 °C for 20 min, and the supernatants were denatured with SDS buffer at 100 °C for 10 min. The final protein samples were loaded onto 4–20% SDS-PAGE gels, transferred to PVDF membrane (Millipore), blocked with 5% BSA, and immunoblotted with the indicated antibodies. The primary antibodies used in the study included anti-CCL20 (Abcam, ab9829, 0.2ug/ml), anti-α-Tubulin (Proteintech, 11-224-1-AP, 1:5000). The signals were detected with HRP-conjugated secondary antibody (Abbkine, A21020, 1:5000). Wound healing assay The cells were pre-seeded into a 6-well plate 24 h prior to the experiment. Upon reaching 100% confluency, a 10 µl sterile pipette tip was used to vertically scratch lines, followed by PBS washing to remove cell debris. Cells were cultured in serum-free medium. Photographs were captured at 0h, 12h, and 24h intervals and analyzed using ImageJ software to assess the percentage of cell migration ration. Transwell assay Transwell migration experiments were conducted using Transwell chambers (Coring, USA). Cells were resuspended at a density of 2 × 10^4 per well within 200 μl of serum-free medium and seeded into the upper chambers, while the lower chambers were filled with 700 μl of medium containing 10% FBS. After 36 h of incubation, migrating cells in the lower compartment were fixed in 4% paraformaldehyde for 30 min, followed by staining with 0.1% crystal violet for 15 min at room temperature. Subsequently, cells were photographed and counted under a microscope. Immunohistochemical (IHC) A human lung tissue array (AF-LucSur2202), containing 80 lung adenocarcinoma and paired adjacent normal tissues was obtained from AiFang Biologic (Changsha, China). The IHC staining of CCL20 (1:150) was performed following the manufacturer’s protocol. The staining intensity score was defined by two independent experienced pathologists as follows: The definition for the staining intensity score: 0 points (Negative); 1 point (Light brown); 2 points (Brown); 3 points (Dark brown). The rule for the stained positive cells scores: 0 points (0%); 1 point (10%-25%); 2 points (26%-50%); 3 points (51%-75%); 4 points (76%-100%). We divided the tissues into high (score ≤ 6) and low (score > 6) groups according to the final score. Final score = intensity score multiplied by percentage score. Statistical analysis In this study, baseline patient characteristics were summarized using descriptive statistics. Data were analyzed by one-way ANOVA followed by Dunnett’s multiple comparisons test. Kaplan–Meier survival analysis combined with log-rank tests was employed to compare survival outcomes between patients in the high-risk and low-risk groups. Quantitative image analysis was performed using ImageJ software (v1.53t), while statistical analysis and data visualization were conducted using GraphPad Prism (v9.0) and R software (v4.2.3), respectively. All statistical tests were two-sided, and p-values less than 0.05 were considered statistically significant. Results Identification of prognosis-related BMIRGs The overall research process is visually summarized in Fig. [77]1. To investigate the expression and distribution characteristics of BMIRGs in LUAD, data from 507 LUAD cases in the TCGA database and 226 LUAD patients in the [78]GSE31210 were analyzed. Clinical and pathological characteristics of LUAD patients in TCGA and GEO cohorts are listed in Table [79]1. Analysis of mRNA expression for 422 BMIRGs in patients with LUAD identified 51 upregulated and 82 downregulated DEGs (Fig. [80]2A). The heatmap illustrates the top 30 DEGs (Fig. [81]2B). Univariate analysis revealed 47 BMIRGs significantly associated with OS (Fig. [82]2C and Supplementary Table S1). The relationship between the expression levels of these BMIRGs and survival outcomes is detailed in Supplementary Figure S1. Copy number variation (CNV) analysis of these 47 DEGs highlighted that ECM1, BCAN, and ITGB4 predominantly exhibited CNV gains, whereas CD69 and POSTN displayed CNV losses (Fig. [83]2D). The chromosomal distribution of these CNV mutations is depicted in Fig. [84]2E. Additionally, the associations among these 47 BMIRGs are represented in a prognosis network diagram (Supplementary Figure S2). These findings highlight the differential expression of BMIRGs in LUAD cancer tissues compared to adjacent normal tissues and their significant correlation with patient prognosis. Fig. 2. [85]Fig. 2 [86]Open in a new tab Identifying prognostic BMIRGs. (A) The DEGs of BMIRGs were identified from the LUAD dataset of the TCGA and the [87]GSE31210 dataset. Up-regulated genes are indicated in red, while down-regulated genes are indicated in green. (B) Heatmap showing the expression levels of the top 30 DEGs. (C) Prognostic forest plot for 47 BMIRGs. Risk genes with a hazard ratio greater than 1 are colored red, while protective genes are displayed in blue. (D) Copy Number Variation of these BMIRGs. (E) Chromosome Region of these BMIRGs. Consistency clustering and pathway enrichment analysis among 47 prognostically relevant BMIRGs To further investigate BMIRG expression patterns in LUAD, a consensus clustering algorithm identified two distinct patterns of gene expression patterns among patients with LUAD (Fig. [88]3A–C). Principal component analysis (PCA) further confirmed the significant differences in gene expression profiles between these two clusters (Fig. [89]3D,E). The Kaplan–Meier survival analysis revealed that patients in Cluster A had significantly shorter OS compared to those in Cluster B (Fig. [90]3F). A heatmap was used to illustrate the associations between BMIRGs clusters, expression levels and clinical characteristics (Fig. [91]3G). Despite the differences in BMIRGs expression, clinical characteristics-including age, sex, tumor stage and database origin-did not differently between clusters. Fig. 3. [92]Fig. 3 [93]Open in a new tab Identification of the BMIRGs clusters in lung adenocarcinoma patients. (A–C) Consensus matrix for k = 2 was used to categorize patients into high and low risk clusters. (D) Principal Component Analysis of patients with LUAD. (E) Expression levels of 47 BMIRGs in the two clusters. (F) Overall survival of the two groups of patients based on Kaplan–Meier curves. (G) Heatmap showing the relationship between clinicopathologic features and expression profiles of 47 BMIRGs in both clusters. To elucidate the biological functions of the DEGs in these two clusters, GSVA enrichment analysis was conducted and revealed that Cluster A was mainly enriched in energy metabolism, such as fatty acid metabolism, propanoate metabolism and so on, while Cluster B was enriched in pathways related to cancer, base excision repair, cell cycle and ECM receptor interaction (Fig. [94]4A). GSEA analysis further supported these findings, showing that Cluster A was positively enriched in ECM receptor interaction and cytokine-cytokine interaction pathways but negatively enriched in DNA replication, cell cycle and the P53 pathway (Fig. [95]4B). In contrast, Cluster B exhibited the opposite enrichment patterns (Fig. [96]4C). Briefly, the clustering of the identified BMIRGs revealed two distinct LUAD patterns with differential gene expression profiles and potential functional roles. Fig. 4. [97]Fig. 4 [98]Open in a new tab Functional analysis of the two clusters. (A) Heatmap showing GSVA functional enrichment of 47 BMIRGs in two clusters. GSEA (B, C) of pathway enrichment in the two differential BMIRGs clusters. Construction of BMIRGs scores for prognosis prediction To construct a quantitative risk-score model incorporating BMIRGs, all patients were randomly divided into a training group and a validation group. Utilizing the optimal value of λ (Fig. [99]5A) and the lowest partial likelihood deviance (Fig. [100]5B), a LASSO-Cox regression model comprising nine BMIRGs was constructed. The genes and their corresponding coefficients are listed in Supplementary Table S3. Based on the risk scores, patients were categorized into high-risk and low-risk groups. Kaplan–Meier survival indicated that patients in the high-risk group had significantly lower survival probabilities compared to those in the low-risk group, a trend consistent across the overall LUAD patient cohort (Fig. [101]5C), in the training group (Supplementary Figure S3A), and in the validation group (Supplementary Figure S3B). To validate the robustness of the BMIRGs, we included two independent GEO cohorts in this study. We used the same formula to calculate the risk score of each patient in two GEO cohorts. Patients were sorted into the high-risk and low-risk groups in each cohort by the median risk score. Kaplan–Meier analysis demonstrated that the high-risk group had inferior prognosis than the low-risk group in all two GEO cohorts, namely, [102]GSE13213 (Supplementary Figure S4A, p = 0.002), [103]GSE50081(Supplementary Figure S4B, p < 0.001). ROC curves demonstrated the model’s predictive accuracy for 1-, 3-, and 5- year survival in all patients (Fig. [104]5D), in the training group (Supplementary Figure S3C), and those in the validation group (Supplementary Figure S3D). An alluvial diagram depicted the relationships among risk scores, expression clusters and survival outcomes (Fig. [105]5E). Furthermore, Cluster A had lower BMIRG risk scores compared to Cluster B (Fig. [106]5F), with additional details provided in Supplementary Table S2. Collectively, the BMIRG signature effectively distinguished patient survival outcomes. Fig. 5. [107]Fig. 5 [108]Open in a new tab Construction and validation of the BMIRGs in LUAD patients. (A, B) LASSO-Cox regression analysis based on nine BMRIG prognostic signature. (C) Kaplan–Meier curve analysis for overall survival between the two risk groups. (D) Time-dependent ROC analysis of TCGA LUAD cohort demonstrated the predictive efficiency of the prognostic score. (E) An alluvial diagram depicts the BMIRG prognostic genes’ expression in the two risk groups. (F) The risk scores of BMRIGs in different groups. (G) Heatmap of the expression of the nine BMRIG prognostic signature. (H) Multivariate Cox analysis of BMIRG risk score and clinicopathological parameters, including gender and stage. (I) Development of a nomogram to predict overall survival probabilities. (J) Cumulative hazard curve was used evaluate survival probability between the two groups. (K) The calibration curves of the model to show the predictive efficacy of overall survival of the LUAD patients. Analysis of BMIRG expression profiles between the two risk groups revealed that LAD1, CCL20, ACAN, PCDH7 and TPBG were significantly upregulated in the high-risk group, while HMCN2, CX3CL2, IL7R and LAMP3 were expressed at a lower level (Fig. [109]5G). Compared to normal tissues, CX3CL1, IL7R and LAMP3 were downregulated in LUAD tissues, while the other six genes were upregulated (Supplementary Figure S5A). Correlations among the expression levels of the nine BMIRGs are shown in Supplementary Figure S5B. Given the influence of clinicopathological parameters and tumor grade on overall survival, a Cox regression model and a nomogram model were constructed to predict OS at 1, 3, and 5 years. The forest plot indicated that sex, pathological stages, and BMIRG risk scores as significant factors in Cox regression model (Fig. [110]5H), consistent with findings from the nomogram (Fig. [111]5I). The cumulative hazard curves (Fig. [112]5J) demonstrated a significant increase in the hazard risk over time for the high-risk group compared to the low-risk group. Calibration curves show the predictive efficacy of the nomogram model at 1-, 3-, and 5-year intervals (Fig. [113]5K). Decision curve analysis demonstrated that the nomogram provided the highest clinical benefit compared to the BMIRG risk model and other clinical parameters across the training, validation, and total cohorts (Supplementary Figure S3E–G). These findings collectively highlight the robust predictive capacity of BMIRG-related models for survival outcomes in patients with LUAD. Functional enrichment analysis of BMIRGs signature To gain a comprehensive understanding of the biological roles and processes underlying the risk score, GO and KEGG annotation analyses were performed on the genes in the model. Go molecular function analysis revealed enrichment in several inflammatory response-related processes, including CCR chemokine receptor binding, chemokine activity, cytokine activity, cytokine receptor binding, and CXCR chemokine receptor binding. (Supplementary Figure S6A-B). Similarly, KEGG pathway analysis highlighted associations with inflammatory-associated pathways, like Cytokine-cytokine receptor interaction, TNF signaling pathway, Chemokine signaling pathway, IL-17 signaling pathway (Supplementary Figure S6C-6D). Collectively, these findings suggest that the BMRG signature predominantly functions in inflammatory processes, which may underlie its prognostic capacity. Drug sensitivity and immunotherapy prediction Drug therapy remains a cornerstone in NSCLC treatment, playing a crucial role in improving patients’ outcomes and extending survival. To evaluate the therapeutic implications of the BMIRG risk score, we examined the efficacy of immune checkpoint inhibitors using TIDE scores and assessed the sensitivity of 198 drug types via the “oncoPredict” R package within the TCGA cohort. The analysis unveiled that high-risk patient had lower IC50 responses to docetaxel, ERK_6604, gefitinib, luminespib, sapitinib, SCH772984, and WIKI4 (Fig. [114]6A–G), indicating potential benefits. Additionally, individuals in the high-risk group displayed increased sensitivity to the immune checkpoint blocker PD1 (Fig. [115]6H,I). These results underscore the potential utility of the BMIRG model in guiding drug selection and immunotherapy strategies for LUAD. Fig. 6. [116]Fig. 6 [117]Open in a new tab Evaluation of treatment responsiveness for high group scores patients. Compared to the low- risk group, patients with high group scores showed significantly lower IC50 to treatment drugs docetaxel (A), ERK_6604 (B), Gefitinib (C), Luminespib (D), Sapitinib (E), SCH772984 (F), WIKI4 (G). (H, I) Showed that the high group scores more sensitive to immune checkpoint inhibitors, for PD1. TME and immune infiltration analysis of BMIRGs The TME, comprising cancer cells, stromal cells, fibroblasts and various inflammatory cells, plays a critical role in tumor development and progression^[118]30. An expanding body of evidence has showed the significant role that the TME plays in the occurrence and progression of tumors^[119]30,[120]31. To investigate the relationships among the BMIRGs signature, tumor microenvironment, and immunotherapy response in LUAD, we analyzed the estimated immune scores, abundance of various immune cells, and key immune molecules within the stroma of LUAD patients in this study. The CIBERSORT algorithm revealed that the high-BMIRGs score group exhibited higher immune scores than the low-BMIRGs score group, while there was no significant change in stromal score or estimate score between the high and low risk groups (Fig. [121]7A). The immune cell infiltration profiles for each case are shown in Fig. [122]7B, while differences in immune cell abundance between the two risk groups are shown in Fig. [123]7C. Notably, the low-risk group demonstrated a higher abundance of plasma cells, activated memory and resting CD4 T cells, and infiltrating M1 macrophages than the high-risk group. A heatmap illustrating correlations among internal immune cells (Fig. [124]7D), and the correlations among model genes, risk scores, and immune cells are shown in Fig. [125]7E. Fig. 7. [126]Fig. 7 [127]Open in a new tab Immune infiltration patterns of the BMIRG signature. (A) Tumor microenvironment scores of the high- and low-risk groups. (B) Immune infiltration components in each patient. (C) Immune infiltration components in the two risk groups of LUAD. (D) Correlations among infiltrated immune cells. (E) Correlations between immune cells and the BMIRG signature. Bioinformatic analysis and experimental validation of CCL20 in LUAD progression CCL20, a key BMIR-related differentially expressed gene (DEG) within the gene signature, was identified through comprehensive bioinformatics analysis and experimental validation. By calculating the correlation between the expression levels of each gene in the model and the risk score, we found that CCL20 exhibited the strongest positive correlation with the risk score (correlation coefficient = 0.5), ranking highest among all genes in the model (Supplementary Figure S7A-I). Due to its significant association with the risk score, CCL20 was selected as a key target for further experimental validation. Firstly, the [128]GSE75037 dataset exhibited the up-regulation of CCL20 in LUAD tissues (Fig. [129]8A). Then we analyzed the expression in different LUAD grades. The result shown that the expression of CCL20 were significantly elevated in Stage II, III, IV compared to Stage I (Fig. [130]8C, P < 0.05) and patients were stratified into high-risk and low-risk groups based on the median expression level of CCL20, with high CCL20 expression correlating with poorer prognosis (Fig. [131]8B). To validate these findings, we further analyzed a cohort of 80 LUAD tissue microarrays and 8 paired lung cancer tissues and corresponding adjacent non-tumor tissues. IHC and WB results showed that CCL20 protein expression was upregulated in tumor tissues (Fig. [132]8D–F). Moreover, our results suggested that CCL20 statuses were positively associated with node metastasis (P = 0.01) and clinical stages (P = 0.02) (Table [133]2). Moreover, based on the complete clinical information of the LUAD samples, the correlation between CCL20 expression and overall survival was analyzed. As displayed in Fig. [134]8G, patients with higher CCL20 expression possessed a robustly poorer prognosis. This is consistent with our previous analysis. Subsequently, we performed in vitro experiments to examine its pro-carcinogenic effects, including scratch assays and transwell migration assays. qRT-PCR confirmed the knockdown efficiency of CCL20 in A549 and H1299 cells (Fig. [135]8H,I). Wound healing (Fig. [136]8J,H) and migration experiments (Fig. [137]8K,L). The results demonstrated that CCL20 was significantly upregulated in lung cancer tissues and correlated with poor patient prognosis. Furthermore, silencing CCL20 effectively inhibited the migratory capacity of lung cancer cells. Fig. 8. [138]Fig. 8 [139]Open in a new tab Evaluation of CCL20 expression and prognosis value in clinical LUAD samples and cells. (A) Differential analysis of CCL20 in [140]GSE75037. (B) The overall survival was associated with the expression levels. (C) The correlation between the transcription level of CCL20 and different tumor grades in LUAD. (D) Immunohistochemical staining of CCL20 protein in LUAD tissue arrays. Representative pictures of CCL20 protein in tumor and adjacent tissues. (E) Western blot was performed to detect the expression of CCL20 in 8 matched lung cancer tissues. (F) Quantification of CCL20 protein levels in LUAD and matched adjacent normal tissues. (G) Kaplan–Meier curve showing the overall survival of LUAD patients with high and low CCL20 expression. (H, I) RT-PCR was used to test the knockdown efficiency in A549 and H1299 cells. (J, H) Cell migration was determined by wound healing assay. (K, L) Transwell assays were used to assess the migration ability of A549 and H1299 cells. N, non-tumor; T, tumor. Data were analyzed by one-way ANOVA followed by Dunnett’s multiple comparisons test. Statistical analyses were performed using GraphPad Prism 9.0. **p < 0.01, ***p < 0.001, ****p < 0.0001. Table 2. CCL20 expression and clinicopathological features in LUAD. Variables n CCL20 expression p-value* Low expression High expression Gender 0.81 Male 46 17 29 Female 34 14 20 Age 0.17  ≥ 55 43 14 29  < 55 37 18 19 T 0.97 T0 2 1 1 T1 18 6 12 T2 44 18 26 T3 10 4 6 T4 6 2 4 N 0.01 N0 29 16 13 N1 + N4 51 13 38 Stage 0.02 I 33 15 18 III + IV 46 11 37 [141]Open in a new tab Correlation of CCL20 expression and clinicopathological features in 80 patients with Lung Adenocarcinoma. TNM stage information was missing one patient. Interestingly, the analysis of immune cell infiltration also revealed that the high CCL20 expression was associated with increased infiltration of plasma cells, CD4-activated memory T cells and neutrophils, while the low CCL20 expression correlated with the presence of M2 macrophages and other cell types (Supplementary Figure S9). Additionally, single-cell analysis of LUAD revealed that CCL20 was predominantly clustered in the macrophage module (Supplementary Figure S6E-H), consistent with the localization of macrophage markers reported in previous studies^[142]32. Based on CCL20 expression levels in LUAD patients from the TCGA database, we stratified samples into high- and low-expression groups and performed differential expression analysis. GO enrichment analysis indicated that CCL20 is primarily involved in regulating humoral immune responses, mediating the migration of inflammatory cells (e.g., granulocytes and neutrophils), and modulating acute inflammatory responses (Supplementary Figure S8A-B). KEGG analysis further revealed that CCL20 is significantly associated with key signaling pathways, including the IL-17 signaling pathway, ligand-receptor interactions, and cytokine-cytokine receptor interactions (Supplementary Figure S8C-D). These results strongly suggest that CCL20 plays a critical role in LUAD progression and could serve as a valuable diagnostic biomarker and therapeutic target for lung cancer patients. Discussion The tumor microenvironment plays an indispensable role in cancer development and progression^[143]33. As a key component of the TME, the basement membrane acts as a structural barrier that prevents early metastasis of cancer cells^[144]34,[145]35. However, the infiltration of various inflammatory cells compromises the integrity of the basement membrane, thereby facilitating cancer cell invasion and distant metastasis^[146]36. The interplay between the basement membrane and the inflammatory responses creates a complex immune microenvironment that favors tumor cell growth and metastasis. Despite the growing use of gene signatures associated with basement membrane or inflammatory response have been used for prognostic classification of LUAD^[147]37,[148]38, the precise mechanisms underlying these interactions in tumor recurrence and metastasis remain poorly understood, underscoring the need for systemic search in this field. Lung cancer is a highly malignant tumor and heterogeneous disease^[149]39,[150]40. While advances in diagnosis and treatment have been archived, the lack of effective diagnostic and prognostic markers contributes to present significant challenges in the global fight against lung cancer. This unmet need highlights the importance of further investigations into the molecular mechanisms of lung cancer progression. In this study, we identified nine BMIRGs and constructed a robust prognostic model for LUAD. This gene signature effectively stratified patients with LUAD into two groups, accurately predicting clinical outcomes, treatment responses, TME characteristics and immune cells infiltration. Importantly, the prognostic value and oncogenic role of CCL20 in LUAD were comprehensively analyzed through bioinformatics methods and experimental validation. The proposed BMIRG scoring system offers the potential for predicting survival outcomes and evaluating responses to targeted therapies and immunotherapy in LUAD. To accurately assess patients’ risk, the prognostic model incorporated nine BMIRGs: CX3CL1, IL7R, LAMP3, PCDH7, TPBG, ACAN, HMCN2, LAD1and CCL20. CX3CL1, a chemokine expressed in various tissues, primarily binds to its receptor CX3CR1 under inflammatory stimuli^[151]41. Trinh et al. have shown that reducing CXCL3-CXCR1 axis limits cytotoxic cell infiltration into solid tumors, enabling tumor immune evasion^[152]42. Conversely, overexpression of CX3CL1 in tumor cells has been shown to significantly inhibit tumor growth in vivo while increasing CD8 + T cell infiltration within the tumor microenvironment^[153]43. Our findings demonstrate a positive correlation between CX3CL1 expression and the low-risk group, indicating its potential role in promoting immune cell infiltration and subsequent tumor cell elimination. Furthermore, elevated IL7R expression has been consistently associated with enhanced anti-tumor immunity, reinforcing its potential as a favorable prognostic marker^[154]44. Lysosome-associated membrane proteins 3 (LAMP3) is a member of the lysosome-associated membrane protein family (LAMPs)^[155]45. Recent studies have shown that LAMP3 has an important role in promoting tumor metastasis^[156]46. Protocadherin-7 (PCDH7) functions as a proto-oncogene that promotes cell proliferation and chemoresistance^[157]47. TPBG is an oncofetal protein, Ping He et al. reported that TPBG is highly expressed in pancreatic cancer tissues and enhances cancer cell migration and invasion by regulating the Wnt/PCP signaling pathway^[158]48. Similarly, ACAN, another risk factors in this prognostic mode, Yang S et al. identified elevated expression of ACAN in lung cancer tissues using a proteomic approach, suggesting that it may promote the malignant biological behavior of lung cancer cells^[159]49. Higher LAD1 expression in tumor tissue correlates with a worse prognosis in lung cancer, knocking down of LAD1 in H1299 and PC-9 cells inhibited cell proliferation and migration^[160]50. CCL20, primarily produced by epidermal cells, is currently the only high-affinity ligand that binds to CCR6, and its expression is dramatically increased under inflammatory conditions^[161]51. CCL20 plays an important role in tumor cell metastasis and therapeutic chemotherapy^[162]52. Tao Fan et al. elucidated that CCL20 promotes lung adenocarcinoma progression by enhancing epithelial-mesenchymal transition^[163]53. Considering the pivotal role of CCL20 in tumor progression, we used bioinformatics analysis and experimental validation to explore its functions. The results revealed that CCL20 is highly expressed in lung cancer tissues and correlates with poor patient prognosis. Moreover, CCL20 knockdown significantly inhibited the migratory and invasive capabilities of lung cancer cells. Our study represents the first integrated analysis of BMIRGs in the context of LUAD, providing valuable insights into their predictive potential for clinical outcomes and early diagnosis strategies. The use of multigene prognostic models can better reflect the characteristics of tumor heterogeneity. Based on the results of the multigene prognostic model, patients can be accurately stratified into different risk categories, enabling the provision of individualized clinical management and treatment guidance tailored to their specific risk profiles. Leveraging whole transcriptome expression profiling, Jiang et al. independently developed a prognostic prediction model for triple-negative breast cancer (TNBC) comprising five RNAs^[164]54. This model accurately classifies TNBC patients into high-risk and low-risk recurrence groups, marking it as the first internationally recognized prognostic prediction model specifically designed for TNBC. Guided by the multigene model test results, the research team categorized enrolled patients into high-risk and low-risk groups and administered distinct treatment regimens accordingly. This precision chemotherapy approach led to a remarkable improvement in survival rates, with high-risk patients experiencing an increase of over 10%, while low-risk patients achieved a nearly 10% improvement^[165]55. These findings underscore the significant potential of multigene prognostic models in advancing clinical practice. To further validate the clinical efficacy of this model, we plan to expand the sample size and enhance sample diversity. By utilizing the multigene test results, we aim to stratify individuals undergoing health check-ups into high- and low-risk groups. High-risk individuals will undergo more comprehensive screening in accordance with existing clinical guidelines, thereby improving the early detection rate of cancer. Timely intervention and treatment for early-stage patients are expected to significantly enhance the overall survival rate of LUAD patients. This prospective study design not only aims to validate the clinical utility of the model but also seeks to establish new strategies and frameworks for the early screening and precision treatment of tumors, ultimately contributing to improved patient outcomes. Despite the contributions of our study, several limitations should be acknowledged and addressed in future research. Firstly, although bioinformatics analyses using public transcriptomics databases provide valuable insights, they are limited by relatively small sample sizes and insufficient diversity in sample sources, which may affect the generalizability of our findings. Secondly, considering the complexity of tumor heterogeneity and evolution, incorporating advanced approaches such as single-cell sequencing and spatiotemporal genomic analyses would enable a more comprehensive understanding of dynamic tumor changes at higher resolution. Thirdly, to further validate the biological significance of our results, additional in vitro and in vivo experiments are required to systematically investigate the functional roles and molecular mechanisms of other key genes involved in LUAD progression. In conclusion, while our study provides a foundational framework for understanding the role of BMIRGs in LUAD, future investigations should focus on (1) expanding sample diversity and cohort sizes, (2) integrating multi-omics approaches, (3) implementing single-cell and spatiotemporal resolution analyses, and (4) conducting systematic functional validation experiments to address current limitations and advance the field. Conclusion In summary, the BMIRG signature effectively stratifies patients with LUAD into distinct risk groups and predicts their survival outcomes. The signature is also correlates with drug responsiveness and the tumor immune microenvironment, providing valuable insights for guiding personalized treatment strategies in LUAD. Among the identified genes, CCL20 emerged as a critical biomarker, showing strong associations with tumor stage and patients’ OS. This study confirmed that CCL20 knockdown effectively inhibited the migration capability of lung cancer cells, highlighting its potential as a biomarker and therapeutic target for preventing distant metastasis. Nevertheless, additional research study is required to fully elucidate the specific molecular mechanism and clinical significance of BMIRGs in LUAD. Electronic supplementary material Below is the link to the electronic supplementary material. [166]Supplementary Material 1^ (23.3MB, docx) [167]Supplementary Material 2^ (12.8KB, xlsx) [168]Supplementary Material 3^ (97.1KB, xlsx) Acknowledgements