Abstract Cancer-associated fibroblasts (CAFs) are significantly involved in determining the patient's prognosis and response to bladder cancer (BLCA) therapy. CAFs can induce epithelial-mesenchymal transformation (EMT) as well as complex interaction with immune cells. Hence, it is imperative to identify potential markers for enhancing our understanding of CAFs in BLCA progression and immune regulation. A variety of algorithms and analyses were employed in the study, leading to the development of a novel prognostic feature for CAFs-Stromal-EMT (CSE)-prognostic feature. This feature was constructed based on the genes MFAP5, PCOLCE2, and JUN. Furthermore, we revealed that patients with higher CSE risk scores responded to immunotherapy better compared to those with lower. Finally, we verified two CSE-related genes using in vitro experiments. Our results suggested that the CSE-prognostic feature could predict the prognosis and evaluate the response of patients to immune and chemotherapies. This would aid clinicians in designing treatment strategies for patients with BLCA. Keywords: Cancer-associated fibroblasts, Epithelial-mesenchymal transformation, Bladder cancer, Immunotherapy, Migration, Invasion 1. Introduction Globally, an estimated 500,000 new bladder cancer (BLCA) cases and 200,000 related mortalities occur yearly. In the USA, 80,000 new BLCA cases and 17,000 related mortalities occur every 25 years [[31]1,[32]2]. Patients with BLCA present a broad spectrum of disease phenotypes ranging from recurrent non-invasive tumors that require long-term treatment to invasive tumors that require multiple forms of intervention and surgical treatment [[33]3]. The advent of high-throughput sequencing techniques has successfully enhanced our understanding of the underlying mechanism of BLCA. Furthermore, numerous potential biomarkers for predicting BLCA progression, prognosis, and molecular subtypes have been investigated [[34]4,[35]5]. In addition, it has aided in improving the diagnosis of patients and designing treatment strategies, specifically immune and targeted therapy [[36]6]. Solid tumors contain tumor cells and non-tumor components, constituting the tumor microenvironment (TME) [[37]7]. TME affects cancer cell growth and metastasis as well as the therapeutic efficacy of drugs; it is widely studied by various researchers [[38]8]. One of the most important components of TME is the stromal cells. Fibroblasts are the most abundant and important type of stromal cells due to their close association with cancer cells [[39]9]. Chronic activation of fibroblasts can promote tumorigenesis, and fibroblasts involved in various cancer-related processes are termed cancer-associated fibroblasts (CAFs). CAFs participate in tumorigenesis by secreting factors like chemokines, collagen, and matrix metalloproteinases (MMPs) [[40]10]. A study performed immunohistochemistry on primary tumors of patients with BLCA revealed that high CAFs levels were positively correlated with BLCA invasiveness compared to normal bladder tissue [[41]11]. Epithelial-mesenchymal transformation (EMT) refers to a phenotypic and pleiotropic change in cells, wherein the epithelial cells gain characteristics and properties of mesenchymal cells. Studies have shown significant involvement of EMT in tumorigenesis, invasion and metastasis of cancer cells, tumor stemness, and chemoresistance [[42]12]. In TME, EMT mediates immunosuppression, thereby inducing resistance to immunotherapy, including immune checkpoint inhibitors [[43]13]. Post-EMT, the epithelial cells are one of the sources of CAFs; hence, a close relationship exists between CAFs and EMT. Ozdemir et al. have shown that CAFs promote the progression of highly aggressive and undifferentiated tumors, induce hypoxia and EMT, and enrich cancer stem cells [[44]14]. In cancer cells, a strong correlation was observed between CAF and EMT marker expression. In vitro stimulation of fibroblasts induces EMT of cancer cells [[45]15]. Studies have shown a close correlation between CAFs and EMT, and significant involvement of CAFs in the onset and progression of BLCA, leading to the formation of the malignant phenotype. In this study, the RNA-sequencing (RNA-seq) data of patients with BLCA were retrieved from publicly available databases like “Gene Expression Omnibus (GEO)” and “the Cancer Genome Atlas (TCGA).” For the first time, CAFs and EMT-related genes were simultaneously identified using the bioinformatics algorithm like “Weighted gene co-expression network analysis (WGCNA)” from this data. First, the highest correlation between the module and the CAF infiltration-stromal scores was observed. Hub genes are yielded by the convergence of the most pertinent modules and genes associated with EMT. Next, a series of algorithms were used to construct CAF-Stromal score-EMT-related genes (CSE-related genes). These CSE-related genes could predict the prognosis and response of patients to BLCA therapy. More importantly, we overexpressed two of CSE-related genes in BLCA cells and revealed that these genes promoted tumor cell migration and invasion using cell-based experiments ([46]Supplementary Fig. 1). Our results suggest that the CSE-prognostic feature may be a new way to predict the progression and efficacy of BLCA. 2. Materials and methods 2.1. Acquisition and processing of raw data The RNA-seq data of 424 patients with BLCA in fragments per kilobase of transcript per million mapped reads (FPKM) format and their clinical characteristics were downloaded from “The Cancer Genome Atlas (TCGA, [47]https://portal.gdc.cancer.gov/)” database. The average value was used if several Ensembl IDs were mapped to the same gene [[48]16]. Next, “Gene Expression Omnibus (GEO, [49]https://[50]www.ncbi.nlm.nih.gov/geo/)” database was searched to retrieve the data on normalized gene expression and the clinical characteristics of patients with BLCA from the [51]GSE13507 dataset [[52]17]. Next, the logarithmic quadratic transformation was performed on the sequence data obtained from GEO. Further, they were subjected to background adjustment, and the data was normalized. The patients without survival information were excluded from the study. Finally, we obtained 407 and 166 patients from TCGA-BLCA and [53]GSE13507 cohorts, respectively, for subsequent analysis. The clinicopathological characteristics of the patients in two cohorts were listed in [54]Supplementary Table 1. 2.2. Infiltration of CAFs and calculation of stromal scores The “Estimate the Proportion of Immune and Cancer cells (EPIC)” algorithm, which leverages expression data to analyze the infiltration proportion of immune cells, was used to determine CAFs infiltration [[55]18]. Additionally, the “xCell” algorithm, employing a gene signature enrichment-based approach, was applied to identify potential immune cell subtypes and calculate their relative abundance within cancer, thereby computing the stromal scores [[56]19]. These analyses were performed using the “IOBR (v0.99.9)” R package. 2.3. Weighted correlation network analysis WGCNA is a system biology technique used for describing the association patterns of genes in various samples, identifying synergistic gene sets, and mining the co-expression pattern of coding genes and modules [[57]20]. First, the data on the expression of protein-coding genes in patients were retrieved from TCGA-BLCA and [58]GSE13507 cohorts. Next, the adjacent 73 matrices were clustered based on the topological overlap measure (TOM) and dissimilarity (1-TOM) among genes. Furthermore, “the Pearson correlation coefficient” was used to determine the correlation between the modules, CAF infiltration, and stromal scores. Finally, the modules exhibiting the highest correlation were selected. Hub genes were identified by intersecting the module bases of TCGA-BLCA and [59]GSE13507 cohorts. 2.4. Functional enrichment analysis The “Gene Ontology (GO)” and “Kyoto Encyclopedia of Genes and Genomes (KEGG)” pathway enrichment analyses was performed using the “clusterProfiler (version 4.4.4)” R package to identify biological processes (BPs), molecular functions (MFs), and cellular components (CCs) and pathways enriched by hub genes [[60]21]. “Gene set enrichment analysis (GSEA)” was performed using enrichment plot and the “clusterProfiler” R package. The KEGG pathway-related gene sets (c2. cp.kegg.v7.4) and biological feature gene sets (h.all.v7.4) were obtained from the “Molecular Signatures Database (MSigDB)” [[61]22]. P < 0.05 was considered statistically significant. 2.5. Identifying prognosis-related genes and constructing the prognostic feature First, the “linear models for microarray data (Limma)" R package was used to identify the differentially expressed genes between tumor and adjacent tumor tissues in patients from TCGA-BLCA cohort [[62]23]. Next, the “univariate COX” and “Kaplan-Meier (KM) survival analyses” package was used for identifying genes significantly associated with the patient's survival. CSE-prognostic features contained three genes were identified using the “Least Absolute Shrinkage and Selection Operator (LASSO) -Cox regression” algorithm and “glmnet" R package for 1000 iterations [[63]24]. The coefficients derived were used for calculating the CSE-risk score (RS). Finally, the median RS was set as the cut-off value to categorize the patients with BLCA into high- and low-CSE-RS groups. The difference in the overall survival (OS) of patients between the two groups was determined using the log-rank test. Next, the KM survival curve was constructed. The feature was verified using the same formula for calculating CSE-RS, the cut-off value, and the analysis method in patients with BLCA from [64]GSE13507. 2.6. Correlation analysis of the patient's clinical characteristics, the infiltration of CAFs, stromal scores, immune functions, and CSE-RS The correlation between the patient's clinical characteristics (age, sex, tumor grade, and TNM staging) and CSE-RS using the “ComplexHeatmap" R package. Next, the “GGally" package was used to determine the Spearman correlation between CSE-RS, multiple CAF infiltration (EPIC, xCell, MCP-counter) and stromal score. Finally, the expression data of patients from TGCA-BLCA cohort were analyzed by “Single Sample Gene Set Enrichment Analysis (ssGSEA)” using the “Gene Set Variation Analysis" package and the " Estimation of STromal and Immune cells in MAlignant Tumor tissues using Expression data (ESTIMATE)" package. The differences in immune cells, immune function, and TME in patients in high- and low-CSE-RS groups were compared. 2.7. Analyzing sensitivity to immunotherapy and chemotherapeutic drugs The immunophenoscore of patients with BLCA was downloaded from “the cancer immune group atlas (TCIA; [65]https://tcia.at/home)” database. Simultaneously, the “Tumor Immune Dysfunction and Exclusion (TIDE; [66]http://tide.dfci.harvard.edu)” algorithm was used to predicting immune checkpoint blockade (ICB) response and evaluated its ability as a new antigen [[67]25]. The “Genomics of Drug Sensitivity in Cancer (GDSC; [68]https://[69]www.cancerrxge-ne.org)” database was used for predicting the patient's response to chemotherapy drugs. The half-maximal inhibitory concentration (IC[50]) value was calculated using the “oncoPredict” R package. 2.8. Cell culture and transfection T24, BLCA cells were purchased from the Chinese Academy of Sciences (Shanghai, China) and cultured in McCoy's 5A medium (BasalMedia, Shanghai, China), supplemented with 10 % fetal bovine serum (FBS, Excell bio, Shanghai, China), and 100 U/mL penicillin/streptomycin. The cells were incubated in a humidified incubator at 37 °C and 5 % CO2. MFAP5 and PCOLCE2 overexpressing plasmids (TK-PCDH vector, NheI/BamHI enzyme cleavage) were designed by Qingke Biotechnology Co., Ltd. (Beijing, China). Western blotting was performed to determine the expression and transduction efficiency. 2.9. Wound healing assay First, 3 × 10^5 T24 cells were seeded and cultured in a six-well plate with medium containing FBS for 24 h. Next, an aseptic 200 μl pipette tip was used to scratch the cell monolayer physically. Then, the cells were washed with phosphate-buffered saline (PBS) and cultured in a serum-free medium. The wound closure at specified positions was monitored at 0 and 48 h using an optical microscope and analyzed using the “ImageJ” software. The experiment was performed in three independent replicates. 2.10. Transwell invasion assay To analyze invasion, chamber membranes were coated with Matrigel (BD Bioscience, Franklin Lakes, NJ, USA) at a dilution of 3.3 ng/mL in medium without serum. 5 × 10^4 T24 cells were seeded in 200 μL serum-free medium in the upper chamber for 24 h. Next, the lower chamber was filled with the medium containing 10 % FBS, which was used as a chemotherapy inducer. After 24 h, the upper chamber was cleaned, and the cells were fixed using 4 % PFA. Finally, Wright-Giemsa Staining Solution (Beyotime, China) was used for staining the fixed cells. 2.11. Western blotting assay Protein samples were extracted from cells using a RIPA lysis buffer (Beyotime, Wuhan, China) and separated through electrophoresis on a 10 % SDS-PAGE gel (Beyotime, Shanghai, China). The resulting proteins were then transferred to PVDF membranes with a pore size of 0.22 μm. The membranes were blocked with 5 % nonfat milk for 1 h at room temperature before being incubated with primary antibodies overnight at 4 °C. Afterward, the membranes were washed three times with PBST and further incu-bated with secondary antibodies (Proteintech, Wuhan, China) for 1 h. Next, a gel image processing system (ChemiDoc XRS+) was used to identify target/GAPDH bands. Finally, we calculated the relative protein levels by ImageJ software. The primary an-tibodies were as follows: anti-PCOLCE2 (Invitrogen:PA5-112,643), anti-MFAP5 (Invi-trogen:PA5-52706), anti-E-cadherin (Abcam:ab231303), anti-N-cadherin (Abcam:ab76011), anti-Vimentin (Abcam:ab20346), and anti-GAPDH (Proteintech: 10494-1-AP). GAPDH was set as an internal control. 2.12. Statistical analysis The experiment was repeated 2–3 times. Unpaired t-test and One-way analysis of variance were used for comparison between two groups and among multiple groups, respectively. The differences among groups were similar. The correlation between the two variables was determined by Pearson correlation analysis. Next, the statistical significance of in vivo studies was calculated using Fisher's exact test. Univariate survival and survival analyses were performed using the “KM survival analysis” and the log-rank test. The “COX regression model” was used for univariate and multivariate survival analysis. We performed statistical analysis using software like “GraphPad Prism 9.4” and “R language package ([70]https://[71]www.r-project).” Statistical significance was set two-tail p-value <0.05. 3. Results 3.1. The OS of patients with high CAF infiltration and stromal scores was low First, the “EPIC” and “xCell” algorithms were used for calculating CAFs infiltration and stromal scores of patients from TCGA-BLCA and [72]GSE13507 cohorts. The log-rank test was performed evaluating the prognostic value of OS, and the results were represented using the KM survival curve. The results showed higher CAFs infiltration and stromal scores were significantly and positively associated with poorer OS in TCGA-BLCA ([73]Fig. 1A and B) and [74]GSE13507 ([75]Fig. 1C and D). Fig. 1. [76]Fig. 1 [77]Open in a new tab The Kaplan-Meier curves indicated that patients with increased CAF infiltrations and stromal scores exhibited poorer overall survival in both TCGA-BLCA cohorts and [78]GSE13507. n = 407 in TCGA-BLCA cohort (A, B) and n = 166 in [79]GSE13507 cohort (C, D). Log-rank test. 3.2. WGCNA and identification of the functional module The expression profile of mRNA in patients from TCGA-BLCA and [80]GSE13507 was analyzed using WGCNA. First, a scale-free topology network was constructed. Next, 4 with a scale-free index of 0.9 in TCGA-BLCA ([81]Fig. 2A) and 10 with a scale-free index of 0.85 in [82]GSE13507 ([83]Fig. 2B) were identified as the soft threshold power (β) with relatively favorable mean connectivity. Finally, 60 genes in each network module were set as the minimum number of genes based on mixed dynamic shear trees criteria. After the gene module was determined using the dynamic shearing method, the characteristic genes of each module were identified. Then, the modules were clustered, and the modules close to each other were merged into new modules. Finally, the hierarchical cluster tree showed 12 co-expression modules in TCGA-BLCA ([84]Fig. 2C). The results revealed that the blue module was positively correlated with stromal score (Cor = 0.65, P = 2e-49) and the CAFs infiltration scores (Cor = 0.86, P = 8e-122, [85]Fig. 2E). A total of 22 co-expression modules were identified in [86]GSE13507 ([87]Fig. 2D). Furthermore, a positive correlation was observed between the brown module and the stromal score (Cor = 0.83, P = 3e-48), as well as the CAFs infiltration scores (Cor = 0.83, P = 1e-48; [88]Fig. 2F). Collectively, 1818 and 1696 genes were obtained in blue and brown modules, signifying their significant relevance to CAFs and stromal score. Fig. 2. [89]Fig. 2 [90]Open in a new tab Identification of CAFs-Stromal score co-expression hub module. (A–B) Analysis of the scale‐free fit index for various soft‐threshold powers and the mean connectivity in TCGA-BLCA and [91]GSE13507 cohorts. (C–D). Clustering dendrograms showing genes with similar expression patterns were clustered into co-expression modules in TCGA-BLCA and [92]GSE13507 cohort. (E–F) Correlations between each gene module and phenotype in TCGA-BLCA and [93]GSE13507 cohorts. 3.3. Identification of hub genes and function analyses A total 543 common genes were identified by intersecting gene sets of the two modules. Next, “the GO and KEGG pathway enrichment analysis” was performed on 533 genes. These genes were enriched in GO-BP, MF, and CC terms like extracellular stromal organization, extracellular stromal structural constituents, and collagen-containing extracellular, respectively. These genes were enriched in KEGG pathways including the focal adhesion and ECM-receptor interaction signaling pathways ([94]Supplementary Fig. 2a). Since these processes and pathways were closely related to EMT, the “HALLMARK_EPITHELIAL_MESENCHYMAL_TRANSITION” gene set from the MSigDB was intersected with 533 genes. Taken together, 75 CAF-stromal -EMT (CSE)-related genes ([95]Supplementary Table 2) were identified. 3.4. Construction of CSE-prognostic feature First, 407 patients from TCGA-BLCA in the training group and 165 patients from [96]GSE13507 were included in the validation group. Next, “univariate cox regression” and differential expression analyses were performed on 75 CSE-related genes based on the following criteria: “|log fold change (FC)| > 1” and “adj.P < 0.05” to increase the significance of the follow-up and narrowing the range of CSE-related genes. Finally, 21 genes were selected for subsequent analysis. The prognostic features were constructed using “LASSO Cox regression analysis” and cross-validation. The trajectory and log lambda of each independent variable was shown ([97]Fig. 3A and B). Finally, three genes were identified for constructing the CSE-prognostic features. The formula for calculating the CSE-RS was as follows: [MATH: CSERS=expMFAP5*0.25+expJUN*0.021+expPCOLCE2*0.17. :MATH] Fig. 3. [98]Fig. 3 [99]Open in a new tab Evaluation and validation of the CSE-prognostic feature. (A, B) cvfit and lambda curves showing the LASSO regression was performed with the minimum criteria. (C–D) Kaplan–Meier analysis according the calculated CSE-RS identified BLCA patients in the high CSE-RS group exhibited worse overall survival in both TCGA-BLCA (n = 200) and [100]GSE13507 cohorts (n = 77). “number at risk" refers to the current situation where patients in both groups have not yet reached the endpoint event (death). Log-rank test. The low-risk group exhibits a considerably higher survival rate compared to the high-risk group (P < 0.001). (E–F) The time-dependent ROC curves of the CSE-RS for 1-, 3-, and 5-year overall survival. Next, CSE-RS t was calculated to categorized the patients from the training and test groups into high-CSE-RS and low-CSE-RS groups. The KM survival curve demonstrated that in the high-CSE-RS group, the OS of patients was lower compared to the low-CSE-RS group ([101]Fig. 3C and D). Finally, the receiver operating characteristic (ROC) curve was used for validating the accuracy of the CSE-RS in predicting the patient's prognosis. The Area Under Curve (AUC) values of 1-, 3-, and 5-year OS rates of patients in the training group were 0.630, 0.655 and 0.695, respectively. Consistently, AUC values in the test group were 0.728, 0.656, and 0.647 ([102]Fig. 3E and F), respectively. These results suggest that CSE-related genes (MFAP5, JUN, and PCOLCE2) were important markers for predicting the prognosis of patients with BLCA. 4. CSE-RS were highly correlated with the patient's clinical characteristics and CAFs infiltration To enhance the clinical significance of the CSE-related prognostic feature, the differences in the clinical characteristics of patients were compared between both CSE-RS groups from TCGA-BLCA and [103]GSE13507 cohorts ([104]Fig. 4A and B). The heatmap showed differences in the pathological grade, T stage, tumor stage in two groups, and a majority of patients with higher degree of malignancy were in the high-CSE-RS group. Next, the ability of the CSE-prognostic feature in predicting the CAFs infiltration was verified by analyzing the Spearman correlation between CSE-RS, stromal score, and the CAFs infiltration using algorithms like “EPIC,” “xCell,” and “MCP-counter.” The results revealed a strong positive correlation between CSE-RS, stromal scores, and the infiltration of CAFs in patients from TCGA-BLCA ([105]Fig. 4C) and [106]GSE13507 ([107]Fig. 4D) cohorts. Fig. 4. [108]Fig. 4 [109]Open in a new tab The relevance between CSE-RS with clinical characteristics and multi-estimated CAF infiltrations. (A–B) The heatmap indicated that the group with high CSE-RS had a greater concentration of patients with a higher degree of malignance in TCGA-BLCA (n = 400) and [110]GSE13507 cohorts (n = 165). (C–D) Spearman's correlation analyses revealing the CAF risk score was strongly and positively correlated with stromal scores and multi-estimated CAF infiltrations in TCGA-BLCA (n = 405) and [111]GSE13507 cohorts (n = 165). *P < 0.05, **P < 0.01, ***P < 0.001. 4.1. Functional enrichment and characteristic differences of CSE-RS group To determine the functional enrichment of CSE-related genes, GSEA was performed on patients in both CSE-RS groups from TCGA-BLCA cohort. The primary GOBP signaling pathways enriched in by CSE-related genes from the patients the high-CSE-RS group were related to immune responses like the “GOBP_ACTIVATION_OF_IMMUNE_RESPONSE,” “GOBP_ADAPTIVE_IMMUNE_RESPONSE,” etc. ([112]Fig. 5A). Additionally, “KEGG_CHEMOKINE_SIGNALING_PATHWAY,” “KEGG_ECM_RECEPTOR_INTERACTION,” “KEGG_FOCAL_ADHESION,” were the primary KEGG pathways enriched by CSE-related genes from the patients in high-CSE-RS group ([113]Fig. 5B). In addition, the hallmark gene sets enriched by high-CSE- RS were primarily “HALLMARK_EPITHELIAL_MESENCHYMAL_TRANSITION,” “HALLMARK_INFLAMMATORY_RESPONSE,” and “HALLMARK_TNFA_SIGNALING_VIA_NFKB8.” ([114]Supplementary Fig. 2b). Further ssGSEA showed that the patients in the high-CSE-RS group had higher infiltration of immune cells ([115]Fig. 5C) and highly active immune functions ([116]Supplementary Fig. 2c). In addition, the stromal and immune scores of all patients were calculated using the “ESTIMATE” algorithm. The results showed significantly high immune and stromal scores of patients in the high-CSE-RS group compared to the low-CSE-RS group ([117]Fig. 5D). Finally, the differences in the expression of immune checkpoint were compared, and the results revealed higher expression of most immune checkpoints in patients in the high-CSE-RS group ([118]Fig. 5E). Together, these results suggest that CSE-related genes were not solely related to immune responses but may also be intricately connected to the EMT pathway. Fig. 5. [119]Fig. 5 [120]Open in a new tab Immune responses related pathways, immune cells and functions were enriched in the high-CSE-RS group (n = 200). (A, B) Gene set enrichment analysis (GSEA) of GOBP and KEGG gene sets in high-CSE-RS group. (C) The proportions of infiltration of immune cells in different CSE-RS group. (D) TME score in different CSE-RS group calculated by ESTIMATE shown in Violin plot. (E) The comparisons of expression of immune checkpoint in different CSE-RS group. *P < 0.05, **P < 0.01, ***P < 0.001. 4.2. Comparing the sensitivity of patients in both CSE-RS groups to chemo- and immunotherapies The patients in the high-CSE-RS group had highly active immune cells and immune functions. Further, the expression of immune checkpoints was also high in these patients. Therefore, these patients would may respond to immunotherapy better. First, the immunogenicity of patients in both groups was estimated using the IPS analysis from TCIA. The results revealed higher IPS-PD1, IPS-CTLA4, and IPS-PD1-CTLA4 scores in patients in the high-CSE-RS group compared to the low-CSE-RS group ([121]Supplementary Fig. 2 d-g). Next, the “TIDE” analysis was used to evaluate if CSE-RS could predict the patient's response to immunotherapy. The patients in the responder subgroup (n = 77) had significantly higher CSE-RS compared to the no-responder (n = 323) subgroup (P < 0.001; [122]Fig. 6A). The patients with BLCA in the high-CSE-RS group (42/158) were more sensitive to immunotherapy compared to patients (35/165) in low-CSE-RS group (P < 0.001; [123]Fig. 6B). These results suggested that the response of patients in the high-CSE-RS group to immunotherapy was better. Simultaneously, GDSC was used to predict the response of patients in both CSE-RS groups to common chemotherapeutic drugs. The results revealed higher sensitivity of patients in the low-CSE-RS group to chemotherapeutic drugs such as Camptothecin, Cisplatin, Docetaxel, and Sunitinib ([124]Fig. 6C–F). Collectively, we found that high-CSE-RS group may exhibit better responses to immunotherapy and chemotherapy. Fig. 6. [125]Fig. 6 [126]Open in a new tab Sensitivity analysis for immunotherapy and chemotherapy in both CSE-RS group (n = 400). (A) The TIDE-predicted immunotherapy-responders showed higher CSE-RS in the TCGA-BLCA dataset. (B) The percentage of responders (in orange) versus non-responders (in blue) within in high and low CSE-RS groups. (C–F) IC[50] box diagram of the four common chemotherapeutics drugs in high and low CSE-RS groups. 4.3. Determining the functions of CSE-related genes in vitro Various studies have shown the role of JUN in BLCA [[127]26,[128]27]; hence, MFAP5 and PCOLCE2 were selected from the feature for in vitro experiments. To further investigate the biological function of MFAP5 and PCOLCE2, the expression levels of MFAP5 and PCOLCE2 were analyzed in 5 BLCA cell lines. As shown in ([129]Fig. 7A), the protein level of MFAP5 and PCOLCE2 was significantly decreased in T24 cells compared to other cells. First, MFAP5 and PCOLCE2 were overexpressed in T24 cells by virus transfection and performed wound healing and transwell invasion assays. The results showed that MFAP5 and PCOLCE2 overexpression promotes T24 cell migration and invasion. ([130]Fig. 7B and C). Western blotting results showed that MFAP5 and PCOLCE2 overexpression could significantly decrease E-cadherin expression and increase N-cadherin and Vimentin expression ([131]Fig. 7D). Finally, we found two CSE-related genes (MFAP5 and PCOLCE2) have been observed to stimulate the invasion and migration of BLCA cells, suggesting a potential association with the EMT pathway. Fig. 7. [132]Fig. 7 [133]Open in a new tab Overexpression of MFAP5 and PCOLCE2 promoted the migration and invasion of BLCA cells. (A) Protein expression of MFAP5 and PCOLCE2 was significantly downregulated in T24 cells. The experiments were performed in triplicate. One-way ANOVA, *P < 0.05, ***P < 0.001.(B) Overexpression of MFAP5 and PCOLCE2 contributed to T24 cell migration in wound-healing assay. The percentage of wound closure is shown as the mean ± SD. t-test, *P < 0.05, **P < 0.01. (C) In the transwell assay, the invasiveness of T24 cells was enhanced in the overexpression group. The number of invasion cell is shown as the mean ± SD. t-test, **P < 0.01, ***P < 0.001. (D) Western blotting was performed to detect expression of EMT-related markers (E-cadherin, N-cadherin, vimentin) in MFAP5 and PCOLCE2-overexpression T24 cells. The experiments were performed in triplicate. t-test, *P < 0.05, **P < 0.01, ***P < 0.001. 5. Discussion Recent studies on BLCA have primarily focused on deciphering the mechanism underlying the invasive phenotype of tumors by analyzing genetic variations. The advent of immunotherapy, including ICB, has enhanced our understanding of the invasiveness behavior of cancer cells [[134]28,[135]29]. However, the interaction between stromal and tumor cells has not received much attention. CAFs are among the most abundant components of most solid tumors. The presence of CAFs and the proliferation of connective tissues are the most prominent features of invasive tumors of BLCA [[136]11]. Some studies have shown that fibroblast-derived growth factors mediate protumor effects like promoting EMT and cell invasion [[137]30]. Furthermore, these CAFs help tumor cells evade immune surveillance and contribute to BLCA development, drug resistance, and immunosuppression [[138]31]. Our results showed an association between high CAFs infiltration as well as stromal scores, and poor OS of patients with BLCA. Furthermore, a close correlation between CAFs and EMT was also observed. Therefore, it is important to study CAFs and EMT-related molecular targets in BLCA. However, only few studies have analyzed the correlation between study CAFs and EMT-related molecular targets in BLCA. The stromal components within the TME have fundamental and interconnected roles in driving tumourigenesis, CAFs are among the most significant and dynamic constituents within the TME [[139]32,[140]33], Given the intimate interplay between stromal components and CAFs, integrating them would more effectively illustrate their functional roles. In this study, for the first time, WGCNA and multiple algorithms were used to mine the co-expression network between CAFs and stromal cells in patients from TCGA-BLCA and [141]GSE13507 cohorts. The enrichment analysis showed that these genes were enriched in the EMT-related pathways. By using a univariate Cox and LASSO regression algorithm after the intersection with EMT-related genes, a CSE-prognostic feature consisting of three genes (MFAP5, JUN, and PCOLCE2) was constructed and verified. Previous studies have demonstrated a few functions of these three genes, including promoting tumorigenesis. These genes are closely related to CAFs and EMT. Microfibrillar-associated protein 5 (MFAP5) are secreted by CAFs and critically involved in integrating elastic microfibers, regulating the behavior of endothelial cells, and promoting basal breast cancer progression by activating the EMT pathway [[142]34]. MFAP5 is involved in activation of CAFs in pancreatic cancer [[143]35]. JUN is a protumor transcription factor that regulates major biological events. Several studies have shown the effect of JUN on tumors. Additionally, the downstream positive regulator of Wnt/β-catenin is a key transcription factor regulating various processes such as the proliferation, metastasis, and death of mammalian cells [[144]36,[145]37]. Procollagen C-endopeptidase enhancer 2 (PCOLCE2) is considered to be the intermediary of protein stromal, which can stimulate bone morphogenetic protein 1 (BMP1) activity on fibrinogen collagen [[146]38]. A study has shown the involvement of BMP1 in gastric and non-small cell lung cancer metastases [[147]39]. Furthermore, PCOLCE plays a key role in promoting lung metastasis of osteosarcoma [[148]40] and participates in EMT [[149]41]. A recent study has shown that PCOLCE2 stimulates the production of reactive oxygen species in neutrophils [[150]42]. Moreover, a study has shown a correlation between PCOLCE2 and ferroptosis in colon adenocarcinoma [[151]43]. However, few studies are available on relationship between PCOLCE2 and BLCA. Based on the constructed CSE-prognostic feature, patients with a high degree of malignancy were divided into the high-CSE-RS group. Furthermore, the results showed a positive correlation between CSE-RS and the CAFs infiltration as well as stromal scores predicted by three different methods. Interestingly, GSEA showed that “GOBP_ACTIVATION_OF_IMMUNE_RESPONSE,” “GOBP_ADAPTIVE_IMMUNE_RESPONSE,” etc., were primarily enriched BP pathways in the high-CSE-RS groups. These results are plausible since CAFs are stromal cells are the most basic component of TME. A study has shown that interactions between tumor immune microenvironment and CAFs play a key role in enhancing cancer progression [[152]44]. Furthermore, CAFs create immunosuppressive TME by secreting several growth factors, cytokines, chemokines, exocrine, and other effector molecules. These molecules interact with infiltrating immune cells and other immune components of tumors over time, which allows cancer cells to escape immune surveillance [[153]45]. This is similar to the fact that our high CSE score genes were mainly enriched in the KEGG enrichment pathway. In this study, a correlation analysis was performed on the immune score of the prognostic feature, and the results revealed high infiltration of immune cells in patients in the high-CSE-RS group, specifically CD8^+ T cells. Additionally, ssGSEA showed highly active immune function. CD8^+ T cells are critically involved in promoting anti-tumor immune response by releasing cytokines and inducing apoptosis of cancer cells [[154]46]. The presence of tumor edge and intra-tumor CD8^+ T cells will produce a stronger response to immune checkpoint inhibitor therapy [[155]47]. Additionally, TCIA was analyzed using the “TIDE” algorithm. The results showed that the response of patients in the high-CSE-RS group to immunotherapy was better. Moreover, GDSC was used to predict the responses of patients in both CSE-RS groups to common chemotherapeutic drugs. Our results revealed higher sensitivity of patients in the low-CSE-RS group to chemotherapeutic drugs like Camptothecin, Cisplatin, Docetaxel, and Sunitinib. Finally, in vitro analyzed were performed to verify the functions of CSE-related genes. As expected, MFAP5 and PCOLCE2 overexpression could significantly enhance the invasion and migration of BLCA cells, consistent with previous studies. Shi showed that MFAP5 promotes BLCA progression and metastasis of cells by triggering the PI3K-AKT signaling pathway to activate the DLL4/NOTCH2 pathway axis [[156]48]. Chen et al. showed good performance of PCOLCE2 in predicting the OS of patients with colorectal cancer [[157]49]. Currently, few studies have determined the involvement of PCOLCE2 in BLCA. Our team will continue to study the mechanism of BLCA. However, our study has a few shortcomings. First, this is a retrospective study based on data obtained from two publicly available databases. The prognosis of patients and the therapeutic significance of the model should be validated at multiple centers and using data from several databases. Second, only in vitro experiments were performed. The results should be verified using animal-based experimental models to explore its mechanism and provide new ideas for treating patients with BLCA. 6. Conclusion In summary, the correlation between CAFs and stromal cell invasion in TME and poor prognosis of patients with BLCA was validated. Furthermore, the co-expression network of the CAF infiltration and stromal scores were combined with EMT-related genes to construct and determine a prognostic feature contained three genes for BLCA. Additionally, the CSE-prognostic feature was verified to have high accuracy in predicting the prognosis, the CAFs infiltration, and the responses of patients to immunotherapy and chemotherapy. Moreover, in vitro experiments were performed, confirming that these CSE-prognostic feature genes could promote the migration, invasion, and EMT phenotypes of BLCA cells. These results may provide new insights and aid clinicians in designing new and accurate treatment strategies for patients with BLCA. Funding This research was funded by Changsha Municipal Natural Science Foundation (Grant No. kq2208468, to H.Z.), and Project of the Scientific Research of Hunan Provincial Health Commission (Grant No. D202304056751, to H.Z.). Data availability statement The data presented in this study are contained within the article and Supplementary Materials. CRediT authorship contribution statement Huidong Zhou: Writing – original draft, Visualization, Validation, Investigation, Funding acquisition, Formal analysis, Data curation. Ruqi Li: Software, Methodology, Investigation, Formal analysis, Data curation. Jinghong Liu: Methodology, Formal analysis, Data curation. Jianhua Long: Writing – review & editing, Validation, Supervision, Resources, Conceptualization. Tao Chen: Writing – review & editing, Validation, Supervision, Conceptualization. Declaration of competing interest The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper. Acknowledgments