Abstract

   Cancer-associated fibroblasts (CAFs) are significantly involved in
   determining the patient's prognosis and response to bladder cancer
   (BLCA) therapy. CAFs can induce epithelial-mesenchymal transformation
   (EMT) as well as complex interaction with immune cells. Hence, it is
   imperative to identify potential markers for enhancing our
   understanding of CAFs in BLCA progression and immune regulation. A
   variety of algorithms and analyses were employed in the study, leading
   to the development of a novel prognostic feature for CAFs-Stromal-EMT
   (CSE)-prognostic feature. This feature was constructed based on the
   genes MFAP5, PCOLCE2, and JUN. Furthermore, we revealed that patients
   with higher CSE risk scores responded to immunotherapy better compared
   to those with lower. Finally, we verified two CSE-related genes using
   in vitro experiments. Our results suggested that the CSE-prognostic
   feature could predict the prognosis and evaluate the response of
   patients to immune and chemotherapies. This would aid clinicians in
   designing treatment strategies for patients with BLCA.

   Keywords: Cancer-associated fibroblasts, Epithelial-mesenchymal
   transformation, Bladder cancer, Immunotherapy, Migration, Invasion

1. Introduction

   Globally, an estimated 500,000 new bladder cancer (BLCA) cases and
   200,000 related mortalities occur yearly. In the USA, 80,000 new BLCA
   cases and 17,000 related mortalities occur every 25 years
   [[31]1,[32]2]. Patients with BLCA present a broad spectrum of disease
   phenotypes ranging from recurrent non-invasive tumors that require
   long-term treatment to invasive tumors that require multiple forms of
   intervention and surgical treatment [[33]3]. The advent of
   high-throughput sequencing techniques has successfully enhanced our
   understanding of the underlying mechanism of BLCA. Furthermore,
   numerous potential biomarkers for predicting BLCA progression,
   prognosis, and molecular subtypes have been investigated [[34]4,[35]5].
   In addition, it has aided in improving the diagnosis of patients and
   designing treatment strategies, specifically immune and targeted
   therapy [[36]6].

   Solid tumors contain tumor cells and non-tumor components, constituting
   the tumor microenvironment (TME) [[37]7]. TME affects cancer cell
   growth and metastasis as well as the therapeutic efficacy of drugs; it
   is widely studied by various researchers [[38]8]. One of the most
   important components of TME is the stromal cells. Fibroblasts are the
   most abundant and important type of stromal cells due to their close
   association with cancer cells [[39]9]. Chronic activation of
   fibroblasts can promote tumorigenesis, and fibroblasts involved in
   various cancer-related processes are termed cancer-associated
   fibroblasts (CAFs). CAFs participate in tumorigenesis by secreting
   factors like chemokines, collagen, and matrix metalloproteinases (MMPs)
   [[40]10]. A study performed immunohistochemistry on primary tumors of
   patients with BLCA revealed that high CAFs levels were positively
   correlated with BLCA invasiveness compared to normal bladder tissue
   [[41]11].

   Epithelial-mesenchymal transformation (EMT) refers to a phenotypic and
   pleiotropic change in cells, wherein the epithelial cells gain
   characteristics and properties of mesenchymal cells. Studies have shown
   significant involvement of EMT in tumorigenesis, invasion and
   metastasis of cancer cells, tumor stemness, and chemoresistance
   [[42]12]. In TME, EMT mediates immunosuppression, thereby inducing
   resistance to immunotherapy, including immune checkpoint inhibitors
   [[43]13]. Post-EMT, the epithelial cells are one of the sources of
   CAFs; hence, a close relationship exists between CAFs and EMT. Ozdemir
   et al. have shown that CAFs promote the progression of highly
   aggressive and undifferentiated tumors, induce hypoxia and EMT, and
   enrich cancer stem cells [[44]14]. In cancer cells, a strong
   correlation was observed between CAF and EMT marker expression. In
   vitro stimulation of fibroblasts induces EMT of cancer cells [[45]15].
   Studies have shown a close correlation between CAFs and EMT, and
   significant involvement of CAFs in the onset and progression of BLCA,
   leading to the formation of the malignant phenotype.

   In this study, the RNA-sequencing (RNA-seq) data of patients with BLCA
   were retrieved from publicly available databases like “Gene Expression
   Omnibus (GEO)” and “the Cancer Genome Atlas (TCGA).” For the first
   time, CAFs and EMT-related genes were simultaneously identified using
   the bioinformatics algorithm like “Weighted gene co-expression network
   analysis (WGCNA)” from this data. First, the highest correlation
   between the module and the CAF infiltration-stromal scores was
   observed. Hub genes are yielded by the convergence of the most
   pertinent modules and genes associated with EMT. Next, a series of
   algorithms were used to construct CAF-Stromal score-EMT-related genes
   (CSE-related genes). These CSE-related genes could predict the
   prognosis and response of patients to BLCA therapy. More importantly,
   we overexpressed two of CSE-related genes in BLCA cells and revealed
   that these genes promoted tumor cell migration and invasion using
   cell-based experiments ([46]Supplementary Fig. 1). Our results suggest
   that the CSE-prognostic feature may be a new way to predict the
   progression and efficacy of BLCA.

2. Materials and methods

2.1. Acquisition and processing of raw data

   The RNA-seq data of 424 patients with BLCA in fragments per kilobase of
   transcript per million mapped reads (FPKM) format and their clinical
   characteristics were downloaded from “The Cancer Genome Atlas (TCGA,
   [47]https://portal.gdc.cancer.gov/)” database. The average value was
   used if several Ensembl IDs were mapped to the same gene [[48]16].
   Next, “Gene Expression Omnibus (GEO,
   [49]https://[50]www.ncbi.nlm.nih.gov/geo/)” database was searched to
   retrieve the data on normalized gene expression and the clinical
   characteristics of patients with BLCA from the [51]GSE13507 dataset
   [[52]17]. Next, the logarithmic quadratic transformation was performed
   on the sequence data obtained from GEO. Further, they were subjected to
   background adjustment, and the data was normalized. The patients
   without survival information were excluded from the study. Finally, we
   obtained 407 and 166 patients from TCGA-BLCA and [53]GSE13507 cohorts,
   respectively, for subsequent analysis. The clinicopathological
   characteristics of the patients in two cohorts were listed in
   [54]Supplementary Table 1.

2.2. Infiltration of CAFs and calculation of stromal scores

   The “Estimate the Proportion of Immune and Cancer cells (EPIC)”
   algorithm, which leverages expression data to analyze the infiltration
   proportion of immune cells, was used to determine CAFs infiltration
   [[55]18]. Additionally, the “xCell” algorithm, employing a gene
   signature enrichment-based approach, was applied to identify potential
   immune cell subtypes and calculate their relative abundance within
   cancer, thereby computing the stromal scores [[56]19]. These analyses
   were performed using the “IOBR (v0.99.9)” R package.

2.3. Weighted correlation network analysis

   WGCNA is a system biology technique used for describing the association
   patterns of genes in various samples, identifying synergistic gene
   sets, and mining the co-expression pattern of coding genes and modules
   [[57]20]. First, the data on the expression of protein-coding genes in
   patients were retrieved from TCGA-BLCA and [58]GSE13507 cohorts. Next,
   the adjacent 73 matrices were clustered based on the topological
   overlap measure (TOM) and dissimilarity (1-TOM) among genes.
   Furthermore, “the Pearson correlation coefficient” was used to
   determine the correlation between the modules, CAF infiltration, and
   stromal scores. Finally, the modules exhibiting the highest correlation
   were selected. Hub genes were identified by intersecting the module
   bases of TCGA-BLCA and [59]GSE13507 cohorts.

2.4. Functional enrichment analysis

   The “Gene Ontology (GO)” and “Kyoto Encyclopedia of Genes and Genomes
   (KEGG)” pathway enrichment analyses was performed using the
   “clusterProfiler (version 4.4.4)” R package to identify biological
   processes (BPs), molecular functions (MFs), and cellular components
   (CCs) and pathways enriched by hub genes [[60]21]. “Gene set enrichment
   analysis (GSEA)” was performed using enrichment plot and the
   “clusterProfiler” R package. The KEGG pathway-related gene sets (c2.
   cp.kegg.v7.4) and biological feature gene sets (h.all.v7.4) were
   obtained from the “Molecular Signatures Database (MSigDB)” [[61]22].
   P < 0.05 was considered statistically significant.

2.5. Identifying prognosis-related genes and constructing the prognostic
feature

   First, the “linear models for microarray data (Limma)" R package was
   used to identify the differentially expressed genes between tumor and
   adjacent tumor tissues in patients from TCGA-BLCA cohort [[62]23].
   Next, the “univariate COX” and “Kaplan-Meier (KM) survival analyses”
   package was used for identifying genes significantly associated with
   the patient's survival. CSE-prognostic features contained three genes
   were identified using the “Least Absolute Shrinkage and Selection
   Operator (LASSO) -Cox regression” algorithm and “glmnet" R package for
   1000 iterations [[63]24]. The coefficients derived were used for
   calculating the CSE-risk score (RS). Finally, the median RS was set as
   the cut-off value to categorize the patients with BLCA into high- and
   low-CSE-RS groups. The difference in the overall survival (OS) of
   patients between the two groups was determined using the log-rank test.
   Next, the KM survival curve was constructed. The feature was verified
   using the same formula for calculating CSE-RS, the cut-off value, and
   the analysis method in patients with BLCA from [64]GSE13507.

2.6. Correlation analysis of the patient's clinical characteristics, the
infiltration of CAFs, stromal scores, immune functions, and CSE-RS

   The correlation between the patient's clinical characteristics (age,
   sex, tumor grade, and TNM staging) and CSE-RS using the
   “ComplexHeatmap" R package. Next, the “GGally" package was used to
   determine the Spearman correlation between CSE-RS, multiple CAF
   infiltration (EPIC, xCell, MCP-counter) and stromal score. Finally, the
   expression data of patients from TGCA-BLCA cohort were analyzed by
   “Single Sample Gene Set Enrichment Analysis (ssGSEA)” using the “Gene
   Set Variation Analysis" package and the " Estimation of STromal and
   Immune cells in MAlignant Tumor tissues using Expression data
   (ESTIMATE)" package. The differences in immune cells, immune function,
   and TME in patients in high- and low-CSE-RS groups were compared.

2.7. Analyzing sensitivity to immunotherapy and chemotherapeutic drugs

   The immunophenoscore of patients with BLCA was downloaded from “the
   cancer immune group atlas (TCIA; [65]https://tcia.at/home)” database.
   Simultaneously, the “Tumor Immune Dysfunction and Exclusion (TIDE;
   [66]http://tide.dfci.harvard.edu)” algorithm was used to predicting
   immune checkpoint blockade (ICB) response and evaluated its ability as
   a new antigen [[67]25]. The “Genomics of Drug Sensitivity in Cancer
   (GDSC; [68]https://[69]www.cancerrxge-ne.org)” database was used for
   predicting the patient's response to chemotherapy drugs. The
   half-maximal inhibitory concentration (IC[50]) value was calculated
   using the “oncoPredict” R package.

2.8. Cell culture and transfection

   T24, BLCA cells were purchased from the Chinese Academy of Sciences
   (Shanghai, China) and cultured in McCoy's 5A medium (BasalMedia,
   Shanghai, China), supplemented with 10 % fetal bovine serum (FBS,
   Excell bio, Shanghai, China), and 100 U/mL penicillin/streptomycin. The
   cells were incubated in a humidified incubator at 37 °C and 5 % CO2.
   MFAP5 and PCOLCE2 overexpressing plasmids (TK-PCDH vector, NheI/BamHI
   enzyme cleavage) were designed by Qingke Biotechnology Co., Ltd.
   (Beijing, China). Western blotting was performed to determine the
   expression and transduction efficiency.

2.9. Wound healing assay

   First, 3 × 10^5 T24 cells were seeded and cultured in a six-well plate
   with medium containing FBS for 24 h. Next, an aseptic 200 μl pipette
   tip was used to scratch the cell monolayer physically. Then, the cells
   were washed with phosphate-buffered saline (PBS) and cultured in a
   serum-free medium. The wound closure at specified positions was
   monitored at 0 and 48 h using an optical microscope and analyzed using
   the “ImageJ” software. The experiment was performed in three
   independent replicates.

2.10. Transwell invasion assay

   To analyze invasion, chamber membranes were coated with Matrigel (BD
   Bioscience, Franklin Lakes, NJ, USA) at a dilution of 3.3 ng/mL in
   medium without serum. 5 × 10^4 T24 cells were seeded in 200 μL
   serum-free medium in the upper chamber for 24 h. Next, the lower
   chamber was filled with the medium containing 10 % FBS, which was used
   as a chemotherapy inducer. After 24 h, the upper chamber was cleaned,
   and the cells were fixed using 4 % PFA. Finally, Wright-Giemsa Staining
   Solution (Beyotime, China) was used for staining the fixed cells.

2.11. Western blotting assay

   Protein samples were extracted from cells using a RIPA lysis buffer
   (Beyotime, Wuhan, China) and separated through electrophoresis on a
   10 % SDS-PAGE gel (Beyotime, Shanghai, China). The resulting proteins
   were then transferred to PVDF membranes with a pore size of 0.22 μm.
   The membranes were blocked with 5 % nonfat milk for 1 h at room
   temperature before being incubated with primary antibodies overnight at
   4 °C. Afterward, the membranes were washed three times with PBST and
   further incu-bated with secondary antibodies (Proteintech, Wuhan,
   China) for 1 h. Next, a gel image processing system (ChemiDoc XRS+) was
   used to identify target/GAPDH bands. Finally, we calculated the
   relative protein levels by ImageJ software. The primary an-tibodies
   were as follows: anti-PCOLCE2 (Invitrogen:PA5-112,643), anti-MFAP5
   (Invi-trogen:PA5-52706), anti-E-cadherin (Abcam:ab231303),
   anti-N-cadherin (Abcam:ab76011), anti-Vimentin (Abcam:ab20346), and
   anti-GAPDH (Proteintech: 10494-1-AP). GAPDH was set as an internal
   control.

2.12. Statistical analysis

   The experiment was repeated 2–3 times. Unpaired t-test and One-way
   analysis of variance were used for comparison between two groups and
   among multiple groups, respectively. The differences among groups were
   similar. The correlation between the two variables was determined by
   Pearson correlation analysis. Next, the statistical significance of in
   vivo studies was calculated using Fisher's exact test. Univariate
   survival and survival analyses were performed using the “KM survival
   analysis” and the log-rank test. The “COX regression model” was used
   for univariate and multivariate survival analysis. We performed
   statistical analysis using software like “GraphPad Prism 9.4” and “R
   language package ([70]https://[71]www.r-project).” Statistical
   significance was set two-tail p-value <0.05.

3. Results

3.1. The OS of patients with high CAF infiltration and stromal scores was low

   First, the “EPIC” and “xCell” algorithms were used for calculating CAFs
   infiltration and stromal scores of patients from TCGA-BLCA and
   [72]GSE13507 cohorts. The log-rank test was performed evaluating the
   prognostic value of OS, and the results were represented using the KM
   survival curve. The results showed higher CAFs infiltration and stromal
   scores were significantly and positively associated with poorer OS in
   TCGA-BLCA ([73]Fig. 1A and B) and [74]GSE13507 ([75]Fig. 1C and D).

Fig. 1.

   [76]Fig. 1
   [77]Open in a new tab

   The Kaplan-Meier curves indicated that patients with increased CAF
   infiltrations and stromal scores exhibited poorer overall survival in
   both TCGA-BLCA cohorts and [78]GSE13507. n = 407 in TCGA-BLCA cohort
   (A, B) and n = 166 in [79]GSE13507 cohort (C, D). Log-rank test.

3.2. WGCNA and identification of the functional module

   The expression profile of mRNA in patients from TCGA-BLCA and
   [80]GSE13507 was analyzed using WGCNA. First, a scale-free topology
   network was constructed. Next, 4 with a scale-free index of 0.9 in
   TCGA-BLCA ([81]Fig. 2A) and 10 with a scale-free index of 0.85 in
   [82]GSE13507 ([83]Fig. 2B) were identified as the soft threshold power
   (β) with relatively favorable mean connectivity. Finally, 60 genes in
   each network module were set as the minimum number of genes based on
   mixed dynamic shear trees criteria. After the gene module was
   determined using the dynamic shearing method, the characteristic genes
   of each module were identified. Then, the modules were clustered, and
   the modules close to each other were merged into new modules. Finally,
   the hierarchical cluster tree showed 12 co-expression modules in
   TCGA-BLCA ([84]Fig. 2C). The results revealed that the blue module was
   positively correlated with stromal score (Cor = 0.65, P = 2e-49) and
   the CAFs infiltration scores (Cor = 0.86, P = 8e-122, [85]Fig. 2E). A
   total of 22 co-expression modules were identified in [86]GSE13507
   ([87]Fig. 2D). Furthermore, a positive correlation was observed between
   the brown module and the stromal score (Cor = 0.83, P = 3e-48), as well
   as the CAFs infiltration scores (Cor = 0.83, P = 1e-48; [88]Fig. 2F).
   Collectively, 1818 and 1696 genes were obtained in blue and brown
   modules, signifying their significant relevance to CAFs and stromal
   score.

Fig. 2.

   [89]Fig. 2
   [90]Open in a new tab

   Identification of CAFs-Stromal score co-expression hub module. (A–B)
   Analysis of the scale‐free fit index for various soft‐threshold powers
   and the mean connectivity in TCGA-BLCA and [91]GSE13507 cohorts. (C–D).
   Clustering dendrograms showing genes with similar expression patterns
   were clustered into co-expression modules in TCGA-BLCA and [92]GSE13507
   cohort. (E–F) Correlations between each gene module and phenotype in
   TCGA-BLCA and [93]GSE13507 cohorts.

3.3. Identification of hub genes and function analyses

   A total 543 common genes were identified by intersecting gene sets of
   the two modules. Next, “the GO and KEGG pathway enrichment analysis”
   was performed on 533 genes. These genes were enriched in GO-BP, MF, and
   CC terms like extracellular stromal organization, extracellular stromal
   structural constituents, and collagen-containing extracellular,
   respectively. These genes were enriched in KEGG pathways including the
   focal adhesion and ECM-receptor interaction signaling pathways
   ([94]Supplementary Fig. 2a). Since these processes and pathways were
   closely related to EMT, the
   “HALLMARK_EPITHELIAL_MESENCHYMAL_TRANSITION” gene set from the MSigDB
   was intersected with 533 genes. Taken together, 75 CAF-stromal -EMT
   (CSE)-related genes ([95]Supplementary Table 2) were identified.

3.4. Construction of CSE-prognostic feature

   First, 407 patients from TCGA-BLCA in the training group and 165
   patients from [96]GSE13507 were included in the validation group. Next,
   “univariate cox regression” and differential expression analyses were
   performed on 75 CSE-related genes based on the following criteria:
   “|log fold change (FC)| > 1” and “adj.P < 0.05” to increase the
   significance of the follow-up and narrowing the range of CSE-related
   genes. Finally, 21 genes were selected for subsequent analysis. The
   prognostic features were constructed using “LASSO Cox regression
   analysis” and cross-validation. The trajectory and log lambda of each
   independent variable was shown ([97]Fig. 3A and B). Finally, three
   genes were identified for constructing the CSE-prognostic features. The
   formula for calculating the CSE-RS was as follows:
   [MATH: <mrow><mtext>CSE</mtext><mo>‐</mo><mtext>RS</mtext><mo
   linebreak="badbreak">=</mo><mtext>expMFAP</mtext><mo>5</mo><mo>*</mo><m
   space width="0.25em"></mspace><mn>0.25</mn><mo
   linebreak="goodbreak">+</mo><mtext>expJUN</mtext><mo>*</mo><mspace
   width="0.25em"></mspace><mn>0.021</mn><mo
   linebreak="goodbreak">+</mo><mtext>expPCOLCE</mtext><mo>2</mo><mo>*</mo
   ><mspace width="0.25em"></mspace><mn>0.17.</mn></mrow> :MATH]

Fig. 3.

   [98]Fig. 3
   [99]Open in a new tab

   Evaluation and validation of the CSE-prognostic feature. (A, B) cvfit
   and lambda curves showing the LASSO regression was performed with the
   minimum criteria. (C–D) Kaplan–Meier analysis according the calculated
   CSE-RS identified BLCA patients in the high CSE-RS group exhibited
   worse overall survival in both TCGA-BLCA (n = 200) and [100]GSE13507
   cohorts (n = 77). “number at risk" refers to the current situation
   where patients in both groups have not yet reached the endpoint event
   (death). Log-rank test. The low-risk group exhibits a considerably
   higher survival rate compared to the high-risk group (P < 0.001). (E–F)
   The time-dependent ROC curves of the CSE-RS for 1-, 3-, and 5-year
   overall survival.

   Next, CSE-RS t was calculated to categorized the patients from the
   training and test groups into high-CSE-RS and low-CSE-RS groups. The KM
   survival curve demonstrated that in the high-CSE-RS group, the OS of
   patients was lower compared to the low-CSE-RS group ([101]Fig. 3C and
   D). Finally, the receiver operating characteristic (ROC) curve was used
   for validating the accuracy of the CSE-RS in predicting the patient's
   prognosis. The Area Under Curve (AUC) values of 1-, 3-, and 5-year OS
   rates of patients in the training group were 0.630, 0.655 and 0.695,
   respectively. Consistently, AUC values in the test group were 0.728,
   0.656, and 0.647 ([102]Fig. 3E and F), respectively. These results
   suggest that CSE-related genes (MFAP5, JUN, and PCOLCE2) were important
   markers for predicting the prognosis of patients with BLCA.

4. CSE-RS were highly correlated with the patient's clinical characteristics
and CAFs infiltration

   To enhance the clinical significance of the CSE-related prognostic
   feature, the differences in the clinical characteristics of patients
   were compared between both CSE-RS groups from TCGA-BLCA and
   [103]GSE13507 cohorts ([104]Fig. 4A and B). The heatmap showed
   differences in the pathological grade, T stage, tumor stage in two
   groups, and a majority of patients with higher degree of malignancy
   were in the high-CSE-RS group. Next, the ability of the CSE-prognostic
   feature in predicting the CAFs infiltration was verified by analyzing
   the Spearman correlation between CSE-RS, stromal score, and the CAFs
   infiltration using algorithms like “EPIC,” “xCell,” and “MCP-counter.”
   The results revealed a strong positive correlation between CSE-RS,
   stromal scores, and the infiltration of CAFs in patients from TCGA-BLCA
   ([105]Fig. 4C) and [106]GSE13507 ([107]Fig. 4D) cohorts.

Fig. 4.

   [108]Fig. 4
   [109]Open in a new tab

   The relevance between CSE-RS with clinical characteristics and
   multi-estimated CAF infiltrations. (A–B) The heatmap indicated that the
   group with high CSE-RS had a greater concentration of patients with a
   higher degree of malignance in TCGA-BLCA (n = 400) and [110]GSE13507
   cohorts (n = 165). (C–D) Spearman's correlation analyses revealing the
   CAF risk score was strongly and positively correlated with stromal
   scores and multi-estimated CAF infiltrations in TCGA-BLCA (n = 405) and
   [111]GSE13507 cohorts (n = 165). *P < 0.05, **P < 0.01, ***P < 0.001.

4.1. Functional enrichment and characteristic differences of CSE-RS group

   To determine the functional enrichment of CSE-related genes, GSEA was
   performed on patients in both CSE-RS groups from TCGA-BLCA cohort. The
   primary GOBP signaling pathways enriched in by CSE-related genes from
   the patients the high-CSE-RS group were related to immune responses
   like the “GOBP_ACTIVATION_OF_IMMUNE_RESPONSE,”
   “GOBP_ADAPTIVE_IMMUNE_RESPONSE,” etc. ([112]Fig. 5A). Additionally,
   “KEGG_CHEMOKINE_SIGNALING_PATHWAY,” “KEGG_ECM_RECEPTOR_INTERACTION,”
   “KEGG_FOCAL_ADHESION,” were the primary KEGG pathways enriched by
   CSE-related genes from the patients in high-CSE-RS group ([113]Fig.
   5B). In addition, the hallmark gene sets enriched by high-CSE- RS were
   primarily “HALLMARK_EPITHELIAL_MESENCHYMAL_TRANSITION,”
   “HALLMARK_INFLAMMATORY_RESPONSE,” and
   “HALLMARK_TNFA_SIGNALING_VIA_NFKB8.” ([114]Supplementary Fig. 2b).
   Further ssGSEA showed that the patients in the high-CSE-RS group had
   higher infiltration of immune cells ([115]Fig. 5C) and highly active
   immune functions ([116]Supplementary Fig. 2c). In addition, the stromal
   and immune scores of all patients were calculated using the “ESTIMATE”
   algorithm. The results showed significantly high immune and stromal
   scores of patients in the high-CSE-RS group compared to the low-CSE-RS
   group ([117]Fig. 5D). Finally, the differences in the expression of
   immune checkpoint were compared, and the results revealed higher
   expression of most immune checkpoints in patients in the high-CSE-RS
   group ([118]Fig. 5E). Together, these results suggest that CSE-related
   genes were not solely related to immune responses but may also be
   intricately connected to the EMT pathway.

Fig. 5.

   [119]Fig. 5
   [120]Open in a new tab

   Immune responses related pathways, immune cells and functions were
   enriched in the high-CSE-RS group (n = 200). (A, B) Gene set enrichment
   analysis (GSEA) of GOBP and KEGG gene sets in high-CSE-RS group. (C)
   The proportions of infiltration of immune cells in different CSE-RS
   group. (D) TME score in different CSE-RS group calculated by ESTIMATE
   shown in Violin plot. (E) The comparisons of expression of immune
   checkpoint in different CSE-RS group. *P < 0.05, **P < 0.01,
   ***P < 0.001.

4.2. Comparing the sensitivity of patients in both CSE-RS groups to chemo-
and immunotherapies

   The patients in the high-CSE-RS group had highly active immune cells
   and immune functions. Further, the expression of immune checkpoints was
   also high in these patients. Therefore, these patients would may
   respond to immunotherapy better. First, the immunogenicity of patients
   in both groups was estimated using the IPS analysis from TCIA. The
   results revealed higher IPS-PD1, IPS-CTLA4, and IPS-PD1-CTLA4 scores in
   patients in the high-CSE-RS group compared to the low-CSE-RS group
   ([121]Supplementary Fig. 2 d-g). Next, the “TIDE” analysis was used to
   evaluate if CSE-RS could predict the patient's response to
   immunotherapy. The patients in the responder subgroup (n = 77) had
   significantly higher CSE-RS compared to the no-responder (n = 323)
   subgroup (P < 0.001; [122]Fig. 6A). The patients with BLCA in the
   high-CSE-RS group (42/158) were more sensitive to immunotherapy
   compared to patients (35/165) in low-CSE-RS group (P < 0.001; [123]Fig.
   6B). These results suggested that the response of patients in the
   high-CSE-RS group to immunotherapy was better. Simultaneously, GDSC was
   used to predict the response of patients in both CSE-RS groups to
   common chemotherapeutic drugs. The results revealed higher sensitivity
   of patients in the low-CSE-RS group to chemotherapeutic drugs such as
   Camptothecin, Cisplatin, Docetaxel, and Sunitinib ([124]Fig. 6C–F).
   Collectively, we found that high-CSE-RS group may exhibit better
   responses to immunotherapy and chemotherapy.

Fig. 6.

   [125]Fig. 6
   [126]Open in a new tab

   Sensitivity analysis for immunotherapy and chemotherapy in both CSE-RS
   group (n = 400). (A) The TIDE-predicted immunotherapy-responders showed
   higher CSE-RS in the TCGA-BLCA dataset. (B) The percentage of
   responders (in orange) versus non-responders (in blue) within in high
   and low CSE-RS groups. (C–F) IC[50] box diagram of the four common
   chemotherapeutics drugs in high and low CSE-RS groups.

4.3. Determining the functions of CSE-related genes in vitro

   Various studies have shown the role of JUN in BLCA [[127]26,[128]27];
   hence, MFAP5 and PCOLCE2 were selected from the feature for in vitro
   experiments. To further investigate the biological function of MFAP5
   and PCOLCE2, the expression levels of MFAP5 and PCOLCE2 were analyzed
   in 5 BLCA cell lines. As shown in ([129]Fig. 7A), the protein level of
   MFAP5 and PCOLCE2 was significantly decreased in T24 cells compared to
   other cells. First, MFAP5 and PCOLCE2 were overexpressed in T24 cells
   by virus transfection and performed wound healing and transwell
   invasion assays. The results showed that MFAP5 and PCOLCE2
   overexpression promotes T24 cell migration and invasion. ([130]Fig. 7B
   and C). Western blotting results showed that MFAP5 and PCOLCE2
   overexpression could significantly decrease E-cadherin expression and
   increase N-cadherin and Vimentin expression ([131]Fig. 7D). Finally, we
   found two CSE-related genes (MFAP5 and PCOLCE2) have been observed to
   stimulate the invasion and migration of BLCA cells, suggesting a
   potential association with the EMT pathway.

Fig. 7.

   [132]Fig. 7
   [133]Open in a new tab

   Overexpression of MFAP5 and PCOLCE2 promoted the migration and invasion
   of BLCA cells. (A) Protein expression of MFAP5 and PCOLCE2 was
   significantly downregulated in T24 cells. The experiments were
   performed in triplicate. One-way ANOVA, *P < 0.05, ***P < 0.001.(B)
   Overexpression of MFAP5 and PCOLCE2 contributed to T24 cell migration
   in wound-healing assay. The percentage of wound closure is shown as the
   mean ± SD. t-test, *P < 0.05, **P < 0.01. (C) In the transwell assay,
   the invasiveness of T24 cells was enhanced in the overexpression group.
   The number of invasion cell is shown as the mean ± SD. t-test, **P <
   0.01, ***P < 0.001. (D) Western blotting was performed to detect
   expression of EMT-related markers (E-cadherin, N-cadherin, vimentin) in
   MFAP5 and PCOLCE2-overexpression T24 cells. The experiments were
   performed in triplicate. t-test, *P < 0.05, **P < 0.01, ***P < 0.001.

5. Discussion

   Recent studies on BLCA have primarily focused on deciphering the
   mechanism underlying the invasive phenotype of tumors by analyzing
   genetic variations. The advent of immunotherapy, including ICB, has
   enhanced our understanding of the invasiveness behavior of cancer cells
   [[134]28,[135]29]. However, the interaction between stromal and tumor
   cells has not received much attention. CAFs are among the most abundant
   components of most solid tumors. The presence of CAFs and the
   proliferation of connective tissues are the most prominent features of
   invasive tumors of BLCA [[136]11]. Some studies have shown that
   fibroblast-derived growth factors mediate protumor effects like
   promoting EMT and cell invasion [[137]30]. Furthermore, these CAFs help
   tumor cells evade immune surveillance and contribute to BLCA
   development, drug resistance, and immunosuppression [[138]31].

   Our results showed an association between high CAFs infiltration as
   well as stromal scores, and poor OS of patients with BLCA. Furthermore,
   a close correlation between CAFs and EMT was also observed. Therefore,
   it is important to study CAFs and EMT-related molecular targets in
   BLCA. However, only few studies have analyzed the correlation between
   study CAFs and EMT-related molecular targets in BLCA. The stromal
   components within the TME have fundamental and interconnected roles in
   driving tumourigenesis, CAFs are among the most significant and dynamic
   constituents within the TME [[139]32,[140]33], Given the intimate
   interplay between stromal components and CAFs, integrating them would
   more effectively illustrate their functional roles. In this study, for
   the first time, WGCNA and multiple algorithms were used to mine the
   co-expression network between CAFs and stromal cells in patients from
   TCGA-BLCA and [141]GSE13507 cohorts. The enrichment analysis showed
   that these genes were enriched in the EMT-related pathways. By using a
   univariate Cox and LASSO regression algorithm after the intersection
   with EMT-related genes, a CSE-prognostic feature consisting of three
   genes (MFAP5, JUN, and PCOLCE2) was constructed and verified.

   Previous studies have demonstrated a few functions of these three
   genes, including promoting tumorigenesis. These genes are closely
   related to CAFs and EMT. Microfibrillar-associated protein 5 (MFAP5)
   are secreted by CAFs and critically involved in integrating elastic
   microfibers, regulating the behavior of endothelial cells, and
   promoting basal breast cancer progression by activating the EMT pathway
   [[142]34]. MFAP5 is involved in activation of CAFs in pancreatic cancer
   [[143]35]. JUN is a protumor transcription factor that regulates major
   biological events. Several studies have shown the effect of JUN on
   tumors. Additionally, the downstream positive regulator of
   Wnt/β-catenin is a key transcription factor regulating various
   processes such as the proliferation, metastasis, and death of mammalian
   cells [[144]36,[145]37]. Procollagen C-endopeptidase enhancer 2
   (PCOLCE2) is considered to be the intermediary of protein stromal,
   which can stimulate bone morphogenetic protein 1 (BMP1) activity on
   fibrinogen collagen [[146]38]. A study has shown the involvement of
   BMP1 in gastric and non-small cell lung cancer metastases [[147]39].
   Furthermore, PCOLCE plays a key role in promoting lung metastasis of
   osteosarcoma [[148]40] and participates in EMT [[149]41]. A recent
   study has shown that PCOLCE2 stimulates the production of reactive
   oxygen species in neutrophils [[150]42]. Moreover, a study has shown a
   correlation between PCOLCE2 and ferroptosis in colon adenocarcinoma
   [[151]43]. However, few studies are available on relationship between
   PCOLCE2 and BLCA.

   Based on the constructed CSE-prognostic feature, patients with a high
   degree of malignancy were divided into the high-CSE-RS group.
   Furthermore, the results showed a positive correlation between CSE-RS
   and the CAFs infiltration as well as stromal scores predicted by three
   different methods. Interestingly, GSEA showed that
   “GOBP_ACTIVATION_OF_IMMUNE_RESPONSE,” “GOBP_ADAPTIVE_IMMUNE_RESPONSE,”
   etc., were primarily enriched BP pathways in the high-CSE-RS groups.
   These results are plausible since CAFs are stromal cells are the most
   basic component of TME. A study has shown that interactions between
   tumor immune microenvironment and CAFs play a key role in enhancing
   cancer progression [[152]44]. Furthermore, CAFs create
   immunosuppressive TME by secreting several growth factors, cytokines,
   chemokines, exocrine, and other effector molecules. These molecules
   interact with infiltrating immune cells and other immune components of
   tumors over time, which allows cancer cells to escape immune
   surveillance [[153]45]. This is similar to the fact that our high CSE
   score genes were mainly enriched in the KEGG enrichment pathway.

   In this study, a correlation analysis was performed on the immune score
   of the prognostic feature, and the results revealed high infiltration
   of immune cells in patients in the high-CSE-RS group, specifically
   CD8^+ T cells. Additionally, ssGSEA showed highly active immune
   function. CD8^+ T cells are critically involved in promoting anti-tumor
   immune response by releasing cytokines and inducing apoptosis of cancer
   cells [[154]46]. The presence of tumor edge and intra-tumor CD8^+ T
   cells will produce a stronger response to immune checkpoint inhibitor
   therapy [[155]47]. Additionally, TCIA was analyzed using the “TIDE”
   algorithm. The results showed that the response of patients in the
   high-CSE-RS group to immunotherapy was better. Moreover, GDSC was used
   to predict the responses of patients in both CSE-RS groups to common
   chemotherapeutic drugs. Our results revealed higher sensitivity of
   patients in the low-CSE-RS group to chemotherapeutic drugs like
   Camptothecin, Cisplatin, Docetaxel, and Sunitinib.

   Finally, in vitro analyzed were performed to verify the functions of
   CSE-related genes. As expected, MFAP5 and PCOLCE2 overexpression could
   significantly enhance the invasion and migration of BLCA cells,
   consistent with previous studies. Shi showed that MFAP5 promotes BLCA
   progression and metastasis of cells by triggering the PI3K-AKT
   signaling pathway to activate the DLL4/NOTCH2 pathway axis [[156]48].
   Chen et al. showed good performance of PCOLCE2 in predicting the OS of
   patients with colorectal cancer [[157]49]. Currently, few studies have
   determined the involvement of PCOLCE2 in BLCA. Our team will continue
   to study the mechanism of BLCA.

   However, our study has a few shortcomings. First, this is a
   retrospective study based on data obtained from two publicly available
   databases. The prognosis of patients and the therapeutic significance
   of the model should be validated at multiple centers and using data
   from several databases. Second, only in vitro experiments were
   performed. The results should be verified using animal-based
   experimental models to explore its mechanism and provide new ideas for
   treating patients with BLCA.

6. Conclusion

   In summary, the correlation between CAFs and stromal cell invasion in
   TME and poor prognosis of patients with BLCA was validated.
   Furthermore, the co-expression network of the CAF infiltration and
   stromal scores were combined with EMT-related genes to construct and
   determine a prognostic feature contained three genes for BLCA.
   Additionally, the CSE-prognostic feature was verified to have high
   accuracy in predicting the prognosis, the CAFs infiltration, and the
   responses of patients to immunotherapy and chemotherapy. Moreover, in
   vitro experiments were performed, confirming that these CSE-prognostic
   feature genes could promote the migration, invasion, and EMT phenotypes
   of BLCA cells. These results may provide new insights and aid
   clinicians in designing new and accurate treatment strategies for
   patients with BLCA.

Funding

   This research was funded by Changsha Municipal Natural Science
   Foundation (Grant No. kq2208468, to H.Z.), and Project of the
   Scientific Research of Hunan Provincial Health Commission (Grant No.
   D202304056751, to H.Z.).

Data availability statement

   The data presented in this study are contained within the article and
   Supplementary Materials.

CRediT authorship contribution statement

   Huidong Zhou: Writing – original draft, Visualization, Validation,
   Investigation, Funding acquisition, Formal analysis, Data curation.
   Ruqi Li: Software, Methodology, Investigation, Formal analysis, Data
   curation. Jinghong Liu: Methodology, Formal analysis, Data curation.
   Jianhua Long: Writing – review & editing, Validation, Supervision,
   Resources, Conceptualization. Tao Chen: Writing – review & editing,
   Validation, Supervision, Conceptualization.

Declaration of competing interest

   The authors declare that they have no known competing financial
   interests or personal relationships that could have appeared to
   influence the work reported in this paper.

Acknowledgments