Abstract Purpose: Bladder cancer (BCa) is generally considered one of the most prevalent deadly diseases worldwide. Patients suffering from muscle-invasive bladder cancer (MIBC) possess dismal prognoses, while those with non-muscle-invasive bladder cancer (NMIBC) generally have a favorable outcome after local treatment. However, some NMIBCs relapse and progress to MIBC, with an unclarified mechanism. Hence, insight into the genetic drivers of BCa progression has tremendous potential benefits for precision therapeutics, risk stratification, and molecular diagnosis. Methods: In this study, three cohorts profile datasets ([34]GSE13507, [35]GSE32584, and [36]GSE89) consisting of NMIBC and MIBC samples were integrated to address the differently expressed genes (DEGs). Subsequently, the protein-protein interaction (PPI) network and pathway enrichment analysis of DGEs were performed. Results: Six collagen members (COL1A1, COL1A2, COL5A2, COL6A1, COL6A2, and COL6A3) were up-regulated and gathered in the ECM-receptor interaction signal pathway identified by KEGG pathway analysis and GSEA. Evidence derived from the Oncomine and TCGA databases indicated that the 6 collagen genes promote the progression of BCa and are negatively associated with patient prognosis. Moreover, taking COL1A1 as a further research object, the results showed that COL1A1 was up-regulated in MIBC and its knockdown significantly inhibited the proliferation, migration, and invasion of 5637 and T24 cells by inhibiting epithelial-mesenchymal transition (EMT) process and the TGF-β signaling pathway. Conclusion: With integrated bioinformatic analysis and cell experiments, we showed that 6 collagen family members are high progression risk factors and that they can be used as independent effective diagnostic and prognostic biomarkers for BCa. Keywords: genomic analysis, bladder cancer, invasion, diagnostic marker, prognostic markers Introduction BCa is a common malignant urologic cancer worldwide, with approximately 76960 new cases and 16390 deaths in 2016 due to its poor clinical outcome.[37]^1^,[38]^2 Approximately 80% of BCa patients have non-muscle invasive tumors (Ta, T1, and Tis).[39]^3 Unfortunately, 10 to 30% of NMIBCs progress to MIBCs(T2~T4).[40]^4 On account of the involvement of multistep accumulation of genetic and epigenetic factors and external elements, such as tobacco smoking, neutrophil-to-lymphocyte ratio, body mass index, residual T1 high-grade (HG)/G3 tumors, or other carcinogens,[41]^5^–[42]^8 BCa can progress from a retrievable form of NMIBC into MIBC, an advanced crippling morphology compared to the former.[43]^9 In order to prevent postoperative bladder cancer recurrence and progression, some adjuvant means are applied such as chemotherapy, Chinese medicine, vaccine, etc.[44]^10^–[45]^13 Many resources and tools have empowered in-depth studies on the mechanisms accounting for BCa progression, which provide more possibilities in providing ideal and reliable methods to identify important genetic or epigenetic modifications in cancer progression. At present, novel abnormally expressed genes and key signaling pathways can be addressed by advanced bioinformatics analysis, rendering many clinical evaluations for BCa studies.[46]^14 The field of identifying novel cancer susceptibility loci and some individual SNPs that account for the heritability of BCa has been advanced by GWAS (genome-wide association studies).[47]^15 Recently, using TCGA analysis, researchers found that MIBC has high overall mutation rates and some recurrent alterations in genes including TP53, PIK3CA, and FGFR3 genes, and also somatic TERT promoter mutations that present in the early process of BCa.[48]^16 BCa entails a complex process, through which a primary tumor progresses to a disseminated metastatic disease. Non-cellular surroundings, the extracellular matrix (ECM), interact with cancer cells at each step of the metastatic process.[49]^17 During cancer progression, the ECM regulates numerous cell functions, including proliferation, migration, invasion and protein synthesis.[50]^18 The ECM comprises approximately 300 proteins, of which collagen, elastin, and fibronectin are common.[51]^19 The ECM interacts with cells, and these interactions are mediated by transmembrane receptors, such as integrins, syndecans, CD44, discoidin domain receptor, and dystoglycan.[52]^20^,[53]^21 In breast cancer, ECM proteins appear to be involved in the maintenance of tumor cell shape, migration and invasion by regulating the expression of the CD44 protein, known as a tumor prognostic factor, thus acting on tumor progression and metastasis.[54]^22 However, although ECM is closely related to tumor metastasis, the role of ECM proteins, especially collagens in the progression of non-invasive BCa into invasive cancer has not been extensively studied. While many studies have been carried out, BCa progression is still poorly understood, with some more esoteric theories remaining unexplored. In this study, to address the potential markers and risk factors of BCa, the expression profiles of MIBC tissue and NMIBC tissue obtained from three GEO datasets were analyzed by using the limma package. Through KEGG pathway analysis and GSEA, the ECM–receptor interaction signaling pathway was identified. By further analyzing the Oncomine database, it was determined that 6 collagen family members that are located in the ECM–receptor interaction signaling pathway were positively correlated with BCa progression. Analyzing the Oncomine and the TCGA databases indicated that 6 collagen genes overexpressed in MIBC are significantly correlated with BCa progression, overall survival, and recurrence-free survival in patients with BCa. The pivotal protein COL1A1 is further disposed with expression silencing to determine its more profound functions or roles in the tumor cell growth, proliferation, invasion and migration in BCa. The results revealed that the 6 collagen family members and the ECM-receptor interaction signaling pathway play a significant role and that the 6 collagen family members may be effective, independent prognostic biomarkers of BCa progression. Materials and methods Microarray data information and degs identification NMIBC and MIBC tissue gene expression profiles of [55]GSE13507, [56]GSE32584 and [57]GSE89 were all obtained from NCBI-GEO ([58]https://www.ncbi.nlm.nih.gov/geo/). The array data for [59]GSE13507 consisted of 103 NMIBC tissue samples and 61 MIBC tissue samples.[60]^23 The array data for [61]GSE32548 contained 92 NMIBC tissue samples and 38 MIBC tissue samples.[62]^24 The [63]GSE89 dataset contained 30 NMIBC tissue samples and 10 MIBC tissue samples.[64]^25 Then, the DEGs were identified with the independent t-test in the limma package, with p<0.05 and [logFC]>0.75 as the cut-off criterion. The robust multi-array average (RMA) method in the limma package ([65]http://www.bioconductor.org/packages/2.9/bioc/html/limma.html) was used to preprocess the raw CEL data and to perform data normalization and quartile data normalization. Construction of protein-protein interaction (PPI) network PPI analysis was used to search core genes and gene modules related to carcinogenesis. DEGs were committed to STRING version 10.5 (Search Tool for the Retrieval of Interacting Genes; string.embl.de) to find interaction associations of the proteins with the confidence score>0.4 used as the threshold criterion.[66]^26 Then, Cytoscape software (cytoscape.org) was performed to construct a PPI network of DEGs and analyze the interaction relationship of DEGs encoding proteins in BCa.[67]^27 Functional and pathway enrichment analysis Gene ontology (GO) analysis is a common method for analyzing the functions of many genes.[68]^28 Kyoto Encyclopedia of Genes and Genomes (KEGG) is the major recognized pathway-related database resource for biological interpretation of high-throughput data. GO and KEGG analyses are available in the DAVID database ([69]https://david.ncifcrf.gov/). The cut-off value was a p -value <0.05. Gene set enrichment (GSEA) analysis GSEA is a method to identify classes of genes in a big set of genes that may have a correlation with disease phenotypes. These genes are grouped together by their involvement in the same biological pathway or by proximal location on a chromosome. Generally, the enrichment score (ES) indicates the degree to which the genes are over-represented at either the top or bottom of the list, corresponding to the largest differences in genes expression between the MIBC and NMIBC tissues, is first calculated. Then, the estimated statistical significance of the ES before the enrichment scores for each set are normalized, and a false discovery rate is counted.[70]^29 The normalized enrichment score (NES) and nominal p-value were applied to rank the pathways enriched in each phenotype. Oncomine and TCGA analysis Oncomine database ([71]http://www.oncomine.org) was applied to explore the mRNA expression differences of the 7 DEGs in the ECM pathway between MIBC and NMIBC tissues in BCa. BCa gene expression data (mRNA, normalized RNAseq FPKM-UQ) was obtained from the TCGA database (provisional) using cBioPortal ([72]http://www.cbioportal.org/). For the survival analyses, BCa patients with high expression levels of the 7 DEGs in the ECM pathway (the lower 25%, n=102) were compared with those with low expression levels of the genes (the upper 25%, n=102). Tissue samples and immunohistochemistry (IHC) staining In this study, a total of 30 tisuue samples(15 NMIBC and 15 MIBC) were collected from BCa patients who received surgery at the Third Affiliated Hospital, Guangzhou Medical University. All patients provided wirtten informed consent for the use of their tissue for research. All patients were confirmed to have BCa via histopathological evaluations. None of the patients received treatment before surgery. This study was approved by the Internal Research Ethics Board at the Third Affiliated Hospital of Guangzhou Medical University and was conducted in accordance with the Declaration of Helsinki. Paraffin-embedded sections (3 µm) from 30 BCa samples were deparaffinized, rehydrated and blocked as previously described[73]^30 After being incubated with a goat anti-human monoclonal COL1A1 antibody (1:200; ab138492, Abcam, Cambridge, UK), sections were incubated with a biotin-labeled rabbit anti-goat antibody. Sections were visualized with streptavidin-conjugated horseradish peroxidase and 3, 3-diaminobenzidine (DAB). According to the number of cells stained (≤5% positive cells, 0; 6–25% positive cells,1; 25–50% positive cells,2; 50–80% positive cells,3; >80% positive cells, 4) and the staining intensity (no staining, 0; mild staining, 1; moderate staining, 2; severe staining, 3), each sample was scored. The final score was the sum of the positive cell percentage score and the staining intensity score. Cell culture and RNAi 5637 and T24 BCa cells were purchased from the Institute of Cell Research, Chinese Academy of Sciences, Shanghai, China. The 5637 and T24 cells were cultured in RPMI-1640 medium (Invitrogen, Carlsbad, CA, USA) with 10% fetal bovine serum. Plates were then placed at 37 °C in a humidified atmosphere of 5% CO2 in an incubator. The siRNAs against COL1A1 gene with RNAiMAX (Invitrogen) were transfected into 5637 and T24 cells for 48h. The following siRNA sequences were used: COL1A1#NC, sense: 5‘-ACUGCCGUUGUUAAGGUGUTT-3′, antisense: 5′-ACACCUUAACAACGGCAGUTT-3′, COL1A1#1, sense: 5′-GUACGUCCGGUUGUAUGUATT-3′, antisense: 5′-UACAUACAACCGGACGUACTT-3′, COL1A1#2, sense:5′-GGUGUUGUGCGAUGACGUGTT-3′, antisense: 5′-CACGUCAUCGCACAACACCTT-3′. qRT-PCR Total RNA was extracted from Treated 5637 and T24 cells using Trizol lysis method. cDNA synthesis was performed using the PrimeScript™Reverse Transcription System (Takara). RNA levels of COL1A1were detected using qRT-PCR according to the manufacturer’s protocol. Expression of COL1A1 and GAPDH were measured using the 2-ΔΔCt method. The following primer sequences were used: COL1A1, forward: GAGACCTGCGTGTACCCCACT, reverse: GTCATGCTCTCGCCGAACCAG; GAPDH, forward: CCCACTCCTCCACCTTTGAC, reverse: TCTTCCTCTTGTGCTCTTGC. Western blotting Total cell lysates were separated by 8% or 10% SDS-PAGE and then electroblotted and transferred onto PVDF membranes.[74]^31 After incubation with primary antibodies specific for COL1A1 (1:1000; sc-293182, Santa Cruz, TX, USA), CD44 (1:2000; #37259, CST, Boston, USA), E-cadherin (1:2000; #14472, CST, Boston, USA), vimentin (1:2000; #5741, CST, Boston, USA), TGFBI(1:1000; 10188–1-AP, proteintech, IL, USA), MMP9 (1:2000; #15561, CST, Boston, USA) and β-actin (1:4000,sc-81178, Santa Cruz, TX, USA) at 4 °C overnight, the blots were incubated with the corresponding secondary antibodies at room temperature, and the proteins were detected with an ECL kit. Cell proliferation assay 5637 and T24 BCa cells were transfected with the indicated siRNAs for 48 h; 1×10^4 cells were seeded into 24-well plates. The cells were counted for the number of viable cells at 24, 48, 72, 96, and 120 h. After being transfected with the indicated siRNAs for 48 h, the cells were analyzed with an EdU Kit (C10310-1, RiboBio) according to the manufacturer’s protocol. The results were detected with an inverted fluorescence microscope (Nikon). Colony formation assays For colony formation assays, 500 5637 or T24 cells were seeded and cultured in RPMI 1640 medium containing 10% FBS for 12 or 7 days, respectively. Clones were fixed with methanol and stained with 0.5% crystal violet. Migration and invasion assays For transwell migration assays, 1×10^5 5637 or T24 cells were plated in the top transwell chamber (8-µm pore size, BD, NY, USA) and were cultured in RPMI 1640 medium containing 0.1% FBS for 36 or 24 h, respectively. The chambers were placed in RPMI 1640 medium containing 10% FBS. Then, the chambers were fixed with methanol and stained with 0.5% crystal violet. The results were detected with an inverted microscope (Nikon). For invasion assays, chamber inserts were coated with 100 μl/ml of Matrigel (BD, Biosciences, MA, USA) with RPMI 1640 media and dried for 1 h at 37°C. Then, 20×10^4 5637 or T24 cells were plated in the top transwell chamber and were cultured in RPMI 1640 medium containing 0.1% FBS for 96 or 48 h, respectively. The same staining and detecting method for the migration assay was used. Statistics Statistical analysis was performed using GraphPad Prism 5. The survival curve between the groups was described by the Kaplan-Meier plot and was evaluated using the log-rank test. Differences in quantitative data were compared between the two groups using Student’s t-test. All experiments were performed three times independently. *p<0.05, **p<0.01, or ***p<0.001 was considered to be significant. Results The ECM-receptor interaction signal pathway may promote BCa progression By utilizing gene expression microarray analysis, DEGs (491 in [75]GSE13507, 418 in [76]GSE32584, and 529 in [77]GSE89) were identified, and 45 genes were found in all three datasets ([78]Figure 1A). These 45 genes were shown in [79]Table 1, including 30 up-regulated genes and 15 down-regulated genes in the MIBC tissues compared with NMIBC tissues ([80]Table 1). PPI network was performed using the STRING database, and the results are presented in [81]Figure 1B. Analysis of the PPI sub-network demonstrated that COL1A1, COL1A2, COL5A2, COL6A1, COL6A2, COL6A3, MMP9, THY1, TAGLN, VIM, CTGF, PDGFRB, ACTA2, SPP1, and TPM2 interacted closely with each other ([82]Figure 1B). Figure 1. [83]Figure 1 [84]Open in a new tab The ECM-receptor interaction signal pathway may promote BCa progression. (A) Venn diagram of DEGs from the three mRNA expression profiling datasets [85]GSE89, [86]GSE13507, and [87]GSE32548. (B) The protein–protein interaction (PPI) network of DEGs was constructed using the String website (available online: [88]www.string-db.org) and Cytoscape software (red represents upregulated genes, and blue represents downregulated genes). (C) KEGG pathway enrichment analysis of DEGs. (D) Venn diagram of the top 20 gene sets enriched by GSEA in the MIBC phenotype compared to NMIBC. (E) Venn diagram of the intersection of GSEA and KEGG analysis. (F) GSEA results showing the ECM-receptor interaction pathway. Table 1. Forty-five DEGs were identified from [89]GSE89, [90]GSE13507, and [91]GSE32548 expression profile datasets DEGs Genes Name Up-regulated ACTA2, ACTN1, AIF1, C1QB, CD14, CDC20, COL1A1, COL1A2, COL5A2, COL6A1, COL6A2, COL6A3, CRIP1, CTGF, DPYSL2, LGALS1, MGP, MMP9, MT2A, PDGFRB, PLEK, RBP1, RGS1, SLC2A3, SPP1, TAGLN, TGFBI, THY1, TPM2, VIM Down-regulated CTSE, CYP2J2, CYP4B1, DHRS2, FABP6, FOXA1, GPX2, HMGCS2, HSD17B2, ID3, PTK6,SLC14A1,SMAD6, SORL1, SPINK1 [92]Open in a new tab For further analyzing the candidate DEGs’ functions and pathway enrichment, both GO and KEGG analyses were conducted using DAVID. All DGEs were classified into three groups, cellular component (CC), biological processes (BP) and molecular function (MF), and the results are shown in [93]Table 2. Specifically, regarding the cellular component (CC), the upregulated DEGs were enriched in proteinaceous extracellular matrix, extracellular exosomes, and collagen. As for biological processes (BP), the upregulated DEGs were mainly involved in extracellular matrix organization, cell adhesion, and skeletal system development. Moreover, for molecular function, the upregulated DEGs were mainly enriched in extracellular matrix structural constituent and platelet-derived growth factor binding. In addition, the downregulated DEGs were mainly enriched in steroid metabolic processes and electron carrier activity. Furthermore, KEGG pathway analysis revealed that the upregulated DEGs were mainly involved in focal adhesion and ECM-receptor interaction ([94]Figure 1C), and both of the former two signaling pathways contained 7 common genes, COL6A1, COL6A2, COL6A3, COL5A2, COL1A1, COL1A2, and SPP1(osteopontin) ([95]Table 3). However, the downregulated DEGs were not enriched. Table 2. The significant GO terms analysis of DEGs Category Term Count P-value Up-regulated (Top20) GOTERM_CC_FAT GO:0005578~proteinaceous extracellular matrix 11 1.97E-10 GOTERM_CC_FAT GO:0031012~extracellular matrix 11 4.11E-10 GOTERM_CC_FAT GO:0044421~extracellular region part 13 6.26E-08 GOTERM_CC_FAT GO:0005581~collagen 5 5.72E-07 GOTERM_CC_FAT GO:0005576~extracellular region 16 5.73E-07 GOTERM_MF_FAT GO:0048407~platelet-derived growth factor binding 4 1.63E-06 GOTERM_BP_FAT GO:0030198~extracellular matrix organization 6 2.08E-06 GOTERM_BP_FAT GO:0001501~skeletal system development 8 2.94E-06 GOTERM_BP_FAT GO:0007155~cell adhesion 10 7.17E-06 GOTERM_BP_FAT GO:0022610~biological adhesion 10 7.26E-06 GOTERM_BP_FAT GO:0043062~extracellular structure organization 6 1.88E-05 GOTERM_MF_FAT GO:0005201~extracellular matrix structural constituent 5 3.76E-05 GOTERM_CC_FAT GO:0044420~extracellular matrix part 5 7.27E-05 GOTERM_MF_FAT GO:0019838~growth factor binding 5 8.21E-05 GOTERM_BP_FAT GO:0001503~ossification 5 8.67E-05 GOTERM_BP_FAT GO:0060348~bone development 5 1.13E-04 GOTERM_CC_FAT GO:0005583~fibrillar collagen 3 2.40E-04 GOTERM_CC_FAT GO:0005615~extracellular space 8 2.53E-04 GOTERM_MF_FAT GO:0005178~integrin binding 4 2.99E-04 GOTERM_BP_FAT GO:0048705~skeletal system morphogenesis 4 1.56E-03 Down-regulated GOTERM_BP_FAT GO:0008202~steroid metabolic process 5 3.10474E-05 GOTERM_BP_FAT GO:0055114~oxidation reduction 5 0.002505005 GOTERM_MF_FAT GO:0009055~electron carrier activity 3 0.022942517 GOTERM_MF_FAT GO:0070330~aromatase activity 2 0.026636667 GOTERM_MF_FAT GO:0016712~oxidoreductase activity 2 0.031884301 [96]Open in a new tab Table 3. The significant signaling pathway enrichment analysis of DEGs functions Pathway Name Gene Count p-value Genes Focal adhesion 9 8.9E-07 COL6A3,COL6A2, COL1A2, COL6A1, ACTN1, PDGFRB, COL1A1, COL5A2, SPP1 PI3K-Akt signaling pathway 8 2.6E-04 COL6A3,COL6A2,COL1A2,COL6A1,PDGFRB, COL1A1, COL5A2, SPP1 ECM-receptor interaction 7 8.9E-07 COL6A3,COL6A2,COL1A2,COL6A1, COL1A1, COL5A2, SPP1 Protein digestion and absorption 6 1.7E-05 COL6A3,COL6A2,COL1A2,COL6A1, COL1A1, COL5A2 Amoebiasis 5 7.5E-04 COL1A2, ACTN1, COL1A1, COL5A2, CD14 [97]Open in a new tab To further identify signaling pathways that are obviously activated in MIBC compared to NMIBC, we performed GSEA in the three expression datasets ([98]GSE13507, [99]GSE32584, and [100]GSE89). Compared to the NMIBC phenotype, the top 20 positively correlated gene signatures for NES values were enriched in MIBC. Among them, 4 gene signatures were found in all three datasets, including ECM-receptor interaction, cytokine-receptor interaction, complement, coagulation cascades and NOD-like receptor signaling pathways ([101]Figure 1D). Among them, the ECM-receptor interaction signaling pathway was found in GSEA and KEGG analysis in common ([102]Figure 1E). The ECM-receptor interaction signaling pathway was differentially enriched in the NMIBC phenotype based on the [103]GSE13507, [104]GSE32584, and [105]GSE89 gene sets ([106]Figure 1F and [107]Figure S1). Figure S1. [108]Figure S1 [109]Open in a new tab Blue-pink O’gram of the ECM-receptor interaction pathway in the three datasets [110]GSE89, [111]GSE13507, and [112]GSE32548. Over-expression of 6 collagen genes in muscle-invasive bca Based on GSEA and KEGG analysis, we hypothesized that up-regulation of genes located in the ECM-receptor interaction signal pathway may be important in the progression of NMIBC into MIBC. In order to ascertain the roles collagen genes played and investigate whether the genes belonging to the ECM-receptor interaction signal pathway are amplified in MIBC, mRNA levels in both types of BCa were analyzed using the Oncomine database. Significantly higher mRNA levels of all 6 genes were found in MIBC tissues compared to NMIBC tissues ([113]Figure 2A). Similarly, higher mRNA levels were also found in MIBC tissues in the Sanchez-Carbayo bladder database ([114]Figure 2B). These observations indicate that the COL6A3, COL6A2, COL6A1, COL5A2, COL1A2, and COL1A1 genes are amplified in MIBC. Collectively, these results demonstrate that the ECM-receptor interaction signaling pathway collagen genes may be critical for BCa progression, even having the possibility of clinical treatment. Figure 2. Figure 2 [115]Open in a new tab Over-expression of COL6A3, COL6A2, COL6A1, COL5A2, COL1A2, and COL1A1 genes in MIBC tissues compared to NMIBC tissues. (A) COL6A3, COL6A2, COL6A1, COL5A2, COL1A2, and COL1A1 mRNA levels were up-regulated in MIBC compared to NMIBC based on the Dyrskjot bladder database from Oncomine. (B) COL6A3, COL6A2, COL6A1, COL5A2, COL1A2, and COL1A1 mRNA levels were increased in the MIBC compared to NMIBC in the Sanchez-Carbayo bladder database. Higher expression of 6 collagen genes involved in the ECM–receptor interaction signaling pathway promoted mortality in BCa patients By comparing the 6 collagen gene expression levels between the MIBC and NMIBC tissues, all mRNA levels turned out to be extraordinary similar with the former exhibiting a higher level relative to the latter. This indicated that higher expression levels of the genes located in the ECM-receptor interaction signal pathway might increase the progression possibility of NMIBC into MINBC. However, correlation of the upregulation of these 6 genes with the recurrence-free survival and overall survival for BCa patients remained unclear. The results showed that the overall survival rates of BCa patients were negatively correlated with the levels of all 6 collagen genes ([116]Figure 3A), and the same was observed for the recurrence-free survival ([117]Figure 3B). The results showed that 6 collagen genes were negatively correlated with the prognosis of BCa patients, and may become an independent prognostic marker. Figure 3. [118]Figure 3 [119]Open in a new tab Increased COL6A3, COL6A2, COL6A1, COL5A2, COL1A2 and COL1A1 mRNA levels correlate with poor survival rates in BCa patients. (A) Kaplan-Meier analysis of overall survival for BCa patients with high (the upper 25%) or low (the lower 25%) COL6A3, COL6A2, COL6A1, COL5A2, COL1A2 and COL1A1 expression (102 patients per subgroup). (B) Kaplan-Meier analysis of recurrence-free survival for BCa patients with high (the upper 25%) or low (the lower 25%) COL6A3, COL6A2, COL6A1, COL5A2, COL1A2 and COL1A1 expression (102 patients per subgroup). Upregulation of COL1A1 protein is positively correlated with the progression of NMIBC to MIBC We evaluated the expression level of 6 collagen genes involved in the ECM–receptor interaction signaling pathway with GAPDH as the internal control, based on the [120]GSE13507, [121]GSE32584, and [122]GSE89 gene sets. The result showed that the mRNA level of COL1A1 was the highest, relative to the other five collagen genes, in MIBC, leading us to speculate that COL1A1 plays a more key role in the progression of BCa ([123]Figure S2). Therefore, we regard it as a future research object. To examine the role of COL1A1 in the progression of BCa, COL1A1 levels in 15 NMIBC tissues and 15 MIBC tissues were analyzed with IHC staining ([124]Figure 4A and [125]B). We found that 73.33% (11/15) of the MIBC tissues were stained positively forCOL1A1, including 2 cases of severely positive samples, 2 cases of moderately positive samples and 7 cases of highly positive samples but only 40.0% (6 of 15) of the NMIBC tissues were positive ([126]Figure 4B). Figure S2. [127]Figure S2 [128]Open in a new tab Analysis of the three datasets ([129]GSE89, [130]GSE13507, and [131]GSE32548) for mRNA expression levels of COL6A3, COL6A2, COL6A1, COL5A2, COL1A2 and COL1A1 in the MIBC and NMIBC samples. Figure 4. [132]Figure 4 [133]Open in a new tab Compared with NMIBC tissue samples, the COL1A1 protein was upregulated in MIBC tissue samples. (A) Representative IHC images of COL1A1 expression in the NMIBC and MIBC tissue samples. (B) Differences in COL1A1 expression scores between NMIBC tissues (n=15) and MIBC tissues (n=15) are shown as a scatter plot. Silencing of COL1A1 inhibits both migration and invasion of tumors in vitro Features of tumor progression generally consist of the dysregulation of cell proliferation, invasiveness, and metastasis.[134]^32^,[135]^33 To further investigate the role of COL1A1 in the progression of BCa, COL1A1 was targeted with RNA interference in 5637 and T24 cells. The results showed that both mRNA and protein expression levels of COL1A1in 5637 and T24 cells were significantly reduced compared to the control groups ([136]Figure 5A and [137]B). We also found that silencing of COL1A1 downregulated the expression level of CD44 in both cells ([138]Figure 5B). During tumor metastasis, epithelial-mesenchymal transition (EMT) is a key process. E-cadherin and vimentin are regarded as improtant EMT markers.[139]^34 We demonstrated that silencing COL1A1 downregulated the expression levels of E-cadherin and Vimentin. Binding of collagen to integrin is mediated by TGFBI.[140]^35 In addition, TGFBI can activate matrix metalloproteinase (MMP) secretion and promote invasion.[141]^36 We found that silencing COL1A1 down-regulated the expression levels of TGFBI and MMP9. We found that silencing COL1A1 in 5637 and T24 cells inhibited cell growth ([142]Figure 5C and [143]D). Silencing COL1A1 also inhibited cell colony formation in 5637 and T24 cells ([144]Figure 5E). Silencing COL1A1 caused various degrees of reduction in both cell lines’ invasion and migration ([145]Figure 5F). Taken together, we demonstrated that COL1A1 promotes BCa cell malignant phenotypes, which matches the poor prognosis in BCa patients. Figure 5. Figure 5 [146]Open in a new tab COL1A1 is required for tumorigenesis and metastasis of BCa cells. (A) The mRNA levels of COL1A1 in COL1A1-silenced 5637 and T24 cells. (B) 5637 and T24 cells were transfected with 100 nM anti-COL1A1 siRNA, and the protein levels of COL1A1, CD44, E-cadherin, vimentin, Snail and TGFBI were analyzed by western blotting. (C) The effects of COL1A1 silencing on 5637 and T24 cell growth after being transfected with the indicated siRNAs for 48 h. (D) Cell proliferation was measured using EdU immunofluorescence assays in COL1A1-silenced 5637 and T24 cells.(E) Colony formation assays of COL1A1 silenced 5637 and T24 cells. (F) Migration and invasion of COL1A1-silenced 5637 and T24 cells. Discussion Plenty of studies have been implemented to clarify the unique mechanisms of BCa formation and progression, whereas the clinical incidence and mortality of BCa still maintains a severe status. Since most efforts have concentrated on a individual genetic research, little attention has been paid to integrated processes for radical exploration by systematic bioinformatics methods and medical molecular biology technologies. In the present study, gene expression profiling was utilized to investigate the molecular mechanisms underlying BCa progression. A set of 30 up-regulated and 15 down-regulated DEGs were identified between the NMIBC samples and MIBC samples. Furthermore, results from the KEGG analysis and GSEA collectively focused on the ECM–receptor interaction signaling pathway. ECM, as a key component in the modulation of cancer cell invasion, is considered as necessary to maintain the integrity of the impermeable bladder surface.[147]^37 The interaction between membrane receptors of tumor cells and ECM proteins plays a crucial role in tumor invasion and metastasis.[148]^38 Collagens, major components of ECM, are involved in regulating tumor cell proliferation, migration, and invasion. COL1A1, as a major component of collagen type I, has been well-identified by researchers as an oncogene in the progression of colon cancer .[149]^39 In addition, high COL1A1 and COL1A2 mRNA expression levels are significantly associated with survival in patients with NMIBC.[150]^40 Moreover, researchers have discovered that COL3A1 is involved in many clinical diseases and that it could function as one translocation partner gene in lipomatous tumors and influence cell migration and invasion in nasopharyngeal carcinoma progression.[151]^41^,[152]^42 It could also interfere with malignant development in renal cell carcinoma and act as one of the etiologically linked genes in aortic dissected aneurysm.[153]^43 COL5A2 is also widely studied, and previous evidence indicated that COL5A2 might be associated with the pathological processes of several human cancers. For example, the expression of COL5A2 was determine to be up-regulated in invasive breast cancer compared with ductal carcinoma in situ, indicating that COL5A2 is related to tumor progression in breast cancer.[154]^44 COL5A2 is also considered as a diagnostic marker for osteosarcoma and might affect the growth of BC cells by ECM remodeling.[155]^45 COL6A1, an oncogene in the progression of cervical cancer, is highly correlated with poor prognosis in cervical cancer patients.[156]^46 In this study, we demonstrated upregulation of the ECM–receptor interaction signaling pathway by 6 collagen genes, COL1A1, COL1A2, COL5A2, COL6A1, COL6A2, and COL6A3, favoring NMIBC progression to MIBC. Our data reveal that the expression of the 6 collagen genes was highly correlated with high pathological stages, poor recurrence-free survival rates, and poor overall survival rates. These findings indicated the potentially important roles of 6 collagen genes in the diagnosis and prognosis of BCa patients. It has been reported that constitutive expression of SPP1 is involved in tumorigenesis and metastasis of BCa.[157]^47 We also found that higher expression of SPP1 was involved in the ECM–receptor interaction signaling pathway and was highly correlated with poor pathological stages of BCa and poor survival rates of patients with BCa ([158]Figure S3). Figure S3. [159]Figure S3 [160]Open in a new tab SPP1 was highly correlated with poor pathological stages of BCa and poor survival rates. (A and B) The mRNA levels of SPP1 were upregulated in MIBC compared to NMIBC by analyzing the Dyrskjot and the Sanchez-Carbayo bladder databases from Oncomine. (C and D) Kaplan-Meier analysis of overall survival and recurrence-free survival for BCa patients with high (the upper 25%) or low (the lower 25%) SPP1 expression (102 patients per subgroup). To further explore the function and mechanism of collagen genes in promoting bladder cancer progression, we used COL1A1 as a further research object. The results showed that the suppression of COL1A1 decreased the growth, migration, and invasion of BCa cells. CD44 plays an important role in the ECM-receptor interaction signal pathway and is evidently increased in tumor cells cultured in 3D collagen scaffolds.[161]^48^,[162]^49 Our results showed that COL1A1 knockdown inhibited the expression of CD44. COL1A1 is closely related to EMT and the transforming growth factor β (TGF-β, a common stimulator of EMT) pathway. COL1A1 directly stimulates EMT of pancreatic,[163]^50 lung,[164]^51 and breast[165]^52 carcinoma cells via TGF-β signaling. TGFBI, which is induced by TGF-β1 protein, is an ECM protein that activates matrix metalloproteinase (MMP) secretion. Logan A. et al showed that inhibiting the expression of COL1A1 with siRNA inhibited EMT and cell migration induced by TGF-β1in HK-2 cells. EMT is characterized by changes in the expression of biochemical markers, a decrease in epithelial markers like E-cadherin and an increase in mesenchymal markers such as vimentin and N-cadherin.[166]^34 First, we discovered that COL1A1 is a key effector of CD44 in the regulation of BCa progression. Further, we found that inhibiting the expression of COL1A1 activated the transcription of E-cadherin and inhibited the expression of vimentin. Finally, we determined that inhibiting the expression of COL1A1 also inhibited the expression of TGFBI and MMP9. Therefore, we believe that COL1A1 is an important EMT mediator and effector of TGF-β-induced expression via interacting with CD44 in BCa progression. Interestingly, vimentin, TGFBI, and MMP9 are also significantly upregulated DEGs. These data are mutually confirmed with the results of our cellular experiments. Meanwhile, relevant studies have shown that other collagen genes, such as COL1A2, COL5A2, and COL6A3, are important EMT mediators.[167]^53^–[168]^55 Our findings provide novel insights into NMIBC progression to MIBC. In summary, the present study determined gene expression profiles and identified 6 collagen proreins(COL1A1, COL1A2, COL5A2, COL6A1, COL6A2, and COL6A3) involved in the ECM–receptor interaction signaling pathway in BCa progression. The 6 collagen genes can be used as independent effective prognostic biomarkers for BCa. In addition, the 6 collagen genes may be an independent potential indicator of whether NMIBC patients need further treatment after surgery to prevent the recurrence and progression of the disease. Acknowledgments