Abstract Introduction Only a proportion of triple-negative breast cancer (TNBC) is immunotherapy-responsive. We hypothesized that the tumor microenvironment (TME) influences the outcomes of TNBC and investigated the relevant signaling pathways. Materials and Methods Immune score (IS) and stromal score (SS) were calculated using the ESTIMATE and correlated with the overall survival (OS) in TNBC. RNA-seq data from 115 TNBC samples and 112 normal adjacent tissues were retrieved. Validations in the methylation levels in 10 TNBC and five non-TNBC cell lines were obtained. Cox model overall survival (OS) validated the derived transcription factor (TF) genes in cBioPortal breast cancer patients. Results SS-low predicts a higher OS compared with SS-high patients (P = 0.0081 IS-high/SS-low patients had better OS (P = 0.045) than IS-low/SS-high patients. More macrophages were polarized to the M2 state in patients with IS-low/SS-high patients (P < 0.001). Moreover, CIBERSORTx showed more CD8+ cytotoxic T-cells in IS-high/SS-low patients (p = 0.0286) and more resting NK cells in the IS-low/SS-high TME (P = 0.0108). KEGG pathway analysis revealed that overexpressed genes were enriched in the IL-17 and cytokine-cytokine receptor interaction pathways. The lncRNA DRAIC, a tumor suppressor, was consistently deactivated in the 10 TNBC cell lines. On the cBioPortal platform, we validated that 13% of ER-negative, HER2-unamplified BC harbored IL17RA deep deletion and 25% harbored TRAF3IP2 amplification. On cBioPortal datasets, the nine altered TF genes derived from the X2K analysis showed significantly worse relapse-free survival in 2377 patients and OS in 4819 invasive BC patients than in the unaltered cohort. Conclusion Of note, the results of this integrated in silico study can only be generalized to approximately 17% of patients with TNBC, in which infiltrating stromal cells and immune cells play a determinant prognostic role. Keywords: triple-negative breast cancer, immune cells, tumor microenvironment, IL-17 signaling pathway, immune evasion, in silico study Introduction In 2020, the global incidence of breast cancer was 2,261,419 diagnosed across 185 countries, accounting for 11.7% of all cancer types.[32]^1 Triple-negative breast cancer (TNBC) is characterized as estrogen receptor-negative, progesterone receptor-negative, and a lack of HER2 expression or amplification and accounts for approximately 15–20% of all breast cancers.[33]^2^,[34]^3 TNBCs are heterogeneous (in terms of genomics, transcriptomics, and histopathology) and demonstrate the heterogeneity of response to anti-programmed death-1/ligand-1 (anti-PD-1/PD-L1) checkpoint inhibition immunotherapy.[35]^3–5 In fact, recent studies by Schmid et al revealed that PD-1 or PD-L1 checkpoint blockade immunotherapy combined with neoadjuvant chemotherapy (NAC) improved the pathological complete response (pCR) rate in the neoadjuvant setting; therefore, they proposed such an approach would probably improve the progression-free survival (PFS) of patients with advanced or metastatic TNBC staining positive for PD-L1 in tumor-infiltrating immune cells, if used as the first-line therapy.[36]^6–8 In contrast, other tumors are weakly immunogenic depending on the composition of the infiltrating immune cell populations and extrinsic factors (eg different metabolites or specific cytokines) enriched in the immune tumor microenvironment (TME). Of note, recent bioinformatics research, based on bulk tumor gene expression data demonstrated that a low abundance of regulatory CD4+ T cells (Treg) was significantly associated with an increased pCR rate in TNBC patients after NAC.[37]^9 In a TNBC surgical specimen, untreated tumor cells typically represent approximately 60% of the cellular component, lymphoid and immune cells account for 20%, and stromal cells such as fibroblasts, histiocytes, endothelial cells, myofibroblasts, and adipocytes represent the remaining 20%.[38]^10 The primary function of stromal cells is to establish an immune response. The TME creates a chemokine-rich milieu inside, promoting the encounter between the boundary tumor cells and a variety of surrounding infiltrating immune cells in addition to cancer-associated fibroblasts, neo-vessels, neo-neurites, and other supportive tissues.[39]^11 Prior research has demonstrated that the prognosis of TNBC in terms of disease-free survival and disease-specific survival is worse in the basal-like immunosuppressed subtype and fare better in the basal-like immune-activated subtype, indicating that the immune TME plays a crucial role in the formation of either a tumor-permissive or tumor-expulsive milieu[40]^4 A clinicopathological study also demonstrated that TNBC with a high number of tumor-infiltrating CD56-positive natural killer (NK) cells was associated with a more favorable disease-free survival.[41]^12 Additionally, research focusing on cancer-associated chemokines revealed that NK cells expressing abundant CXCR3 (also known as GPR9 and CD183) molecules on the cell surface are recruited by the chemokines CXCL9, CXCL10, or CXCL11 secreted by immune cells or stromal cells in the TME.[42]^13 However, it is still unclear what driving force explains the immunogenic or immunosuppressive phenotype in the TME in patients with TNBC. Recently, there have been marked improvements in the research efficiency using cross-platform in-silico bioinformatics analyses of multi-omics, multi-layer cancer data.[43]^4^,[44]^10^,[45]^14–17 Hence, here, we leveraged multi-platform in-silico analyses of genomic data to investigate which infiltrating immune cell populations would be associated with the immune phenotype, and which signaling pathways could determine a subgroup of TNBC that is strongly immunogenic, thus, having a better prognosis. Lastly, we also aimed to determine whether transcription factor signatures derived from the RNA-seq data have prognostic value. Materials and Methods Ethics Statement and Study Design This human data-based research study leveraged multiple publicly accessible RNA-seq datasets containing only mature, anonymous, and de-identified genetic and demographic data. The Institute Review Board of Kuang Tien General Hospital approved the study with a Certificate of Approval numbered KTGH-10458. This study was performed in accordance with the Declaration of Helsinki. The schematic diagram ([46]Figure 1) shows the study design and methodology adopted in this study. First, we downloaded from TCGA, a breast cancer gene expression RNA-seq dataset (TCGA-BRCA). Using this dataset, we sorted 115 patients with non-metaplastic TNBC, characterized by the lack of estrogen receptors, progesterone receptors, and HER2 non-overexpression or HER2-FISH un-amplification. Figure 1. [47]Figure 1 [48]Open in a new tab Study flow diagram. The schematic diagram represents the study design and methodology adopted. Estimation of the Immune/Stromal Scores As mentioned above, the gene expression data and corresponding clinical information from a total of 1222 cases were retrieved from a publicly available dataset (TCGA-BRCA).[49]^18 After extracting the ER, PgR, and HER2 information of each sample, a total of 115 TNBC cases, 112 normal cases, and 994 non-TNBC cases were identified. To predict TNBC purity, we applied the Estimation of STromal and Immune cells in MAlignant Tumor tissues using Expression data (ESTIMATE) algorithm to the normalized expression matrix to determine the immune/stromal scores of each TNBC patients. The immune score (IS; derived from the immune signature of 141 genes) and stromal score (SS; derived from the stromal signature of 141 genes) were calculated using the r-ESTIMATE package in R.[50]^19 [51]Supplementary Table S1 presents a gene list of these immune-signature and stromal-signature genes. This algorithm rank-normalizes and rank-orders a set of gene expression values in a given sample and calculates the empirical cumulative distribution function from the signature gene set of the remaining genes; therefore, ESTIMATE can calculate the SS and IS from the RNA-seq data. The algorithm also allows the combination of these scores into an estimate score, used to infer DNA copy number-based tumor purity, as per the following formula: Fraction of tumor cells in a clinical sample (namely, transcriptomic-based tumor purity) = cos (0.6049872018 + 0.0001467884 × [combined stromal-immune score]). Correlation Analyses The overall survival (primary prognosis endpoint-based) was estimated using Kaplan-Meier survival analysis. Log rank test was used to compare the survivals stratified by stage. Based on the calculated SS and IS, the corresponding patients were classified into two groups, and their prognoses were individually examined. A previously published method (maximally selected rank statistics), was employed for optimal cutoff identification using the survival in R to explore the relationship between the overall survival and the two groups of TNBC cases. CIBERSORT Analysis We used CIBERSORT X ([52]https://cibersort.stanford.edu) to compare the proportion of infiltrating immune cells in the TME of IS-low/SS-high and IS-high/SS-low TNBC patients.[53]^20^,[54]^21 The CIBERSORT accurately allows the evaluation of the relative levels of 22 immune cell phenotypes; such and analysis was performed using the immunedeconv package in R.[55]^22 The immune cell fraction level divided by the cutoff value was 0 or 1 in the subsequent scoring formula. Differential Expression Analysis According to the ESTIMATE results, the intersection genes were selected based on the stromal/immune scores. The limma package in R was used to screen DEGs between normal and TNBC samples.[56]^23 Genes with a p-value ≤ 0.0001 and absolute log2 fold-change ≥ 4 were considered to be differentially expressed and extracted further for network construction. Heatmaps were generated using the pheatmap package in R.[57]^24 The generated heatmaps and volcano plots show the differentially expressed genes in either the IS-high/SS-low or IS-low/SS-high TNBC groups. Gene Ontology and KEGG Pathway Enrichment Analyses Using these DEGs defined as above for the two TNBC groups, we investigated the enriched signaling pathways based on GO terms and KEGG pathways. Functional enrichment analyses of GO terms, including cellular components, molecular functions, and biological processes, and KEGG pathways were performed using the clusterProfiler package in R.[58]^25 The functions among genes of interest were adjusted with a cutoff criterion of p < 0.05. eXpression2Kinases Analysis We utilized eXpression2Kinases (X2K), as reported elsewhere, to disclose the potential enrichment of transcription factors.[59]^26^,[60]^27 Gene Set Variation Analysis Gene set variation analysis (GSVA) is a bioinformatics framework that organizes gene expression data in the form of a pathway or signature summary.[61]^28 GSVA is a popular pathway-related immune infiltration and tumor mutational burden immune-related analysis. Here, we leveraged GSVA to provide an accurate definition of pathway enrichment between samples from different groups. We used the GSVA package in R to evaluate the t score and assigned the pathway activity conditions. We then used the pheatmap package in R to display the distinctions in pathway activation between normal tissues and those of IS-high/SS-low and IS-low/SS-high patients. Validation in TNBC Cancer Cell Lines and Non-TNBC Cancer Cell Lines In the Depmap platform ([62]https://depmap.org/portal/), we used the methylation dataset (1 kb upstream transcription start sites) to compare the lncRNA methylation between the TNBC cell lines and non-TNBC cell lines (ER+/HER2+ or HER2-negative). We defined fractional methylation > 0.6 indicates “methylated” and < 0.6 “unmethylated”. Ten TNBC cell lines used were HCC2157, BT20, MDAMB231, HCC1395, HCC1937, HCC1599, MDAMB436, MDAMB468, HCC1806, and HCC1143. Five non-TNBC cell lines were BT474, UACC812, EFM192A, EFM19, and ZR751. Validation in cBioPortal for Cancer Genomics Finally, we validated our findings in a different, more extensive dataset of breast cancer via cBioPortal analysis.[63]^29^,[64]^30 Statistical Analyses We used the R package statistical software to implement the survival analysis with relevant R functions as survfit(), survdiff(), coxfit1. In addition, we used the StatsDirect version 3.3 to compute the Mann–Whitney U-test and the grouped linear regression. Finally, the built-in default statistical software performed all other statistical analyses on each platform. Results From the TCGA Breast Cancer dataset, gene expression data for 115 patients with triple-negative breast cancer and 112 normal adjacent tissues were retrieved. [65]Figure 1 demonstrates the study design, flow, and methodology used in the current study. In this cohort of 115 TNBC patients, favorable overall survival as per the Kaplan-Meier curves was significantly correlated with an earlier disease stage (P = 0.0012; [66]Figure 2). Figure 2. [67]Figure 2 [68]Open in a new tab Overall survival of 115 patients with triple-negative breast cancer stratified by cancer stage at diagnosis. Kaplan-Meier curves referring to different overall TNBC stages. Log rank test was used to compare the survivals stratified by stage. The higher stage correlates significantly with poorer survival (P = 0.0012). The ESTIMATE algorithm computationally estimates the fraction of stromal and immune cells in the TME of all the 115 TNBC patients. The SS was high (SS-high) in 10 patients, and the IS was high (IS-high) in the ten other patients. There were no significant correlations between the SS or IS and the cancer stage in all the 115 patients ([69]Supplementary Figure S1). However, as per the Kaplan-Meier analyses, SS-low patients showed a higher overall survival (OS) than that of SS-high patients (P = 0.0081; [70]Supplementary Figure S2), while IS-high patients showed a higher OS than that of IS-low patients (P = 0.2; too few cases in the IS-high; [71]Supplementary Figure S3). Of note, a strong correlation between both the SS and IS and the patients’ overall prognosis was observed. Expectedly, when compared with IS-low/SS-high patients, IS-high/SS-low patients showed a better OS (P = 0.045) ([72]Figure 3). Cytoscape revealed the immune cell infiltration levels between samples grouped by IS-high/SS-low or IS-low/SS-high ([73]Supplementary Figure S4). Interestingly, grouped linear regression showed a statistically significant increase in M2 macrophages in TNBC patients with the IS-high/SS-low phenotype. In line with these results, in the tumor microenvironment (TME), a higher proportion of M2 was also found in IS-low/SS-high patients, as estimated by CYBERSORTx (P < 0.001) ([74]Supplementary Figure S5). Additionally, significantly more CD8+ cytotoxic T-cells, memory B cells (P = 0.0304), activated CD4+ memory T cells (P = 0.0056), follicular helper T cells (P = 0.0044), and activated NK cells (P = 0.0511) were also observed in IS-high/SS-low patients (14% vs 5.3%, p = 0.0143) ([75]Table 1). On the other hand, more resting NK cells were observed in IS-low/SS-high patients (P = 0.0108). Figure 3. [76]Figure 3 [77]Open in a new tab Overall survival of 115 patients with triple-negative breast cancer stratified by the combination of the Stromal Score and Immune Score. Both scores were inferred by the ESTIMATE gene expression signatures. SS-low/IS-high patients were associated with excellent overall survival, whereas SS-high/IS-low patients showed the worst overall survival. Table 1. Comparison of the Immune Cell Profiles in the Tumor Microenvironment (TME) of IS-High/SS-Low and IS-Low/SS-High Triple-Negative Breast Cancer Patients, Estimated Using CYBERSORTx Immune Cell Type in the TME Immune Score-High/Stromal Score-Low TNBC (n = 10) Immune Score-Low/Stromal Score-High TNBC (n = 10) P-value* Naive B cells 0.0363 ± 0.0532 0.0258 ± 0.0263 0.9288 Memory B cells 0.0168 ± 0.0121 0.0061 ± 0.0153 0.0304 Plasma cells 0.0162 ± 0.0376 0.0148 ± 0.0397 0.7599 CD8+ T cells 0.1403 ± 0.0934 0.0531 ± 0.0616 0.0143 Naive CD4+ T cells 0 0 - Resting CD4+ memory T cells 0.0956 ± 0.0557 0.1604 ± 0.1036 0.2176 Activated CD4+ memory T cells 0.0441 ± 0.0316 0.0176 ± 0.0159 0.0056 Follicular helper T cells 0.0700 ± 0.0235 0.0379 ± 0.0305 0.0044 Regulatory T cells 0.0517 ± 0.0409 0.0271 ± 0.0219 0.8767 Gamma delta T cells 0.0421 ± 0.0361 0.0259 ± 0.0268 0.2454 Resting NK cells 0 0.0199 ± 0.0395 0.0108 Activated NK cells 0.0360 ± 0.0257 0.0178 ± 0.0256 0.0511 Monocytes 0.0009 ± 0.0030 0.0164 ± 0.0428 0.2105 M0 macrophages 0.1196 ± 0.0540 0.1979 ± 0.1377 0.2799 M1 macrophages 0.1497 ± 0.0789 0.1357 ± 0.0670 0.8534 M2 macrophages 0.1043 ± 0.0628 0.1534 ± 0.0939 0.1431 Resting dendritic cells 0.0395 ± 0.0663 0.0211 ± 0.0178 0.5889 Activated dendritic cells 0.0001 ± 0.0004 0.0008 ± 0.0026 0.5 Resting mast cells 0.0344 ± 0.0214 0.0681 ± 0.0466 0.123 Activated mast cells 0 0 - Eosinophils 0.0008 ± 0.0028 0 > 0.9999 Neutrophils 0.0013 ± 0.0032 0.0002 ± 0.0006 0.582 [78]Open in a new tab Notes: Bold text denotes statistical significance. (%) Mean±SD; *Using Mann–Whitney U-test. CYBERSORTx: [79]https://cibersortx.stanford.edu/index.php. Moreover, DEG analysis showed 651 DEGs (284 upregulated, and 367 downregulated) in IS-high/SS-low patients, and 370 DEGs (187 upregulated, and 183 downregulated) in IS-low/SS-high patients ([80]Supplementary Figure S6). Heatmaps and volcano plots of the DEGs in the context of these two groups are shown in [81]Figure 4A–D. DESeq2 uses the Wald test to identify differentially expressed genes by comparing the immunogenic or immunotolerant classes and the normal adjacent tissue. To put things in context, we displayed the top 30 DEGs ranked by the Wald statistic for the four categories by the combined immune and stromal scores. In IS-high/SS-low TNBC tumors, the top 30 upregulated DEGs were LAG3, CCNE1, GBP5, CDCA8, ETV7, CDCA3, CXCL10, KCNJ10, IDO1, PTTG1, SKA1, CXCL11, IFNG, LOC105373098, TPX2, PLK1, WFDC21P, A2ML1, KIF2C, MMP11, IL21R, NDC80, IL4I1, AIM2, CXCL9, CDC20, CCL25, RTP3, NUF2, and CENPA; in contrast, the downregulated DEGs were MAMDC2, FAXDC2, VEGFD, TGFBR3, NOVA1, ADAMTS5, ECRG4, LMOD1, CAVIN2, ADAM33, TMTC1, DACH1, RERGL, MAB21L1, SLC7A2, AGTR1, PGR, SCN2B, PAMR1, MYH11, ATP1A2, ADAMTS9-AS2, CD300LG, CA4, CHL1, GPC3, ARHGAP20, CCL14, ABCA8, and ADGRL3. In IS-low/SS-high TNBC tumors, the top 30 upregulated DEGs were MMP11, CEMIP, COL11A1, COL10A1, CA9, LOC101929128, LOC157273, INHBA, CCL11, OLAH, CILP2, LYPD1, MMP1, HAPLN1, HOXB9, LINC02487, EPYC, GBP5, TDO2, CCKBR, CCN4, CXCL11, LIPG, LINC01615, ANLN, GJB2, LINC00511, LINC00673, CXCL10, and SHISAL1; whereas, the downregulated DEGs were ABCA10, CAVIN2, ECRG4, SCN2B, ANGPTL7, VEGFD, MYH11, CD300LG, SCARA5, HPSE2, MYOC, HEPACAM2, LINC00993, HIF3A, SLC16A12, ATP1A2, PGM5-AS1, BTNL9, TNXB, ANKRD30A, GDF10, PGR, PTPRQ, B3GALT1, NPY2R, PLPPR1, HSD17B13, DEFB132, LEP, and GLYAT. Figure 4. [82]Figure 4 [83]Open in a new tab Heatmaps and volcano plots of the differentially expressed genes in SS-low/IS-high (A and B) and SS-high/IS-low (C and D) TNBC patients, and the corresponding KEGG pathway analysis (E). We labeled the DEGs satisfying the conditions of having an adjusted P-value < 0.0001 and |log2FoldChange| ≥ 5 in panels (B and D). The overexpressed DEGs in the context of both phenotypes are enriched in the IL-17 and cytokine-cytokine receptor interactions’ signaling pathways. KEGG pathway enrichment analysis showed that the overexpressed DEGs from both phenotypes were enriched in the IL-17 signaling pathway and viral protein interaction cytokine and cytokine receptor genes. Notably, the overexpressed DEGs in the context of the SS-low/IS-high phenotype were also enriched in other two cytokine-related pathways: cytokine-cytokine receptor interactions pathway and chemokine signaling pathway ([84]Figure 4E). Using X2K, we also inferred the transcription factors associated with the two immune phenotypes. In SS-high/IS-low TNBC patients, the inferred TFs were PPARG, HNF4A (also known as farnesoid X receptor, FXR), NR1H4, NR0B2, and MLXIPL, whereas the TF signature of SS-low/IS-high TNBC patients was composed of the PPARG, CEBPA, and MLXIPL TFs ([85]Figure 5). Additionally, we performed a literature search on PubMed and discovered nine TF genes linked to IL-17-mediated signaling, including PPARG, CEBPA, MEOX1, KLF15, CD36, ZNF750, EZH2, HNF4A, and NR0B2 ([86]Table 2). Figure 5. [87]Figure 5 [88]Open in a new tab Transcription factors inferred from the X2K in the context of SS-high/IS-low or SS-low/IS-high TNBC patients. Table 2. Selection of the Transcription Factors Linked to Interleukin-17-Mediated Signaling Transcription Factor (TF) Investigated Relationship References