Abstract Breast cancer is a heterogeneous disease whose subtypes represent different histological origins, prognoses, and therapeutic sensitivity. But there remains a strong need for more specific biomarkers and broader alternatives for personalized treatment. Our study classified breast cancer samples from The Cancer Genome Atlas (TCGA) into three groups based on glycosylation-associated genes and then identified differentially expressed genes under different glycosylation patterns to construct a prognostic model. The final prognostic model containing 23 key molecules achieved exciting performance both in the TCGA training set and testing set [32]GSE42568 and [33]GSE58812. The risk score also showed a significant difference in predicting overall clinical survival and immune infiltration analysis. This work helped us to understand the heterogeneity of breast cancer from another perspective and indicated that the identification of risk scores based on glycosylation patterns has potential clinical implications and immune-related value for breast cancer. Keywords: breast cancer, glycosylation, prognosis, subtype, biomarkers, immune Introduction Breast cancer has reached the highest incidence in women’s cancer types, and its lethality has reached second place, followed by lung cancer ([34]Sung et al., 2021). As a heterogeneous disease, breast cancer’s multiple subtypes represent different histological origins, prognoses, and therapeutic sensitivity ([35]Perou et al., 2000; [36]Cancer Genome Atlas Network, 2012; [37]Curtis et al., 2012; [38]Marusyk et al., 2012). The pathological markers estrogen receptor (ER), progesterone receptor (PR), and human epidermal growth factor 2 (HER2) stratified patients with various treatment selecting, such as hormonal therapy (e.g., Tamoxifen) and HER2-targeted therapy (e.g., Trastuzumab) ([39]Goldhirsch et al., 2013). Of note, HER2 is characterized by poor prognosis and has multiple sites of N-glycosylation, whose presence is linked with function ([40]Peiris et al., 2017). Subsequently, intrinsic molecular subtyping based on expression profile highlights the intricate complexity of this cancer type and the importance of genomic/transcriptomic analyses for prognostic prediction. PAM50 utilizes a 50 genes system that classifies breast cancer into luminal A, luminal B, HER2-enriched, and basal-like subtype that involves not only diverse biological processes but also has prognostic significance ([41]Prat et al., 2012; [42]Prat et al., 2015). The highly heterogeneous of breast cancer requires a strong need for more specific biomarkers and broader alternatives for personalized treatment. Meanwhile, efforts to classify established histological subtypes have been carried out, which identified at least four distinct subtypes of ER-negative and six triple-negative subtypes ([43]Teschendorff et al., 2007; [44]Lehmann et al., 2011). According to recent reports, researchers are seeking a multi-angle classification approach to identify diversified functional clustering and signatures, such as glycolysis ([45]Zhang et al., 2020a; [46]Jiang et al., 2021), autophagy ([47]Zhang et al., 2020b; [48]Jiang et al., 2022), ferroptosis ([49]Wang et al., 2021), stemness ([50]Li et al., 2020), and immune microenvironment ([51]Shen et al., 2020). All these attempts allow us to make more defined and precise characterizations based on new parameters to drive the heterogeneity landscape of breast cancer and put forward new ideas in prognostic prediction and treatment in the future. Glycosylation is defined as a biosynthetic enzymatic process characterized by the covalent attachment of single sugar or glycans to a wide range of target proteins ([52]Pinho and Reis, 2015; [53]Eichler, 2019). As a post-translational modification, they play an essential role in almost all aspects of the life processes of cells, such as cell cycle, proliferation, and aging ([54]Mallard and Tiralongo, 2017; [55]Gudelj et al., 2018; [56]Gao et al., 2021). The glycosylation pattern is profoundly altered during tumorigenesis. Among them, O-glycan truncation, sialylation, fucosylation, and N-glycan branching are common types of glycosylation in cancer ([57]Drake et al., 2015; [58]Kölbl et al., 2015; [59]Kudelka et al., 2015; [60]Taniguchi and Kizuka, 2015), leading to the occurrence of malignant phenotypes such as cell adhesion, metastasis, epithelial–mesenchymal transitioning, and even the shifting of the tumor microenvironment ([61]Günthert et al., 1991; [62]Rabinovich and Toscano, 2009; [63]Pinho et al., 2011; [64]Paredes et al., 2012; [65]Pinho et al., 2013). Researchers have also identified glycosylation-related molecules as biomarkers for cancer diagnosis and prognostics evaluation. For instance, prostate-specific antigen (PSA) in prostate cancer ([66]Gilgunn et al., 2013), carcinoma antigen 125 (CA125/MUC16) in ovarian cancer ([67]Zurawski et al., 1988), CA19-9 and carcinoembryonic antigen (CEA) in colon cancer ([68]Goldstein and Mitchell, 2005), and aberrantly glycosylated MUC1 (also known as CA15-3) in breast cancer ([69]Kumpulainen et al., 2002). More recent studies have mapped the histopathological orientation and tissue distribution of N-linked glycans in clinical breast cancer tissues ([70]Scott et al., 2019a; [71]Scott et al., 2019b), which deepen the understanding of the heterogeneity of breast cancer from the perspective of glycosylation. Our study classified breast cancer samples from The Cancer Genome Atlas (TCGA) into three groups based on glycosylation-associated genes and then identified differentially expressed genes under different glycosylation patterns to construct a prognostic model. Finally, a model containing 23 risk signatures was built and performed favorable predicting efficacy in training and testing cohorts, and the evaluation of immune infiltration and immunotherapy response were analyzed as well. Results Classification of BRCA based on the glycosylation-related gene sets [72]Figure 1 shows the workflow of our study. The TCGA column of [73]Table 1. We classified TCGA-BRCA samples (n = 1,104) based on 179 glycosylation-related genes (GRGs) performed by consensus clustering analysis. Related clustering parameters are shown in [74]Figures 2A–C, [75]Supplementary Figure S1A, and [76]Supplementary Figure S2A. Considering the complexity of grouping and the feasibility of subsequent analysis, we choose the optimal grouping when k = 3. Thus, we obtain three glycosylation-based clusters. We used t-SNE ([77]Figure 2D) and PCA ([78]Supplementary Figure S2B) dimensional reduction methods to observe that the samples had favorable overall differences under this grouping. Cluster 3 exhibited shorter overall survival (OS), indicating a poorer prognosis compared with clusters 1 and 2. (p < 0.05) ([79]Figure 2E). In brief, this grouping method based on intracellular glycosylation status has specific differences in breast cancer samples and has substantial clinical value. FIGURE 1. [80]FIGURE 1 [81]Open in a new tab Workflow of our study design. TABLE 1. Clinical information of TCGA, [82]GSE42586, [83]GSE58812. TCGA [84]GSE42568 PMID: 23740839 [85]GSE58812 PMID: 25887482 Sample Tumor 1,109 104 107 Normal 113 17 0 Survival Dead 144 35 29 Alive 933 69 78 Age <60 575 59 64 ≥60 502 45 43 Grade I — 11 — II — 40 — III — 53 — Stage I 179 11 — II 609 40 — III 246 53 — IV 19 0 — Unknown 24 0 — Subtype Luminal A 497 — — Luminal B 197 — — Basal 171 — — Her2 77 — — Unknown 135 — — ER expression Positive — 67 — Negative — 34 — [86]Open in a new tab Her2, human epidermal growth factor receptor 2; ER, estrogen receptor. FIGURE 2. [87]FIGURE 2 [88]Open in a new tab Consensus clustering classification of BRCA based on glycosylation-associated genes. (A) Optimal cluster distinction by consensus matrix (k = 3). (B) Empirical cumulative distribution function (CDF) plot displayed consensus distributions for each k. (C) Delta area plot. (D) T-SNE clustering of sample distributions based on glycosylation-related genes. (E) KM survival analysis of three glycosylation-based groups. Screening of differentially expressed genes We classified BRCA tumor samples into three clusters based on glycosylation patterns. Next, we screened the DEGs of these three clusters using the “Deseq2” R package. [89]Supplementary Figures S2C–E show the PCA map and DEGs heatmap between the three clusters. [90]Figure 4 shows the differential analysis volcano plot of group 1 to group 2 ([91]Figure 4A), group 2 to group 3 ([92]Figure 4B), and group 1 to group 3 ([93]Figure 4C). We made a Venn diagram for the three groups of differential genes to show their overlap ([94]Figure 4D). The genes contained in each unit are shown in [95]Supplementary Table S1, and the genes that show differences under one grouping are included in the next analysis. Finally, 1915 DEGs ([96]Supplementary Table S1) were obtained and used to construct a prognostic risk-scoring model. FIGURE 4. [97]FIGURE 4 [98]Open in a new tab Construction of lasso regression model. Volcano plot of differentially expressed genes between cluster 1 vs. cluster 2 (A), cluster 2 vs. cluster 3 (B), and cluster 1 vs. cluster 3 (C) in BRCA. (D). Venn diagram of differentially expressed genes between glycosylation-based groups. (E). Cross-validation plot for the penalty term λ based on differentially expressed genes. Vertical bars represent acceptable maximum and minimum λ values with corresponding mean-squared error and the number of covariates. (F) Plots for lasso regression coefficients over different values of the penalty parameter λ. Immune characteristics of glycosylation-related groups To explore the correlation between glycosylation patterns and immune characteristics, we analyzed the immune correlates of the three clusters. [99]Figures 3B and C show significant differences in the immune score, stomal score, and immune cell infiltration. Cluster 3 demonstrated the lowest immune and stomal score and the poorest immune cell infiltration. Cluster 2 had the highest immune score and modest stomal score, and the immune cell infiltration was also the most abundant. Cluster 1 had the mediocre immune score and highest stomal score, and the immune cell infiltration was modest. FIGURE 3. [100]FIGURE 3 [101]Open in a new tab Differences in immune characteristics of glycosylation-based groups. (A) PCA clustering of sample distributions in immune signatures between three glycosylation-based groups. (B) Stomal score and the immune score of glycosylation-based groups (ESTIMATE algorithm). (C) Differences in 24 TME infiltration cells between glycosylation-based groups (ssGSEA) (****p < 0.0001). Construction and efficacy of risk-scoring model To further construct a prognostic risk-scoring model without redundant genes, we used lasso regression to narrow down the range of candidate genes. According to mean-square error ([102]Figure 4E) and coefficients ([103]Figure 4F), we opted for the former λ as it results in a better prediction efficiency than the latter λ. Then, we fitted a multivariate Cox proportional hazard model to develop more valuable integrated molecules in the training set. Patients’ age, stage, and 23 genes were included in this model, with a concordance index of 0.87 (Log-rank P: 4.48e-43) ([104]Figure 5A). [105]Figure 5B arranged the sample from low to high according to the risk score. The proportion of deaths increased as risk scores rose. The 23 key molecule expression is also shown at the bottom. Its area under the ROC curve (AUC) in 1, 3, and 5 years prior to death was 0.89, 0.90, and 0.89, respectively ([106]Figure 5C). Kaplan–Meier (KM) analysis showed a significant difference in overall survival (p < 0.0001) ([107]Figure 5D). FIGURE 5. [108]FIGURE 5 [109]Open in a new tab Construction of multivariate Cox regression. (A) Multivariate Cox proportional hazard model based on lasso de redundant gene set in the TCGA training set. (B) Proportion of deaths in the training set in high- and low-risk groups as risk score values increased. Top: red, high-risk; blue, low-risk. Middle: red, death; blue, alive. Bottom: hierarchical clustering of 14 key molecules between low- and high-risk groups. (C) COX risk score’s time-dependent ROC curves for 1, 3, and 5 years before death in the TCGA training set. In the training set, (D) Kaplan–Meier survival analyses for COX low- and high-risk groups. (p < 0.0001, log-rank test). Validating of risk-scoring model predicting efficacy We choose two breast cancer cohorts from GEO to validate the efficacy of this protistic model. The [110]GSE42582 column of [111]Table 1. In [112]GSE42568 cohort, AUC in 1, 3, and 5 years prior to death was 0.73, 0.82, and 0.88, respectively ([113]Figure 6A), and KM analysis presents a significant difference (p < 0.0001) ([114]Figure 6B). The [115]GSE58812 column of [116]Table 1. In [117]GSE58812 cohort, AUC in 1, 3, and 5 years prior to death was 0.95, 0.77, and 0.79, respectively ([118]Figure 6C), and KM analysis presents a significant difference (p = 6e-04) ([119]Figure 6D). FIGURE 6. [120]FIGURE 6 [121]Open in a new tab Predicting the efficacy of constructed multivariate Cox regression in the testing set. (A) COX risk score’s time-dependent ROC curves for 1, 3, and 5 years before death in testing cohort [122]GSE42568. (B) Kaplan–Meier survival analyses for COX low- and high-risk groups in testing cohort [123]GSE42568 (p < 0.0001, log-rank test). (C) COX risk score’s time-dependent ROC curves for 1, 3, and 5 years before death in testing cohort [124]GSE58812. (D) Kaplan–Meier survival analyses for COX low- and high-risk groups in testing cohort [125]GSE58812. (p < 0.0001, log-rank test). Risk score related immune infiltration and immunotherapy evaluation We calculated a risk score for each sample according to the expression levels and regression coefficients and divided the BRCA cohort into low- and high-risk groups by median. To better investigate the interactions between the risk score and the immune microenvironment, we performed the ESTIMATE algorithm and ssGSEA to evaluate the correlation between the prognostic model and immune infiltrating in BRCA patients. [126]Supplementary figure S3A shows PCA clustering of immune signatures. The low-risk group demonstrates a higher immune score but no difference in the stomal score ([127]Supplementary figure S3B). In terms of immune cell infiltration ([128]Figures 7B, [129]8C), the risk score was slightly negatively correlated with immune cell level. The low-risk group represents a more significant fraction of activated B cells, eosinophils, mast cells, activated CD8^+ T cells, natural killer cells, and effector memory CD8^+ T cells but no difference in neutrophils, T follicular helper T cells, type 2 T helper cells, and type 17 T helper cells. Then, we used TIDE, an online tool, to evaluate immune checkpoint blockade (ICB) response for our screened signatures based on the TCGA and PRECOG cohorts. According to [130]Figure 9, the gene set we input obtained almost equivalent area under the curve (AUC) as other predicting scores, especially CD274, CD8, IFNG, and Merck 18. FIGURE 7. [131]FIGURE 7 [132]Open in a new tab Immune characteristics in high- and low-risk groups. (A) Risk signatures expression in high- and low-risk groups. (B) Differences in 24 TME infiltration cells between high- and low-risk groups (ssGSEA) (*p < 0.05; **p < 0.01; ***p < 0.001, ****p < 0.0001). FIGURE 8. [133]FIGURE 8 [134]Open in a new tab Immune infiltration status of prognostic signature. (A) Risk genes level in high- and low-risk groups. (B) Correlation between risk genes and checkpoint molecule expression. (C) Correlation between riskscore and immune infiltration. FIGURE 9. [135]FIGURE 9 [136]Open in a new tab Biomarker evaluation from TIDE (Tumor Immune Dysfunction and Exclusion). 23 Gene signatures investigation We further investigated the correlation between 23 gene signatures and immune cell infiltration. Compared with the low-risk group, the high-risk group harbors a low level of SPPL2C, IGKV2D-24, IGLC2, QRFPR, LINC01871, FABP7, [137]AP000851.2, CLIC6, ILOVL2, FYB2, CDHR4, GNG4, TBR1, [138]AC015910.1, and UPK1B and a high level of PXDNL. ([139]Figure 7A). LINC01871, IGLC2, IGKV2D-24, MLIP, LINC01235, and [140]AP000851.2 positively correlated with immune cell infiltration, and GNG4, PXDNL, KCNK3, ELOVL2, FYB2, SPPL2C, CLIC6 negatively correlated with immune cell levels. The main types of immune cells with different infiltrating were activated CD4^+ T cells, activated CD4^+ T cells, natural killer T cells, activated B cells, activated dendritic cells, and MDSC. ([141]Figure 8A). In addition, LINC01871 and IGLC2 positively correlated with immune checkpoint molecules such as PD-1, PDL1, CTLA4, TIGIT, LAG3, and BTLA and negatively correlated with HAVCR2. FYB2, SPPL2C, ELOVL2, CLIC6, IGKV2D-24, L1CAM, and [142]AP000851.2 ([143]Figure 8B). Materials and methods Data collection The Breast Cancer (BRCA) data from The Cancer Genome Atlas Program (TCGA) was accessed via UCSC Xena ([144]http://xena.ucsc.edu/). A total of 179 genes encoding glycosylation enzymes, targets, and regulators were obtained from previous literature ([145]Krushkal et al., 2017) and are listed in [146]Supplementary Table S1. Consensus clustering analysis based on glycosylation-related genes BRCA samples from TCGA were grouped into three clusters using the “ConsensusClusterPlus” (version1.60.0) R package ([147]Wilkerson and Hayes, 2010) based on glycosylation-related genes (GRGs) (maxK = 4, innerLinkage = “complete”). “Fpkm” format was used for clustering analysis and “count” for difference analysis. Principal component analysis (PCA) and t-SNE were applied to assess sample clustering using the “FactoMineR” (version2.4) and Rtsne (version0.16) packages. “DESeq2” (version1.36.0) R package was used for screen differentially expressed genes (DEGs) between different clusters (|logFC| > 2, FDR <0.05). The prognostic risk-scoring model constructed through GRGs-based clusters First, the most minor absolute shrinkage and selection operator (LASSO) removed redundant genes achieved using the “glmnet” (version 4.1-4) R package. Ten-fold cross-validation was used to select the penalty term, λ. The mean-squared error was computed for the test data to measure the fitted models’ predictive performance. Then, 38 genes ([148]Supplementary Table S1) were obtained for prognostic Cox regression construction using the “My.stepwise” (version 0.1.0) package to establish the optimal model. Finally, the 23 retained genes were used for calculating risk scores according to the following formula: [MATH: RiskScore=i=0n(Coefix< mi>i), :MATH] (1) where [MATH: Coefi :MATH] is the coefficient, and [MATH: xi :MATH] is the z-score-transformed relative expression value of each selected gene. The time-dependent receiver operating characteristic (ROC) curve evaluated each model’s sensitivity and specificity. The “survival” (version 3.3-1) R package was used, and the Kaplan–Meier (KM) overall survival curves between different clusters and risks were performed using the “survival” R package. Immune infiltrates analysis The single-sample gene-set enrichment analysis (ssGSEA) was used to establish the relative abundance of 24 cell infiltration, which was analyzed using the “GSVA” (version 1.44.2) package. The ESTIMATE algorithm calculated stomal scores and immune scores of high- versus low-risk groups and different GRGs-based clusters. Immune checkpoint blockade (ICB) predicting evaluation performed by biomarker evaluation module from TIDE (Tumor Immune Dysfunction and Exclusion: harvard.edu)">[149]http://tide.dfci.harvard.edu/) (harvard.edu)), a computational method to model tumor immune evasion and ICB response and resistance regulators. Hub-genes analysis Immune Infiltrates differences of prognostic hub-genes were performed using ssGSEA, as mentioned earlier. Checkpoints correlation was analyzed using the ‘Hmisc’ (version 4.7-0) package. All the statistical significance sets as p < 0.05 with two-side. Data processing and visualization were performed using R version 4.1.2. Discussion The role of glycocalyx–the extracellular carbohydrate coat, has been proposed in breast cancer occurrence and development since the 1950s ([150]Aub et al., 1963). Then, it was noteworthy that plant lectin and carbohydrate motif binding proteins showed a higher affinity for malignant cells than normal cells in the 1960s ([151]Remmele et al., 1986). By the 1980s, biochemists found that the enzyme-linked lectin binding assay could be used to predict tumor differentiation and therapeutic reactivity ([152]Parodi et al., 1982). Shortly afterward, it was widely accepted that glycosylation status alteration could be used as biomarkers for breast cancer prognosis and tumor burden ([153]Springer, 1997; [154]Lin et al., 2002; [155]Duffy et al., 2010). Given the heterogeneity of breast cancer, more recent studies have mapped the histopathological orientation and tissue distribution of glycosylated modifications in clinical breast cancer samples. So far, the influentially changed landscape of glycosylation processes in breast cancer is vividly portrayed. We obtained a set of glycosylation-related genes containing 181 molecules from previous pieces of literature, including glycosylation pathways, genes encoding glycosylation targets or regulators, and members of cancer pathways affected by glycosylation ([156]Supplementary Table S1) ([157]Krushkal et al., 2017). In our study, TCGA-BRCA tumor samples were divided into three groups. We can consider three different glycosylated states based on these glycosylation-related genes by using consensus clustering analysis. There were significant differences in the expression patterns of glycosylated genes between them, and the survival analysis also reflected the difference in survival time under different glycosylated states ([158]Figures 2D and E). It is well-documented that an altered “glycan coat” is a distinct hallmark of cancer. Given that immune cells express a large variety of lectin (glycan-binding receptors), they recognize glycans on the tumor cell. Those immune cells can sense and respond to changes in the glycan signature of their environment. This often leads to tumor immune escape and immunomodulation. Therefore, the glycosylation-related signatures could affect tumor-immune cells’ connections within the tumor microenvironment ([159]Rodríguez et al., 2018; [160]Lopes et al., 2021). In addition, a variety of recruited stomal components–transformed parenchyma and the associated stroma–are involved in tumor progression and response to treatment ([161]Arneth, 2019; [162]Hanahan, 2022). We further analyzed the immune characteristics of glycosylation-based groups. According to our results, group 3 demonstrated the lowest immune and stomal score and the poorest immune cell infiltration; group 2 had the highest immune score and modest stomal score, and the immune cell infiltration was also the most abundant. This indicates that group 2 tends to the glycosylation pattern of immune cells, group 1 of stromal cells, and group 3 of malignant cells ([163]Figures 3A–C). In combination with the survival analysis of [164]Figure 2E, we were surprised to find that in terms of glycosylation pattern, the glycosylation mode of tumor cells and immune cells did not show any difference in patient survival, while the glycosylation of stromal cells may have a significant impact on patients’ survival. In future explorations of tumor microenvironment glycosylation, focusing on stromal cells may be a more effective research direction. These results prove that the classification based on glycosylation is meaningful and effective, which helps us to understand the heterogeneity of breast cancer from another perspective. However, at present, the classification samples are limited. Increasing the sample size will help formulate a more stable grouping method and hopefully be applied to clinical prognosis and prediction. The change of glycosylation pattern in tumor cells and immune microenvironment will affect the expression of other critical genes and make their corresponding bioprocesses abnormal, thus, inducing the transformation of malignant phenotypes, such as proliferation, epithelial–mesenchymal transition, and apoptosis resistance. To identify the prognostic genes influenced by glycosylation processes, we screened the DEGs of these three groups and constructed a predictive risk model through lasso and Cox regression calculation. The final prognostic model containing 23 key molecules achieved exciting performance both in the TCGA training set and testing set [165]GSE42568 and [166]GSE58812 ([167]Figures 5C and D, [168]Figure 6). Using the model algorithm, we calculated a risk score and divided the sample into high- and low-risk groups by the median. This risk score also showed a significant difference in predicting overall clinical survival and immune infiltration ([169]Figures 7B, [170]8C). Great achievement has been obtained in ICB-based immunotherapies ([171]Chen et al., 2020). In order to obtain better clinical remission and fewer immune-related adverse events, researchers are committed to developing biomarkers to screen an effective population accurately. The reported measures that can be used to predict the efficacy of ICI therapy include immune cell infiltration ([172]Cogdill et al., 2017), protein expressions such as PD-L1 ([173]Teng et al., 2015), mutations and neoantigens ([174]Mcgranahan et al., 2016), and genetic and epigenetic characteristics ([175]Ascierto et al., 2012). On the TIDE prediction website, our gene set shows a favorable performance compared with the existing evaluation methods ([176]Figure 9), which proves that our model has practical proficiency and value for further exploration and improvement in immunotherapy prediction. Then, we move on to several single prognostic genes. LINC01871 significantly lower expression in the high-risk group and positively correlated with most of the immune cell infiltration ([177]Figure 8). This suggests that LINC01871 may play a protective role in breast cancer. According to a recent review of the literature, LINC01871 has been identified by several studies in breast cancer through bioinformatic measurement involving the cellular phenotype of autophagy ([178]Li et al., 2021; [179]Wu et al., 2021; [180]Jiang et al., 2022; [181]Luo et al., 2022), stemness ([182]Li et al., 2020), immune response ([183]Ma et al., 2020; [184]Mathias et al., 2021), ferroptosis ([185]Xu et al., 2021), and lipid metabolism ([186]Shi et al., 2022). IGLC2 has a similar expression and functional pattern to LINC01871 in our study ([187]Figure 8). [188]Chang et al. (2021) found in a study of triple-negative breast cancer (TNBC) cohort that a high expression of IGLC2 was related to a favorable prognosis for TNBC patients, which is consistent with our results. In addition, IGLC2 is linked with the proliferation, migration, and invasion of MDA-MB-231 cells. Pathway enrichment analysis showed that IGLC2 is related to the extracellular matrix–receptor interaction ([189]Chang et al., 2021). All these features make IGLC2 have the potential to be a biomarker to predict prognosis, even for identifying breast cancer patients who can benefit the most from immune checkpoint blockade treatment. ELOVL2 is another prognostic signature in our results. Studies have shown that long noncoding RNA on its antisense chain (ELOVL2-AS1) correlates with breast cancer prognosis. The predictive efficacy of ELOVL2 needs to be verified in a larger sample size, and its mediated cell function also needs to be further explored. Acknowledgments