Abstract
   Long non-coding RNAs (lncRNAs) play critical roles in tumor immunity;
   however, the functional roles of immune-related lncRNAs in breast
   cancer (BC) remain elusive. To further explore the immune−related
   lncRNAs in BC, whole genomic expression data and corresponding clinical
   information were obtained from multiple BC datasets. Based on
   correlation with the immune-related genes within the training set, we
   screened out the most promising immune-related lncRNAs. Subsequently,
   Lasso penalized Cox regression analysis followed by stepwise
   multivariate Cox regression analysis identified six survival-related
   lncRNAs ([29]AC116366.1, [30]AC244502.1, [31]AC100810.1, MIAT,
   [32]AC093297.2, and [33]AL356417.2) and constructed a prognostic
   signature. The cohorts in the high−risk group had significantly poor
   survival time compared to those in the low−risk group. In addition, a
   nomogram integrated with clinical features and the prognostic signature
   was developed on the basis of the training set. Importantly, all the
   findings had a similar performance in three validated datasets. In the
   following studies, our integrative analyses indicated that the
   infiltration of CD8-positive (CD8) T cells associated with a good
   prognosis was strikingly activated in the low−risk group. To further
   provide an interpretation of biological mechanisms for the prognostic
   signature, we performed weighted gene co−expression network analysis
   (WGCNA) followed by KEGG pathway-enrichment analysis. Our results
   showed that the antigen presentation pathway involved in protein
   processing in endoplasmic reticulum and antigen processing and
   presentation was markedly altered in the high-risk group, which might
   promote tumor immune evasion and associate with poor clinical outcomes
   in BC patients with high risk scores. In conclusion, we aimed to take
   advantage of bioinformatics analyses to explore immune−related lncRNAs,
   which could function as prognostic indicators and promising therapeutic
   targets for BC patients.
   Keywords: breast cancer, immune infiltration, long non-coding RNA, gene
   expression omnibus, the cancer genome atlas, prognostic signature
Introduction
   Breast cancer (BC) is one of the most frequent malignant tumors and a
   leading cause of cancer death among females ([34]Siegel et al., 2019).
   Although comprehensive treatments including surgery, radiotherapy, and
   chemotherapy have been well supplied, the prognosis of BC patients is
   not satisfactory ([35]Schwartz and Erban, 2017; [36]Emens, 2018).
   Therefore, it is urgent to have an insight into novel prognostic and
   diagnostic markers to further develop novel therapeutic approaches for
   BC patients.
   The tumor microenvironment (TME) is a complicated system, consisting of
   not only tumor cells but also tumor-associated normal epithelial and
   stromal cells, immune cells, and vascular cells ([37]Junttila and de
   Sauvage, 2013). Emerging evidence has identified that dysfunctional
   immune status in TME is a hallmark of cancers ([38]Yu et al., 2018).
   Infiltrating immune cells in TME play critical roles in the initiation
   and progression of cancer, and positively or negatively influence
   patient prognosis, which is primarily due to tumor heterogeneity and
   distinct populations of tumor-infiltrating immune cells ([39]Talmadge
   et al., 2007). The infiltration of regulatory T (Treg) cells inhibits
   endogenous immune responses against tumors and correlates with poor
   prognosis ([40]Joshi et al., 2015), whereas dendritic cell (DC)
   infiltration is typically associated with a positive clinical outcome
   associated with their capacity to initiate and regulate both innate and
   adaptive immunity ([41]Saxena and Bhardwaj, 2018). Myeloid-derived
   suppressor cells (MDSCs) could prevent cytotoxic T lymphocytes (CTLs)
   from binding to the peptide–MHC complex and therefore inhibit antitumor
   activity ([42]Nagaraj et al., 2007). In contrast, infiltration of
   CD8-positive cytotoxic T lymphocytes (CD8 T cells) into solid tumors is
   associated with good prognosis in various types of cancer, including BC
   ([43]Jiang and Shapiro, 2014; [44]Li et al., 2019). CD4-positive T
   cells, namely, T follicular helper (TFH) cells, have two predominant
   cell subtypes, Th1 and Th2. In general, infiltration of TH1 cells
   predicts a positive outcome, whereas TH2 cells predict a negative
   outcome ([45]Giraldo et al., 2014; [46]Pradhan et al., 2014). Natural
   killer (NK) cells, lymphocytes of the innate immune system, are
   essential for defense against infectious pathogens and nascent
   malignant cells. Classically, the CD56dim NK cell subset is considered
   to mediate antitumor responses, whereas the CD56bright NK cell subset
   is involved in immunomodulation. Nevertheless, current studies
   demonstrated that brief priming with IL-15 strikingly enhanced the
   antitumor response of CD56 bright NK cells ([47]Wagner et al., 2017).
   Given that clinical outcomes of cancer patients were frequently
   associated with the infiltration of immune cells in TME, current
   diagnostic and therapeutic strategies are mainly focused on the
   analysis and manipulation of infiltrating immune cellular populations
   ([48]Talmadge et al., 2007; [49]Tamborero et al., 2018).
   Long non-coding RNAs (lncRNAs), longer than 200 nucleotides, play
   crucial roles in the occurrence and progression of cancers ([50]Iyer et
   al., 2015). Previous studies have shown that lncRNAs could interact
   with other biomolecules, such as proteins, regulatory DNA regions, and
   miRNAs, and therefore regulate gene expression at multiple levels,
   including epigenetic, transcriptional, and post-transcriptional ([51]Wu
   et al., 2018). Recently, mounting studies found that the lncRNAs also
   play a pivotal role in cancer immunity ([52]Hu et al., 2013; [53]Heward
   and Lindsay, 2014). For instance, lincRNA-Cox2, proximal to the
   prostaglandin-endoperoxide synthase 2 (Ptgs2/Cox2) gene, regulated both
   the activation and repression of diverse types of immune genes
   ([54]Carpenter et al., 2013; [55]Hu et al., 2016). LncRNA FENDRR had an
   impact on the immune escape of hepatocellular carcinoma cells through
   interaction with miR-423-5p and GADD45B ([56]Yu et al., 2019). Another
   research revealed that lncRNA NKILA overexpression in tumor-specific
   CTLs and TH1 promoted their apoptosis and contributed to shorter
   survival time in patients with BC or lung cancer ([57]Huang et al.,
   2018). However, the immune−related lncRNAs associated with BC
   progression remain poorly understood.
   In this study, we integrated bioinformatics analyses and took
   advantages of lncRNA expression profiles to explore novel prognostic
   marks and therapeutic targets for BC patients.
Materials and Methods
Sample Datasets and Data Processing
   The [58]GSE20685 dataset was downloaded from the GENE EXPRESSION
   OMNIBUS (GEO) which contains 327 BC cases. This dataset was based on
   the [59]GPL570 platform (Affymetrix Human Genome U133 Plus 2.0 Array).
   Meanwhile, patients’ clinical information was also obtained from the
   GEO database, including age, overall survival time (OS), survival
   status, and clinical stage. We re-annotated probe sets of Affymetrix
   Hg-U133 Plus 2.0 array by mapping all probes to the human genome
   (GRCh37) using SeqMap ([60]Jiang and Wong, 2008). In the analysis, we
   ensure that those probes mapped to the genome were unique with no
   mismatch, and all probes mapped to pseudogene transcripts were removed.
   The analysis procedure was conducted by the previous study ([61]Du et
   al., 2013).
Construction and Validation of Immune-Related LncRNAs Prognostic Signature
   Firstly, the immune−related genes were extracted from immune system
   process and immune response gene sets sourced from the Molecular
   Signatures Database (MSigDB,^[62]1) ([63]Liberzon et al., 2015). Then,
   a cohort of immune−related lncRNAs was identified according to Pearson
   correlation analysis between immune genes and lncRNA expression level
   in BC samples (|r| > 0.5, p < 0.001). Next, the univariate Cox
   regression analysis and Kaplan–Meier method were separately performed
   to sort out survival−related lncRNAs with significant prognostic value
   (p < 0.05) on the basis of the training set. Following these two
   independent analyses, the overlapping lncRNAs with significant
   difference (p < 0.05) were sent for further analyses. Finally, Lasso
   penalized Cox regression analysis followed by stepwise multivariate Cox
   regression analysis identified the most promising survival-related
   lncRNAs and constructed a prognostic signature.
   The enrichment scores (ES) of immune genes associated with the
   prognostic signature were calculated using the Single-sample Gene Set
   Enrichment Analysis (ssGSEA) in Gene Set Variation Analysis (GSVA)
   package ([64]Hanzelmann et al., 2013), and then the immune signature
   based on enrichment scores was developed. The Pearson correlation
   analysis was employed to evaluate the relationship between the
   prognostic signature and the immune signature. Subsequently, random
   forest (RF) and logistic regression (LR), two well machine learning
   algorithms, were applied to evaluate the performance of the six-lncRNA
   signature for stratifying the immune status based on the median
   enrichment score of immune signature according to the area under the
   ROC curve (AUC) through five-fold cross validation ([65]Han et al.,
   2014).
Risk Score Construction
   To confirm the reliability of the detected immune−related lncRNAs, the
   risk score was designed to establish a unitive signature for the
   prognostic analysis, and the coefficient for each lncRNA was obtained
   through the multivariate cox regression following bidirectional
   stepwise selection. A risk score formula was established as follows:
   [MATH: ∑iCoefficient(IncRNAi)×Expression(IncRNAi) :MATH]
   Risk score = (−0.468 × expression of [66]AC116366.1) + (−0.260 ×
   expression of [67]AC244502.1) + (−0.279 × expression of [68]AC100810.1)
   + (−0.162 × expression of MIAT) + (−0.312 × expression of
   [69]AC093297.2) + (0.173 × expression of [70]AL356417.2).
Nomogram Construction
   A nomogram integrated clinical features and the six-lncRNA signature
   was constructed using the “rms” R package. Calibration curve were
   performed to assess the predictive accuracy of the nomogram. The
   predicted outcomes and actual observed outcomes of the nomogram were
   presented in the calibrate curve, and the 45° line represents the best
   prediction.
Gene Set Enrichment Analysis
   To further explore the relationship between immune-related gene sets
   and the six-lncRNA signature, GSEA was performed with the JAVA program.
   Genes were ranked on the basis of differential significance between the
   high- and low- risk groups, which were stratified via the median risk
   score in BC patients. After performing 1000 permutations, gene sets
   enrichment with nominal p < 0.05 and FDR < 0.25 were considered
   significant.
Weighted Gene Co-expression Network Analysis (WGCNA)
   The WGCNA package ([71]Langfelder and Horvath, 2008) was applied for
   further analysis on the basis of the training set, and the analysis
   procedure was conducted as previously described ([72]Giulietti et al.,
   2016). In brief, two weighted gene co-expression networks were
   constructed via the WGCNA package according to quartile of risk scores,
   with the bottom group serving as the reference network and the top
   group serving as the test network. The conservation of gene pairs
   between two networks was evaluated by module preservation function from
   WGCNA. In order to make analysis more accurate, 1000 times of
   permutation were performed. Subsequently, median rank was used to
   confirm non-preservative modules and Z-summary score (Z-score) was
   applied to evaluate the significance of non-preservative modules. Note
   that Z-scores greater than 10 indicate a high preservation and Z-scores
   less than 10 indicate a weak preservation. Finally, the genes in
   non-preservation modules were enrolled to send for DAVID tool (version
   6.8) ([73]Huang et al., 2007) to map these hub genes to KEGG pathways.
   Enriched KEGG pathways with p-values < 0.05 were extracted, and then
   the enrichment results were visualized by ggraph package.
External Data Validation
   To further validate predictive capability of the six-lncRNA signature,
   we enrolled three independent datasets, including [74]GSE21653,
   [75]GSE88770, and TCGA with a total of 245, 117, and 888 cases,
   respectively. [76]GSE21653 and [77]GSE88770 were based on the platform
   [78]GPL570 (Affymetrix Human Genome U133 Plus 2.0 Array). The TCGA
   dataset (level3, raw count) was obtained from The Cancer Genome Atlas
   (TCGA) via the TCGAbiolinks package. Patients without complete
   information in training and validation sets had been excluded from this
   study. The inclusion patient ids are listed in [79]Supplementary Table
   S1.
Statistical Analysis
   In Kaplan–Meier survival analysis, the “surv_cutpoint” function in
   survminer package was applied to determine the optimal cutoff value of
   risk scores. Note that the amount of samples in any group could not be
   less than 30% of the total samples. According to the optimal cutoff
   value, BC patients were divided into high− and low−risk groups to
   perform survival analysis. The time−dependent receiver operating
   characteristic (ROC) analysis was performed to investigate the
   prognostic accuracy of the six−lncRNA signature using the survivalROC
   package. After the ssGSEA method calculated the enrichment scores of 24
   immune cell types ([80]Bindea et al., 2013), Pearson correlation
   analysis was performed to evaluate the relationship between risk scores
   and infiltrating immune cells. Logistic regression analysis was carried
   out to evaluate whether combination of the six lncRNAs have higher
   predictive accuracy for infiltrating status of CD8 T-cells than each
   predictor alone. Differences with p < 0.05 were considered significant.
Results
Construction of Immune-Related LncRNAs Prognostic Signature
   A flow chart described our analysis procedure was presented in
   [81]Figure 1A. In our study, 327 samples sourced from [82]GSE20685
   served as the training set. A total of 313 immune−related genes were
   obtained from the Molecular Signatures Database, which were associated
   with immune response and immune system process ([83]Supplementary Table
   S2). Next, 82 immune−related lncRNAs were extracted using Pearson
   correlation analysis (|r| > 0.5, p < 0.001) ([84]Supplementary Table
   S3). Univariate Cox regression analysis and Kaplan–Meier method was
   separately performed to identify survival-related lncRNAs (p < 0.05),
   of which 13 were overlapped ([85]Figure 1B and [86]Supplementary Table
   S4). After primary filtration, Lasso penalized Cox regression analysis
   following stepwise multivariate Cox regression analysis was carried out
   to sort out six survival-related lncRNAs and developed a prognostic
   signature ([87]Figures 1C,D; [88]Supplementary Table S5). Moreover, a
   regulatory network associated with 6 lncRNAs and 86 signature-related
   immune genes was established and visualized using Cytoscape3.7.1
   ([89]Supplementary Figure S1).
FIGURE 1.
   [90]FIGURE 1
   [91]Open in a new tab
   Construction of Immune–related LncRNAs Prognostic Signature (A) Flow
   chart indicating the analysis procedure. (B) Survival analysis was
   separately performed by univariate Cox regression analysis and
   Kaplan–Meier method, of which 13 survival-related lncRNAs were
   overlapped. (C) The selection of the tuning parameter via 10 times
   cross–validation in the Lasso model. (D) Lasso coefficient profiles of
   13 lncRNAs. (E) The predictive efficacy of the six-lncRNA signature in
   classification of immune signature. RF: random forest; LR: logistic
   regression. (F) The correlation identified by Pearson correlation
   analysis among immune signature, six-lncRNA signature, and six lncRNAs.
   Based on 86 model-related immune genes ([92]Supplementary Table S6), we
   constructed an immune signature via the ssGSEA method. The enrichment
   scores of the immune signature are listed in [93]Supplementary Table
   S7. Subsequently, random forest (RF) and logistic regression (LR)
   revealed that the prognostic signature could significantly stratify two
   immune subgroups on the basis of the median enrichment score (LR, AUC
   score = 0.919, RF, AUC score = 0.898) ([94]Figure 1E). Moreover, the
   interplay among immune signature, prognostic signature, and six lncRNAs
   was evaluated by Pearson correlation analysis ([95]Figure 1F;
   [96]Supplementary Table S8). We observed that the six-lncRNA signature
   was negatively associated with the immune signature (r = -0.342, p =
   1.99E-10). In the prognostic signature, [97]AC116366.1, [98]AC244502.1,
   [99]AC100810.1, MIAT, and [100]AC093297.2 acted as the protective
   factors with an HR value less than 1, and [101]AL356417.2 acted as the
   deleterious factor with an HR value greater than 1 ([102]Table 1).
TABLE 1.
   Six immune-related lncRNAs of the prognostic signature.
   Pearson correlation analysis
     __________________________________________________________________
   Univariate Cox regression analysis
     __________________________________________________________________
   lncRNAs Ensemble_ID R p-value HR HR.95L HR.95H p-value
   [103]AC116366.1 ENSG00000234290.2 0.462 1.12E−18 0.427 0.280 0.649
   6.76E−5
   [104]AC244502.1 ENSG00000251002.6 0.786 6.59E−70 0.777 0.660 0.915
   0.0025
   [105]AC100810.1 ENSG00000253982.1 –0.195 0.0004 0.705 0.563 0.882
   0.0023
   MIAT ENSG00000225783.5 0.540 3.54E−26 0.853 0.746 0.975 0.0202
   [106]AC093297.2 ENSG00000272335.1 –0.446 2.04E−17 0.757 0.628 0.912
   0.0034
   [107]AL356417.2 ENSG00000233682.3 0.140 0.0112 1.319 1.079 1.613 0.0070
   [108]Open in a new tab
   R: Correlation of six lncRNAs and the immune signature.
Primary Evaluation of the Six-LncRNA Signature in the Training Set
   Overall survival curves of six immune-related lncRNAs are shown in
   [109]Figure 2A. Risk scores, survival status, OS period, and six-lncRNA
   expression level of each patient are presented in [110]Figure 2B.
   Moreover, patients in the low−risk group showed increased survival time
   in contrast to those in the high−risk group ([111]Figure 2C).
   Univariate following multivariate Cox regression analysis was conducted
   to identify that N-stage and the six-lncRNA signature functioned as
   independent prognostic factors ([112]Supplementary Figures S2,
   [113]S3). The predictive accuracy of the six-lncRNA signature was
   assessed by time−dependent ROC curves at 1, 3, and 5 years on the basis
   of AUC scores. The AUC scores at 1, 3, and 5 years were 0.88, 0.758,
   and 0.725, respectively, which indicated that the prognostic signature
   had a moderate sensitivity and specificity in the training set
   ([114]Figure 2D).
FIGURE 2.
   [115]FIGURE 2
   [116]Open in a new tab
   Primary evaluation of the six-LncRNA Signature in the Training Set. (A)
   Overall survival curves of six lncRNAs. (B) From top to bottom are the
   risk score, survival status distribution, and six-lncRNA expression
   level of each patient. (C) Overall survival curves of the prognostic
   signature, in which the blue line represents the high-risk subgroup and
   the yellow line represents the low-risk subgroup. (D) The 1-, 3-, and
   5-year survival receiver operating characteristic curves.
Further Evaluation of the Six-LncRNA Signature in Validation Sets
   To further evaluate the predictive efficacy of the six−lncRNA
   signature, three independent external datasets ([117]GSE21653,
   [118]GSE88770, and TCGA) were used to validate our findings.
   Kaplan–Meier analysis showed that patients in the low-risk group had
   evidently longer survival time than those in the high-risk group in
   three validation sets ([119]Figures 3A–C), which was consistent with
   the results from the training set. The ROC curves at 1, 3, and 5 years
   indicated that the signature also had a good predictive performance in
   validation sets ([120]Figures 3D–F).
FIGURE 3.
   [121]FIGURE 3
   [122]Open in a new tab
   Further evaluation of the six-LncRNA Signature in Validation Sets.
   (A–C) Survival curves of the prognostic signature in three validation
   sets. [123]GSE21653: Disease-free survival (DFS) curve. [124]GSE88770
   and TCGA: Overall survival (OS) curve. (D–F) Survival receiver
   operating characteristic curves in three validation sets.
Construction and Evaluation of Nomogram
   To enhance the predictive accuracy, four clinical features and the
   six-lncRNA signature were integrated and a nomogram was constructed on
   the basis of the training set ([125]Figure 4A). The calibration curves
   presented a good agreement between prediction by nomogram and actual
   observation for predicting survival probability at 3 and 5 years in
   training and validation datasets ([126]Figures 4B–E).
FIGURE 4.
   [127]FIGURE 4
   [128]Open in a new tab
   Construction and Evaluation of Nomogram. (A) Nomogram to predict
   survival probability. (B–E) Three- and 5-year nomogram calibration
   curves in training and validation sets. The training set:
   [129]GSE20685. The validation sets: [130]GSE21653, [131]GSE88770, and
   TCGA.
Immune Status Analysis for High- and Low-Risk Groups
   In the following study, GSEA was applied to show that immune system
   process and immune response gene sets were significantly enriched in
   the low−risk group ([132]Figures 5A,B). The ssGSEA followed by Pearson
   correlation analysis was conducted to validate the relationships
   between risk scores and infiltration immune cells. As presented in
   [133]Figure 5C, risk scores had the strongest correlation with CD8 T
   cells compared to other types of immune cells. Based on the median risk
   score, BC patients in the training set were stratified into two
   subgroups, high- and low-risk groups. High-risk cohorts displayed lower
   enriched levels of the immune signature and CD8 T cells compared to
   low-risk cohorts ([134]Figure 5D). Meanwhile, the immune signature was
   strikingly positively related with CD8 T cells on the basis of Pearson
   correlation analysis (r = 0.72, p = 1.54E-53; [135]Supplementary Figure
   S4). Survival curve revealed that infiltration of CD8 T cells was
   associated with a favorable prognosis in BC patients ([136]Figure 5E).
FIGURE 5.
   [137]FIGURE 5
   [138]Open in a new tab
   Immune status analysis for high– and low–risk groups. (A,B) Gene set
   enrichment analysis (GSEA) indicated a significant enrichment of
   immune–related phenotype in the low–risk group. (C) Pearson correlation
   analysis between risk scores of the six-lncRNA signature and enrichment
   scores of the immune cells. The results with p-value < 0.05 were
   extracted. (D) Enrichment scores of immune signature and CD8 T cells in
   high- and low-risk cohorts. The patients were divided into two
   subgroups on the basis of the median risk score. The enrichment score
   was scaled to be between 0 and 1 in the plot. Orange: immune signature;
   blue: CD8 T cells; red: high-risk patients; green: low-risk patients.
   (E) Kaplan–Meier curve associated with CD8 T cells.
   In training and validation sets, risk scores were negatively related
   with enrichment scores of CD8 T cells on the basis of Pearson
   correlation analysis ([139]Figures 6A,D). To further identify
   predictive efficacy of the signature for infiltration of CD8 T cells,
   the patients of training and validation sets were divided into high-
   and low-infiltration groups according to the median enrichment score of
   CD8 T cells. Logistic regression analysis showed that the combination
   of six lncRNAs had higher predictive efficacy than each predictor alone
   ([140]Figures 7A–D).
FIGURE 6.
   [141]FIGURE 6
   [142]Open in a new tab
   The correlation of the six-lncRNA signature and CD8 T cells. (A–D)
   Pearson correlation analysis was conducted to identify the relationship
   between risk scores of the prognostic signature and enrichment scores
   of CD8 T cells in training and validation sets.
FIGURE 7.
   [143]FIGURE 7
   [144]Open in a new tab
   The predictive efficacy of the six-lncRNA signature for infiltration of
   CD8 T cells. (A–D) Logistic regression was conducted to evaluate the
   predictive efficacy of the combination of the six lncRNAs and each
   lncRNA alone on the basis of the AUC value in training and validation
   sets.
Construction of Weighted Gene Co-expression Network and Identification of Key
Modules
   In the following study, we applied the WGCNA method to identify
   non-preservation modules, whose characteristics could distinguish the
   test network from the reference network. Firstly, we confirmed a soft
   threshold power value of 16 as best optimum index in the high-risk
   group ([145]Supplementary Figure S5) and 2 as the best optimum index in
   the low-risk group ([146]Supplementary Figure S6). Next, we identified
   6 modules in the high-risk group and 22 modules in the low-risk group
   except for the gray module via hierarchical clustering. Finally, the
   module preservation function in the WGCNA package was applied to
   identify three non-preservation modules, the light green module
   (Z-score = 8.2), the midnight blue module (Z-score = 9.8) and the tan
   module (Z-score = 6.0) ([147]Figure 8A).
FIGURE 8.
   [148]FIGURE 8
   [149]Open in a new tab
   Construction of weighted co–expression network and identification of
   key modules. (A) Each module is represented by its color-code and name.
   Left plot shows the preservation median rank, which tends to be
   independent from module size with high median ranks indicating low
   preservation. Right plot shows Z-score value. The green dotted line
   indicates that Z-score is equal to 10. Z-score < 10 represents weak
   preservation. (B) Enriched KEGG pathways with p-values < 0.05 were
   presented. KME value (eigengene connectivity) of module internal hub
   genes was indicated by dots’ size, respectively.
   To further investigate the biological mechanisms of non-preservation
   modules, we mapped internal genes of non-preservation modules to KEGG
   pathways using the DAVID tool (version 6.8) ([150]Huang et al., 2007).
   Enriched KEGG pathways with p values < 0.05 were extracted. KEGG
   pathway-enrichment analysis demonstrated that the midnight blue module
   was mainly involved in protein processing in endoplasmic reticulum and
   antigen processing and presentation; the light green module was mainly
   involved in sphingolipid signaling pathway, hippo signaling pathway,
   and tight junction; and the tan module was mainly involved in circadian
   rhythm ([151]Figure 8B). In summary, the midnight blue module could be
   a crucial module, whose disorganized functions affected immune cell
   infiltration and contribute to shorter survival time in patients with
   high risk scores.
Discussion
   Infiltration of immune cells in TME has dual roles on cancer
   progression and patient prognosis, and deciphering their complex
   interplay is of seminal importance for exploring novel prognostic
   markers and therapeutic targets in cancer patients ([152]Fridman et
   al., 2011; [153]Klemm and Joyce, 2015). LncRNAs have been reported to
   function as key regulators in immune response, inflammation, and
   distinct immune cell types ([154]Das et al., 2018; [155]Li et al.,
   2018; [156]Wang et al., 2018; [157]Xu and Cao, 2019), and play crucial
   roles in cancer malignant progression ([158]Bartonicek et al., 2016;
   [159]Jiang et al., 2016; [160]Schmitt and Chang, 2016). Therefore,
   insight into immune-related lncRNAs could bring profound effects on
   cancer-related therapeutics and survival prognosis prediction
   ([161]Chen et al., 2017).
   Since the lncRNA expression level and biological functions were
   markedly associated with immune-related genes ([162]Roux et al., 2017),
   we employed immune-associated gene sets to screen out highly
   immune-related lncRNAs by co-expression analysis. Subsequently, a
   series of bioinformatics analyses were integrated to identify a
   six-immune-related lncRNA signature, which could well stratify patients
   into favorable or unfavorable prognoses. Moreover, the six-lncRNA
   signature was considered as an independent risk factor by multivariate
   Cox regression analysis. To further improve the predictive accuracy, a
   nomogram consisting of age, TMN stage, and the six-lncRNA signature was
   constructed. A good performance of the nomogram was identified by the
   calibration curve. Meanwhile, our findings were validated in multiple
   BC datasets. Additionally, functional annotation conducted by GSEA
   revealed that the immune-related gene sets, the immune system process,
   and immune response were significantly activated in the low-risk group.
   Previous studies have shown that different types of infiltrating immune
   cells have distinct effects on tumor progression ([163]Fridman et al.,
   2012; [164]Bindea et al., 2013). In order to investigate the
   association between infiltration immune cells and the lncRNAs
   signature, the ssGSEA method was used to calculate enrichment scores of
   24 immune cell types ([165]Bindea et al., 2013), and then Pearson
   correlation analysis was conducted to evaluate their correlations. Our
   integrated analyses indicated that risk scores were significantly
   associated with the enrichment level of CD8 T cells. In addition, the
   combination of six lncRNAs had higher prognostic accuracy for CD8 T
   cell infiltrating status than each predictor alone, indicating that the
   signature could provide promising immunotherapeutic targets for the
   treatment of BC. CD8 T cells function as a tumor suppressor and play
   crucial roles in cancer progression ([166]Barry and Bleackley, 2002;
   [167]Vesely et al., 2011; [168]Jansen et al., 2019). In BC, previous
   studies have demonstrated that CD8 T cells are the primary effector
   immune cells and associated with favorable clinical outcomes of BC
   patients ([169]Mahmoud et al., 2011). Taken together, dysregulation of
   CD8 T cells might be a key mechanism associated with a higher mortality
   of patients in the high-risk group.
   To further provide an interpretation of biological mechanisms for the
   six-lncRNA signature, we performed WGCNA analysis to explore
   non-preservative modules in the high-risk group compared with the
   low-risk group. Subsequent KEGG pathway-enrichment analysis
   demonstrated that the midnight blue module was strongly associated with
   immune-related pathways mainly involved in protein processing in
   endoplasmic reticulum and antigen processing and presentation. In the
   immune system, the antigen presentation pathway plays central roles for
   the immune surveillance and elimination of malignant transformations
   ([170]Hu et al., 2019). Nascent malignant cells may reduce antigenicity
   so that anti-tumor lymphocytes fail to detect transformed cells to
   escape immune surveillance ([171]Hu et al., 2019). Endoplasmic
   reticulum (ER) could process and present mutation-derived tumor-related
   nascent antigens to CD8 T cells, which contribute to tumor suppression
   ([172]Johnsen et al., 1999; [173]Hu et al., 2019). In our study, the
   antigen presentation pathway involved in protein processing in
   endoplasmic reticulum and antigen processing and presentation was
   disorganized in the high-risk group, which might promote tumor immune
   evasion and associate with poor clinical outcomes in the patients with
   high risk scores.
   In the study, we integrated bioinformatics analyses to provide a solid
   theoretical basis for potential value of the six-lncRNA signature.
   However, in the process of public database mining, some unknown factors
   are usually difficult to extrapolate. In subsequent studies, we will
   further demonstrate and improve our findings by more rigorous
   experimental evidence in immune-related lncRNAs associated with BC.
Conclusion
   In conclusion, we integrated informatics analyses to identify a six
   immune-related lncRNAs signature. Moreover, CD8 T cell infiltration was
   significantly activated in the low-risk group, which could contribute
   to a better prognosis of BC patients. Our findings aim to help improve
   clinical outcome prediction and provide promising immunotherapeutic
   targets for BC patients.
Data Availability Statement
   Publicly available datasets were analyzed in this study, these can be
   found in The Cancer Genome Atlas ([174]http://cancergenome.nih.gov/)
   and the NCBI Gene Expression Omnibus
   ([175]https://www.ncbi.nlm.nih.gov/geo/) ([176]GSE20685, [177]GSE21653,
   and [178]GSE88770).
Author Contributions
   QY conceived and directed the project. ZL designed the study and
   analyzed the data. ZL and YL wrote the manuscript. XW reviewed the
   data. All authors have read and approved the final manuscript for
   publication.
Conflict of Interest
   The authors declare that the research was conducted in the absence of
   any commercial or financial relationships that could be construed as a
   potential conflict of interest.
Abbreviations
   AUC
          the area under the ROC curve
   BC
          breast cancer
   CD8 T cells
          CD8-positive cytotoxic T lymphocytes
   DFS
          disease-free survival
   FDR
          false discovery rate
   GEO
          the Gene Expression Omnibus
   GSEA
          gene set enrichment analysis
   GSVA
          gene set variation analysis
   HR
          hazard ratio
   KEGG
          Kyoto Encyclopedia of Genes and Genomes
   lncRNAs
          long non-coding RNAs
   LR
          logistic regression analysis
   MSigDB
          the Molecular Signatures Database
   OS
          overall survival
   RF
          random forest
   ROC curve
          receiver operating characteristic curve
   ssGSEA
          single sample gene set enrichment analysis
   TCGA
          The Cancer Genome Atlas
   TME
          the tumor microenvironment
   WGCNA
          weighted gene co-expression network analysis.
   Funding. This work was supported by the National Natural Science
   Foundation of China (Nos. 81672613 and 81874119), the Special
   Foundation for Taishan Scholars (No. ts20190971), the Special Support
   Plan for National High Level Talents (Ten Thousand Talents Program
   W01020103), the National Key Research and Development Program (No.
   2018YFC0114705), and the Qilu Hospital Clinical New Technology
   Developing Foundation (Nos. 2018-7 and 2019-3).
   ^1
   [179]https://www.gsea-msigdb.org/gsea/msigdb/
Supplementary Material
   The Supplementary Material for this article can be found online at:
   [180]https://www.frontiersin.org/articles/10.3389/fgene.2020.00680/full
   #supplementary-material
   [181]Click here for additional data file.^ (884.3KB, JPEG)
   [182]Click here for additional data file.^ (719.8KB, JPEG)
   [183]Click here for additional data file.^ (718.1KB, JPEG)
   [184]Click here for additional data file.^ (599.7KB, JPEG)
   [185]Click here for additional data file.^ (371.1KB, JPEG)
   [186]Click here for additional data file.^ (370.8KB, JPEG)
   [187]Click here for additional data file.^ (34.5KB, XLSX)
   [188]Click here for additional data file.^ (15.2KB, XLSX)
   [189]Click here for additional data file.^ (12.1KB, XLSX)
   [190]Click here for additional data file.^ (9.6KB, XLSX)
   [191]Click here for additional data file.^ (9.2KB, XLSX)
   [192]Click here for additional data file.^ (10.7KB, XLSX)
   [193]Click here for additional data file.^ (60.5KB, XLSX)
   [194]Click here for additional data file.^ (11.2KB, XLSX)
References