Abstract Background Ovarian cancer (OV) is the most lethal gynecological cancer in women. We aim to develop a generalized, individualized immune prognostic signature that can stratify and predict overall survival for ovarian cancer. Methods The gene expression profiles of ovarian cancer tumor tissue samples were collected from 17 public cohorts, including 2777 cases totally. Single sample gene set enrichment (ssGSEA) analysis was used for the immune genes from ImmPort database to develop an immune-based prognostic score for OV (IPSOV). The signature was trained and validated in six independent datasets (n = 519, 409, 606, 634, 415, 194). Findings The IPSOV significantly stratified patients into low- and high-immune risk groups in the training set and in the 5 validation sets (HR range: 1.71 [95%CI: 1.32–2.19; P = 4.04 × 10^−5] to 2.86 [95%CI: 1.72–4.74; P = 4.89 × 10^−5]). Further, we compared IPSOV with nine reported ovarian cancer prognostic signatures as well as the clinical characteristics including stage, grade and debulking status. The IPSOV achieved the highest mean C-index (0.625) compared with the other signatures (0.516 to 0.602) and clinical characteristics (0.555 to 0.583). Further, we integrated IPSOV with stage, grade and debulking, which showed improved prognostic accuracy than clinical characteristics only. Interpretation The proposed clinical-immune signature is a promising biomarker for estimating overall survival in ovarian cancer. Prospective studies are needed to further validate its analytical accuracy and test the clinical utility. Fund This work was supported by National Key Research and Development Program of China, National Natural Science Foundation of China and Natural Science Foundation of the Jiangsu Higher Education Institutions of China. Keywords: Ovarian cancer, Immune, Gene expression, Prognostic signature List of Abbreviations OV ovarian cancer IPSOV immune-based prognostic signature for OV HR hazard ratio CI confidence interval GEO Gene Expression Omnibus TCGA The Cancer Genome Atlas ovarian cancer FIGO International Federation of Gynecology and Obstetrics ssGSEA single sample gene set enrichment analysis KEGG Kyoto Encyclopedia of Genes and Genomes GO Gene Ontology BP biological process MF molecular function CC cellular component FDR False-Discovery Rate C-index concordance statistic RMS restricted mean survival [35]Open in a new tab 1. Introduction Ovarian cancer (OV) is the most lethal gynecological cancer among women, with >14,000 estimated new deaths in United States, 2018 [[36]1]. Biomarkers especially gene expression in tumor tissues are reliably related to cancer prognosis and survival [[37][2], [38][3], [39][4]]. Thus, identification of the subset of patients with worse survival and higher mortality is needed for additional clinical therapy. The availability of large-scale public cohorts with gene expression data and the well-developed biological database bring the opportunity to identify a more generalized prognostic signature with biological background for ovarian cancer. Immune system has been shown to be a determining factor during cancer initiation and progression [[40]5,[41]6]. Various evidence have proved that ovarian cancer are immunogenic tumors [[42][7], [43][8], [44][9]] and immunotherapy is strongly pursued through targeting on the immune checkpoints [[45]10,[46]11]. In addition, previous study has preliminary confirmed the prognostic value of immune system in ovarian cancer [[47]12]. Thus, the immune based prognostic signature remains a potential to be applied in ovarian cancer. In this study, we integrated multiple cohorts with gene expression which contained 2777 cases totally to develop and validate an individualized gene-set based prognostic signature for ovarian cancer from immune-related genes. With sufficient validation of 5 independent datasets, we proved the model stability and reliability. Further, to leverage the complementary value of clinical characteristics, we performed an integrated analysis to improve the predicted accuracy for overall survival. 2. Methods 2.1. Study population and eligibility criteria We retrospectively collected the ovarian cancer gene expression profiles of frozen primary tumor tissue samples from 17 public microarray datasets, including 16 cohorts from the Gene Expression Omnibus (GEO) repository and 1 from The Cancer Genome Atlas ovarian cancer (TCGA-OV) cohort. All patients had received primary debulking surgery. Only patients with available follow-up time, status and gene expression data were included. The main outcome of our study was overall survival. Staging was assessed in accordance to the International Federation of Gynecology and Obstetrics (FIGO) stage system. Optimal debulking was defined as ≤1 cm of gross residual disease while sub-optimal debulking was defined as >1 cm of residual disease. We did not limit histology type, FIGO stage, grade, debulking and neoadjuvant chemotherapy status for sample collection. The sample quality control details were described in Table S1. The study characteristics of the 17 public ovarian cancer datasets were described in Table S1. Finally, 2777 ovarian cancer cases were included in our study. The study design and workflow were provided in [48]Fig. 1. Fig. 1. [49]Fig. 1 [50]Open in a new tab Flowchart of the study. 17 public ovarian cancer datasets containing 2777 cases were included and categorized into 6 independent datasets according to the microarray platform. We developed the IPSOV in the training set and validated in the other 5 datasets. Further, we integrated IPSOV with stage, grade and debulking status to improve the prognostic value. 2.2. Gene expression data preprocessing For GEO datasets, all microarray data and clinical information were download from GEO repository ([51]https://www.ncbi.nlm.nih.gov/geo/). To increase the statistical power and leverage cohorts with small sample size sufficiently, we combined the cohorts with the same microarray platform. Only the platforms with sample size ≥ 150 were included in the study. For TCGA cohort, due to the samples with available RNA-Sequencing gene expression quantification were relatively small (n = 374), we collected Affymetrix Human Genome U133A Array microarray data instead (n = 519). All level-2 data were download from the Genomic Data Commons Data Portal ([52]https://portal.gdc.cancer.gov/). Finally, six independent datasets were included according to the different platforms from GEO and TCGA ([53]Table 1). Entrez IDs were used to represent genes across different platform. In the independent validation phase, if multiple probe sets correspond to the same Entrez ID in the validation sets, the one with the highest mean signal was selected as the expression level of the corresponding gene [[54]13]. Table 1. Summary of the eight datasets included in the study. Dataset Platform Sample size Included cohorts[55]^a Training set Affymetrix Human Genome U133A Array 519 TCGA Validation set 1 Affymetrix Human Genome U133A Array 409 [56]GSE14764, [57]GSE23554, [58]GSE26712, [59]GSE3149 Validation set 2 Affymetrix Human Genome U133 Plus 2.0 Array 606 [60]GSE18520, [61]GSE19829, [62]GSE26193, [63]GSE30161, [64]GSE63885, [65]GSE9891 Validation set 3 Agilent-014850 Whole Human Genome Microarray 4x44K G4112F 634 [66]GSE17260, [67]GSE32062, [68]GSE53963, [69]GSE73614 Validation set 4 Operon human v3 ~35 K 70-mer two-color oligonucleotide microarrays 415 [70]GSE13876 Validation set 5 ABI Human Genome Survey Microarray Version 2 194 [71]GSE49997 [72]Open in a new tab ^a We used ComBat to adjust the batch effects between different cohorts within the same platform. Gene expression values of all probes were adjusted in each dataset, respectively. The batch effects between different datasets within the same platform were adjusted by ComBat method [[73]14]. Gene expression values of all probes were adjusted in each dataset, respectively. We also adjusted the batch effects within the TCGA cohort according to the tissue source site code. All the gene expression levels had been logarithmic transformed before batch effects adjustment. 2.3. Immune-related genes definition We constructed a prognostic signature from the immune-related genes, which were downloaded from the ImmPort database ([74]http://www.immport.org) [[75]15]. It includes 17 immune categories according to different molecular function, such as antimicrobials, cytokine, interleukins, T-cell receptor signaling pathway, B-cell receptor signaling pathway, TNF family receptors. 2.4. Development of the immune-based prognostic signature for OV (IPSOV) We performed a three-stage strategy to develop the signature. First, we used the TCGA cohort as the training set to screen the survival related probes. 1905 microarray probes for 1109 immune genes were included totally. The Cox proportional-hazards model was used to assess their association with overall survival with adjustment for age, stage, grade and debulking status. We excluded the probes with adjusted P values larger than 0.05. To estimate the population specific immune infiltration, we used single sample gene set enrichment analysis (ssGSEA) that define a enrichment score to represent the degree of absolute enrichment of a gene set in each sample within a given dataset [[76]16]. Normalized enrichment scores could be calculated for each immune category. The ssGSEA analysis were performed in R package GSVA. In the last step, we develop a prognostic signature named the immune-based prognostic signature for OV (IPSOV) to combine the effects of each immune category in the training set. The coefficients of each category were determined by multivariable Cox regression model: [MATH: IPSOV=i=1 kβiSi :MATH] , where S[i] is the ssGSEA score for ith immune category. 2.5. Validation of the IPSOV To get a uniform cutoff value to categorize the patients into low-risk and high-risk group, we performed a normalization for the gene expression values in each dataset with mean value = 0 and standard deviation (SD) = 1. After developing the signature, the prognostic score of IPSOV was further calculated in the 5 validation datasets, respectively. In the multivariable Cox regression, age, FIGO stage, grade, debulking status and study site were included as covariates. 2.6. Pathway enrichment analysis for the molecular function To further understand the gene function in IPSOV, we performed a pathway enrichment analysis for the genes based on Kyoto Encyclopedia of Genes and Genomes (KEGG) and Gene Ontology (GO) database including biological process (BP), molecular function (MF) and cellular component (CC). The P values were adjusted by False-Discovery Rate (FDR) method for multiple comparison. All the analysis were performed by R package clusterProfiler [[77]17]. 2.7. Existing prognostic signatures for comparison To evaluate the survival classification and prediction ability of IPSOV, we retrospectively collected 9 published prognostic signatures for comparison, which included from 7 genes to 300 genes (Table S7). Continuous prognostic scores were calculated for each signature. The P values of continuous scores in univariable Cox model and the overall concordance statistic (C-index) were compared in the six datasets. 2.8. Statistical analysis Continuous variables were summarized as mean ± SD, and categorized variables were described by frequency (n) and proportion (%). In the survival analysis, associations between characteristics and overall survival were evaluated by Cox proportional hazard models. Kaplan-Meier survival curves were drawn and compared among subgroups using log-rank tests. The C-index and restricted mean survival (RMS) curve were estimated using R package survival and survRM2, while C-index was compared by R package compareC [[78]18]. Statistical analyses were performed using R version 3.4.0 (The R Foundation). P values were two-sided, and P < 0.05 was considered statistically significant. 3. Results 3.1. Development and definition of the IPSOV A total of 2777 patients with ovarian cancer (mean age ± SD, 59.7 ± 11.8; 12.8% early-stage [I, II], 87.2% late-stage [III, IV]; 43.5% optimal debulking, 56.5% sub-optimal debulking) from 6 independent datasets were included in the analysis ([79]Table 2). In the training set, 156 probes of 129 genes from 15 immune categories were associated with overall survival among the 1905 immune-related microarray probes. Further, we used these probes to perform the ssGSEA analysis for each patient to calculate the immune enrichment score of each immune category. The IPSOV was defined as the combined effect of scores in different categories using the coefficients generated from multivariable Cox regression model (Table S3). The median score in the training set (−0.23) was used as the cutoff to stratify patients into low- and high-immune risk group across all the datasets. Table 2. Demographic and clinic characteristic descriptions for ovarian cancer patients in different datasets. Characteristics[80]^a Training set Validation set 1 Validation set 2 Validation set 3 Validation set 4 Validation set 5 Number of samples 519 409 606 634 415 194 Median survival time (month) (95% CI) 44.5 (41–48.3) 51 (45–65) 41.9 (37.8–47.8) 53 (49–62) 24 (21−30) >49[81]^b Number of Death (%) 284 (54.9) 232 (56.7) 346 (57.1) 364 (57.1) 302 (72.7) 57 (29.3) Age (years) 59.8 ± 11.4 – 62.9 ± 10.6 62.6 ± 11.2 57.9 ± 12.2 57.6 ± 11.8 

 Histology type (%) Serous 520 (100) 96 (88.8) 516 (85.5) 534 (83.8) 415 (100) 171 (88.1) Endometroid 0 6 (5.5) 41 (6.8) 66 (10.3) 0 0 Clear cell 0 2 (1.8) 20 (3.3) 37 (5.8) 0 0 Mucinous 0 0 9 (1.4) 0 0 0 Other 0 4 (3.7) 17 (2.8) 0 0 23 (11.8) 

 FIGO stage (%) I 15 (2.9) 8 (4.1) 41 (7.4) 32 (5.0) – 0 II 25 (4.8) 2 (1.0) 111 (20.0) 23 (3.6) – 9 (4.6) III 396 (76.7) 164 (84.1) 364 (65.8) 462 (72.6) – 154 (79.3) IV 80 (15.5) 21 (10.7) 37 (6.6) 119 (18.7) – 31 (15.9) 

 Grade (%) 1 5 (0.9) 9 (4.0) 32 (5.8) 26 (4.0) – 0 2 64 (12.5) 86 (38.9) 80 (14.6) 204 (32.0) – 0 3 439 (86.2) 125 (56.5) 394 (72.0) 324 (50.8) – 143 (100) 4 1 (0.1) 1 (0.2) 41 (7.4) 83 (13.0) – 0 

 Debulking status (%) Optimal 337 (73.1) 193 (54.0) 209 (66.7) 271 (51.4) – – Sub-optimal 124 (26.9) 164 (45.9) 104 (33.2) 256 (48.5) – – [82]Open in a new tab ^a Sum of frequency numbers may not equal to the total sample size due to missing values. ^b The median survival time is incalculable due to the mortality at the last follow-up time is <50%. 3.2. Validation of the IPSOV The IPSOV significantly stratified patients into low- and high-immune risk groups in the training set (hazard ratio [HR] = 1.82; 95% confidence interval [CI]: 1.43–2.30; P = 9.19 × 10^−7) and in the 5 validation sets (HR range: 1.66 [95%CI: 1.27–2.15; P = 1.84 × 10^−4] to 2.36 [95%CI: 1.38–4.00; P = 0.001]) (Table S4). In addition, it remained as an independent prognostic factor in the multivariable Cox model, after adjusting for age, stage, grade and debulking status (HR range: 1.71 [95%CI: 1.32–2.19; P = 4.04 × 10^−5] to 2.86 [95%CI: 1.72–4.74; P = 4.89 × 10^−5]) ([83]Fig. 2). In the meta-analysis for all datasets, the high-immune risk group showed a 1.96-fold higher risk than the low-immune risk group (HR = 1.96; 95%CI: 1.73–2.23; P = 2.37 × 10^−25) (Fig. S1). The IPSOV distribution with survival status in the combined dataset was shown in [84]Fig. 3. Significant RMS time ratio (1.24 to 1.88) was observed in the 6 datasets, respectively ([85]Table 3). Fig. 2. [86]Fig. 2 [87]Open in a new tab Kaplan-Meier survival analyses of IPSOV. (a) Patients are stratified into low- (red) and high-immune (blue) risk groups with a cutoff of the median value in the training set. (b-f) Further, the prognostic signature IPSOV is validated in five independent validation sets. (For interpretation of the references to color in this figure legend, the