Abstract Purpose High-grade serous ovarian cancer (HGSOC) is the leading cause of death among gynecological malignancies. This is mainly attributed to its high rates of chemoresistance. To date, few studies have investigated the molecular mechanisms underlying this resistance to treatment in ovarian cancer patients. In this study, we aimed to explore these molecular mechanisms using bioinformatics analysis. Methods We analyzed microarray data set [36]GSE51373, which included 16 platinum-sensitive HGSOC samples and 12 platinum-resistant control samples. Differentially expressed genes (DEGs) were identified using RStudio. Gene ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment analyses were performed using DAVID, and a DEG-associated protein–protein interaction (PPI) network was constructed using STRING. Hub genes in the PPI network were identified, and the prognostic value of the top ten hub genes was evaluated. MGP, one of the hub genes, was verified by immunohistochemistry. Results All samples were confirmed to be of high quality. A total of 109 DEGs were identified, and the top ten enriched GO terms and four KEGG pathways were obtained. Specifically, the PI3K-AKT signaling pathway and the Rap1 signaling pathway were identified as having significant roles in chemoresistance in HGSOC. Furthermore, based on the PPI network, KIT, FOXM1, FGF2, HIST1H4D, ZFPM2, IFIT2, CCNO, MGP, RHOBTB3, and CDC7 were identified as hub genes. Five of these hub genes could predict the prognosis of HGSOC patients. Positive immunostaining signals for MGP were observed in the chemoresistant samples. Conclusion Taken together, the findings of this study may provide novel insights into HGSOC chemoresistance and identify important therapeutic targets. Keywords: high-grade serous ovarian cancer, chemoresistance, gene expression profiling, bioinformatics analysis Introduction Epithelial ovarian cancer has the highest mortality rate of any gynecological cancer, with patients with high-grade serous ovarian cancer (HGSOC) having particularly poor outcomes.[37]^1^,[38]^2 Typical treatments for HGSOC include surgical resection in combination with postoperative chemotherapy using cisplatin and paclitaxel.[39]^3 However, the vast majority of patients with advanced disease relapse within 5 years, often owing to metastasis and drug resistance of ovarian cancer cells.[40]^4^–[41]^7 Therefore, there is a need to find more efficient therapeutic targets, including key genes and signaling pathways that drive therapy resistance. Some progress has been made in determining the mechanisms of chemoresistance in HGSOC.[42]^5^,[43]^8 For example, Zhang et al[44]^9 isolated CD44+/CD117+ ovarian cancer cell stem cells (CSCs) and found that they exhibited enhanced chemoresistance to the ovarian cancer chemotherapeutics cisplatin or paclitaxel. Liu et al[45]^10 found that C/EBPβ-mediated reprogramming of gene expression triggered a broad signaling network that synergistically promoted cisplatin resistance in HGSOC. Luo et al[46]^11 showed that loss of ARID1A in HGSOC led to multiple drug resistance through the upregulation of MRP2. Recently, tumor metabolism has also been implicated in chemoresistance; chemotherapeutic drugs combined with metabolic targeting appears to be a promising approach to overcoming chemoresistance.[47]^12^,[48]^13 Nevertheless, these studies are only the tip of the iceberg regarding the mechanisms of platinum resistance in HGSOC. Stronger links between molecular profiles and drug resistance are needed. Bioinformatics analysis is an effective and practical method to predict key genes and pathways in tumorigenesis or other pathological processes.[49]^14 In this study, we analyzed microarray profiles of 12 platinum-resistant HGSOC samples and 16 platinum-sensitive control samples from the Gene Expression Omnibus (GEO) database. Differentially expressed genes (DEGs) were screened, and gene ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment analyses were conducted. We also constructed a protein–protein interaction (PPI) network and screened out the hub genes involved in the development of platinum-based chemotherapy resistance in HGSOC patients. Finally, we evaluated the significance of these hub genes with respect to prognosis, and chose one marker to validate its expression. This study may provide novel insights into HGSOC chemoresistance and identify potentially important therapeutic targets. Materials and Methods Data Sets Microarray gene expression profiles from the [50]GSE51373 data set[51]^15 were downloaded from the GEO database ([52]http://www.ncbi.nlm.nih.gov/geo/). This data set is based on the [53]GPL570 Affymetrix Human Genome U133 Plus 2.0 Array platform (Affymetrix Inc., Santa Clara, CA, USA) and contained 28 patients, who were divided into two groups (chemoresistant and chemosensitive groups). The inclusion criterion was as follows: patients with high-grade serous HGSOC, treated with the same standard platinum-based chemotherapy (Carboplatin/paclitaxel). Twelve patients demonstrating relative resistance to platinum chemotherapy corresponding to shorter progression-free survival (PFS < 8 months) were compared with 16 platinum-sensitive patients (PFS > 18 months). Data Pre‑processing and Differential Expression Analysis RStudio software (version 1.1.447) and various R packages were used to analyze the original array data. In short, we first assessed the quality of the raw data by plotting the normalized unscaled standard error (NUSE) boxplot and plot of residuals. Then, background correction and quantile normalization were performed on the raw data using the robust multi-array average algorithm in the R affy package. Subsequently, the DEGs between the chemosensitive and chemoresistant group samples were sorted by paired t-tests using the limma package in R. Multiple comparisons were corrected by the Benjamini–Hochberg method to obtain adjusted P-values. Finally, genes with adjusted P<0.05 and |log[2] fold change (FC)|>1 were considered to be significant. Functional Enrichment Analysis GO enrichment and KEGG pathway analyses were performed for gene annotation and functional enrichment using the online tool Database for Annotation, Visualization and Integrated Discovery (DAVID, [54]http://david.abcc.ncifcrf.gov/).[55]^16 The resulting GO terms and KEGG pathways with P<0.05 were considered to be significantly enriched in the obtained DEGs. Construction of PPI Network To evaluate the interactive relationships among DEGs, we mapped the DEGs to the STRING database ([56]http://string-db.org). Only the interactions with a combined score >0.15 were considered significant. Hub genes were then selected from the PPI network and a score was calculated for each gene (the number of genes directly interacting with it). The top ten hub genes in the network were identified based on their scores. Evaluation of the Prognostic Significance of Top Ten Hub Genes Kaplan–Meier Plotter, an online survival analysis tool ([57]http://kmplot.com/analysis/),[58]^17^,[59]^18 was used to evaluate the prognostic significance of the top ten hub genes in HGSOC. According to their median expression value, the patient samples were divided into high- and low-expression groups. Hazard ratios with 95% confidence intervals and log-rank P-values were calculated. Immunohistochemistry Immunohistochemical analyses were performed using the Envision System with diaminobenzidine (Gene Tech Co., Ltd., Shanghai) according to the manufacturer’s protocol. Chemoresistant and chemosensitive samples were collected from Fudan University Shanghai Cancer Center. In brief, specimens were incubated first with an anti-MGP antibody (10734-1-AP; 1:400, Proteintech, China) overnight at 4°C and then with a biotinylated secondary antibody (1:100, goat anti-rabbit IgG) for 30 mins at 37°C. Ethics Statement The tissue samples from patients in this study were used with approval from the Ethics Committee of Fudan University Shanghai Cancer Center and with informed consent from all patients. All procedures were performed in accordance with the Declaration of Helsinki and relevant policies in China. Results Quality Control of Data Sets Good quality control is essential for successful microarray data analysis. In the present study, we used NUSE boxplots and residual plots to assess the quality of the data sets. As described in the literature, if a microarray data set is very reliable, its NUSE values will be very close to one. Our results ([60]Figure 1A) show that this was the case for our data set. The residuals were distributed uniformly (shown in red and blue in [61]Figure 1B), which further confirmed the quality of the data set. Figure 1. [62]Figure 1 [63]Open in a new tab Evaluation of the quality of the microarray data set. (A) Normalized unscaled standard error (NUSE) boxplot for all samples. (B) Plot of residuals also confirmed the quality of the microarray data set. Screening of DEGs After quality assessment, a set of 20,460 genes were mapped to probes for each CEL file. Then, we defined 109 DEGs between chemoresistant and chemosensitive samples, of which 64 (58.7%) were upregulated and 45 (41.3%) were downregulated ([64]Table S1). Genes with |logFC| >1 and adjusted P<0.05 are marked in red in the volcano plot in [65]Figure 2A. A heat map of these DEGs is presented in [66]Figure 2B, and the top ten DEGs are listed in [67]Table 1. Figure 2. [68]Figure 2 [69]Open in a new tab Expression levels of genes and distributions in microarray data set. (A) Volcano plots of microarray data. Y-axis represents log[2] FC. X-axis represents adjusted P-value (chemosensitive vs chemoresistant samples). Red dots indicate DEGs. (B) Heatmap of DEG clustering. Green represents downregulation and red represents upregulation. Table 1. Top Ten Differentially Expressed Genes Between Chemosensitive and Chemoresistant Samples Gene Log(FC) Adjusted P-value P-value PRAMEF12 1.207071 7.54E-11 3.69E-15 HIST1H4D 1.424006 2.14E-08 1.46E-11 DIO3OS −1.07802 2.58E-08 2.50E-11 LINC00669 1.009978 2.31E-06 9.81E-09 RAB43 1.398519 3.31E-06 1.70E-08 RAB3A 1.048865 6.36E-06 4.04E-08 LOC101559451 1.489303 1.04E-05 7.36E-08 HIST1H2BJ 1.296716 1.79E-05 1.59E-07 MIR205 1.289967 1.82E-05 1.63E-07 CYP27B1 1.384973 1.88E-05 1.71E-07 [70]Open in a new tab Significant Functions and Pathway Enrichment Analysis Potential biological functions associated with these DEGs may imply intrinsic chemoresistance mechanisms. We used the online software DAVID to identify the representative GO categories and KEGG pathways. The top ten most enriched GO terms according to P-values and numbers of enriched pathways are shown in [71]Figure 3 and [72]Table 2. These GO terms included response to stimulus, cell communication, positive regulation of biological process, and cellular component organization or biogenesis. Many of these terms are closely related to chemoresistance and tumorigenesis. The four KEGG pathways that were enriched are presented in [73]Figure 3 and [74]Table 2. These included the PI3K-AKT signaling pathway and the Rap1 signaling pathway, which may have important roles in chemoresistance in HGSOC patients. Figure 3. [75]Figure 3 [76]Open in a new tab Representative GO categories and KEGG pathways obtained using DAVID. Table 2. Enriched GO Terms and KEGG Pathways for the Identified DEGs Category Term P-value GOTERM_BP_ALL GO:0048518: positive regulation of biological process 0.001648 GOTERM_BP_ALL GO:0051716: cellular response to stimulus 0.008628 GOTERM_BP_ALL GO:0016043: cellular component organization 0.014114 GOTERM_BP_ALL GO:0050896: response to stimulus 0.014516 GOTERM_BP_ALL GO:0044700: single organism signaling 0.018674 GOTERM_BP_ALL GO:0044699: single-organism process 0.018953 GOTERM_BP_ALL GO:0023052: signaling 0.021613 GOTERM_BP_ALL GO:0071840: cellular component organization or biogenesis 0.021909 GOTERM_BP_ALL GO:0007154: cell communication 0.022512 GOTERM_BP_ALL GO:0050789: regulation of biological process 0.045651 KEGG_PATHWAY hsa04261: adrenergic signaling in cardiomyocytes 0.016833 KEGG_PATHWAY hsa04151: PI3K-AKT signaling pathway 0.03826 KEGG_PATHWAY hsa05203: viral carcinogenesis 0.040515 KEGG_PATHWAY hsa04015: Rap1 signaling pathway 0.043034 [77]Open in a new tab PPI Network Analysis and Hub Genes Screening To determine the interactions among the identified genes, we constructed a PPI network using the online tool STRING. Only genes with score >0.15 were included in the network. As shown in [78]Figure 3A, we obtained a network containing a total of 213 PPI relationships and 97 nodes after removing the effects of free protein pairs, accounting for 89% of all DEGs. In PPI networks, hub genes, which have strong interactions with many other genes, can exert huge effects on tumorigenesis. Owing to their key positions, hub genes are potential drivers of disease. In order to identify the key genes in chemoresistance of HGSOC, the cytoHubba plugin for Cytoscape was used to screen the hub genes. As shown in [79]Figure 3B and [80]Table 3, we obtained ten hub genes (nodes colored red or orange in the figure) according to their scores; these were proto-oncogene c-Kit (KIT), forkhead box protein M1 (FOXM1), fibroblast growth factor 2 (FGF2), H4 clustered histone 4 (HIST1H4D), zinc finger protein FOG family member 2 (ZFPM2), IFN-induced protein with tetratricopeptide repeats 2 (IFIT2), cyclin O (CCNO), matrix Gla protein (MGP), rho-related BTB domain-containing 3 (RHOBTB3), and cell division cycle 7 (CDC7). The hub genes identified in the PPI network analysis may be key players in the development of cancer chemoresistance. [81]Table 4 provides a brief overview of the functions of these hub genes. In addition, we tested the expression of hub genes using a data set from The Cancer Genome Atlas ([82]Figure S1); however, the expression of some hub genes in this data set, including KIT, did not match the results of the current bioinformatics analyses. This was probably due to the small data set itself or the changes in the genome during the development of platinum resistance in ovarian cancer patients. Table 3. Identified Hub Genes in the PPI Network Rank Gene Score 1 KIT 28 2 FOXM1 22 3 FGF2 18 4 HIST1H4D 16 5 ZFPM2 15 6 IFIT2 14 7 CCNO 13 7 MGP 13 9 RHOBTB3 11 10 CDC7 10 [83]Open in a new tab Table 4. Overview of the Functions of Top Ten Hub Genes Gene Mechanism of Action and Function References