Abstract Simple Summary A mass spectrometry (MS) proteomics and molecular pathway study was applied to serum samples of patients with ovarian serous carcinoma administered the FOLFOX-4 drug combination protocol, before the second cycle of therapy. This exploratory study aimed at identifying a protein panel that could be significantly modulated during two different collection time intervals and associated with patient response to therapy. The label-free differential MS proteomic analysis of 14 serum samples was conducted and identified 291 shared expressed proteins; 12 proteins resulted in being significantly associated with response to treatment and time of sample collection. The network enrichment analysis performed through STRING and other bioinformatic tools provided a metadata validation of the panel in which the identified proteins were related to resistant ovarian cancers at the molecular level. We concluded that the discovered protein panel that guided the identification of the associated molecular pathways could be further explored in a higher number of patients. Considering the lack of biomarkers that can guide the selection of further therapeutic approaches after drug resistance appearance, our study may suggest a new direction in the discovery and validation of a protein panel as biomarkers for future clinical application. Abstract Ovarian cancer is a highly lethal gynecological malignancy. Drug resistance rapidly occurs, and different therapeutic approaches are needed. So far, no biomarkers have been discovered to predict early response to therapies in the case of multi-treated ovarian cancer patients. The aim of our investigation was to identify a protein panel and the molecular pathways involved in chemotherapy response through a combination of studying proteomics and network enrichment analysis by considering a subset of samples from a clinical setting. Differential mass spectrometry studies were performed on 14 serum samples from patients with heavily pretreated platinum-resistant ovarian cancer who received the FOLFOX-4 regimen as a salvage therapy. The serum was analyzed at baseline time (T0) before FOLFOX-4 treatment, and before the second cycle of treatment (T1), with the aim of understanding if it was possible, after a first treatment cycle, to detect significant proteome changes that could be associated with patients responses to therapy. A total of 291 shared expressed proteins was identified and 12 proteins were finally selected between patients who attained partial response or no-response to chemotherapy when both response to therapy and time dependence (T0, T1) were considered in the statistical analysis. The protein panel included APOL1, GSN, GFI1, LCATL, MNA, LYVE1, ROR1, SHBG, SOD3, TEC, VPS18, and ZNF573. Using a bioinformatics network enrichment approach and metanalysis study, relationships between serum and cellular proteins were identified. An analysis of protein networks was conducted and identified at least three biological processes with functional and therapeutic significance in ovarian cancer, including lipoproteins metabolic process, structural component modulation in relation to cellular apoptosis and autophagy, and cellular oxidative stress response. Five proteins were almost independent from the network (LYVE1, ROR1, TEC, GFI1, and ZNF573). All proteins were associated with response to drug-resistant ovarian cancer resistant and were mechanistically connected to the pathways associated with cancer arrest. These results can be the basis for extending a biomarker discovery process to a clinical trial, as an early predictive tool of chemo-response to FOLFOX-4 of heavily treated ovarian cancer patients and for supporting the oncologist to continue or to interrupt the therapy. Keywords: ovarian cancer, FOLFOX-4, mass spectrometry proteomics, serum samples, time lapse detection, network enrichment analysis, cancer molecular pathways, protein panel 1. Introduction Ovarian cancer is the most lethal gynecological malignancy, and despite the relatively high response rate to first-line standard chemotherapy, most patients develop recurrent disease [[58]1]. Treatment is performed with carboplatin in combination with paclitaxel, and then other chemotherapeutic drugs are used such as topotecan and gemcitabine as second line therapy, as well as pegylated liposomal doxorubicin hydrochloride (PLDH), alone or alongside platinum drugs [[59]2]. More recently, antiangiogenetic drugs or PARP inhibitors have entered into clinical trials and therapy [[60]3]. Patients who progress or relapse within 6 months after the end of the treatment have been reported to have the worst outcomes [[61]4]. Therefore, these patients often undergo multiple chemotherapy courses to try to achieve long-term remission and an acceptable quality of life. This approach raises the risk of cumulative toxicity and/or absence of response. For this reason, new, effective, and less toxic therapies for patients with recurrent and persistent disease are needed. Some in vitro studies have indicated a potential synergy between oxaliplatin and 5-fluorouracil (5-FU)/leucovorin (FOLFOX-4) [[62]5,[63]6]. While oxaliplatin is an alkylating agent, 5-FU is the prodrug of the widely used inhibitor of thymidylate synthase (TS), namely 5-fluoro-deoxyuridine-5′-monophosphate, (FdUMP) and it is incorporated in mRNA, thus, affecting DNA and protein synthesis [[64]7]. Leucovorin, i.e., folinic acid, has been reported to favor TS inhibition and subsequent DNA-directed effects, because this folate analog is readily converted into 5,10-methylenetetrahydrofolate, the TS cofactor needed for ternary-complex formation [[65]8,[66]9]. This combination represents a standard regimen in the management of some advanced tumors such as colorectal (CRC), gastric, and breast cancers [[67]10,[68]11,[69]12,[70]13], and more recently, as salvage treatment in refractory or resistant ovarian cancer [[71]14]. Circulating biomarkers detected in the serum of ovarian cancer patients are involved in either the cause of a malignancy or a systemic response to a malignancy. These factors may originate from several sources including the tumor itself, the surrounding stroma, or systemic tissues involved in the host response. It is unclear how circulating biomolecules are biochemically related to their respective producing sources and how their expression change is associated with cancers. In general, the overlap between the expression of proteins in cancer cells and in serum is limited. In the case of ovarian cancer, a tissue-derived effect in serum has been suggested [[72]15]. Investigations of chemotherapy mechanisms and the discovery of the factors responsible for chemotherapy response and resistance may provide a selection of alternative therapeutic options or chemo-sensitizing agents [[73]16,[74]17]. Several techniques, including MALDI-TOF MS, ESI-TOF MS, and associated bioinformatics analysis, have been proposed to investigate proteome changes during cancer treatment [[75]18,[76]19,[77]20,[78]21,[79]22,[80]23,[81]24]. Almost all the available studies have been performed on samples collected before treatments, and only a few investigations have been run on samples collected and analyzed sometime after treatment and compared with data before treatment to correlate the changes in protein levels with patients responses [[82]18,[83]19]. We wonder if we could apply a similar approach to a difficult case, such as in serum samples derived from heavily pretreated ovarian serous carcinomas patients observed in a routine clinical practice [[84]14]. In this study, the chosen patients were resistant to platinum drugs and had undergone chemotherapy with different drug combinations, and then the FOLFOX-4 regimen as a salvage therapy. According to our methodology, we started with an experimental approach for a serum sample MS analysis by considering the differentially expressed proteins (DEP) between T0 and T1 in two groups of patients, i.e., nonresponders (NR) and partial responders (PR), and an association of the protein changes with patients’ responses was found. Finally, we compared the experimental results with the results of a metadata analysis, which supported the biological significance of the experimental finding. We considered that the protein panel identified was a useful suggestion for performing a biomarker discovery on a larger number of patients. The overall approach is described in [85]Figure 1. Figure 1. [86]Figure 1 [87]Open in a new tab Workflow of the protein identification process in ovarian cancer serum samples performed through label free differential mass spectrometry analysis and integration with the bioinformatic analysis. 2. Materials and Methods 2.1. Experimental Design The current study is a retrospective translational analysis of serum samples from patients who received FOLFOX-4 as salvage chemotherapy for the treatment of recurrent ovarian carcinoma (OC). The medical scientific committee of the IRST Istituto Scientifico Romagnolo per lo Studio e la Cura dei Tumori approved the study (protocol # 5108/V.3) [[88]14]. All the patients shared the following characteristics from baseline serum collection (T0) to T1 collection: (i) no progression of the disease after the first cycle of FOLFOX-4 administration (T1); (ii) no adverse effect reported after T1; (iii) distinction between PR and NR was assessed according to the guidelines [[89]25,[90]26], with a follow-up evaluation 6 months after the FOLFOX-4 cycle. The details are reported in [91]Supplementary Materials S1 and Table S1. The MS translational study consisted of collecting two serum samples from the patients, in which the baseline sample was before treatment at T0 and the second sample was before the second cycle of treatment (T1). We selected 7 patients who shared a similar pharmacological treatment and, overall, evaluated 14 serum samples. According to our methodology, we started with the experimental approach to serum sample MS analysis, and then we statistically reduced the number of proteins that resulted in being significantly differentially expressed to a panel of 12 proteins. On the resulting protein panel, we performed a bioinformatic analysis through network enrichment proteomic tools (see [92]Section 2.5 below). 2.2. Serum Immunodepletion of Albumin and IgG’s A ProteoPrep Immunoaffinity Albumin and IgG Depletion Kit (Sigma-Aldrich) was employed to remove albumin and IgG from the collected serum samples and to enrich the samples with low abundant proteins (LAPs). First, the storage solution was removed, and the columns were equilibrated, as reported by the vendor. Then, 50 µL of serum samples were diluted with equilibration buffer and loaded onto the wet columns with 10 min incubation to promote binding between albumin/IgG and the resin. To maximize the recovery of the unbound protein fractions (i.e., LAPs), the columns were washed with 125 µL of equilibration buffer and centrifuged for 60 s. The obtained aliquot contained the majority (>95%) of the LAPs (checked by SDS-page). To collect the remaining trace amounts of unbound proteins from the column, 0.4 mL of ProteoPrep immunoaffinity equilibration buffer were added and centrifuged at 8000 × g for 60 s. The obtained LAP fractions were quantified with a Bradford assay on 96-wells plate. 2.3. Protein Digestion via Filter-Aided Sample Preparation (FASP) First, 0.2 mg of LAPs were incubated with 20 µL of 0.1 M dithiothreitol (DTT, Sigma-Aldrich, St Louis, MO, USA), 0.2% Protease Max trypsin enhancer (Promega, Madison, WI, USA) in 50 mM ammonium bicarbonate (ABC, Sigma-Aldrich) on Microcon YM-30 (Millipore, Burlington, MA, USA) filters at 55 °C for 30 min, and centrifuged at 14,000× g for 15 min. Then, 200 µL of UA buffer (8 M urea in 0.1 M Tris/HCl, pH 8.5) and 100 µL of 50 µM iodoacetamide IAA (Sigma-Aldrich, St Louis, MO, USA) were added, in two steps, to reduce disulfides and alkylate cysteines, followed by centrifugation at 14,000× g for 15 min. The samples were digested overnight with 4 µL of 1 mg/mL trypsin solution (Promega Corporation, Madison, WI, USA) in ABC at 37 °C, 250 rpm. After adding 40 µL ABC, FASP filters were centrifuged again at 14,000× g for 30 min to collect hydrolyzed peptides. Finally, the samples were acidified with CF[3]COOH, desalted with C18 SPE cartridge (7 mm × 3 mL, 3M Empore, Maplewood, MN, USA), and dried down (Speedvac, Eppendorf, Hamburg, Germany). The samples were stored at −80 °C until MS analysis. 2.4. LC-MS/MS Label-Free Quantification The MS analysis was performed with an nLC coupled online with Impact HD™ URH-TOF (Bruker Daltonics GmbH, Bemen, Germany). The samples were resuspended in mobile phase, quantified by NanoDrop assay, and analyzed using an ultra-high resolution (UHR) Impact HD^TM QTOF mass spectrometer, at the University of Milano Bicocca, Monza. Each sample was injected three times and separated with a Dionex nRSLC (Rapid Separation LC nano, ThermoScientific, Sunnyvale, California, USA) before on-line desalting. A multistep gradient ranging from 4 to 98% acetonitrile in 240 min at a flow rate of 300 nL/min was applied [[93]24]. Peptides were analyzed in data-dependent acquisition (DDA) mode and both an internal mass calibration segment (Na Formate, 15 min length, before the beginning of the gradient and the injection of the actual samples) and a 1221.9906 m/z lock mass, during the proper sample analysis, were employed in each run to retain mass accuracy. Raw spectral data from DataAnalysisTM v.4.0 Sp4 (Bruker Daltonics, Germany, GmbH, Bemen, Germany) were processed with the Mascot search engine, through Mascot Daemon. Database matching was restricted to human SwissProt (October 2019, 561,356 sequences and 201,858,328 residues) [[94]27]. The searching parameters were set as follows: trypsin as enzyme, carbamidomethyl (C) as fixed modifications, oxidation (M) as variable modification, 20 ppm mass tolerance for MS^1 and 0.05 Da for fragments. An automatic decoy database search for FDR calculation and a built-in percolator algorithm for rescoring peptide-spectrum matches were applied [[95]28]. The Progenesis QI for proteomics v. 2.0 (Nonlinear Dynamics, Newcastle, England) platform was used for label-free data elaboration and the determination of the normalized relative abundances of identified proteins and peptides [[96]29]. Briefly, the alignment process was conducted based on the ion intensity maps of all imported runs. To compensate for between-run variation in LC separation, and to maximize the overlay across the data, alignment scores above 60% were accepted per each run. All matched proteins were used for total ion current (TIC) normalization. The Mascot software search engine was used for protein identification, setting the “search parameters” software option as described above. Only non-conflicting peptides were selected for determining the fold change to prevent the overlapping of trends derived from different proteins that share the same peptides. The results were validated with an UHPLC coupled to an Orbitrap Q-Exactive equipped with a micro-ESI (Thermo Fisher Scientific, Waltham, MA, USA). Two samples (F10 and F12) at T0 were processed as described in [97]Section 2.3 and [98]Section 2.4 and analyzed in replicate. Inclusion lists of 3–4 unique peptides per each DEP were generated with PeptideAtlas [[99]30] and included in a target list of the software (XCalibur, Thermo Fisher). Peptides were separated with 180 min gradient in a C18 Hypersil Gold column, 100 × 2.1 mm, 1.9 µm column (Thermo Fisher Scientific, Waltham, MA, USA) in ddMS^2 acquisition mode (Top8). Peak lists were analyzed with Mascot Matrix for protein matching and Progenesis QI Proteomics (Nonlinear Dynamics, Newcastle, UK) [[100]31,[101]32]. One-way ANOVA was employed to perform the differential analysis. The MS parameters and statistical analysis of peptides/proteins used for the semitargeted validation experiment are described in [102]Supplementary Materials, Tables S2 and S3. 2.5. Proteins Network Analysis The STRING ([103]www.string.org) [[104]33] (accessed on 2 October 2021) protein–protein interaction networks functional enrichment analysis was performed as follows: the input proteins were the 12 selected proteins plus thymidylate synthase (TYMS) and dihydrofolate reductase (DHFR). The addition of TYMS and DHFR is justified by the fact that 5-FU nucleotide targets TYMS and the TS cycle, and therefore, also affects DHFR concentrations. Leucovorin counteracts DHFR activity of the dihydrofolic acid reduction effect. The analysis settings adopted were the following: multiple proteins mode; interaction score in the range 0.400–0.700 (medium to high confidence), depending on the experiment’s objectives; all selection criteria checked, excluding neighborhood; 1st shell max 10 interactions; 2nd shell max 50 interactions; confidence mode represented by the edge thickness related to the strength of the confidence (the highest, the larger). Other specific parameters included: number of nodes, 73; number of edges, 409; average node degree, 9.86; average local clustering coefficient, 0.663: expected number of edges, 174; PPI enrichment p-value <1.0 × 10^−16. We observed that by changing the parameter settings, the global network did not change regarding the main biological pathways present during the STRING enrichment process. The SOD-GPX redox biological process was always present together with the vesicles and trafficking system and the lipoprotein network. 2.5.1. Proteins Localization Gene localization was analyzed according to UniProtKB [[105]34] and the Compartment Subcellular localization database, as well as cellular component ontologies visualized by the Gene Ontology (GO) Consortium (GeneCards, [106]https://www.genecards.org, accessed on 2 October 2021) [[107]35]. Subcellular localizations from compartment localization data were integrated from the literature manual curation, high-throughput microscopy-based screens, predictions from primary sequence, and automatic text mining [[108]36]. Unified confidence scores of the localization evidence were assigned based on evidence type and source and ranged from 1 (low confidence) to 5 (high confidence) [[109]37]. The assumption was made that the proteins in the extracellular space could be present in the serum [[110]38]. Original content and additional information can be found at the Human Protein Atlas available at [111]www.proteinatlas.org [[112]39] and GeneCards [[113]35]. 2.5.2. Local Networks Local network interactions generated for each of the 12 proteins of the panel were obtained through STRING upon elaboration of data from GeneCards databases, Reactome ([114]https://reactome.org, accessed on 2 October 2021) [[115]39], and Panther ([116]http://www.pantherdb.org, accessed on 2 October 2021) [[117]40]. For each protein, the minimal number of interactions were considered that could explain the connections among the serum protein, the membrane protein, and at least one intracellular protein. Relevant interconnections are based on a confidence score >0.700 and p-value <1 × 10^−16. 2.6. Statistical Methods The sample size was chosen according to the availability of heavily treated OC patients who agreed to participate in the study. Despite the low numerosity of the sample to statistically validate a single biomarker (i.e., this study is intended to provide only a first step into drug resistance circulating markers); the statistical approach adopted during MS analysis and metadata integration have been shown to provide robust outcomes [[118]21]. All the serum samples were treated in duplicate and analyzed in triplicate in LC-MS to eliminate both biochemical and instrumental biases. Statistical analyses were conducted using R language and environment for statistical computing (R Foundation for Statistical Computing, Vienna, Austria). A type I error of 5% was taken as the limit for two-sided p-value statistical significance and all confidence intervals (CI) were reported as 95% CI. Differential protein abundance intended as fold change (FC) over time between patient response stratification was analyzed by mean of a paired t-test. A volcano plot was adopted to show the results of the FC analysis and to highlight which proteins had a statistically significant behavior change. Response to treatment contribution over time and its interaction with the different timepoints was investigated by mean of mixed-effect analysis of variance (ANOVA). 3. Results 3.1. Study Design First, we developed the MS study to identify the detectable proteins, their characterization, and their chemical properties such as MW and abundance. Then, statistical analyses were performed to identify the DEPs and selection criteria were applied to pass from about 12,000 proteins detected overall during the MS experiments (observations) down to 291. After subsequent selection steps that included time response (T0–T1) and PR or NR group response, 12 proteins (protein panel) were finally identified and their biological roles were described in the context of cancers/ovarian cancer pathology. Then, their profiles were studied in terms of time and response and reported in an expression profile representation. The second level of investigation was based on bioinformatic analysis of the metabolic network of the 12 identified proteins to characterize their interconnections in human cells and their localizations in cells, on cellular membranes and outside cells, by database analysis (STRING, GeneCards, Reactome and Panther). The process, first, allowed the description of a global network, then a local network around each protein of the panel was identified, and their intra-/extracellular localizations were characterized. The third level was the annotation of the biological role of each of the selected proteins in the cell response to FOLFOX-4, through metadata analysis and their connection with ovarian cancer. The workflow design is reported in [119]Figure 1. 3.2. Protein Identification and Characterization (First Analysis Level) 3.2.1. Label-Free MS Proteomic Approach Normalized protein abundance was the main parameter on which we analyzed 291 DEPs detected over two serum samples collected at T0 and T1 for each of seven patients (total 14 samples analyzed) and processed in three technical replicates, which resulted in a total of 12,222 observations. Then, these proteins were analyzed to identify DEPs between response groups. Firstly, we examined the differential protein abundance as fold change (FC) over time between PR and NR and a paired t-test was applied considering an alfa error <0.05 for statistical significance. The results from this analysis were reported by means of a volcano plot ([120]Figure 2a), and only a small subset of proteins (13/291) showed reproducible statistically significant FC between the PR group versus NR group. The LCAT protein, in [121]Figure 2a, has no label because its FC is under lower threshold (FC = 0.9982). LCAT (phosphatidylcholine-sterol acyltransferase) is indicated with the blue dot, which exceeds the limit of p-value, but not that of FC. Ten out of fourteen DEPs showed a reduction of log[2]FC(T1/T0) in the PR group with respect to NR, confirming that downregulated proteins outnumbered upregulated proteins ([122]Table 1, right column). As expected, a confirmation of DEPs identified ([123]Table 1) that reported a statistically significant differential expression between PR and NR groups at time T0 and T1 was obtained also with the ANOVA test ([124]Table 2). Figure 2. [125]Figure 2 [126]Open in a new tab (a) Volcano plot showing -log[10](p-value) versus log[2](FC[R]/FC[NR]). Horizontal lines indicate 0.05 (blue) and 0.01 (red) p-values. Proteins statistically significant (p < 0.05) and with a FC > 1 were reported alongside with their names. Protein in red dot fits the FC and statistical significance criteria, blue dot fits only the statistical criteria, the green dot fits only the FC criteria, and the grey dot does not fit either criteria. Proteins over the blue dashed line showing p < 0.05 are reported in [127]Table 1. Data for each protein were taken from the protein identification table of the MS analysis elaboration. ([128]Supplementary Materials doi.org/10.15490/fairdomhub.1.datafile.4074.1); (b) log[2] protein abundance expression profile between T0 and T1 relative to the 10 proteins reported in [129]Table 2. SOD3 and VPS18, taken from [130]Table S5a, are added as an example of non-intersecting proteins. The red color is related to NR patients, while the black color is related to PR. When the red and black lines intersect, it is intended that the contribution of the time in response status is relevant and, consequently, the interaction between response and treatment as well. Data are elaborated from the file named “Report progenesis _all rows_all data for biostatistics.doi.org/10.15490/fairdomhub.1. datafile.4074.1.” Table 1. Statistically significant DEPs between PR versus NR group at time T0 and T1. Up- or downregulation protein abundance was established by mean of a paired t-test. Statistically significant p-values are reported together with the relative FC (proteins reported in [131]Figure 2a showing p < 0.05). Protein PR Group log2(T1/T0) Mean FC NR Group log2(T1/T0) Mean FC t Test p-Value PR/NR Regulation APOL1 −1.5447 0.5949 0.0156 Downregulation CO8A −0.0363 −2.1393 0.0328 Upregulation FA12 −0.2436 0.9692 0.028 Downregulation FCGBP −0.79 1.3221 0.0023 Downregulation GELS −0.5374 0.8473 0.0212 Downregulation HABP2 −0.6816 1.2037 0.0244 Downregulation IGHG1 −1.2833 0.8845 0.0105 Downregulation IGM 0.6886 −0.504 0.0345 Upregulation LCAT 0.8256 −0.1726 0.0345 Upregulation LMNA −0.509 0.6646 0.0465 Downregulation LYVE1 −1.8428 1.1198 0.0051 Downregulation PGRP2 0.5958 −0.902 0.0079 Upregulation ROR1 −0.5573 1.0462 0.0361 Downregulation ZN573 −0.684 1.0665 0.0389 Downregulation [132]Open in a new tab Table 2. Proteins selected from those differentially expressed in [133]Table S4 and ANOVA statistic for time and response. Interaction Response to Treatment Timepoint Response × Timepoint Proteins F Value p-Value F Value p-Value F Value p-Value APOL1 0.06 0.8230 1.2 0.3241 12.92 0.0156 GELS 4.47 0.0881 1.5 0.2746 10.96 0.0212 GFI1 4.06 0.1001 0.27 0.6267 8.6 0.0325 LCAT 0.17 0.6939 2.22 0.1967 8.3 0.0345 LMNA 2.57 0.1699 0.54 0.4969 6.92 0.0465 LYVE1 0.07 0.8054 0.24 0.6471 22.65 0.0051 ROR1 0.3 0.6056 2.15 0.2024 10.51 0.0229 SHBG 0.01 0.9347 0.16 0.7064 6.74 0.0485 TEC 0.1 0.7751 9.81 0.0520 25.57 0.0149 ZNF573 0.05 0.8327 1.12 0.3380 8.41 0.0338 [134]Open in a new tab To confirm the identified protein list reported in [135]Table 2, we applied a semitargeted approach in which the proteotypic peptides of samples F10 and F12 at T0 were analyzed ([136]Supplementary Materials S2 and Table S2, Table S3). The samples were selected because they showed the largest differences in the protein profiles (raw AUC quantification of non-conflicting peptides), as represented in the raw data report (doi.org/10.15490/fairdomhub.1.datafile.4074.1). The sample analysis revealed the presence of the 12 proteins selected from this study ([137]Table 2). Data elaboration was robust and consistent with the primary label-free MS investigation reported in the manuscript. The results suggest that single proteotypic peptides in the LC-MS/MS inclusion list can be used as a means to estimate the abundance of the corresponding entire protein in serum samples (semitargeted approach). The description of the work is reported in the in the raw data report (doi.org/10.15490/fairdomhub.1.datafile.4074.1) associated with this manuscript (see data availability statement). 3.2.2. Protein Selection at T0–T1 Timepoints Initially, we proceeded by statistically analyzing the DEP by means of a linear mixed-effect analysis of variance (ANOVA) that allowed us to evaluate the independent contribution of time (T0–T1, timepoint) and response or their iterative effect on protein expression ([138]Supplementary Materials, Tables S4 and S5). Protein abundance in log[2] scale was chosen as the dependent variable, while time and response to therapy were included as independent covariates with a fixed effect. Lastly, the same study was conducted considering a random effect. The results were interpreted as statistically significant when beta error was ≤0.20 and alfa error was <0.05. Response to treatment has a statistically significant contribution on differential expression of five proteins ([139]Table S5a), otherwise protein differential expression of a different set of 27 proteins was identified as mainly driven independently by timepoint ([140]Table S5b). Lastly, 20 proteins demonstrated an iterative effect of treatment response and timepoint on their differential expression ([141]Table S5c). [142]Table S5a shows proteins that are different from those present in [143]Table S5c. If we consider the baseline sample, the proteins associated with patient outcome, and therefore, those which are potentially able to predict which patients can respond to therapy or not, are different from those suggesting that the patients are responding to therapy after the beginning of the same. Thus, the two protein panels have a different meaning, with the latter being of more clinical interest. A time-dependent ANOVA between NR and PR samples (T0 vs. T1 timelapses) evidenced a list of significant proteins ([144]Table S5c), which was submitted to further analysis. For the composition of the final panel, we progressed with the selection based on the biological significance of the 20 proteins and their clinical relevance in cancer processes. Then, during the network enrichment analysis process we added two proteins, SODE (SOD3) and VPS18, as they were differentially expressed based on response to treatment ([145]Table S5a), and they played important roles in cellular pathways related to cancer protection from reactive oxygen intermediates (SOD3) and autophagy (VPS18), respectively, ([146]Table 3) [[147]41]. 3.2.3. Role of the Selected Proteins in Cancer Processes A metadata investigation was performed to validate the proteins in the panel. The up- or downregulation trend for each of the 12 proteins of the final panel at T0 and T1, is reported in [148]Figure 2b. These trends are statistically significant and specifically influenced by one or both variables (time and response) or by their interaction, which is the specific case when they intersect. An intersection of the trends indicates that one of the two variables influences the state of the other. In our case, the state of time (T0 and T1) influences the state of the answer. Indeed, the group that had an overexpression at T0 decreased at T1, and vice versa. Only GSN, SOD3, and VPS18 proteins over 12 did not intersect. The biological properties of the 12 proteins of the panel are summarized in [149]Table 3 and [150]Table S6. The biological roles of the 12 proteins and the variations of their expression with respect to patient outcome and their roles in ovarian cancer are given below. The lymphatic vessel endothelial hyaluronan receptor-1 (LYVE-1) is a hyaluronic acid receptor, which is selectively expressed in the endothelium of lymphatic capillaries [[151]42]. Serum low LYVE-1 levels have been significantly associated with the occurrence of distant metastases in some cancers [[152]43]. LCAT and sex hormone-binding globulin (SHBG) have been recorded as differentially expressed between PR and NR patients. SHBG is present in serum and plasma (GeneCards). Although SHBG was not associated with overall risk of ovarian cancer in one recent study [[153]44], both LCAT and SHBG downregulation have been reported to provide important information on the aggressiveness of the ovarian cancer [[154]45]. This trend was also observed in our studies ([155]Figure 2b), where both proteins decreased in NR and increased in PR. It is worth noting that deregulated lamin-A/C (LMNA) expression has been associated with nuclear shape, mechanical stability, and migration ability of cells in ovarian cancer [[156]46,[157]47]. In our experiments, LMNA increased between T0 and T1 in NR patients and decreased in PR patients ([158]Figure 2b), therefore, the trend agreed with these studies. Two members of the tyrosine kinase family, non-receptor tyrosine-protein kinase Tec (TEC) and receptor tyrosine kinase-like orphan receptor 1 (ROR1), were found differentially expressed in PR with respect to NR, in baseline samples. TEC kinase, together with other proteins, play a role in the intracellular signaling of both B and T lymphocytes, relevant cells that contribute to the tumor microenvironment which is increasingly interested in controlling cancer growth [[159]48]. ROR1 overexpression has been associated with a poor prognosis in several solid and hematological malignancies, including ovarian cancer [[160]49,[161]50] and other malignancies [[162]51]. The same trend was also observed for ROR1 that showed higher expression at T1 in NR, while its expression was lower in PR ([163]Figure 2b). We observed lower gelsolin (GSN) levels in sensitive (PR) patients as compared with their chemo-resistant counterparts (NR). GSN is considered to be one important determinant for chemo-resistance, probably due to inhibition of the apoptosis pathways [[164]52,[165]53]. SOD3 (extracellular superoxide dismutase [Cu-Zn]) is the predominant antioxidant enzyme secreted into the extracellular space, that affects drug delivery, and chemotherapeutic effect on tumors [[166]41,[167]54]. It was selected because it significantly correlated with response to treatment ([168]Table S5b), but its level change was not correlated with time T0–T1. We observed higher SOD3 levels in PR patients with respect to NR patients in samples before treatment ([169]Figure 2b), in agreement with the reported findings in the literature [[170]54]. In our analysis, two zinc finger proteins were found to be differentially expressed between PR and NR patients: zinc finger protein (GFI1) and zinc finger protein 573 (ZNF573). In both cases, the two proteins demonstrated a significant statistical effect of treatment response and timepoint ([171]Table S5c) on their differential expression. The protein encoded by ZNF573 decreased in PR, whereas increased in NR, at T1 time, suggesting a possible role of this protein in the response to treatment. The function of ZNF573 is still undetermined, and only a recent study has suggested that ZNF573 may be involved in cervical cancer activating the cancer progression through the regulation of transcription or structural integrity of cells [[172]55]. We observed a similar trend for GFI1. Two recent studies reported that GFI1 has been shown to favor the survival of myeloid cells in myeloproliferative disease [[173]56] and tumor maintenance in medulloblastoma [[174]57]. Vacuolar protein sorting-associated protein 18 homolog (VPS18) was selected because it was significantly correlated with response to treatment ([175]Table S5b). It was found to be overexpressed in PR with respect to NR in patients’ samples before treatment and was slightly increased between T0 and T1 in both cases ([176]Figure 2b). It is well known that VPS18 plays a key role in vesicle-mediated protein trafficking to lysosome including the endocytic membrane transport and autophagic pathways [[177]58,[178]59]. The role of apolipoproteins in cancer has not been explored deeply yet. In our analysis, we found only apolipoprotein L1 (APOL1) differentially expressed between PR and NR patients. APOL1 is a secreted high-density lipoprotein, which binds to apolipoprotein A-I. It has been characterized as a novel Bcl-2 homology domain 3 (BH3)-only lipid binding protein [[179]60,[180]61]. In our studies, APOL1 decreased at T1 in PR with respect to NR ([181]Figure 2b). Summarizing, the 12 proteins of the selected panel, considered to be relevant in the statistical analysis, are also confirmed to be relevant by metadata analysis. In fact, experiments in the literature have suggested that the proteins of the panel and their trends have also been similarly found in other ovarian cancer studies. In the next steps, we analyze how the 12 proteins are connected and which biological processes are involved using the enrichment network analysis approach. An RI cluster analysis with a zero inflated model was applied to the 12 DEPs to analyze their behaviors both at T0 only and with a differential model between T0 and T1. The most significant clusters identified were APOL1 + GFI1 + LYVE1 (zI = 4.20) and LCAT + LMNA (zI = 3.25) at baseline time only, whereas the cluster GFI1 + LCAT + LMNA was identified with a T0–T1 analysis with a zI = 3.67. If a Q-test was applied to discard the outlier data from proteomic triplicates (confidence interval >95%), the cluster APOL1 + GFI1 + LYVE1 emerged with a zI = 3.16. The results of the analysis are reported in [182]Supplementary Material S7 and Table S7. 3.3. Enrichment of the Cellular Network of the Selected Protein Panel (Second Analysis Level) Global Network Analysis The serum proteins relate to membrane proteins, and intracellular proteins are connected to each other through intracellular physical and functional networks. Therefore, it is relevant to identify the meaning of these connections. As a starting point, we characterized the panel of 12 proteins considering the overall metabolic network independently on the proteins’ localization, (cytosol, membrane, or serum) applying a global network analysis through STRING and GeneCards. This approach allowed the extraction of the necessary information for further explanation of metabolic changes in response to treatment. The first level of enrichment was the addition of thymidylate synthase (TYMS gene encoding for TS protein) and dihydrofolate reductase (DHFR). TS and DHFR represent the main proteins of the TS cycle, and therefore, it is expected that 5-FU (FOLFOX-4) modulates both [[183]9,[184]10,[185]62]. Our MS experimental conditions did not allow identification of either TS and DHFR, as they are difficult to detect in differential proteomic experiments on tissue cancer samples due to their nuclear compartmentalization and low physiological concentrations, despite their recognized relevant role in cancer and drug resistance [[186]62,[187]63]. TYMS has also been considered a potential prognostic biomarker of overall survival in patients with CRC, where high TYMS levels predict for low overall survival [[188]10]. The 12 selected proteins plus TS and DHFR were processed using STRING with their annotation to highlight any common biological processes in which they are involved and to identify their interconnections ([189]Figure S1). The UniProt entry names were used for the statistical over-representation test in STRING [[190]64]. An enrichment analysis was performed and resulted in up to 84 total proteins divided in the first shell (12 proteins submitted plus TS and DHFR, colored spheres in [191]Figure S1) and 69 extra proteins almost all belonging to the second shell (white color in [192]Figure S1). Despite different attempts to connect all the 12 proteins plus TS and DHFR of the panel during the enrichment process, a few of them remained unconnected, specifically, GFI1 and ZNF573, at the level of enrichment selected. Three additional proteins, ROR1, LYVE, and TEC, showed a very limited connection with only one protein of the global network. The other proteins very well interconnected. Then, we studied the biological processes using the gene ontology (GO) [[193]37] analysis of the pathways and biological processes and revealed that modulation of the cellular metabolism by FOLFOX-4 results from the combination of multiple layers of regulation. The overall network is dominated by the cellular organization biological process ([194]Figure 3, red spheres). Figure 3. [195]Figure 3 [196]Open in a new tab Global network visualization based on STRING pathway enrichment analysis of the 12 DEP proteins + TYMS and DHFR showing the most extended biological process containing the protein panel. Details are reported in the main text. Red spheres represent the cellular metabolism organization biological process with the following STRING features: GO:0016043 and FDR 51/5163. A detailed GO analysis showed that the four most relevant biological processes involving the protein panel are related to vesicle trafficking process, lipoproteins associated metabolic process, structural component modulation in relation to cellular apoptosis and autophagy, and cellular oxidative stress response ([197]Figure 4). These principal biological processes were well connected to the purine metabolism and apoptotic process generated by STRING around the 5-FU and leucovorin targets, i.e., TYMS and the TS cycle protein, DHFR. Figure 4. [198]Figure 4 [199]Open in a new tab Global network visualization based on the STRING pathway enrichment analysis of the 14 selected proteins (12 + TYMS or TS and DHFR). The network shows the most relevant biological process containing the protein panel. Details are reported in the main text. The STRING features are the following: yellow spheres, actin-filament based movement and regulation (GO:0030048, FDR 1.31 × 10^−7, 9/105); green spheres, cholesterol metabolic process (GO:FDR 2.3 × 10^−3); violet spheres, endosome to lysosome transport (GO:0008333, FDR 1.78 × 10^−8, 8/49); pink spheres, cellular response to oxidative stress (GO:0034599, FDR 5.36 × 10^−10, 14/22); red spheres, pteridine-containing compounds biological process (GO:0042558, FDR 9.37 × 10^−14, 11/33). A detailed GO analysis shows that the four most relevant biological processes involving the protein panel are related to vesicle trafficking process, lipoproteins associated metabolic process, structural component modulation in relation to cellular apoptosis and autophagy, and cellular oxidative stress response. The principal biological processes are well connected to the purine metabolism and apoptotic process generated by STRING around the 5-FU and leucovorin targets, i.e., TYMS and the TS cycle protein, DHFR. Our analysis was based on protein modulations detected in the serum samples and related to intracellular biological processes. With the aim to understand the interconnections between serum and intracellular networks and how this is rationally associated with cancer biology, we conducted a local network analysis (LnA), and then a localization characterization of the proteins of the panel was performed, stemming from the global network analysis of [200]Figure 4. LnA was performed through the identification of the minimal number of interactions that each protein of the panel ([201]Table 3) could establish with other interconnected proteins previously identified in the global network. A total of nine networks reported in [202]Figure 5 describe the serum-membrane-intracellular connections for all proteins of the panel. A few local networks (1, 2, and 4) involve more than one protein of the panel. The function of ZNF573 was not well known and its network (number 6) was identified using the co-expression STRING analysis, and then only a few proteins were found connected through co-expression and experimental analysis to TRIM28 (tripartite motif containing 28, a protein coding gene), and then to CDC5L (cell division cycle 5-like, a DNA-binding protein involved in cell cycle control) ([203]Figure 5). The detailed local specific protein-centered connections information is reported in [204]Table S4. Some proteins are recurrently present in the networks such as AKT1, which is present in four of nine networks reported in [205]Figure 5 (3, 5, 7, 8), while FOXO3, ACTA1, and APOA1 are present in two of nine networks. It is worth noting that AKT1, FOXO3, and APOA1 were not detected in the MS study, but are relevant in ovarian cancer development ([206]Table S8). The local network identified, reported in [207]Figure 5 are also found in the biological processes identified in the global network such as cholesterol metabolic process (LCAT and APOL1), actin-filament based movement and regulation (GSN, LMNA, and LCAT), endosome to lysosome transport and trafficking (VPS18), and cellular response to oxidative stress (SOD3) ([208]Figure 4). Figure 5. [209]Figure 5 [210]Open in a new tab Local network interaction generated for each of the 12 proteins of the panel obtained through STRING upon elaboration of data from [211]Figure 3 and GeneCards databases. The most relevant interconnections based on confidence feature (value of confidence >0.700 and p < 1.0 × 10^−16) are also visualized in [212]Figure 6. Each local network (1–9) shows the relevant protein connections written in the bottom, and proteins of the MS panel are reported in bold. SOD3 and VPS18, taken from [213]Table S5a, are added as internal control proteins. Localization of the proteins was achieved by GeneCards that adopted the Genome Atlas information [[214]65]. [215]Figure 6 shows the results of the localization network analysis. The protein connections are established between intracellular environment, plasmatic membrane, and extracellular space on the basis of metadata analysis through the different tools and database adopted. Some proteins are usually found in the serum such as SOD3, GSN (not shown), APOL1, LYVE1, and SHGB ([216]Table 3). Some of them are related to their membrane protein form (ROR1, APOL1, and LYVE1) or to different proteins connected with the local and global networks. The localization metadata agree with the features of the protein of the selected protein panel. Table 3. References of biological properties and main local interaction