Graphical abstract graphic file with name ga1.jpg [37]Open in a new tab Keywords: Guizhi Fuling Decoction, Breast cancer, Systems pharmacology, HTS^2 assay, Bioinformatics Abstract As one of the classical traditional Chinese medicine (TCM) prescriptions in treating gynecological tumors, Guizhi Fuling Decoction (GFD) has been used to treat breast cancer (BRCA). Nonetheless, the potential molecular mechanism remains unclear so far. Therefore, systems pharmacology was used in combination with high throughput sequencing-based high throughput screening (HTS^2) assay and bioinformatic technologies in this study to investigate the molecular mechanisms of GFD in treating BRCA. By computationally analyzing 76 active ingredients in GFD, 38 potential therapeutic targets were predicted and significantly enriched in the “pathways in cancer”. Meanwhile, experimental analysis was carried out to examine changes in the expression levels of 308 genes involved in the “pathways in cancer” in BRCA cells treated by five herbs of GFD utilizing HTS^2 platform, and 5 key therapeutic targets, including HRAS, EGFR, PTK2, SOS1, and ITGB1, were identified. The binding mode of active compounds to these five targets was analyzed by molecular docking and molecular dynamics simulation. It was found after integrating the computational and experimental data that, GFD possessed the anti-proliferation, pro-apoptosis, and anti-angiogenesis activities mainly through regulating the PI3K and the MAPK signaling pathways to inhibit BRCA. Besides, consistent with the TCM theory about the synergy of Cinnamomi Ramulus (Guizhi) by Cortex Moutan (Mudanpi) in GFD, both of these two herbs acted on the same targets and pathways. Taken together, the combined application of computational systems pharmacology techniques and experimental HTS^2 platform provides a practical research strategy to investigate the functional and biological mechanisms of the complicated TCM prescriptions. 1. Introduction According to the Global Cancer Report released at the end of 2018, breast cancer (BRCA) is the most common malignancy in female patients, and it ranks second place in terms of morbidity among all malignancies [38][1]. Apart from chemotherapy and radiotherapy, complementary and alternative therapy (CAM) has gradually become a new therapeutic option. As an indispensable component of CAM, traditional Chinese medicine (TCM) has been increasingly applied in cancer prevention and treatment over the last few decades [39][2]. Compared with chemotherapy which has a series of adverse reactions, TCM therapy has certain advantages in treating breast cancer, including less adverse reactions, safer property and higher patient compliance [40][3]. Guizhi Fuling Decoction (GFD) is originated from a classic medical book named the Synopsis of the Golden Chamber (Jingui Yaolue) written by the famous Chinese physician Zhang Zhongjing at about 1800 years ago. It is composed of 5 herbs, including Cinnamomi Ramulus (Guizhi), Poria (Fuling), Radix Paeoniae Rubra (Chishao), Cortex Moutan (Mudanpi), and Persicae Semen (Taoren). In ancient clinical practice, Chinese physicians have used GFD to prevent and treat gynecological malignant tumors included mammary tumors, which provide theoretical support for the treatment of breast cancer [41][4]. In addition, numerous experimental researches have fully proved GFD could inhibit the proliferation and induce the apoptosis of breast cancer cells [42][5], [43][6]. However, the underlying molecular mechanisms remain largely unknown for the time being. Systems pharmacology, an emerging discipline that emphasizes integrity and systematization, is a more appropriate approach to investigate the complicated mechanisms of TCM, which commonly functions in a multi-component and multi-target manner [44][7]. Systems pharmacology has been successfully adopted in previous studies to predict the primary bioactive substances and to elucidate the potential therapeutic mechanisms of TCM prescriptions, such as Huanglian Jiedu Decoction [45][8]. Nonetheless, the suitable verification method is lacking to verify the results predicted by systems pharmacology, which leads to the uncertainty of research. Therefore, this study applied a novel and powerful research technique called high throughput sequencing-based high throughput screening (HTS^2) [46][9], [47][10]. HTS^2, a high-throughput screening platform based on gene expression signature, is mainly composed of the RASL strategy and the next-generation sequencing technology. This platform allows us to simultaneously examine the expression of thousands of genes in human cells treated with thousands of herbs; importantly, it provides the large-scale herb-cell-gene datasets for effectively validating the predicted results of systems pharmacology, and brings new research inspiration. Moreover, it can quantitatively analyze gene matrices related to a specific phenotype or focusing on a specific pathway. Compared with genome sequencing, the HTS^2 technique is more targeted and cost-effective. In this study, systems pharmacology was employed to construct an active ingredient-BRCA target network, so as to screen the most significantly enriched anti-BRCA signaling pathway of GFD. Thereafter, HTS^2 assay was carried out to verify the effect of drug intervention on gene expression involved in this pathway. Finally, bioinformatics analysis was performed to identify the key targets of GFD and to explain its anti-BRCA mechanism. [48]Fig. 1 presents the research flowchart in this study. Fig. 1. [49]Fig. 1 [50]Open in a new tab Experimental technical roadmap. 2. Materials and methods 2.1. Construction of the chemical component database All the chemical components of GFD were collected from the Traditional Chinese Medicine Systems Pharmacology Database (TCMSP) ([51]http://lsp.nwu.edu.cn/, version 2.3, updated on May 31st, 2014) [52][11] and the NCBI PubChem Database ([53]https://pubchem.ncbi.nlm.nih.gov/, updated on May 8th, 2019) [54][12]. 2.2. Screening the active ingredients The chemical components of GFD were filtered by Oral bioavailability (OB) and Drug-likeness (DL). Of them, OB in vivo (%F), which represents the unchanged fraction of the orally administered dose that achieves systemic circulation, is one of the most commonly pharmacokinetic parameters used in drug screening. In this study, a robust calculative system, OBioavail 1.1, was employed to predict the compound OB [55][13]. Then, compounds with OB ≥30% were selected as the active ingredients in this study. Besides, DL is defined as a delicate balance among the various molecular properties that affect pharmacodynamics and pharmacokinetics, which ultimately affects its absorption, distribution, metabolism, and excretion (ADME) in the human body like a drug. In this study, the DL index of a new compound was calculated according to the Tanimoto similarity [56][14]. [MATH: fA,B=A.B A2+ B2- A.B :MATH] Where A represents the new compound and B stands for the average DL index of all the 6511 molecules in the DrugBank database based on the Dragon soft descriptors. Afterwards, compounds with DL <0.18 were removed. Finally, compounds with both OB ≥30% and DL ≥0.18 were considered as the active ingredients, as suggested by the TCMSP database. 2.3. Prediction of the potential targets of active ingredients SysDT, the drug-target prediction model, was adopted in this study to predict the targets of active ingredients. Notably, SysDT is based on 6511 drugs and 3987 targets of the DrugBank database, as well as the mutual correlation degree [57][15]. It is established according to the support vector machine (SVM) algorithm, with the consistency, sensitivity, and specificity of 82.83%, 81.33% and 93.62%, respectively. Using this model, targets with SVM >0.7 were predicted as the potential targets of active ingredients. Additionally, the target information from the STITCH ([58]http://stitch.embl.de/, version 5.0) [59][16], the Therapeutic Target Database (TTD) ([60]http://bidd.nus.edu.sg/group/cjttd/, updated on September 15th, 2017) [61][17], and the Uniprot ([62]https://www.uniprot.org/, updated on July 31st, 2019) [63][18] databases, was integrated to supplement this prediction model. 2.4. Collection of BRCA therapeutic targets and network mapping All BRCA therapeutic targets were collected from the OMIM ([64]https://omim.org/, updated on June 28th, 2019) [65][19], the DrugBank ([66]https://www.drugbank.ca/, version 5.1.4, updated on July 2nd, 2019) [67][20], and the TTD ([68]http://bidd.nus.edu.sg/group/cjttd/, updated on September 15th, 2017) [69][17] databases. Thereafter, the BRCA target network was established using Cytoscape v3.7.1 [70][21], which was then mapped with the compound-target network to obtain the active ingredient-BRCA target network of GFD, including all the GFD-related targets for BRCA treatment. Moreover, the Venn diagram was acquired through the Venn diagram web tool ([71]http://bioinformatics.psb.ugent.be/webtools/Venn/) to visualize the overlapping targets. 2.5. Functional enrichment analysis The functional enrichment analysis of the overlapping targets was performed by DAVID ([72]https://david.ncifcrf.gov/, version 6.8) [73][22]. Later, the gene ontology (GO) biological process (BP) terms and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways were enriched based on the adjusted P-value of <0.05. Thereafter, the top 10 signaling pathways were drawn into a bubble chart using the R software [74][23]. 2.6. Genetic alteration and survival analysis cBioPortal ([75]https://www.cbioportal.org/, updated on August 13th, 2019) [76][24], a web-based integrated data mining system, was utilized to examine the genetic alterations and to perform survival analysis on GFD-related targets for BRCA treatment. 2.7. The HTS^2 assay HTS^2 is a high-throughput screening platform based on the gene expression signature that quantitatively analyzes cell transcriptional profiles at a large scale [77][9], [78][25]. In this study, the HTS^2 assay was carried out to detect the regulation of GFD on the most significantly enriched signaling pathway in BRCA cells. 2.7.1. Cell culture and materials MDA-MB-231 cells were obtained from the China Infrastructure of Cell Line Resources (Beijing, China), and cultured in DMEM medium (Gibco) supplemented with 10% fetal bovine serum (FBS, Gemini), 100 U/mL streptomycin and 100 U/mL penicillin (Gibco) under the humidified atmosphere of 95% air and 5% CO[2] at 37℃. Meanwhile, the extracts of five herbs contained in GFD were provided by CapitalBio Corporation (Beijing, China). All probes were purchased from Invitrogen (Shanghai, China). 2.7.2. Preparation of herb extracts Firstly, the five herbs, namely, Cinnamomi Ramulus, Poria, Radix Paeoniae Rubra, Cortex Moutan, and Persicae Semen, were pulverized into powders using a grinder. Secondly, the powders were extracted with the 90% ethanol solvent in a Soxhlet extractor. Then, the solvent was dried to extractum by the drying oven at 45 °C. Finally, a certain amount of extractum was weighed and diluted with DMSO to 100 μg/mL, and preserved in a refrigerator at −80 °C for further studies. 2.7.3. Probe design Afterwards, genes involved in the most significant KEGG pathway in GFD were selected as the probes [79][26]. Typically, three pairs of probes were designed for each gene according to exon-exon junctions near the 3′-terminal of target mRNA. Moreover, the Tm range of those designed probes was required to limit G/C content in probe design, and all probes must be tested to prevent the high-affinity probes from excessively occupying the sequencing space. Then, one pair of probes was selected to represent a transcript. 2.7.4. HTS^2 screening Gene signature represents a phenotype of interests, which is used to search for herbs that affect phenotype in the presence or absence of a validated drug target. This study attempted to explore the synergetic mechanisms of the five herbs in GFD in MDA-MB-231 cells. To this end, about 5000 MDA-MB-231 cells were grown in the 384-well plates for 24 h, treated with herb extracts for 24 h, lysed with lysis buffer, and incubated at room temperature for 10 min. Then, the lysis was preserved at −80 °C for subsequent automated HTS^2 assay. Later, the sample obtained from the HTS^2 assay was sequenced using the Illumina HiSeq X Ten sequenator. 2.8. Data processing First of all, all reads were mapped to the probe sequences, which allowed for three mismatches and were normalized relative to the expression of 18 stable housekeeping genes. Subsequently, to evaluate the reliability and repeatability of the transcriptional profile, the Pearson correlation coefficients among normalized transcriptional data were calculated by R software after treatments with 16 replicates of DMSO and 3 replicates of the five herbs in GFD. Notably, the correlation coefficients of >0.9 indicated that the HTS^2 assay results were reliable and repeatable. Then, the FC and P-value of gene expression were calculated via R package DESeq2 [80][27], and genes with |FC| > 2 and P-value <0.05 were considered as the differentially expressed genes (DEGs). Later, the heatmap and volcano plots of DEGs were drawn using the R package pheatmap and ggplot2, respectively [81][28], [82][29]. 2.9. Interaction network construction and analysis Based on the KEGG enrichment analysis results on DEGs, the herb-target-pathway interaction network was constructed [83][26]. Besides, information related to the phenotypes associated with the KEGG pathway was also collected. Subsequently, the herb-pathway-phenotype interaction network was constructed based on the links, so as to clarify the synergistic effects of herbs. 2.10. Construction of the protein–protein interaction (PPI) network and screening of key targets To screen the key targets among the DEGs verified by the HTS^2 assay, all DEGs were mapped to the STRING database ([84]https://string-db.org/, version 11.0, updated on January 19th, 2019), a database of known and predicted PPIs [85][30]. Afterwards, the PPI network with a combined score of >0.9 was constructed. Subsequently, topological analysis was carried out using the Network Analyzer plug-in contained in Cytoscape, and the main topological parameters of the PPI network were obtained. In this study, the degree centrality was adopted as the major parameter, which was used in combination with the closeness centrality and the betweenness centrality to identify the key anti-BRCA targets in GFD. 2.11. Survival analysis The Kaplan-Meier Plotter ([86]http://kmplot.com/) was adopted for analyzing the association of key anti-BRCA target expression in GFD with BRCA patient survival rate (n = 3951) [87][31]. 2.12. Molecular docking The crystal structures of key anti-BRCA targets in GFD were obtained from the RCSB Protein Data Bank (PDB) [88][32]. Then, the crystal structure of each protein was selected based on the optimal available resolution. Moreover, the protein preparation wizard module of Schrodinger's Maestro Molecular modeling suit (Schrödinger Release, 2019–1) was utilized to prepare the protein crystallographic structures [89][33]. In addition, the LigPrep module of Schrodinger's Maestro Molecular modeling suit was employed to obtain the 3D structures and energy minimization of the active ingredients in GFD. Based on the specific known active sites of the protein targets, Glide was adopted for all molecular docking simulations and calculations [90][34]. Moreover, the ligand-target interactions were visualized by the ligand interaction diagram module. 2.13. Molecular dynamics (MD) simulation To further optimize the docking results, MD simulation was performed by GROMACS 2019 with all-atomic force field and SPC water model [91][35]. The simulation temperature was set to 300 K, and simulation system formed by the water molecules added around the protein was served as the periodic boundary. In the simulation process, the particle-mesh Ewald (PME) algorithm was used to calculate the long-range electrostatic interactions, and 2 fs integral time step was applied. Under the NVT (keeping the atomic number, volume and temperature constant) ensemble, the system was balanced and the water was optimized for 500 ps. Afterwards, another 500 ps equilibration under NPT ensemble conditions (keeping atomic number, pressure and temperature constant) was conducted, which was followed by a final production run of 100 ns. The binding energy of the compound and the protein was calculated using MM-PBSA method. And the images were drawn by pymol 2.3 and Maestro 11.9 [92][33], [93][36]. 2.14. Plotting of the GFD-related mechanism diagram According to the results obtained from HTS^2 assay, KEGG pathway enrichment analysis, topological analysis, and molecular docking, a specific mechanism diagram related to those key anti-BRCA targets in GFD was plotted to visualize the significant molecular mechanisms of GFD in treating BRCA. 3. Results and discussion 3.1. Screening of active ingredients All chemical components of the five herbs contained in GFD, including Guizhi (2 2 0), Fuling (34), Chishao (1 1 9), Mudanpi (55), and Taoren (66), were collected through TCMSP and PubChem. Afterwards, altogether 76 compounds with OB ≥30% and DL ≥0.18 were identified as the active ingredients ([94]Table S1). Among all active ingredients contained in Guizhi (7), Fuling (15), Chishao (29), Mudanpi (11), and Taoren (23), 6 were shared by 2 or 3 herbs. For example, beta-sitosterol was shared by Guizhi, Chishao, and Taoren. 3.2. Potential targets of GFD The SysDT model was utilized to predict the potential targets for all active ingredients in GFD, and a total of 211 potential protein targets were finally acquired. The detailed information about these potential targets is provided in [95]Table S2. 3.3. Known therapeutic targets for BRCA To acquire all the therapeutic targets for BRCA, they were collected manually from the TTD, OMIM and DrugBank databases, respectively. In total, 58, 21 and 118 known therapeutic protein targets for BRCA were acquired from the TTD, OMIM and DrugBank databases, respectively. Ultimately, 182 known BRCA targets were collected after eliminating the redundancy. The detailed information is presented in [96]Table S3. 3.4. Mining of overlapping targets and enrichment analysis To obtain the anti-BRCA targets in GFD, a comparative analysis was carried out for the potential targets in GFD and the therapeutic targets for BRCA. The results showed that, 38 proteins overlapped between the potential targets in GFD and the known therapeutic targets for BRCA ([97]Table S4), which were the anti-BRCA targets in GFD with high confidence. To further explore the biological mechanisms underlying these 38 anti-BRCA targets in GFD, the GO/KEGG pathway enrichment analysis was conducted. The top 10 most significant GO/KEGG pathways linked to these 38 targets were selected, as displayed in [98]Fig. 2. Moreover, results of GO enrichment analysis indicated that, those targets were involved in multiple biological processes (BPs), including response to drug, negative regulation of apoptotic process, oxidation–reduction process, positive regulation of gene expression, xenobiotic metabolic process, response to estradiol, positive regulation of nitric oxide biosynthetic process, response to toxic substance, response to ethanol, and positive regulation of protein phosphorylation. Typically, response to drug, which is also referred to as drug susceptibility/resistance, is a key to the success or failure in cancer treatment [99][37]. Besides, the negative regulation of apoptotic process allows for the infinite cell growth and division, which is one of the fundamental factors in carcinogenesis [100][38]. Some evidence suggests that, the oxidation–reduction process has significant regulatory effects on tumor plasticity [101][39]. Fig. 2. [102]Fig. 2 [103]Open in a new tab Functional enrichment analysis. (A) The Venn diagram of the potential targets in GFD and the therapeutic targets for BRCA. (B) GO enrichment analysis on 38 anti-BRCA targets in GFD. (C) KEGG pathway enrichment analysis on 38 anti-BRCA targets in GFD. Further, results of KEGG enrichment analysis of the 38 targets revealed that, the anti-BRCA effect of GFD showed the highest correlation with “pathways in cancer”, followed by “microRNAs in cancer” and “proteoglycans in cancer”. Specifically, the “Pathways in cancer” involves multiple signaling pathways and it is tightly associated with cancer. Besides, cancer is also related to microRNAs (miRNAs), which have been reported to regulate the expression of various oncogenes or tumor suppressor genes and may serve as the diagnostic and prognostic biomarkers [104][40]. Proteoglycans exert multiple functions in cancer and angiogenesis, which is ascribed to their polyhedric nature and their capacity to interact with both ligands and receptors that regulate neoplastic growth and neovascularization [105][41], [106][42]. 3.5. Mining of genetic alterations and survival analysis To further explore the molecular characteristics of 38 targets among different cohorts of BRCA patients, genetic alteration mining and survival analysis were performed by cBioPortal. According to the results, there were about 25%-80% genetic alterations among the 11 datasets of BRCA patients. To be specific, the genetic alterations of those 38 anti-BRCA targets in GFD included mutation, fusion, amplification, deep deletion, and multiple alterations, with mutation being the most frequently seen alteration ([107]Fig. 3A). Notably, cases with genetic alterations were linked with poor survival compared with those without alterations ([108]Fig. 3B). Such results suggested that, these 38 targets were closely correlated with the prognosis for BRCA, which supported the clinical application of GFD in treating BRCA. Fig. 3. [109]Fig. 3 [110]Open in a new tab Genetic alteration mining and survival analysis of 38 anti-BRCA targets in GFD based on BRCA studies in cBioPortal. (A) Overview of alterations in the 38 targets among genomic datasets available in 11 different BRCA studies. (B) Kaplan-Meier survival curve between groups with and without alterations. P-value < 0.05 indicated statistical significance. 3.6. Identification of DEGs by the HTS^2 assay Due to the incompleteness of databases and the shortcomings of the systems pharmacology method, the predicted results might not entirely reflect the actual mechanism of GFD in treating BRCA. By means of HTS^2 assay, the above deficiencies might be partially corrected, and new research clues might be provided. As found in the studies mentioned above, the “pathways in cancer” was the most significant anti-BRCA signaling pathway in GFD. Thus, a total of 308 genes involved in this pathway were collected ([111]Table S5), and HTS^2 assay was carried out to detect the expression changes in these 308 genes within the BRCA MDA-MB-231 cells treated with the 5 herbs of GFD, respectively ([112]Table S6). In the heatmap ([113]Fig. 4A), almost all modules were blue, indicating that each of these five herbs suppressed the expression of most genes, and such finding was consistent with the anti-BRCA effect of GFD. In addition, the cut-off criteria were set as |FC| >2 and P-value <0.05 in this study to screen out DEGs. In the volcano plot ([114]Fig. 4B), altogether 70 DEGs, including 32 up-regulated and 38 down-regulated one, were identified in Guizhi; while 8 DEGs, including 5 up-regulated and 3 down-regulated ones, were found in Fuling. Besides, 33 DEGs were screened in Mudanpi, including 23 up-regulated and 10 down-regulated ones; whereas 8 DEGs were discovered in Chishao, including 5 up-regulated and 3 down-regulated ones. Not surprisingly, there were least DEGs in Taoren, which only included 2 down-regulated genes, with no up-regulated one. In conclusion, altogether 121 DEGs reached the threshold. Fig. 4. [115]Fig. 4 [116]Open in a new tab The heatmap and volcano plot of DEGs verified by HTS^2 assay. (A) The heatmap of DEGs. (B)The volcano plot of DEGs. The cut-off criteria were set at |Foldchange| > 2 and P-value < 0.05. The blue dots represented DEGs that reached the threshold, and the red dots stood for DEGs that did not reach the threshold. (For interpretation of the references to