Abstract

   In this study, we analyzed the chemical compositions of Alangium
   platanifolium (Sieb. et Zucc.) Harms (AP) using ultraperformance liquid
   chromatography-quadrupole time-of-flight mass spectrometry
   (UPLC-Q-TOF-MS) non-targeted plant metabolomics integration
   MolNetEnhancer strategy. A total of 75 compounds, including flavonoids,
   alkaloids, terpenes, C[21] steroids, among others, were identified by
   comparing accurate mass-to-charge ratios, MS^2 cleavage fragments,
   retention times, and MolNetenhancer-integrated analytical data, and the
   cleavage rules of the characteristic compounds were analyzed. A total
   of 125 potential cervical cancer (CC) therapeutic targets were obtained
   through Gene Expression Omnibus (GEO) data mining, differential
   analysis, and database screening. Hub targets were obtained by
   constructing protein-protein interaction (PPI) networks and CytoNCA
   topology analysis, including SRC, STAT3, TP53, PIK3R1, MAPK3, and
   PIK3CA. According to Gene ontology (GO) analysis, AP was primarily
   against CC by influencing gland development, oxidative stress
   processes, serine/threonine kinase, and tyrosine kinase activity.
   Enrichment analysis of the Kyoto Encyclopedia of Genes and Genomes
   (KEGG) indicated that the PI3K/AKT and MAPK signaling pathways play a
   crucial role in AP treatment for CC. The compound-target-pathway
   (C-T-P) network revealed that quercetin, methylprednisolone, and
   caudatin may play key roles in the treatment of CC. The results of
   molecular docking revealed that the core compound could bind
   significantly to the core target. In this study, the compounds in AP
   were systematically analyzed qualitatively, and the core components,
   core targets, and mechanisms of action of AP in the treatment of CC
   were screened through a combination of network pharmacology tools.
   Providing a scientific reference for the therapeutic material basis and
   quality control of AP.

   Keywords: Alangium platanifolium, Cervical cancer, Molecular docking,
   Molecular networking, Network pharmacology, UPLC-Q-TOF-MS

1. Introduction

   Cervical cancer (CC) is the fourth most frequent malignancy globally
   and the fourth principal cause of cancer-related mortality in women
   [[41]1]. According to statistics, there were 604,127 new cases of CC
   and 341,831 deaths from CC globally in 2020 [[42]2]. Current treatments
   for CC mainly include radiotherapy, chemotherapy, and/or surgical
   resection. However, these modalities can cause various adverse effects
   and have limited efficacy in advanced CC [[43]3]. Therefore,
   identifying a safer and less toxic treatment strategy for the effective
   treatment of CC is crucial. Natural products have received widespread
   attention as a potential source of highly effective and low-toxicity
   antitumor drugs. There is increasing evidence that plant-derived
   natural products can achieve anti-CC effects through various
   mechanisms, including inhibition of CC cell proliferation, induction of
   CC apoptosis, reduction of telomerase activity, and inhibition of
   angiogenesis [[44]4,[45]5]. Based on previous literature research,
   Alangium platanifolium (Sieb. et Zucc.) Harms (AP) belonging to the
   family Alangiaceae is an important medicinal resource. In the "Quality
   Standards for Traditional Chinese Medicinal Materials and Ethnic
   Medicinal Materials in Guizhou Province (2003 Edition)" of China, it is
   explicitly stated that Bajiaofeng refers to the dried fine whisker
   roots or dry branches of the Alangium chinense (Lour.) Harms and the
   Alangium platanifolium (Sieb.et Zucc.) Harms. Recent studies have shown
   that monomeric compounds from Alangium chinese exhibit good inhibitory
   effects on HeLa cells [[46]6]. The current research has placed
   significant emphasis on Alangium chinense, but insufficient with
   regards to AP. While a few glycosides have been extracted from its
   leaves, there is a dearth of information regarding the components of
   its root. To expand the potential applications of AP in medicinal
   contexts, it was selected as the primary focus of this study.

   Based on liquid chromatography-mass spectrometry (LC-MS), non-targeted
   plant metabolomics is an indispensable method for decoding and
   synthetically characterizing phytochemical components. However, Given
   the large number and complex chemical components of primary and
   secondary metabolites in plants, LC-MS-based non-targeted plant
   metabolomics methods are not ideal for large-scale identification of
   the structure of various compounds in samples. To address this
   challenge, several advanced informatics tools have been developed in
   recent years to improve the annotation of compounds [[47]7]. The Global
   Natural Products Social Molecular Networking (GNPS) serves as a
   platform designed to analyze and archive tandem mass spectrometry data,
   which can build nodes (compounds) with similar secondary mass
   spectrometry cleavage fragments (MS^2) into molecular networking based
   on the MS-Cluster algorithm. By comparing the node information in the
   network, it is helpful to identify similar compounds and discover novel
   compounds [[48]8]. In addition, MolNetEnhancer is a workflow in the
   GNPS molecular networking analysis platform that integrates analysis
   results from the GNPS library matching, Mass2Motifs LDA parameters
   (MS2LDA), Network Annotation Propagation (NAP), and DEREPLICATOR+, as
   well as automatically chemically classified via ClassyFire. This
   in-silico mining tool significantly enhances the scope and efficiency
   of compound identification in complex natural products [[49]9].
   Currently, MolNetEnhancer has been used to improve the annotation of
   compounds in metabolites and to characterize the chemical space in
   plants [[50]10,[51]11].

   Several effective phytochemical components may theoretically act on
   multiple targets and pathways. Network pharmacology is suitable for
   revealing the overall mode of action of multi-component, multi-target,
   and multi-pathway natural products, which can solve the problems of
   poor efficacy and drug resistance of single-target drugs [[52]12].
   Molecular docking is one of the important means of computer-aided drug
   design, which can predict the optimal conformation and binding sites
   between small molecules, express interactions in enzyme reactions, and
   predict and explain biological reaction mechanisms [[53]13].

   Herein, LC-MS-based non-targeted plant metabolomics coupled with the
   MolNetEnhancer strategy was applied to qualitatively analyze the
   components of methanol extracts of AP roots. In addition, the network
   pharmacological method combined with bioinformatics analysis was
   applied to identify key active ingredients, hub targets, and potential
   molecular mechanisms of AP against CC, and finally verified the binding
   effect of active compounds to hub targets through molecular docking.

2. Materials and methods

2.1. Materials and reagents

   HPLC-grade acetonitrile was obtained from Honeywell Burdick & Jackson
   (New Jersey, USA). LC/MS-grade acetonitrile was obtained from
   Mallinckrodt Baker, Inc. (New Jersey, USA). HPLC-grade methanol was
   obtained from Sinopharm Chemical Reagent Co., Ltd. (Shanghai, China).
   Glacial acetic acid (GAA) was obtained from TEDIA. (Ohio, USA). The
   experimental water was purified by the Millipore water purification
   system (Massachusetts, USA).

   In August 2021, AP roots, aged between 8 and 10 years, were freshly
   collected from a humid valley at an altitude of 1400 to 2000 m in
   Qujing, Yunnan Province, China. The sample was identified by Professor
   Jian'an Wang (School of Pharmacy, Jining Medical University). The
   voucher sample (GMG-20210815) was stored at 4 °C at Jining Medical
   University.

2.2. Sample preparation

   The AP roots were left in a cool, well-ventilated area for 30 days to
   allow natural evaporation of water. Methanol is the preferred solvent
   for extracting the polar and moderately polar compounds, which are the
   specific compounds we are targeting [[54]14]. Therefore, the crushed
   root powder of AP (20.00 kg) was thoroughly extracted with methanol 3
   times by methanol cooling extraction for 7 days each. After vacuum
   concentration, a methanol crude extract was obtained.

   The methanol extract of 200 mg of AP was precisely weighed, dissolved
   in 5 ml of methanol, then extracted ultrasound for 30 min (40 kHz,
   200 W), and cooled to room temperature. The test solution was filtered
   once using a 0.22 μm microporous membrane for subsequent analysis.

2.3. UPLC-Q-TOF-MS analysis

   The sample analysis used the Agilent 6530B Accurate Mass Q-TOF LC/MS
   System with an electrospray ionization source (ESI) interface. A
   4.6 × 150 mm TOSOH TSK gel ODS-100V column was used for chromatographic
   separations. The temperature of the column oven was maintained at
   30 °C. The flow rate of the mobile phase was adjusted to 1.0 ml/min,
   and the injection volume was 5 μl.

   The conditions for UPLC and MS were optimized. Mobile phase A is GAA
   aqueous solution with a concentration of 0.5 % (pH = 3) and mobile
   phase B is acetonitrile. Gradient parameters were set as follows: 10 %
   B, 0–15 min; 10–20 % B, 15–30 min; 20–30 % B, 30–40 min; 30–60 % B,
   40–50 min; 60–80 % B, 50–60 min.

   For qualitative analysis of ESI-MS, both positive and negative ion full
   scans were used to obtain more detailed ion information. Scanning
   mass-to-charge ratio (m/z) ranged from 100 to 1200. The nebulizer
   pressure was set to 45 psi and the capillary volt was set to (±)
   3000 V. The drying gas ﬂow rate was adjusted to 5 l/min and the drying
   gas temperature was set to 350 °C. In the auto MS/MS mode, the
   acquisition of MS/MS spectra occurred at a scan speed of 2 spectra per
   second.

   The analysis data was preprocessed using Agilent MassHunter Qualitative
   Analysis B.06.00 software, including integration, smoothing,
   deconvolution, and isotope filtering.

2.4. Molecular networking and compound identification

   The processed MS^2 data was converted to mzML format by ProteoWizard
   MSConvert, which was then uploaded to the UCSD GNPS mass spectrometry
   platform. MS^2 data were processed and filtered again during the
   creation of the METABOLOMICS-SNETS-V2. The value of the precursor ion
   mass tolerance (PIMT) was adjusted to 2.0 Da and the value of the
   fragment ion mass tolerance (FIMT) was adjusted to 0.5 Da. The
   remaining parameters were set to their default values.

   MS/MS data were decomposed into annotated Mass2Motifs using the MS2LDA
   tool [[55]15]. Dereplication was performed using the
   DEREPLICATOR + algorithm tool to identify more compounds [[56]16]. NAP
   was able to build molecular networks based on the principle of spectral
   similarity and then improved the annotation of small molecules using in
   silico fragmentation predictions [[57]17]. The completed GNPS molecular
   network, MS2LDA, and DEREPLICATOR+ were integrated using
   MolNetEnhancer, analyzed online on the GNPS platform, and visualized
   using Cytoscape 3.9.1.

2.5. Screening of compounds in AP by ADME druggability evaluation

   Compounds in AP were pharmacokinetically screened using SwissADME
   [[58]18]. SwissADME is widely for evaluating the druggability of small
   molecule compounds in natural products [[59]19,[60]20]. The screening
   criteria for potential active compounds were: (1) Gastrointestinal (GI)
   absorption was ‘High’; (2) two or more of the five types of medicinal
   properties (Lipinski, Ghose, Veber, Egan, Muegge) results were ‘yes’.

2.6. Screening of targets of active compounds in AP

   Predicted targets of active compounds were acquired through three
   online databases, including Swisstargetprediction, DrugBank, and
   BATMAN-TCM. The collected targets were normalized by using the UniProt
   database, the species was limited to ‘Homo sapiens’, and duplicate
   values were removed.

2.7. Screening of CC-related targets

   Three CC-related chip data were downloaded from the GEO database:
   [61]GSE26342 (NIAID [62]GPL1052 Platform-NIAID-Hsaa, [63]GPL1053
   Platform-NIAID-Hsab), [64]GSE63514 (Affymetrix [65]GPL570
   Platform-HG-U133_Plus_2), [66]GSE173097 (Agilent [67]GPL16956
   Platform-Agilent-045997 Arraystar human lncRNA microarray V3). Details
   of all GSE chip datasets were shown in [68]Table 1.

Table 1.

   Information of the three GSE datasets.
   GEO Platform Samples(number)
   Total CC Control Experiment type Attribute Author/reference
   [69]GSE26342 [70]GPL1052[71]GPL1053 54 34 20 Array Test Mine, K.L
   [[72]21]
   [73]GSE63514 [74]GPL570 128 28 24 Array Test den Boon, J.A [[75]22]
   [76]GSE173097 [77]GPL16956 11 5 6 Array Test Shang, C.L [[78]23]
   [79]Open in a new tab

   The R programming language 4.3.0 and the R package ‘limma’ were applied
   to acquire differentially expressed genes (DEGs) in each GSE dataset. A
   p-value <0.05 and |log (fold change) | > 1 were used as restriction
   criteria for sifting differential genes. The R packages "ggplot2" and
   "pheatmap" were applied to construct volcano plots and heatmaps of the
   DEGs, respectively. Robust Rank Aggregation (RRA) is a commonly used
   bioinformatics tool that comprehensively ranks genes from different
   data sources [[80]24]. The RRA algorithm was employed to integrate and
   analyze the three data sets to obtain significant DEGs as potential
   targets for CC. CC-associated targets were also acquired from the
   DrugBank, OMIM, GeneCards, PharmGKB, and TTD databases.

   All obtained CC-related targets were standardized by using the UniProt
   database. Finally, the R package ‘VennDiagram’ was applied to analyze
   the potential protein targets of AP against CC.

2.8. Protein-protein interaction (PPI) networks

   Possible potential interactions between potential therapeutic targets
   were explored on the STRING database. Parameters were set as follows:
   only targets belonging to the species "Homo sapiens" were considered, a
   minimum interaction threshold of greater than 0.9 was set, and nodes
   that were not connected to the network were removed. Cytoscape 3.9.1
   software was used to analyze and visualize the resulting PPI networks.
   Using CytoNCA, the node topology parameters were calculated, and
   crucial targets were identified.

2.9. GO and KEGG pathway enrichment analyses

   The GO and KEGG enrichment analyses were conducted using the R language
   and associated R packages “clusterProfiler”, “stringr”,
   “AnnotationDbi”, “org. Hs.eg.db”, “DOSE”, “ggplot2” and “ggrepel”. The
   pvalueCutoff was 0.05 and the qvalueCutoff was 0.2, and only GO and
   KEGG analysis results with a p-value of <0.05 were retained.

2.10. The construction of a C-T-P network

   The C-T-P network was constructed by importing data for potential
   therapeutic targets, active compounds, and top 30 signaling pathways
   into Cytoscape 3.9.1. Core compounds were screened based on Degree
   values in the network.

2.11. Molecular docking

   The top 6 target proteins with the highest CytoNCA scores in the core
   PPI network were selected as receptors, and compounds that met
   pharmacokinetic screening were selected as ligands. The core targets
   were acquired from the Protein Data Bank (PDB) in three-dimensional
   (3D) format. Non-protein molecules in the core targets were removed by
   the Pymol 2.5. The core targets were hydrogenated using AutoDock Tools
   1.5.7 and converted into the pdbqt file format for use as docking
   receptors. The two-dimensional (2D) structure of active compounds was
   drawn by ChemDraw 2019, and energy minimization was implemented by
   Chem3D 2019 and saved as a PDB file format. The active compounds were
   converted by Auto-Dock Tools 1.5.7 to a pdbqt file format for use as
   docking ligands.

   Molecular docking was run via AutoDock Vina, the results were
   visualized using Discovery Studio 2021 Client software, and heatmaps of
   molecular docking data were drawn using R and the R package ‘pheatmap’.

3. Results

3.1. Sample preparation

   The methanol extracts of AP roots were concentrated to about 1.79 kg,
   the estimated dry paste amount was 1.19 kg, and the dry paste
   extraction rate was 5.97 %.

3.2. UPLC-Q-TOF-MS analysis

   High-resolution mass spectrometry data of AP roots methanol extracts in
   positive and negative ion modes were acquired by the UPLC-Q-TOF-MS
   system, and their total ion chromatograms (TICs) were shown in [81]Fig.
   1. UPLC-Q-TOF-MS analysis showed that the methanol extract of AP was
   rich in chromatographic peaks in positive and negative ionization
   modes, making it ideal for comprehensive characterization of
   composition.

Fig. 1.

   [82]Fig. 1
   [83]Open in a new tab

   TICs of methanol extracts of AP roots based on UPLC-Q-TOF-MS analysis.
   (A) TIC in the negative ESI mode. (B) TIC in the positive ESI mode.

3.3. Molecular networking and compound identification

   The family of structurally related molecules in AP was revealed by
   MolNetEnhancer. As shown in [84]Fig. 2A, approximately 34.1 % of the
   MS^2 spectral matching nodes were classified as organoheterocyclic
   compounds, 7.5 % as lipids, and 5.2 % as phenylpropane and polyketones,
   etc.

Fig. 2.

   [85]Fig. 2
   [86]Open in a new tab

   Chemical classification and structural identification of methanol
   extracts of AP roots. (A) Nodes annotation using MolNetEnhancer
   combined with GNPS molecular network, MS2LDA, NAP, and
   DEREPLICATOR + output. The same color in the node represented the same
   family of molecules. (B) Characterize compounds using GNPS molecular
   network and NAP, combined with Mass2Motifs generated by MS2LDA. The
   same color represented the same Mass2Motifs. (For interpretation of the
   references to color in this figure legend, the reader is referred to