Abstract In this study, we analyzed the chemical compositions of Alangium platanifolium (Sieb. et Zucc.) Harms (AP) using ultraperformance liquid chromatography-quadrupole time-of-flight mass spectrometry (UPLC-Q-TOF-MS) non-targeted plant metabolomics integration MolNetEnhancer strategy. A total of 75 compounds, including flavonoids, alkaloids, terpenes, C[21] steroids, among others, were identified by comparing accurate mass-to-charge ratios, MS^2 cleavage fragments, retention times, and MolNetenhancer-integrated analytical data, and the cleavage rules of the characteristic compounds were analyzed. A total of 125 potential cervical cancer (CC) therapeutic targets were obtained through Gene Expression Omnibus (GEO) data mining, differential analysis, and database screening. Hub targets were obtained by constructing protein-protein interaction (PPI) networks and CytoNCA topology analysis, including SRC, STAT3, TP53, PIK3R1, MAPK3, and PIK3CA. According to Gene ontology (GO) analysis, AP was primarily against CC by influencing gland development, oxidative stress processes, serine/threonine kinase, and tyrosine kinase activity. Enrichment analysis of the Kyoto Encyclopedia of Genes and Genomes (KEGG) indicated that the PI3K/AKT and MAPK signaling pathways play a crucial role in AP treatment for CC. The compound-target-pathway (C-T-P) network revealed that quercetin, methylprednisolone, and caudatin may play key roles in the treatment of CC. The results of molecular docking revealed that the core compound could bind significantly to the core target. In this study, the compounds in AP were systematically analyzed qualitatively, and the core components, core targets, and mechanisms of action of AP in the treatment of CC were screened through a combination of network pharmacology tools. Providing a scientific reference for the therapeutic material basis and quality control of AP. Keywords: Alangium platanifolium, Cervical cancer, Molecular docking, Molecular networking, Network pharmacology, UPLC-Q-TOF-MS 1. Introduction Cervical cancer (CC) is the fourth most frequent malignancy globally and the fourth principal cause of cancer-related mortality in women [[41]1]. According to statistics, there were 604,127 new cases of CC and 341,831 deaths from CC globally in 2020 [[42]2]. Current treatments for CC mainly include radiotherapy, chemotherapy, and/or surgical resection. However, these modalities can cause various adverse effects and have limited efficacy in advanced CC [[43]3]. Therefore, identifying a safer and less toxic treatment strategy for the effective treatment of CC is crucial. Natural products have received widespread attention as a potential source of highly effective and low-toxicity antitumor drugs. There is increasing evidence that plant-derived natural products can achieve anti-CC effects through various mechanisms, including inhibition of CC cell proliferation, induction of CC apoptosis, reduction of telomerase activity, and inhibition of angiogenesis [[44]4,[45]5]. Based on previous literature research, Alangium platanifolium (Sieb. et Zucc.) Harms (AP) belonging to the family Alangiaceae is an important medicinal resource. In the "Quality Standards for Traditional Chinese Medicinal Materials and Ethnic Medicinal Materials in Guizhou Province (2003 Edition)" of China, it is explicitly stated that Bajiaofeng refers to the dried fine whisker roots or dry branches of the Alangium chinense (Lour.) Harms and the Alangium platanifolium (Sieb.et Zucc.) Harms. Recent studies have shown that monomeric compounds from Alangium chinese exhibit good inhibitory effects on HeLa cells [[46]6]. The current research has placed significant emphasis on Alangium chinense, but insufficient with regards to AP. While a few glycosides have been extracted from its leaves, there is a dearth of information regarding the components of its root. To expand the potential applications of AP in medicinal contexts, it was selected as the primary focus of this study. Based on liquid chromatography-mass spectrometry (LC-MS), non-targeted plant metabolomics is an indispensable method for decoding and synthetically characterizing phytochemical components. However, Given the large number and complex chemical components of primary and secondary metabolites in plants, LC-MS-based non-targeted plant metabolomics methods are not ideal for large-scale identification of the structure of various compounds in samples. To address this challenge, several advanced informatics tools have been developed in recent years to improve the annotation of compounds [[47]7]. The Global Natural Products Social Molecular Networking (GNPS) serves as a platform designed to analyze and archive tandem mass spectrometry data, which can build nodes (compounds) with similar secondary mass spectrometry cleavage fragments (MS^2) into molecular networking based on the MS-Cluster algorithm. By comparing the node information in the network, it is helpful to identify similar compounds and discover novel compounds [[48]8]. In addition, MolNetEnhancer is a workflow in the GNPS molecular networking analysis platform that integrates analysis results from the GNPS library matching, Mass2Motifs LDA parameters (MS2LDA), Network Annotation Propagation (NAP), and DEREPLICATOR+, as well as automatically chemically classified via ClassyFire. This in-silico mining tool significantly enhances the scope and efficiency of compound identification in complex natural products [[49]9]. Currently, MolNetEnhancer has been used to improve the annotation of compounds in metabolites and to characterize the chemical space in plants [[50]10,[51]11]. Several effective phytochemical components may theoretically act on multiple targets and pathways. Network pharmacology is suitable for revealing the overall mode of action of multi-component, multi-target, and multi-pathway natural products, which can solve the problems of poor efficacy and drug resistance of single-target drugs [[52]12]. Molecular docking is one of the important means of computer-aided drug design, which can predict the optimal conformation and binding sites between small molecules, express interactions in enzyme reactions, and predict and explain biological reaction mechanisms [[53]13]. Herein, LC-MS-based non-targeted plant metabolomics coupled with the MolNetEnhancer strategy was applied to qualitatively analyze the components of methanol extracts of AP roots. In addition, the network pharmacological method combined with bioinformatics analysis was applied to identify key active ingredients, hub targets, and potential molecular mechanisms of AP against CC, and finally verified the binding effect of active compounds to hub targets through molecular docking. 2. Materials and methods 2.1. Materials and reagents HPLC-grade acetonitrile was obtained from Honeywell Burdick & Jackson (New Jersey, USA). LC/MS-grade acetonitrile was obtained from Mallinckrodt Baker, Inc. (New Jersey, USA). HPLC-grade methanol was obtained from Sinopharm Chemical Reagent Co., Ltd. (Shanghai, China). Glacial acetic acid (GAA) was obtained from TEDIA. (Ohio, USA). The experimental water was purified by the Millipore water purification system (Massachusetts, USA). In August 2021, AP roots, aged between 8 and 10 years, were freshly collected from a humid valley at an altitude of 1400 to 2000 m in Qujing, Yunnan Province, China. The sample was identified by Professor Jian'an Wang (School of Pharmacy, Jining Medical University). The voucher sample (GMG-20210815) was stored at 4 °C at Jining Medical University. 2.2. Sample preparation The AP roots were left in a cool, well-ventilated area for 30 days to allow natural evaporation of water. Methanol is the preferred solvent for extracting the polar and moderately polar compounds, which are the specific compounds we are targeting [[54]14]. Therefore, the crushed root powder of AP (20.00 kg) was thoroughly extracted with methanol 3 times by methanol cooling extraction for 7 days each. After vacuum concentration, a methanol crude extract was obtained. The methanol extract of 200 mg of AP was precisely weighed, dissolved in 5 ml of methanol, then extracted ultrasound for 30 min (40 kHz, 200 W), and cooled to room temperature. The test solution was filtered once using a 0.22 μm microporous membrane for subsequent analysis. 2.3. UPLC-Q-TOF-MS analysis The sample analysis used the Agilent 6530B Accurate Mass Q-TOF LC/MS System with an electrospray ionization source (ESI) interface. A 4.6 × 150 mm TOSOH TSK gel ODS-100V column was used for chromatographic separations. The temperature of the column oven was maintained at 30 °C. The flow rate of the mobile phase was adjusted to 1.0 ml/min, and the injection volume was 5 μl. The conditions for UPLC and MS were optimized. Mobile phase A is GAA aqueous solution with a concentration of 0.5 % (pH = 3) and mobile phase B is acetonitrile. Gradient parameters were set as follows: 10 % B, 0–15 min; 10–20 % B, 15–30 min; 20–30 % B, 30–40 min; 30–60 % B, 40–50 min; 60–80 % B, 50–60 min. For qualitative analysis of ESI-MS, both positive and negative ion full scans were used to obtain more detailed ion information. Scanning mass-to-charge ratio (m/z) ranged from 100 to 1200. The nebulizer pressure was set to 45 psi and the capillary volt was set to (±) 3000 V. The drying gas flow rate was adjusted to 5 l/min and the drying gas temperature was set to 350 °C. In the auto MS/MS mode, the acquisition of MS/MS spectra occurred at a scan speed of 2 spectra per second. The analysis data was preprocessed using Agilent MassHunter Qualitative Analysis B.06.00 software, including integration, smoothing, deconvolution, and isotope filtering. 2.4. Molecular networking and compound identification The processed MS^2 data was converted to mzML format by ProteoWizard MSConvert, which was then uploaded to the UCSD GNPS mass spectrometry platform. MS^2 data were processed and filtered again during the creation of the METABOLOMICS-SNETS-V2. The value of the precursor ion mass tolerance (PIMT) was adjusted to 2.0 Da and the value of the fragment ion mass tolerance (FIMT) was adjusted to 0.5 Da. The remaining parameters were set to their default values. MS/MS data were decomposed into annotated Mass2Motifs using the MS2LDA tool [[55]15]. Dereplication was performed using the DEREPLICATOR + algorithm tool to identify more compounds [[56]16]. NAP was able to build molecular networks based on the principle of spectral similarity and then improved the annotation of small molecules using in silico fragmentation predictions [[57]17]. The completed GNPS molecular network, MS2LDA, and DEREPLICATOR+ were integrated using MolNetEnhancer, analyzed online on the GNPS platform, and visualized using Cytoscape 3.9.1. 2.5. Screening of compounds in AP by ADME druggability evaluation Compounds in AP were pharmacokinetically screened using SwissADME [[58]18]. SwissADME is widely for evaluating the druggability of small molecule compounds in natural products [[59]19,[60]20]. The screening criteria for potential active compounds were: (1) Gastrointestinal (GI) absorption was ‘High’; (2) two or more of the five types of medicinal properties (Lipinski, Ghose, Veber, Egan, Muegge) results were ‘yes’. 2.6. Screening of targets of active compounds in AP Predicted targets of active compounds were acquired through three online databases, including Swisstargetprediction, DrugBank, and BATMAN-TCM. The collected targets were normalized by using the UniProt database, the species was limited to ‘Homo sapiens’, and duplicate values were removed. 2.7. Screening of CC-related targets Three CC-related chip data were downloaded from the GEO database: [61]GSE26342 (NIAID [62]GPL1052 Platform-NIAID-Hsaa, [63]GPL1053 Platform-NIAID-Hsab), [64]GSE63514 (Affymetrix [65]GPL570 Platform-HG-U133_Plus_2), [66]GSE173097 (Agilent [67]GPL16956 Platform-Agilent-045997 Arraystar human lncRNA microarray V3). Details of all GSE chip datasets were shown in [68]Table 1. Table 1. Information of the three GSE datasets. GEO Platform Samples(number) Total CC Control Experiment type Attribute Author/reference [69]GSE26342 [70]GPL1052[71]GPL1053 54 34 20 Array Test Mine, K.L [[72]21] [73]GSE63514 [74]GPL570 128 28 24 Array Test den Boon, J.A [[75]22] [76]GSE173097 [77]GPL16956 11 5 6 Array Test Shang, C.L [[78]23] [79]Open in a new tab The R programming language 4.3.0 and the R package ‘limma’ were applied to acquire differentially expressed genes (DEGs) in each GSE dataset. A p-value <0.05 and |log (fold change) | > 1 were used as restriction criteria for sifting differential genes. The R packages "ggplot2" and "pheatmap" were applied to construct volcano plots and heatmaps of the DEGs, respectively. Robust Rank Aggregation (RRA) is a commonly used bioinformatics tool that comprehensively ranks genes from different data sources [[80]24]. The RRA algorithm was employed to integrate and analyze the three data sets to obtain significant DEGs as potential targets for CC. CC-associated targets were also acquired from the DrugBank, OMIM, GeneCards, PharmGKB, and TTD databases. All obtained CC-related targets were standardized by using the UniProt database. Finally, the R package ‘VennDiagram’ was applied to analyze the potential protein targets of AP against CC. 2.8. Protein-protein interaction (PPI) networks Possible potential interactions between potential therapeutic targets were explored on the STRING database. Parameters were set as follows: only targets belonging to the species "Homo sapiens" were considered, a minimum interaction threshold of greater than 0.9 was set, and nodes that were not connected to the network were removed. Cytoscape 3.9.1 software was used to analyze and visualize the resulting PPI networks. Using CytoNCA, the node topology parameters were calculated, and crucial targets were identified. 2.9. GO and KEGG pathway enrichment analyses The GO and KEGG enrichment analyses were conducted using the R language and associated R packages “clusterProfiler”, “stringr”, “AnnotationDbi”, “org. Hs.eg.db”, “DOSE”, “ggplot2” and “ggrepel”. The pvalueCutoff was 0.05 and the qvalueCutoff was 0.2, and only GO and KEGG analysis results with a p-value of <0.05 were retained. 2.10. The construction of a C-T-P network The C-T-P network was constructed by importing data for potential therapeutic targets, active compounds, and top 30 signaling pathways into Cytoscape 3.9.1. Core compounds were screened based on Degree values in the network. 2.11. Molecular docking The top 6 target proteins with the highest CytoNCA scores in the core PPI network were selected as receptors, and compounds that met pharmacokinetic screening were selected as ligands. The core targets were acquired from the Protein Data Bank (PDB) in three-dimensional (3D) format. Non-protein molecules in the core targets were removed by the Pymol 2.5. The core targets were hydrogenated using AutoDock Tools 1.5.7 and converted into the pdbqt file format for use as docking receptors. The two-dimensional (2D) structure of active compounds was drawn by ChemDraw 2019, and energy minimization was implemented by Chem3D 2019 and saved as a PDB file format. The active compounds were converted by Auto-Dock Tools 1.5.7 to a pdbqt file format for use as docking ligands. Molecular docking was run via AutoDock Vina, the results were visualized using Discovery Studio 2021 Client software, and heatmaps of molecular docking data were drawn using R and the R package ‘pheatmap’. 3. Results 3.1. Sample preparation The methanol extracts of AP roots were concentrated to about 1.79 kg, the estimated dry paste amount was 1.19 kg, and the dry paste extraction rate was 5.97 %. 3.2. UPLC-Q-TOF-MS analysis High-resolution mass spectrometry data of AP roots methanol extracts in positive and negative ion modes were acquired by the UPLC-Q-TOF-MS system, and their total ion chromatograms (TICs) were shown in [81]Fig. 1. UPLC-Q-TOF-MS analysis showed that the methanol extract of AP was rich in chromatographic peaks in positive and negative ionization modes, making it ideal for comprehensive characterization of composition. Fig. 1. [82]Fig. 1 [83]Open in a new tab TICs of methanol extracts of AP roots based on UPLC-Q-TOF-MS analysis. (A) TIC in the negative ESI mode. (B) TIC in the positive ESI mode. 3.3. Molecular networking and compound identification The family of structurally related molecules in AP was revealed by MolNetEnhancer. As shown in [84]Fig. 2A, approximately 34.1 % of the MS^2 spectral matching nodes were classified as organoheterocyclic compounds, 7.5 % as lipids, and 5.2 % as phenylpropane and polyketones, etc. Fig. 2. [85]Fig. 2 [86]Open in a new tab Chemical classification and structural identification of methanol extracts of AP roots. (A) Nodes annotation using MolNetEnhancer combined with GNPS molecular network, MS2LDA, NAP, and DEREPLICATOR + output. The same color in the node represented the same family of molecules. (B) Characterize compounds using GNPS molecular network and NAP, combined with Mass2Motifs generated by MS2LDA. The same color represented the same Mass2Motifs. (For interpretation of the references to color in this figure legend, the reader is referred to