Abstract Background Colorectal cancer (CRC) remains one of the leading causes of cancer-related death worldwide. Gelsemium elegans Benth (GEB) is a traditional Chinese medicine commonly used for treatment for gastrointestinal cancer, including CRC. However, the underlying active ingredients and mechanism remain unknown. This study aims to explore the active components and the functional mechanisms of GEB in treating CRC by network pharmacology-based approaches. Methods Candidate compounds of GEB were collected from the Traditional Chinese Medicine@Taiwan, Traditional Chinese Medicines Integrated Database, Bioinformatics Analysis Tool for Molecular mechanism of Traditional Chinese Medicine, and published literature. Potentially active targets of compounds in GEB were retrieved from SwissTargetPrediction databases. Keywords “colorectal cancer”, “rectal cancer” and “colon cancer” were used as keywords to search for related targets of CRC from the GeneCards database, then the overlapped targets of compounds and CRC were further intersected with CRC related genes from the TCGA database. The Cytoscape was applied to construct a graph of visualized compound-target and pathway networks. Protein-protein interaction networks were constructed by using STRING database. The DAVID tool was applied to carry out Gene Ontology and Kyoto Encyclopedia of Genes and Genome pathway enrichment analysis of final targets. Molecular docking was employed to validate the interaction between compounds and targets. AutoDockTools was used to construct docking grid box for each target. Docking and molecular dynamics simulation were performed by Autodock Vina and Gromacs software, respectively. Results Fifty-three bioactive compounds were successfully identified, corresponding to 136 targets that were screened out for the treatment of CRC. Functional enrichment analysis suggested that GEB exerted its pharmacological effects against CRC via modulating multiple pathways, such as pathways in cancer, cell cycle, and colorectal cancer. Molecular docking analysis showed that the representative compounds had good affinity with the key targets. Molecular dynamics simulation indicated that the best hit molecules formed a stable protein-ligand complex. Conclusion This network pharmacology study revealed the multiple ingredients, targets, and pathways synergistically involved in the anti-CRC effect of GEB, which will enhance our understanding of the potential molecular mechanism of GEB in treatment for CRC and lay a foundation for further experimental research. Supplementary Information The online version contains supplementary material available at 10.1186/s12906-021-03273-7. Keywords: Gelsemium elegans Benth, Colorectal cancer, Network pharmacology, Mechanism Background Colorectal cancer (CRC) continues to be one of the leading causes of mortality and morbidity worldwide despite the availability of reliable screening tools and effective therapies. It is the second most common cause of cancer death in the United States when men and women are combined [[39]1]. According to the American Cancer Society’s and GLOBOCAN estimates, it will be 147,950 and 1,931,590 new cases of CRC in the United States and the world for 2020, respectively [[40]2, [41]3]. The incidence and mortality of CRC are rapidly increasing particularly in developing countries, and it is estimated that the global burden of CRC increases by 60% over 2.2 million new cases and 1.1 million cancer death by 2030 [[42]4]. Effective treatments used for CRC may include some combination of surgery, radiation therapy, chemotherapy, immunotherapy and targeted therapy [[43]5–[44]7]. However, the mortality is still relatively high because of delayed diagnosis, metastasis, and frequent recurrence. The 5-year survival rate is less than14% and unfortunately, more than 50% of CRC patients were diagnosed at an advanced stage [[45]8]. Furthermore, using the most prevalent chemotherapy regimens has shown the limitations, a series of side effects commonly accompany patients, such as gastrointestinal reaction, bone marrow suppression, neurotoxicity, and abnormal liver or kidney function. It is of great significance to search for more effective alternative agents with low toxicity for patients. Medicinal herbs are an important, yet often overlooked, a source for novel antineoplastic drugs. Indeed, many chemotherapeutics derived from plants, such as paclitaxel, vinblastine, and vincristine, have proven effective against different tumors. Sometimes as a complementary therapy, medicinal plants are widely used to treat several types of cancers, including CRC, with relatively fewer and milder side effects [[46]9, [47]10]. As an important source of alternative and complementary medicines, traditional Chinese medicine (TCM) has been widely reported to treat cancer [[48]11–[49]13]. In recent years, more and more herbs originating from TCM have attracted considerable attention as anti-CRC agents because of their therapeutic value and low toxicity [[50]9, [51]14]. Gelsemium, as a genus of the Gelsemiaceae family, consists of three well-known species: Gelsemium elegans Benth. (GEB) (Fig. [52]1), native to Southeast Asia and China, and Gelsemium rankinii Small and Gelsemium sempervirens Ait, native to North America [[53]15]. Although GEB is a toxic plant, it has long been used in Chinese folk medicine to treat many diseases, including pain, inflammation, and cancer [[54]16, [55]17]. Alkaloids, isolated and purified from GEB, constitute the main active molecules of GEB and were profoundly studied for their biological activities in several pharmaceutical areas, including anti-inflammatory, antirheumatic, analgesic, immunomodulatory, and anti-tumor activities [[56]16]. GEB has shown potential as a promising anti-tumor agent in clinical practice. Patients with severe primary liver cancer survived after treated with powder of GEB (150 mg,Bid), resuting in tumor shrinkage and pain relief [[57]18]. Notably, GEB has long been used as Chinese folk medicine in the treatment of CRC in Southern China [[58]19]. In vitro study, cytotoxic effects on the tumor cells of digestive system was observed in alkaloidal compounds from GEB [[59]20]. Alkaloids of GEB could inhibit the proliferation and induced the apoptosis of the human colonic carcinoma cells [[60]21, [61]22]. However, the potential active compounds and underlying molecular mechanisms of the anti-CRC effect of GEB remain unknown. Fig. 1. Fig. 1 [62]Open in a new tab Flowers and leaves from Gelsemium elegans Benth With the current merging of bioinformatics, network pharmacology-based analysis has become a robust method to systematically reveal the biological mechanisms of complex diseases and drug effects at the molecular level [[63]23]. It integrates information science, systematic medicine, and is evolving as a promising strategy for the next-generation mode of drug discovery and development for traditional medicine. Compared with traditionally experimental pharmacology research, network pharmacology focuses on analyzing multiple target regulation of multiple chemical components, so it is particularly suitable for the interpretation of the mechanism of TCM [[64]24]. Hence, the present study aimed to reveal the potentially active ingredients against CRC and predict the underlying action mechanism of GEB by employing a network pharmacological method. Methods Potentially active compounds in GEB The information about compounds of GEB was obtained from the published literature [[65]16, [66]25] and the following online databases: (1) Traditional Chinese Medicines Integrated Database (TCMID) ([67]http://119.3.41.228:8000/tcmid/), which is a comprehensive database to provide information and bridge the gap between TCM and modern life sciences [[68]26]. (2) Bioinformatics Analysis Tool for Molecular mechanism of Traditional Chinese Medicine (BATMAN-TCM) ([69]http://bionet.ncpsb.org/batman-tcm/), which is the first online bioinformatics analysis tool specially designed for the research of molecular mechanism of TCM [[70]27]. (3) Traditional Chinese Medicine Database@ Taiwan ([71]http://tcm.cmu.edu.tw/zh-tw/), the world’s largest TCM database for drug screening in silico [[72]28]. Then all compounds of the herbal medicine were determined by removing the duplicate compounds. The candidate active compounds were further filtered by meeting at least two of five features of drug-likeness (Lipinski, Ghose, Veber, Egan, and Muegge) and combining bioavailability score ≥ 30% as suggested by the [73]http://www.swissadme.chwebsite, which allows to compute physicochemical descriptors as well as to predict ADME parameters, pharmacokinetic properties, druglike nature and medicinal chemistry friendliness of one or multiple small molecules to support drug discovery [[74]29]. Targets prediction of compounds in GEB As a popular online server, SwissTargetPrediction ([75]http://www.swisstargetprediction.ch) provides information on chemical substances, biological activities, and allows to estimate the most probable macromolecular targets of a small molecule [[76]30]. 3D molecular structure files of each ingredient that could be retrieved from PubChem ([77]https://pubchem.ncbi.nlm.nih.gov/) were imported into SwissTargetPrediction for identification of potential drug target in humans. The targets of ingredients acquired from SwissTargetPrediction with probability ≥0.1 were chosen as potential targets in this study after removing the repeated targets. Those compounds without target information were excluded. Targets of CRC Information on CRC-associated target genes was collected from the following resources. The different genes involved in CRC were gathered from GeneCards ([78]https://www.genecards.org/), which is a searchable, integrative database that provides comprehensive, user-friendly information on all annotated and predicted genes involved in human diseases [[79]31]. Keywords “colorectal cancer”, “rectal cancer”, and “colon cancer”, were used to search through the database, which identified 33,505 genes with a disease relevance score ≥ 10. Then, the putative target genes of GEB were mapped to the CRC-associated target genes. The candidate anti-CRC targets of GEB were visualized by overlapping the above targets with a Venn diagram. Validation of candidate targets in the cancer genome atlas (TCGA) database CRC related genes from The Cancer Genome Atlas (TCGA) database of ONCOMINE ([80]https://www.oncomine.org/) were used to validate candidate targets of GEB against CRC. Firstly, the top 10% over-expression and under-expression mRNA genes form CRC samples (Cecum Adenocarcinoma vs. Normal, Colon Adenocarcinoma vs. Normal, Colon Mucinous Adenocarcinoma vs. Normal, Rectal Adenocarcinoma vs. Normal, Rectal Mucinous Adenocarcinoma vs. Normal, and Rectosigmoid Adenocarcinoma vs. Normal) were downloaded from the TCGA database of ONCOMINE. Subsequently, a Venn diagram was used to intersect the candidate targets of GEB against CRC genes obtained from the TCGA database. Protein-protein interaction (PPI) data Core regulatory genes can be identified by exploring the protein-protein interaction (PPI). PPI information can be obtained from the STRING database ([81]https://string-db.org/), which covers abundant information regarding known and predicted protein-protein interactions of different species [[82]32]. In this study, high confidence score > 0.7 were reserved and the species was only limited to “Homo sapiens”, then the validated targets were submitted to STRING. Finally, PPI data were extracted. The top 20 proteins with a higher level of degrees were considered as the center targets for GEB in the treatment for CRC. Cluster analysis There are some closely connected regions of molecular complexes in large PPI networks, which are named topology modules or clusters. Cluster analysis is a classification method that involves interconnected regions showing the inherent laws in the network. In this study, significant cluster modules from the constructed PPI network were selected using the Molecular Complex Detection (MCODE), a plug-in of Cytoscape, which was used to detect densely connected regions and cluster analysis in the PPI network [[83]33]. The criteria settings were set as follows: node score cutoff = 0.2; K-core = 2; and degree of cutoff = 2 [[84]34]. GO and KEGG pathway enrichment analysis To explore the gene functions, the Database for Annotation, Visualization and Integrated Discovery (DAVID, [85]https://david.ncifcrf.gov/.ver.6.8) which provides a systematic and comprehensive set of functional annotation tools for investigators to understand the biological meanings behind a large list of genes [[86]35], was applied to perform Gene Ontology (GO) and Kyoto encyclopedia of genes and genome (KEGG) pathway enrichment of proteins in the PPI network analyses.The species was only limited to “Homo sapiens”. Those GO and KEGG pathway terms with only False Discovery Rate (FDR) < 0.01 were considered to be significantly enriched. As for enrichment analysis, the results of enriched GO terms of biological process (BP), cellular component (CC), and molecular function (MF) were visualized by the R software package (3.5.2), as well as the bubble chart of KEGG pathway enrichment. Network construction Four networks were constructed as follows: (1) Compounds-compound targets network of GEB was constructed by connecting chemical compounds with corresponding targets; (2) Potential compounds-targets network of GEB against CRC; (3) Potential compounds-targets-pathways network of GEB against CRC; (4) PPI network of the potential targets of GEB against CRC. The PPI network was completed directly on STRING. The other 3 networks were constructed using the network visualization software Cytoscape ([87]http://cytoscape.org/.ver.3.7.2), which is an open-source software platform suitable for visualizing intermolecular interactions networks and biological pathways [[88]36]. Furthermore, Cytoscape can be used to integrate and analyze these networks with annotations, gene expression profiles, and other complicated data. Three parameters can be calculated to evaluate the topological coefficients of each node. “Degree” represents the number of edges connected to a node; “Betweenness” is defined as the number of times a node act as a bridge along the shortest paths between pairs of other nodes; “Closeness” is the inverse of the sum of the shortest paths from a node to other nodes in the network. Active compounds-targets docking Ten compounds were selected from the core compounds of GEB and docked with six proteins selected from the center targets to verify the accuracy of the main compounds and their corresponding predicted targets. The candidate compound and the crystal structure of the target protein were downloaded from the PubChem database and RCSB protein data ([89]http://www.rcsb.org), respectively. The latter preferably selects a model with ligand binding smaller than 3 Å, and then dehydration, hydrogenation, and separation of ligands were carried out by importing the crystal structure into the Pymol 2.4.1 Software ([90]https://pymol.org/2/); then AutoDockTools 1.5.6 was used to construct the docking grid box of crystal structure for each target [[91]37]. Docking was done by Autodock Vina 1.1.2 software, and the molecules with the lowest binding energy in the docking conformation were selected to observe the binding effect by comparing with the original ligands and intermolecular interactions (such as hydrophobicity, cation-π, anion-π, π-π stacking, hydrogen bonding, etc.). The proteins with the original ligands were specified docking at a domain of the protein, and amino acid residues in the domain were targeted for evaluating the interaction. The number of grid points in the three dimensions (NPTS) used in this study were 40 40 40 0.375. Since RCSB did not find the effective crystal structure of the CCND1 binding ligand, direct docking was performed with grid center 24.683 13.205 61.426. Molecular dynamics simulation In order to analyze the binding affinities of the best hit molecules (gelsesyringalidine and CDK2) after docking, a 10 ns atomistic molecular dynamics (MD) simulation of selected protein-ligand complex was conducted. In the present study the NVIDIA RTX 1060 GPU accelerated GROMACS 2021 software, running over Linux ubuntu 20.04 operating system supported by AMD R5 3600 processor was used. The Charmm36 force field was used to generate protein topology. The ligand topology and parameters required for MD simulation were generated by using CGenFF server. The TIP3P water model was used for solvating each systems followed by neutralization with appropriate numbers of Na^+ and Cl^−. Then energy of each system was minimized by using the steepest descent minimization algorithm with maximum 50,000 steps and < 10.0 kJ/mol force. Position restrains have been applied to receptor and ligand of the each systems for 100 ps throughout heating (300 K) utilizing NVT (No. of atoms, Volume, Temperature) ensemble with leap-frog integrator, a time step of 2 fs and LINCS holonomic constraints.NPT (No. of atoms, Pressure, Temperature) ensemble has been applied at temperature (300 K) for 100 ps using a time step of 2 fs for NPT equilibration phase. After the energy minimization and equilibration of all systems, MD production run has been executed without any restrain for 10 ns with a time step of 2 fs, and after every 10 ps coordinates of the structure have been saved. After the completion of 10 ns MD simulation, the trajectories have been used for various dynamics analysis such as root mean square deviation (RMSD) and root mean square fluctuation (RMSF). These were compared with the primitive ligand complex. Statistics Benjamini–Hochberg correction was performed for multiple testing, and adjusted value <0.05 was set as the threshold. False Discovery Rate (FDR) < 0.01 was deemed as significant enriched in GO and KEGG analysis. Results Potentially active compounds and targets in GEB Using the BATMAN-TCM and TCM@Taiwan databases, we collected a total of 97 compounds in GEB. Eventually, based on the filtering rules, (OB ≥30% and the features of drug-likeness), 56 potentially active compounds were identified from a total compound in GEB. Details of the 56 potentially bioactive compounds are provided in Table [92]1. By using SwissTargetPrediction for target prediction, 729 potential targets were found for GEB (Table S1). The compounds-compound targets network as shown in Fig. [93]2, included 785 nodes: 56 active compound nodes and 729 target nodes, and 3747 edges. The top 5 compound nodes with the greatest number of edges included Humantenine, n-desmethoxtrankinidine, 19-(Z)-Akuammidine, 19 - Z- akuammidine, and N- Desmethoxyhumantenine. Three topological features of these compounds exhibited mean values of degree, node betweenness, and closeness were 111.2, 0.04769 and 0.3723, respectively. The top 5 target nodes with highest degree were LRRK2, JAK1, DRD3, EGFR, and IGF1R. nidine, 19-(Z)-Akuammidine, 19 - Z- akuammidine, and N- Desmethoxyhumantenine. Three topological features of these targets exhibited mean values of degree, node betweenness, and closeness were 29, 0.01554 and 0.4025, respectively. Table 1. Information of the active compounds in GEB for network analysis NO Name Compound CID MW MF Source 1 N- Desmethoxyrankinidine 5,316,594 310.4 C[19]H[22]N[2]O[2] TCMID, References [[94]16, [95]24]