Abstract Enhancing the production of economically and medically important plant metabolites by genetic and metabolic manipulation is a lucrative approach for enhancing crop quality. Nevertheless, the task of identifying suitable biosynthetic pathways related to certain bioactivities has proven to be challenging due to the intricate interconnections of the major metabolic and biochemical processes in commercially important plants. The commercial significance of plants belonging to the genus Aloe stems from their extensive utilization across several industries, such as cosmetics, pharmaceuticals, and wellness items, due to their medicinal properties. In the present study, we have utilized a reverse association approach to identify potential target metabolic pathways for enhancing the production of commercially important metabolites of Aloe spp., based on their metabolic pathway enrichment profile. The leaves of five highly utilized Aloe sp. were subjected to untargeted gas chromatography-mass spectrometry analysis followed by testing of free-radical scavenging effects against components of the Fenton and Haber-Weiss reaction. Through the application of appropriate bioinformatics tools, we identified distinct phytochemical classes and determined the enrichment of their corresponding biosynthetic pathways, associated the pathways with bioactivities, and also identified the inter-relation between the commonly enriched pathways. The strong association between metabolic pathways and antioxidant potentials suggested the necessity to enhance distinct but closely related metabolic pathways in order to enhance the quality of Aloe spp. and maximize their antioxidant effects for commercial exploitation in cosmetic industries. Keywords: Aloe, Metabolite, Photochemical, Antioxidant, Metabolic pathway, Metabolomics Highlights * • Plants belonging to the genus Aloe are extensively used in the cosmetic industry. * • Identification of the key pathways could guide metabolic engineering strategies. * • Major metabolite classes were identified as relevant to cosmetic products. * • Highly enriched common metabolic and biosynthetic pathways were identified. * • Target pathways pertaining to commercially relevant bioactivities were recognized. 1. Introduction The genus Aloe comprise of the largest group of pants (over 650 spp.) under the family Xanthorrhoeaceae [[25]1]. The Aloe spp. contains over 75 micronutrient components and over 250 different phytochemicals, including flavonoids, saponins, anthraquinones, and lignin, that exhibit various bioactivities, including antiviral, antibacterial, antifungal, anticancer, anti-inflammatory, moisturizing, anti-aging, immunostimulatory, anti-radiation, and wound healing properties [[26]2,[27]3]. Currently, these plants are considered to be highly significant in terms of their economic value in the field of medicine and pharmacology. They are frequently utilized in primary healthcare and traditional medicinal practice to effectively cure a wide range of disorders by modifying biochemical and molecular pathways [[28]4,[29]5]. In recent years, there has been a considerable increase in the use of A. vera in the cosmetic industry, in addition to a widespread acceptance of Aloe-based products by consumers due to its low dermatological sensitivity, and skin smoothening and moisturizing properties [[30]2,[31]6]. Especially due to potent antioxidant effects, an essential pre-requisite for cosmetic products, Aloe-based formulations are often included in commercial products [[32]7,[33]8]. For instance, various Aloe compositions have been demonstrated to possess anti-aging effects [[34]9], UV-protective effects [[35]10], and anti-proliferative effects [[36]11] through potent free-radical scavenging and antioxidant properties. The amounts of Aloe plant components in cosmetic products can vary significantly, ranging from less than 1 % to as high as 20 % [[37]12]. Due to the presence of a large variety of non-polar constituents, components of Aloe gel, upon topical application, are claimed to profoundly enter the deeper layers of the skin, facilitating the restoration of lost moisture and replenishment of the fatty layer, while neutralizing the free-radicals [[38]13]. The composition of non-polar secondary metabolites, such as essential oils, found in medicinal plants are closely linked to their therapeutic effectiveness and can be targeted for enhancing crop quality. One can gain an understanding of how these substances are regulated metabolically by identifying the genes and enzymes involved in their biosynthesis pathways. Research on economically significant phytometabolites is accelerating as researchers aim to understand the pathways responsible for the manufacture of key metabolites and manipulate the phytochemical composition to match commercial demands [[39]14]. Plant species belonging to the genus Aloe are commonly considered safe and serve as valuable resources for traditional remedies, pharmaceuticals, nutraceuticals, aromatherapy, preservatives, beverages, scents, cosmetics, and botanical pesticides [[40]1]. Quantity and diversity of the metabolites are influenced by their biosynthetic pathways. To improve the production of industrially relevant phytochemicals, metabolic and genetic engineering techniques have been employed. For instance, by exploring full-scale functional genomics, one study generated a whole transcriptome sequence database with a focus on the metabolic specificity of A. vera, to identify the pathways related to secondary metabolite formation, metabolic regulation, and signal transduction, that contributes to the growth and physiology of the plant [[41]15]. However, for the practical application of these strategies critical for crop improvement, it is necessary to first identify the specific biochemical pathways associated with the secondary metabolite formation. However, it is challenging since many of the plant's biosynthetic pathways are interconnected through a shared carbon pool [[42]16,[43]17]. Instead of focusing on individual genes responsible for a particular industrially significant phytochemical, it is more effective to enhance the complete biosynthetic route or multiple co-occurring biosynthetic pathways. This approach is more efficient and smarter strategy for developing economically valuable phytochemicals [[44]14,[45]18]. To achieve this objective, it is necessary to first identify the co-occurring plant biosynthetic pathways responsible for producing specific metabolites associated with distinct bioactivities. Therefore, in the present study, we intended to identify the predominant and inter-correlated metabolic and biochemical pathways that commonly dictate the formation of the volatile secondary metabolites in the Aloe spp. This was based on the hypothesis that metabolome-based reverse association strategies for identification of the predominant biosynthetic pathways could lead to crop improvement through metabolic engineering approaches. For this purpose, we selected Aloe vera (L.) Burm.f., AristAloe aristata (Haw.) Boatwr. & J.C.Manning (aka, Aloe aristata), Aloe jucunda Reynolds, Aloe aspera Haw., and Aloe albiflora Guillaumin, which are among the highly utilized plants in the cosmetic industry that belong to the genus Aloe [[46]19]. The volatile and non-polar metabolome of these plants were fingerprinted in an untargeted manner, followed by the study of the chemical-class enrichments and pathways analysis. The bioactivity potential was evaluated by testing the free-radical scavenging effects against the Fenton and Haber-Wess reaction-associated intracellular free-radical formation. Collectively, several clusters of enriched co-occurring metabolic pathways were detected, which could be potentially targeted using metabolic engineering strategies to enhance the antioxidant properties of plants belonging to the genus Aloe. 2. Methodologies 2.1. Plant material collection The plant materials were collected from the medicinal garden of North Bengal University and processed as described previously [[47]5]. In brief, disease-free leaves of A. vera, A. aristata, A. jucunda, A. aspera, and A. albiflora were hand-picked. The plant materials were identified and authenticated with accession numbers [48]09781, [49]09766, [50]09767, [51]09753, and [52]09779. The voucher specimens of the leaves were stored in the herbarium of North Bengal University. 2.2. Sample preparation and derivatization The collected whole leaves were rinsed with double distilled water three times, chopped into 3–5 mm pieces with a sterile scalpel, and placed in a dehydrating incubator (37 °C) for 7-d until completely dried. Dried leaves (100 mg) were separately mixed with 3 mL n-hexane in recti vials and incubated in the dark for 24 h at 25 °C and 120 rpm. Next, 40 μL N,O-Bis(trimethylsilyl)trifluoroacetamide with trimethylchlorosilane (BSTFA + TMS) was added into the mixture and incubated under the same condition for 6 h. The mixture was then passed through activated charcoal and Na[2]SO[4] (1:2, w/w) in a mini-column. The filtrate was spun at 15,000 rpm for 20 min at 25 °C, filtered through a 0.2 μm membrane syringe filter, and the resultant was used for GCMS analysis. 2.3. Gas chromatography-mass spectrometry The processed and derivatized samples were analyzed in a Trace 1300 Gas Chromatography instrument and ISQ QD single quadrupole mass spectrophotometer (Thermo-Scientific) as per prior standardized protocol [[53]14,[54]20]. The separation column used was TG-5MS with a dimension of 30 m × 0.25 mm × 0.25 μm. Samples were injected (1 μL) in splitless mode using AI-1310 auto-sampler (Thermo-Scientific). The inlet port temperature was set at 250 °C, the initial column temperature was 60 °C with a solvent delay of 5 min (4 min hold), and the final temperature was 290 °C with 4 min hold at the end. The temperature ramp was set at 5 °C/min, achieving a total run time of 54 min (1 mL/min flow of 99.99 % helium passed through hydrocarbon and dehydrating columns). The transfer line for the mass spectrometer was set at 290 °C and an ion source temperature was kept at 230 °C (electron ionization mode). MS analyzer range was 50–650 amu and the samples were analyzed at electron energy 70 eV (vacuum pressure of 2.21 x 10^−5 torr). MS data for identified peaks was analyzed using AMDIS (V2.7) where the major peaks were identified based on the base peak and molecular peak patterns of the library references using MS Interpreter