Abstract Seeds play a crucial role in plant reproduction, making it essential to identify genes that affect seed development. In this study, we focused on UDP-glucosyltransferase 71C4 (UGT71C4) in cotton, a member of the glycosyltransferase family that shapes seed width and length, thereby influencing seed index and seed cotton yield. Overexpression of UGT71C4 results in seed enlargement owing to its glycosyltransferase activity on flavonoids, which redirects metabolic flux from lignin to flavonoid metabolism. This shift promotes cell proliferation in the ovule via accumulation of flavonoid glycosides, significantly enhancing seed cotton yield and increasing the seed index from 10.66 g to 11.91 g. By contrast, knockout of UGT71C4 leads to smaller seeds through activation of the lignin metabolism pathway and redirection of metabolic flux back to lignin synthesis. This redirection leads to increased ectopic lignin deposition in the ovule, inhibiting ovule growth and development, and alters yield components, increasing the lint percentage from 41.42% to 43.40% and reducing the seed index from 10.66 g to 8.60 g. Our research sheds new light on seed size development and reveals potential pathways for enhancing seed yield. Key words: UDP-glucosyltransferase, seed development, phenylpropanoid metabolism, gene editing __________________________________________________________________ This study reports the identification of a UDP-glucosyltransferase UGT71C4 that influences seed development by shaping seed length and width, thereby influencing seed cotton yield. UGT71C4 activity redirects metabolic flux from lignin to flavonoid metabolism to promote ovule cell proliferation via accumulation of flavonoid glycosides. Introduction Seeds are the unique reproductive bodies of gymnosperms and angiosperms; as such, they play causal roles in the continuation of a species and serve as the foundation of agricultural production. Most seeds bear the heavy responsibility of species reproduction and dispersal, with crop seeds also providing food and necessary nutrients for humans and livestock. Mature seeds are rich in various nutrients, including carbohydrates, storage proteins, and lipid storage compounds; ultimately, seeds provide approximately 70% of the energy intake for organisms on Earth ([45]Li et al., 2019), with rice in particular being the main source of carbohydrates for half of the world’s population ([46]Song et al., 2007). The diversity and complexity of plant secondary metabolites are a consequence of many post-translational modifications such as methylation, glycosylation, hydroxylation, and acylation ([47]De Bruyn et al., 2015; [48]Tiwari et al., 2016; [49]Cardenas et al., 2019; [50]Louveau and Osbourn, 2019; [51]Ye et al., 2022). Glycosylation is the last modification to be made on plant products, and glycosyltransferases (GTs) are therefore central to plant growth and metabolism. Glycosylation is a core modification for regulation of biological activities, leading to the development of glycodiversity. Notably, the cell walls of plant seeds contain abundant carbohydrate polymers, and changes in sugar metabolism and abundance can alter the accumulation of substances during seed development ([52]Gomez et al., 2009). The plant secondary product glycosyltransferase consensus sequence (called the PSPG box) is the critical uridine diphosphate (UDP)-sugar binding domain of UDP-glucosyltransferases (UGTs). These enzymes have the ability to transfer glucuronic acid from UDP-glucose as the sugar donor to proteins, lipids, and small-molecule secondary metabolites, thereby influencing their stability, solubility, and bioactivity ([53]Mackenzie et al., 1997; [54]Le Roy et al., 2016). The PSPG box consists of about 40 amino acids and is located near the C-terminal sequence of UGTs; within the box, the two short sequences WAPQV and HCGWNS are highly conserved, being present in 95% of GTs. Notably, GTs lose their activation ability if the conserved histidine and glutamic acid residues are mutated ([55]Jones, 2000). In upland cotton, 274 glycosyltransferase genes have been identified, distributed across 18 chromosomes ([56]Xiao et al., 2019). Based on the model plant Arabidopsis, UGTs can be divided into 115 families ([57]Coutinho et al., 2003; [58]Gachon et al., 2005), of which GT1 comprises 107 glycosyltransferases and is further divided into 16 subgroups (groups A–P). Members of a subgroup share greater than 60% similarity and have similar substrates and donors ([59]Brazier-Hicks et al., 2018). With the evolution of higher plants, the E group has become the largest GT subgroup; it includes UGT71, UGT72, and UGT88, among many others. E group members UGT71B6 and UGT71C5 are able to recognize and glycosylate abscisic acid (ABA) and ABA-related metabolites ([60]Priest et al., 2006; [61]Liu et al., 2015). Overexpression of UGT72E1, UGT71E2, and UGT71E3 accelerates the accumulation of coniferyl alcohol 4-O-glucoside in phenylpropanoid metabolism ([62]Lanot et al., 2006). In addition, OsGSA1 encodes a UDP-glucosyltransferase that influences grain size by regulating cell proliferation and expansion ([63]Dong et al., 2020). More than 100 000 secondary metabolites have been found across plant species in past decades ([64]Gachon et al., 2005). These metabolites evolved over a long period of time and play important roles in cellular damage, stress tolerance, and protection from insect invasion ([65]Goodman et al., 2004; [66]Iven et al., 2012; [67]Hu et al., 2013). Plant secondary metabolites are extensively utilized as signal molecules, for example to mediate auxin movement and plant development ([68]Peer and Murphy, 2007; [69]Hu et al., 2021; [70]Mateo-Bonmati et al., 2021). Multiple metabolic pathways, including lipid ([71]Niu et al., 2009; [72]Yang et al., 2017), sugar ([73]Gomez et al., 2009; [74]Wang et al., 2021a), and phenylpropanoid pathways ([75]Besseau et al., 2007), are involved in seed formation and development. Phenylpropane metabolism is one of the main sources of secondary metabolites, including as it does the flavonoid and lignin pathways. Flavonoids are involved in many aspects of plant growth and development, such as pathogen resistance, UV damage repair, pollen growth, and seed coat development ([76]Mo et al., 1992; [77]Taylor and Grotewold, 2005; [78]Stracke et al., 2010). Arabidopsis mutants with flavonol synthesis defects show stomatal morphology in cotyledons and defective trichome formation ([79]Ringli et al., 2008). Similarly, flavonoid content in sweet potato (Ipomoea batatas) leaves is reportedly directly proportional to leaf size and epithelial cell number, and hence greater flavonoid content can promote leaf development ([80]Gao et al., 2023). Lignin is a composite phenolic polymer deposited in the secondary cell walls of all vascular plants ([81]Wilkerson et al., 2014). Lignin monomer biosynthesis involves a series of hydroxylation, methylation, and other modification reactions to form basic lignin complexes. Natural lignin polymers are formed from p-coumaryl, coniferyl, and sinapyl alcohol, which respectively produce p-hydroxyphenyl (H), guaiacyl (G), and syringyl (S) units ([82]Ralph et al., 1997; [83]Voxeur et al., 2015). Inhibition of lignin synthesis and deposition in anthers and grains can promote their development ([84]Yoon et al., 2014; [85]Jiang et al., 2019). PtMYB4 and PtMYB8 from Pinus taeda are activators of secondary cell wall deposition in conifers ([86]Bomal et al., 2008). Overexpression of AtMYB46 has been shown to activate the expression of genes involved in the biosynthesis of xylan, cellulose, and lignin and hence participates in the network regulating Arabidopsis secondary wall synthesis ([87]Zhong et al., 2007). The NAC family proteins NST1, NST2, and NST3 directly target AtMYB46 and its homologs and act as redundant regulators of stem and anther secondary wall synthesis in Arabidopsis. Mtnst1 mutants exhibit secondary wall synthesis defects in the stem and anthers ([88]Zhao et al., 2010). The yield trait lint percentage (LP) is the weight of fiber in the seed cotton and is an important indicator of cotton fiber yield. Seed size significantly impacts LP. Seed development in general has been well studied, but the mechanisms of action of glycosyltransferases in cotton seed development are still unclear. Here we report a gene encoding a UDP-glucosyltransferase (UGT71C4) that is involved in cotton seed development. Overexpression or knockout of UGT71C4 leads to changes in the flavonoid and lignin content of seeds due to altered expression of genes in the flavonoid metabolism and lignin synthesis pathways, ultimately affecting seed size and LP. These findings can lay a solid foundation for further research on seed formation and development in plants and may have benefits for crop yield improvement. Results Isolation and characterization of UGT71C4, a gene predominantly expressed during ovule development We identified 276 UDP-glucosyltransferases in G. hirsutum acc. TM-1 ([89]Hu et al., 2019), 131 in the A subgenome and 145 in the D subgenome. Among these UDP-glucosyltransferase genes, 256 were expressed in ovules and fibers, whereas the remaining 20 were not expressed ([90]Supplemental Figure S1A). UDP-glucosyl transferase 71C4 (UGT71C4, GH_D05G0690), a gene that encodes 486 amino acids, was predominantly expressed in ovule tissues at 10, 15, and 20 days post anthesis (DPA), suggesting that it acts in ovule development ([91]Figure 1A). To explore the role of this gene, we first performed a phylogenetic analysis with the flavonoid UGTs of Arabidopsis; UGT71C4 largely resembled members of the UGT71 group, which are known to act as flavonoid catalytic enzymes ([92]Supplemental Figure S1B and [93]Table S1). A conserved region named the PSPG box, which is widely associated with the synthesis of plant secondary metabolites ([94]Liu et al., 2002; [95]Jones et al., 2003; [96]Lim et al., 2003; [97]Dai et al., 2015) and the binding of UDP-sugar donors ([98]Kurosawa et al., 2002; [99]Yonekura-Sakakibara et al., 2008; [100]Ono et al., 2010), was found in the C-terminal amino acid sequence of UGT71C4, and its sequence similarity to functional PSPG boxes ranged from 83% to 85% ([101]Figures 1B and 1C). Protein structure prediction showed that UGT71C4 also contains a pair of tightly linked and face-to-face β/α/β Rossmann-like domains, termed a GT-B fold ([102]Figure 1D), which is responsible for recognizing and binding donors and acceptors through the resulting loose cleft ([103]Lairson et al., 2008). These findings suggest that UGT71C4, a member of Glycosyltransferase family 1 that contains a GT-B fold and a PSPG box, may be responsible for catalyzing the glycosylation of terpenes, flavonoids, and anthocyanins during ovule development. Figure 1. [104]Figure 1 [105]Open in a new tab Phylogenetic tree, protein structure, and characterization of UGT71C4. (A)UGT71C4 transcript levels (FPKM) from −3 to 25 DPA in ovules and at 5, 10, and 20 DPA in fibers. (B) Protein sequences of the conserved plant secondary product glycosyltransferase (PSPG) motif in all UGTs, which contains a conserved domain of 44 amino acids and two short highly conserved sequences indicated by a red box. (C) Multiple alignment of cotton UGT71C4 and Arabidopsis UGTs that are its near neighbors in the phylogenetic tree. The conserved domain of the PSPG box is indicated by a red box, and ∗ indicates conserved amino acids. (D) GT-B enzyme structure showing the two β/α/β Rossmann-like domains, which are oriented face-to-face and are not very closely related. The active site is located in the gap between them. UGT71C4 pleiotropically influences seed and lint development To gain detailed insights into the functions of UGT71C4 in cotton, we used clustered regularly interspaced short palindromic repeats (CRISPR)-CRISPR-associated nuclease 9 (Cas9)-mediated target mutagenesis to produce five different knockout lines (UGT71C4-KO). After producing T1 homozygous lines, we obtained five independent lines with different mutations named UGT71C4-KO-1 to UGT71C4-KO-5. The mutations were as follows: UGT71C4-KO-1 contained a 1-base pair (bp) adenine (A) deletion, UGT71C4-KO-2 contained a 5-bp TTCAG deletion, and UGT71C4-KO-3, UGT71C4-KO-4, and UGT71C4-KO-5 harbored a 1-bp thymine (T) deletion, a 1-bp T insertion, and a 1-bp A insertion, respectively ([106]Supplemental Figure S2A). UGT71C4-KO-3 and UGT71C4-KO-5 were found to be Cas9-free transgene lines, so they were used for further experiments ([107]Figure 2A, [108]Supplemental Figure S2B). Both lines exhibited a reduction in seed size ([109]Figures 2B and 2C): mature seed widths were significantly narrower (UGT71C4-KO-3 4.57 ± 0.98 mm and UGT71C4-KO-5 4.36 ± 1.02 mm vs. W0 4.86 ± 0.90 mm) ([110]Figure 2D), and seed lengths were substantially shorter (UGT71C4-KO-3 8.13 ± 1.42 mm and UGT71C4-KO-5 8.57 ± 0.64 mm vs. W0 10.34 ± 0.51 mm) ([111]Figure 2E). As expected, UGT71C4-KO-3 and UGT71C4-KO-5 also exhibited reductions in seed index by 15.97% and 27.80%, respectively, compared with W0 ([112]Figure 2F). However, LP was significantly higher by 4.77% for UGT71C4-KO-5 and 3.32% for UGT71C4-KO-3 compared with the receptor W0 ([113]Figure 2G). Figure 2. [114]Figure 2 [115]Open in a new tab UGT71C4 affects seed size and weight. (A) CRISPR-Cas9-mediated mutagenesis of UGT71C4. Top: Schematic diagram of UGT71C4; the CRISPR-Cas9 target site is indicated by an arrow. Exons and introns are represented by black boxes and black lines, respectively. Bottom: Alignment of receptor W0 and mutant sequences containing the target site. In the receptor sequence, the target sequence adjacent to the underlined PAM is indicated in red. The newly generated UGT71C4-KO-3 and UGT71C4-KO-5 mutants harbored a 1-bp T deletion and a 1-bp A insertion, respectively. (B), (D) Seed width and (C), (E) seed length in transgenic plants compared with receptor plants. UGT71C4-OE-1, UGT71C4-OE-2, and UGT71C4-OE-3 are overexpression lines; UGT71C4-KO-3 and UGT71C4-KO-5 are knockout lines; W0 is the receptor. Scale bar represents 1 cm. (F) Seed indexes of the various UGT71C4 lines. (G) Lint percentages of the various UGT71C4 lines. (H) Relative expression of UGT71C4 in ovules at 5, 10, 15, and 20 DPA. ∗p < 0.05 and ∗∗p < 0.01 by two-tailed Student’s t-test (seed width and length, n = 50; others, n = 3). Next, the full-length coding sequence of UGT71C4 was cloned into a vector with the CaMV 35S promoter and used to generate overexpression lines. More than 11 independent overexpression lines were identified among the T1 homozygotes, of which the top three lines with the highest expression were further investigated; these are referred to as UGT71C4-OE-1, UGT71C4-OE-2, and UGT71C4-OE-3 ([116]Supplemental Figure S2C). Expression during ovule development was quantified by qRT–PCR, which showed significantly increased UGT71C4 abundance in 5, 10, and 15 DPA ovules of overexpression lines compared with W0 ([117]Figure 2H). Seeds of the overexpression lines were significantly (p < 0.01) wider and longer (UGT71C4-OE-1 w: 5.02 ± 0.78 mm, l: 10.93 ± 0.63 mm; UGT71C4-OE-2 w: 5.20 ± 0.92 mm, l: 10.69 ± 0.50 mm; and UGT71C4-OE-3 w: 5.22 ± 0.83 mm, l: 9.81 ± 0.93 mm vs. W0 w: 4.86 ± 0.90 mm, l: 10.34 ± 0.51 mm) ([118]Figures 2D and 2E). Larger grain and seed size contributes to higher crop yield, to a certain extent. The overexpression line UGT71C4-OE-1 demonstrated the most significant increase in seed index, at 11.78% more than that of the W0; it was followed by UGT71C4-OE-3 at 5.56% and then UGT71C4-OE-2 at 4.47% ([119]Figure 2F). However, LP was decreased by 9.35%, 7.36%, and 4.18% in UGT71C4-OE-1, UGT71C4-OE-2, and UGT71C4-OE-3, respectively ([120]Figure 2G). There were no stable and significant changes in fiber length, elongation, strength, or micronaire value in either the overexpression or knockout lines ([121]Supplemental Table S2). All told, these results indicate that UGT71C4 pleiotropically influences seed size by affecting seed width and length along with lint development, without affecting fiber quality. UGT71C4 influences seed development by controlling cell proliferation To understand the effects of UGT71C4 overexpression and depletion during seed development, scanning electron microscopy (SEM) was performed on outer epidermal cells of ovules at −1 DPA. This revealed a certain degree of correlation between epidermal cell number and ovule size. Compared with W0, fewer epidermal cells were observed in UGT71C4-OE and more in UGT71C4-KO ([122]Figure 3A and 3B). We then tracked changes in ovule development from 5 DPA to 20 DPA, which revealed large differences in seed size during the period of 15 DPA to 20 DPA ([123]Figure 3C). UGT71C4-OE produced a greater number of ovule inner seed coat cells compared with W0, resulting in smaller cells that were tightly arranged with small cell gaps ([124]Figure 3D and 3E). By contrast, UGT71C4-KO exhibited inner seed coat cells that were larger in cross-sectional area and more loosely arranged. Figure 3. [125]Figure 3 [126]Open in a new tab UGT71C4 shapes seed size by regulating seed cell proliferation. (A) Scanning electron micrographs of ovule outer epidermal cells at −1 DPA. Scale bar represents 10 μm. (B) Number of outer epidermal cells (n = 20–25) at 10 DPA. (C) Ovule size in transgenic lines from 5 to 20 DPA. Scale bar represents 1 cm. (D) Images of parenchymal cells of the ovule inner integument from UGT71C4-OE-1, UGT71C4-OE-2, UGT71C4-OE-3, UGT71C4-KO-3, UGT71C4-KO-5, and W0. Scale bar represents 30 μm. (E) Number of inner integument cells in the same area. ∗p < 0.05 and ∗∗p < 0.01 by two-tailed Student’s t-test (n = 6–10). Ultimately, the UGT71C4-OE line possessed more inner seed coat cells and larger seeds, whereas UGT71C4-KO had smaller seeds with fewer inner seed coat cells that were sparsely arranged. Therefore, we inferred that UGT71C4 affects seed development by influencing cell proliferation and expansion. UGT71C4 activates the flavonoid metabolism pathway and thereby influences seed development Taking into account factors such as yield traits, gene expression levels, and stable genetics, we selected UGT71C4-OE-1, UGT71C4-KO-5, and the receptor W0 for a widely targeted metabolomics assay and RNA-sequencing (RNA-seq) analysis of 15-DPA ovules. These lines are henceforth referred to as UGT71C4-OE, UGT71C4-KO, and W0, respectively. Through ultraperformance liquid chromatography–tandem mass spectrometry (UPLC–MS/MS) detection and database analysis, we identified 97 differential metabolites in ovules of UGT71C4-OE compared with W0; 60 were downregulated, including ferulic acid, sinapic acid, cordycepin, and isoscopoletin, and 37 were upregulated, including naringenin, dihydrokaempferol, dihydroquercetin, and cinnamic acid. In UGT71C4-KO, 112 differential metabolites were detected, 23 of which were downregulated, including dicaffeoylquinic acid-O-glucoside and formononetin-7-O-glycoside, and 89 were upregulated, such as eriodictyol, p-coumaryl alcohol, and phenyl acetate ([127]Supplemental Table S3–S6). Kyoto Encyclopedia of Genes and Genomes pathway enrichment analysis of the differential metabolites identified enrichment of the 2“Phenylpropanoid biosynthesis,” “Flavonoid biosynthesis,” “Flavone and flavonol biosynthesis,” and “Biosynthesis of secondary metabolites” pathways ([128]Supplemental Figure S3). The phenylpropane metabolic pathway produces a large number of secondary metabolites, mainly flavonoids and monolignols. Compared with those of W0 and UGT71C4-KO, the 15-DPA ovules of UGT71C4-OE clearly had higher flavonoid levels ([129]Figure 4A). In particular, chalcone content was significantly increased by 3.29-fold in UGT71C4-OE but was not detected in UGT71C4-KO. Naringenin, the downstream product of chalcone, also exhibited 4.89-fold greater accumulation in ovules of the overexpression line. Similar trends were observed for dihydroquercetin and dihydrokaempferol, downstream products of the flavonoid pathway; accumulation of these molecules was increased by 4.24-fold and 8.99-fold, respectively, in ovules of the overexpression line compared with the receptor W0 ([130]Figure 4B). Figure 4. [131]Figure 4 [132]Open in a new tab Flavonoid metabolism alteration in seeds of UGT71C4 transgenic plants. (A) Heatmap of the relative flavonoid and lignin contents in ovules at 15 DPA for UGT71C4-OE, UGT71C4-KO, and W0 (n = 3) based on widely targeted metabolomic data. (B) Differential metabolites in the flavonoid metabolic pathway. ∗p < 0.05 and ∗∗p < 0.01 by two-tailed Student’s t-test (n = 3). (C) Mapping of flavonoid metabolism pathways to visualize overall differences in 15-DPA ovules of UGT71C4-OE, UGT71C4-KO, and W0. Relative expression of key catalytic enzymes is shown in the box. (D) Relative expression levels of flavonoid-biosynthesis-related genes in UGT71C4-OE, UGT71C4-KO, and W0. ∗p < 0.05 and ∗∗p < 0.01 by two-tailed Student’s t-test (n = 3). Phenylalanine ammonia lyase (PAL) acts as an entry enzyme for the phenylpropane metabolic pathway, directing more reduced carbon into basal phenylpropane metabolism, and 4-coumarate-CoA ligase (4CL) is an enzyme of the flavonoid and lignin branches and is therefore involved in the synthesis of flavonoids and cell-wall-bound phenolics. Both PAL and 4CL showed significantly higher expression in UGT71C4-OE but significantly lower expression in UGT71C4-KO ([133]Figure 4C). RNA-seq data confirmed significantly higher expression of the four major flavonoid synthesis genes in UGT71C4-OE and significantly lower expression in UGT71C4-KO; these genes are chalcone synthase (CHS: GH_A02G0270, GH_D02G0295), which can direct metabolic fluxes toward flavonoid biosynthesis, chalcone isomerase (CHI: GH_A13G0216, GH_D04G0140), flavanone 3-hydroxylase (F3H: GH_A12G0596), and flavonoid 3′-monooxygenase (F3′H: GH_A12G2012) ([134]Figure 4D). Taken together, these results support the hypothesis that UGT71C4 actively participates in the flavonoid branch of the phenylpropanoid metabolism pathway. Relative to the UGT71C4 loss-of-function line, UGT71C4-OE plants preferentially direct the flux of phenylpropane metabolism products into the flavonoid pathway rather than the lignin pathway, causing accumulation of flavonoids in the 15-DPA ovule, which in turn promotes increasing expression of key flavonoid portal enzymes and related synthases and thereby promotes further metabolic flux into the flavonoid branch. UGT71C4 also influences monolignol synthesis Considering that overexpression of UGT71C4 triggers changes in metabolic fluxes in phenylpropane metabolism, we next explored the changes brought about by loss of its function. Of the two main lignin monomers, p-coumaryl alcohol (H unit) exhibited 2.40-fold higher content in ovules of UGT71C4-KO compared with those of W0, but it was reduced to only 0.78 times the W0 level in UGT71C4-OE; there was no significant change in sinapyl alcohol (S unit) content in UGT71C4-KO ovules, whereas UGT71C4-OE ovules contained only 0.09 times as much S monomer as W0. Similar results were observed for the precursors of sinapyl alcohol, ferulic acid and sinapic acid; namely, ferulic acid accumulated in UGT71C4-KO ovules to 1.58 times the level in the receptor W0, and sinapic acid accumulated to 1.69 times the receptor level ([135]Figure 5A). There was no difference in the relative content of guaiacyl (G unit) between UGT71C4-OE and UGT71C4-KO ovules. These three monolignols undergo a complex polymerization process to produce lignin, which is deposited in plant secondary walls and constitutes one of the main components of the cell wall. Here, significant accumulation of lignin was observed in mature seeds of UGT71C4-KO, but seeds of UGT71C4-OE ([136]Figure 5B) demonstrated no significant difference compared with the receptor W0. Figure 5. [137]Figure 5 [138]Open in a new tab Lignin metabolism alteration in seeds of UGT71C4 transgenic plants. (A) Differential metabolites in the lignin metabolic pathway. ∗p < 0.05 and ∗∗p < 0.01 by two-tailed Student’s t-test (n = 3). (B) Mature seed lignin content in UGT71C4-OE, UGT71C4-KO, and W0 (n = 3). (C) Relative expression levels of lignin-biosynthesis-related genes. ∗p < 0.05 and ∗∗p < 0.01 by two-tailed Student’s t-test (n = 3). (D), (E) Mapping of lignin metabolism pathways to visualize overall differences in 15-DPA ovules of UGT71C4-OE, UGT71C4-KO, and W0 plants. The relative expression of key catalytic enzymes is shown in the box. ∗p < 0.05 and ∗∗p < 0.01 by two-tailed Student’s t-test (n = 3). In light of the altered monolignol content in ovules and lignin content in mature seeds, we endeavored to further elucidate the molecular mechanism by which UGT71C4 influences seed development by performing transcriptome sequencing (RNA-seq) of 15-DPA ovules from UGT71C4-OE, UGT71C4-KO, and their receptor W0. RNA-seq analysis identified 4359 differentially expressed genes (DEGs) in UGT71C4-OE compared with the receptor, including 2200 upregulated genes and 2159 downregulated genes; UGT71C4-KO exhibited only 279 DEGs, with 72 upregulated and 207 downregulated ([139]Supplemental Figure S4A–S4C). Gene Ontology (GO) enrichment analysis identified terms such as “cell wall,” “cell wall organization,” “cell wall biogenesis,” and “peroxidase activity” as enriched among DEGs in both the overexpression and knockout lines ([140]Supplemental Figure S4D and S4E). Typical activators of lignin synthesis, the NAC family genes NST1/NST2, showed decreased transcript levels in the ovules of UGT71C4-OE, whereas their expression in UGT71C4-KO was 1.21 times that in the receptor W0. In addition to NST1/NST2, two other activators of lignin synthesis, MYB46 and MYB83, were predominantly expressed in UGT71C4-KO at levels 1.67 and 2.11 times higher than those in the receptor W0. By contrast, the lignin synthesis inhibitor MYB8 was predominantly expressed in UGT71C4-OE at 1.62 times the level in the receptor W0 ([141]Figure 5C). Monolignol synthesis occurs primarily through the lignin branch of phenylpropane metabolism. We observed that transcript levels of catalytic enzymes in this pathway also changed upon perturbation of UGT71C4 expression. A particularly notable example is hydroxycinnamoyl-CoA shikimate hydroxycinnamoyl transferase (HCT: GH_D05G1187), which can direct metabolic flux from the general phenylpropane pathway toward monolignol biosynthesis. HCT expression was increased by 13% in UGT71C4-KO ovules but significantly decreased by 56.1% in UGT71C4-OE ovules. The expression of caffeic acid O-methyltransferase (COMT: GH_A12G2656) was also significantly decreased by 58.20% in UGT71C4-OE but significantly increased by 8.12% in UGT71C4-KO. Another catalytic enzyme, cinnamoyl-CoA reductase (CCR), also showed a 7% increase in expression in UGT71C4-KO but a 47.84% decrease in UGT71C4-OE ([142]Figure 5D and 5E). Taken together, these results show that when UGT71C4 is knocked out, the balance of the phenylpropanoid metabolism pathway is altered, metabolic flow is preferentially directed toward lignin metabolism rather than the flavonoid pathway, and more lignin than flavonoids accumulates in the ovule. In addition to significant lignin deposits, UGT71C4-KO ovules exhibited greater activity of genes and enzymes related to lignin synthesis; conversely, such activity was significantly suppressed in seeds of UGT71C4-OE. UGT71C4 exhibits specialized glucosyltransferase activity Our findings demonstrated that significant and discrepant accumulation of major metabolites and key enzymes of the phenylpropane pathway occurs in UGT71C4-OE and UGT71C4-KO ovules. As UGT71C4 belongs to the UDP-glycosyltransferase family and sequence analysis suggested that it has a functional PSPG box for sugar-donor binding, we next investigated whether glycoside contents were altered in 15-DPA ovules of the overexpression and knockout lines. A widely targeted metabolomics assay showed that UGT71C4-OE ovules accumulated greater amounts of flavonoid glycosides such as naringenin-4′-O-glucoside, kaempferol-4′-O-glucoside, kaempferol-3-O-arabinoside, scopoletin-7-O-glucoside, and kaempferol-3′-O-glucoside; no significant accumulation was seen in ovules from the receptor W0 or UGT71C4-KO ([143]Figure 6A). Figure 6. [144]Figure 6 [145]Open in a new tab Glucosyltransferase activity of UGT71C4 toward naringenin. (A) Flavonoid glycoside profiles of ovules from UGT71C4 transgenic lines. (B) Western blot detection of UGT71C4 protein. M: marker. Red arrow indicates UGT71C4 protein. (C) HPLC analysis of the negative control (the reaction system without UGT71C4 protein). (D) The glycosyltransferase activity of UGT71C4 toward naringenin was determined by HPLC analyses, revealing two peaks at Rt = 6.872 min and Rt = 10.330 min. The former peak represents newly generated product. (E) HPLC analysis of the authentic naringenin standard. (F) HPLC analysis of the authentic naringenin-7-O-glycoside (N7G) standard. (G) Mass spectrum of the negative control reaction system without UGT71C4 protein. (H) MS analysis of the glucoside products produced by UGT71C4. The mass spectrum shows the parent molecule ion with m/z 271.0000 [M-H]^− and N7G with m/z 433.0000 [M-H]^−. (I), (J) MS analyses of the authentic naringenin and N7G standards. (K) UGT71C4 glycosylation reaction process. To explore the ability of UGT71C4 proteins to glycosylate flavonoids and monolignols, recombinant UGT71C4 protein was purified and in vitro binding assays performed ([146]Figure 6B). Metabolites whose levels differed significantly among the UGT71C4-OE, UGT71C4-KO, and receptor W0 lines were selected as substrates, including naringenin, dihydrokaempferol, dihydroquercetin, p-coumaryl alcohol, sinapic acid, and UDP-glucose. Upon HPLC/UPLC‒ESI–MS analysis, the negative control without UGT71C4 protein exhibited only a single naringenin peak (Rt = 10.321 min) ([147]Figure 6C). By contrast, the reaction product system demonstrated both the naringenin peak (Rt = 10.330 min) and an additional peak at Rt = 6.872 min ([148]Figure 6D). The retention time of the naringenin standard was 10.297 min ([149]Figure 6E). Comparison with standards revealed that the novel peak had a retention time similar to that of naringenin-7-O-glycoside (N7G) ([150]Figure 6F). This identification was confirmed by mass spectrometry, which showed that the reaction products consisted of N7G at an m/z of 433.0 and naringenin at an m/z of 271.0 ([151]Figure 6H), values that corresponded well to those of the naringenin ([152]Figure 6I) and N7G ([153]Figure 6J) standards. The no-UGT71C4 negative control only showed naringenin at an m/z of 271.1 ([154]Figure 6G). On the basis of these results, UGT71C4 possesses the ability to glycosylate naringenin to form N7G ([155]Figure 6K), but under the conditions tested, it did not show glucosyltransferase activity toward dihydrokaempferol, dihydroquercetin, p-coumaryl alcohol, or sinapic acid ([156]Supplemental Figure S5–S8). Collectively, our findings demonstrate that UGT71C4 has glucosyltransferase activity toward naringenin, consistent with the accumulation of flavonoids and flavonoid glycosides observed in UGT71C4-OE. UGT71C4 affects cell proliferation and ROS content GO-term enrichment analysis of RNA-seq data revealed that the term “peroxidase activity,” which is associated with cell proliferation, was enriched only in DEGs of UGT71C4-OE ([157]Supplemental Figure S4D and S4E). The DEGs annotated with this term included PRX72_A and PRX52_A, which are homologous to AtPRX72 and AtPRX52, sharing 88% and 79% amino acid similarity, respectively ([158]Supplemental Figure S9). As illustrated in [159]Supplemental Figure S10A, PRX72_A, PRX72_D, PRX52_A, and PRX52_D are transcribed at a significantly higher level in UGT71C4-OE and a significantly lower level in UGT71C4-KO. Nitroblue tetrazolium (NBT), which is oxidized to a dark blue precipitate by superoxide anions, was used to quantify oxygen radicals in the leaves of UGT71C4-KO, UGT71C4-OE, and W0. This staining revealed substantial accumulation of superoxide anions in UGT71C4-OE leaves but little in UGT71C4-KO leaves ([160]Supplemental Figure S10B and S10C). We also observed differential expression of many genes related to cell growth and development. One such is OsTGW6 (GH_A01G2007), which has been shown to control the cellularization stage of endosperm to increase grain weight in rice ([161]Ishimaru et al., 2013). In soybeans, GmSWEET10a and CYP450 family members ([162]Yang et al., 2013; [163]Wang et al., 2015; [164]Tang et al., 2016; [165]Zhao et al., 2016; [166]Liu et al., 2020) have been shown to affect seed size and other traits ([167]Duan et al., 2022). Cotton genes of interest include the gibberellic acid signaling pathway gene GH_A05G2747 (GL10) ([168]Zhan et al., 2022), sucrose synthesis pathway genes like GH_A05G0363 (GhSus4) ([169]Ruan et al., 2003; [170]Jiang et al., 2012; [171]Liu et al., 2020), the phytosterol biosynthetic gene GH_A08G0610 (GhSMT1) ([172]Suo et al., 2021), and the UDP-glucosyltransferase gene GH_D05G3357 (GSA1), which is known to control grain size and abiotic stress tolerance ([173]Dong et al., 2020). More complicated cases include DA1 and its homolog GW2, key genes that positively regulate cell proliferation in Arabidopsis thaliana and maize but negatively regulate grain weight and grain width in rice ([174]Li et al., 2008; [175]Xie et al., 2018; [176]Achary and Reddy, 2021). Other seed developmental genes that demonstrated remarkable increases or decreases are presented in [177]Supplemental Figure S10D. For validation by qRT–PCR, we selected the HECT E3 ligase UPL3, which has proven involvement in seed development, and the transcription factor PGL1 (PRE1), a member of the basic helix-loop-helix (bHLH) family that has roles in cell division and cotton plant development ([178]Zhang et al., 2009). The results confirmed that the expression of these two genes was altered by UGT71C4 perturbation, being significantly increased in UGT71C4-OE and decreased in UGT71C4-KO ovules at 5 DPA, 10 DPA, and especially 15 and 20 DPA ([179]Supplemental Figure S11). All told, our RNA-seq analysis confirmed that overexpression and knockout of UGT71C4 affect the expression of genes that control cell proliferation, resulting in altered seed size. Discussion The gene UGT71C4 has intriguing potential for improvement of cotton production. With its knockout in UGT71C4-KO, LP was increased from 41.42% to 43.40%. With its overexpression in UGT71C4-OE, seed index was increased by 11.78%, ovule inner seed coat cells were more closely arranged and more numerous, and mature seeds were longer and wider ([180]Figure 2). UGT71C4 has glucosyltransferase activity toward naringenin and was able to generate N7G in vitro; via this activity, it could regulate the distribution of metabolic flux in the phenylpropanoid metabolism pathway between flavonoid synthesis and monolignol synthesis, thereby affecting cotton ovule development. As improvements in SI or LP would benefit cotton production, UGT71C4 could be a candidate gene for future studies aimed at improving yield parameters or seed composition. A widely targeted metabolomics assay revealed that seeds of UGT71C4-OE accumulated more flavonoids in their ovules, including more chalcone, naringenin, dihydroquercetin, and dihydrokaempferol ([181]Figure 4). Researchers have intensively studied the glycosylation substrates of Arabidopsis thaliana UGT71 family members, which exhibit high substrate specificity for low-molecular-weight compounds such as flavonoids ([182]Lim et al., 2001; [183]Lim, 2002). UGT71C1, in particular, shows catalytic activity toward the 3-OH of hydroxycoumarins, hydroxycinnamates, and flavonoids ([184]Lim et al., 2003, [185]2008). Our results showed that UGT71C4 possesses glucosyltransferase activity for flavonoids, glycosylating naringenin to form N7G ([186]Figure 6C–6K), which has higher stability and solubility. Such conversions of metabolites enable plants to respond to environmental changes and maintain normal function ([187]Jiang et al., 2016); in addition, flavonols can directly promote cell growth and support more energy storage, with moderate accumulation having been shown to contribute to cell proliferation ([188]Broun, 2005; [189]Taylor and Grotewold, 2005; [190]Yang et al., 2021). Thus, accumulation of flavonoids in ovules tends to promote seed development. Metabolic flux redirection (MFR) can occur in various branches of phenylpropane metabolism. UGT83A1 influences MFR between lignin and flavonoid glycosides during abiotic stress, thereby protecting the plant from damage ([191]Dong et al., 2020). In Arabidopsis thaliana, the miPEP858a mutation results in a redirection of metabolic flux from lignin toward flavonoids, including anthocyanins ([192]Sharma et al., 2020). Here, we found that UGT71C4 likewise influenced MFR between lignin and flavonoid glycosides during cotton ovule development. We further observed that this MFR affected the expression of key enzymes in each branch, namely CHI, CHS, F3H, and F3′H in the flavonoid metabolic pathway and PAL, C4H, and 4CL in phenylpropane metabolism ([193]Figure 4C and 4D). As the core enzymes of phenylpropane metabolism, PAL, C4H, and 4CL ([194]Zhang and Liu, 2015) finely regulate the action of MFR on phenylpropane metabolism at multiple levels, which means that the action of MFR on these key synthases facilitates feedback on the phenylpropane pathway, thereby regulating plant growth and development. In previous studies, mutation of chalcone synthase in Zea mays and Petunia hybrida was found to block the first step of the flavonoid pathway, resulting in pollen tube growth defects and hindering root hair development ([195]Buer and Muday, 2004; [196]Taylor and Grotewold, 2005). CHS catalyzes the first step in the flavonoid pathway, which ultimately produces a series of flavonoid products. Silencing of CHS in soybeans has been shown to result in a lack of flavonoids, which inhibits nodule formation ([197]Wasson et al., 2006). Because the cotton seed is also a major site of nutrient storage, we infer that upon overexpression of UGT71C4, a greater metabolic flux flows into flavonoid metabolism, particularly the more stable flavonoid glycosides. Because this enzyme catalyzes glycoside formation, our findings provide new insights into the roles of flavonoids. Research in rapeseed has shown that phenolic compounds can affect seed weight and size, as well as alter the internal structure of the seed and the overall metabolic level of the plant ([198]Clauss et al., 2011). With repression of the lignin biosynthetic gene HCT in Arabidopsis, the metabolic flux from lignin synthesis is redirected toward flavonoid accumulation, resulting in a dwarf phenotype without a floral stem ([199]Hoffmann et al., 2004; [200]Besseau et al., 2007). In the current work, loss of UGT71C4 resulted in MFR in the phenylpropane metabolic pathway shifting to the lignin synthesis pathway, with significant accumulation in ovules of the lignin monomers p-coumaryl alcohol and sinapyl alcohol and their precursors, ferulic acid and sinapic acid. In addition, several related synthases (HCT, COMT, and CCR) were transcribed at significantly higher levels in UGT71C4-KO ovules ([201]Figure 5E). Accumulation of such key metabolites and enzymes leads to deposition of lignin in the ovules, which affects cell surface tension and may be part of the reason why UGT71C4-KO ovule enlargement is constrained. Lignin biosynthesis is also regulated by MYB and NAC transcription factors. NST1 and NST2 are two typical factors that promote secondary wall thickening; overexpression of either leads to ectopic deposition of a secondary cell wall in normal parenchyma cells ([202]Mitsuda et al., 2005; [203]Zhao et al., 2010). Hyperactivation of MYB activators leads to substantial transcript accumulation of genes in the lignin synthesis pathway, ectopic lignin deposition in plant secondary walls, small curled leaves, and smaller floral organs ([204]Patzlaff et al., 2003; [205]Zhong et al., 2007). With further research on the MYB family, factors with inhibitory functions have also been discovered ([206]Bomal et al., 2008). Notably, some NAC and MYB genes that promote lignin synthesis were significantly upregulated in UGT71C4-KO and significantly repressed in UGT71C4-OE, whereas the opposite trends were observed for a lignin synthesis repressor ([207]Figure 5C). The spatial and temporal control of lignification is extremely important for plants, as lignification is a metabolically costly process that requires a large carbon skeleton, and plants do not have mechanisms for lignin degradation. Plants must therefore balance the need for lignification with the resources available for synthesis of lignin polymers. In addition, because lignin limits cell wall expansion, lignification must follow cell division and expansion growth; excessive deposition of lignin in tissues undergoing active division or elongation could lead to decreased cell-wall tension, which limits cell division and elongation and ultimately has adverse effects on plant growth and development ([208]Lucas et al., 2013; [209]Zhao, 2016). On the basis of existing evidence, we inferred that the smaller seeds of UGT71C4-KO plants may be attributed to altered transcription of lignin-synthesis-related genes, enhanced lignin deposition, erroneous ectopic expression, and more metabolic flux in phenylpropane metabolism flowing into the lignin metabolic pathway rather than flavonoid metabolism. Ectopic deposition of lignin in the ovule led to decreased ductility of the secondary wall, which limited the expansion of epidermal cells, ultimately decreasing individual cell volume and reducing seed size. It is well established that peroxidase family genes play vital roles throughout the plant life cycle ([210]Passardi et al., 2004b). Studies have shown that class III peroxidases (PRXs) are involved in various plant biological functions, including defense against abiotic stress, lignification, and auxin degradation ([211]Cosio et al., 2009). In addition, peroxidases regulate ROS activity, which influences cell growth and expansion; specifically, the presence of the hydroxyl (⋅OH) radical in the cell increases cell extensibility and therefore stimulates cell growth. Elongation of cotton fibers is dependent on ROS to regulate cell-wall relaxation and retard the increase in cell-wall rigidity ([212]Ruan et al., 2001; [213]Gapper and Dolan, 2006). Moreover, through this loosening mechanism, ROS affect cell proliferation and division to some extent ([214]Blee et al., 2003; [215]Li et al., 2003; [216]Passardi et al., 2004a, [217]2004b; [218]Shigeto et al., 2013; [219]Lu et al., 2014; [220]Wang et al., 2021b), and reducing ROS level too greatly might inhibit cell proliferation, affecting normal differentiation and defense ([221]Schieber and Chandel, 2014). The results of this work showed that peroxidase family genes had significantly altered expression in UGT71C4-OE and UGT71C4-KO ([222]Supplemental Figure S10A). On the basis of the characteristics of UDP-glucosyltransferase family enzymes, we speculate that UGT71C4 indirectly influences cell proliferation by regulating ROS levels mediated by peroxidase family proteins. Specifically, in UGT71C4-KO plants, ⋅OH radicals are less active in tissues with more lignin deposition, whereas in UGT71C4-OE plants, ⋅OH radicals accumulate significantly in tissues with less lignin deposition, where they increase cellular extensibility, promote cell division, and participate in coordinating cellular activities ([223]Supplemental Figure S10B and S10C). Consistent with previous studies, we found that UGT71C4 was involved in phenylpropane metabolism in addition to energy flow; furthermore, we observed that it indirectly influenced genes involved in grain-size regulation. We propose a working model ([224]Figure 7) in which metabolic changes in upland cotton seeds are influenced by UGT71C4 glycosyltransferase, thereby affecting seed size. When UGT71C4 is overexpressed, UGT71C4 exerts glucosyltransferase activity toward flavonoids to produce flavonoid glycosides; the phenylpropanoid metabolism pathway is induced in seeds; substances such as naringenin, dihydrokeampferol, and dihydroquercetin are appropriately accumulated in the ovule; and metabolic flux is redirected from lignin metabolism to flavonoid metabolism ([225]Figure 7A). Hence, flavonoid metabolism is active, and key catalytic enzymes in the flavonoid pathway such as CHS, CHI, and F3′H are significantly upregulated at the transcriptional level. Genes related to cell proliferation are also significantly upregulated, further promoting ovule cell expansion ([226]Figure 7C). By contrast, when UGT71C4 is knocked out, the loss of UGT71C4 glucosyltransferase activity means that metabolic flux fails to flow to the flavonoid metabolic pathway and is redirected to lignin metabolism ([227]Figure 7B); the lignin synthesis pathway is instead induced; lignin monomers and intermediate products are overaccumulated in the ovule; and lignin polymerization ability is significantly improved. Enzymes related to the flavonoid pathway and its products are significantly inhibited; ectopic deposition of lignin in the ovule reduces the ductility of the cell wall and limits seed growth and development; and genes relating to secondary wall synthesis are significantly induced, resulting in the formation of smaller grains ([228]Figure 7C). Figure 7. [229]Figure 7 [230]Open in a new tab Working model of seed-size regulation by UGT71C4. (A) When UGT71C4 is overexpressed, it exhibits glucosyltransferase activity that transforms flavonoids into flavonoid glycosides, and metabolic flux is directed toward the flavonoid metabolic pathway. (B) When UGT71C4 function is lost, metabolic flux is directed away from flavonoid metabolism and toward lignin metabolism. (C) When UGT71C4-OE increases the metabolic flux from basic phenylpropane metabolism to flavonoid metabolism, the expression of core flavonoid synthesis enzymes increases significantly; appropriate flavonoids accumulate in ovules to promote cell growth and development; and the expression of genes related to cell differentiation and proliferation is advantageously regulated. These changes ultimately increase seed length and width and hence the yield. In UGT71C4-KO ovules, greater metabolic flux is directed from basal phenylpropane metabolism to lignin metabolism, which results in greater synthesis of lignin monomers and intermediates; significantly increased expression of related core enzymes; induction of genes related to lignin synthesis; and significant accumulation of lignin in the ovules. This in turn results in lignification of the cell wall, reduced cell ductility, and hence constrained seed development. Materials and methods Vector construction, plant genetic transformation, and plant culture The full-length coding sequence of UGT71C4 was cloned into a vector containing the CaMV 35S promoter. The CRISPR-Cas9 system was used to edit genomic material, with single-guide RNAs (sgRNAs) targeting UGT71C4 designed using the CRISPR-P 2.0 website ([231]http://crispr.hzau.edu.cn/CRISPR2/). The sgRNAs consisted of 20-bp sequences, and mutations were induced in an ∼20-bp region upstream of a protospacer adjacent motif (PAM) in the upland cotton receptor line (Gossypium hirsutum acc. W0). Vectors containing the cloned inserts were transferred into the receptor by Agrobacterium tumefaciens–mediated transformation ([232]Liu et al., 2002) using the strain LBA4404 (CAT#: AC1030). Genomic DNA inserts from overexpression lines were amplified by PCR to confirm transformation, and knockout lines were confirmed by Hi-TOM ([233]Supplementary Table S7). Confirmed overexpression, knockout, and receptor plants were cultivated in fields in Hainan and Yongan, China. Seeds from homozygous plants were collected after maturity for subsequent experiments. The seed index (weight of 100 seeds) was measured for more than 300 seeds from 25 bolls, and LP was measured from 25 bolls (n = 3). Phylogenetic analysis and sequence alignment UGT amino acid sequences were obtained from previous reports on Arabidopsis thaliana ([234]Brazier-Hicks et al., 2018) ([235]Supplementary Table S1). Multiple sequence alignment of UGTs was performed using Clustal W. MEGA X was used to construct the neighbor-joining tree with 1000 bootstrap replicates. Multiple sequence alignment between UGT71C4 and other known UGT proteins was performed using BioXM ([236]http://202.195.246.60/BioXM/). RNA extraction and quantitative reverse transcription polymerase chain reaction Total RNA was extracted from different cotton tissues with the RNAprep Pure Plant Kit (DP432), then converted into cDNA using HiScript II Q RT SuperMix for quantitative PCR (+gDNA wiper) (Vazyme: R223-01). A LightCycler 96 instrument (Roche) and ChamQ Universal SYBR qPCR Master Mix (Vazyme #Q711) were used to quantify gene expression by the 2^−ΔΔCt method. The reaction solution was as follows: 10 μL 2× ChamQ Universal SYBR qPCR Master Mix, 0.2 μM primer 1, 0.2 μM primer 2, 1 μL of 100–150 ng/μL template cDNA, and ddH[2]O to 20 μL. Sequences of the qRT–PCR primers are given in [237]Supplementary Table S7. Transcriptome sequencing Transcriptome sequencing was performed by Novogene (Beijing, China). First, total RNA sequences were assessed using the Fragment Analyzer 5400 (Agilent Technologies, CA, USA). Next, the transcriptome sequencing library was prepared and sequencing libraries were generated using the NEBNext UltraTM RNA Library Prep Kit for Illumina (NEB, USA). All samples were clustered using the TruSeq PE Cluster Kit v3 cBot HS (Illumina) on the cBot Cluster Generation System. The library formulation was sequenced using the Illumina NovaSeq 6000 platform to generate paired-end reads of 150 bp. Raw data converted from the Illumina platform were subjected to quality control before subsequent analysis. The transcriptome analysis method was as described previously ([238]He et al., 2022). Widely targeted metabolomics methods The widely targeted metabolomics assay was performed by Metware (Wuhan, China). Flowers were labeled on the day of flowering, and ovules from the various lines were harvested at 15 DPA. The ovule samples were freeze-dried using a vacuum freeze-drying machine (Scientz-100F), then crushed using a mixer mill (MM 400, Retsch) with zirconia beads at 30 Hz for 1.5 min. The resulting freeze-dried powder (100 mg) was dissolved in 1.2 ml of 70% methanol solution, incubated at room temperature for 3 h with vortexing for 30 s every 30 min (a total of six times), and placed overnight in a refrigerator at 4°C. Prior to UPLC–MS/MS analysis, the extract was centrifuged at 12 000 rpm for 10 min and filtered (SCAA-104, pore size 0.22 μm; ANPEL, Shanghai, China, [239]http://www.anpel.com.cn/). A UPLC–ESI–MS/MS system was used to analyze the sample extracts. Analytical conditions for UPLC were as follows. The column was an Agilent SB-C18 (1.8 μm, 2.1 mm × 100 mm). The mobile phase consisted of solvent A (pure water with 0.1% formic acid) and solvent B (acetonitrile with 0.1% formic acid). The gradient program began at initial conditions of 95% A and 5% B, graded linearly to 5% A and 95% B within 9 min, was maintained at 5% A and 95% B for 1 min, graded to 95% A and 5% B within 1.1 min, and was then maintained at that composition for 2.9 min. The flow rate was set to 0.35 mL/min, the column oven to 40°C, and the injection volume to 4 μL. The wastewater was alternately connected to the ESI triple quadrupole linear ion trap (QTRAP) mass spectrometer LIT and triple quadrupole (QQQ) scans were obtained on a triple quadrupole linear ion trap mass spectrometer (Q TARP), the AB4500 Q trap UPLC/MS/MS System. This system was equipped with an ESI Turbo ion spray interface, which operates in both positive and negative ion modes and is controlled by Analyst 1.6.3 software (AB Sciex). The operation parameters of the ESI source were: ion source, turbine spray; source temperature, 550°C; ion spray voltage (IS), 5500 V (positive ion mode)/−4500 V (negative ion mode); ion source gas I (GSI), gas II (GSII), and curtain gas (CUR) flows of 50, 60, and 25.0 psi, respectively; and collision activation dissociation, high. The instrument was tuned and calibrated with 10 and 100 μM polypropylene glycol solutions in the QQQ and LIT modes. QQQ scanning was an MRM experiment performed with the collision gas (nitrogen) set to medium. Further DP and CE optimization was carried out for the transition of a single MRM. A specific set of MRM transitions was monitored for each period on the basis of the metabolites eluted during that period. Histochemical analysis and SEM Ovules (15 DPA) were fixed overnight in 50% ethanol, 5% glacial acetic acid, and 5% formaldehyde at 4°C, then dehydrated in an ethanol series. For histochemical analysis, after fixation with xylene, the ovules were embedded in an extracellular matrix (Sigma Aldrich) and sliced into 8-μm sections using a rotary slicing machine (Leica). The prepared tissue sections were stained with toluidine blue and observed under a Carl Zeiss light microscope. Subsequently, critical point drying (Leica) of the spikelet shell, gold sputter coating, and observation under SEM (Hitachi) were performed. Cell size and number were measured using ImageJ, and images were collected using the Leica application suite (V 4.2). Lignin content determination Samples (0.05 g) were ground into powder, combined with 5 ml of 1% acetic acid, and mixed well. The mixture was centrifuged and the sediment retained. Ethanol ether mixed solution (3 ml) was added to the precipitate, mixed thoroughly, incubated for 3 min, and centrifuged again, retaining the precipitate. After drying, 1.2 ml of 72% sulfuric acid was added to the precipitate, mixed well, and left to stand at room temperature for 16 h (overnight) to dissolve all the cellulose. The next day, 4 ml of distilled water was added to the centrifuge tube, mixed well, and incubated in a boiling water bath for 5 min. Afterward, 2 ml of distilled water and 0.2 ml of 10% barium chloride solution were added; the solution was shaken well, then centrifuged, retaining the precipitate. The precipitate was washed with 5 ml distilled water, then resuspended in 4 ml of 10% sulfuric acid and 4 ml of 0.025 M potassium dichromate solution. After mixing, the precipitate was heated in a boiling water bath for 15 min while shaking. After cooling, all substances in the centrifuge tube were transferred to a conical flask. Residual substances in the centrifuge tube were washed out with 15–20 ml water, and this eluate was also added to the flask. For titration, 2 ml of 20% KI solution was added to the conical flask, and the solution was calibrated with sodium thiosulfate to the end point. Next, 0.4 mL of 1% starch solution was added, and titration with sodium thiosulfate was continued until the solution turned from blue to bright green. At that end point, the titration volume was recorded; the blank volume V0 was calculated using the same method, and the lignin content was determined. Protein expression and purification in E. coli and UGT71C4 enzyme assays The full-length coding sequence of UGT71C4 was cloned into the pET30a vector. The recombinant plasmid was transferred into the E. coli Rosetta strain (DE3), and positive cells were cultured in LB medium at 37°C and 220 rpm until the OD[600] reached 0.6–0.8. Expression of the UGT71C4 protein was induced by addition of 1 mM isopropyl β-D-1-thiogalactopyranoside (IPTG) and incubation at 37°C for 4 h, after which the bacteria were harvested by centrifugation at 8000 rpm for 15 min at 4°C. The cells were resuspended in 1× PBS, broken by sonication in a homogenizer at 30% power for 20–30 min, and centrifuged at 8000 rpm for 20 min at 4°C. The cell pellets were washed three to five times with protein wash buffer (20 mM Na[3]PO[4,] 0.5 M NaCl, and 10 mM imidazole). Finally, the recombinant protein was collected by adding protein elution buffer (20 mM Na[3PO[4]], 0.5 M NaCl, and 10 mM imidazole) to an affinity chromatography column. The UGT71C4 protein was detected using anti-His-tag pAb (MBL) and subsequently kept at −80°C. For UGT71C4 protein enzyme activity assays, each 500-μl reaction mixture contained 1 M Tris–HCl (pH 8.0), 0.1 M UDP-glucose as the donor, 100 mM of prospective acceptor, 2 μg of UGT71C4 protein, and 10% β-mercaptoethanol. Reactions were carried out at 30°C for 2 h, after which HPLC/UPLC–ESI–MS analysis was performed. Funding This study was financially supported by grants from the Fundamental Research Funds for the Central Universities (226-2022-00100), the NSFC (32130075), Xinjiang Production and Construction Corps (2023AA008), and Research Startup Funding from Hainan Institute of Zhejiang University (0202-6602-A12201). Acknowledgments