Abstract Improving the baking quality is a primary challenge in the wheat flour production value chain, as baking quality represents a crucial factor in determining its overall value. In the present study, we conducted a comparative RNA-Seq analysis on the high baking quality mutant “O-64.1.10” genotype and its low baking quality wild type "Omid" cultivar to recognize potential genes associated with bread quality. The cDNA libraries were constructed from immature grains that were 15 days post-anthesis, with an average of 16.24 and 18.97 million paired-end short-read sequences in the mutant and wild-type, respectively. A total number of 733 transcripts with differential expression were identified, 585 genes up-regulated and 188 genes down-regulated in the “O-64.1.10” genotype compared to the “Omid”. In addition, the families of HSF, bZIP, C2C2-Dof, B3-ARF, BES1, C3H, GRF, HB-HD-ZIP, PLATZ, MADS-MIKC, GARP-G2-like, NAC, OFP and TUB were appeared as the key transcription factors with specific expression in the “O-64.1.10” genotype. At the same time, pathways related to baking quality were identified through Kyoto Encyclopedia of Genes and Genomes. Collectively, we found that the endoplasmic network, metabolic pathways, secondary metabolite biosynthesis, hormone signaling pathway, B group vitamins, protein pathways, pathways associated with carbohydrate and fat metabolism, as well as the biosynthesis and metabolism of various amino acids, have a great deal of potential to play a significant role in the baking quality. Ultimately, the RNA-seq results were confirmed using quantitative Reverse Transcription PCR for some hub genes such as alpha-gliadin, low molecular weight glutenin subunit and terpene synthase (gibberellin) and as a resource for future study, 127 EST-SSR primers were generated using RNA-seq data. Keywords: RNA-Seq, qRT-PCR, Transcriptome, Gene ontology, Glutenin subunit Subject terms: Biotechnology, Plant biotechnology Introduction Wheat (Triticum aestivum L., 2n = 6x = 42) is an important food crop in the world and possesses unique flour quality that can be used to make various food products^[32]1. However, current wheat varieties need improvement in processing quality to meet the increasing demand for better quality food products^[33]2. Processing quality of wheat flour is determined by grain protein concentration and its composition which confer wheat dough with unique rheological properties, making it possible to produce a series of quality foods for human consumption^[34]3–[35]5. Baking quality is one of the most important parameters throughout the value chain of wheat flour^[36]6. Seed is the primary storage organ in plants for storing nutrients such as starch, lipids, and proteins^[37]7 and the baking quality is largely influenced by these nutrients. The content of gliadins and glutenins^[38]6, optimal water absorption, rheological parameters^[39]8 and grain hardness^[40]9 are correlated to high baking quality. Determination of flour water absorption, as well as the results of the farinographic analysis of the flour play a key role in the assessment of wheat flour baking quality ^[41]10. However, improving baking quality is a challenge for wheat breeders due to the time-consuming and costly testing forcing breeders to postpone sophisticated quality tests to the very last phases of variety development^[42]11. The genome of a plant is the most critical factor to control baking quality trait in wheat^[43]12. For example,^[44]2 processing quality related key genes such as glutenin and gliadins, puroindolines, grain softness protein, alpha and beta amylases, proteases, were identified, and many other candidate genes related to cellular and molecular functions. Therefore, an understanding of the molecular basis of this trait would be a major advantage in wheat breeding^[45]13. Advances in molecular techniques offer new opportunities to identify the genetic and molecular basis of bread quality in wheat^[46]1,[47]2,[48]14. RNA-Seq analysis of various wheat genotypes has opened up new avenues for research on grain development and its composition, which ultimately determines the nutritional and functional quality of wheat^[49]15,[50]16. The gene expression during wheat grain development plays a crucial role in determining the yield and nutritional properties of the crop. Identifying genes expressed during the grain filling stage can be useful for improving yield and grain quality^[51]17,[52]18. Several studies have applied transcriptomics approaches to investigate the gene expression during grain development stages in wheat^[53]2,[54]15–[55]17,[56]19–[57]21. The grain development process of wheat can be categorized into three stages, including cell division and expansion (0 ~ 14 DPA (days post-anthesis)), effective grain filling (14 ~ 28 DPA), and maturation and desiccation (28 DPA to maturity)^[58]22. Understanding these stages and the associated gene expression can lead to the development of new and improved wheat varieties with superior nutritional and functional properties^[59]17,[60]18. Early grain development, specifically the first 14 days after pollination, is critical for the final yield and quality of wheat. During this time, rapid grain expansion occurs through continuous cell division and endosperm filling with starch and gluten^[61]16,[62]23. At 30 DPA, a diverse range of genes are expressed at low levels with a predominance of genes associated with seed defense and stress tolerance^[63]13. Wheat flour quality is highly influenced by protein content and composition, with glutens being the most abundant storage proteins, comprising about 80% of total grain proteins. Glutenins consist of high and low-molecular-weight glutenin subunits (HMW-GS and LMW-GS). Gene expression from these protein families occurs early in grain development^[64]16,[65]23. Zamani et al.^[66]24 and Izadi-Darbandi et al.^[67]25 reported that mutant, “O-64.1.10” and its parent, “Omid” wheat cultivar have two HMW-GS genes, Dx2 + Dy12 and Bx7 + By8 but the improvement of baking quality in the mutant, “O-64.1.10” was unknown. A negative effect of Dx2 + Dy12 was published previously by several studies^[68]26–[69]31. The allelic pair Bx7 + By8* (Glu-B1a1) strongly associated with dough strength^[70]32. The complex processing quality of wheat is controlled by many genes, which have not been completely explored^[71]2. For example, phytohormones are the predominant biochemical basis for grain morphogenesis^[72]33 and reported to be important signals in controlling seed development, maturation, and nutrients accumulation^[73]34. Gene ontology (GO) enrichment and pathway enrichment analysis showed that many transcription products and transcription factors (TFs) associated with carbohydrate and protein metabolism were abundantly expressed in the grain^[74]35. TFs regulate target genes to ensure tightly regulated developmental process^[75]15. bZIP TFs are regulators of important plant processes such as seed storage protein gene regulation^[76]36, energy metabolism^[77]37, unfolded protein response^[78]36, and hormone and sugar signalling^[79]38. NAC TF play a significant role in regulating the accumulation of glutenin and starch in wheat endosperm, which are crucial for the grains quality^[80]39. Rahemi et al.^[81]40 reported that water absorption percentage, valorimeter value, farinograph quality number, zeleny number, hardness, wet gluten, protein content in high baking quality mutant “O-64.1.10” exhibited greater than “Omid”. We compared RNA-Seq data from immature grains of a “O-64.1.10” wheat genotype and its wild type cultivar, "Omid," which have significantly different baking qualities. The purpose of our study was to identify genes that are primarily expressed in wheat grain and are related to baking quality. Through this study, we gained new insights into the molecular mechanisms that underlie seed quality and identified potential candidate genes related to wheat grain quality. Our findings provide a solid foundation for future research aimed at improving bread-making quality. Material and methods At all stages, the research complied with relevant institutional, national, and international guidelines and legislation. Plant materials and sampling To investigate the transcriptome of immature seeds of Triticum aestivum L, mutant “O-64.1.10” genotype and its low baking quality wild type "Omid" cultivar were obtained from the GenBank of Nuclear Science and Technology Research Institute in Karaj, Iran. We used two genotypes of Iranian bread wheat, namely “O-64.1.10” (with high baking quality) and its parent cultivar “Omid” (with poor baking quality), to conduct a comparative analysis of their baking quality (Table [82]1). Mutant genotype “O-64.1.10” was produced by the gamma irradiation approach and evaluated for baking quality through rheological and proteomics trials^[83]25,[84]40. The “Omid” and its mutant “O-64–1-10”, were cultivated at the research field of Gorgan University of Agricultural and Natural Resources Gorgan, Iran. The main stem heads were marked with the anthesis date. The grains from the middle ear were collected at 5, 10, 15, 20, and 30 DPA between 9:00 and 10:00 am. For dynamic comparison of developing grains three biological replications of samples were used for the five stages. Also for RNA-seq samples, the seeds of “Omid” and “O-64.1.10” were used at 15 DPA. For any genotypes the seeds of 9 spikes (10 grains per spike) harvested from 9 different plants (three plants per three replications) and pooled. The samples were promptly frozen in liquid nitrogen and stored at − 80 °C for subsequent total RNA extraction. Table 1. Comparison the baking quality traits in mutant genotype “O-64.1.10” and its wild type “Omid” cultivar^[85]12. Genotype Protein content Wet gluten Hardness Zeleny number Valorimeter value Water absorption percentage Farinograph quality number Bread volume Omid 11.5^a 20.5^b 51^b 31^b 41^b 57.65^b 41.25^a 568. 5^a O-64.1.10 13^a 36^a 54.5^a 36.5^a 54^a 66.75^a 75^a 505. 5^b [86]Open in a new tab ^a, bMean values marked with different letters are significantly different (p ≤ 0.05) by LSD test. RNA isolation and cDNA library construction We utilized 100 mg seed samples at 15 DPA and p-Biozol Buffer (BioFlux, Japan) to extract total RNA, following the manufacturer's recommendations. Additionally, 1% (w/v) RNase-free agarose gel electrophoresis and a Nanophotometer (Implen, Germany) were used to evaluate the amount and quality of RNA samples. The RNA samples of 15 DPA, passing the quality and quantity control, were sent to Beijing Genomics Institute (BGI), Hong Kong, China for cDNA library construction and sequencing on the Illumina sequencing platform (Illumina HiSeq™ 2500). A comprehensive quantitative evaluation of each RNA samples was performed using a Nanodrop 8000 Spectrophotometer (ThermoScientific, USA) and an Agilent 2100 Bioanalyzer System (Agilent Technologies, USA) to produce information on RNA concentration. The high-quality RNA (OD 260/280 = 2.06–2.08; OD 260/230 = 1.9–2.09; RIN value 7.5) was further processed for cDNA library creation using the Illumina TrueSeq RNA Sample Prep kit. To produce 150 bp paired-end reads, NGS sequencing of cDNA from seed tissue was performed. Data processing and analysis The quality of paired-end reads from raw sequencing data was assessed both before and after trimming. The sequencing quality of the raw reads from each sequenced sample was assessed using FASTQC software ([87]https://www.bioinformatics.babraham.ac.uk/projects/fastqc/) accessed on January 8th, 2019)) and CLC Genomic Workbench 7.5.1 (CLC Bio-Qiagen, Denmark). Trimmomatic software (version 0.36) was used to remove adapter sequences, low-quality nucleotides/sequences, and reads shorter than 150 bp. The remaining high-quality reads with a Phred score of ≥ 30 was used for downstream analyses. These reads were aligned against the wheat reference genome (version IWGSC RefSeq v1.0, [88]http://plants.ensembl.org/Triticum_aestivum/) using Hisat2 software (version 2.2.1.0)^[89]41. The mapped reads from each sample were assembled using Cufflinks v2.0.2 and htseq. These Cufflink assemblies were merged using Cuffmerge. The outputs were then used for differential expression analysis by Cuffdiff and EdgeR packages^[90]42. The normalization of gene expression values of two samples was estimated as fragments per kilobase of transcript per million fragments mapped (FKPM) and fold change (FC). Data analysis with an adjusted p-value threshold of 0.001 and Log2 Fold Change (log2FC) of ≥ 2 were assigned as differentially expressed genes (DEGs). GO enrichment analysis for DEGs was performed using goseq and AgriGO version 2.0. The Benjamini and Hochberg approach for controlling the false discovery rate (FDR) was used to adjust P-values. The threshold of 0.01 was set for the FDR value. The GO analysis was carried out to functionally categorize the DEGs from 15 DPA into three major aspects: "cellular component (CC)," "molecular function (MF)," and "biological process (BP)". Pathway enrichment analysis of DEGs was performed using the KEGG (Kyoto Encyclopedia of genes and genomes) database ([91]http://www.genome.ad.jp/kegg/). To identify TFs, transcriptional regulators (TRs), and protein kinases (PKs) encoding genes, The DEGs were screened using the Plant Transcription Factor & Protein Kinase Identifier and Classifier database (iTAK v1.6). Simple Sequence Repeats (SSRs) were identified from RNA-seq data using the Perl script of MISA (MIcroSAtellite identification tool)^[92]43 with default parameters^[93]44. Validation of RNA-seq data by real-time PCR The RNA-Seq results were validated through the use of quantitative Reverse Transcription PCR (qRT-PCR). The qRT-PCR was performed on three genes that were chosen from the DEG analysis. The primers were designed based on the 3́-UTR region of the sequence (Table [94]2) using the Primer3 online software ([95]http://www.embnet.sk/cgi-bin/primer3_www.cgi)^[96]45. Table 2. List of primers used in the experiment for quantitative Reverse Transcription PCR amplification. Primer name Gene ID Primer sequence Tm GC% Product size (bp) LMW-GS TraesCS1B01G011600.1 Forward 5ʹ- ACAACAGGTTCAGGGTTCCA-3ʹ 59.08 50 158 Reverse 5ʹ- CTATCTGGTGTGGCTGCAAA-3ʹ 58.17 50 Alpha-gliadin TraesCSU01G160200.1 Forward 5ʹ- ATGTTGTCAGCAGTTGTGGC-3ʹ 59.33 50 154 Reverse 5ʹ- TTACTGAGGCTGCTGGTAGG-3ʹ 58.80 55 Gibberellin TraesCSU01G099900.1 Forward 5ʹ- TGGACGAGAGAATTGAGGCA-3ʹ 58.73 50 152 Reverse 5ʹ- ATCGAACACATGGGGAGGG-3ʹ 59.08 57.9 GAPDH [97]AK359500.1 Forward 5'-GTTGGCAAGGTGCTCCCAGA-3' 62.70 60 121 Reverse 5'-GCTCATAGGTGGCTGGCTTG-3' 61.10 60 [98]Open in a new tab Before first-strand cDNA synthesis, possible genomic DNA contamination was removed by RNase-free DNaseI (Thermo Scientific, USA) at a ratio of 1 U DNaseI to 2 µg RNA, 1X DNaseI buffer, 10 U Ribolock RNase inhibitor (Thermo Scientific, USA), and DEPC water up to 9 µl, followed by incubation at 37 °C for 30 min. To finalize DNaseI activity, 25 mM EDTA buffer was added to the reaction and the RNA was then heat-denatured at 65 °C for 10 min. First-strand cDNA synthesis was carried out using 1 µg of total RNA in the reaction mixture containing 0.5 µg Oligo (dT) primer and DEPC water (nuclease-free) up to 11 µl. To eliminate secondary structures, the liquid was gently mixed, quickly centrifuged, then incubated at 70 °C for 5 min. Then, 4 µl 5X cDNA reaction buffer, 10 mM dNTP mix, and 20 U Ribolock RNase inhibitor were added to the reaction mixture. The final volume was adjusted to 19 µl with DEPC-treated water, and the mixture was incubated for 5 min at 37 °C. 200 U Revert Aid enzyme (Thermo Scientific, USA) was added to the reaction mixture and incubated at 42 °C for 1 h. A qRT-PCR was performed using an iCycler thermal cycler (Bio-Rad, iQ5, USA) with a reaction volume containing 3 µl of diluted cDNA, 10 µl of 2X SYBR Bio Pars PCR Master Mix (Gorgan University of Agricultural Sciences and Natural Resources, Iran), and 1 µl of each gene-specific primer (10 pmol) in a final volume 20 µl with double distilled water. The time course of qRT-PCR were done at the following conditions: 3 min at 95 °C for 1 cycle; 10 s at 95 °C, 10 s at 62 °C and 10 s at 72 °C for 35 cycles, and 2 min at 72 °C for 1 cycle. Upon that, PCR was done on each sample in three technical and biological replications. The fluorescence signal was detected at 72 °C. The REST software was used for the gene expression analysis^[99]46, and the relative expression was computed utilizing the comparative Ct (2-∆∆^Ct) method^[100]47 and compared with the expression levels of RNA-Seq. For qRT-PCR normalization, glyceraldehyde 3-phosphate dehydrogenase (GAPDH) was utilized as the reference gene. Gene ontology analysis The major biological activities of DEGs were determined using a GO enrichment analysis. DEGs from the 15 DPA stage were annotated and grouped into three major groups based on their functional characteristics: CC, MF, and BP categories. Result Grain morphology changes during seed development The dynamic comparison of developing grains was examined at 5, 10, 15, 20, and 30 DPA stages between the “O-64.1.10” genotype and its “Omid” counterpart (Fig. [101]1a). The dry weight of the seeds showed a continuous increase from 5 to 20 DPA and reached its highest point at 20 DPA in both the mutant genotype “O-64.1.10” and its parent genotype “Omid” (Fig. [102]1b). Our results showed that mutant “O-64.1.10” exhibited greater grain weight growth than “Omid” at15, 20 and 30 DPA. Also highest rate of dry weight accumulation per day occurred between 10 and 15 DPA. The 15 DPA stage are beginning of effective grain filling stage in this stage the grain dry weight increases by about two-fold. The 15 DPA stage for the RNA-seq analysis. This finding is in agreement with earlier study conducted by Shewry et al.^[103]12. Figure 1. [104]Figure 1 [105]Open in a new tab Dynamic comparison of grains at different developmental stages between in high baking quality mutant “O-64.1.10” and its wild type genotypes “Omid”. (a) Changes in grain morphology at 5, 10, 15, 20 and 30 DPA of “O-64.1.10” and “Omid”. (b) Changes in dry grain weight during the development process. Three biological replications and t-test were done to obtain the curve. *—significant at P < 0.05 and **—significant P < 0.01. I: standard error. Identification of differentially expressed genes The quality of RNA and quality of high-throughput sequencing data showed in supplementary data ([106]S1 Figure). A total of 18.97 and 16.24 million paired-end short-read sequences with high-quality (Q30 > 99%) obtained by Illumina sequencing of the “Omid” and “O-64.1.10” cDNA libraries, respectively. The GC (guanine-cytosine) content for the clean data was 51% and 52% for the “Omid” and “O-64.1.10”, respectively (Table [107]3). DEGs were found in 773 genes between “O-64.1.10” and “Omid” (supplemental data [108]1). 585 genes up-regulated and 188 genes down-regulated in the “O-64.1.10” genotype compared to the “Omid”. Our study showed that most of the DEGs (339 genes, 43.85%) were anchored on B-genome (Fig. [109]2d, supplemental data [110]2). In addition, we found that up-regulated genes in “O-64.1.10” were enriched on chromosomes 2A (5.95%) and 3A (6.33%) and chromosomes 2B (6.59%) and 3B (6.2%), whereas down-regulated genes in “O-64.1.10” were overrepresented on chromosome 1B (14.5%) (Fig. [111]2). Table 3. Summary of high-throughput sequencing data in “Omid” and mutant at 15 DPA. Sample Sequence length (base pairs) Number of sequences (total reads) Total nucleotides in data set (base) % sequence duplication levels % GC-content ^aQ30 (%) Omid 150 18,976,039 2,846,405,850 0.5335 51 99.78 O-64-1-10 150 16,246,625 2,436,993,750 0.672 52 99.73 [112]Open in a new tab ^apaired-end short-read sequences with high-quality (Q30≻ 99%). Figure 2. [113]Figure 2 [114]Open in a new tab The numbers of DEGs on the A, B and D genomes in the “O-64.1.10” genotypes relative to the “Omid” cultivar. UN: unassigned chromosome. Functional classification and gene ontology analysis GO enrichment analysis was carried out to characterize the main biological functions of DEGs in seeds 15 DPA. The DEGs from 15 DPA stage were functionally annotated and classified ontologically into three principal categories: CC, MF, and BP. Furthermore, 62 GO terms were enriched in the DEG from O-64.1.10 and Omid, including 19 CC, 15 MF, and 28 BP. A total of 733 DEGs were assigned to three main GO functional categories and then divided into 21 sub-categories. Metabolic process, cellular process, developmental process, and multicellular organismal process were found to be dominant in the BP category. The MF category was primarily dominated by catalytic activity and binding. Cell, cell part, and organelle were foremost in the cellular CC category (Fig. [115]3). The “O-64.1.10” genotype had the highest number of genes involved in specific BP, such as cellular protein catabolic process, diterpenoid biosynthetic process, gibberellin metabolic process, cellular nitrogen compound biosynthetic process, cellular amide metabolic process, lipid biosynthetic process and response to abiotic stimulus. Overall, these results suggest that the genes associated with wheat early grain development play crucial roles in encoding diverse regulators and proteins. Figure 3. [116]Figure 3 [117]Open in a new tab Histogram of Gene Ontology (GO) classifications of DEGs between “Omid” and “O-64.1.10” genotypes in 15 DAP. The results are summarized in three principal GO functional categories: biological processes, cellular component, and molecular function. At the same time pathway enrichment analysis using KEGG database was performed to reveal the active biological pathways involved in baking quality of immature wheat grains. In this study, 68 and 34 pathways were identified in the “O-64.1.10” and the wild-type, respectively. Among all the pathways, the largest pathways were metabolic pathway, protein processing in endoplasmic reticulum and biosynthesis of secondary. An analysis using the KEGG database on biological pathways in the “O-64.1.10” genotype showed that a total of 187 annotated DEGs were assigned to 68 pathways. Among the 68 pathways, the largest pathway was metabolic pathways, which contained 32 genes, followed by protein processing in endoplasmic reticulum (19 genes), biosynthesis of secondary metabolites (16 genes), spliceosome (9 genes), ribosome (8 genes), plant hormone signal transduction (4 genes), and biosynthesis of amino acids (4 genes). The KEGG analysis in the “O-64.1.10” genotype indicated that 39 DEGs were involved in protein processing in endoplasmic reticulum (19 genes), biosynthesis of different amino acids pathways (14 genes), and different vitamins metabolism pathways (6 genes), which might participate in regulating baking quality (Fig. [118]6). Furthermore, the carbohydrate metabolism pathways (5 genes) in the wild-type genotype might participate in regulating baking quality. Pathway enrichment analysis of “Omid” genotype displayed the total of 68 annotated DEGs were assigned to 34 pathways. The top three enriched pathways include metabolic pathways (12 genes), biosynthesis of secondary metabolites (9 genes), and ribosome (4 genes) were largest pathways (Fig. [119]4, Supplemental data [120]4). Figure 6. [121]Figure 6 [122]Open in a new tab Distribution of transcription factor (TF) genes of 22 TF families (a), protein kinases (PK) genes of seven PK families (b), transcriptional regulators (TR) genes of eight families (c), and plant hormones (d) in comparison between Omid and O-64-1-10 genotypes with different baking quality at 15 DPA. Figure 4. [123]Figure 4 [124]Open in a new tab Bar diagrams displayed the KEGG significantly metabolic pathways in the “O-64.1.10” and “Omid” at 15 DPA. Functional annotations of these DEGs were achieved by using Mercator tool (Mercator V3.6). Results of differential expression showed that out of 773 differentially expressed genes, 109 (13.05%) BINs assigned to protein process. The number of genes were assigned to the functional category “protein” was highest (103), followed by “RNA” (74), “stress” (52) and “misc” (42) (Fig. [125]5). Lastly, In the “O-64.1.10”, 82 out of 103 DEGs involved in protein processing (Fig. [126]5), were up-regulated compared to the wild-type (Table [127]4). Elongation factor 1-alpha (EF1α) play a crucial role in the protein synthesis machinery of cells, and its function extends to various physiological processes in plants, including wheat^[128]48. Our results indicated that this gene were highly expressed in “O-64.1.10” genotype. Figure 5. [129]Figure 5 [130]Open in a new tab The amount of hub genes expression in 35 segregate biological groups (35 main BINs) using the Mercator web server. Table 4. Categorization of differentially expressed hub genes (up/down-regulated) in “O-64.1.10” at 15 DPA into protein process; BIN code 29. Code BIN Class No. BIN No. Unigene Up-regulated in O-64-1-10 down-regulated in O-64-1-10 29.1 Protein.aa activation 3 3 3 0 29.2 Protein.synthesis 26 23 18 5 29.3 Protein.targeting 9 9 7 2 29.4 Protein.postranslational modification 25 23 17 6 29.5 Protein.degradation 41 40 33 7 29.6 Protein.folding 4 4 3 1 29.7 Protein.glycosylation 1 1 1 0 Total – 109 103 82 21 [131]Open in a new tab Transcription factor, protein kinases and transcriptional regulators analysis From DEGs, it was found that out of 22 TF families that the highest expression occurred for heat shock factor (HSF), AP2/ERF-ERF and basic leucine zipper (bZIP). Additionally, the HSF, bZIP, C2C2-Dof, B3-ARF, BES1, C3H, GRF, HB-HD-ZIP, PLATZ, MADS-MIKC, GARP-G2-like, NAC, OFP, and TUB TF families were exclusively expressed in the "O-64.1.10" genotype at the 15 DPA stage (Fig. [132]6a, supplemental data [133]3). Our findings revealed that, the bZIP TFs was related to storage protein gene regulation. TFs of the bZIP family are regulators of many central developmental and physiological processes including storage protein gene regulation^[134]49, energy metabolism^[135]37, light responses and oxidative stress signaling^[136]50. The regulations of the grain-filling process may be regulated by several TF families, such as bZIP and HSF TFs. The HSF TFs are regulators in response to stress^[137]51, early grain filling^[138]16, seed maturation^[139]52 and pollen development^[140]53. The HSF family is a transcriptional activator of heat shock protein (HSP) genes^[141]54. The HSP family is an important factor ensuring correct protein folding that plays a significant role in degradation pathways, such as endoplasmic reticulum-associated degradation^[142]55. Our results showed that LMW-GS genes and ERF TF had a higher expression in “Omid” and “O-64.1.10”, respectively. Hasrak et al.^[143]56 reported there is a negative correlation between ERF TFs and lower expression of LMW-GS. Our data indicated that the NAC TF (TraesCS2B01G359200.1) were found from highly expressed in “O-64.1.10” genotype. The NAC019 TF regulates glutenin and starch accumulation and its elite allele improves wheat grain quality^[144]39. The TKL, STE, CAMK, and Others of PKs were found from highly expressed in the “O-64.1.10” genotypes, whereas the AGC of PK families was expressed only in the “Omid” genotypes (Fig. [145]6b, supplemental data [146]3). The PKs may cooperate with the transcriptional networks to refine the regulation of genes during seed development^[147]34. In this study, among all genes with differential expression, seven important families of TRs such as HMG, MBF1, MED, mTERF, SET, TAZ, and others genes were found from highly expressed in the “O-64.1.10” genotype. In contrast, one family of TRs (GNAT) only was found from highly expressed in the wild-type genotype (Fig. [148]6c, supplemental data [149]3). Plant hormones and transposable elements (TEs) analysis Among of the five hormone-related genes that were up-regulated in each of genotypes, three genes associated with gibberellic acid, brassinosteroid, and auxin were expressed only in “O-64.1.10” genotypes (Fig. [150]6d, supplemental data [151]3). The plant hormones regulate multiple biological processes in early grain development of wheat and the comprehensive expression profiling provides useful information for hormone regulatory mechanisms^[152]16. The hormone response genes affect the wheat grain quality through indirect pathways^[153]57. Our research displayed eleven TEs that were expressed only in “O-64.1.10” genotypes. These genes related to transposon may be responsible for causing the high baking quality in the mutant genotype “O-64.1.10”. So, further research is needed to find mutations governing genes related baking quality. TEs are mobile genetic elements in the eukaryotic genome which alter the expression of neighboring genes via insertion into promoter regions, or disrupt the function of protein-coding genes when inserted into the genes, or even change gene structure by altering its splicing and polyadenylation patterns^[154]58–[155]60. Identification of SSRs In the context of differential gene expression between the “O-64.1.10” and wild-type genotypes, 127 SSRs were identified. The repeat pattern of the identified SSRs indicated a prevalence of trinucleotide repeats (122) were the most abundant, followed by di- (3) and hexa- (2) nucleotide repeats (Table [156]5, supplemental data [157]3). Within the trinucleotide repeats, CCG/CGG (54) was the most frequent motif, followed by AGG/CCT (26) and AGG/CCT (22). The majority of motifs (96.06%) consisted of 5–7 repeats, while motifs with 8–10 repeats were rare (3.94%). For instance, gamma-gliadin contained AAC motifs with 10 repeats. A total of 109 sequences containing SSRs, 17 sequences containing more than one SSR, and 8 sequences containing SSRs in compound formation were identified. For instance, alpha-gliadin contained (ACA)9 and (CAA)8 motifs. These genic SSRs/RNAseq-SSR markers were developed from transcriptome sequences and can be used for marker-assisted selection (MAS). Table 5. Details of SSR markers obtained from DEGs of “Omid” and “O-64.1.10”. Omid O-64.1.10 Total Total number of sequences examined 189 584 773 Total number of identified SSRs 27 100 127 Number of SSR containing sequences 27 82 109 Number of sequences containing more than 1 SSR 0 17 17 Number of SSRs present in compound formation 0 8 8 Mono 0 0 0 Di 0 3 3 Tri 26 96 122 Tetra 0 0 0 Penta 0 0 0 Hexa 1 1 2 [158]Open in a new tab Quantitative Reverse Transcription PCR Validation qRT-PCR was used to validate the RNA-seq results. Therefore, out of DEGs, three important genes involved in baking quality were selected from the DEGs, including LMW-GS, Alpha-gliadin and Gibberellin. The LMW-GS and gliadins (alfa/beta, omega, and gamma) are the main storage proteins and the major components of the gluten polymer. The accumulation of these storage proteins affects wheat quality formation^[159]17. The relative fold changes in gene expression measured by qRT-PCR were found to be consistent with their expression levels determined by RNA-Seq. Both techniques yielded similar patterns of changes, validating the results obtained from RNA-Seq (Fig. [160]7). Figure 7. [161]Figure 7 [162]Open in a new tab Quantitative Reverse Transcription PCR validation of RNA-seq results for selected DEGs. Conclusions The transcriptomics analysis is a powerful tool to study molecular processes taking place during grain development. In this study, differential expressed genes of two breed wheat genotypes “O-64.1.10” and “Omid” were investigated at 15 DPA by transcriptomics approaches. a total number of 733 DEGs were identified between each genotype, as well as the DEGs analysis suggested that up-regulated genes in “O-64.1.10” were enriched on A-genome, whereas down-regulated genes in “O-64.1.10” were overexpression on B-genome. Our recent studies have identified several crucial factors that regulate the baking quality of wheat, including grain storage proteins (gluten), TFs, TRs, PKs, plant hormones, carbohydrates (starch and sucrose), lipids (unsaturated fatty acids), and vitamins. Using the KEGG pathway enrichment analysis of DEGs comparing “O-64.1.10” with “Omid” showed that various pathways of metabolic pathways, protein processing in endoplasmic reticulum, biosynthesis of secondary metabolites, spliceosome, ribosome, plant hormone signal transduction, and biosynthesis of amino acids were enriched in the “O-64.1.10” genotype with the highest number of the genes, whereas the pathways associated with metabolic pathways, biosynthesis of secondary metabolites, and ribosome were frequent in the “Omid” genotype. Yu et al.^[163]17 reported that metabolic pathway network analysis that major and minormetabolic pathways regulate one another to ensure regular seed development and nutritive reserve accumulation. Yan et al.^[164]61 showed that carbohydrate and hormone metabolism has play important roles in the grain size and weight in wheat. Our results indicated that TraesCS2A01G483600 (encodes Elongation factor 1-alpha) expressed only in “O-64.1.10” genotypes. Paul et al.^[165]48 suggested that Elongation factor 1-alpha gene play an important role in cell expansion in early developmental wheat grain. The validation of transcripts via qRT-PCR presented in this study holds significant academic value. These transcripts have been found to have a positive correlation with the enhancement of the baking quality of wheat grains. Lastly, in addition, we also identified 127 RNA-seq SSR markers among all genes with differential expression in “O-64.1.10” and its wild type genotype. The repeat pattern of identified SSRs indicated an abundance of dinucleotides followed by tri, and hexanucleotide repeats. In conclusion, results from our study provides valuable shed light into the genetic mechanisms underlying baking quality in wheat flour production, and could have significant implications for the baking industry. Supplementary Information [166]Supplementary Information.^ (312.7KB, xlsx) Acknowledgements