Abstract Gene expression and post-transcriptional regulation are key mechanisms affecting lactation performance in dairy animals. However, the difficulty of obtaining mammary gland tissue samples from lactating animals has significantly impeded lactation research. Milk fat globules may be a non-invasive way to obtain mammary transcripts. Here, we aimed to reveal the universal rule of the transcript profiles of the milk fat globules and mammary glands from buffaloes and goats by RNA-sequencing analysis. Results showed that, in buffalo, 97 % of mRNAs were expressed in both milk fat globules and mammary glands, with 45 % showing differential expression. Among 6086 lncRNAs and 7010 miRNAs identified, 35 % and 50 % were differentially expressed, respectively. Of 11,631 circRNAs, only 618 showed significant differences. In goat, more than 99 % of mRNAs and 87 % of ncRNAs (including lncRNAs, circRNAs, and miRNAs) were expressed in both milk fat globules and mammary glands, and over 91 % of mRNAs, 96 % of lncRNAs, 98 % of circRNAs, and 86 % of miRNAs showed no significant differences, suggests that the transcripts in milk fat globules exactly reflect that in the mammary gland. This study suggests that milk fat globules are an effective candidate for non-invasive acquisition of mammary gland transcripts, but their applicability needs further study. Keywords: Lactation, Transcript profiles, Non-invasive acquisition, ncRNAs Highlights * • Expression of transcripts in milk fat globule was similar to mammary gland. * • Milk fat globules provided non-invasive mammary transcripts for lactation studies. * • Feasibility of using milk fat globule transcripts was species specific. 1. Introduction This study aims to evaluate the feasibility of using milk fat globules as a non-invasive substitute for mammary gland tissues in transcriptional studies. Milk secretion encompasses the synthesis of fatty acids, protein, lactose, and other nutrients, and this process is predominantly regulated by gene expression and post-transcriptional modifications ([45]Du et al., 2022; [46]Osorio et al., 2016). Functional analysis of transcripts in the mammary gland has provided essential information on the regulatory mechanisms of lactation and promote milk production. The simplest and most direct way to obtain mammary gland tissue transcripts is to collect mammary gland tissue from lactating animals directly through surgery. However, this method is usually invasive and does not allow direct and repeated sampling without causing tissue damage. It may also pose a significant risk of mastitis and subsequent complications in dairy animals ([47]Bionaz & Loor, 2008). In addition, it may significantly reduce the production performance of the animals, including the milk yield and quality, and limit the economic efficiency of the farm. Another sampling method is to obtain mammary gland tissues from slaughterhouses ([48]Cui et al., 2014). Although researchers do not need to consider the impact of sample collection on the production performance of slaughtered animals, researchers also cannot access genetic information and historical production data of these particular animals. Additionally, dairy animals for slaughter are often obsoleted for ineffective milk production, and their mammary gland tissue cannot exhibit the normal lactation regulatory mechanisms of mammary gland tissues from superior dairy animals. In addition, it is worth noting that a region of the tissue cannot reflect the molecular expression patterns of the entire mammary gland ([49]Brenaut et al., 2012). Therefore, finding a representative experimental material that can be noninvasively collected to substitute for the use of mammary gland tissue is of great significance to the study of lactation physiology in dairy animals. Mammary epithelial cells (MEC) in milk, milk-derived somatic cells (MSC), and milk fat globules are primary candidate materials to obtain transcript profiles of lactation traits. MEC in milk generally refers to the cells gradually shed from the mammary gland epithelium during lactation. It has been reported that even when MEC in dairy animals are separated from the mammary gland epithelial environment, they are not entirely dead and can still be cultivated ([50]Ben Chedly et al., 2010; [51]Sorg et al., 2012). At the same time, the level of MEC shedding in the mammary gland can also indicate the regulation of the number of mammary gland cells and the influence on milk production ([52]Herve et al., 2016). Milk-derived MEC seems an effective non-invasive method to study transcriptional regulation in the mammary glands of dairy animals ([53]Boutinaud et al., 2015). However, the precise separation of MEC from milk is not easy for the small proportion in the milk. In addition, it is also a challenge to ensure that the isolated MEC are effective living cells rather than apoptotic fragments ([54]Krappmann et al., 2012). Using MSC in milk seems to be a better material that can reduce the experimental steps and save the experimental time. Studies have shown that MSC can be used to study the gene expression of casein and alpha-lactalbumin in the mammary glands of lactating goats ([55]Boutinaud et al., 2002). The problem is that MSC contains various kinds of cells, most of which are inflammatory cells, such as lymphocytes, macrophages, and neutrophils, especially when the animal suffers from mastitis or stress. At this time, the gene expression and molecular regulation of the MSC cells had little relationship with the lactation traits. Milk fat globules are droplets of different sizes secreted by the MEC, which contain over 99 % of the total lipids within the milk. RNA obtained from milk fat globules has been a valuable and non-invasive source of mammary gland messenger RNA (mRNA) to study diverse factors that impact milk yield and composition ([56]Chen et al., 2016; [57]Lemay, Hovey, et al., 2013; [58]Maningat et al., 2007). However, the publications have only focused on the expression of highly expressed genes or specific functional genes in milk fat globules, rather than comparing all transcripts of milk fat globules and mammary gland, which do not fully reflect the similarities and differences across transcripts between milk fat globules and mammary gland. In addition, previous studies are mainly focused on materials from humans and mice, little is performed on domestic animals. It is unknown if the expression pattern of milk fat globules and mammary glands between different species has the same rule. Moreover, non-coding RNAs (ncRNAs) such as long non-coding RNAs (lncRNAs,) circular RNAs (circRNAs), and microRNAs (miRNAs) have also been found to play critical roles in transcriptional regulation during lactation. However, few studies have characterized and compared the ncRNAs between milk fat globules and mammary glands, especially in dairy animals. Therefore, systematically comparing the transcript profiles, especially the ncRNAs, in the milk fat globules and mammary glands is essential to reveal the universal rule of transcription patterns between milk fat globules and mammary glands. In this study, RNA sequencing was performed to characterize and compare the transcription profiles between milk fat globules and mammary glands in buffalo and goat. The transcripts, including mRNA, lncRNA, circRNA, and microRNA, were systematically characterized and analyzed to evaluate the reliability of using milk fat globules as a non-invasive alternative to mammary gland tissue for studying lactation traits in lactating buffalo and goats. 2. Materials and methods 2.1. Sample collection All animals were treated humanely as outlined in the Guide for the Care and Use of Experimental Animals of the National Institutes of Health. The gathering of animal samples was approved by the Animal Experiment Ethics Review Committee of Guangxi University, Nanning, China (Approval number: GXU-2022-192). Our experiment included 6 independent biological animals (3 buffaloes, and 3 goats). For each animal, we synchronously (one-on-one correspondence) collected the mammary gland sample (serve as invasive samples) and milk fat globule sample (serve as non-invasive samples) to obtain the transcripts. The buffaloes were kept in the Buffalo Breeding Farm of the Buffalo Research Institute, Chinese Academy of Agricultural Sciences, Nanning, Guangxi, China. Three healthy Murrah buffaloes in the second or third parity with ages between 5 and 8 years were selected. Nubian goats were kept in a local goat breeding farm in Mashan, Guangxi Province. Three healthy Nubian goats that were in the second or third parity with ages about 3 years were selected. Milk samples (50 mL) were collected from buffaloes or goats at the mid-lactation stage using centrifuge tubes (Vazyme, Shanghai, China), and were immediately frozen in liquid nitrogen for further use. To collect the mammary gland tissues, the animals were anesthetized with xylazine hydrochloride (Huamu Animal Health Products Co., Ltd., Jilin, China) by subcutaneous injection (0.02 mL/10 kg for buffalo and 0.002 mL/10 kg for goat). After the animals were completely anesthetized, the mammary gland of buffaloes and goats was collected aseptically and stored in liquid nitrogen. The wounds were sutured with absorbable catgut aseptically (Shandong Haidike Medical Products Co., Ltd., Heze, China) and antibiotics (Qilu Animal Health Products Co., Ltd., Jinan, China) were used to prevent infection. All of the samples were transported to the laboratory within one hour. 2.2. RNA extraction and quality control The total RNA of milk fat globules and mammary glands was extracted by the TRIzol method. Specifically, milk samples were centrifuged at 7000 ×g for 10 min at 4 °C to separate the supernatant fat layer. About 500 μL fat was harvested and mixed thoroughly with 1 mL of TRIzol™ LS solution (Invitrogen Life Technologies Inc., Carlsbad, CA, USA) using a vortex (Thermo Fisher Scientific, Waltham, MA, USA). The lipids were removed from the top layer after the centrifugation. Chloroform (Sigma-Aldrich, St. Louis, MO, USA) was used to separate the aqueous phase, while isopropanol (Sigma-Aldrich, St. Louis, MO, USA) was employed to precipitate the RNA. The RNA was then washed twice with 75 % ethanol (Fuyu Fine Chemical Co., Ltd., Tianjin, China) and dissolved in DEPC-treated water (Beyotime, Shanghai, China). Approximately 0.1 g of mammary gland tissue was ground into powder under liquid nitrogen and mixed with 1 mL of TRIzol™ LS solution. The total RNA from the mammary gland was then extracted as described in the methods outlined above, except for the lipid removal step. RNA concentration and integrity were detected using a NanoDrop 2000 (Thermo Fisher Scientific, Waltham, MA, USA) and agarose gel electrophoresis. RNA integrity number (RIN) values of the total RNA were determined using an Agilent Bioanalyzer 2100 (Agilent Technologies, Inc., Santa Clara, CA, USA). RIN values greater than 7.0 were selected for further library construction and sequencing. 2.3. RNA library construction and sequencing Total RNA samples were submitted to BGI Genomics (Huada Gene, Shenzhen, China) for library construction and sequencing. Two separate RNA libraries were prepared to comprehensively characterize the variable transcripts in milk fat globules and mammary gland tissues. One was an rRNA-depleted strand-specific library designed for the analysis of lncRNA, circRNA, and mRNA, while the other was specifically for miRNA analysis. rRNAs were removed from the total RNA using the Ribo-Zero™ kit (Illumina, Madison, WI, USA). RNA samples were enriched using magnetic beads with Oligo dT, followed by cDNA synthesis using primers and buffers. The resulting double-stranded cDNA underwent end repair, PolyA tail addition, and sequencing adapter ligation. The cDNA library was then constructed via PCR amplification and purified using AMPure XP beads. The effective concentration of the constructed library was assessed to ensure it met the required standard before sequencing was performed. Sequencing was carried out on the DNBSEQ platform using the PE 100 paired-end sequencing protocol. For miRNA analysis, library preparation followed BGI's standard protocol. First, 18–30 nt small RNA fragments were recovered through polyacrylamide gel electrophoresis, after which 3′ and 5′ adapters were ligated. Subsequently, RT-PCR amplification was performed, and the PCR products were recovered via electrophoresis. Following library construction, the effective concentration was verified to ensure it met quality requirements. PCR products were then cyclized to create single-stranded circular molecules, with any remaining linear DNA digested prior to sequencing. Sequencing for the miRNA library was performed on the DNBSEQ platform using the SE 50 single-end sequencing protocol. The datasets generated in this study have been deposited in the NCBI Sequence Read Archive (SRA) repository under accession number PRJNA1000197 and aligned to the buffalo (Bubalus bubalis, GCA_003121395.1) and goat (Capra hircus, ARS1.Capra_hircus) genomes. 2.4. Analysis of mRNA biological information Using Fastp software (v0.23.4, [59]https://github.com/OpenGene/fastp, accessed on December 2022), the raw sequencing data were filtered to remove low-quality reads, and GC content was assessed, resulting in high-quality clean data. The clean reads were then mapped to the buffalo reference genome and the goat reference genome from the Ensembl database ([60]http://asia.ensembl.org/index.html, accessed on December 2022) using Hisat2 software ([61]http://daehwankimlab.github.io/hisat2/, accessed on December 2022). Based on the alignment results, transcripts were assembled and quantified using StringTie software, and transcript expression levels were calculated as FPKM (Fragments Per Kilobase of transcript sequence per million mapped reads). Differential expression analysis between mammary gland and milk fat globule samples was performed using DESeq2 (v1.6.3). The significance threshold was set to an adjusted P-value (False Discovery Rate, FDR) < 0.05 and |log2(fold change)| ≥ 1. Genes meeting these criteria were classified as significantly differentially expressed. To further investigate the biological functions and pathways associated with differentially expressed genes (DEGs) and non-differentially expressed genes (NDEGs) between groups, functional enrichment analyses were conducted using Gene Ontology (GO, [62]http://www.geneontology.org/, accessed on January 2023) and Kyoto Encyclopedia of Genes and Genomes (KEGG, [63]http://www.genome.jp/kegg/, accessed on January 2023). These analyses were implemented using the ClusterProfiler R package (v4.7.1). Protein-protein interaction (PPI) network analysis was performed using the STRING database (v12.0, [64]https://string-db.org/, accessed on January 2023) with a confidence score threshold of 0.7. Based on the PPI results, key protein subnetworks and interaction relationships were identified to gain insights into critical biological processes. 2.5. Identification of lncRNA and prediction of target genes The transcriptome was assembled using StringTie (v1.3.1) based on the reads aligned to the reference genome. The assembled transcripts were annotated using the gffcompare program (v0.12.6). Transcripts with class codes ‘i’, ‘x’, ‘u’, ‘o’, and ‘e’ and FPKM values ≥0.1 were retained, while transcripts shorter than 200 bp or containing fewer than two exons were excluded. The identification of lncRNA transcripts was performed using multiple tools, including the Coding Potential Calculator (CPC2, v1.0.1), Coding-Noncoding Index (CNCI, v2.0), Coding Potential Assessment Tool (CPAT, v3.0.4), and the Pfam database ([65]http://pfam.xfam.org/v1.3, accessed on February 2023). The expression abundance of the identified lncRNA transcripts was quantified using FPKM. Differential expression analysis between the mammary gland and milk fat globules was conducted using DESeq2, with thresholds set at FDR < 0.05 and |log2(fold change)| ≥ 1. cis-Target genes of lncRNAs were identified using a Perl script to locate adjacent genes within 100 kb upstream or downstream of each lncRNA. Pearson correlation coefficient method was used to analyze the correlation between lncRNA and mRNA among samples, genes with the absolute value of correlation greater than 0.9 and significant P-value less than 0.01 were taken as trans-target genes of lncRNA. Additionally, GO functional annotation and KEGG pathway enrichment analysis were performed for non-differential lncRNA target genes using the ClusterProfiler R package. 2.6. Identification of circRNA and prediction of target genes The quality of the sequencing data was assessed using FastQC (v0.12.1), followed by processing of the high-throughput sequencing reads with Fastp (v0.23.4) to remove adapter sequences, primers, poly-A tails, and other unwanted elements. The filtered high-quality reads were then aligned to the reference genomes of buffalo and goats using Burrows-Wheeler-Alignment (BWA) software. Circular RNAs (circRNAs) were identified using CIRI2 software (v2.0.6), while splicing and quantification of the circRNAs were conducted with CIRIquant (v2.0.6, [66]https://sourceforge.net/projects/ciri/files/CIRIquant/, accessed on April 2023). Lowly expressed circRNAs were filtered out, retaining only those expressed in at least two samples for further analysis. Differential expression analysis was performed using the edgeR R package (v3.42.4), with thresholds set at FDR < 0.05 and |log(fold change)| ≥ 1. Target genes of circRNAs were also predicted using CIRIquant. Finally, GO functional annotation and KEGG pathway enrichment analyses of the non-differential circRNA target genes were performed using the ClusterProfiler R package. 2.7. Identification of miRNA and prediction of target genes The raw data were processed using a Perl script to remove low-quality sequences, reads with more than 10 % unknown bases (N), those missing a 3′ junction sequence, and sequences shorter than 18 nucleotides or longer than 30 nucleotides. The cleaned reads were then used for downstream analysis, including alignment to the Silva, GtRNAdb, Rfam, and Repbase databases using Bowtie software (v1.0.0). By filtering out ncRNAs such as ribosomal RNA (rRNA), transfer RNA (tRNA), small nuclear RNA (snRNA), small nucleolar RNA (snoRNA), and repetitive sequences, unannotated reads containing miRNAs were obtained. The unannotated reads were then aligned to the buffalo or goat reference genome using Bowtie software to determine their genomic locations, referred to as Mapped reads. These Mapped reads were compared to the mature miRNA sequences in the miRBase database (v22.1), including a range of 2 nucleotides upstream and 5 nucleotides downstream. A maximum of one mismatch was allowed, and reads that matched were classified as known miRNAs. Unknown miRNAs were predicted using miRDeep2 software (v2.0.0.5). Statistical analysis of miRNA expression was performed using DESeq2, with FDR < 0.05 and |log2(fold change)| ≥ 1 as the criteria for identifying significant differences between groups. The expression levels were normalized using the TPM algorithm. miRNA target genes were predicted using TargetScan ([67]https://www.targetscan.org/vert_80/, accessed on May 2023) and miRDB ([68]https://mirdb.org/, accessed on May 2023). Finally, GO functional annotation and KEGG pathway enrichment analyses of non-differential miRNA target genes were conducted using the ClusterProfiler R package. 2.8. Reverse transcription real-time quantitative polymerase chain reaction (RT-qPCR) analysis The RNA extracted from the mammary gland and milk fat globules was reverse transcribed into cDNA using an RT-PCR kit (Vazyme, Nanjing, China). To ensure the accuracy of the data, 29 DEGs and NDEGs related to lactation traits were randomly selected from buffalo and goat mRNA for Quantitative polymerase chain reaction (qPCR) validation. Primers were designed using Primer Premier software (v5.0; [69]https://www.premierbiosoft.com/primerdesign/, accessed on July 2023) and synthesized by Sangon Biotech Co., Ltd. (Shanghai, China) (Table S1). qPCR was performed using the SYBR qPCR Master Mix kit (Vazyme, Shanghai, China) on the LightCycler® 96 system (Roche, Basel, Switzerland). glyceraldehyde-3-phosphate dehydrogenase (GAPDH) was used as the reference gene, with the mammary gland serving as the control group. The reaction system consisted of 10 μL 2× FastStart Universal SYBR Green Master, 0.5 μL (10 μM) each of forward and reverse primers, 1 μL cDNA, and 8 μL ddH₂O, resulting in a total reaction volume of 20 μL. The optimized reaction conditions were as follows: initial denaturation at 95 °C for 5 min, followed by 40 cycles of 95 °C for 15 s, 60 °C for 30 s, and 72 °C for 1 min. Gene expression levels were calculated using the 2^−ΔΔCT method, with each experiment repeated more than three times. Bar graphs illustrating the results were generated using GraphPad Prism software (v8.0, [70]https://www.graphpad.com/features, accessed on July 2023). 2.9. Statistical analysis Statistical analysis was performed using SPSS software (v26.0, [71]https://www.ibm.com/spss, accessed on July 2023). The Student's t-test was applied to compare transcript expression levels between milk fat globules and mammary glands. Each group was analyzed in triplicate, and the results are presented as mean ± SEM. Statistical significance was defined as p < 0.05, and highly significant differences were defined as p < 0.01. Prior to performing the t-test, data were tested for normality to ensure the validity of the analysis. 3. Results 3.1. Quality control and routine data Milk fat globules and mammary gland tissues from lactating buffalo (referred to as B-MFG and B-MG, respectively) and lactating goat (referred to as G-MFG and G-MG, respectively) were collected, and total RNA was extracted for RNA sequencing ([72]Fig. 1A, B). Agarose gel electrophoresis confirmed that the extracted RNA samples were of high integrity and suitable for sequencing ([73]Fig. 1A, B). Following sequencing quality control, the clean data from buffalo and goat samples reached 9.07 Gb and 15.71 Gb, respectively, with Q30 values exceeding 85.87 % and 85.72 % (Table S2). In the buffalo mRNA dataset, 14,266 genes were identified, including 401 genes specifically expressed in the mammary gland, 7 genes specifically expressed in milk fat globules, and 13,858 genes expressed between the mammary gland and milk fat globules ([74]Fig. 1C). In the goat mRNA dataset, 12,006 genes were identified, of which 72 genes were specifically expressed in the mammary gland, while 11,934 genes were expressed in both the mammary gland and milk fat globules ([75]Fig. 1D). FPKM analysis revealed that gene expression levels in the mammary gland were comparable to those in milk fat globules ([76]Fig. 1E, F). Fig. 1. [77]Fig. 1 [78]Open in a new tab Basic layout of transcriptome data. (A) Technical roadmap of buffalo data. (B) Technical roadmap of goat data. (C) Overlap of mammary gland and milk fat globules gene expression in buffalo (D). Overlap of mammary gland and milk fat globules gene expression in goat. (E) Buffalo sample FPKM distribution box plot (The abscissa represents different samples, and the ordinate represents the logarithm of the sample expression FPKM). (F) Goat sample FPKM distribution box plot. 3.2. Analysis of Buffalo mRNA profiles In the identified genes from buffalo samples, 6431 DEGs and 7835 NDEGs were found between milk fat globules and mammary gland ([79]Fig. 2A). Compared to the mRNA in mammary gland samples, 2680 DEGs were up-regulated, while 3751 DEGs were down-regulated in the milk fat globules samples ([80]Fig. 2B). KEGG analysis showed that the DEGs are mainly enriched in the JAK-STAT signaling pathway, PPAR signaling pathway, prolactin signaling pathway, and galactose metabolism pathway ([81]Fig. 2C). While the NDEGs were mainly enriched in the MAPK signaling pathway, fatty acid metabolism, glycerophospholipid metabolism, and glycerolipid metabolism pathway ([82]Fig. 2D). GO analysis showed that the DEGs are mainly enriched in cellular processes, metabolic processes, organic metabolic processes, and biological regulation ([83]Fig. 2E). Contrastingly, the NDEGs were mainly enriched in terms linked to cellular processes, response to stimuli, metabolic processes, and regulation of organic metabolic processes ([84]Fig. 2F). Additionally, genes associated with lactation were screened from DEGs and NDEGs to assess protein interaction networks. According to the protein interaction networks, we found that TP53, EGFR, PPARA, DGAT1, PLIN2, and FABP3 of the DEGs have the most interaction with other proteins ([85]Fig. 2G), while CASP3, IL1B, NFKB1, PDCD11, and PPARG of the NDEGs have the highest interactions with other proteins ([86]Fig. 2H). We randomly selected several genes related to lactation from the DEGs and NDEGs and detected their expression levels through qPCR. Results showed that the expression patterns of the selected genes were consistent with those in RNA-seq ([87]Fig. 2I, J), suggesting that the data and analysis of RNA-seq are credible. Fig. 2. [88]Fig. 2 [89]Open in a new tab Analysis of buffalo mRNA. (A) Total mRNA heatmap. (B) Volcano plot (P-value <0.05 and | log2 (Fold Change) | ≥ 1) (C) KEGG enrichment analysis of DEGs. (D) KEGG enrichment analysis of NDEGs. (E) GO enrichment analysis of DEGs. (F) GO enrichment analysis of NDEGs. (G) Protein interaction networks of DEGs linked to lactation traits. (H) Protein interaction networks of NDEGs linked to lactation traits. (I) qPCR of DEGs corresponding to lactation traits. (J) qPCR of NDEGs corresponding to lactation traits. * and ** represent significant difference (p < 0.05 or p < 0.01), ns represents no difference (p > 0.05). 3.3. Analysis of goat mRNA profiles In goat transcript profiles, we identified a total of 12,006 genes, including 1053 DEGs and 10,953 NDEGs between milk fat globules and mammary gland ([90]Fig. 3A). Compared to the mammary gland samples, 230 up-regulated genes and 823 down-regulated genes were identified in the milk fat globules samples ([91]Fig. 3B). KEGG analysis showed that the DEGs are mainly enriched in the PI3K-Akt signaling pathway, MAPK signaling pathway, PPAR signaling pathway, and fatty acid metabolic pathway ([92]Fig. 3C). In addition, the NDEGs were mainly enriched in the MAPK signaling pathway, FoxO signaling pathway, Wnt signaling pathway, fatty acid metabolism, and PPAR signaling pathway ([93]Fig. 3D). It is notable that both the DEGs and NDEGs were enriched in terms linked to the cellular lipid metabolism process and lipid biosynthesis process through GO cluster analysis ([94]Fig. 3E, F). Genes related to lactation traits from DEGs and NDEGs were screened and performed protein interaction network analysis. Results showed that TRP53, TLR4, PPARA, CASP3, FASN, and CPT1A of the NDEGs had the highest interaction with other proteins ([95]Fig. 3G). As the DEGs between G-MFG and G-MG were limited, no protein interaction network could be mapped to the DEGs, indicating that the mRNAs from milk fat globules are highly consistent with the mammary gland. qPCR analysis confirmed the credibility of the data for the expression of selected genes was consistent with RNA-seq ([96]Fig. 3H, I). Fig. 3. [97]Fig. 3 [98]Open in a new tab Assessment of goat mRNA. (A) Total mRNA heatmap. (B) Volcano plot (P-value <0.05 and | log2 (Fold Change) | ≥ 1). (C) KEGG enrichment analysis of DEGs. (D) KEGG enrichment analysis of NDEGs. (E) GO enrichment analysis of DEGs. (F) GO enrichment analysis of NDEGs. (G) Protein interaction networks of DEGs linked to lactation. (H) qPCR of DEGs associated with lactation traits. (I) qPCR of NDEGs associated with lactation traits. * and ** represent significant difference (p < 0.05 or p < 0.01), ns represents no difference (p > 0.05). 3.4. ncRNA analysis of Buffalo The ncRNAs, including lncRNAs, circRNAs, and miRNAs, were analyzed in the transcript profiles. A total of 6086 lncRNA transcripts were identified ([99]Fig. 4 A), comprising 231 mammary gland-specific lncRNAs, 1791 milk fat globule-specific lncRNAs, and 4064 lncRNAs expressed in both mammary glands and milk fat globules ([100]Fig. 4B). Among these, 3993 lncRNAs were non-differentially expressed, while 2093 were differentially expressed, with 527 upregulated and 1566 downregulated in the milk fat globule samples ([101]Fig. 4C, N). Further analysis revealed that most lactation-related lncRNAs were differentially expressed between the mammary gland and milk fat globules, including MSTRG.100749.3, MSTRG.13060.1, MSTRG.25944.1, MSTRG.26125.1, and MSTRG.9054.2, which target SCD, NCKAP5, MGST1, ABCC9, and BTN1A1, respectively. KEGG and GO analysis showed that most of the lncRNAs in lactation-related pathways ([102]Fig. 4D, E), such as GnRH secretion, GnRH signaling pathway, Oxytocin signaling pathway, Wnt signaling pathway, Phosphatidylinositol signaling system, lipid metabolic process are differentially expressed. Fig. 4. [103]Fig. 4 [104]Open in a new tab ncRNA analysis of buffalo. (A) lncRNA prediction Venn diagram. (B) lncRNA specificity and co-expression Venn diagram. (C) Heatmap of all lncRNA transcripts. (D) KEGG analysis of lncRNA transcripts related to lactation traits. (E) GO analysis of lncRNA transcripts related to lactation traits. (F) circRNA specificity and co-expression Venn diagram. (G) Heatmap of circRNA transcripts. (H) KEGG analysis of circRNA transcripts linked to lactation traits. (I) GO analysis of circRNA transcripts linked to lactation traits. (J) miRNA specificity and co-expression Venn diagram. (K) Heatmap of miRNA transcripts. (L) KEGG analysis of miRNA transcripts associated with lactation traits. (M) GO analysis of miRNA transcripts linked to lactation traits. (N) Volcano plot of differential transcripts of ncRNA. (O) Functional classification of non-differentially expressed transcripts linked to lactation traits. (P) Targeting relationship of ncRNA non-differentially expressed transcripts. (Red circles represent target genes, and the larger the number of interacting protein nodes, the larger the circles. The more proximal the protein correlation, the thicker the line segment). We further predicted the circRNAs in the transcripts, and 11,631 circRNA transcripts were identified, encompassing 3210 mammary gland-specific circRNAs, 1180 milk fat globules-specific circRNAs, and 7241 expressed in both mammary gland and milk fat globules circRNAs ([105]Fig. 4F). Among the predicted 11,631 circRNAs, only 618 were found significantly differentially expressed, with 244 up-regulated and 367 down-regulated in the milk fat globules samples ([106]Fig. 4G, [107]Fig. 4 N), suggesting that the circRNA transcripts in milk fat globules are quite similar to those in mammary gland. We found that many of the lactation-related circRNAs are expressed both in the mammary gland and milk fat globules samples without significant difference, such as chr8:84187508|84,304,002, chr3:153425609|153,435,325, chr3:48542316|48,566,573, chr4:15797092|15,802,037, and chr23: 20512605|20,513,448, which target CSN2, LPL, ACACA, PPARA, and SCD, respectively. KEGG and GO analyses demonstrated that most of the circRNAs act in lactation-related pathways, such as the PPAR signaling pathway, sphingolipid signaling pathway, MAPK signaling pathway, fatty acid metabolism, adipocyte signaling pathway, are non-differently expressed in the mammary gland and milk fat globules ([108]Fig. 4H, I). The miRNA transcripts were predicted and 7010 miRNAs were identified, including 5897 known miRNAs and 1113 newly predicted miRNAs. Among the miRNAs, 1287 were identified as specifically expressed in the mammary gland, 622 were found to be specifically expressed in milk fat globules, and 5101 miRNAs were expressed both in the mammary gland and milk fat globules ([109]Fig. 4 J). We found that 3465 miRNAs were non-significantly differentially expressed in the mammary gland and milk fat globules, and 3545 were differentially expressed, including 1534 up-regulated and 2011 down-regulated from the milk fat globules samples ([110]Fig. 4 K, N). Further, GO and KEGG analysis demonstrated that most of the miRNAs in the lactation-related pathways are differently expressed throughout the mammary gland and milk fat globules samples ([111]Fig. 4 L, M). Synthesizing the above data, we can assert that most of the ncRNAs are expressed both in the mammary gland and milk fat globules samples, though some are expressed with significant differences. Additionally, we analyzed the functions and relationships of target genes of ncRNA transcripts to better understand the regulatory network between ncRNAs and their target genes, as well as interactions among ncRNAs. Results showed multiple associations between target genes and ncRNAs related to key lactation traits, such as milk yield, synthesis of fatty acids and triglycerides, somatic cell score, and lactation persistence ([112]Fig. 4O). PPI analysis further revealed potential regulatory relationships between ncRNA transcripts and their target genes, highlighting their roles in lactation processes ([113]Fig. 4P). 3.5. ncRNA analysis of goat samples We identified a total of 7584 lncRNA transcripts ([114]Fig. 5 A), including 543 specifically expressed in the mammary gland, 408 specific to milk fat globules, and 6633 expressed in both mammary gland and milk fat globules ([115]Fig. 5B). Among these, 7285 lncRNAs showed no significant differential expression, while 299 were differentially expressed, with 50 upregulated and 249 downregulated in the milk fat globule samples ([116]Fig. 5C, N). In contrast to the buffalo group, a large proportion of lncRNA transcripts in this group were non-differentially expressed. These include MSTRG.1267.1, MSTRG.20228.6, MSTRG.2343.5, MSTRG.25221.2, MSTRG.26390.4, and MSTRG.26401.1, which target key genes such as ALCAM, LALBA, BCL6, ABCG2, CSN1S1 and CSN1S2, respectively. KEGG and GO analyses demonstrated that most lncRNAs in lactation-related pathways were enriched in the PI3K-Akt signaling pathway, Sphingolipid signaling pathway, JAK-STAT signaling pathway, oxytocin signaling pathway, Wnt signaling pathway, MAPK signaling pathway, and galactose metabolic pathway ([117]Fig. 5D, E). Fig. 5. [118]Fig. 5 [119]Open in a new tab ncRNA analysis of goat. (A) lncRNA prediction Venn diagram. (B) lncRNA specificity and co-expression Venn diagram. (C) Heatmap of all lncRNA transcripts. (D) KEGG analysis of lncRNA transcripts related to lactation traits. (E) GO analysis of lncRNA transcripts related to lactation traits. (F) circRNA specificity and co-expression Venn diagram. (G) Heatmap of circRNA transcripts. (H) KEGG analysis of circRNA transcripts linked to lactation traits. (I) GO analysis of circRNA transcripts linked to lactation traits. (J) miRNA specificity and co-expression Venn diagram. (K) Heatmap of miRNA transcripts. (L) KEGG analysis of miRNA transcripts associated with lactation traits. (M) GO analysis of miRNA transcripts linked to lactation traits. (N) Volcano plot of differential transcripts of ncRNA. (O) Functional classification of non-differentially expressed transcripts linked to lactation traits. (P) Targeting relationship of ncRNA non-differentially expressed transcripts. (Purple circles represent target genes, the larger the number of interacting protein nodes, the larger the circles. The closer the protein correlation, the thicker the line segment). We further predicted the circRNAs in the transcripts and 7040 circRNA transcripts were identified, including 249 mammary gland-specific circRNAs, 633 milk fat globules-specific, and 6158 mammary gland and milk fat globules expressed ([120]Fig. 5F). Among the predicted 7040 circRNAs, only 118 were identified to be significantly differentially expressed, including 29 up-regulated and 89 down-regulated from the milk fat globules samples ([121]Fig. 5G, N), indicating that the circRNA transcripts in milk fat globules were quite similar to mammary gland. We found that most of the lactation-related circRNAs are expressed both in the mammary gland and milk fat globules samples without significant differences, including chr11:14031625| 14,040,821, chr13:63768085|63,774,646, chr17:6697222|6,724,992, chr 17:7224976|7,235,869, and chr19:50155979|50,162,612, which target XDH, ACSS2, SART3, ACACB and FASN, respectively. KEGG and GO analysis indicated that most of the circRNAs in lactation-related pathways, such as the MAPK signaling pathway, PI3K-Akt signaling pathway, PPAR signaling pathway, sphingolipid signaling pathway, Wnt signaling pathway, and GnRH signaling pathway showed no difference in expression across mammary gland and milk fat globules ([122]Fig. 5H, I). The miRNA transcripts were analyzed, and a total of 7184 miRNAs were identified, including 5809 known miRNAs and 1375 newly predicted ones. Among these, 752 miRNAs were specifically expressed in the mammary gland, 226 were specific to milk fat globules, and 6206 were expressed in both mammary gland and milk fat globules ([123]Fig. 5 J). Of these, 6294 miRNAs showed no significant differential expression between mammary gland and milk fat globules, while 890 miRNAs were differentially expressed, including 182 upregulated and 708 downregulated in the milk fat globule samples ([124]Fig. 5 K, N). Further, GO and KEGG analyses indicated that most miRNAs involved in lactation-related pathways were expressed similarly in the mammary gland and milk fat globule samples ([125]Fig. 5 L, M). These results suggest that the majority of ncRNAs are consistently expressed across mammary gland and milk fat globules, with minimal differences. Additionally, similar to buffalo, the goat transcript data revealed multiple associations between target genes and ncRNAs linked to lactation traits ([126]Fig. 5O, P). The targeting relationship and interaction network of ncRNAs may allow for a new understanding for further study of lactation. 4. Discussion Functional analysis of specific molecules in the mammary gland can provide a critical understanding of the Gene expression and post-transcriptional regulation that affects the lactation performance of dairy animals ([127]Arora et al., 2019; [128]Wu et al., 2023; [129]Xuan et al., 2022). However, the difficulty and impracticality of obtaining mammary gland samples from lactating animals has significantly impeded the study of animal lactation. Milk fat globules, abundant in milk fat, are the most direct participants in milk fat synthesis and metabolism. It is secreted in MEC and is readily available and allows repetitive sampling during the lactation of animals ([130]Mather & Keenan, 1998). Milk fat globules originate in the endoplasmic reticulum, where fat droplets are formed and are gradually enveloped by the apical plasma membrane before being extruded into the acinus lumen as part of milk secretion ([131]Heid & Keenan, 2005; [132]Thum et al., 2022). During the formation of milk fat globules, some cytoplasm may be retained around the fat droplets and membrane, with a shape similar to a crescent, called a crescent body. Studies have found that 7.2 % of milk fat globules contain crescent bodies ([133]Bourlieu & Michalski, 2015), and the transcript profile of milk fat globules was found similar to the mammary gland in humans and mice ([134]Maningat et al., 2009). Using microarray assays, a previous study found that RNA isolated from milk fat globules represents a unique source of MEC RNA to noninvasively sample sufficient quantities of high-quality RNA to assess the dynamics of MEC gene expression in vivo ([135]Brenaut et al., 2012). Using qPCR, another study uncovered that the expression of milk fat and protein synthesis-related genes in milk fat globules were more elevated than that in mammary gland and MSC ([136]Chen et al., 2016), suggesting that milk fat globules may be a promising technique for studies on the function of MEC. However, the previous studies are only focused on the comparison of highly expressed genes or specific functional genes between milk fat globules and mammary gland, which cannot fully reflect the similarities and differences of the transcripts between milk fat globules and mammary gland. For example, in this study, we also found that most lactation-related transcripts are expressed both in the milk fat globules and mammary gland without significant differences, which is consistent with the above reports. But the fact is that, near 45 % of the whole mRNAs were found differentially expressed between milk fat globules and mammary gland in buffalo samples. Therefore, it is of significant to systemically compare the transcript profiles of milk fat globules and mammary gland using whole RNA sequencing, which is a more comprehensive comparison method. A previous report has systemically compared gene expression of RNA obtained from mammary gland tissue (MGT), MSC, laser micro-dissected mammary epithelial cells (LCMEC), milk fat globules, and antibody-captured milk mammary epithelial cells (mMEC) from bovine animals using RNA-sequencing, and found that the MSC and milk fat globules transcriptomes are representative of MGT and LCMEC and can be used as effective and alternative samples for the study of gene expression in the MGT ([137]Cánovas et al., 2014). While in this study, we found that the feasibility of using transcripts from milk fat globules to replace that of mammary glands seems to be different for different species. In buffaloes, though more than 97 % of the mRNAs were found expressed both in the milk fat globules and mammary gland, nearly 45 % of the mRNAs were found differentially expressed between milk fat globules and mammary gland, suggesting that the transcripts from milk fat globules are not fully representative of the mammary gland. While in goats, we found more than 99 % of the mRNAs were expressed both in the milk fat globules and mammary gland, and over 91 % of them were expressed without significant difference, suggesting the mRNAs in milk fat globules exactly reflect that in mammary glands. Combined with the reports and our results, the transcripts from milk fat globules cannot always reflect that of the mammary gland. Therefore, it is important for the researchers to make it clear whether the milk fat globules are a feasible source of mammary gland transcripts before regulatory and functional analysis. It is worth noting that, though 45 % of the mRNAs were found differentially expressed between milk fat globules and mammary gland in buffalo samples, further analysis showed that most lactation-related transcripts are expressed both in the milk fat globules and mammary gland without significant differences, including the PPARG, NMI, EXT2, and ACLY that related to fat synthesis, the RPL23A, ATP6AP2, BCL6, VPS13B, and IRF6 that related to milk yield, and the CSN2, MFNG, MFSD14A, and AGL that related to milk protein synthesis. β-casein (CSN2) and alpha-lactalbumin (LALBA) have been found to account for 45 % of the total mRNA library in the transcripts of the fat layer of human milk ([138]Lemay, Ballard, et al., 2013), and casein (CSN2, CSN3, CSN1S1, and CSN1S2), whey proteins (BLG and LALBA), and GLYCAM1 were the most expressed genes in the transcripts of MSC ([139]Wickramasinghe et al., 2012). In addition, multiple genes, such as genes related to milk protein and lipid synthesis, were found expressing both in the milk fat globules and mammary gland alveolar parenchyma in goats using microarray analysis ([140]Brenaut et al., 2012). Coincidentally, in this study, the genes mentioned above were also found to be highly expressed both in the milk fat globules and mammary glands without significant differences, which suggests the uniformity of the transcripts in milk fat globules and mammary glands. However, we also found the expression of some genes in milk fat globules was significantly higher than in mammary glands in buffalo data. A previous study showed that the expression of genes related to the synthesis of milk protein (LALBA, BLG, and CSN2) and fat (ACC, BTN1A1, FABP3, and FAS) in milk fat globules was higher than that in the mammary gland and MSC ([141]Chen et al., 2016), which is consistent with our study. Therefore, in buffalo, though the mRNAs in milk fat globules did not exactly reflect that in mammary gland tissues, most of the expression of the lactation-related genes showed a high degree of consistency, suggesting that the milk fat globules are still a non-invasive candidate source to the lactation-related studies. ncRNAs have been firmly established as key regulators of gene expression in myriad processes ranging from embryonic development to innate immunity, including the processes of physiology and pathology. Recently, many studies have found that the ncRNA family, including lncRNA, circRNA, and microRNA, plays crucial roles in lactation by regulating gene transcription and post-transcriptional translation ([142]Feng et al., 2022; [143]Ji et al., 2020; [144]Standaert et al., 2014), little study has systemically compared the ncRNA transcripts between the milk fat globules and mammary gland. lncRNAs are found as essential regulatory elements, playing diverse roles in many biological processes including mammary gland development. 45 differentially expressed lncRNAs were identified between the Jersey and Kashmiri cattle, and were found co-expressed with genes involved in the milk synthesis processes, indicating their potential regulatory effects on milk quality ([145]Mumtaz et al., 2022). A previous study identified 2381 lncRNAs in the mammary gland tissues of Laoshan dairy goats (Capra hircus) from different lactation periods at the whole-genome level and found the lncRNAs are closely related to mammary gland development and lactation. In the present study, we identified 6086 and 7584 lncRNA transcripts in mammary gland and milk fat globules from buffalo and goat, respectively. Further analysis showed that 35 % of the lncRNAs are differentially expressed between the milk fat globules and mammary glands from buffalo. While in the milk fat globules and mammary gland from goats, more than 96 % of the lncRNAs are non-differentially expressed, indicating the milk fat globules-lncRNAs exactly reflect the mammary gland-lncRNAs. Using RNA-seq, a previous study identified 309 circRNAs that are differentially expressed between the MEC of cows with high and low milk fat percentages and constructed the lipid metabolism-related competing endogenous RNA (ceRNA) interactive regulatory network ([146]Feng et al., 2022; [147]Feng et al., 2023). Another study identified 4906 circRNAs in mammary gland parenchyma at peak lactation from Small-Tailed Han sheep and Gansu Alpine Merino sheep with phenotypic differences in milk yield and components ([148]Hao et al., 2019). circ_015343 has been found that can regulate the expression of functional genes related to the development of mammary glands by binding some miRNAs ([149]Wu et al., 2022). circRNAs were also found in response to heat stress by circRNA-miRNA-mRNA networks. However, no study has identified or characterized the circRNAs between mammary glands and milk fat globules. In the present study, we identified 11,631 and 7040 circRNAs, and only 618 and 118 circRNAs were found significantly differentially expressed in mammary gland and milk fat globules from buffalo and goat, respectively. Further analysis showed that most of the lactation-related circRNAs and pathways are expressed both in the mammary gland and milk fat globules samples without significant differences, suggesting that the circRNA transcripts in milk fat globules were quite similar to the mammary gland. miRNAs are small noncoding RNAs present in milk-derived extracellular vesicles and milk fat globules. Studies have found that the mammary gland cells express their own miRNAs rather than originate from the blood ([150]Kosaka et al., 2010). Several studies have identified the miRNAs from the mammary gland tissue of cattle and goats and found miRNAs play important roles in mammary gland development and lactation ([151]Dysin et al., 2021). In addition, a study has found that more than 90 % of the miRNAs in milk-derived extracellular vesicles are consistent with the mammary gland tissue in bovine ([152]Li et al., 2022). However, another study suggested that the miRNAs in the milk fat globules do not reflect exactly the mammary gland miRNAs by detection of several miRNAs using qRT-PCR ([153]Leroux et al., 2022; [154]Pawlowski et al., 2020). In this study, we found that in buffalo, more than 72 % of the miRNAs were expressed both in the milk fat globules and mammary gland, and only about 50 % of them were expressed without significant difference, which agrees with the point that, milk fat globules can be used as a non-invasive source of microRNA but do not reflect exactly the mammary gland miRNome. While in the goat, more than 86 % of the miRNAs were found non-significantly differentially expressed in mammary glands and milk fat globules, suggesting the miRNAs in the milk fat globules were consistent with mammary glands in the goat. This result again indicated that the feasibility of using transcripts from milk fat globules to replace that of mammary glands is different for different species. 5. Conclusion This study systematically characterized and compared the expression of mRNAs, lnRNAs, circRNAs, and miRNAs from milk fat globules and mammary gland samples of buffalo and goats. The results strongly suggest that the feasibility of using transcripts from milk fat globules to replace that of mammary glands is different for different species. In goats, milk fat globules represent a valuable, non-invasive source of mammary gland transcripts and a possible alternative to mammary glands for molecular expression and regulation analysis. While in buffaloes, the transcripts from milk fat globules cannot fully represent mammary transcription as nearly half of the transcripts from milk fat globules are significantly different from those of mammary glands. However, most of the lactation-related transcripts are consistent between the milk fat globules and mammary gland samples, suggesting that the milk fat globules are still a non-invasive candidate source for lactation-related studies. This study provides deeper insights into the transcriptional characteristics of milk fat globules and their potential as a non-invasive alternative to mammary gland tissue in lactation studies. CRediT authorship contribution statement Hancai Jiang: Writing – original draft, Visualization, Validation, Formal analysis. Xiaoxian Xu: Visualization, Software, Formal analysis, Data curation. Shuwan Wang: Validation, Resources. Xinhui Song: Software, Investigation. Ling Li: Resources. Qingyou Liu: Writing – review & editing, Supervision, Funding acquisition. Kuiqing Cui: Writing – review & editing, Supervision. Deshun Shi: Writing – review & editing, Supervision. Jian Wang: Writing – review & editing. Hui Li: Writing – review & editing. Jieping Huang: Writing – review & editing, Supervision. Zhipeng Li: Writing – review & editing, Supervision, Funding acquisition, Conceptualization. Declaration of competing interest The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper. Acknowledgments