Abstract Background Cadmium (Cd)-containing chemicals can cause serious damage to biological systems. In animals and plants, Cd exposure can lead to metabolic disorders or death. However, for the most part the effects of Cd on specific biological processes are not known. DNA methylation is an important mechanism for the regulation of gene expression. In this study we examined the effects of Cd exposure on global DNA methylation in a living organism by whole-genome bisulfite sequencing (WGBS) using Drosophila melanogaster as model. Results A total of 71 differentially methylated regions and 63 differentially methylated genes (DMGs) were identified by WGBS. A total of 39 genes were demethylated in the Cd treatment group but not in the control group, whereas 24 showed increased methylation in the former relative to the latter. In most cases, demethylation activated gene expression: genes such as Cdc42 and Mekk1 were upregulated as a result of demethylation. There were 37 DMGs that overlapped with differentially expressed genes from the digital expression library including baz, Act5C, and ss, which are associated with development, reproduction, and energy metabolism. Conclusions DNA methylation actively regulates the physiological response to heavy metal stress in Drosophila in part via activation of apoptosis. Electronic supplementary material The online version of this article (10.1186/s12864-019-5688-z) contains supplementary material, which is available to authorized users. Background Cadmium (Cd)-based chemicals are essential in many industries, including plastics and battery manufacturing and non-ferrous metallurgy [[37]1]. As a result of their widespread use, large amounts of Cd have been released into the environment over many decades, causing pollution that threatens global ecosystems as well as human health [[38]2, [39]3]. Through the food chain, these chemicals can accumulate in organisms inhabiting contaminated environments [[40]4], resulting in genetic damage, reduced reproductive capacity, growth inhibition, and even death [[41]5, [42]6]. Given their ubiquitous presence, there is an urgent need to better understand the biochemical impacts of Cd-based chemicals and develop effective detoxification mechanisms [[43]7]. Many studies have addressed not only the repair of genetic damage caused by Cd but also apoptosis and oxidative stress [[44]8, [45]9]. However, there is little known about how Cd affects DNA methylation, a type of epigenetic modification that is important for gene regulation [[46]10–[47]12]. Drosophila melanogaster is considered a suitable model species for investigating biological responses to toxic chemicals [[48]13]. Genes in D. melanogaster have many homologs in mammals including humans, with many genes being structurally and functionally conserved; however, Drosophila has the advantage of a simpler genome that makes it more amenable to studies of complex biological mechanisms [[49]14–[50]16]. Although global DNA methylation level is lower overall in the genome of Drosophila as compared to mammals, there are also fewer methylases. DNA methylation is an important epigenetic mechanism for the regulation of gene expression in development, reproduction, and stress resistance [[51]17–[52]20]. Although it is presumed that DNA methylation is involved in the response to Cd stress in Drosophila, there have been no detailed surveys of DNA methylation profiles following exposure to heavy metal stress and many questions remain unanswered, including the number and identity of methylated genes and how methylation affects gene expression. To address these points, in this study we used whole-genome bisulfite sequencing (WGBS) to evaluate genome-wide DNA methylation changes in D. melanogaster subjected to Cd stress. We identified many differentially methylated genes (DMGs) and demonstrated their relationship to gene expression. Our results provide evidence for the broad involvement of DNA methylation in the response to heavy metal stress in animals. Results DNA methylation state of the Drosophila genome WGBS yielded 35.5 Gb of raw data from six different samples (three repeats for each of the two groups) comprising about 38.2 billion nucleotides, all with Q20 values above 95% (Table [53]1). The raw reads numbered more than 37.6 million among the six samples, and after removing those of low quality (i.e., those with a high number of ‘N’, poly-A contamination, and contamination by adaptor sequences), at least 98% of the reads were retained and were taken as the high-quality (HQ) clean reads. Given the number of retained HQ reads, we expected an average genome coverage of about 30×. For all samples, between 63.56 and 74.60% of the HQ reads mapped uniquely to the genome, giving an average genome coverage between 27.28× and 35.67× (Table [54]1). Table 1. Summary of genome-wide bisulfite sequencing data for six Drosophila melanogaster samples Sample Clean data (bp) HQ clean data (bp) No. of clean read No. of HQ clean reads (%) Q20 (%) Q30 (%) GC (%) N (%) HQ clean data / clean data (%) CK-1 6,342,804,900 6,265,486,794 42,285,366 41,829,038 (98.92%) 6,127,259,145 (97.79%) 5,943,780,984 (94.87%) 1,182,385,554 (18.87%) 987,865 (0.02%) 98.78% CK-2 5,651,708,700 5,543,296,390 37,678,058 36,998,402 (98.20%) 5,328,517,738 (96.13%) 5,021,173,126 (90.58%) 1,040,523,808 (18.77%) 593,380 (0.01%) 98.08% CK-3 6,667,975,500 6,566,612,690 44,453,170 43,845,552 (98.63%) 6,346,014,536 (96.64%) 6,013,608,683 (91.58%) 1,291,540,988 (19.67%) 715,638 (0.01%) 98.48% s52–1 6,380,107,200 6,289,532,315 42,534,048 41,983,774 (98.71%) 6,135,594,846 (97.55%) 5,933,616,057 (94.34%) 1,179,393,128 (18.75%) 976,796 (0.02%) 98.58% s52–2 6,989,049,900 6,860,524,362 46,593,666 45,809,874 (98.32%) 6,685,079,771 (97.44%) 6,464,628,728 (94.23%) 1,303,391,516 (19.00%) 1,073,466 (0.02%) 98.16% s52–3 6,172,341,000 6,053,568,099 41,148,940 40,413,472 (98.21%) 5,787,545,449 (95.61%) 5,425,219,375 (89.62%) 1,168,998,770 (19.31%) 567,912 (0.01%) 98.08% [55]Open in a new tab HQ high quality The average number of methylated cytosines detected in the Cd treatment and control groups was about 0.1% of all cytosines in the Drosophila genome. There were 12,397 methylated cytosines for CG, 9880 for CHG, and 30,678 for CHH (where H represents A, C, or T) in the treatment group (Fig. [56]1a and Table [57]2), which was significantly lower (P < 0.05, Fisher’s exact test) than in the control group (15,854, 12,243, and 37,246, respectively, Fig. [58]1b and Fig. [59]1c), indicating that Cd treatment reduced global methylation levels. Fig. 1. Fig. 1 [60]Open in a new tab Distribution of mC in CG, CHG, and CHH in the (a) treatment group, (b) control group and (c) all six different samples Table 2. Methylated CG, CHG, and CHH sites in Cd treatment and control groups (CK) as the total number and percentage of whole genome Group Type Number Percent (%) CK-1 CG 15,854 24.26 CK-2 CHG 12,243 18.74 CK-3 CHH 37,246 57.00 Cd treatment 1 CG 12,397 23.41 Cd treatment 2 CHG 9880 18.66 Cd treatment 3 CHH 30,678 57.93 [61]Open in a new tab Preferred sequences flanking the methylation site We analyzed the relationship between the type of methylation and surrounding sequences by identifying the features of the 9-mer sequence around the methylation site (Fig. [62]2a and b). For CHH, the Cd treatment and control groups showed identical sequence enrichment at each genomic region, with “TTG” and “TTT” being the preferred sequences around the methylation site. In the CG and CHG environment, sequences around methylation were slightly different. At the CG locus, with “TTT” and “AAAA” being the preferred sequences of the treatment group and “TTT” and “AATT” being those of the control group. Judging by such pattern, there seem to be almost equal preference for A, T, C, and G around all types of methylation sites. Thus, there doesn’t seem to be any significantly enriched motifs in any of the treatments. Methylation occurred at similar sequence environments. Fig. 2. [63]Fig. 2 [64]Open in a new tab Sequence preferences of methylation site domains in CG, CHG, CHH in (a)