Abstract

   Serum metabolite profiling in Duchenne muscular dystrophy (DMD) may
   enable discovery of valuable molecular markers for disease progression
   and treatment response. Serum samples from 51 DMD patients from a
   natural history study and 22 age-matched healthy volunteers were
   profiled using liquid chromatography coupled to mass spectrometry
   (LC-MS) for discovery of novel circulating serum metabolites associated
   with DMD. Fourteen metabolites were found significantly altered (1%
   false discovery rate) in their levels between DMD patients and healthy
   controls while adjusting for age and study site and allowing for an
   interaction between disease status and age. Increased metabolites
   included arginine, creatine and unknown compounds at m/z of 357 and 312
   while decreased metabolites included creatinine, androgen derivatives
   and other unknown yet to be identified compounds. Furthermore, the
   creatine to creatinine ratio is significantly associated with disease
   progression in DMD patients. This ratio sharply increased with age in
   DMD patients while it decreased with age in healthy controls. Overall,
   this study yielded promising metabolic signatures that could prove
   useful to monitor DMD disease progression and response to therapies in
   the future.

Introduction

   Duchenne muscular dystrophy is an X-linked genetic disease and the most
   common childhood neuromuscular disorder, with an incidence of about 1
   in 5,000 newborn males [[53]1–[54]3]. The disease is characterized by a
   natural history consisting of progressive muscle degeneration, loss of
   ambulation by the age of 12, and eventually death by late 20s to early
   30s [[55]4, [56]5]. DMD is caused by mutations in the dystrophin gene
   resulting in complete loss of the dystrophin protein that maintains
   muscle fiber integrity and function [[57]6]. Absence of dystrophin
   causes muscle fragility and loss of muscle fibers which are replaced by
   fat and connective tissue. While understanding of the disease and
   clinical management–including the use of glucocorticoid treatment and
   assisted ventilation–has continued to improve over the last few decades
   [[58]7, [59]8], development of novel therapies and testing of novel
   drugs is still hindered by the small numbers of patients available for
   clinical trials and the difficulties in selecting appropriate outcome
   measures [[60]9]. As a result, regulatory agencies such as the Food and
   Drug Administration (FDA) and European Medicines Agency (EMA) are more
   willing to consider surrogate biomarkers as endpoint measures in Phase
   II dose-ranging studies. Thus it is important to find and identify
   reliable biomarkers associated with disease progression and response to
   therapies for DMD patients.

   The best-known serum molecular biomarker for DMD is muscle-derived
   creatine kinase (CK) [[61]10], which is typically measured by enzymatic
   activity. Serum elevations of CK are generally considered a diagnostic
   biomarker of DMD, even in presymptomatic infants [[62]11]. However, the
   large inter- and intra-person variability in CK levels, partly caused
   by the impact of the child’s age and sensitivity to physical activity,
   makes it a poor candidate for use as a surrogate biomarker.
   Furthermore, CK levels have been shown to be inversely correlated with
   disease progression and severity due to the loss of muscle tissue for
   more advanced patients, which causes less CK to leak into the blood
   stream and be detected [[63]12].

   In the last few years, there has been renewed interest in defining
   biochemical biomarkers for DMD. These include miRNAs [[64]13–[65]15]
   and proteins [[66]12, [67]16–[68]20], but little research has been
   dedicated to identifying metabolic biomarkers for DMD [[69]21].

   Metabolites are small molecular mass components or intermediate
   products of metabolism that can be detected in biofluids and tissues;
   they regulate and maintain physiology homeostasis and have various
   biological functions. They can be influenced by genetics, as well as by
   environmental factors [[70]22]. Metabolites are easily measurable using
   high throughput technologies and can be transitioned into clinical
   assays. They are widely used in other diseases (e.g. serum creatinine
   is a marker of liver function) but had very limited implementation in
   DMD. “Metabolomics” refers to the systematic assessment of metabolites.
   Metabolic profiles have been used to predict disease risk, to diagnose
   disease, or as biomarkers of disease in a variety of disorders,
   including diabetes, prostate cancer, and Crohn’s disease
   [[71]23–[72]26]. Untargeted metabolomics approaches in DMD can add to
   the growing catalogue of protein and miRNA biomarkers which may be used
   to monitor disease progression and, as a result, to evaluate
   therapeutic response. The present study proposes several such putative
   metabolic biomarkers.

Materials and Methods

Study participants and samples

   We analysed the circulating serum metabolites of 51 DMD patients and 22
   healthy volunteers from 5 different study sites enrolled through the
   Cooperative International Neuromuscular Research Group (CINRG) Duchenne
   Natural History Study. All the study participants were male. The cases
   and controls were age-matched, with some enrichment for older cases.
   Summary characteristics of the study participants are provided in
   [73]Table 1 and detailed in [74]S1 Table. The DMD patients had a
   minimum age of 4, a maximum age of 28.7, and a median age of 11.4
   years. The healthy controls had a minimum age of 6, a maximum age of
   17.8, and a median age of 13.7 years. The study protocol was approved
   by Institutional Review Boards at all participating institutions, and
   informed written consent was obtained from participants or their parent
   or legal guardian. The Institutional Review Boards were: Conjoint
   Health Research Ethics Board (CHREB) (for Alberta Children’s Hospital,
   Calgary), UC Davis Institutional Review Board, University of Pittsburgh
   Institutional Review Board, Children’s National Medical Center
   Institutional Review Board (CNMC IRB) and Executive Committee of the
   Sydney Children’s Hospitals Network (SCHN) Human Research Ethics
   Committee (HREC) (for Children’s Hospital at Westmead in Australia).

Table 1. Summary characteristics of study participants.

   Study site DMD patients Healthy controls
   Alberta Children’s Hospital (Calgary) 19 16
   University of California, Davis (UC Davis) 26 0
   University of Pittsburgh / Children's Hospital of Pittsburgh of UPMC (U
   of Pittsburgh) 5 2
   Children’s National Medical Center (CNMC) 0 4
   Children’s Hospital at Westmead (Australia) 1 0
   Median age in years (minimum, maximum) 11.4 (4, 28.7) 13.7 (6, 17.8)
   Total by age category
   4–7 years 15 2
   > 7–11 years 8 3
   > 11–18 years 17 17
   > 18–29 years 11 0
   Total 51 22
   [75]Open in a new tab

   For both DMD patients and healthy controls, 15 mL of blood was taken
   according to the study protocol. The blood was separated into 3 red top
   tubes, which were then mixed by 5 complete inversions. Samples were
   allowed to clot for 30 minutes at room temperature in a vertical
   position. After clotting, the tubes were centrifuged for 10 minutes to
   separate the serum, which was carefully collected by avoiding contact
   with the blood clot. The serum was then aliquoted into four
   polypropylene Cryogenic vials from Thermo Scientific Nalgene. Each
   transfer tube of 500 μL was frozen in a secure -80°C freezer until
   shipment. Samples were not thawed and refrozen until analysis.

Liquid chromatography-mass spectrometry (LC-MS) analysis

   For metabolite extraction, a single technical replicate for each serum
   sample was processed by adding 175 μl extraction buffer (A solution of
   40% Acetonitrile, 25% Methanol and 35% Water containing internal
   standards [10μl of 1mg/ml debrisoquine and 50μl of 1mg/ml 4-
   nitrobenzoic acid]) to 25 μl of each serum sample. The samples were
   incubated on ice for 10 minutes and centrifuged at 13,000 rpm for 20
   minutes at 4°C. The supernatant was transferred to a fresh Eppendorf
   vial and dried under vacuum. The dried samples were reconstituted in
   200 μl of 5% Methanol, 1% Acetonitrile and 94% water solution. The
   samples were re-centrifuged at 13,000 rpm for 20 minutes at 4°C and
   supernatant transferred to fresh vials for UPLC- QTof analysis. 2μl of
   each sample was injected onto a Waters Acquity CSH C18 1.7 μm, 2.1 ×
   100mm column using an Acquity UPLC system by Waters Corporation,
   Milford, MA. The gradient mobile phase consisted of Solvent A- 100%
   water with 0.1% formic acid, Solvent B- 100% acetonitrile with 0.1%
   Formic acid and Solvent D- 9:1 ratio of Isopropanol to Acetonitrile
   with 0.1% Formic acid and 10mM Ammonium Formate. To reduce the chance
   of possible batch effects, the cases and controls were randomized. Each
   sample was run onto the column for 13 minutes at a flow rate of 0.4 ml
   per minute. The column temperature was set to 60° C. The gradient
   consisted of 97% Solvent A for 3 minutes and then at a ramp of curve 6
   to 60% Solvent B from 0.5 to 4 minutes. From 4.0 to 8.0 minutes at a
   ramp of curve 6, the gradient moved to 98% of solvent B and at 9
   minutes shifting to 5% Solvent B and 95% solvent D for10 minutes. At 11
   minutes, the gradient was 25% Solvent A, 25% solvent B and 50% solvent
   D at a curve of ramp 6 and then back to initial conditions of 97%
   solvent A and 3% solvent B at 13 minutes. The elution from the column
   was introduced to Quadrupole Time of flight Mass spectrometer (Waters
   G2- Qtof) by electrospray Ionization in both positive and negative mode
   at a capillary voltage of 3.5 kV and sampling cone voltage of 35 V. The
   source temperature was set to 120° C and Desolvation temperature to 350
   °C. The cone gas flow was maintained at 25L/hr and Desolvation gas flow
   at 750L/hr. Leucine- Encephalin solution in 50% acetonitrile was used a
   reference mass ([M+H]^+ = 556.2771 and [M-H]^− = 554.2615) to maintain
   accurate mass. The data was acquired in centroid mode from 50 to 1200
   mass range with the software Mass lynx (Waters Corporation). The column
   was initially conditioned by multiple injections of pooled quality
   controls and then every ten injections subsequently. The mass accuracy
   was monitored by injecting a mixture of standard compounds at the
   beginning and at the end of the batch.

Data processing

   The resulting CDF files were processed together using the XCMS method
   [[76]27] to align the peaks and estimate the metabolite intensities for
   each sample. The exact steps performed were: feature detection,
   retention time correction, alignment of peaks into peak groups, and
   imputation of the values for the missing peaks based on the raw files.
   Thus, each peak group may have slightly different m/z values in
   different samples. The groups which had fewer than 19 peaks detected
   across all samples prior to imputation were removed. Henceforth, we
   will refer to “peak groups” as simply “peaks.” Prediction of isotopes
   was then performed using the CAMERA package [[77]28] and higher-weight
   isotopes were removed. This resulted in 313 peaks in the negative mode
   and 1892 peaks in the positive mode, including nitrobenzoic acid and
   debrisoquine, respectively. Four of the estimated peak intensities,
   corresponding to two peaks, were equal to zero; we imputed these as the
   minimum non-zero value for that peak across all samples. The
   intensities were then normalized to the intensities of the two internal
   standards, quantile normalized (separately for the positive and
   negative modes), then log transformed (we used a log[2] basis, but the
   results do not depend on the basis). Thus, a total of 2203 peaks were
   considered in downstream analyses. Individual-level internal standard
   normalized, quantile normalized, and log2-transformed peak intensity
   values are given in [78]S1 Table.

Statistical and bioinformatic analysis

Variables considered

   The primary variable of interest for all the analyses considered below
   was DMD disease status. Additional variables considered were: age,
   age-by-DMD status multiplicative interaction, and study site. We
   included the effect of age, as metabolites related to hormonal and
   other physical changes in individuals between childhood and adulthood
   are expected to be reflected in metabolomics measurements, even in
   healthy controls. Other sources of variation not related to the disease
   process might exist. For example, many metabolites show associations
   with diet [[79]29], some of which might also change with age. Since DMD
   is a progressive disorder, the age trends may be different in cases and
   controls, hence the use of the age-by-DMD status interaction term. We
   considered the effect of the study site, since demographic
   characteristics and environment factors including altitude or latitude
   might be different in the populations served by the different centers.
   Furthermore, we wished to account for any effects due to technical
   variables which were not controlled for between sites and which might
   have an impact on the results. For example, issues like personnel
   differences could play an important role and can never be fully
   eliminated [[80]30].

Global exploratory analyses

   A principal components analysis (PCA) was also performed as an
   exploratory analysis to check for possible technical artifacts and any
   other unusual patterns. After the PCA, 9 linear regression models were
   considered for the first principal component (PC1) to perform an
   assessment of variables which may have a “global” impact on the
   metabolite profiles. These models included as explanatory variables all
   possible combinations of DMD status, age, and site, as well as the
   possible age-by-DMD status interaction, only considered in models where
   both age and DMD status were also included separately. The best-fitting
   model in terms of the trade-off between model fit and number of
   parameters were chosen by the Akaike Information Criterion (AIC).

Linear regression analyses

   The goal of the main statistical analysis was to find metabolites that
   are significantly different between DMD patients and healthy controls.
   The following linear model was fit to the transformed and normalized
   intensities:
   [MATH: <mi mathvariant="normal">M</mi><mi
   mathvariant="normal">e</mi><mi
   mathvariant="normal">t</mi><mo>=</mo><msub><mrow><mi
   mathvariant="normal">β</mi></mrow><mrow><mn>0</mn></mrow></msub><mo>+</
   mo><msub><mrow><mi
   mathvariant="normal">β</mi></mrow><mrow><mn>1</mn></mrow></msub><mi
   mathvariant="normal">x</mi><mn>1</mn><mo>(</mo><mi
   mathvariant="normal">S</mi><mi mathvariant="normal">t</mi><mi
   mathvariant="normal">a</mi><mi mathvariant="normal">t</mi><mi
   mathvariant="normal">u</mi><mi mathvariant="normal">s</mi><mo>=</mo><mi
   mathvariant="normal">D</mi><mi mathvariant="normal">M</mi><mi
   mathvariant="normal">D</mi><mo>)</mo><mo>+</mo><msub><mrow><mi
   mathvariant="normal">β</mi></mrow><mrow><mn>2</mn></mrow></msub><mi
   mathvariant="normal">x</mi><mi mathvariant="normal">A</mi><mi
   mathvariant="normal">g</mi><mi
   mathvariant="normal">e</mi><mo>+</mo><msub><mrow><mi
   mathvariant="normal">β</mi></mrow><mrow><mn>3</mn></mrow></msub><mi
   mathvariant="normal">x</mi><mn>1</mn><mo>(</mo><mi
   mathvariant="normal">S</mi><mi mathvariant="normal">t</mi><mi
   mathvariant="normal">a</mi><mi mathvariant="normal">t</mi><mi
   mathvariant="normal">u</mi><mi mathvariant="normal">s</mi><mo>=</mo><mi
   mathvariant="normal">D</mi><mi mathvariant="normal">M</mi><mi
   mathvariant="normal">D</mi><mo>)</mo><mspace
   width="0.25em"></mspace><mi mathvariant="normal">x</mi><mi
   mathvariant="normal">A</mi><mi mathvariant="normal">g</mi><mi
   mathvariant="normal">e</mi><mo>+</mo><msub><mrow><mi
   mathvariant="normal">β</mi></mrow><mrow><mi>s</mi><mi>i</mi><mi>t</mi><
   mi>e</mi></mrow></msub><mo>+</mo><mi mathvariant="normal">N</mi><mi
   mathvariant="normal">o</mi><mi mathvariant="normal">i</mi><mi
   mathvariant="normal">s</mi><mi mathvariant="normal">e</mi><mo>,</mo>
   :MATH]

   where β[site] is an intercept for each site with the exception of
   Australia, which serves as a baseline.

   Equivalently, this means that for DMD patients, the model was:
   [MATH: <mi mathvariant="normal">M</mi><mi
   mathvariant="normal">e</mi><mi
   mathvariant="normal">t</mi><mo>=</mo><mo>(</mo><msub><mrow><mi
   mathvariant="normal">β</mi></mrow><mrow><mn>0</mn></mrow></msub><mo>+</
   mo><msub><mrow><mi
   mathvariant="normal">β</mi></mrow><mrow><mn>1</mn></mrow></msub><mo>)</
   mo><mo>+</mo><mo>(</mo><msub><mrow><mi
   mathvariant="normal">β</mi></mrow><mrow><mn>2</mn></mrow></msub><mo>+</
   mo><msub><mrow><mi
   mathvariant="normal">β</mi></mrow><mrow><mn>3</mn></mrow></msub><mo>)</
   mo><mi mathvariant="normal">x</mi><mi mathvariant="normal">A</mi><mi
   mathvariant="normal">g</mi><mi
   mathvariant="normal">e</mi><mo>+</mo><msub><mrow><mi
   mathvariant="normal">β</mi></mrow><mrow><mi>s</mi><mi>i</mi><mi>t</mi><
   mi>e</mi></mrow></msub><mo>+</mo><mi mathvariant="normal">N</mi><mi
   mathvariant="normal">o</mi><mi mathvariant="normal">i</mi><mi
   mathvariant="normal">s</mi><mi mathvariant="normal">e</mi><mo>,</mo>
   :MATH]

   while for healthy controls, it was:
   [MATH: <mi mathvariant="normal">M</mi><mi
   mathvariant="normal">e</mi><mi
   mathvariant="normal">t</mi><mo>=</mo><msub><mrow><mi
   mathvariant="normal">β</mi></mrow><mrow><mn>0</mn></mrow></msub><mo>+</
   mo><msub><mrow><mi
   mathvariant="normal">β</mi></mrow><mrow><mn>2</mn></mrow></msub><mi
   mathvariant="normal">x</mi><mi mathvariant="normal">A</mi><mi
   mathvariant="normal">g</mi><mi
   mathvariant="normal">e</mi><mo>+</mo><msub><mrow><mi
   mathvariant="normal">β</mi></mrow><mrow><mi>s</mi><mi>i</mi><mi>t</mi><
   mi>e</mi></mrow></msub><mo>+</mo><mi mathvariant="normal">N</mi><mi
   mathvariant="normal">o</mi><mi mathvariant="normal">i</mi><mi
   mathvariant="normal">s</mi><mi mathvariant="normal">e</mi><mo>.</mo>
   :MATH]

   As a result, testing the null of no effect of the DMD disease on the
   metabolite levels is equivalent, in this framework, to testing whether
   both the intercept and the slope for age are the same in the two
   groups:
   [MATH:
   <msub><mrow><msub><mrow><mi>H</mi></mrow><mrow><mn>0</mn></mrow></msub>
   <mo>:</mo><mi
   mathvariant="normal">β</mi></mrow><mrow><mn>1</mn></mrow></msub><mo>=</
   mo><msub><mrow><mi
   mathvariant="normal">β</mi></mrow><mrow><mn>3</mn></mrow></msub><mo>=</
   mo><mn>0</mn><mo>.</mo> :MATH]

   We perform this F-test for each of the 2203 metabolite peaks, adjusting
   for multiple testing via the Bonferroni threshold, to ensure control of
   the familywise error rate (FWER) at a significance level of 0.05 and
   via the Benjamini-Hochberg (BH) correction [[81]31]–which estimates
   conservative q-values [[82]32]–to ensure control of the false discovery
   rate (FDR) at a significance level of 0.01. The significant peaks from
   the FDR-adjusted analysis were annotated via the HMDB [[83]22] and
   Metlin [[84]33] databases and manually inspected. These peaks were also
   considered in receiver operator characteristic (ROC) curve analyses,
   which fit a logistic regression model with case/control status as
   outcome and age and peak intensity as explanatory variables.

   Two further analyses were considered. One subgroup analysis considered
   the same linear model, but only on the 62 study participants with ages
   4–18 years. This is because the DMD group contained adult patients,
   whereas the control group did not. A subgroup analysis was also
   performed for the study participants from Calgary, using the model:
   [MATH: <mi mathvariant="normal">M</mi><mi
   mathvariant="normal">e</mi><mi
   mathvariant="normal">t</mi><mo>=</mo><msub><mrow><mi
   mathvariant="normal">β</mi></mrow><mrow><mn>0</mn></mrow></msub><mo>+</
   mo><msub><mrow><mi
   mathvariant="normal">β</mi></mrow><mrow><mn>1</mn></mrow></msub><mi
   mathvariant="normal">x</mi><mn>1</mn><mo>(</mo><mi
   mathvariant="normal">S</mi><mi mathvariant="normal">t</mi><mi
   mathvariant="normal">a</mi><mi mathvariant="normal">t</mi><mi
   mathvariant="normal">u</mi><mi mathvariant="normal">s</mi><mo>=</mo><mi
   mathvariant="normal">D</mi><mi mathvariant="normal">M</mi><mi
   mathvariant="normal">D</mi><mo>)</mo><mo>+</mo><msub><mrow><mi
   mathvariant="normal">β</mi></mrow><mrow><mn>2</mn></mrow></msub><mi
   mathvariant="normal">x</mi><mi mathvariant="normal">A</mi><mi
   mathvariant="normal">g</mi><mi
   mathvariant="normal">e</mi><mo>+</mo><msub><mrow><mi
   mathvariant="normal">β</mi></mrow><mrow><mn>3</mn></mrow></msub><mi
   mathvariant="normal">x</mi><mn>1</mn><mo>(</mo><mi
   mathvariant="normal">S</mi><mi mathvariant="normal">t</mi><mi
   mathvariant="normal">a</mi><mi mathvariant="normal">t</mi><mi
   mathvariant="normal">u</mi><mi mathvariant="normal">s</mi><mo>=</mo><mi
   mathvariant="normal">D</mi><mi mathvariant="normal">M</mi><mi
   mathvariant="normal">D</mi><mo>)</mo><mspace
   width="0.25em"></mspace><mi mathvariant="normal">x</mi><mi
   mathvariant="normal">A</mi><mi mathvariant="normal">g</mi><mi
   mathvariant="normal">e</mi><mo>+</mo><mi mathvariant="normal">N</mi><mi
   mathvariant="normal">o</mi><mi mathvariant="normal">i</mi><mi
   mathvariant="normal">s</mi><mi mathvariant="normal">e</mi> :MATH]

   and performing the analogous F-test. The purpose of this analysis was
   to see whether the most significant peaks are also found in a subset
   which is expected to be more homogeneous, in terms of both demographics
   and sample processing. Note that the Calgary subgroup had the largest
   sample size and was also somewhat balanced in terms of cases and
   controls ([85]Table 1).

Validation of significant metabolites

   For validation studies, eight serum samples were processed and run for
   tandem mass spectrometry (MS/MS) on Waters Quadruple time of flight
   Mass Spectrometer with ultra-performance liquid chromatography (Waters
   G2- Qtof). Samples were processed in the similar manner as in the
   profiling experiment and the chromatographic conditions used for data
   acquisition remained the same. The MS/MS spectra obtained for the
   significant m/z was then used to match with available spectra in online
   databases for identification of the biomarkers.

Pathway analyses

   We used the web-based resource ConsensusPathDB [[86]34] to perform
   pathway over-representation analysis. ConsensusPathDB contains a total
   of 4349 pathways from 12 different pathway databases such as Reactome,
   Kegg, Signalink, Biocarta, Pharmgkb, Netpath, Smpdb, Inoh,
   Wikipathways, Pid, Ehmn and Humancyc. HMDB IDs of the peaks validated
   by MS/MS were used to conduct pathway enrichment analysis. For each
   pathway, the hypergeometric distribution was used to test for
   over-representation of the metabolites in the input list of validated
   peaks among the metabolites in the pathway set. A multiple testing
   correction was performed using the Benjamini-Hochberg approach to
   control the FDR for the pathways that overlapped with at least one of
   the input HMDB IDs.

Bayesian network analyses

   A Bayesian network analysis was conducted to develop a classifier for
   the participants in the study using a minimal set of metabolites, for
   both the entire dataset and the Calgary subset. The goal of this
   analysis was to identify the minimum set of peaks that classify samples
   as either DMD or control. The full data analysis considered age,
   normalized peak intensities, and site. Age and the peak intensity
   values were discretized using density approximation with 3 bins. The
   Markov Blanket network learning algorithm [[87]35] was then applied to
   create the network. The Calgary-only analysis considered age and
   normalized peak intensities.

   We then performed five-fold cross validation using the Markov Blanket
   algorithm. The data was split randomly into 5 approximately equal
   subsets, with the learning being performed on 80% of the data and the
   remaining 20% being used for validation for each of the 5 combinations
   of learning/validation sets.

Results

Global exploratory analysis

   The top two principal components are shown in [88]Fig 1(A) and 1(B).
   The samples in [89]Fig 1 are shape-coded by DMD disease status; in
   [90]Fig 1(A) they are also color-coded by age category, whereas in
   [91]Fig 1(B) they are color-coded by study site. 23% of the total
   variance is explained by the first principal component (PC1) and 6% by
   the second principal component (PC2). No clear clusters are observed.
   The regression models considered with PC1 as the outcome, along with
   the corresponding AIC and relative probability that they minimize the
   information loss [[92]36] are presented in [93]S2 Table. The top model
   selected via AIC for PC1 had age and study site, with study site being
   significant (p = 0.008 from F-test comparing it to model with only age)
   and age showing a borderline significant association (p = 0.066). Thus,
   both age and study site appear to have an impact on global metabolite
   profiles.

Fig 1. PCA plots of the internal standard normalized, quantile normalized,
and log2-transformed intensity data.

   [94]Fig 1
   [95]Open in a new tab

   PC2 is plotted against PC1. Solid circles are DMD patients, empty
   circles are healthy controls. In panel a), individuals are color-coded
   by age category and in panel b) by study site.

Serum metabolome signature for DMD

   Based on the F-tests performed for each of the 2203 peaks, testing for
   an effect of disease status in the presence of age and study site and
   allowing for an interaction between disease status and age, we find 14
   metabolites to be statistically significant using the
   Benjamini-Hochberg procedure with q-value (adjusted p-value) ≤ 0.01,
   thus controlling the FDR at the 0.01 level. This means that we would
   expect the average fraction of false discoveries to be no more than 1%
   when repeatedly using this procedure. Of these 14 metabolites, 8 also
   showed statistically significant associations at a Bonferroni threshold
   of 0.05/2203, therefore controlling the FWER at 0.05. This means that
   using a p-value cut-off of 0.05/2203 for rejection for each peak, we
   would expect to obtain at least one false discovery (a peak which is
   not truly different between cases and controls but for which the null
   is rejected) no more than 5% of the time when repeatedly using this
   procedure on individuals sampled in the same way. Significant
   differences indicate that the fitted age trends for the DMD and control
   groups differed in slope and/or intercept, i.e. either there is a
   difference between intensity levels in normal and DMD individuals at
   baseline, which is maintained at different ages (difference in
   intercept); or there is no difference at baseline, but the values
   change according to age (difference in slope); or there is a difference
   at baseline and the values change according to age as well (difference
   in intercept and slope.) We focused the metabolite annotation and
   validation efforts on these 14 peaks. Each peak was annotated using the
   HMDB and Metlin databases (see [96]S3 Table for the positive mode and
   [97]S4 Table for the negative mode). Following MS/MS validation ([98]S1
   Fig), the identities of 4 of these peaks were determined very likely to
   be: 5a-DHT; creatinine; either epitestosterone sulfate,
   dehydroepiandrosterone sulfate, or testosterone sulfate; and creatine,
   corresponding to adducts with median m/z values of 369.17, 114.07,
   367.16, and 132.08 in the LC-MS experiment. The MS/MS spectra for these
   peaks were all in accordance with the most likely database annotations.
   The identity of a fifth peak was determined to likely be L-arginine
   upon manual inspection; the adduct for this peak has an m/z value of
   174.15, which was close to the monoisotopic mass of 174.11 for
   L-arginine, however it was not identified in the database annotation.

   A subgroup analysis was conducted for the participants aged 4–18 years.
   This retained 6 significant peaks (m/z values of 357.25, 449.15,
   369.17, 114.07, 132.08, 312.01) at the Bonferroni threshold of
   0.05/2203, with no additional peaks significant at the q-value
   threshold of 0.01. Five of these peaks (all except the peak at m/z of
   449.15) were among the top 14 peaks in the full data analysis. Three of
   these peaks were validated as being creatinine, creatine, and 5a-DHT.
   The remaining peak was ranked 16^th (q-value = 0.013) in the full data
   analysis.

   A subgroup analysis was also conducted for the Calgary dataset. This
   resulted in 8 significant peaks at the Bonferroni threshold of
   0.05/2203, with no additional peaks significant at the q-value
   threshold of 0.01. Six (m/z values of 357.25, 369.17, 114.07, 367.16,
   451.17, and 209.12) of these 8 peaks were among the top 14 peaks in the
   full data analysis, three of them corresponding to the validated
   metabolites creatinine, 5a-DHT, and either testosterone sulfate,
   epitestosterone sulfate, or dehydroepiandrosterone sulfate. The
   remaining 2 peaks at m/z of 223.13 and 175.12 respectively were ranked
   15^th (q-value = 0.0101), and 612^th (q-value > 0.99) in the full data
   analysis. We note that creatine was only slightly outside of the
   threshold deemed significant in the Calgary-only analysis (q-value =
   0.016).

   [99]Fig 2 plots the normalized intensity values versus age for the top
   14 peaks, with the regression lines shown for the Calgary and UC Davis
   sites, which had the largest numbers of participants. It is clearly
   shown that levels of some metabolites (e.g. creatinine at m/z 114.06)
   and the unknown compound at m/z = 312) are not substantially different
   between DMD and healthy volunteers at young ages, then their levels
   substantially diverge at around 15 years of age. Creatinine was lower
   in DMD group relative to control group at all ages with the most
   substantial difference seen at an older age (> 15 years old). [100]Fig
   3 displays boxplots for these same metabolites for the two groups and
   clearly shows substantial differences between DMD and healthy
   volunteers at older ages (blue dots).

Fig 2. Plots of the intensity versus age for the top 14 metabolites
associated with DMD status.

   [101]Fig 2
   [102]Open in a new tab

   The intensity levels have been internal standard normalized, quantile
   normalized, and log2-transformed. The likely annotations are given for
   the validated peaks and the m/z value of the adduct in the LC-MS
   experiment is given for the other peaks; the peak labelled testosterone
   sulfate may be dehydroepiandrosterone sulfate or epitesterone sulfate.
   The points are color- and shape-coded by disease status. The regression
   lines obtained from the interaction model are shown for the Calgary
   site for DMD cases and controls (solid lines) and for the UC Davis site
   for DMD cases (broken line).

Fig 3. Boxplots of the intensity versus age for the top 14 metabolites
associated with DMD status.

   [103]Fig 3
   [104]Open in a new tab

   The intensity levels have been internal standard normalized, quantile
   normalized, and log2-transformed. The likely annotations are given for
   the validated peaks and the m/z value of the adduct in the LC-MS
   experiment is given for the other peaks; the peak labelled testosterone
   sulfate may be dehydroepiandrosterone sulfate or epitesterone sulfate.
   The points are separated by DMD status and color-coded by age category.

   The top 14 metabolites include multiple pairs which have strong
   positive or negative correlations ([105]Fig 4). In particular, we note
   that creatine and creatinine are negatively correlated (Pearson
   correlation = -0.48) while the two testosterone metabolites, 5a-DHT and
   testosterone sulfate (or epitestosterone sulfate or
   dehydroepiandrosterone sulfate), are strongly positively correlated
   (Pearson correlation = 0.92).

Fig 4. Correlation plot of the top 14 metabolites associated with DMD status.

   [106]Fig 4
   [107]Open in a new tab

   The intensity levels have been internal standard normalized, quantile
   normalized, and log2-transformed. The likely annotations are given for
   the validated peaks and the m/z value of the adduct in the LC-MS
   experiment is given for the other peaks; the peak labelled testosterone
   sulfate may be dehydroepiandrosterone sulfate or epitesterone sulfate.

   We further considered a similar linear regression analysis for the
   creatine/creatinine ratio (on the log[2] scale). Once again an F-test
   was performed, looking for an effect of disease status in the presence
   of age and study site and allowing for an interaction between disease
   status and age; the result was highly significant (p = 4 x 10^−13).
   [108]Fig 5 shows the plot of the ratio on the log[2] scale against
   participants’ ages. We also compared this ratio to the values of serum
   CK muscle type (CKM) measured in a recent proteomics study [[109]12].
   All 51 of the DMD cases in the current study were also considered in
   the proteomics study. Overall, there was a negative correlation between
   CKM level and creatine/creatinine in the cases, reflecting a decrease
   in CKM activity that is most likely due to loss of muscle mass with age
   ([110]S2 Fig).

Fig 5. Plot of the creatine/creatinine ratio of intensities on the log[2]
scale versus age.

   [111]Fig 5
   [112]Open in a new tab

   The intensity levels have been internal standard normalized and
   quantile normalized. The points are color- and shape-coded by disease
   status. The regression lines obtained from the interaction model are
   shown for the Calgary site for DMD cases and controls (solid lines) and
   for the UC Davis site for DMD cases (broken line).

   The top 14 metabolites and the creatine/creatinine ratio all had areas
   under the curve (AUC) values greater than 0.5 in the ROC analysis
   ([113]S3 Fig) 8 of these 15 biomarkers had AUC greater than 0.8,
   including creatine, creatinine, creatine/creatinine ratio, and the two
   testosterone metabolites.

Pathway analyses for metabolites significantly different between DMD patients
and healthy controls

   The pathways in which at least one of the 5 validated peaks,
   corresponding to 7 HMDB IDs–due to the uncertainty for the peak with
   m/z 367.16 –were considered. Two IDs, corresponding to epitestosterone
   sulfate and 5-DHT were not found in the ConsensusPathDB database. The
   pathways in which at least one of the metabolites associated with the
   remaining 5 HMDB IDs is present are given in [114]S5 Table. Creatine
   metabolism is the most significant pathway followed by urea cycle and
   metabolism of amino groups, creatine biosynthesis and arginine and
   proline metabolism. The validated metabolites present in these pathways
   are creatine, creatinine, and L-arginine. In addition to the amino acid
   metabolism pathways, hormonal pathways such as 17-beta hydroxysteroid
   dehydrogenase III deficiency and androgen and estrogen metabolism were
   also significant. These pathways contain testosterone sulfate and
   dehydroepiandrosterone sulfate.

Bayesian network analyses

   Six peaks were identified as important for classification for the
   full-data analysis, two of them being validated as creatinine and
   creatine; the remaining 4 peaks were ranked 27^th, 58^th, 88^th, and
   1479^th using the F-test. Using only these 6 peaks, the participants in
   the study can be classified at 97% accuracy (in-sample classification).
   Four of these peaks were identified in at least 4 of the 5 combinations
   in the five-fold cross-validation, corresponding to creatine (5 times),
   creatinine (5 times), and the peaks ranked 27^th (q-value = 0.04) and
   58^th (q-value = 0.10), which have m/z values of 154.06 and 176.04,
   respectively (4 times). The peak with m/z = 154.06 is likely to be
   annotated to a +Na adduct of creatine or beta-guanidinopropionic acid
   based on the HMDB database.

   The average classification precision for the five-fold validation was
   88%.

   The Bayesian network analysis applied to the Calgary subset selected
   only two peaks, corresponding to creatine and creatinine. Using only
   these two peaks, the samples were classified at 89% accuracy. Five-fold
   cross validation identified the creatine and creatinine peaks as the
   only peaks being selected in 3 of the 5 combinations. None of the peaks
   appeared in more than 3 of the combinations. The total classification
   precision for the 5 fold subgroup analysis was 71%. The lower
   classification accuracy of the subgroup analysis may be due to the
   smaller learning set size.

Discussion

   To the best of our knowledge, this is the first serum metabolomics
   study of DMD. We considered serum from 51 DMD patients and 22
   age-matched controls for discovery of novel metabolic biomarkers for
   DMD using LC-MS profiling. After data processing, linear regression
   models were fit for each metabolite, considering the associations with
   DMD disease status, age, study site, as well as the age-by-DMD status
   interaction. We found 14 putative molecular biomarkers that differ in
   their levels between DMD patients and healthy controls when age is
   taken into consideration. Essentially, as in the DMD proteomics study,
   we considered increasing age as a surrogate for disease progression
   among the DMD patients [[115]12]. These metabolites included creatine,
   creatinine, and testosterone-related steroids, which were confirmed by
   MS/MS analysis. Creatinine and creatine were, as expected, inversely
   correlated, with creatinine being generally lower and creatine
   generally higher in the DMD patients. Creatinine tends to increase with
   age in the controls and decrease in the DMD patients, while creatine
   tends to decrease in the controls but stay relatively stable in the DMD
   patients. Creatinine is a breakdown product from the high energy
   metabolite phosphocreatine in muscle. With loss of muscle mass in DMD
   there is a decrease in production of creatinine while creatine that is
   synthesized by the liver stays at a steady level. These results are
   also in agreement with a recent study which considers serum creatinine
   in DMD and the less severe Becker muscular dystrophy phenotype, showing
   an inverse correlation with a number of measures of disease severity
   [[116]37], given that in our study creatinine is higher in controls
   compared to cases and shows a slight decrease in cases with age, which
   is correlated with disease severity.

   We note that the ratio between creatine and creatinine may also serve
   as a possible biomarker, as it generally appears to increase in DMD
   patients and decrease in controls ([117]Fig 5) and is more
   significantly associated with DMD status than either creatine or
   creatinine individually. It is apparent that DMD patients are not
   metabolizing creatine as they age, most likely due to loss of muscle
   mass. These results are also in agreement with the values of serum CKM,
   which were measured on all the DMD cases in a recent proteomics study
   [[118]12]. While CKM decreases in DMD cases with age, thus leading to
   levels more similar to those in healthy controls, which makes it not an
   ideal marker of disease progression, the creatine to creatinine ratio
   appears to increase in cases and decrease in controls.

   Other interesting biomarkers are testosterone-related steroids that
   showed an increase with age in both DMD patients and controls. However,
   they remain lower in the case group compared to the control group. This
   is most probably due to glucocorticoid use, which is known to cause a
   decrease in levels of endogenous testosterone and derivatives
   [[119]38]. We note that we detected two testosterone-related
   derivatives, 5a-DHT and testosterone sulfate (or epitestosterone
   sulfate or dehydroepiandrosterone sulfate), which are positively
   correlated with each other. Our subgroup analyses–one restricted to
   participants aged 4–18 years and one restricted to the Calgary
   participants–both confirmed several of the significant peaks from the
   all-data results. The Bayesian network analysis confirms the importance
   of creatine and creatinine, although it does not select the
   testosterone metabolites. We note that our study did not include
   reliable recording of glucocorticoid use, because this is a natural
   history study and it has been shown that DMD patients may take either
   deflazacort or prednisone/prednisolone and that many different dosing
   regimens exist [[120]39]. This is certainly an important aspect which
   should be investigated in future work. Additional variables such as
   muscle mass, dietary intake, disease conditions, and drugs or
   supplements taken by the study participants at the time of the blood
   draw are also known to impact metabolism and could not be controlled in
   the present DMD natural history study, therefore additional studies
   considering these variable are desirable and could be achieved with a
   well-controlled cohort in the future.

   A major challenge we faced and a common one for this type of study
   involves separating metabolites related to the disease process from
   those that are a result of drugs or dietary supplements used by the
   patients. Several compounds were found to have altered levels in DMD
   patients compared to controls, but are yet to be identified.

   Potential differences between study sites could also be a concern.
   These were significant for the first principal component, which is not
   surprising, due to possible differences in demographics and
   environmental exposures which could not be controlled, although the
   sample collection, processing and storage protocols were consistent at
   each site. We note as a limitation of this work the fact that the
   number of patients in 3 of the five sites was less than 10 and that the
   second largest site, UC Davis, only provided cases. We adjusted for
   study site in the linear model and performed the Calgary-only subgroup
   analysis, which resulted in very similar top hits, to partially address
   this issue but are aware that in the future more sites should be
   considered, with a larger number of participants per site and more
   detailed individual-level demographic and exposure information. A
   further limitation is due to the retrospective nature of the
   study–meaning that we cannot tell if the change in a metabolite’s
   intensity is related to a process which impacts disease progression or
   is impacted by disease progression. We are presently collecting
   longitudinal samples within a well-controlled cohort in order to answer
   these questions and to validate the biomarkers proposed in our current,
   natural history, study. Our future work will also be powered to detect
   correlations between clinical outcomes which measure disease severity
   and metabolic markers. As creatine and creatinine are well-studied
   metabolites, the use of more targeted methods to measure them as well
   as their ratio in future studies may also provide additional validation
   for our results.

Conclusions

   We described herein the first comprehensive metabolomic study for DMD.
   We believe that this preliminary study represents one of the first
   steps towards finding metabolic surrogate biomarkers of disease
   progression in DMD patients. Given that the present work considered a
   natural history study, important variables could not be controlled. We
   are hopeful that future work will include prospective longitudinal
   studies with a larger number of variables collected, as well as more
   targeted assays, which will provide better insights into temporal
   trends and causal mechanisms related to DMD progression. While further
   validation of the results from this initial discovery study is expected
   to follow from the longitudinal cohort, our current work should provide
   valuable insights as the first metabolomic signature of DMD and might
   prove to be a useful reference for future metabolomics studies.
   Integration with additional types of data–such as genomics, proteomics,
   and clinical outcomes–is a further avenue of interest.

Supporting Information

   S1 Fig. MS/MS spectra for the top peaks.

   (PDF)
   [121]Click here for additional data file.^ (236.3KB, pdf)
   S2 Fig. Plot of CKM versus the creatine/creatinine ratio in DMD cases,
   both on the log[2] scale.

   The intensity levels for creatine and creatinine have been internal
   standard normalized and quantile normalized. The points are color- and
   shape-coded by age category. ρ represents the correlation on the log
   scale. Both the overall correlation and the correlations within age
   categories are given.

   (PDF)
   [122]Click here for additional data file.^ (7.1KB, pdf)
   S3 Fig. ROC curves for the top 14 peaks from the overall analysis and
   for the creatine/creatinine ratio.

   A logistic model was fit with case/control status as outcome and age
   and peak intensity as explanatory variables. The AUC for each possible
   biomarker is given in the title; tpr = true positive rate, fpr = false
   positive rate.

   (PDF)
   [123]Click here for additional data file.^ (15KB, pdf)
   S1 Table. Individual characteristics and peak intensities of the study
   participants.

   Each peak is labelled according to its m/z value, retention time in
   seconds, and mode (‘p’ = positive, ‘n’ = negative); for example, peak
   M132T37p has an m/z value of 132, a retention time of 37 seconds, and
   was detected in the positive mode. The log[2]-transformed CKM values
   from the [[124]12] study are also given for the common samples.

   (XLSX)
   [125]Click here for additional data file.^ (2.2MB, xlsx)
   S2 Table. Models considered for PC1.

   The 9 linear models considered for PC1. The variables and Akaike
   Information Criteria (AIC) are given. Smaller AIC values indicate a
   better model fit, while also adjusting for the number of parameters,
   with the best model according to the AIC being shown in bold. The
   relative probabilities representing the strength of evidence for each
   model j compared to the model deemed to be the best are calculated by
   exp(-(AIC[j]-AIC[min])/2); for instance, the model including only site
   has 0.43 times as much evidence than the model including both age and
   site.

   (XLSX)
   [126]Click here for additional data file.^ (8.1KB, xlsx)
   S3 Table. Annotations for the top peaks detected in positive mode from
   HMDB and Metlin.

   Drugs and non-human metabolites are shown in red. The most likely
   annotations are highlighted.

   (XLSX)
   [127]Click here for additional data file.^ (17KB, xlsx)
   S4 Table. Annotations for the top peaks detected in negative mode from
   HMDB and Metlin.

   Drugs and non-human metabolites are shown in red. The most likely
   annotations are highlighted.

   (XLSX)
   [128]Click here for additional data file.^ (14.7KB, xlsx)
   S5 Table. Pathways from ConsensusPathDB in which at least one of the 5
   validated peaks is present.

   The ‘Size’ column represents the absolute size of the pathway, the
   ‘Effective size’ column represents the number of set members that are
   annotated with an ID of the user-specified ID type (HMDB ID in this
   case); the ‘Input overlap’ column represents number of entities from
   the user input list that overlap with entities in the pathway based
   set. The p-values are obtained from a hypergeometric test of
   over-representation. The q-values are calculated using the
   Benjamini-Hochberg approach considering just the tests for the pathways
   which include at least one validated peak.

   (XLSX)
   [129]Click here for additional data file.^ (21.7KB, xlsx)

Acknowledgments