Abstract
Type 2 diabetes mellitus (T2DM) is a challenging and progressive
metabolic disease caused by insulin resistance. Skeletal muscle is the
major insulin-sensitive tissue that plays a pivotal role in blood sugar
homeostasis. Dysfunction of muscle metabolism is implicated in the
disturbance of glucose homeostasis, the development of insulin
resistance, and T2DM. Understanding metabolism reprogramming in newly
diagnosed patients provides opportunities for early diagnosis and
treatment of T2DM as a challenging disease to manage. Here, we applied
a system biology approach to investigate metabolic dysregulations
associated with the early stage of T2DM. We first reconstructed a human
muscle-specific metabolic model. The model was applied for personalized
metabolic modeling and analyses in newly diagnosed patients. We found
that several pathways and metabolites, mainly implicating in amino
acids and lipids metabolisms, were dysregulated. Our results indicated
the significance of perturbation of pathways implicated in building
membrane and extracellular matrix (ECM). Dysfunctional metabolism in
these pathways possibly interrupts the signaling process and develops
insulin resistance. We also applied a machine learning method to
predict potential metabolite markers of insulin resistance in skeletal
muscle. 13 exchange metabolites were predicted as the potential
markers. The efficiency of these markers in discriminating
insulin-resistant muscle was successfully validated.
Introduction
Type 2 diabetes mellitus is a complex progressive metabolic disease
with a heterogeneous etiology. The disease has a high global prevalence
and is among the top ten causes of death worldwide. According to the
latest international diabetes federation report, the global prevalence
of diabetes in 2019 is estimated to be 463 million people. In addition,
374 million people are at greater risk of developing T2DM [[30]1].
Therefore, this disease has become a major concern in global health.
Insulin resistance is the main pathophysiologic feature in T2DM. The
key insulin-sensitive tissues include skeletal muscle, liver, and
adipose tissues. Among these, skeletal muscle is responsible for the
clearance of over 75% of glucose from the bloodstream. Thus this tissue
plays a great role in lowering the blood glucose level [[31]2].
Dysfunction of muscle metabolism is associated with disturbance in
glucose hemostasis and implicated in obesity, insulin resistance, and
T2DM [[32]3]. Therefore, the study of the molecular mechanisms
underlying insulin resistance in skeletal muscle will improve our
understanding and treatments for T2DM.
In complex and multifactorial diseases such as diabetes, systems
biology provides a holistic view to study dysregulated subsystems
associated with the disease in a cell. Meanwhile, genome-scale
metabolic modeling serves as a context for in silico exploration of
metabolism through a systems biology approach. Since there is no
one-to-one relationship between transcription and translation, the
study of gene expression data alone does not provide an accurate
understanding of cellular metabolism; while, metabolic network
simulation gives an in-depth insight into the molecular mechanisms
involved in cellular processes. Generic metabolic networks are
reconstructed based on the information encoded in the genome. This
information includes all possible reactions that can occur in all types
of human cells. By having gene expression data from a particular cell
or disease state and integrating it into the generic model, active
reactions are identified and a context-specific model can be
reconstructed. Some efforts have been made to study the molecular
mechanisms contributing to muscle insulin resistance using genome-scale
metabolic models (GEMs). Bordbar et.al. have reconstructed a
multi-tissue type GEMs and integrated gene expression data of obese and
type 2 obese gastric bypass patients to study metabolic activity
between these states [[33]4]. Nogiec et.al. have applied flux balance
analysis to model muscle insulin resistance metabolism in fasted to fed
states. In addition, they tried to identify reactions that reproduce
key features of insulin resistance by perturbing the muscle metabolic
network. Nevertheless, they did not apply the integration of
high-throughput data to the model [[34]5]. Varemo et.al. have
reconstructed a myocyte metabolic network and have utilized it to apply
a gene-set analysis. They have found reporter metabolites and the
metabolic pathways affected by differentially expressed genes in T2DM
[[35]6]. Previous studies were either small scale or no comprehensive
simulation-based metabolic analysis was employed. Also, there is not
any study that demonstrated metabolic disturbance at the early stages
of diabetes. Since T2DM is a progressive disease, identification of
disorders in newly diagnosed patients can provide significant
information about the disease. It may bring us closer to identifying
the main causes triggering the onset of disease progression.
In this study, the system-level metabolic analysis on the most
comprehensive data from early-stage T2DM patients was performed. The
samples include muscle gene expression data from newly diagnosed
diabetic patients [[36]7]. We attempted to identify early metabolic
alterations associated with the disease. We first reconstructed the
functional muscle-specific metabolic model. The model was employed for
metabolic analyses in T2DM. We applied personalized metabolic modeling
along with the constraint-based approach to study metabolic
reprogramming in diabetic muscles. Moreover, potential metabolite
markers related to muscle insulin resistance were predicted using a
machine learning approach. Finally, we validated these markers with
gene expression data from a separate study [[37]8]. The overall
workflow of this study is shown in [38]Fig 1.
Fig 1. The workflow of study design.
[39]Fig 1
[40]Open in a new tab
(A) The muscle-specific metabolic model was reconstructed and was
employed for our following analyses: (B) Study of metabolism
reprogramming in newly diagnosed T2DM patients using the personalized
metabolic modeling with the constraint-based approach. (C) Prediction
of potential metabolite markers of insulin resistance in muscle by
applying a machine-learning technique.
Materials and methods
Ethics statement
In this study, we utilized transcriptome data from the dbGaP database
with the accession code phs001068.v1.p1, which was originally published
in reference [[41]7]. The authors of reference [[42]7] obtained written
informed consent from all subjects and obtained approval from the
coordinating ethics committee of the Hospital District of Helsinki and
Uusimaa. For validation of metabolite markers, we used muscle gene
expression data of healthy and diabetic subjects from the GEO
repository with the accession numbers [43]GSE63887 and [44]GSE81965,
which were obtained from reference 8. The authors of reference [[45]8]
stated that ethical approval for the human studies, from which
satellite cells were obtained, was provided by the Ethics Committee of
Copenhagen and Frederiksberg Council, Denmark (KF 01-141/04). The human
volunteers were given oral and written information about the
experimental procedures, and all subjects provided written consent
before participation. The study was carried out in accordance with the
principles of the Declaration of Helsinki as revised in 2000.
As we did not directly perform any experiments on human samples and
solely used publicly available data, we deemed it unnecessary to seek
additional ethical approval for our study.
Data
We searched transcriptome databases for skeletal muscle data from
healthy and T2DM individuals. We found the most comprehensive dataset
from the dbGaP database with the accession code phs001068.v1.p1. The
gene expression data were obtained from vastus lateralis muscle using
high-throughput RNA-Seq technology. The data comprise 91 healthy and 63
newly diagnosed subjects [[46]7]. The transcriptome data were
downloaded and were used for our analyses.
For validation of metabolite markers, the muscle gene expression data
of healthy and diabetic subjects were taken from the GEO repository
with the accession numbers [47]GSE63887 and [48]GSE81965 [[49]8]. This
data contains RNA-Seq samples from 24 individuals in four equal
following groups: normal glucose tolerant (NGT)/non-obese, NGT/obese,
T2DM/non-obese, and T2DM/obese.
Muscle-specific metabolic model reconstruction
The muscle-specific metabolic model was reconstructed by integration of
gene expression data to the generic model. Here, we used the HMR2 model
as the generic model [[50]9, [51]10]. The gene-protein-reaction (GPR)
relationships in HMR2 were corrected using the Metabolic Atlas database
[[52]11] and BiGG model [[53]12]. Moreover, the myocyte biomass
reaction from the Bordbar muscle metabolic model was added to our model
[[54]4]. Muscle gene expression profiles of healthy individuals [[55]7]
were used. For the preprocessing, low expressed genes (< 5 read counts
in less than 25 percentages of samples) were filtered and
between-sample normalization according to the DESeq2 normalization
method with gene length adjustment and log2 (gene expression+1)
transformation were applied [[56]13]. E-Flux algorithm [[57]14] was
employed to integrate gene-expression data to the HMR2 model and to
reconstruct the context-specific metabolic model. To build a fully
functional model we removed dead-end reactions.
Constraint-based personalized metabolic modeling
Personalized GEMs were reconstructed by integrating gene expression
data from each individual into the muscle-specific metabolic model.
Pre-processed gene expression data and E-Flux algorithm were used here.
Media conditions were set by body fluid metabolites [[58]15]. We used
maximizing flux through the production of mitochondrial ATP as the
objective function. In addition, to ensure that the simulated cell is
biologically relevant and functional,we set the lower bound of biomass
reaction to 80% of the maximum biomass production by the healthy model
[[59]16]. For personalized metabolism simulation, flux variability
analysis was applied. FVA is a common constraint-based and linear
programming method to find the possible range of flux through each
reaction while a set of constraints are satisfied [[60]17]. The minimum
(min) and maximum (max) possible fluxes through each reaction in each
personalized model were obtained using FVA in Cobra Toolbox [[61]18].
The glpk has been used for solving the linear programming problem. To
find perturbed reactions in the T2DM state, the two-sample t-test was
applied on the min and max fluxes between healthy and T2DM groups. The
Benjamini–Hochberg procedure was used to apply multiple testing
corrections and reactions with false discovery rates less than 0.1 were
regarded as significant perturbed reactions.
Enrichment analyses
To looks for significantly dysregulated pathways and metabolites,
enrichment analyses were performed. Significant perturbed reactions
were used as the reaction set to be enriched. Pathway-reaction set and
metabolite-reaction sets were created using our reconstructed muscle
metabolic model. Metabolites were assigned to the reaction if it
participated in the corresponding reaction as either substrate or
products. For enrichment analysis, hypergeometric 1-sided test with FDR
correction for multiple testing correction was used.
Metabolite-centric network
Metabolite-centric is one of the most common representations of the
metabolic network. Metabolites are considered as the nodes of the
network. Then two nodes are linked if both of them participate in the
reaction, one as the substrate and the other as the product. We
reconstructed the metabolite-centric network from perturbed reactions.
For simplicity, transport and exchange reactions were excluded and
unique metabolites instead of the compartmentalized ones were used. We
also excluded the currency metabolites. To find the significant
connected components of dysregulated metabolites, metabolites with an
adjusted p-value greater than 0.05 were filtered. Adjusted p-values
were obtained from metabolite enrichment analysis. The
metabolite-centric subnetwork was visualized using Cytoscape version
3.8.2 [[62]19].
Classification and feature selection
An SVM classifier with a polynomial kernel of order 2 was used for
classification. The classifier was evaluated by 10-fold stratified
cross-validation and analysis of accuracy, sensitivity, and specificity
were reported.
[MATH:
Accurac<
mi>y=TP+TNTP+TN
mi>+FP+FN<
/mfrac> :MATH]
[MATH:
Sensiti<
mi>vity=TPTP+FN
mi> :MATH]
[MATH:
Specifi<
mi>city=TNTN+FP
mi> :MATH]
True Positives (TP) refers to the T2DM individuals that are correctly
classified as patients. False Positives (FP) refers to the healthy
individuals that are incorrectly classified as patients. True Negatives
(TN) is the number of healthy individuals that correctly classified as
healthy, and False Negatives (FN) is the number of T2DM individuals
that are incorrectly classified as healthy.
For feature selection, a feature selection method, based on a hybrid of
genetic algorithm (GA) and support vector machine was employed. In the
GA, the feature subset is encoded as the candidate solution on a
chromosome-like structure. The population is formed from a set of
chromosomes that mutation and crossing over in it occur to generate the
next generation. A fitness score is calculated for each chromosome
representing adaptation of it to the environment and better feature
subsets have more chance of reproducing the next generation. Creating
the next generations will be continued until a stopping criterion is
satisfied.
Here, a binary GA was employed that the binary values of 1 and 0
represent the presence or absence of a specific feature at the
particular chromosome, respectively. The chromosome length was set to
the number of features. The population size and the maximal number of
generations were set to 300 chromosomes and 100 generations,
respectively. We used the accuracy of the SVM classifier as the fitness
score. The GA was stopped if the fitness score was reached an accuracy
higher than 90 percent or the maximum number of generations was
reproduced.
Results
Comprehensive high-throughput RNA-Seq data from skeletal muscle
We looked for the most comprehensive skeletal muscle gene expression
profiles. We queried the transcriptome repositories for individuals’
expression profiles with the following criteria: 1) newly diagnosed
with T2DM. 2) Taking no medication for diabetes. 3) Do not have any
other disease affecting the analysis. This resulted in data with the
accession code phs001068.v1.p1 in the dbGaP database. Diabetic patients
in this sample have been newly diagnosed as T2DM patients, and in most
of them, the fasting blood glucose concentration was around seven
mmol/l. [63]Table 1 shows the characteristics of the subjects of the
samples used in this study. Investigation of metabolism reprogramming
in these patients who are at the early stages of T2DM has significant
value in identifying the underlying disorders in T2DM. We used gene
expression data of healthy individuals in this sample to reconstruct
the muscle-specific metabolic model.
Table 1. Subjects characteristics of the individuals in the dataset [[64]7].
Means ± standard deviation values for fasting plasma glucose, fasting
serum insulin, body mass index (BMI), and waist/hip ratio (WHR) in each
diabetic and healthy group are shown.
Sample Glucose (mmol/L) Insulin (mu/l) BMI WHR
T2DM 7.17 ± 0.66 10.68 ± 6.94 29.37 ± 5.04 0.99 ± 0.07
Healthy 5.62 ± 0.33 6.87 ± 3.34 26.35 ± 3.47 0.92 ± 0.08
[65]Open in a new tab
Human muscle-specific metabolic model reconstruction
Reconstruction of the muscle-specific metabolic model requires the
human generic metabolic model as the template. Here, the Human
Metabolic Reaction 2 (HMR2) [[66]10] was employed as the template
metabolic model. The gene expression data are integrated into the model
using GPR relationships. Proteins/protein complexes are linked to the
corresponding genes by GPRs that are typically described with Boolean
rules. An ’AND’ logic operator between two genes implies the required
expression of both of them for a protein to have a function; an ‘OR’
operator denotes any of the related genes that can produce protein. In
the HMR2 model, however, each reaction is assigned to the corresponding
genes but the Boolean operators were all OR annotated. Since the right
GPR rules are crucial for metabolic simulation, we manually curated
them using the Metabolic Atlas database [[67]11] and BiGG models
[[68]12]. The updated HMR2 model was used for subsequent steps.
The biomass reaction is used in metabolic models to represent all
metabolites required for cell growth and maintenance. We included the
biomass function in the template model to ensure that the simulated
fluxes also assert muscle maintenance and viability. The
muscle-specific biomass maintenance function was built using the
information obtained from the experimental measurements and literature
[[69]4]. The process of formulating muscle-specific biomass functions
has been described in their publication [[70]4]. The biomass reaction
was constructed and added to the template model.
The process of reconstructing the muscle-specific metabolic model
requires the expression data to infer active reactions from the
template model. The preprocessed transcriptome data from the healthy
group were used to integrate into the model. E-flux algorithm [[71]14]
was applied to build the muscle-specific metabolic model. Reactions
with dead-end metabolites were detected and removed from the model to
have a fully functional model. Dead-end metabolites are those that are
only consumed or produced within the network. Reactions with dead-end
metabolites can not carry flux and are blocked. The final model
comprises 3560 metabolites, 5714 reactions and, 2704 genes. [72]Table 2
shows the characteristics of our muscle-specific metabolic model in
comparison with HMR2. The reconstructed muscle metabolic model in this
study is freely available at
[73]https://github.com/Maryamkhn/Muscle_MetNet_CBB.
Table 2. Comparison of HMR2 metabolic model with reconstructed
muscle-specific metabolic model in the present study.
HMR2 model Muscle model
Reactions 8181 5714
Metabolites 6006 3560
Genes 3765 2704
[74]Open in a new tab
For quality controls and validation, we first checked for biomass
production. This asserts that all the required reactions to having an
alive model are satisfied. Then, 56 metabolic tasks known to occur in
all types of human cells were checked; Important pathways including
glycolysis, TCA cycle, pentose phosphate, cellular respiration,
synthesis of nucleotides and lipids, oxidation of fatty acids were
checked. The complete list of these metabolic tasks can be found in
[[75]20]. Moreover, for muscle-specific validation, same as Bordbar
et.al. [[76]4], we tested the ability of the model to produce ATP from
several different sources like glucose, fatty acids, glycogen,
branched-chain amino acids, and ketone bodies. The model successfully
passed all these tests. The reconstructed muscle-specific metabolic
model was employed for further analyses.
Metabolic reprogramming in the muscle at the early stage of T2DM
The reconstructed muscle metabolic model can be employed as a context
to investigate metabolic reprogramming in diabetic muscles. Integrating
gene expression data with the metabolic model can unravel pathways
implicated in the disease. Here, we applied constraint-based
personalized metabolic modeling. We reconstructed personalized
metabolic models for healthy and T2DM individuals. Then, using a
constraint-based approach, metabolism was simulated at the personalized
level. In this approach, the metabolic model is converted to the
mathematical format, and several constraints (e.g. adjusting lower and
upper bound of fluxes in each reaction) were imposed on the model.
Combining high-throughput transcriptional data and constraint-based
models provides an opportunity to infer cellular metabolism at the
relevant physiological state [[77]21]. Flux variability analysis (FVA)
is a well-known constraint-based approach, which can provide allowable
flux ranges in the metabolic model through linear programming based
strategy [[78]17].
This analysis resulted in perturbed reactions involving in several
pathways. The perturbations in the glucose to glucose-6-phosphate
conversion reaction and the glucose exchange reaction were also
observed. Many reactions contributing to fatty acids and lipid
metabolism were dysregulated. Also, changes in fluxes of several
reactions implicated in inositol phosphate, chondroitin, and heparin
sulfate metabolism were observed. To find statistically significant
enriched pathways, pathway enrichment analysis was employed. [79]Fig 2
shows enrichment scores of these significant pathways. This score is
calculated as the percentages of the perturbed reactions in each
pathway divided by the total number of reactions in the corresponding
pathway. For simplicity, carnitine shuttles, transport, and exchange
reactions are not shown. Keratan sulfate, chondroitin/heparan sulfate
biosynthesis, and glycerolipid metabolism have the most enrichment
scores. Moreover, the distribution of perturbed reactions in these
significant pathways is shown in [80]Fig 3. Formation and hydrolysis of
cholesterol esters, glycerolipid metabolism, fatty acid activation,
keratan sulfate biosynthesis, chondroitin/heparan sulfate biosynthesis,
glycerophospholipid metabolism, and inositol phosphate metabolism are
the top enriched pathways in this figure.
Fig 2. Pathway enrichment analysis.
[81]Fig 2
[82]Open in a new tab
Perturbed reactions obtained from metabolic analysis in T2DM were used
as reaction set to be enriched. The X-axis indicates the enrichment
score. This score is calculated as the percentages of the perturbed
reactions in each pathway divided by the total number of reactions in
the corresponding pathway. Colors also show adjusted p-value obtained
from pathway enrichment analysis.
Fig 3. The distribution of perturbed reactions in enriched pathways.
[83]Fig 3
[84]Open in a new tab
These perturbed reactions obtained from metabolic analysis in T2DM.
Numbers indicate the percentage of reactions that belong to each
pathway.
To reveal metabolic alterations at the subcellular compartment level,
we classified each perturbed reaction based on the compartment the
reaction occurs inside it (([85]Fig 4A). As is shown, cytosol,
lysosome, and endoplasmic reticulum are the top dysregulated
compartments. Cytosol due to the high number of cytosolic reactions has
the largest amount of perturbed reaction. To understand which
compartment is the most dysregulated ones we adjusted perturbed
reactions in each compartment by the total number of reactions in the
corresponding compartment (transport reactions were excluded). [86]Fig
4B shows the result of this analysis. This figure demonstrates that
Golgi, lysosome, and endoplasmic reticulum were mostly affected. Golgi
has mainly encountered dysmetabolism in chondroitin, heparin, and
keratan biosynthesis. Perturb reactions implicated in the formation and
hydrolysis of cholesterol esters, heparan sulfate degradation, and
chondroitin sulfate degradation comprise the major dysmetabolism in the
lysosome. In addition, the endoplasmic reticulum is mostly affected by
perturbation in the carnitine shuttle subsystem.
Fig 4. Dysmetabolism at the subcellular level.
[87]Fig 4
[88]Open in a new tab
A) Distribution of perturbed reactions based on the subcellular
compartments. B) The percentage of perturbed reactions in each
compartment.
We also provided enrichment analysis of significant perturbed reactions
in the context of metabolites. Several important metabolites mainly
implicated in amino acids, fatty acids, and lipid metabolism were
enriched; Metabolites including glucose, mitochondrial ATP, CoA, fatty
acids, activated fatty acids, cholesterol, amino acids, pyruvate, AKG,
succinate, homoserine, NADH, FADH, glutathione, Acetyl-CoA,
CMP-N-acetylneuraminate, THF, 5,10-methenyl-THF, and 10-formyl-THF.
[89]Fig 5 shows the top 40 significant enriched metabolites with
corresponding enrichment scores (currency metabolites are excluded).
The complete list of the enriched metabolites in unique and
compartmentalized form can be found in Additional File 1, S1 and S2
Tables in [90]S1 File.
Fig 5. Metabolites enrichment analysis.
[91]Fig 5
[92]Open in a new tab
The X-axis indicates the enrichment score. This score is calculated as
the percentage of perturbed reactions to the total number of reactions
in which the corresponding metabolite involved. Colors also show
adjusted p-values obtained from enrichment analysis.
We constructed a metabolite-centric network from the perturbed
metabolic network to find the connected component of significantly
dysregulated metabolites. Enriched metabolites with adjusted p-value
lower than 0.05 were preserved and other metabolites were removed from
the metabolite-centric network. We used Cytoscape biological network
tool [[93]19] to visualize the subnetwork. [94]Fig 6 shows the two
biggest connected components as significantly dysregulated ones; One is
related to the glycerolipid metabolism and the other related to the
metabolism of the amino acids. We also found tetrahydrofolate (THF) and
glutathione (GSH) connected to the amino acids related component.
Fig 6. Connected components of significantly dysregulated metabolites.
[95]Fig 6
[96]Open in a new tab
The metabolite-centric network was reconstructed from the perturbed
reactions in T2DM. Insignificant enriched metabolites were removed from
the network. The two biggest connected components were found. The right
component mainly consists of amino acids. The left component comprises
metabolites involved in the glycerolipid metabolism.
Potential metabolite markers prediction
The genome‐scale model of human metabolism provides an opportunity to
predict potential biofluid markers associated with the diseases. The
overall approach involves identifying exchange metabolites that their
flux range in disease state is shifted compared to healthy ones. The
successful performance of this method has already been evaluated and
validated [[97]22]. Here, we applied this method to find metabolite
markers of insulin resistance in muscle. Two generic metabolic models
of healthy and diabetic muscles were reconstructed using average gene
expression data of each group and the E-Flux method; Then, FVA analysis
was applied to them. The exchange reactions in which the flux range was
shifted relative to the normal state were selected. This approach
resulted in 32 exchange reactions. In order to evaluate the
discriminative capability of these reactions at the individual level, a
machine learning approach was employed. To do this, the generated
minimum and maximum fluxes of these reactions obtained from
personalized FVA analysis were used as the features for classification
with SVM. This evaluation resulted in 72.22% accuracy, 81.11%
specificity, and 60.32% sensitivity. We found that these exchange
metabolites altered at the individuals’ level.
To achieve a fewer and better feature subset with improvement in
classification accuracy, a wrapper feature selection method was
employed that comprises a combination of GA and SVM as the feature
selector and classifier, respectively. Several subsets of features with
which the SVM classifier can discriminate diabetic patients from
healthy ones with approximately 90 percentages accuracy were found.
This GA-SVM procedure was iterated 100 times resulting in 100 feature
subsets; then, the observation frequency of each feature in these
iterations was calculated. These frequencies were considered as
indicative of the importance of features and features with at least 80%
frequency regarded as top-ranked features. The top-ranked features were
related to 13 reactions exchanging glucose, alanine, aspartate, sodium,
hyaluronate, galactose, GQ1b, acetaldehyde, deoxyribose,
4-OH-Estradiol, hypoxanthine, retinoate, and methylglyoxal.
Subsequently, the SVM classification performance with the top-ranked
features was assessed. Our analysis revealed that using these
top-ranked features improved classification accuracy to 81.04% with
73.02% sensitivity and 86.67% specificity. Since the classifier
performance will vary depending on which samples are assigned to the
training set and which ones to the test set, we repeated the 10-fold
cross-validation 100 times and evaluated the SVM performance. The
boxplot of this evaluation is shown in [98]Fig 7.
Fig 7. Boxplot of SVM evaluation using 13 metabolite markers as features.
[99]Fig 7
[100]Open in a new tab
10-fold cross-validation was repeated 100 times and the SVM performance
was evaluated.
For further validation, we assessed the performance of these top-ranked
features with data obtained from another study [[101]8]. This data
comprises RNA-Seq samples from the vastus lateralis muscle of 24
participants, divided into four following groups: normal glucose
tolerant (NGT)/non-obese, NGT/obese, T2DM/non-obese, and T2DM/obese.
After preprocessing of the read counts, personalized metabolic models
were reconstructed and FVA analysis was performed. Associated fluxes of
top-ranked features were obtained and were used as the features for the
SVM classifier.
We divided these people into NGT and diabetic groups regardless of the
obesity state and the generated fluxes of marker reactions in these 24
participants were used for validation with SVM. This analysis resulted
in 49.46% accuracy, 42.25% specificity, and 56.67% sensitivity, which
is remarkably low, probably due to the presence of normoglycemic obese
people in the healthy group. Although the blood glucose level in these
individuals is in the normal range, high levels of fasting blood
insulin level (60 ± 33 pmol/l in NGT/obese vs 25 ±7 pmol/l in
NGT/non-obese) indicates insulin resistance in this group, which is
compensated by more insulin secretion. Thus, we decided to remove data
from NGT/obese individuals and re-evaluated markers in this set. The
result significantly altered with 78.50% accuracy, 70.17% specificity,
and 82.67% sensitivity. Therefore, the metabolites markers are related
to insulin-resistant metabolism and can successfully discriminate
insulin-resistant individuals from healthy ones.
Discussions
Here, we aimed to provide a holistic view of the metabolism remodeling
in skeletal muscle of newly diagnosed T2DM patients. Since diabetes is
a progressive metabolic disease, our goal was to identify the metabolic
alterations at the early stages of the disease. We used the gene
expression data from the most comprehensive human skeletal muscle
transcriptome study to date. The diabetic sample consists of
participants who have newly been diagnosed with diabetes (with an
average blood glucose concentration of ~ 7.2 mmol/L) and had not taken
any medications. For understanding metabolism alterations in T2DM, we
first reconstructed our functional muscle-specific metabolic model. We
could not utilize Bordbar [[102]4] or Nogiec [[103]5] muscle metabolic
models due to them being small-scale. Varemo model [[104]6] was a
complete muscle metabolic model, but because of the wrong GPRs, it was
not suitable for simulation. Thus, we reconstructed our comprehensive
skeletal muscle metabolic model. The model quality was successfully
validated. This model is freely available and can be used for other
muscle metabolic researches.
We employed the reconstructed model for system-level investigation of
dysmetabolism in T2DM. Personalized metabolic models with
constraint-based simulations were applied to quantitatively compare
metabolic capabilities between healthy and newly diagnosed diabetic
patients. Perturbed reactions and subsequently affected pathways and
metabolites were identified. We found that significant metabolic
alterations occurred in several pathways. Pathways implicated in
membrane and ECM biosynthesis were dysregulated. Perturbations in the
metabolism of keratin sulfate, chondroitin sulfate, and heparan
sulfate, inositol phosphate, glycerolipid, glycerophospholipid, and
sphingolipid metabolism were observed. Chondroitin sulfate, keratin
sulfate, heparan sulfate, and hyaluronic acid are glycosaminoglycans
(GAGs). GAGs play a vital role in cell physiology including cell
signaling, proliferation, and cell adhesion. The insulin-sensitizing
and anti-diabetic impacts of some GAGs have been reported [[105]23,
[106]24]. Perturbations in GAGs related pathways and sphingolipid
metabolism imply the role of the ECM in insulin resistance, which is
involved in the regulation of insulin action. Inositol mediates insulin
signal transduction, is associated with glucose uptake, and plays a
vital role in oxidative stress and inflammation. Administration of
inositol supplements improves glucose metabolism and insulin resistance
[[107]25].
Significant alterations in lipid metabolism including glycerolipid,
glycerophospholipid, sphingolipid, and glycosphingolipid metabolism
were found in our analyses. These pathways mediate the production of
major constituents of the plasma membrane. These metabolites act as
structural components of membranes and as signaling molecules.
Dysfunctional lipid metabolism can lead to membrane rearrangement,
changing its integrity, and interrupt cell signaling. Perturbation in
these pathways also triggers feedback loops resulting in mitochondrial
dysfunction, reactive oxygen species production, and inflammation.
Studies have found several intermediates in these pathways implicated
in insulin resistance [[108]26].
Enrichment analysis also revealed perturbation in amino acids. Previous
studies have demonstrated associations between amino acids and T2DM. In
particular, the various pieces of evidence have implied the implication
of branched-chain amino acids (BCAAs) in diabetes. Studies have shown
that dysregulation of BCAAs metabolism results in the serine
phosphorylation of insulin receptor substrates and subsequent
uncoupling of insulin signaling [[109]27]. Here, in this analysis BCAAs
and all other amino acids were affected by the metabolic reprogramming
in T2DM. Since amino acids also circulate in blood plasma,
dysmetabolism can possibly be reflected in serum. In recent years,
amino acids have been considered as a source of prognostic markers in
T2DM [[110]28]. Here, dysregulation of amino acids in the early stages
of T2DM was observed which can support the significance of these
metabolites in the early diagnosis of the disease. Also,
tetrahydrofolic acid is connected to the amino acids in the
metabolite-centric network. THF is a cofactor in the anabolism of amino
acids. THF and its derivatives also contribute to purine, pyrimidine
and nucleotide synthesis, and DNA methylation [[111]29]. Alteration in
folate metabolism can disrupt downstream processes. Limited research
regarding the role of folate and its supplementation has been
performed. Folate intakes have shown an inverse association with the
incidence of T2DM [[112]30–[113]33]. It improves insulin resistance by
modulating DNA methylation of genes associated with obesity and insulin
resistance [[114]34]. The role of folate metabolism in muscle insulin
resistance could be the subject of further empirical researches.
Analysis of metabolic alteration at the sub-cellular level revealed
that Golgi is mainly affected by metabolic remodeling in T2DM. This
compartment implicates in building ECM metabolites. Many studies have
been performed on the role of mitochondria and endoplasmic reticulum in
the development of insulin resistance [[115]35]; whereas, our knowledge
about the role of Golgi is very limited. Based on the observations in
the present study, we suppose the role of Golgi in insulin resistance
should be further investigated.
Taken together, the main dysmetabolism was observed in ECM components,
lipids, and amino acids related pathways. Muscle dysmetabolism changes
the abundance of metabolites involved in the membrane and ECM. This
metabolic reprogramming can interrupt the process of sensing insulin
and transmitting signals into the cell. Decreased insulin sensitivity
results in lower expression of insulin-responsive genes, reduced
glucose uptake, and consequently changes in energy, glucose, fatty
acids, and amino acid metabolisms.
As the final analysis, potential metabolite markers were predicted. Our
approach led to the identification of 13 exchange metabolites that
could discriminate healthy individuals from T2DM patients with 81%
accuracy. We validated these markers using separate gene expression
data from the Varemo et.al. study [[116]8]. They have shown that muscle
transcriptional reprogramming in obesity is similar to that which occur
in T2DM [[117]8]. Here, we found that using all data from this study,
including normoglycemic obese individuals for validation, the accuracy
of our proposed marker was notably low (~ 50%). We thought this low
accuracy may be due to the metabolic similarities between obese and
diabetic individuals. In fact, as noted in the Varemo et.al. study,
obese and diabetic individuals have shown similar gene expression
patterns [[118]8] that can lead to similar metabolisms. Moreover, obese
individuals had high levels of fasting blood insulin, which demonstrate
insulin resistance in this group. To test this issue, we removed obese
healthy individuals from the validation set. This improved accuracy to
78.50%. Therefore, this analysis confirmed both the Varemo et.al. claim
about the similarity of the gene expression pattern between obese and
diabetic individuals and the appropriate efficiency of our proposed
markers in identifying insulin-resistant individuals. Although it is
reasonable to assume that our model predicts insulin resistance
biomarkers rather than obesity-related metabolic changes based on the
fact that the diabetic samples in the test group included both
non-obese and obese individuals, further research is needed to confirm
that these biomarkers do not overlap with metabolic changes related to
obesity. To achieve this, muscle gene expression data is required from
individuals with normal blood sugar and insulin levels in both obese
and non-obese groups, without drug use or confounding factors.
Important metabolites such as methylglyoxal, hyaluronan, retinoic acid
(vitamin A), sodium, alanine, and aspartate are present in the
predicted markers. Notably, this method also successfully identified
glucose as one of the markers. Methylglyoxal, a glycolytic by-product,
is a toxic and highly reactive compound involved in cellular
dysregulations. This compound modifies nucleotides, proteins, and
lipids producing advanced glycation end products, contributing to
diabetes complications. Methylglyoxal is associated with oxidative
stress, cellular inflammation, and age-related disease such as diabetes
[[119]36]. Several studies have revealed the impact of methylglyoxal on
insulin signaling pathways and insulin resistance [[120]37, [121]38]
and recently this metabolite has been introduced as an emerging marker
for T2DM diagnosis [[122]36]. Hyaluronan is an anionic GAG metabolite
implicated in several functions like cell signaling, proliferation and
migration, and angiogenesis. This metabolite also contributes to the
inflammation and pathogenesis of T2DM [[123]39, [124]40]. Studies have
demonstrated that hyaluronan increases in the serum and skeletal muscle
of T2DM subjects [[125]41, [126]42]. Several analyses have shown the
possible roles of vitamin A in glucose metabolism, and the progression
of insulin resistance [[127]43–[128]45]. Association of purine
metabolites such as xanthine and hypoxanthine with the risk of T2DM
incidence and complications has been reported [[129]46, [130]47]. Also,
change in serum concentration of amino acids [[131]48], and sodium
[[132]49] is associated with obesity and T2DM. In addition, the
implication of gangliosides in insulin resistance has been shown
[[133]50–[134]52]. In our study, GQ1b ganglioside was predicted in the
top-ranked metabolites list that can be considered for future analysis.
We also checked metabolomics-based studies of T2DM and the Human
Metabolome Database for these metabolites [[135]53]. We found that
glucose, hypoxanthine, alanine, aspartate, galactose, hyaluronate, and
methylglyoxal levels have been reported to be associated with T2DM
[[136]46, [137]53, [138]54]. This evidence confirms the efficiency of
our method for identifying metabolite markers. These metabolite markers
can be used for a further empirical investigation to verify their
prognostic and diagnostic values in insulin resistance.
Supporting information
S1 File. The complete list of the enriched metabolites in the unique
and compartmentalized forms obtained from metabolite enrichment
analysis of perturbed reactions is provided in S1 and S2 Tables,
respectively.
(XLSX)
[139]Click here for additional data file.^ (74.9KB, XLSX)
Acknowledgments