Abstract
Insulin resistance is considered to be a pathogenetic mechanism in
several and diverse diseases (e.g. type 2 diabetes, atherosclerosis)
often antedating them in apparently healthy subjects. The aim of this
study is to investigate with a microarray based approach whether IR per
se is characterized by a specific pattern of gene expression. For this
purpose we analyzed the transcriptomic profile of peripheral blood
mononuclear cells in two groups (10 subjects each) of healthy
individuals, with extreme insulin resistance or sensitivity, matched
for BMI, age and gender, selected within the MultiKnowledge Study
cohort (n = 148). Data were analyzed with an ad-hoc rank-based
classification method. 321 genes composed the gene set distinguishing
the insulin resistant and sensitive groups, within which the
“Adrenergic signaling in cardiomyocytes” KEGG pathway was significantly
represented, suggesting a pattern of increased intracellular cAMP and
Ca2+, and apoptosis in the IR group. The same pathway allowed to
discriminate between insulin resistance and insulin sensitive subjects
with BMI >25, supporting his role as a biomarker of IR. Moreover, ASCM
pathway harbored biomarkers able to distinguish healthy and diseased
subjects (from publicly available data sets) in IR-related diseases
involving excitable cells: type 2 diabetes, chronic heart failure, and
Alzheimer’s disease. The altered gene expression profile of the ASCM
pathway is an early molecular signature of IR and could provide a
common molecular pathogenetic platform for IR-related disorders,
possibly representing an important aid in the efforts aiming at
preventing, early detecting and optimally treating IR-related diseases.
Introduction
Insulin resistance (IR), operationally defined as a reduction in
insulin-mediated glucose metabolism [[43]1], is commonly associated
with obesity [[44]2], ageing [[45]3], male gender [[46]4] and physical
inactivity [[47]5], and precedes and predicts Type 2 Diabetes Mellitus
(T2DM) [[48]6]. In the early stage, the euglycemic state is maintained
by compensatory hyperinsulinemia [[49]7], so that early alterations in
insulin signaling at the cellular level may occur also in apparently
healthy subjects [[50]8]. The consequences of chronic
IR/hyperinsulinemia are believed to be far reaching, and not limited to
T2DM pathogenesis [[51]6], but also to other conditions, including
atherogenic dyslipidemia [[52]9], hypertension [[53]10] and
atherosclerosis itself [[54]11, [55]12]. This has been encapsulated in
the concept of metabolic syndrome [[56]13, [57]14], or cardiometabolic
risk [[58]15], a clinically recognizable high risk state for T2DM and
cardiovascular disease, which are believed to be pathogenically rooted
in the common soil of IR [[59]14, [60]15]. The list of other conditions
associated with IR is impressively long and diverse, and includes,
without being limited to, Alzheimer’s disease (AD) [[61]16] and chronic
heart failure (CHF) [[62]17, [63]18].
The early molecular alterations leading to IR in humans are known only
in part. One well grounded hypothesis holds that increased
diacylglycerol and activation of several protein kinase Cs are central
to the development of IR in a number of conditions such as obesity,
T2DM, non alcoholic fatty liver disease, and have been detected also in
lean old individuals and lean first degree relatives of patients with
T2DM [[64]19]. However, this scenario might be unable to explain why,
for instance, in some individuals, pancreatic beta cells fail to
compensate IR and T2DM develops [[65]20], i.e. it may need a second
“hit” to account for the dire consequences of IR. Another hypothesis,
supported by a wealth of convincing experimental evidences, puts FOXO
proteins, in particular FOXO1, at the heart of the pathogenesis of both
liver/hypothalamus IR and beta cell demise [[66]21, [67]22]. However,
FOXO1 emerges as a key transducer to most, but perhaps not all
[[68]23], detrimental consequences of IR, leaving unanswered the
question of what causes IR in the first place. Of course the
combination of the two hypotheses would be endowed with fully
explanatory power, according to a “plural pathway” mechanism.
Both the above mentioned hypotheses are extensions of working
hypotheses [[69]24, [70]25] aiming at explaining findings in a single,
specific organ known to be involved in the pathogenesis of IR and T2DM.
However, in the case of heterogeneity/multiplicity of the mechanisms of
IR, by focusing at the beginning on a single tissue there might be an
inadvertent selection bias in favor of pathogenetic mechanisms
preferentially and/or primarily at work in that tissue and against
possible generalized, widespread mechanism(s).
Peripheral blood mononuclear cells (PBMCs) express approximately 80% of
the genes encoded by the human genome [[71]26]. Furthermore, the
continuous contact of PBMCs with both endogenous and exogenous cues
puts them in a unique position to recapitulate the overall genomic
response of the individual to the environment. Thus, PBMC gene
expression profile may be a useful, accessible window on the
pathophysiology of processes occurring in hardly accessible organs and
tissues [[72]27]. A few studies of PBMC gene expression analysis have
been performed to detect cross-sectionally the associations with
dietary patterns in healthy individuals or to monitor the changes
induced by dietary intervention in obese, insulin-resistant individuals
[[73]28–[74]31].
We hypothesized that, if IR induces ubiquitous changes in gene
expression patterns, circulating mononuclear cells may be reporters of
this phenomenon. The aim of our study, therefore, was to assess whether
“primary” (i.e. not associated to obesity and/or metabolic syndrome) IR
is characterized by a specific pattern of gene expression profile in
PBMCs. To reach this goal, we analyzed 40 healthy subjects in two
different comparisons. Initially, we selected 20 lean healthy
individuals from the extremes of the distribution of HOMA-IR
(Homeostasis Model Assessment-estimated Insulin Resistance), a
recognized and widely used index of IR [[75]32], within a cohort of
carefully phenotyped healthy subjects, after matching for BMI (Body
Mass Index), age and sex. In these two polar groups, with a hugely
different degree of insulin sensitivity, we explored, with a microarray
based approach, the gene expression profile of PBMC by an innovative
rank-based classification method [[76]33, [77]34] and we discovered a
biomarker of “primary” IR. After identifying a specific pathway as the
putative originator of the biomarker of “primary” IR, we tested further
whether alterations in gene expression of this pathway could be
considered as a biomarker of IR. As a first step, we tested whether the
ASCM pathway tracks with insulin resistance also in 20 otherwise
healthy overweight/obese people. Secondly, we investigated whether the
same pathway could represent a biomarker of organ damage related to IR,
by testing its expression pattern in clinical IR-related disorders,
such as T2DM [[78]35], CHF [[79]36, [80]37] and AD [[81]38] (from
publicly available data sets).
Results
Characteristics of the study population
The main features of the whole study cohort are reported in [82]S1
Table. All subjects had normal glucose regulation, with fasting plasma
glucose <100mg/dl and 2h plasma glucose after OGTT < 140mg/dl (ADA
guidelines). The insulin-sensitive (IS) and the insulin-resistant (IR)
groups were selected from the entire cohort as described in the Methods
section. The main characteristics of the IS (n = 10) and of the IR (n =
10) groups are reported in [83]Table 1. No statistically significant
differences in the variables composing the metabolic
syndrome/cardiometabolic risk (glucose, waist, blood pressure,
triglycerides and HDL-cholesterol) were detected between the IS and the
IR groups. The IR group, however, had a mean homeostatic model
assessment (HOMA)-IR value 8-fold higher than the IS group (p<0.001),
to highlight a huge difference in insulin sensitivity.
Table 1. Anagraphic, anthropometric and clinical characteristics of the low
BMI insulin-sensitive (Low HOMA-IR) and insulin-resistant (High HOMA-IR)
groups.
N.s.: non-significant.
Low HOMA-IR High HOMA-IR p-value
Gender (M/F) (n) 7/3 7/3 n.s.
Age (years) 40.2 ± 11.3 33.9 ± 6.1 n.s.
BMI (Kg/m2) 22.5 ± 2.2 22.2 ± 1.6 n.s.
HOMA-IR (pure number) 0.5 ± 0.1 4.0 ± 2.8 p<0.001
Systolic blood pressure (mmHg) 115.4± 11.9 111.7 ± 15.0 n.s.
Diastolic blood pressure (mmHg) 75.4±9.2 75.7± 8.2 n.s.
Total Cholesterol (mg/dl) 208.3±38.3 199.2±32.5 n.s.
HDLc (mg/dl) 65±11.2 57.7±16.9 n.s.
LDLc (mg/dl) 129.44±38.0 124.8±31.6 n.s.
Triglycerides (mg/dl) 69.3±45.8 83.5±54.5 n.s.
hsPCR (mg/l) 0.86±1.0 0.58±0.3 n.s.
Waist (cm) 84.0±7.0 84.4±8.5 n.s.
[84]Open in a new tab
Gene expression microarray analysis: identification of a biomarker of insulin
sensitivity/resistance
The biomarker computation was carried out by an enhanced version of the
rank-based classification method introduced in [[85]32, [86]33] as
described in the Methods section. The IS and IR groups of subjects
could be distinguished by a biomarker of 512 probes with an accuracy of
100%, according to a 5-fold cross-validation approach and a permutation
test p-value of 0.005 ([87]S2A Table). [88]Fig 1 shows the heat map for
the 512 probe expressions in the 20 subjects. Of these 512 probes, 321
could be mapped to genes ([89]S2B Table reports the 321 genes and the
associated fold change and p-value). KEGG (Kyoto Encyclopedia of Genes
and Genomes) pathway enrichment analysis for the 321 genes showed
significant enrichment (Bonferroni adjusted p-value 0.0187) of the
Adrenergic Signaling in CardioMyocytes (ASCM) KEGG pathway
[[90]39–[91]41], of which 9 genes ([92]Table 2), with significantly
altered expression in the comparison between the IS and the IR groups,
are represented in the list of 321. Other 36 KEGG pathways were
identified after functional analysis, among which insulin secretion and
T2DM, but they did not have significant p-value after adjustment.
[93]Fig 2 shows the bar charts for the KEGG pathway annotation,
representing the number of genes found in each pathway and the
corrected p-value associated with them.
Fig 1. Heatmap of the 512 probes distinguishing IR and IS subjects.
[94]Fig 1
[95]Open in a new tab
Rank values of the probes are represented and colors indicate the
position of the probe expression level in the ranking of each sample
(ascending order, red for over expressed probes, blue for under
expressed probes). Each column reports the data of a single subject: 1
to 10 are the insulin resistant individuals, 11 to 20 are the insulin
sensitive subjects. See also [96]S2A Table.
Table 2. Differentially expressed genes associated to the ASCM pathway
included in the biomarker.
The table provides the list of gene symbols, the log2 fold-change of
gene expression levels between IS and IR subjects and the associated
Wilcoxon-test p-values.
Gene symbol Log2FC [IS/IR] p-value direction
ADCY9 -0.295 0.0499 Up regulated in IR
TNNI3 0.213 0.0499 Up regulated in IS
RAPGEF3 -0.085 0.0281 Up regulated in IR
CACNA1S 0.068 0.0104 Up regulated in IS
CACNG3 -0.090 0.0379 Up regulated in IR
ADRA1B 0.095 0.0499 Up regulated in IS
CAMK2D -0.200 0.0499 Up regulated in IR
PPP2R3C 0.174 0.00466 Up regulated in IS
PPP2CA -0.180 0.00466 Up regulated in IR
[97]Open in a new tab
Fig 2. Bar chart of gene enrichment.
[98]Fig 2
[99]Open in a new tab
The bars represent the annotated KEGG pathways within the genes
distinguishing between IR and IS subjects. The Bonferroni corrected
p-value is reported on each bar, while the bar color represents the
–log10 (p-value) for each term. The bars’ length shows the number of
genes found in each term. The dendrogram on the left represents the
term similarity, i.e. terms are connected depending on the number of
genes they share.
Increasing evidences indicate that in obese subjects mononuclear cells
can induce chronic low-grade inflammation, which develops into insulin
resistance and metabolic syndrome [[100]42]. Then we asked whether the
alteration of the ASCM pathway is detected also in overweight subjects
with high HOMA-IR index. For this reason, from the MultiKnowledge
cohort, we selected other two groups of 10 subjects with different
values of HOMA-IR index but with BMI>25.0 (namely IS-BMI and IR-BMI),
and matched for age, sex and BMI, as described in the methods section
([101]Table 3). Then, we tested whether the expression of some of the
genes in the ASCM pathway could provide a biomarker able to
discriminate between the two high BMI groups.
Table 3. Anagraphic, anthropometric and clinical characteristics of the high
(BMI>25.0) BMI insulin-sensitive (Low HOMA-IR) and insulin-resistant (High
HOMA-IR) groups.
N.s.: non-significant.
IS-BMI IR-BMI p-value
Gender (M/F) (n) 6/4 7/3 n.s.
Age (year) 36.70± 4.7 38.10± 6.9 n.s.
BMI (Kg/m2) 27.7± 2.5 27.9 ± 2.7 n.s.
HOMA-IR (pure number) 0.8 ± 0.3 3.1 ± 0.8 p<0.001
Systolic blood pressure (mmHg) 124.0±11.0 122.6 ± 13.7 n.s.
Diastolic blood pressure (mmHg) 77.2±5.5 82.9± 7.0 n.s.
Total Cholesterol (mg/dl) 200.5±38.3 211.5±37.2 n.s.
HDLc (mg/dl) 52.2±8.0 47.3±12.3 n.s.
LDLc (mg/dl) 134.6±31.8 145.1±43.8 n.s.
Triglycerides (mg/dl) 68.4±30.2 119.0±48.7 p<0.05
hsPCR (mg/l) 1.35±0.99 3.0±4.2 n.s.
Waist (cm) 94.40±7.3 98.1±5.0 n.s.
[102]Open in a new tab
Of the 96 probes within the ASCM pathway, 16 probes, corresponding to
15 genes, ([103]S3 Table), constituted a biomarker that distinguished
IS-BMI group from IR-BMI group with 100% discriminant accuracy and a
permutation test p-value of 0.0001 ([104]Fig 3).
Fig 3. Network of subject groups in the IS-BMI and IR-BMI study of the ASCM
pathway.
[105]Fig 3
[106]Open in a new tab
Networks represent subjects’ classification in the validation study of
the ASCM pathway in groups with high BMI (IS-BMI and IR-BMI). Nodes
represent subjects (green ones are IS-BMI subjects, red ones are
IR-BMI). The edge length is proportional to the difference between the
subject signatures (short edges indicate high similarities, long edges
indicate low similarities, no edge indicates a negligible similarity).
In each case subjects naturally cluster together according to the class
they belong to. See also [107]S3 Table.
Deepening in the gene expression dysregulations of the ASCM pathway,
our working hypothesis was the net functional balance of the
differentially expressed genes in this pathway would predict a cluster
of alterations in the IR group, including increased intracellular cAMP
(due to overexpression of ADCY9) and Ca^2+ (due to an alteration of the
expression of PP2A subunits PPP2R3C and PPP2CA) levels and accelerated
apoptosis (due to an overexpression of RAPGEF3). These alterations
would seem to depict a scenario in which individuals with “primary”
(i.e. not associated to obesity and/or metabolic syndrome) IR are
burdened with “overwork” (increased intra-cellular cAMP and Ca^2+) in
excitable cells, together with accelerated cellular demise. Since IR is
a risk factor for T2DM, CHF and AD, diseases in which excitable cells
(beta cells, cardiomyocytes and neurons, respectively) play major
roles, we hypothesized that, if an alteration of the gene expression
pattern in the ASCM pathway is associated with the development of IR,
within its genes we should find biomarkers able to distinguish healthy
from diseased also in these conditions.
To test this hypothesis, we investigated whether the genes in the ASCM
pathway could be used to find a biomarker that would distinguish
healthy controls from patients with T2DM, or CHF or AD. As to T2DM, we
tested gene expression data of pancreatic beta cells of 10 healthy
donors and 10 T2DM donors from a publicly available dataset
([108]GSE20966 [[109]35]). Of the 418 probes within the ASCM pathway,
52 probes, corresponding to 43 genes ([110]S4A Table), constituted a
biomarker that distinguished controls from patients with 100%
discriminant accuracy according to a 10-fold cross-validation approach
and a permutation test p-value of 0.0001 ([111]Fig 4A).
Fig 4. Networks of subject groups in each replication study of the ASCM
pathway.
[112]Fig 4
[113]Open in a new tab
Networks represent subjects’ classification in the four replication
studies of the ASCM pathway as a provider of classification effective
biomarkers of IR-related disorders. Nodes represent subjects (green
ones are healthy subjects, red ones are diseased). The edge length is
proportional to the difference between the subject signatures (short
edges indicate high similarities, long edges indicate low similarities,
no edge indicates a negligible similarity). In each case subjects
naturally cluster together according to the class they belong to. A:
classification of 10 healthy donors and 10 T2DM donors [[114]35]. B:
classification of 3 healthy controls versus 4 NICM patients [[115]36].
C: classification of 9 healthy controls versus 9 CHF patients
[[116]37], in this case a diseased subject is miss-classified by the
algorithm. D: classification of 8 control donors and 7 AD patients’
autopsies [[117]38]. Networks were drawn through Cytoscape [[118]40].
See also [119]S4 Table.
As to CHF (of which IR is a known risk factor), we used two publicly
available gene expression data sets ([120]GSE9128 [[121]36] and
[122]GSE21125 [[123]37]) of PBMCs and white blood cells, respectively.
In this case we used two datasets because the number of subjects of the
first dataset was too small to allow the cross-validation of results.
From [[124]36] we compared 3 healthy controls versus 4 non-ischemic
cardiomyopathy patients. In the gene expression data, 261 probes
corresponded to genes in the ASCM pathway. Among these, 41 probes,
corresponding to 37 genes ([125]S4B Table), constituted a biomarker
that distinguished healthy controls from patients with 100%
discriminant accuracy and a permutation test p-value of 0.026 ([126]Fig
4B). From [[127]37] we compared 9 healthy controls versus 9 CHF
patients. In the gene expression data 146 probes corresponded to genes
in the ASCM pathway, within which 22 genes ([128]S4C Table) constituted
a biomarker that distinguished controls from patients with 91.7%
discriminant accuracy according to a 9-fold cross-validation approach
and a permutation test p-value of 0.0015 ([129]Fig 4C).
As to AD, we tested gene expression data of hippocampal cells from a
publicly available dataset ([130]GSE28146, [[131]38]) taken from 8
control donors and 7 AD patients’ autopsies. In this case, within the
261 probes of the ASCM pathway, 15 genes ([132]S4D Table) constituted a
biomarker that distinguished controls form patients with 100%
discriminant accuracy according to a 5-fold cross-validation approach
and a permutation test p-value of 0.0002 ([133]Fig 4D).
The key findings are summarized in [134]Fig 5.
Fig 5. Block diagram reassuming the results of the paper.
[135]Fig 5
[136]Open in a new tab
Starting from the MultiKnowledge population we identified a cohort of
10+10 subjects with normal BMI and opposite HOMA-IR value. We computed
a biomarker able to discriminate between the two groups and we
identified the ASCM pathway as the most significant pathway in the
biomarker. We then validated this result in another cohort from the
MultiKnowledge study of 10+10 subjects with BMI>25 and opposite HOMA-IR
value. Finally, we identified genes in the ASCM able to discriminate
between healthy controls and patients with T2DM, or CHF, or AD.
Discussion
One main finding of this study is the identification of a PBMC
transcriptomic signature exclusively associated to IR, independently of
age, sex, BMI, waist, blood pressure and lipids. [137]Fig 5 is a
concept map, which highlights the principal findings of the paper.
These results were obtained by means of an innovative ranked-based
classification method. The transcriptomic signature of IR included 321
annotated genes, which identified IR and IS subjects with 100%
accuracy. The most representative pathway, and the only one endowed
with full statistical significance, was the "Adrenergic signaling in
cardiomyocytes" (ASCM) KEGG pathway, therefore the best candidate for
having a role within IR development. Since alterations in the ASCM
pathway were observed also in overweight/obese subjects with polar
opposite level of HOMA-IR index, gene expressions in this pathway
should be directly related to insulin resistance and not a feature
related to BMI. At a closer scrutiny, the high IR-BMI group
down-regulates genes involved in the betaA2R signaling (GNAI1 and
PI3KCG), decreasing the PI3K activity [[138]43]. This observation is in
line with the insulin sensitizing activity of PI3K and strongly
suggests that the modification of the expression of genes related to
this pathway could be a biomarker of IR. If this hypothesis holds true,
the ASCM pathway should harbor biomarkers distinguishing between
healthy from diseased subjects also in other cases where IR is known to
develop.
Specifically, the differentially expressed genes of the ASCM pathway
are involved in the modulation of the calcium flux across the membranes
(CACNA1S, CACNG3, CAMK2D), and in cell growth and apoptosis (RAPGEF3,
PPP2R3C, PPP2CA, TNNI3). After scrutinizing the alterations in the
expression of genes involved in this pathway, we put forward the
hypothesis that they should lead to an increase in intracellular cAMP
and calcium and to accelerated apoptosis and impaired cell growth in
the IR group. These processes are extremely relevant in disorders of
excitable cells, among which IR is believed to play a significant role
in T2DM (beta cells), CHF (cardiomyocytes) and AD (neurons).
Thus, we investigated whether the ASCM pathway gene expression profile
could distinguish affected individuals from healthy controls, i.e. we
tried to replicate the role of the pathway investigating whether it
would contain biomarkers of IR-related disorders. For this purpose we
applied a ranked-based classification method to validate our findings
and found that the ASCM gene expression profile could provide molecular
biomarkers of all the above mentioned conditions ([139]Fig 5).
With hindsight, these results are not surprising. For instance, as to
T2DM, it is well known that calcium plays a key role in the process of
glucose-induced insulin release of the pancreatic beta cell [[140]44].
Mice with chronic activation of CAMK2 in beta cells developed glucose
intolerance and abnormal insulin secretion through a ryanodine receptor
2 (Ryr2) dependent mechanism. Chronic activation of Ryr2 was also
implicated in beta cell apoptosis [[141]45]. In the T2DM biomarker
herein reported, CALML3, CALML5 and CAMK2B genes are up regulated in
the patients, possibly leading to an aberrant activation of RyRs.
Studies in diabetic rat and human samples have shown that RyRs activity
can be regulated during acute hyperglycemia in cardiomyocytes
[[142]46]. Moreover, it was demonstrated that sarco/endoplasmic
reticulum Ca^2+-ATPase (SERCA) activity and calcium content are reduced
in myocytes of a mouse model of diabetes (db/db mice), and similarly in
glucose intolerant rats [[143]47, [144]48].
Also in the case of CHF, a biomarker derived from the ASCM pathway was
able to discriminate between patients and controls, in both CHF gene
expression data sets. According to our results in IR vs IS, the
adrenergic receptor signaling via cAMP-dependent and independent
pathways may be overactive and it is known that high intracellular
calcium levels, one of the main features of CHF, result into increased
apoptosis. In different rat models of IR, the decreased SERCA activity
is caused by a slowed cytosolic Ca2+ removal [[145]49], which leads to
alterations of the excitation–contraction coupling process. Finally, IR
is present in non-diabetic CHF patients with both reduced and preserved
ejection fraction [[146]50].
Also in the AD case a number of differentially expressed genes of the
ASCM pathway could distinguish patients from controls. Aberrant
intracellular calcium signaling occurs in the early stage of AD, and
elevated cytosolic calcium concentrations correlate with amyloid
precursor protein production [[147]51]. Recent studies showed that IR
and the associated key risk factors (such as obesity and physical
inactivity) are associated with both brain changes and an increased
risk of developing dementia or AD [[148]52]. Moreover, in mouse models
of AD the expression of RyRs is up-regulated in neurons [[149]53].
Thus, these replication studies confirm that the ASCM pathway gene
expression profile provides biomarkers of IR and IR-related conditions
and that ASCM plays a role in the pathogenic processes put in motion by
“primary” IR. To this regard, it is to be highlighted that in our study
the ASCM pathway was selected in otherwise healthy subjects, with
normal BMI and clinical parameters, and in cells (PBMC) that are not
classical targets of insulin action, are exposed to all cues
circulating in the blood stream and express about 80% of the overall
genome. Thus, our findings strongly suggest a very early and widespread
involvement of the ASCM pathway in the pathogenesis of IR and related
conditions. The putative functional alterations brought about by the IR
related changes in the ASCM pathway can be predicted to be particularly
relevant in excitable cells, such as pancreatic beta cells, myocytes
and neurons. The ASCM pathway, therefore, could be (one of) the
molecular basis of vulnerability of excitable cells secondary to, or
concomitant with, the onset of IR.
It could be argued that within the list of genes distinguishing IR and
IS subjects, the canonical candidates that are usually altered in IR
are missing, such as some inflammatory genes, insulin receptors or
glucose channels. On one side we can say that there are a few genes
within the list that are indeed ascribable to IR, such as RIMS2 which
has a role in insulin secretion [[150]54], SLC2A2 (GLUT2, a glucose
channel [[151]55], up-regulated in IS), IRS4 (insulin receptor
substrate 4 [[152]56], up-regulated in IR) and IL4, (up-regulated in
IS). On the other hand, it must be kept in mind that the analyzed
subjects are healthy and young individuals, it is therefore not
surprising that a massive alteration is not detected in the gene
expression profile, but rather a somewhat softened signal which is to
be attributed to mechanisms generally diffused in the organism and
early in the history of IR development.
It could be argued that a comparison between two groups of 10 subjects
each cannot reach the proper statistical power. However, this is due to
the fact that we made a very closed match between the subjects in each
group in order to avoid confounding effects of the principal features
involved in the onset of IR. This type of approach is consistent with
other previous papers [[153]57, [154]58]. To overcome the problem, when
possible, classification accuracy of computed biomarkers has been
evaluated according to a state-of-the-art cross-validation scheme and
by computing permutation test p-values that in all cases confirmed the
statistical significance of the analysis (see [155]Results and
[156]Methods sections). Moreover, we also confirmed our results in
another well matched population of 20 subjects, thus increasing the
effectiveness of the overall analysis, which finally considers 40
subjects in total.
One main limitation of this study is the lack of a post-hypothesis
specific experimental investigation. Although our results derive from
solid microarray data, the connection to specific protein activity
cannot be abstracted, for instance because of post-translational
modification. Therefore the herein generated hypothesis would need
further support from targeted proteomics examination in healthy and
diseased states.
We conclude that modifications in the gene expression profile of the
ASCM pathway are a very early molecular signature of “primary” IR and
that the gene expression profile of this same pathway provides
biomarkers of several IR-related conditions. Corollaries of the
findings herein reported are that the ASCM pathway could provide a
common molecular pathogenetic platform for IR-related disorders as
diverse as T2DM, CHF and AD and that the early molecular signature of
IR could be an important aid in the efforts aiming at preventing, early
detecting and optimally treating IR-related diseases.
Methods
Study subjects
The study population included 148 apparently healthy volunteers
selected among the offspring of the Barilla study population [[157]12,
[158]59]. The Barilla study is a longitudinal observational survey
performed in 732 factory workers of the Barilla factory since 1981 with
a follow up of 15 years. In 2007–2008 we launched another observational
study with the preferential recruitment of the offspring of the
participants of the Barilla Study, naming it Multiknowledge study. The
general aim of this study was to select and to investigate novel
potential biomarkers of the cardiometabolic risk, with specific
attention devoted to the phenotype of hyperinsulinemia/insulin
resistance. Exclusion and inclusion criteria of the Multiknowledge
study were extensively described in previous reports [[159]60,
[160]61]. Briefly, Caucasian subjects from both genders between 18 and
60 years of age were enrolled. They underwent medical history
collection and physical examination, including smoking habits, blood
pressure and anthropometric measurements. After an overnight fast, a
venous blood sample was drawn for complete blood cell count, standard
clinical chemistry and other measurements for companion experiments,
including 50 ml of blood in an EDTA BD Vacutainer tube (Becton
Dickinson and Company, Franklin Lakes, NJ, US) for isolation of
circulating PBMC (see below for further details). Subsequently, a
standard oral glucose tolerance test (OGTT) (75 g of glucose
administered orally) was performed in all subjects and plasma glucose
and serum insulin levels were measured to assess glucose tolerance.
Complete data for this report were available in 148 subjects.
This study was reviewed and approved by the Institutional Ethical
Committee "Comitato Etico per Parma". All subjects were informed about
the aims and the potential risks of the study and signed a written,
informed consent. The main demographic, anthropometric and humoral
features of the study cohort are reported in [161]S1 Table.
Since the Multiknowledge Study is a purely observational survey with a
non-trial design, according to the policy of the International
Committee of Medical Journal Editors (ICMJE), registration of the
Multiknowledge Study in a public trials registry is not required
([162]http://www.icmje.org/recommendations/browse/publishing-and-editor
ial-issues/clinical-trial-registration.html).
Assessment of insulin sensitivity/resistance and selection of
insulin-sensitive and insulin-resistant groups
HOMA-IR was calculated using the formula:
[MATH:
HOMAIR=<
mfrac>fastinggluco
mi>se*fastinginsul
mi>in22.5 :MATH]
(1)
in which glucose units are mmol/l and insulin units are mU/l.
After excluding all subjects with BMI≥25.0, we selected 10 subjects
from the bottom range (IS group) and 10 subjects from the top range (IR
group) of the HOMA-IR distribution, with the additional goal of
matching the two groups also for anthropometry, sex, blood pressure,
triglycerides and HDL-cholesterol, i.e. the phenotypes of the metabolic
syndrome. Selected features of the IS and IR groups are reported in
[163]Table 1.
A second selection was made in order to validate the results found in
the first two groups (IS and IR). We selected from the same cohort all
subjects with BMI>25.0 and, similarly to the first selection, we choose
10 subjects with the lowest HOMA-IR index (IS-BMI) and 10 subjects with
the highest HOMA-IR index (IR-BMI). Both groups are matched for the
principal anthropometry and clinical variables. Selected features of
the IS-BMI and IR-BMI groups are reported in [164]Table 3.
Gene expression profiling of circulating lympho-monocytes
The PBMC transcriptome profile was assessed in IR and IS groups. PBMCs
were immediately isolated by density gradient centrifugation
(Lymphoprep, Sentinel Diagnostic, Milan, Italy), washed three times
with PBS 1% FBS and then stored as a dry pellet at—80°C. Total RNA was
isolated using column separation (RNeasy Mini Kit, Qiagen, Valencia,
CA, US), followed by DNase treatment to avoid DNA contamination,
according to the manufacturer’s instructions. RNA was then eluted with
RNase-free water and stored at -80°C until hybridization.
Quality control testing of extracted RNA was performed before
hybridization. Total RNA yield was checked using a NanoDrop ND-1000
UV-Vis Spectrophotometer (NanoDrop Technologies, Wilmington, DE, USA).
RNA quality was assessed measuring the 28S/18S rRNA and the RNA
Integrity Number (RIN) with a Bioanalyzer 2100, using RNA 6000 Nano
Chips (Agilent Technologies, Santa Clara, CA, USA).
The gene expression microarray protocol was as previously described
[[165]62]. Briefly, a total of 500 ng of each RNA sample was reversely
transcribed into cDNA and then amplified and labelled with Cy5 dye. The
same amount of RNA from a commercially-available pool of human
leucocyte total RNA (Clontech, Mountain View, CA, USA) was used as
reference. Reference and sample RNA were hybridized together on 4X44
Whole Human Genome Agilent Microarray slides (Agilent Technologies,
ref. G4112F). Dye-normalized, background-subtracted log-ratios of
sample to reference expression were calculated using Agilent’s Feature
Extraction Software v8.0. Hybridization quality was checked using the
software’s quality report. We validated transcriptomic analysis by
these methods in a previous paper [[166]62].
Statistical and bioinformatics analyses
All descriptive statistics are reported as mean ± SD, unless otherwise
specified. The distributions of anthropometric and humoral parameters
were tested by the Kolmogorov-Smirnov test to detect significant
deviations from Gaussian distribution. If needed, log-transformation
was applied to reduce/correct non-Gaussian distributions. Between-group
comparisons were performed on normal variables using Student’s t-test.
Statistical analysis for these clinical parameters was performed using
software package SPSS 15.0 for Windows (SPSS Inc., Chicago, IL, US),
except for the gene expression data which were read and cleaned by
means of the R Bioconductor package Limma [[167]63–[168]65]. The
functions readTargets and read.maimages were used to read the files.
The backgroundCorrect function was used for background correction using
the method “normexp”. Normalization within arrays and between arrays
was performed with the functions normalizeWithinArrays (method “loess”)
and normalizeBetweenArrays (method “quant”). The function avereps was
used to average replicate probes, bringing the expression data set from
45015 to 41001 probes. Probe to gene mapping was achieved through
Agilent annotation supporting files, BioMart [[169]66], Gene cards
[[170]67], Aceview [[171]68] and BLAST [[172]69]. Data can be accessed
at GEO (Gene Expression Ominbus) under the accession number
[173]GSE87005.
Biomarkers were identified by means of an enhanced version of the
rank-based classification method proposed in [[174]33, [175]34]. The
method, extended with a global optimizer to compute the signatures
providing the best sample classification according to genetic
optimization, has been already successfully used [[176]70, [177]71] and
demonstrated to be a very efficacious tool during the SBV IMPROVER
Diagnostic Signature Challenge, an international competition designed
to assess and verify computational approaches for classifying clinical
samples based on gene expression [[178]72]. The method summarizes the
characteristics of each sample through a rank-based signature of probes
and then performs an all-to-all signature comparison by means of a
distance metric based on weighted enrichment score (ES) [[179]73,
[180]74]. The results of the comparison are then collected in a
distance matrix, which is used to classify samples into different
groups by assigning each sample to the class whose elements have the
lowest averaged distance. Moreover, to assess the reliability of the
analysis, classification accuracy is computed, when possible, according
to a cross-validation approach and its statistical significance is
evaluated by a permutation test comparing the obtained classification
accuracy with the accuracy of an empirical distribution obtained by
10000 random permutations of the probe labels. Finally, a biomarker is
extracted considering the union of the probes included in at least one
sample signature.
Cytoscape and the ClueGO plug in were used [[181]40, [182]75] for gene
annotation/gene enrichment analysis of KEGG pathways [[183]41]. A
Bonferroni step-down corrected p-value threshold of 0.05 was employed
to declare the statistical significance of gene enrichment.
The above procedure was applied in a “reverse” fashion in the
replication studies, the goal of which was to assess whether the ASCM
pathway could provide statistically significant, classification
effective biomarkers also in cells/organs/tissues taken from
individuals with insulin-resistant conditions. After selecting all
genes of the ASCM pathway, the algorithm was applied to test whether
the pathway was significantly perturbed, more specifically whether some
of its genes could form a biomarker, which would correctly classify
patients and controls. For the second selection (BMI>25.0) and for each
of the 4 datasets available in the scientific literature, a
classification effective biomarker was found entirely made of genes
belonging to the ASCM pathway. The statistical significance of the
performance of these biomarkers was assessed by permutation tests
([184]S3 and [185]S4 Tables).
Cytoscape [[186]40] was also used to draw the networks reported in Figs
[187]3 and [188]4. In particular, the Edge-Weighted Force-Directed
network layout (BioLayout) was used to arrange the placement of nodes
according to subject distances. The R package ggplot2 was used for
[189]Fig 2.
Supporting information
S1 Table. General characteristics of the study cohort.
The table represents the main characteristics of the subjects of
Multi-Knowledge study. Data are reported as mean±standard deviation.
(PDF)
[190]Click here for additional data file.^ (85.5KB, pdf)
S2 Table
A, B. Features of the biomarker with statistically significant
separation property between insulin-resistant and insulin-sensitive
healthy individuals. S2A Table: for each one of the 512 probes included
in the biomarker, the probe ID, the associated gene symbol (when
available, see [191]S1B Table), the log2 fold-change of the gene
expression level between IS and IR individuals and the associated
Wilcoxon-test p-value are reported; S2B Table: the 321 biomarker
features out of 512 for which a probe ID to gene symbol mapping is
available. The list provides probe IDs, gene symbols, log2 fold-change
of gene expression levels and associated Wilcoxon-test p-values between
IS and IR.
(XLSX)
[192]Click here for additional data file.^ (43.5KB, xlsx)
S3 Table. Validation biomarker extracted from the ASCM pathway that can
discriminate between IS-BMI and IR-BMI groups.
The list provides probe IDs, gene symbols, log2 fold-change of gene
expression levels between IS-BMI and IR-BMI individuals and their
associated Wilcoxon-test p-values.
(XLSX)
[193]Click here for additional data file.^ (9.4KB, xlsx)
S4 Table
A, B, C, D. Validation biomarkers extracted from the ASCM pathway that
can discriminate healthy individuals from patients with IR-related
conditions. S4A Table: list of probe IDs, gene symbols, log2
fold-change of gene expression levels and associated t-test p-values in
discriminating beta cells of patients with type 2 diabetes from healthy
non diabetic individuals; S4B Table: list of probe IDs, gene symbols,
log2 fold-change of gene expression levels and associated t-test
p-values in discriminating PBMC of patients with non-ischemic
cardiomyopathy from healthy non diabetic individuals; S4C Table: list
of probe IDs, gene symbols, log2 fold-change of gene expression levels
and associated t-test p-values in discriminating white blood cells of
patients with chronic heart failure from healthy non diabetic
individuals; S4D Table: list of probe IDs, gene symbols, log2
fold-change of gene expression levels and associated t-test p-values in
discriminating hippocampal cells of patients with Alzheimer disease
from healthy non diabetic individuals.
(XLSX)
[194]Click here for additional data file.^ (22KB, xlsx)
Data Availability
Microarray data are available from GEO database
([195]https://www.ncbi.nlm.nih.gov/geo/) under the accession number
GSE87005.
Funding Statement
This study was supported by the EU commission as part of the
“Multi-Knowledge project” (reference: FP6-IST–2004-027106) to IZ and by
grants of University of Parma to IZ, ADC and RCB. The funders had no
role in study design, data collection and analysis, decision to
publish, or preparation of the manuscript.
References