Abstract
Little is known about the mutational processes that shape the genetic
landscape of gliomas. Numerous mutational processes leave marks on the
genome in the form of mutations, copy number alterations,
rearrangements or their combinations. To explore gliomagenesis, we
hypothesized that gliomas with different underlying oncogenic
mechanisms would have differences in the burden of various forms of
these genomic alterations. This was an analysis on adult diffuse
gliomas, but IDH-mutant gliomas as well as diffuse midline gliomas
H3-K27M were excluded to search for the possible presence of new
entities among the very heterogenous group of IDH-WT glioblastomas. The
cohort was divided into two molecular subsets: (1) Molecularly-defined
GBM (mGBM) as those that carried molecular features of glioblastomas
(including TERT promoter mutations, 7/10 pattern, or
EGFR-amplification), and (2) those who did not (others). Whole exome
sequencing was performed for 37 primary tumors and matched blood
samples as well as 8 recurrences. Single nucleotide variations (SNV),
short insertion or deletions (indels) and copy number alterations (CNA)
were quantified using 5 quantitative metrics (SNV burden, indel burden,
copy number alteration frequency-wGII, chromosomal arm event
ratio-CAER, copy number amplitude) as well as 4 parameters that
explored underlying oncogenic mechanisms (chromothripsis, double
minutes, microsatellite instability and mutational signatures).
Findings were validated in the TCGA pan-glioma cohort. mGBM and
“Others” differed significantly in their SNV (only in the TCGA cohort)
and CNA metrics but not indel burden. SNV burden increased with
increasing age at diagnosis and at recurrences and was driven by
mismatch repair deficiency. On the contrary, indel and CNA metrics
remained stable over increasing age at diagnosis and with recurrences.
Copy number alteration frequency (wGII) correlated significantly with
chromothripsis while CAER and CN amplitude correlated significantly
with the presence of double minutes, suggesting separate underlying
mechanisms for different forms of CNA.
Keywords: glioma, mutational signatures, DNA repair, exome sequencing
1. Introduction
Gliomas vary considerably in their phenotype, biology, clinical
behavior and response to treatment [[50]1]. Various distinct tumor
entities, including astrocytoma, oligodendroglioma and glioblastoma
(GBM), were initially defined by morphological criteria and further
characterized by large-scale molecular and genetic studies
[[51]2,[52]3,[53]4,[54]5]. Each tumor type has a specific molecular
landscape with distinctive methylation profiles indicating their cell
of origin and genomic alterations defining oncogenic programs
[[55]6,[56]7]. Such divergent molecular-genetic landscapes imply that
the causative mechanisms may be different, but little is known on the
subject [[57]8]. Large-scale genomics analyses exploring mutational
signatures indicated that clock-like mutational processes (spontaneous
deamination of methylcytosine) and temozolomide-related signatures were
the predominant mechanisms in gliomas [[58]9]. Other studies have also
provided evidence that DNA repair deficiency was a central theme in
gliomagenesis [[59]10,[60]11].
Variations in the nucleotide sequence are not the only form of genetic
alterations. Other forms like alteration in chromosome number
(aneuoploidy) and structure (copy-number alterations, inversions and
re-arrangements) also play major roles in shaping the cancer genome.
These alterations are caused by a different spectrum of mechanisms
acting at different stages of gliomagenesis in distinct glioma entities
[[61]12,[62]13,[63]14,[64]15].
Isocitrate dehydrogenase (IDH) enzymes participate in a variety of
metabolic mechanisms, such as Krebs cycle, glutamine metabolism,
lipogenesis, redox regulation, and cellular homeostasis, by catalyzing
the oxidative decarboxylation of isocitrate. Previous studies revealed
that mutations of IDH genes are frequently observed in several human
malignancies, including gliomas, and that they play a potential role in
oncogenesis [[65]16]. IDH mutations are recognized in the majority of
the lower-grade gliomas and they are associated with more favorable
outcome than IDH wild-type. H3-K27M mutations were first recognized in
pediatric diffuse intrinsic pontine gliomas, but thereafter, they have
been observed in midline gliomas in adults [[66]17]. Studies on
H3-K27M-mutant tumors indicate that H3-K27M mutant gliomas were
diagnosed at an earlier age and have poor prognosis
[[67]18,[68]19,[69]20].
In this study, we chose to analyze the heterogenous group of IDH-WT
diffuse gliomas, which are probably made up of many different entities.
Other well-characterized diffuse gliomas such as “IDH-mutant gliomas”
(astrocytomas and oligodendrogliomas) and “diffuse midline gliomas
H3-K27M mutant” were deliberately excluded [[70]1]. We studied the
oncogenic processes of the cohort indirectly by quantifying the
corresponding genetic alterations that they cause and subsequently
correlating these findings with direct measurements of several
oncogenic processes. The aim of this study was to analyze the burden of
genetic alterations in different IDH-WT entities. Our hypothesis was
that the burden of various genetic alterations in these different
IDH-WT glioma entities would reflect variations in driving genomic
alterations which acted upon them.
2. Materials and Methods
2.1. Patients and Tumor Samples
Thirty-nine adult patients (24 male and 15 female, median age = 51
(range = 28–76)) who were operated on or underwent stereotactic biopsy
for diffuse gliomas were included. IDH-mutant gliomas (astrocytomas and
oligodendrogliomas) as well as diffuse midline gliomas H3-K27M mutant
were excluded. 45 tumor samples from 39 patients were studied,
including 37 primary tumors and 8 recurrences after radiochemotherapy
(radiotherapy and temozolomide). Characteristics of patients and tumors
are presented in [71]Table 1. All patients were informed about whole
exome sequencing (WES) testing and provided written consent. The study
was approved by Acıbadem Mehmet Ali Aydınlar University institutional
review board (ATADEK-2018/7, 17.05.2018).
Table 1.
Characteristics of the patients and tumors analyzed in the study.
“Primary status” indicates whether the tumor is a “primary” tumor or a
“recurrent” tumor. “Gender” indicates the gender of the patient: “M”
for male and “F” for female”. “Sample type” indicates whether the tumor
sample is a fresh frozen tissue sample (LiN2) or Formalin-Fixed
Paraffin-Embedded (FFPE) sample. “ATRX” indicates the presence of a
somatic mutation in the gene ATRX, “WT” indicates wild-type, whereas
“MUT” indicates mutated ATRX. “TERT” indicates the TERT promoter
mutation status, “WT” indicates wild-type, “C228” and “C250” indicate
somatic mutations at the given genomic positions. “H3” indicates any
somatic mutations in the gene H3F3A, “WT” indicates wild-type,
otherwise the protein alteration is presented. “EGFR amplification”
indicates whether the gene EGFR is amplified (“amplification”) or not
(“copy neutral”). “7+/10−” indicates whether both whole chromosome 7
amplification and whole chromosome 10 deletion is observed (TRUE) or
not (FALSE). “Molecular Subset” indicates the molecular subset of the
tumor: either molecularly-defined Glioblastoma (“mGBM”) or “Others”.
Patient ID Analysis ID Primary Status Gender Age at Initial
Presentation Predominant Localization Sample Type Pathological
Diagnosis Grade ATRX TERT H3 EGFR Amplification 7+/10− Other Putative
Drivers Molecular Subset
NOT-0046 NOT-0046_TA primary M 31 thalamic FFPE Glioblastoma, IDH
wild-type IV MUT WT WT amplification FALSE mGBM
NOT-0046_TB recurrent M 31 thalamic LiN2 Glioblastoma, IDH wild-type IV
MUT WT WT amplification TRUE mGBM
NOT-0047 NOT-0047 recurrent M 49 parietal LiN2 Glioblastoma, IDH
wild-type IV WT WT WT amplification TRUE mGBM
NOT-0048 NOT-0048 primary F 45 frontal LiN2 Glioblastoma, IDH wild-type
IV WT WT WT amplification TRUE mGBM
NOT-0051 NOT-0051 primary M 48 temporal FFPE Glioblastoma, IDH
wild-type IV WT C228 WT amplification TRUE mGBM
NOT-0052 NOT-0052 primary F 65 hippocampus FFPE Anaplastic astrocytoma,
IDH wild-type III WT C228 WT amplification TRUE mGBM
NOT-0054 NOT-0054 recurrent F 40 frontal LiN2 Glioblastoma, IDH
wild-type IV WT C250 WT amplification TRUE mGBM
NOT-0056 NOT-0056 primary F 67 hippocampus LiN2 Glioblastoma, IDH
wild-type IV WT C228 WT amplification TRUE mGBM
NOT-0057 NOT-0057 primary F 62 temporal LiN2 Glioblastoma, IDH
wild-type IV WT C228 WT amplification TRUE mGBM
NOT-0058 NOT-0058 primary F 71 frontal FFPE Glioblastoma, IDH wild-type
IV WT C228 WT amplification TRUE mGBM
NOT-0060 NOT-0060 primary M 48 hippocampus FFPE Glioblastoma, IDH
wild-type IV WT C250 WT amplification TRUE mGBM
NOT-0061 NOT-0061 primary M 55 occipital FFPE Glioblastoma, IDH
wild-type IV WT C250 WT copy neutral FALSE mGBM
NOT-0062 NOT-0062 primary M 66 frontal FFPE Glioblastoma, IDH wild-type
IV WT C250 WT amplification TRUE mGBM
NOT-0064 NOT-0064 primary F 48 frontal LiN2 Glioblastoma, IDH wild-type
IV WT C228 WT amplification TRUE mGBM
NOT-0065 NOT-0065 primary M 59 occipital LiN2 Glioblastoma, IDH
wild-type IV WT C228 WT amplification FALSE mGBM
NOT-0066 NOT-0066 primary F 69 frontal LiN2 Glioblastoma, IDH wild-type
IV WT C228 WT amplification TRUE mGBM
NOT-0067 NOT-0067 primary F 51 parietal LiN2 Glioblastoma, IDH
wild-type IV WT WT WT amplification TRUE mGBM
NOT-0069 NOT-0069_TA primary M 46 parietal FFPE Glioblastoma, IDH
wild-type IV WT C228 WT amplification TRUE mGBM
NOT-0069_TB recurrent M 46 parietal LiN2 Glioblastoma, IDH wild-type IV
WT C228 WT amplification TRUE mGBM
NOT-0070 NOT-0070 primary M 46 temporal FFPE Glioblastoma, IDH
wild-type IV WT C250 WT copy neutral TRUE mGBM
NOT-0071 NOT-0071 primary M 47 multifocal LiN2 Glioblastoma, IDH
wild-type IV WT WT WT amplification FALSE mGBM
NOT-0073 NOT-0073 primary M 51 frontal LiN2 Glioblastoma, IDH wild-type
IV WT C228 WT amplification TRUE mGBM
NOT-0076 NOT-0076 primary M 51 thalamus LiN2 Glioblastoma, IDH
wild-type IV WT C228 WT amplification TRUE mGBM
NOT-0078 NOT-0078 primary F 52 frontal LiN2 Diffuse astrocytoma, WHO
grade II, IDH wild-type II MUT WT WT amplification FALSE mGBM
NOT-0079 NOT-0079 primary M 40 parietal LiN2 Glioblastoma, IDH
wild-type IV WT C228 WT amplification TRUE mGBM
NOT-0082 NOT-0082 primary M 68 gliomatosis FFPE Anaplastic astrocytoma,
WHO grade III, IDH wild-type III WT C228 WT amplification FALSE mGBM
NOT-0084 NOT-0084 primary M 54 parietal LiN2 Glioblastoma, IDH
wild-type IV WT WT WT amplification FALSE mGBM
NOT-0085 NOT-0085 primary M 59 frontal LiN2 Glioblastoma, IDH wild-type
IV WT C250 WT amplification TRUE mGBM
NOT-0086 NOT-0086 primary M 53 temporal FFPE Glioblastoma, IDH
wild-type IV WT C250 WT amplification TRUE mGBM
NOT-0087 NOT-0087 primary M 63 frontal FFPE Glioblastoma, IDH wild-type
IV WT C250 WT amplification TRUE mGBM
NOT-0089 NOT-0089 primary M 62 parietal LiN2 Glioblastoma, IDH
wild-type IV WT C228 WT copy neutral FALSE mGBM
NOT-0091 NOT-0091 primary F 48 frontal FFPE Glioblastoma, IDH wild-type
IV WT C228 WT amplification TRUE mGBM
NOT-0092 NOT-0092_TA primary M 51 frontal LiN2 Glioblastoma, IDH
wild-type IV WT WT WT amplification FALSE mGBM
NOT-0092_TB recurrent M 51 frontal FFPE Glioblastoma, IDH wild-type IV
WT WT WT amplification FALSE mGBM
NOT-0094 NOT-0094 primary M 64 frontal FFPE Glioblastoma, IDH wild-type
IV WT C228 WT amplification TRUE mGBM
NOT-0075 NOT-0075_TA primary F 49 cerebellar LiN2 Anaplastic
astrocytoma, IDH wild-type III MUT WT WT amplification TRUE mGBM
NOT-0075_TB recurrent F 49 cerebellar LiN2 Glioblastoma, IDH wild-type
IV MUT WT WT copy neutral FALSE mGBM
NOT-0059 NOT-0059 primary F 76 frontal FFPE Glioblastoma, IDH wild-type
IV WT WT WT copy neutral FALSE SETD2 (Y2523) Others
NOT-0063 NOT-0063 primary M 37 gliomatosis LiN2 Glioblastoma, IDH
wild-type IV MUT WT G34R copy neutral FALSE Others
NOT-0068 NOT-0068 primary M 64 gliomatosis LiN2 Diffuse astrocytoma,
IDH wild-type II MUT WT WT copy neutral FALSE SETD2 (A2553T) Others
NOT-0083 NOT-0083 primary F 41 corpus callosum FFPE Glioblastoma, IDH
wild-type IV MUT WT WT copy neutral FALSE SETD2 (R2040*; R2510H) Others
NOT-0088 NOT-0088_TA primary F 34 cerebellar LiN2 Anaplastic
astrocytoma with piloid features, IDH wild-type* III WT WT WT copy
neutral TRUE FGFR1 (K567E; V742M) Others
NOT-0088_TB recurrent F 34 cerebellar LiN2 Glioblastoma, IDH wild-type
IV WT WT WT copy neutral FALSE FGFR1 (K567E; V742M) Others
NOT-0090 NOT-0090_TA primary M 28 frontal LiN2 Glioblastoma, IDH
wild-type IV MUT WT WT copy neutral FALSE SETD2 (KQ583fs) Others
NOT-0090_TB recurrent M 28 frontal LiN2 Glioblastoma, IDH wild-type IV
MUT WT WT copy neutral FALSE Others
[72]Open in a new tab
2.2. Pathology and Molecular Subsets
All pathological specimens were retrospectively reviewed by a single
neuropathologist (A.E.D.). Molecular markers (including TERT promoter
mutations) were determined using WES and/or Sanger sequencing and/or
fluorescent in situ hybridization. Molecular subsets were determined as
follows: IDH-wild-type gliomas with “TERT promoter mutations” and/or
“EGFR amplifications” and/or “chromosome 7 amplifications and
chromosome 10 loss” were classified as “molecularly-defined
glioblastoma (mGBM)” [[73]21,[74]22]. Other less common and less
well-defined IDH-WT gliomas were grouped as “other diffuse gliomas”
(“Others”), including 4 hemispheric high-grade gliomas which were
SETD2-mutant, 1 diffuse glioma which was H3-G34-mutant and 1 anaplastic
astrocytoma with piloid features (as confirmed by methylation
profiling, Data not provided) ([75]Table 1) [[76]23].
IDH-WT gliomas are a very heterogeneous group of tumors, likely
containing entities which remain to be identified. Therefore, we
classified IDH-WT gliomas which carried the generally accepted
molecular markers of glioblastoma as mGBM [[77]21] and the remaining as
“Others”. This grouping of IDH-WT gliomas was performed as these are
different entities with different molecular features as well as
different clinical characteristics. Because we only included IDH-WT
gliomas in this study, we do not report any comparison between primary
versus secondary GBM.
2.3. Whole Exome Sequencing, Pre-Processing and Variant Calling
DNA was extracted from snap-frozen tumor and peripheral venous blood
samples using the DNeasy Blood and Tissue Kit (QIAGEN, Hilden,
Germany). Sequencing of the libraries were performed on Illumina (San
Diego, California, USA) HiSeq instruments using paired-end reads. FASTQ
data are available under the European Genome-Phenome Archive
([78]https://ega-archive.org) accession EGAD00001004144. We achieved
mean target coverage of 207.27 and 126.25, for tumors and matching
blood samples, respectively. Detailed sequencing quality information,
including exome capture kit information, is provided in
[79]Supplementary Table S1.
The reads were aligned to the reference genome (UCSC hg19 assembly)
using BWA-MEM (version 0.7.17-r1188) [[80]24]. The mapped reads were
cleaned with Picard-CleanSam (Picard version 2.21.6-SNAPSHOT
[81]http://broadinstitute.github.io/picard/; Cambridge, Massachusetts,
USA). Cleaned reads were sorted and mate information was fixed using
Picard-FixMateInformation. PCR-Duplicates were marked using
Picard-MarkDuplicates. Base quality scores were recalibrated using the
Genome Analysis Toolkit (GATK, version 4.1.4.0; Cambridge,
Massachusetts, USA). Somatic Single Nucleotide Variation (SNV) and
insertion/deletion (indel) calling was performed using GATK-MuTect2.
Somatic copy number alterations (SCNAs) were identified using ExomeCNV
[[82]25]. Somatic structural variations were detected using DELLY
[[83]26].
2.4. Metrics
A summary of the 5 quantitative metrics and 4 parameters that explored
underlying oncogenic mechanisms are presented in [84]Table 2. Analyses
were performed using R ([85]https://www.R-project.org/, Vienna,
Austria).
Table 2.
Characteristics of the metrics analyzed in this study. “CNS” indicates
central nervous system. “CNS_“ “A”, “B”, “C”, “D”, “E”, “F”, “G” and
“H” indicate different CNS-related mutational signatures.
Metric Assessed Genomic Alteration Possible Mechanisms
SNV Burden Frequency of single nucleotide variations (SNV) DNA damage
repair deficiency, Polymerase errors, APOBEC mutagenesis in
hypermutated/ultra-mutated tumors
Indel Burden Frequency of short insertion deletions (indels) Polymerase
slippage, Non-homologous End Joining (NHEJ), hairpin loops
Weighted Genome Instability Index (wGII) Frequency of copy number
variation (CNV) events Double stand breaks, NHEJ, Mitotic
nondisjunction, Chromosomal instability, Break-Fusion Bridge (BFB)
cycles, Chromothripsis
Chromosomal Arm Event Ratio (CAER) Number of chromosomal arm
amplifications or deletions (excludes X, Y): Aneuploidy Mitotic
nondisjunction, Chromothripsis
Copy number amplitude Maximum number of amplifications Extrachromosomal
minutes and chromothripsis in cases with high level amplification (>10)
Chromothripsis Massive, clustered, single chromosomal rearrangements
Mitotic nondisjunction
Double Minutes circularization of double-stranded DNA, resulting in
highly amplified genes Chromothripsis, gene amplification, NHEJ
Microsatellite instability (MSI) Length variation in microsatellite
repeats Mismatch Repair (MMR) deficiency
Mutational Signatures Mechanisms underlying single nucleotide
variations CNS_A: Temozolomide-associated
CNS_B: Clock-like mutagenesis
CNS_C: Unknown etiology
CNS_D: Medulloblastoma-associated
CNS_E: Mismatch-repair-associated
CNS_F: Neuroblastoma-associated
CNS_G: Pilocytic astrocytoma-associated
CNS_H: Homologous recombination-BRCA1/2-associated
[86]Open in a new tab
1. SNV burden was defined as the number of somatic SNVs in the coding
region per megabase. After (a) keeping variants with variant allele
frequency (VAF) > 5% and (b) keeping variants with a sequence depth >
20X in the tumor and > 10X in the normal sample, somatic SNV burden was
calculated as:
[MATH:
#
SNVsexome len
mi>gth(Mb<
/mi>) :MATH]
(1)
2. Indel burden was defined as the number of somatic indels in the
coding region per megabase. After filtering using the same criteria for
SNVs, somatic indel burden was calculated as:
[MATH:
#
indelsexome l
mi>ength(Mb)<
/mrow> :MATH]
(2)
3. The weighted Genome Instability Index (wGII) was used to determine
the fraction of the exome exhibiting copy number alterations [[87]15].
The fractions of altered (defined as |log[2] ratio| > 0.25) segments
over the total size of the regions captured by the exome kit were
calculated for each autosomal chromosome and aggregated via the overall
average to eliminate the bias induced by variation in chromosomal
sizes.
4. Chromosomal Arm Event Ratio (CAER) was used to determine
chromosomal-arm-level SCNAs. For each chromosomal arm, the weighted
arithmetic mean of the Tumor/Normal ratios of all segments within the
arm was calculated and log[2]-transformed. If an arm had a
|weighted-mean-log[2]-ratio| > 0.25, a chromosomal-arm-level SCNA was
determined ([88]Supplementary Figure S1). CAER was determined as the
ratio:
[MATH:
#
arms wit
h SCNA# all
autosoma
l arms :MATH]
(3)
5. Copy-number amplitude was defined as the highest copy-number
observed [[89]27].
6. Chromothripsis events were determined using CTLPScanner which
detects the copy-number change clusters via sliding windows,
calculating a likelihood ratio for each window [[90]28]. Only autosomal
chromosomes were used for assessment.
7. Double minutes (DMs) were detected as previously described [[91]29].
Firstly, high-level copy-number segments with Tumor/Normal ratio ≥ 5
were determined. Possible double minutes were determined if (a) the
sample contained multiple distinct high-level copy-number segments, at
least one of which was overlapping an oncogene or (b) there was one
distinct high-level copy-number segment containing ≥ 1 oncogene, with
length > 1 Mb and with an associated structural variation.
8. Microsatellite Instability (MSI) status of each tumor was predicted
using the tool MSIpred, which uses 22 somatic mutational features to
predict MSI via a support vector machine model [[92]30].
9. Brain tumor-specific mutational signatures within each tumor were
determined using a web-based tool
([93]https://signal.mutationalsignatures.com/) [[94]11]. For high
confidence, only signatures with contribution ≥ 10% were accepted.
There was no significant difference in any metrics between
Formalin-Fixed Paraffin-Embedded (FFPE) and fresh-frozen tumor samples
(LiN2) ([95]Supplementary Figure S2).
2.5. The Cancer Genome Atlas Pan-Glioma Data
The current cohort consisted of cases where WES was performed with
clinical intent, introducing a selection bias. Therefore, the findings
were validated using The Cancer Genome Atlas (TCGA) pan-glioma study
[[96]7]. Only cases with known TERT promoter mutation status were used
to yield comparable findings.
2.6. Association of Somatically Mutated Genes with Metrics and Pathway
Enrichment Analyses
An SNV/indel was defined as “high-impact” if its VAF > 5% and its
classification was one of: “Frame_Shift_Del”, “Frame_Shift_Ins”,
“Splice_Site”, “Translation_Start_Site”, “Nonsense_Mutation”,
“Nonstop_Mutation”, “In_Frame_Del”, “In_Frame_Ins”,
“Missense_Mutation”. For each gene with a high-impact somatic
SNV/indel, Wilcoxon rank-sum test was performed to detect any
difference of metrics between mutated and non-mutated tumors. Hence,
genes associated with each metric were obtained. Next, pathway
enrichment analyses of the associated genes were conducted using
pathfindR [[97]31].
3. Results
For analyses, 45 IDH-WT diffuse glioma tumor specimens from 39 patients
were used, classified as mGBM (n = 37, 82.22%) or “Others” (n = 8,
17.78%) ([98]Table 1). There were 37 (82.22%) primary and 8 (17.78%)
recurrent tumors. When only primary cases were considered (n = 37), 31
were mGBMs and 6 were “Others”.
For validation, we analyzed the TCGA pan-glioma cohort [[99]7]. The
IDH-WT diffuse gliomas in the TCGA pan-glioma cohort (with known TERT
promoter mutation status) consisted of 83 cases, 71 mGBMs (85.54%) and
12 others (14.46%), all primary tumors.
For primary gliomas, the median SNV burden was 3.38 (range =
0.48–55.53), the median indel burden was 0.38 (range = 0.09–2.55), the
median wGII, measuring SCNA frequency, was 0.28 (range = 0.02–0.88),
the median CAER, measuring aneuploidy degree, was 0.16 (range = 0–0.68)
and the median copy-number (CN) amplitude was 15 (range = 4–149).
Fifteen primary cases (40.54%) had CT events, and sixteen primary
samples contained putative DMs (43.24%). A total of 44 oncogenes were
detected in putative DMs across the 16 samples with at least one
oncogene identified in every DM. EGFR and SEC61G (both observed in n =
8, 50%) were the most frequent, followed by VOPP1 (n = 4, 25%), MDM2,
CDK4 and AGAP2 (each n = 3, 18.75%) ([100]Supplementary Table S2). No
kataegis event was observed in the current cohort nor the TCGA cohort.
The MSI prevalence was low (n = 4, 8.89% in the current and n = 2,
2.41% in the TCGA cohort).
3.1. Comparison of Metrics between Molecular Subsets
We compared the metrics between the molecular subsets in primary
tumors. In the current cohort, the median SNV burden values of mGBM
(3.38/Mb) and “Others” (3.59/Mb) were similar (p = 0.89, [101]Figure
1A). In the TCGA cohort, mGBM (1.34/Mb) had higher SNV burden compared
to “Others” (0.1/Mb, p < 0.001). There was no significant difference in
indel burden of different molecular subsets in neither the current nor
the TCGA cohort (p = 0.77 and p = 0.48 respectively, [102]Figure 1B).
In the current cohort, mGBM had higher median wGII (0.29) compared to
“Others” (0.13, p = 0.035, [103]Figure 1C). The same was observed in
the TCGA cohort: mGBM had higher median wGII (0.19) than “Others”
(0.01, p < 0.001). In the current cohort, there was no significant
difference in median CAER between the subsets (p = 0.086, [104]Figure
1D). In the TCGA cohort, mGBM had higher median CAER (0.41) compared to
“Others” (0.33, p = 0.011). In the current cohort, mGBM had higher
median CN amplitude (29) compared to “Others” (4, p = 0.0016,
[105]Figure 1E). In the TCGA cohort, again, mGBM had higher median CN
amplitude (17) compared to “Others” (5, p < 0.001).
Figure 1.
[106]Figure 1
[107]Open in a new tab
Molecular subsets of IDH-WT gliomas differ significantly in their SNV
burden, wGII (copy number alteration frequency), CAER (degree of
aneuploidy) and copy-number amplitude but not for indel burden. (A–E)
Distributions of the 5 quantitative metrics in the 2 molecular subsets.
The upper row displays findings in only primary cases of this cohort
and the lower row displays findings in the TCGA pan-glioma cohort. Bold
p values indicate statistical significance (p < 0.05).
Although it may be expected to observe higher SCNA-associated metric
levels in mGBMs (due to chr7 gains and chr10 losses), it is important
to keep in mind that these metrics are global, assessing all autosomal
SCNA events. Therefore, they are expected to be affected little by the
canonical chr7 gains and chr10 losses.
We next investigated the correlations between the metrics ([108]Figure
2). Hierarchical clustering based on the correlations yielded 2
clusters: (1) SNV burden and indel burden, associated with mutational
processes, i.e., metrics/processes associated with changes in
nucleotide sequence, and (2) CN amplitude, DM, CT, wGII and CAER,
associated with SCNA-related mechanisms, i.e., metrics/processes
associated with changes in chromosomal copy-number/structure.
Figure 2.
[109]Figure 2
[110]Open in a new tab
Metrics associated with mutational processes and copy-number-associated
processes for 2 distinct clusters. Correlogram of both quantitative and
qualitative metric, sizes indicate the |correlation coefficient|. The
hierarchical clustering dendrogram is displayed on the left, the
identified clusters are indicated by dashed rectangles.
3.2. Pathway Enrichment Analysis of Metric-Associated Somatic Variants
To investigate the possible mechanisms underlying each metric, we
firstly examined the genes associated with each metric. Through
Wilcoxon rank-sum tests, 1050, 886, 92, 36 and 19 genes with somatic
SNV/indels were found to be significantly associated with SNV burden,
indel burden, wGII, CAER and CN amplitude, respectively
([111]Supplementary Material 1). Next, pathway enrichment analyses were
performed using the associated genes for each metric. As a result, 72,
60, 7, 1 and 4 pathways were found to be enriched for SNV burden, indel
burden, wGII, CAER and CN amplitude, respectively. Pathways distinct to
each metric and those at each intersection of the enrichment results
are presented in [112]Figure 3. There were 19 enriched pathways
specific to SNV burden, 10 were specific to indel burden and 4 were
specific to wGII. SNV burden and indel burden shared 45 common enriched
pathways. SNV burden and wGII had 2 common enriched pathways. SNV
burden and CAER shared 1 pathway. SNV burden, indel burden and CN
amplitude shared 4 common enriched pathways. SNV burden, indel burden
and wGII had 1 common enriched pathway. Of note, the “Mismatch repair”
pathway was significantly enriched only for the SNV burden metric.
Figure 3.
[113]Figure 3
[114]Open in a new tab
“Mismatch repair” is significantly associated with SNV burden. The
UpSet plot displaying the sets of results of pathway enrichment
analyses on genes associated with each of the 5 quantitative metrics.
3.3. Correlation of Metrics with Chromothripsis and Double Minutes
In primary tumors, CT events were most frequently observed within chr17
(n = 11, 19.3% of total events), followed by chr1 (n = 8, 14.04%) and
chr16 (n = 6, 10.53%), while DM events were most frequent in chr7 (n =
9 DMs, 50% of total), chr12 (n = 3, 16.67%) and chr1 (n = 2, 11.11%)
([115]Figure 4A). CT was associated with higher wGII (p = 0.0014)
([116]Figure 4B). The proportion of cases with at least one putative
double minute chromosome was not different between cases harboring CT
and not harboring CT (p = 1). Harboring a DM was associated with higher
CAER (p = 0.021) and CN amplitude (p < 0.001) ([117]Figure 4C).
Figure 4.
[118]Figure 4
[119]Open in a new tab
Chromothripsis and double minute events are associated with copy-number
alteration. (A) A waterfall plot displaying the overview of
chromothripsis and double minute events in primary tumors. (B)
Comparison of metrics according to chromothripsis status. Cases with
chromothripsis had significantly higher frequency of copy-number
alterations (wGII). (C) Comparison of metrics according to double
minute status. Cases with double minutes had significantly higher
copy-number amplitudes and chromosomal arm event ratios. Bold p values
indicate statistical significance (p < 0.05).
3.4. Associations of Metrics with Age and with Recurrences
Next, correlation of each metric with age at diagnosis was
investigated. To remove any confounding effects of molecular class and
recurrence, only primary mGBMs were analyzed for both the current and
TCGA cohorts ([120]Figure 5). This analysis yielded that SNV burden was
positively correlated with age at diagnosis (R = 0.32, p = 0.079 for
the current cohort, R = 0.25, p = 0.037 for the TCGA cohort,
[121]Figure 5A). Indel burden, wGII, CAER and CN amplitude displayed no
significant correlation with age at diagnosis ([122]Figure 5B–E). In
both the current and TCGA cohorts, there was no significant difference
in age at diagnosis between cases with CT and without (p = 0.48 and p =
0.5 for the current and TCGA cohorts, respectively). There was no
difference in age at diagnosis between cases with DMs and without (p =
0.75).
Figure 5.
[123]Figure 5
[124]Open in a new tab
Only SNV burden correlates with age. (A–E) Scatter plots between age at
diagnosis and SNV burden (A), indel burden (B), wGII (C), CAER (D) and
CN amplitude (E) in the current cohort (upper) and the TCGA cohort
(lower). Bold p value indicates statistical significance (p < 0.05).
To evaluate how metrics differ in primary vs. recurrent tumors, first,
we compared primary tumors (n = 37) and recurrent tumors (n = 8). SNV
and indel burden were significantly higher in recurrent tumors (p =
0.0066 and p = 0.0057, respectively), whereas wGII, CAER and CN
amplitude displayed no significant difference ([125]Figure 6A). There
was no significant difference between the proportions of primary and
recurrent tumors with and without CT (χ^2 p = 0.67, [126]Figure 6B),
43.24% of primary tumors had DMs compared to 25% of recurrent tumors (χ
^2 p = 0.58, [127]Figure 6C) and 2 primary tumors (5.71%) and 2
recurrent tumors (25%) were predicted to have MSI (χ ^2 p = 0.28,
[128]Figure 5D).
Figure 6.
[129]Figure 6
[130]Open in a new tab
Only SNV burden and indel burden increase with recurrences. (A)
Comparison of quantitative metrics between all primary and all
recurrent cases. (B) Comparison of chromothripsis prevalence between
primary and recurrent cases. (C) Comparison of double minute prevalence
between primary and recurrent cases. (D) Comparison of MSI prevalence
between primary and recurrent cases. Bold p values indicate statistical
significance (p < 0.05).
3.5. Correlation of the Metrics with Mutational Signatures
The most frequent detected central nervous system (CNS)-associated
mutational signature in primary tumors was the clock-like signature
CNS_B, associated with the deamination of 5-methylcytosine to thymine
(n = 33, 88.19%, [131]Figure 7A). There was no association of any
signatures with molecular subsets or metrics ([132]Supplementary
Material 2). When signature contributions were compared between all
primary and recurrent tumors, CNS_B, the clock-like signature, was
found to be lower in recurrent tumors, whereas CNS_E, associated with
mismatch repair, was found to be significantly higher in recurrent
tumors (p < 0.001, [133]Figure 7B).
Figure 7.
[134]Figure 7
[135]Open in a new tab
Contributions of clock-like and mismatch repair deficiency-associated
signatures differ between primary and recurrent tumors. (A) A heatmap
of central nervous system (CNS)-associated mutational signatures in
primary tumors (mismatch-repair-associated signature CNS-E and
homologous-recombination-associated signature CNS-H are marked in
bold). Clock-like signature CNS-B was the most common signature. (B)
Comparison of signature contributions between primary and recurrent
tumors. Mismatch-repair-associated signature CNS-E was significantly
higher in recurrent tumors. Bold p values indicate statistical
significance (p < 0.05).
4. Discussion
4.1. Rationale for the Study
Gliomas are a heterogeneous tumor group, consisting of various entities
such as “IDH-mutant astrocytomas”, “IDH-mutant, 1p/19q-co-deleted
oligodendrogliomas”, “IDH-wild-type GBM” and “diffuse midline gliomas
H3 K27M mutant”, which are being characterized in ever increasing
detail [[136]1]. These entities differ in demographics, histopathology,
molecular markers, clinical behavior, treatment response and outcome
[[137]1,[138]7,[139]32,[140]33]. Genetic and epigenetic landscapes are
also divergent [[141]6,[142]7]. Even determinants of genetic
inheritance are dissimilar [[143]34,[144]35]. Therefore, it would not
be irrational to think that each tumor type is formed by different
oncogenic processes. Oncogenic processes leading to the observed
genetic alterations in gliomas are not well-characterized, but the
consequent alterations can be readily quantified. Various forms of
genetic alterations exist [[145]14,[146]15]: some affect the genetic
sequence (e.g., mutations), others affect the karyotype, with some
events resulting in abnormal number of chromosomes (aneuploidy) and
others changing the structure of chromosomes (e.g., copy-number
alterations, re-arrangements or loss of heterozygosity). These genetic
alterations differ in underlying mechanisms [[147]14]. To evaluate the
effect of mechanisms underlying molecular subsets of adult diffuse
gliomas, we quantified the burden of various genetic alterations.
Correlations among metrics indicated that SNV and indel burdens
clustered together, whereas metrics of chromosome number/structure
formed another cluster ([148]Figure 2). SNV burden increased
significantly with advancing age in both the current and TCGA cohorts
([149]Figure 5). No significant correlation with age was noted for the
chromosome number/structure-related metrics. Together, these findings
may indicate that mutations and alterations of chromosome
number/structure are caused by different mechanisms and have different
dynamics in gliomas.
4.2. Molecular Subsets of IDH-WT Glioblastomas Differ in Genomic Alteration
Burden
Molecularly-defined glioblastomas (mGBM) and “Others”, which consisted
of diffuse gliomas with no commonly accepted canonical markers, were
significantly different in all SNV (only in the TCGA validation cohort)
and CNA metrics but the indel burden was comparable ([150]Figure 1).
Some discrepancies between our cohort and the TCGA for some metrics may
have resulted from selection bias or the size of the cohort.
4.3. Chromothripsis and Double Minute Events May Be Drivers of Copy-Number
Alterations in IDH-WT Glioblastoma
Little is known about the mechanisms that alter chromosome
number/structure in gliomas. Some copy number gains (chr7, chr19,
chr20) are early, clonal events in GBM [[151]12]. In addition to the
canonical chr7 gains and chr10 losses, other chromosomal arm events
were observed scattered throughout the genome in GBMs
([152]Supplementary Figure S1).
In this study, chromothripsis was found to be a mechanism associated
with altered chromosome number and structure. It is characterized by
rapid and massive but localized accumulation of chromosomal
re-arrangements resulting from nondisjunction events during mitosis
[[153]36]. Previously, chromothripsis was reported in 84% of GBM cases
[[154]13]. In the current study, chromothripsis events were observed in
over one-third of primary tumors ([155]Figure 4). Copy-number
alteration frequency (wGII) was significantly higher in cases
exhibiting chromothripsis, but SNV or indel burden values were
comparable. This may indicate that chromothripsis is a driver of
structural variations but not mutations in gliomas. The chromothripsis
events were most commonly observed in chr17, chr1 and chr16 (but not in
chr7 or in chr10) ([156]Figure 4). These findings together hint that
there are multiple mechanisms leading to aneuploidy in GBM and that
chromothripsis is a late event.
Double minutes (DM) are small fragments of extrachromosomal DNA, formed
via the circularization of highly amplified, double-stranded DNA mostly
containing oncogenes [[157]37]. DM was observed in over one-third of
cases and the oncogenes contained herein (EGFR, MDM2, CDK4) were
consistent with previous reports [[158]29,[159]38]. Tumors with DM
displayed higher CAER and CN amplitude, hinting to a role in driving
structural variations ([160]Figure 4).
Other possible mechanisms driving structural variations may include
chromosomal instability (CIN), break-fusion-bridge (BFB) cycles or
kataegis. In this cohort, the frequency of copy-number alterations
(wGII), the degree of aneuploidy (CAER) and copy-number amplitude
remained fairly constant over increasing age at diagnosis and at
recurrences ([161]Figure 5 and [162]Figure 6), which is not consistent
with CIN, a continuous process that would result in ever-increasing
extent and complexity of copy-number events. Kataegis events, which are
associated with APOBEC mutagenesis, were not observed in our cohort nor
the TCGA. Another mechanism of interest in TERT promoter mutant gliomas
is occurrence of BFB cycles, which create chromosomal instability until
the acquisition of telomerase activity, however the current study was
limited due to lack of a reliable bioinformatics tool for its detection
from WES data [[163]32].
4.4. Mismatch Repair Deficiency Is Likely a Major Driver of Mutational Burden
in Gliomas
A pathway enrichment analysis based on the 5 quantitative genomic
alteration metrics indicated that genes associated with SNV burden were
enriched for mismatch repair (MMR) deficiency. MMR is a highly
conserved biological pathway that plays a key role in maintaining
genomic stability and MMR-deficiency is a known topic in gliomas. We
previously showed that diffuse gliomas had a high incidence of both
familial-inherited and somatically gained MMR-deficiency [[164]10].
Several studies also indicated that MMR-deficiency results in an
increase in mutational burden [[165]10,[166]39,[167]40,[168]41]. We
also observed substantial increases in SNV burden and indel burden at
recurrence after radiochemotherapy ([169]Figure 6). In parallel with
previous studies, this substantial increase in mutational burden was
associated with significantly higher weight of
MMR-deficiency-associated mutational signature CNS_E ([170]Figure 7)
and with a trend towards higher incidence of MSI ([171]Figure 6). Newly
acquired MMR gene mutations were shown to lead to temozolomide
resistance [[172]42,[173]43,[174]44]. Temozolomide-induced damage in
cells with MMR-deficiency were shown to be the mechanism leading to a
post-treatment hypermutated phenotype [[175]45]. In contrast, similar
levels of copy-number-related metrics and no signs of chromosomal
instability were noted at recurrence after radiochemotherapy
([176]Figure 6). The prevalence of chromothripsis or DM were also not
significantly different in post-treatment recurrences. These indicate
that MMR is a major driver of new mutations over time and with
recurrences after radiochemotherapy.
4.5. Limitations and Future Prospects
The cancer genome has a complex nature and the current findings
represent most likely only a detail in the gliomagenesis. Furthermore,
being a pure bioinformatic analysis, the current work points to
associations but does not provide mechanistic analysis of underlying
mechanisms. Also, as it is a whole-exome analysis-based work;
therefore, various other forms of genetic alterations including
rearrangements, intratumoral heterogeneity or epigenetic changes were
not addressed. The current analysis is also limited by small sample
sizes (resulting in some singular entities), decreasing statistical
power. More comprehensive analyses on larger cohorts can advance our
understanding of gliomagenesis and point to tumor vulnerabilities. The
associations identified in this study should be further investigated to
identify and experimentally prove any underlying cause–effect
relationship.
5. Conclusions
Taken together, these findings support the notion that single
nucleotide variations and copy number alterations are driven by
separate mechanisms and that the cancer genome in different molecular
subsets of IDH-WT glioblastomas diverge in the composition of distinct
genetic alterations. We hope that this work contributes to the deeper
understanding of the tumor biology underlying IDH-WT tumor entities and
will allow for better characterization of these tumors to better
understand clinical behavior and eventually for developing novel
treatment strategies.
Acknowledgments