Abstract Impulsivity is a heritable, multifaceted construct with clinically relevant links to multiple psychopathologies. We assessed impulsivity in young adult (N~2100) participants in a longitudinal study, using self-report questionnaires and computer-based behavioral tasks. Analysis was restricted to the subset (N=426) who underwent genotyping. Multivariate association between impulsivity measures and single-nucleotide polymorphism data was implemented using parallel independent component analysis (Para-ICA). Pathways associated with multiple genes in components that correlated significantly with impulsivity phenotypes were then identified using a pathway enrichment analysis. Para-ICA revealed two significantly correlated genotype–phenotype component pairs. One impulsivity component included the reward responsiveness subscale and behavioral inhibition scale of the Behavioral-Inhibition System/Behavioral-Activation System scale, and the second impulsivity component included the non-planning subscale of the Barratt Impulsiveness Scale and the Experiential Discounting Task. Pathway analysis identified processes related to neurogenesis, nervous system signal generation/amplification, neurotransmission and immune response. We identified various genes and gene regulatory pathways associated with empirically derived impulsivity components. Our study suggests that gene networks implicated previously in brain development, neurotransmission and immune response are related to impulsive tendencies and behaviors. Introduction Impulsivity has been defined as ‘a predisposition toward rapid, unplanned reactions to internal or external stimuli with diminished regard to the negative consequences of these reactions to the impulsive individual or others'.^[46]1, [47]2, [48]3, [49]4 Impulsivity is a complex, multidimensional construct related to responses to rewards/punishments, attention and other cognitive processes.^[50]5 Impulsivity relates to multiple psychiatric disorders and abnormal behaviors, including attention-deficit hyperactivity disorder, suicide, aggression and addiction.^[51]5 The Diagnostic and Statistical Manual of Mental disorders 5th edition (DSM V)^[52]6 defines impulse-control features and/or impulsive symptoms as major factors in the diagnosis of bipolar, attention-deficit/hyperactivity, conduct and antisocial and borderline personality disorders, among others.^[53]4,[54]7 Impulsivity may predict suicidal behavior, psychopathy and conduct disorder, drug and alcohol problems.^[55]8 Impulsivity is genetically influenced and heritable.^[56]5,[57]9 Offspring of parents with substance-use disorders have increased impulsivity,^[58]8 which may be transmitted as general risk factor for substance abuse.^[59]10,[60]11 Some putatively related genes related to impulsive behaviors have been identified.^[61]12 Prior studies also report genetic associations in other impulsivity-associated pathological conditions including behavioral addictions and eating disorders, which may share similar neurobiological risk factors.^[62]13, [63]14, [64]15 Quantifying precise genetic underpinnings of impulsivity hold promise for intervention development for multiple psychiatric conditions. Similar to other complex, inherited, behavioral phenotypes analogous to complex medical disorders such as obesity^[65]16 and psychological phenotypes such as extraversion are clearly influenced by multiple genes and also by environmental factors and their interactions. Various impulsivity-related single-nucleotide polymorphisms (SNPs) have been identified in previous genome-wide association studies, including those associated with dopaminergic and serotonergic genes.^[66]17,[67]18 Prior meta-analyses also link common variants in such genes to attention-deficit hyperactivity disorder and suicidal behaviors,^[68]19,[69]20 which are characteristically impulsive. Most genetic studies utilize a univariate (often genome-wide association studies) approach; however, this method is hindered by high statistical threshold owing to multiple testing corrections for SNP numbers and does not take into account the aggregate effects of genetic variants, such as those that might underlie epistasis and other types of interrelationships that likely underpin complex phenotypes. The role of any individual gene in impulsivity remains unclear, likely attributable to the common disease common variant model alluded above, for which univariate approaches are not optimal. Thus, alternate approaches that consider such genetic aggregates are important to pursue. Multivariate analyses such as parallel independent component analysis (Para-ICA) provide a sensitive and powerful alternative to traditional univariate analyses using single SNPs and single phenotypes. Para-ICA is typically more powerful than univariate analyses because it examines clusters of related individual phenotypic measures in relation to clusters of related SNPs that can be linked via annotation pathways to known molecular biological processes.^[70]21 Para-ICA derives both these phenotypic and SNP clusters empirically from the data set, in a hypothesis-free manner, to reveal novel, biologically relevant associations that might otherwise not be detected.^[71]22 Prior studies have shown that Para-ICA yields robust results with practical sample size of patients with various psychiatric disorders such as Alzheimer's disease and schizophrenia.^[72]21,[73]23 Consequently, in the current study, we used Para-ICA^[74]22,[75]24 to examine aggregate effects of common SNP variants underlying impulsivity-related constructs. The main purpose of the current study was to uncover novel gene networks comprised of interacting SNPs associated with various impulsivity-related measures in a sample of healthy young adults. In addition, we aimed to identify the underlying molecular and biological mechanisms associated with these gene networks that might promote understanding the etiology of specific impulsivity-related behaviors and tendencies. Jupp and Dalley^[76]25 recently reviewed various neurotransmission systems (dopaminergic, serotonergic, noradrenergic, glutamergic, GABAergic, opoidergic, cholinergic and cannabinoids) that have a putative role in impulsivity. The importance of these neurotransmission systems may differ with respect to different aspects of impulsive behavior.^[77]25 In addition, brain organizational process during specific neurodevelopmental stages (such as adolescence) might impact the brain's motivation and inhibition substrates, influencing impulsive choice, risky behaviors and addiction risk.^[78]26 We hypothesized that the biological processes identified by Para-ICA would contain genes identified previously as associated with brain development; impulsive traits and impulsivity-related behavioral problems such as externalizing behaviors, attention-deficit hyperactivity disorder, suicidal behavior and substance abuse; nervous system signal generation, amplification or transduction; and neurotransmitter function, for example, their associated receptors, reuptake sites and synthetic/degrading enzymes. Materials and methods Subjects The study sample consisted of N=426 young adult freshman students who participated in the National Institute of Alcohol Abuse and Alcoholism-funded Brain and Alcohol Research with College Students longitudinal study^[79]11 consisting of the subset of participants from the larger sample (N~2100) who provided genotyping data. Demographic information is shown in [80]Table 1. All subjects provided written informed consent, approved by Hartford Hospital, Yale University, Trinity College and Central Connecticut State University. Exclusion criteria included current psychotic or bipolar disorder based on Mini International Neuropsychiatric Interview,^[81]27 history of seizures, head injury with loss of consciousness >10 min, cerebral palsy, concussion in last 30 days, positive urine toxicological screens for common drugs of abuse and pregnancy. Although we did not collect classical intelligence quotient measures, we recorded Scholastic Assessment Test scores from all our participants. Prior studies have shown Scholastic Assessment Test scores to be a good predictor of intelligence quotient.^[82]28 Thus, intelligence quotient estimates were calculated using Scholastic Assessment Test scores as recommended by Frey and Detterman.^[83]28 Also, socio-economic status was calculated using the Hollingshead (1975) four factor index of social status. Table 1. Demographic Information. Demographic information __________________________________________________________________ Caucasian African American Hispanic Mixed/other __________________________________________________________________ Male __________________________________________________________________ Female __________________________________________________________________ Male __________________________________________________________________ Female __________________________________________________________________ Male __________________________________________________________________ Female __________________________________________________________________ Male __________________________________________________________________ Female __________________________________________________________________ Subjects (N) 137 172 17 30 13 21 18 18 Age range (years) 17–24 Mean age (years; s.d.) 18.31 (0.77) [84]Open in a new tab Impulsivity-related measures Five different self-report questionnaires and three behavioral tasks were used to measure impulsivity and related constructs. These measures were chosen to capture different facets of impulsivity and related constructs that had constituted separate factors in our prior research.^[85]3 Self-report measures were as follows: (i) Barrat Impulsiveness Scale (BIS-11),^[86]29 (ii) Behavioral-Inhibition System/Behavioral-Activation System scale (BIS/BAS),^[87]30 (iii) Sensitivity to Punishment and Reward Questionnaire (SPSRQ),^[88]31 (iv) Zuckerman Sensation Seeking Scale (SSS)^[89]32 and (v) Padua Inventory (PI).^[90]33 Computer-based behavioral tasks consisted of (i) two different versions of the Balloon Analog Risk Task (BART), the Java Neuropsychological Test (JANET) BART^[91]34 and conventional BART,^[92]35 and (ii) Experiential Discounting Task (EDT).^[93]36 Subscales used in our analysis included attention, motor and non-planning from BIS-11; drive, fun-seeking and reward responsiveness subscales from BAS; reward and punishment scales from SPSRQ; thrill and adventure seeking (ZTAS), experience seeking (ZES), disinhibition (ZDIS) and boredom susceptibility (ZBS) from SSS; total score from PI; total balloon pumps and pops from JANET BART; average adjusted pumps from conventional BART; and area under the curve from the EDT, yielding 18 total impulsivity scores and subscores that were included in the analysis. Missing impulsivity-related values (10.5-14.1%) were imputed with mean substitution using SPSS v19.0 ([94]www.ibm.com/software/analytics/spss/) and normalized. SNP data collection and preprocessing Genomic DNA was extracted with saliva collected from each subject using Oragene collection kits.^[95]37 Genotyping was performed using Illumina (Illumina, San Diego, CA, USA) HumanOmni1-Quad v1.0 Beadchip (~1 million target SNPs) for 237 subjects and Illumina HumanOmni2.5-8v1 BeadChip (~2.5 million target SNPs) for 189 subjects. Both chips had identical allele coding. The SNP data from both chips were merged in PLINK software ([96]http://pngu.mgh.harvard.edu/~purcell/plink/). SNPs common between two chips (N=582 300) were considered for further processing. We followed quality control steps of SNPs data using PLINK software as reported elsewhere.^[97]38 [98]Figure 1 is a conceptual illustration of the preprocessing steps in quality control of SNP data. To increase independence between markers, SNPs in high-linkage disequilibrium were removed (window size in SNPs=50, number of SNPs to shift the window at each step=5 and r^2>0.5). We performed principal component analysis using custom MATLAB scripts using algorithm similar to EIGENSTRAT.^[99]39 In order to correct for stratification bias, data were corrected using top two eigenvectors. Stratification bias was verified using Q–Q plot based on the P-values from the association test. To further reduce the number of SNPs for optimal employment of Para-ICA,^[100]22 we took processed SNPs and queried using Kyoto Encyclopedia of Genes and Genomes (KEGG) database ([101]www.genome.jp/kegg). Finally, 26 142 SNPs that were part of pathways in KEGG database were considered for Para-ICA. Figure 1. [102]Figure 1 [103]Open in a new tab Illustration of quality control processing pipeline of single-nucleotide polymorphism (SNP) data. LD, linkage disequilibrium; KEGG, Kyoto Encyclopedia of Genes and Genomes. Genetic-impulsivity association To identify associations between genetic and impulsivity-related data, Para-ICA from the Fusion ICA Toolbox ([104]http://mialab.mrn.org/software/fit/) was used in MATLAB 7.7. Data were prepared for impulsivity analysis as (426 (subjects) × 18 (impulsivity-related measures)) and SNPs as (426 (subject) × 26142 (SNPs)), which were then input to Para-ICA.^[105]22,[106]24 The number of independent components for impulsivity-related and SNPs data was calculated using minimum description length criteria^[107]40 and the number of components estimated was 6 for impulsivity-related measures and 17 for SNPs. Correlations between modalities Gene-impulsivity associations were established by examining correlations between loading coefficients between the SNP and impulsivity-related components. To account for confounding factors, partial correlation between loading coefficients of both modalities were computed controlling for calculated intelligence quotient scores, socio-economic status, age and sex using SPSS. Only those components surviving Bonferroni correction for multiple comparisons (P<0.05/(17 (SNP components) × 6 (impulsivity-related components))) were considered for further examination. Post hoc power calculation was performed on genotype–phenotype correlation pairs that survived multiple comparison corrected statistical threshold to ensure our sample adequately controlled the possibility of type II errors using G*Power software ([108]http://www.gpower.hhu.de/). Pathway analysis Genes corresponding to dominant SNPs from the both (GC1 and GC2) genetic networks were selected using an arbitrary threshold |z| >2.5. To correct for gene-size bias, gene-based trait association value was calculated using VEGAS software.^[109]41 Genes with P<0.05 values were input for enrichment analysis in Metacore-based annotation software GeneGo ([110]https://portal.genego.com/) and ConsensusPathDB ([111]http://cpdb.molgen.mpg.de/). Both ConsensusPathDB enrichment analysis and GeneGo allowed examination of pathway and/or gene ontology categories corresponding to gene sets in each component. The quantitative enrichment scores were calculated using a hyper-geometric approach to estimate the likelihood that significant genes were overrepresented in particular biological pathways. To correct for multiple comparisons, significance values were adjusted using false-discovery rate.^[112]42 Results Genetic-impulsivity associations No significant inflation was noted in the association between loading coefficients and SNP data (see [113]Figure 2 for Q–Q plot). Partial correlation controlling for calculated intelligent quotient, socio-economic status, age and sex revealed significant correlations between two independent impulsivity-related phenotypic components (IC1 and IC2) with two genetic components (GC1 and GC2). GC1 contained 618 SNPs from 304 genes and GC2 comprised 643 SNPs from 322 genes. The most significant impulsivity-related measures represented in IC1 were reward-sensitivity and Behavioral-Inhibition system scale scores of BIS/BAS scale.^[114]30 The most significant impulsivity-related measures represented in IC2 were the non-planning subscale score of the BIS-11 (ref. [115]29) and the area under the curve score from the EDT.^[116]36 IC1 correlated negatively with GC1 (r=−0.19, P=0.00008) and IC2 correlated positively with GC2 (r=0.22, P=0.000002). Scatter plots of both component pairs are shown in [117]Figure 3. The top 20 most significant genes from each of the genetic components GC1 and GC2 are listed in [118]Tables 2 and [119]3, respectively. Post hoc power analysis revealed power attained from IC1–GC1 and IC2–GC2 correlation pairs were 99.6% and 98.1%, respectively. Figure 2. [120]Figure 2 [121]Open in a new tab Quantile-Quantile (Q–Q) plot of P-values for (a) IC1 and (b) IC2. Figure 3. [122]Figure 3 [123]Open in a new tab (a) Scatter plots of loading coefficients of gene cluster GC1 and impulsivity component IC1; and (b) scatter plots of loading coefficients of gene cluster GC2 and impulsivity component IC2. Table 2. List of the top 20 genes in GC1. SNP Gene Name CHR ZS RW Function Associated disease and/or behavior rs2269426 TNXB [124]a Tenascin XB 6p21.3 −8.66 1.00 Mediates interactions between cells and extracellular matrix. SZ rs2734335 C2 Complement component 2 6p21.3 7.60 0.87 Part of complement system Autoimmune disease, obesity rs2072633 RDBP Negative elongation factor complex member B 6p21.3 7.44 0.85 Regulates elongation of transcription by RNA polymerase Unknown rs2559639 CHST11 [125]a Carbohydrate sulfotransferase 11 12q23.3 6.90 0.79 Catalyzes transfer of sulfate Marijuana abuse rs9266231 HLA-B [126]a MHC class I, B 6p21.3 6.79 0.78 Immune system MS, SZ, BP rs2249742 HLA-C [127]a MHC class I, C 6p21.3 −6.54 0.75 Immune system Psoriasis, SZ, BP rs4151657 CFB Complement factor B 6p21.3 −6.54 0.75 Part of complement system. SZ rs3134798 NOTCH4 [128]a Notch4 6p21.3 6.38 0.73 Cognition, brain development. SZ, AD, BP rs6931646 HLA-DRA [129]a MHC class II, DR alpha 6p21.3 6.12 0.70 Immune system AD, BP, PD, obesity rs2844519 MICA MHC class I polypeptide-related sequence A 6p21.33 5.43 0.62 Antigen presentation. AD rs151719 HLA-DMB [130]a MHC class II, DM beta 6p21.3 5.27 0.60 Peptide loading of MHC class II molecules by helping release the CLIP. SZ, MS, obesity rs1787729 DCC [131]a Deleted in colorectal carcinoma 18q21.3 5.20 0.60 Axon and neuronal guidance. SZ, depression rs2741566 PIGT Phosphatidylinositol glycan anchor biosynthesis, class T 20q12–q13.12 5.01 0.57 Component of GPI transamidase complex Unknown rs1511179 CTNNA2 [132]a Catenin, alpha 2 2p12-p11.1 −4.66 0.53 Cell–cell adhesion and differentiation in nervous system Excitement seeking/risk taking, AD, ADHD rs2213565 HLA-DQA2 [133]a MHC class II, DQ alpha 2 6p21.3 4.62 0.53 Peptide loading of MHC class II beta chain. Obesity, BP, SZ rs2544800 SULT2B1 Sulfotransferase family, cytosolic, 2B, member 1 19q13.3 4.61 0.53 Catalyzes sulfate conjugation of many hormones, neurotransmitters, drugs and xenobiotic compounds PD rs1152663 CTBP2 [134]a C-terminal binding protein 2 10q26.13 4.60 0.53 Targets diverse transcription regulators TBI rs9664844 PRKG1 [135]a Protein kinase, cGMP dependent, type I 10q11.2 4.48 0.51 Nitric oxide/cGMP signaling pathway SZ, AD rs3117578 CSNK2B [136]a Casein kinase 2, beta polypeptide 6p21.3 4.46 0.51 Wnt signaling pathway. Regulates basal catalytic activity of the alpha subunit Unknown rs7176717 RORA [137]a RAR-related orphan receptor A 15q22.2 −4.43 0.51 DA/GLU signaling, circadian rhythms, learning Autism, PTSD, Depression, BP, MDD [138]Open in a new tab Abbreviations: AD, Alzheimer's disease; ADHD, attention-deficit hyperactivity disorder; BP, bipolar; CHR, chromosome; CLIP, class II-associated invariant chain peptide; MDD, major depressive disorder; MHC, major histocompatibilty complex; MS, multiple sclerosis; PD, Parkinson's disease; PTSD, post-traumatic stress disorder; RW, rank weights; SNP, single-nucleotide polymorphism; SZ, schizophrenia; TBI, traumatic brain injury; ZS, Z-score. ^a Multiple SNP occurrence (>2) in gene network. Information provided was gathered from PubMed, genecards and gene associated databases. Refer to [139]Supplementary Table S4 for detailed references.