Abstract One third of humans are infected lifelong with the brain-dwelling, protozoan parasite, Toxoplasma gondii. Approximately fifteen million of these have congenital toxoplasmosis. Although neurobehavioral disease is associated with seropositivity, causality is unproven. To better understand what this parasite does to human brains, we performed a comprehensive systems analysis of the infected brain: We identified susceptibility genes for congenital toxoplasmosis in our cohort of infected humans and found these genes are expressed in human brain. Transcriptomic and quantitative proteomic analyses of infected human, primary, neuronal stem and monocytic cells revealed effects on neurodevelopment and plasticity in neural, immune, and endocrine networks. These findings were supported by identification of protein and miRNA biomarkers in sera of ill children reflecting brain damage and T. gondii infection. These data were deconvoluted using three systems biology approaches: “Orbital-deconvolution” elucidated upstream, regulatory pathways interconnecting human susceptibility genes, biomarkers, proteomes, and transcriptomes. “Cluster-deconvolution” revealed visual protein-protein interaction clusters involved in processes affecting brain functions and circuitry, including lipid metabolism, leukocyte migration and olfaction. Finally, “disease-deconvolution” identified associations between the parasite-brain interactions and epilepsy, movement disorders, Alzheimer’s disease, and cancer. This “reconstruction-deconvolution” logic provides templates of progenitor cells’ potentiating effects, and components affecting human brain parasitism and diseases. Subject terms: Disease genetics, RNA sequencing, Predictive markers, Microbiology Introduction The first half of the 20^th century achieved remarkable advances in control of some communicable diseases, with development of immunizations, antimicrobial therapies, and increasing ability to identify new pathogenic organisms^[87]1. The second half shifted to understanding chronic degenerative diseases as prevailing causes of death in older populations. One primary challenge for contemporary medicine is to control non-communicable diseases with complex gene-environment etiologies and progression^[88]2, [89]3, postulated to arise from complex interactive cascades of genetic and environmental factors. Historical efforts to find causes and cures for such illnesses, including brain diseases, have had a glaring omission of a significant environmental factor: Over 2 billion humans are infected with the neurotrophic parasite Toxoplasma gondii. Congenital and postnatal infections with T. gondii persist in all infected persons. The parasite interconverts between slow-growing, encysted bradyzoites and rapid-growing tachyzoites^[90]4. In mice, T. gondii creates a chronic intra-neuronal infection and an inflammatory process^[91]4. Mice with acute and chronic infection have alterations in neurotransmitters, memory, seizures, and neurobehaviors^[92]5, [93]6. Some epidemiologic-serologic studies show associations between seropositivity for T. gondii and human neurologic diseases, for example, Parkinson’s and Alzheimer’s diseases^[94]7, [95]8. Serologic studies of humans with diverse genetics are not optimal to detect strong associations or directionality. Epidemiologic associations also do not reveal parasite-modulated gene networks in human brain that could provide insights into how to cure and prevent resultant diseases. We need integrative approaches to examine relationships between brain parasitism and other brain diseases^[96]9, [97]10, to provide a foundation to identify key pathways and molecules for drug and vaccine design. To address these problems, we considered two central questions: (i) If chronic brain parasitism associates with other neurologic diseases, what are they? and (ii) Which macromolecular networks are modulated by the parasite in human brain that lead to neuropathology which could underpin and facilitate design of treatments? We hypothesized that a systems approach integrating multiple levels of host parasite interactions might resolve these questions. To gain insights into relationships between human T. gondii infections and brain disorders, we studied a unique cohort, the National Collaborative Chicago Based Congenital Toxoplasmosis Study (NCCCTS), and identified human susceptibility genes, as well as serologic biomarkers of active brain disease. The NCCCTS has diagnosed, treated and followed 246 congenitally infected persons and their families continuously beginning in 1981^[98]11–[99]39. Next in our study, we obtained new transcriptomic and proteomic data from infections of primary, human, neuronal stem cells, and monocytic cells that infiltrate brain, to determine whether there are different phenotypic effects of Type I, II, or III T. gondii tachyzoites. We used these four data sets to construct an integrated molecular model of the infected human brain. In a second phase, the model was further analyzed using systems biology approaches to provide insights into the molecular mechanisms by which T. gondii infection may cause disease. The broader goal was to provide a robust database and informatic analysis for the scientific community to use for their research. This report described only a limited number of important observations. This effort to integrate multiple levels of intrinsic and extrinsic factors offers an original template to unravel pathogenesis of complex diseases in humans. Our studies presented here were performed to examine genetic and parasite effects we hypothesized would influence outcomes (Fig. [100]1a). We utilized a novel systems approach expanding from gene-environment paradigms to include a third component of development (Fig. [101]1a). In Fig. [102]1a, there is intersection/overlap between the circles representing components of the host-parasite interactions. These circles include host-parasite genetics, toxoplasmosis susceptibility genes, and serum biomarkers in children in the NCCCTS [T. gondii infection, red circle], human neuronal stem cell functional assays including transcriptomics and proteomics [Brain pathology mechanisms, blue circle], and disease pathogenesis/pathology susceptibility genes for other diseases [others, green circle]). This overlap/intersection in the Venn diagram indicates the circumstances in which we hypothesize that manifestations of other diseases will occur. To test this hypothesis, we isolated the infected brain as a system. This sequence of our work presented herein, and structure of our studies, and these results are shown in a flow chart (Fig. [103]1b), detailed outline (Fig. [104]1c), and schematic diagram of our model created based on this work (Fig. [105]1a and d). The overall plan was to gain access to the network interactome and biosignatures of T. gondii and the human brain, which first reconstructed the T. gondii brain infection. Our first steps of reconstructing T. gondii brain infection included discovery, integration and systems analysis of our original data. These systematic analyses of our novel human data sets used empiric studies of human T. gondii infections of persons with toxoplasmosis and their families, and infectomes of primary, human brain stem and monocytic cells (Fig. [106]1b,c). In our work, the human infectome is defined as the human host and parasite molecules, and pathways that are perturbed by the interaction of the human host and parasite T. gondii, as has been characterized by others for studies of other pathogens in earlier literature. An interactome is the whole set of molecular interactions such as protein-protein and genetic interactions. This can provide a global “Omic” view of molecular effects as in the resource APID interactome ([107]http://cicblade.dep.usal.es:8080/APID/init.action). To our knowledge, this study herein is the only single human cohort that is directly observed by a uniform group of examiners, longitudinally in a variety of ways, combined with human, primary, neuronal stem cell Omics Systems analysis for the T. gondii infectome, then interrogated for disease susceptibility genes/proteins. We obtained cellular data to test a few pathways relevant to pathogenesis. These first were considered individually and then together to determine whether we identified key biologic processes and biosignatures affected by T. gondii. Second, the integrated brain infectome became a global map that was deconvoluted to determine functional clusters and disease correlates (Fig. [108]1d). We termed this approach “Reconstruction and Deconvolution” (Fig. [109]1d). Specifically, Reconstruction was based on our unique cohort of infected persons, most of whom showed neuropathologic symptoms^[110]13, to identify novel human genetics of susceptibility to toxoplasmosis. Then, serologic biomarkers were studied for a limited number of infected humans to assay readout of an infected brain. We selected neural stem cells to uncover potential developmental mechanisms because of their multipotency central to neurodevelopment and neuroplasticity. Most brain diseases result from abnormal neurodevelopment (e.g., epilepsy) and neuroplasticity (e.g., neurodegeneration). Hence, transcriptomic and proteomic infectomes of human neural stem cells were studied for parasite effects using primary cultures of cells from the hippocampus-temporal lobe. These datasets were integrated into a total brain infectome. They were unraveled, or “deconvoluted”, to identify functional and disease correlates. This was accomplished by analyzing upstream regulators in all our datasets. Thereby, we determined how different brain infectome components were interrelated. Cluster protein-protein interaction analysis revealed additional, functional correlates. Associations between the T. gondii brain infectome and other diseases that share these signature interactions were determined using our empirical, primary data. Figure 1. Figure 1 [111]Open in a new tab Methodology and analyses for understanding interaction of Toxoplasma gondii with human brain. (a) Gene-environment-pathology paradigm. The Venn diagram shows model of pathogenesis with confluence of permissive host and parasite genetics, and exposure. (b) Flow diagram of empirical genetic and biomarker data from NCCCTS, transcriptomics and proteomics. (c) Structure of the manuscript. This includes original empiric data, methods for analyses, and contributions of components to analyses in each figure. *Empiric but not from NCCCTS cohort; **Cell culture, IFA, microarray gene expression, mRNAseq, miRseq, quantitative proteomics, miR qPCR. d. Reconstruction and deconvolution analyses. Reconstruction is the discovery, integration and systems analysis of interrelatedness of four areas of primary, original data: genetics, transcriptomics and proteomics of infected cells and circulating serum biomarkers in ill persons. Deconvolution refers to the systems analysis that examines upstream regulatory genes, protein-protein cluster interactions and diseases with which biosignature pathways associate. These are the topics of the current work and are elaborated on throughout this manuscript. Image of family reproduced with their permission and also from “The Billion Brain Parasite”, Science Life (Easton, 2014). Results Reconstruction and deconvolution of Toxoplasma gondii brain infection Reconstruction 1. Susceptibility genes expressed in human brain provide insight into signature pathways of T. gondii in human brain Our unique NCCCTS cohort of persons with congenital toxoplasmosis and their families have been carefully characterized longitudinally^[112]13, [113]40–[114]44 (Fig. [115]1). Ongoing evaluations of this cohort contributes to the first phase of our present analyses as shown in the flow diagram in Fig. [116]1a. Figure [117]1b presents an outline of the work in this study including how the genetic and cohort analyses form a basis for the work. Figure [118]1c shows that these cohort and genetics analyses are integrated with other aspects of the current work in a model. Our NCCCTS cohort is the source of previously published analyses^[119]11–[120]39 which earlier provided a powerful tool to identify genes and pathways causing susceptibility to toxoplasmosis (Table [121]1^[122]11–[123]39, Figs [124]1 and [125]2). These susceptibility genes identified earlier are considered along with susceptibility genes newly identified herein. All these genes are part of our further analyses in this present manuscript. In our earlier work, the human susceptibility alleles of candidate genes identified for those in the NCCCTS were HLA Class I and II genes, ERAP 1, COL2A1, ABCA4, P2RX7, ALOX12, NALP1, IRAK4 (Table [126]1). Some of these gene/susceptibility associations were further confirmed using samples from the European Cohort study (EMSCOT)^[127]12, [128]14. One association, with NOD2^[129]39, was present in a Brazilian Cohort with eye disease, but not found in the NCCCTS. Characterization of mechanisms associated with ERAP1 were extended in studies of cross presentation of antigen^[130]45. Peptides interacting with MHC Class I genes of greater than usual octamer/nonamer lengths^[131]45 also were identified, suggesting that T. gondii subverts its host’s immune defenses with aberrant splicing of polypeptides^[132]46. These genetic data and their analysis are summarized in Table [133]1 and Fig. [134]2a and b. The newly identified genes or phenotypes identified herein are labeled “AM” or “OD” in Table [135]1. These susceptibility alleles indicated that the candidate genes were playing a role in susceptibility to toxoplasmosis and some have been studied for corresponding phenotypes to explain that susceptibility. The previously described genes are combined with newly identified genes for the analyses herein. Table 1. Genes with Susceptibility/Resistance Alleles Defined with National Collaborative Toxoplasmosis Study and EMSCOT Cohorts. Gene SNP (Allele) P Value Reason Candidate Gene Replicate/Proof of Principle/Phenotype Reference/Supporting Data HLA Class II DQ3 <0.02 MD Hydrocephalous in children (DQ3), Fewer cysts in HLA transgenic mice (DQ1) [136]11 DQ1 <0.0005 COL2A1 rs6823 (G) <0.03 (brain) ED EMSCOT replicates, imprinted, brain and eye disease [137]12, [138]13 rs2276455 (A) <0.03 rs2276455 (G)* <0.0005 rs1635544 (C) <0.03 rs2070739 (T) <0.02 rs2276454 (A) <0.007 rs3803183 (T) <0.02 rs3803183 (T)* <0.003 ABCA4 rs952499 (C) <0.03 HC EMSCOT replicates, imprinted, localized in human brain [139]12, [140]13, HC rs952499 (T)* <0.005 rs2297633 (G)* <0.0003 rs1761375 (G)* <0.0001 rs3112831 (C)* <0.02 P2RX7 rs1621388(C1772T) <0.021 OI EMSCOT replicates for differing alleles; ATP mediated cell death, cytokine signaling, pro-inflammation [141]14, [142]15, OD rs1718119(T1068C) <0.015 HLA Class I A <0.01 MD Genotype association and phenotypes humans and mice. PBMC from cohort. Peptides for HLA A2, A11, B7 confer protection [143]16–[144]22, OD B C ERAP1 rs149173(T/C) <0.0077 LfL Genotype association and phenotypes humans and mice [145]16, OD rs17481856(C/T) 0.0253 IRAK4 rs1461567 <0.023 IOID Genotype; phenotype, cell death, inflammation [146]23 rs4251513 <0.045 NALP1 rs8081261 <0.002 MD TRNG Genotype (MD region; human); phenotype, cell death, inflammation; MD [147]24–[148]30 rs11652907 <0.02 rs9902174 <0.04 ALOX12 rs6502997 <0.0003 MD TRNG Genotype and phenotype, cell death proinflammation [149]31 rs312462 (C) <0.03 rs6502998 (C) <0.03 rs434473 <0.04 TLR9 rs574386 (T1905C) <0.008 TLRs Brazil and Poland replicates; phenotype, ligand [150]32 rs352140 (C) <0.0001 AM TIRAP rs8177374(S180L) <0.006 IOID Genotype; Phenotype TLR signaling and cytokines AM FOXQ1 rs920209 <0.02 HC Note NK cells mice [151]33, OD TREX1 rs 2242150(A/T) 0.02 SPD Related clinical & Type 1 IFN phenotype, LFL [152]34, [153]35, OD or AM NFκβ1 rs997476(C/A) <0.02 CtoPiEA Phenotype, nuclear localization, signaling pathway [154]36, OD TGFβ1 rs10417924 (G overtranscribed) 0.016 CtoPiEA, MD Phenotype transcriptomics, GRA1 [155]37, [156]38 NOD2 (Brazil) rs3135499 (C/A)† <0.04 LfL Brazil Eye Disease, IL17, CD4+ [157]39 [158]Open in a new tab Abbreviations: Reason for selection of gene to test sequentially as single candidate gene in NCCCTS (1981–2016): MD, Murine model data; ED, Eye disease in humans caused by mutation of this gene; HC, Hydrocephalous caused in humans by gene mutation and adult macular degeneration associated with allelic variants; OI, Implicated alleles for other infections; LfL, Logical to test from literature and other findings, for example MHC Class I presentation of antigen for ERAP1; IOID, Gene in other diseases in the literature; MD TRNG, Toxo 1 region, not gene initially in humans based on rat Toxo 1 region; TLRs, Testing TLR genes now replicated by other cohorts and proven to be important in mice; SPD, similar pattern of brain disease as AG brain disease due to DNA ligase mutations; CtoPIEA, Central to genetic pathway identified with original analysis led to TDT analysis herein. Note: LfL Brazil, Minas Gerais, did not replicate in US full cohort; no *, significant in NCCCTS not EMSCOT;* significant in EMSCOT cohort not NCCCTS. P values are nominal. Supporting Data [SD]: References (#) with narrative