Graphical abstract

   graphic file with name fx1.jpg
   [39]Open in a new tab

Highlights

     * •
       A high-quality molecular network for schizophrenia (SCZ Network)
     * •
       A landscape of molecular pathways linking immune activation and
       schizophrenia
     * •
       The spatiotemporal network dynamics revealing stages susceptible to
       immune activation
     * •
       Identification of the molecular pathways and regulators in the
       immune-activated brain
     __________________________________________________________________

   Clinical neuroscience; Immunology; Bioinformatics; Biocomputational
   method; Systems biology

Introduction

   Schizophrenia (SCZ) has a lifetime incidence rate around 1%, and is
   associated with patient disability and severe socioeconomic burdens for
   families and our entire society ([40]Knapp et al., 2004; [41]Saha
   et al., 2007). It is a complex polygenic disorder with a wide range of
   psychiatric symptoms that are generally categorized as positive,
   negative, or cognitive ([42]International Advisory Group for the
   Revision of ICD-10 Mental and Behavioural Disorders, 2011; [43]Janca
   et al., 1993). Schizophrenia has a heritability of about 80%
   ([44]Cannon et al., 1998; [45]Farmer et al., 1987) and its genetic risk
   factors show very high heterogeneity. Studies of functional genetics
   have identified several schizophrenia susceptibility genes, such as
   DISC1 ([46]Blackwood et al., 2001), PDE4B ([47]Millar et al., 2005),
   NPAS3 ([48]Yu et al., 2014), and NRG1 ([49]Harrison and Law, 2006).
   Many SCZ-associated loci have been identified through genome-wide
   association studies (GWASs) ([50]O'Donovan et al., 2008; [51]Polednak,
   1991; [52]Ripke et al., 2013; [53]Schizophrenia Psychiatric Genome-Wide
   Association Study, 2011). Recently, two large cohort GWASs have
   identified robust and replicable common alleles with significant
   associations to schizophrenia ([54]Pardinas et al., 2018;
   [55]Schizophrenia Working Group of the Psychiatric Genomics, 2014). The
   major histocompatibility complex (MHC) has been found to be associated
   with schizophrenia by several groups ([56]International Schizophrenia
   Consortium et al., 2009; [57]Pardinas et al., 2018; [58]Schizophrenia
   Working Group of the Psychiatric Genomics, 2014; [59]Shi et al., 2009;
   [60]Stefansson et al., 2009; [61]Walters et al., 2013). Some hotspots
   of genomic deletions or duplications have been identified at 1q21.1,
   15q11.2 and 15q13.3, 22q11.2, and 16p11.2 ([62]Rees et al., 2014;
   [63]Szatkiewicz et al., 2014). In addition, whole exome sequencing
   discovered de novo mutations, which are notably enriched in
   glutamatergic postsynaptic proteins comprising of activity-regulated
   cytoskeleton-associated protein (ARC) and N-methyl-D-aspartate receptor
   (NMDAR) complexes ([64]Fromer et al., 2014; [65]Xu et al., 2011),
   suggesting that a dynamic synapse regulation may play a role in the
   disease development. Recent transcriptomic studies on the postmortem
   brain samples of the patients have identified SCZ-dysregulated genes in
   the brain, which are enriched for synaptic genes and immune genes
   ([66]Bowen et al., 2019; [67]Fromer et al., 2016).

   A great challenge of studying the molecular mechanisms underlying
   schizophrenia and of identifying drug targets is that some loci are
   common variants, and each may only play a small role in the
   pathogenesis. Rare variants are found to significantly increase the
   disease burden ([68]Fromer et al., 2014; [69]Singh et al., 2017; [70]Xu
   et al., 2011), and they are suggested to have bigger effects and thus
   higher penetrance ([71]Singh et al., 2017). Another challenge is that
   “gene-environment interaction” and “environment-environment
   interaction” may play a critical role in schizophrenia ([72]Khan and
   Powell, 2018). The “Two-Hit Hypothesis” of schizophrenia, proposed in
   1999, suggests that schizophrenia needs two events (“hits”) for its
   development: the first event (i.e., basic requirement) represents
   genetic factors such as germline mutations; and the second consists of
   environmental factors such as viral infections and social stressors
   ([73]Bayer et al., 1999; [74]Feigenson et al., 2014). Later on, it has
   been suggested that in-utero or early life factors, such as maternal
   infection and stress, may also serve as “the first hit” ([75]Estes and
   McAllister, 2016).

   In any case, accumulating evidence points to the important role of
   abnormal immune activation in the etiology of schizophrenia.
   Proinflammatory cytokine levels have been consistently reported to be
   increased in the serum ([76]Goldsmith et al., 2016; [77]Miller et al.,
   2011) and also in the prefrontal cortex ([78]Fillman et al., 2013;
   [79]van Kesteren et al., 2017) of schizophrenia patients. Prenatal
   exposure to maternal immune activation increases the risk of
   schizophrenia later in adult life ([80]Brown, 2012; [81]Brown and
   Derkits, 2010; [82]Estes and McAllister, 2016); adolescent stress is
   thought to be another driving factor ([83]Gomes and Grace, 2017).
   Moreover, as previously mentioned, genome-wide association studies have
   identified loci covering genes involved in immune function, such as the
   major histocompatibility complex (MHC) ([84]International Schizophrenia
   Consortium et al., 2009; [85]Pardinas et al., 2018; [86]Schizophrenia
   Working Group of the Psychiatric Genomics, 2014; [87]Shi et al., 2009;
   [88]Stefansson et al., 2009; [89]Walters et al., 2013). Transcriptome
   sequencing of postmortem brain samples found transcriptional alteration
   associated with widespread increasing inflammation in the brain of
   schizophrenia patients ([90]Lanz et al., 2019). A recent study found
   increased microglia-mediated synaptic pruning in SCZ patient-derived
   neural culture ([91]Sellgren et al., 2019). All these lines of evidence
   point to the critical role of abnormal immune activation in
   schizophrenia. However, studying the molecular mechanism that links
   abnormal immune activation to schizophrenia remains a major challenge.

   Despite the large number of genes involved in SCZ, patients display
   similar core phenotypic features, suggesting that SCZ risk genes may
   share a common molecular network. Therefore, a high-quality molecular
   network underlying this disease is urgently needed for illustrating the
   molecular mechanisms on how genetic and environmental factors may lead
   to the development of schizophrenia. Proteome-scale studies of protein
   interactions have shed light on the etiology of diseases and putative
   therapeutic targets ([92]Cusick et al., 2005; [93]Gonzalez and Kann,
   2012). High-throughput mapping of protein interactions ([94]Havugimana
   et al., 2012; [95]Hein et al., 2015; [96]Rolland et al., 2014; [97]Yang
   et al., 2016) and literature curation of small-scale studies
   ([98]Cusick et al., 2009) have accumulated huge amounts of
   protein-protein interaction data. Recently, the interactions from
   different sources have been integrated into a comprehensive scored
   human interactome network containing about 600,000 interactions ([99]Li
   et al., 2017). This vast amount of protein interaction data provides a
   potential molecular network for illustrating pathways underlying
   complex disorders such as schizophrenia. “Random walk” was first
   developed to study the network topology ([100]Lovász, 1993), and later
   “random walk with restart” (RWR) was introduced for Internet search
   engines ([101]Page, 1998). More recently, RWR has been utilized to
   search for novel disease candidate genes, using known disease genes as
   seeds in a molecular network ([102]Kohler et al., 2008;
   [103]Valdeolivas et al., 2019).

   Here, we have developed an approach to identifying disease-associated
   neighborhoods across the human interactome network, which we termed as
   “neighborhood walk” (NW). Utilizing the NW approach in combination with
   RWR, we have constructed a high-quality protein interaction network for
   schizophrenia (SCZ Network). This network is enriched in signaling
   pathways regulating immune response as well as synapse formation and
   modulation. By integrating the spatiotemporal gene expression data of
   human brain with the SCZ Network, we found that the immune response
   genes in this network are enriched at two developmental stages (mid- to
   late fetal stage and puberty stage). Therefore, it is likely that at
   these two developmental stages, the SCZ Network is most sensitive to
   the perturbation of immune activation. To further verify that the
   potential schizophrenia signaling pathways are sensitive to the
   perturbation of immune activation during the two critical stages, we
   induced immune activation at these two stages and carried out
   transcriptome sequencing on immune-activated mouse brains. By
   integrating the transcriptome data from immune-activated mouse brains
   with the SCZ Network, we have identified potential molecular pathways
   and key regulators, which may illustrate a detailed view of the
   mechanisms on how maternal immune activation during fetal development
   may increase the risk for schizophrenia with a second immune activation
   at puberty.

Results

Constructing a protein-protein interaction network for schizophrenia using
the “neighborhood walk” and “random walk with restart” methods

   To collect high-confidence schizophrenia candidate genes, we used a
   recently updated comprehensive database for schizophrenia genetic
   research (SZDB) ([104]Wu et al., 2017, [105]2020). The SZDB (version
   2.0) collects a total of 7,903 schizophrenia candidate genes from
   literature, which were identified by 7 different methods in 87 studies
   including newly published genome scale studies ([106]Wu et al., 2017,
   [107]2020). We re-evaluated each source data and selected a core set of
   1,836 high-confidence SCZ candidate genes, including 223 genes from 4
   linkage studies, 529 genes from 2 large-cohort genome-wide association
   studies, 153 genes from 77 copy number variation studies, 543 genes
   from 1 transcriptomic study on postmortem brains, 63 genes from 1
   imputation study, 307 genes from 3 studies on differential methylation
   and 165 genes from 14 studies on exome sequencing ([108]Figures S1A and
   S1B). Of the 1,836 high-confidence SCZ candidate genes, 1,720 are
   protein-coding genes, which were used as seed genes to expand SCZ
   candidate genes. Of these 1,720 seed genes, 55.4% (953 of 1,720) are
   reported in two or more studies ([109]Figure 1A), while 92.3% (1,588 of
   1,720) are found by only one of the seven methods ([110]Figure 1B),
   suggesting that each method may have its bias. To test the functional
   coherence of the selected seed genes, we mapped these 1,720 seeds to
   human interactome to measure their interconnectivity. The same number
   of randomly chosen coding genes from 7,903 candidate genes was used as
   control. Compared with the candidate genes in the source data, selected
   seed genes have a significantly larger main interconnected network
   (empirical p = 0.001, [111]Figure 1C), suggesting that these seed genes
   are more functionally related.

Figure 1.

   [112]Figure 1
   [113]Open in a new tab

   Construction of SCZ Network from human interactome using Neighborhood
   walk (NW) and Random walk with Restart (RWR) methods (See also
   [114]Figure S1)

   (A) The distribution of the number of studies (PubMed IDs) reporting
   the SCZ association of each of the core set of 1,720 high-confidence
   SCZ candidate genes: 767 genes are supported by only one study, 708
   genes by two studies, and 245 genes by at least 3 studies.

   (B) Distribution of the methods for identifying the SCZ association of
   each of the core set of 1,720 high-confidence SCZ candidate genes:
   1,588 genes supported by one method, 119 genes supported by two methods
   and 13 genes supported by at least 3 methods. The methods include
   linkage analysis, genome-wide association studies (GWAS), copy number
   variations (CNV), differentially expressed genes (DEG), imputation
   analysis, differentially methylated genes (DMG) and whole exome
   sequencing (WES).

   (C) The interconnectivity of selected core set of SCZ candidate genes
   (seeds). The core set of 1,720 high-confidence SCZ candidate genes were
   mapped to human interactome and an interconnected network of 1,298 seed
   genes was extracted. The same number of randomly chosen coding genes
   from 7,903 SCZ candidate genes in the source data were used as control.
   Compared with the 1,000 random samplings of SCZ candidate genes in the
   source data, selected seed genes have a significantly larger
   interconnected network (empirical p = 0.001).

   (D)Mapping seeds onto the human interactome. The 1,720 high-confidence
   SCZ candidate genes (as seeds) were mapped onto the comprehensive human
   interactome network which contains 17,210 nodes (1,592 seeds, 15,618
   nonseeds) and 597,859 edges.

   (E) A schematic for RWR and NW. Top: the genes in the human interactome
   network were ranked based on their proximity with “seed genes” using
   “random walk with restart (RWR)” approach to expanding SCZ candidate
   genes. Bottom: The neighborhoods in the human interactome network were
   defined and ranked based on their “seed ratio” using “neighborhood walk
   (NW)” approach.

   (F) Extracting SCZ Network from human interactome. Comparison between
   RWR-extracted network and NW-extracted network is shown as the Venn
   diagram. RWR-extracted subnetwork (RWR-Subnet): the genes with RWR
   score more than 1e-05 were included as expanded SCZ candidate genes and
   a network containing the 1,592 seed genes and 5,920 expanded SCZ
   candidate genes were extracted from the human interactome. NW-extracted
   subnetwork (NW-Subnet): The neighborhoods with seed ratio ≥75% were
   extracted from the human interactome and these neighborhoods were
   further combined into a network containing 1,592 seed genes and 3,858
   expanded candidate genes. The SCZ Network: The intersection of
   RWR-extracted subnetwork and NW-extracted subnetwork was taken as the
   disease network for schizophrenia, containing 4,573 nodes (1,592 seeds;
   2,981 candidates) and 192,392 edges.

   From a network perspective, products of the SCZ disease genes should
   interconnect at the molecular level to determine the complex traits of
   this disease. To identify additional SCZ candidate genes that may
   function together with the known SCZ candidate genes, we mapped the
   1,720 high-confidence SCZ candidate genes onto a high-quality
   comprehensive human interactome network ([115]Figure 1D), which
   contains 17,210 protein-coding genes (nodes) and about 600,000
   interactions (edges) ([116]Table S2) ([117]Li et al., 2017). Of the
   1,720 high-confidence SCZ candidate genes, 1,592 were located in the
   human interactome network. These 1,592 seed genes were more connected
   than randomly picked nodes with the same degrees ([118]Figures S1C and
   S1D), suggesting that the seed genes were functionally related. Using
   these seed genes, we applied two strategies and combined the results to
   construct a protein interaction network for schizophrenia
   ([119]Figure 1E). Firstly, we carried out the “random walk with
   restart” (RWR) approach to rank the genes according to their similarity
   with these seed genes in the human interactome network. Secondly, we
   developed a method, “neighborhood walk” (NW), to rank the neighborhoods
   according to their seed ratios and p-value, which were considered as
   the levels of association to schizophrenia.

   RWR has been adapted for ranking novel candidate genes using known
   disease genes as seeds in molecular networks ([120]Kohler et al., 2008;
   [121]Valdeolivas et al., 2019). Here, we used RWR to expand SCZ
   candidate genes in the human interactome ([122]Figure 1E). We extracted
   an RWR-retrieved subnetwork containing 7,512 genes (including 1,592
   seed genes and 5,920 expanded SCZ candidate genes) using RWR score =
   1 × 10^−5 as the cutoff for choosing the expanded candidate genes
   ([123]Figure 1F and [124]Table S3).

   The neighborhood walk (NW) approach was based on the idea that
   functionally related genes are more likely to cluster together in the
   same neighborhood in a molecular network. Therefore, the neighborhoods
   with higher level of seed density are more likely to be associated with
   schizophrenia. Using NW approach to searching for neighborhoods in the
   human interactome network based on seed density ([125]Figure S1E, See
   [126]STAR Methods for details), we were able to identify the network
   neighborhoods that may be involved in schizophrenia. Briefly, we
   initially mapped the SCZ candidate genes as “seeds” onto the human
   interactome network and assigned a risk score (R-score) to each seed
   gene according to the number of methods used in identifying the gene as
   an SCZ candidate gene ([127]Table S3). For each nonseed gene, we
   calculated a guilt score (G-score = the average R-score of its
   interactors) ([128]Table S3). For each edge, we calculated an edge
   weight between the two interacting nodes (W = the sum of the R-score
   and/or G-score) and a weighted edge length (WEL = 1/W) ([129]Table S3),
   and for any two nodes in the network, we calculated a weighted shortest
   distance
   [MATH: <mrow><mrow><mo>(</mo><mrow><mi>W</mi><mi>S</mi><mi>D</mi><mo
   linebreak="badbreak">=</mo><mo>∑</mo><mrow><mo>(</mo><mrow><mn>1</mn><m
   o
   linebreak="badbreak">/</mo><mrow><mi>W</mi><mi>i</mi></mrow></mrow><mo>
   )</mo></mrow></mrow><mo>)</mo></mrow></mrow> :MATH]
   ([130]Figure S1E). To search for SCZ-associated neighborhoods, we
   started the “walking” from each seed and stopped at a potential
   neighborhood border at a certain walking distance (WSD) from the
   origin, i.e., if further “walking” to the next sphere made the seed
   ratio in the neighborhood drop significantly ([131]Figure S1E). We
   obtained 1,592 potential SCZ-associated neighborhoods (one for each
   seed) and ranked them according to the seed ratio and p-value
   ([132]Table S3). Three neighborhoods were chosen as examples to be
   visualized in the background of human interactome network using
   Cytoscape, and functional enrichment was performed to identify the main
   functional terms ([133]Figure S1F), showing topological and functional
   coherence. The neighborhoods with seed ratio ≥75% were selected and
   combined into a subnetwork (NW-based Subnetwork), which contains 1,592
   seed genes and 3,858 expanded SCZ candidate genes.

   The NW approach was designed based on the hypothesis that disease genes
   of a complex disease were functionally converged and thus clustered
   together in neighborhoods of the human interactome network. In order to
   test this hypothesis, we randomly picked 1,592 genes with exactly the
   same degrees as the 1,592 seeds and, starting with the
   degree-controlled fake seeds, we performed the same procedure using the
   same cutoff of 75% seed ratio for selecting “disease”-associated
   neighborhoods. The procedure was repeated for 100 times, and the best
   result showed only three very small retrieved neighborhoods (a total of
   eight genes, including 7 fake seeds and 1 “expanded candidate gene”)
   ([134]Figure S1G). This result demonstrates that NW exclusively
   retrieves disease-associated neighborhoods based on functional
   convergence of disease genes. We also performed RWR for 100 times
   starting with 1,592 fake seeds using the cutoff 10ˆ-5
   ([135]Figure S1H). Using fake seeds, RWR retrieved more “expanded
   candidate genes” than using real seeds, suggesting that RWR has no
   preference for seed genes with functional convergence.

   NW-retrieved subnetwork and RWR-retrieved subnetwork have 54.5%
   (4,573/8,389) nodes in common ([136]Figure 1F). We took the
   intersection of these two subnetworks as the final network for
   schizophrenia (SCZ Network) ([137]Figure 1F, [138]Table S4) for
   downstream analysis, which contains 1,592 seed genes and 2,981 expanded
   candidate genes. This network only has one major component and four
   isolated nodes.

Evaluation on the SCZ Network

   In order to evaluate the quality of the expanded candidate genes in the
   SCZ Network, we used independent datasets to compare SCZ relevance
   between the expanded candidate genes and the seed genes. We obtained
   the independent SCZ-associated gene sets from three recently published
   systematic studies ([139]Table S5), including: (1) genomic variants
   associated with schizophrenia ([140]Howrigan et al., 2020; [141]Rees
   et al., 2020); (2) dysregulated genes in the postmortem brain of
   schizophrenia patients (SCZ-dysregulated genes) ([142]Bowen et al.,
   2019). We calculated the recovery ratios of these two SCZ-associated
   gene sets by the seed genes, the expanded candidate genes in the RWR-
   or NW-extracted subnetworks, the intersection of these two subnetworks,
   and the nonseed genes in the background network (the human interactome
   network) respectively ([143]Figures 2A and 2B). Compared to the nonseed
   genes in the background network, both the seed genes and the expanded
   candidate genes in the RWR- or NW-extracted subnetworks show
   significant enrichment of SCZ-associated variants (Two-tailed Fisher’s
   exact tests, p = 6.17e−26 for seed genes, p = 1.37e−07 for the expanded
   candidates by NW, p = 3.27e−07 for the expanded candidates by RWR, p =
   1.43e-7 for the expanded candidate genes identified by both,
   [144]Figure 2A). In addition, compared to the nonseed genes in the
   background network, both the seed genes and the expanded candidate
   genes by RWR and NW show significant enrichment of SCZ-dysregulated
   genes (Two-tailed Fisher’s exact tests, p = 7.49e−3 for seed genes, p =
   3.23e−10 for the expanded candidates by NW, p = 1.03e−21 for the
   expanded candidates by RWR, p = 1.83e-13 for the expanded candidate
   genes identified by both, [145]Figure 2B). Furthermore, the seed genes
   are more enriched with SCZ-associated variants than the expanded
   candidates (Two-tailed Fisher’s exact test, p = 2.07e-11 for the
   expanded candidate genes by NW, p = 3.61e-15 for the expanded candidate
   genes by RWR, p =1.89e-9 for the expanded candidate genes identified by
   both, [146]Figure 2A), but show similar enrichment of SCZ-dysregulated
   genes compared with the expanded candidate genes (Two-tailed Fisher’s
   exact test, p = 0.64 for the expanded candidate genes by NW, p = 0.44
   for the expanded candidate genes by RWR, p = 0.14 for the expanded
   candidate genes identified by both, [147]Figure 2B). These results
   suggest that previous studies might be biased towards genetic variants,
   while the RWR or NW methods did not exhibit such bias. Using a core set
   of high-quality SCZ candidate genes as seed genes, RWR and NW performed
   equally well according to the enrichment of SCZ-variants in the
   expanded candidate genes (Two-tailed Fisher’s exact test, p = 0.32 for
   SCZ-variants, p = 0.72 for SCZ-DEGs). Furthermore, the expanded
   candidate genes by both RWR and NW show greater enrichment of the
   SCZ-associated genes than those expanded by RWR alone (Two-tailed
   Fisher’s exact test, p = 8.24e-3 for SCZ-associated variants,
   [148]Figure 2A) or NW alone (Two-tailed Fisher’s exact test, p =
   1.84e-4 for dysregulated genes) ([149]Figure 2B), showing that the
   combination of these two methods has better performance than either
   one. These results support our decision on choosing the intersection of
   RWR-retrieved subnetwork and NW-retrieved subnetwork as high-confidence
   SCZ Network.

Figure 2.

   [150]Figure 2
   [151]Open in a new tab

   Validation of the SCZ Network using independent data sets of variants
   and dysregulated genes associated with schizophrenia (See also
   [152]Figure S2)

   (A) Box and whisker graph for the fractions of SCZ-associated variants
   in the expanded candidate genes in SCZ Network (SCZ-Net), NW-extracted
   subnetwork (NW-Subnet) and RWR-extracted subnetwork (RWR-Subnet),
   compared with those in the seeds in human interactome (seeds-HI) and
   the nonseeds in human interactome (nonseeds-HI).

   (B) Box and whisker graph for the fractions of SCZ-dysregulated genes
   in the expanded candidate genes in SCZ Network (SCZ-Net), NW-extracted
   subnetwork (NW-Subnet) and RWR-extracted subnetwork (RWR-Subnet),
   compared with those in the seeds in human interactome (seeds-HI) and
   the nonseeds in human interactome (nonseeds-HI).

   (C) Box and whisker graph for the fractions of SCZ-associated variants
   in the SCZ Network (SCZ-Net), NW-extracted subnetwork (NW-Subnet) and
   RWR-extracted subnetwork (RWR-Subnet), compared with those in HI (human
   interactome).

   (D) Box and whisker graph for the fractions of SCZ-dysregulated genes
   in the SCZ Network (SCZ-Net), NW-extracted subnetwork (NW-Subnet) and
   RWR-extracted subnetwork (RWR-Subnet), compared with those in HI (human
   interactome). P-values were calculated using Two-sided Fisher’s exact
   test. P-values on lines above or below the boxes are calculated from
   comparisons between the two groups. Lines extending vertically from the
   boxes indicate variability outside the upper and lower quartiles,
   estimated using bootstrapping method with 1,000 resamplings.

   We further compared the recovery ratios of the SCZ-associated gene sets
   by the SCZ Network with those by RWR/NW-extracted subnetworks or the
   background network ([153]Figures 2C and 2D). Compared to the background
   network, the extracted subnetworks using RWR, NW or both (SCZ Network)
   show significant enrichment with SCZ-associated variants (p = 5.23e-18
   for RWR-extracted subnetwork; p = 4.52e-22 for NW-extracted subnetwork,
   p = 4.81e-24 for SCZ Network, [154]Figure 2C) and significant
   enrichment with SCZ-dysregulated genes (p = 4.07e-23 for RWR-extracted
   subnetwork; p = 1.02e-11 for NW-extracted subnetwork, p = 1.57e-14,
   [155]Figure 2D). Compared to RWR/NW-extracted subnetworks, SCZ Network
   showed both increased enrichment with SCZ-associated variants
   (Two-tailed Fisher’s exact tests, p = 8.78e-4 compared to NW-extracted
   network, p = 1.17e-08 compared to RWR-extracted subnetwork,
   [156]Figure 2C) and increased enrichment with SCZ-dysregulated genes
   (Two-tailed Fisher’s exact tests, p = 5.03e-3 compared to NW-based
   subnetwork, [157]Figure 2D). All these results demonstrate that both
   RWR and NW approaches have identified additional SCZ candidate genes,
   which show significant enrichment of SCZ-associated genes, and that the
   combination of RWR and NW had significantly better performance than RWR
   or NW alone, suggesting that the SCZ Network constructed using the
   combination of RWR and NW methods may be a highly SCZ-relevant
   molecular network.

   We further performed enrichment analysis on MetaCore disease biomarkers
   of mental disorders for the expanded candidate genes and seed genes in
   the SCZ Network. We found that biomarkers for several mental disorders,
   including schizophrenia, schizophrenia spectrum disorder, major
   depressive disorder, neurodevelopmental disorders, bipolar disorders,
   autism spectrum disorder, etc., are significantly enriched with both
   the seed genes and the expanded candidate genes ([158]Figure S2,
   [159]Table S5). These results further confirm the high quality of the
   SCZ Network, and also suggest a potential common molecular mechanism
   for the comorbidity of these mental disorders. The expanded candidate
   genes and seed genes are enriched with biomarkers for ‘mental
   disorders’ in equal magnitude, but the seed genes show greater
   enrichment for ‘schizophrenia’, probably due to the bias of
   schizophrenia biomarkers towards well known SCZ candidate genes.

SCZ network is enriched in synaptic and immune signaling pathways

   In order to explore the molecular mechanisms underlying schizophrenia,
   we performed pathway enrichment analysis using MetaCore and KEGG
   databases. MetaCore pathway database is a comprehensive pathway
   database, and therefore, the enriched MetaCore pathways may provide a
   list of potential SCZ-associated pathways in the SCZ Network
   ([160]Table S6-1), which may help further illustrate the molecular
   mechanisms of schizophrenia. We examined literature for the 10 most
   significantly enriched pathways, including ROS-induced cellular
   signaling, GPCR signaling, MET and RON signaling, cytokine/JAK/STAT
   signaling, GDNF signaling, EGFR signaling, etc., ([161]Figure S3A), and
   found that each of these pathways is well reported to associate with
   both schizophrenia and immunity by multiple studies. Interestingly, all
   these pathways are also reported to be involved in synaptogenesis or
   synapse modulation (See [162]Table S6-2 for literature information).
   These top enriched MetaCore pathways, each suggested by literature to
   be associated with immune activation, synaptic regulation, and
   schizophrenia, may provide us with a direction for further exploring
   the SCZ Network. The enriched KEGG pathways would provide functional
   information of the network ([163]Table S6-3). We compared the top
   enriched KEGG pathways between the seed genes, the expanded candidate
   genes and all genes ([164]Figure 3A). Among these pathways, some
   pathways such as ‘MAPK signaling’, ‘cAMP signaling’, ‘ErbB signaling’,
   ‘GABAergic synapse’, ‘dopaminergic synapse’, ‘Alzheimer's disease’, and
   ‘circadian entrainment’ are common between the seed genes and the
   expanded candidate genes ([165]Figure 3A); some pathways such as
   ‘calcium signaling’, ‘neuroactive ligand-receptor interaction’ and
   ‘nicotine addiction’ are only found for seed genes ([166]Figure 3A),
   while other pathways such as ‘viral infections’, ‘bacterial
   infections’, ‘viral carcinogenesis’, ‘pancreatic cancer’, ‘cellular
   senescence’ and ‘ubiquitin-mediated proteolysis’ are found for the
   expanded candidate genes ([167]Figure 3A). Interestingly, the pathways
   that are only enriched with the expanded candidate genes are mostly
   related to immune response and cancer, suggesting that the network
   approach may compensate the genetic approaches in identifying some SCZ
   candidate genes involved in immunity. All these top enriched KEGG
   pathways are documented to associate with schizophrenia or psychiatric
   disorders (See [168]Table S6-4 for literature information). The
   literature information on the top enriched MetaCore and KEGG pathways
   suggests a close link between immune activation, synapse function, and
   schizophrenia. It is consistent with previous studies which discovered
   the involvement of immune system ([169]Brown, 2012; [170]Brown and
   Derkits, 2010; [171]Estes and McAllister, 2016; [172]Fillman et al.,
   2013; [173]van Kesteren et al., 2017) and synaptic loss and impaired
   synaptic plasticity ([174]Boksa, 2012; [175]Forsyth and Lewis, 2017;
   [176]Keshavan et al., 1994; [177]Osimo et al., 2019; [178]Sellgren
   et al., 2019; [179]Stephan et al., 2006) in the development of
   schizophrenia.

Figure 3.

   [180]Figure 3
   [181]Open in a new tab

   SCZ network is enriched for synaptic and immune function (See also
   [182]Figure S3)

   (A) Functional enrichment analyses on the SCZ Network for the seed
   genes (known candidate genes) and the expanded candidate genes. The
   enriched KEGG pathways are those with overrepresentation of genes in
   the SCZ Network. Only top enriched KEGG pathways are shown. The sizes
   of the dots represent gene ratios and the colors represent the adjusted
   p-values. The pathways in immune response are in blue font, the
   synaptic signaling pathways (MAPK, cAMP or ErbB signaling pathways) are
   in green font; the synapse pathways are in cyan, other pathways related
   to cancer, Alzheimer's disease or circadian entrainment are in black.

   (B) Synaptic and immune genes in the SCZ Network. Left pie chart is
   split into 4 categories: (1) immune genes, (2) synaptic genes, (3) the
   genes annotated as both immune and synaptic genes, and (4) other genes.
   The bars on the right show the proportions of the seeds and the
   expanded candidate genes in categories (i), (ii) and (iii).

   (C) Enrichment of synaptic genes in the seed genes and the expanded SCZ
   candidate genes compared to the human interactome (p = 4.88e-47 for all
   genes in SCZ Network, p = 2.37e-27 for seed genes, p = 2.39e-15 for the
   expanded candidates). The seed genes are significantly more enriched
   with synaptic genes than the expanded candidates (p = 1.98e-4).

   (D) Enrichment of immune genes in the seed genes and the expanded
   candidate genes compared to the human interactome (p = 1.44e-42 for all
   genes in SCZ Network, p = 0.09 for seed genes, p = 4.22e-47 for the
   expanded candidates). The expanded candidate genes are significantly
   more enriched immune genes than seed genes (p = 1.37e-09).

   P-values were calculated using Two-sided Fisher’s exact test. P-values
   on lines above the top of boxes are from comparisons between two
   groups. Lines extending vertically from the boxes indicate variability
   outside the upper and lower quartiles, estimated using bootstrapping
   method with 1,000 resamplings.

   Consistent with the pathway enrichment results, the SCZ Network
   contains 10.4% (474/4,573) synaptic genes and 24.3% (1,109/4,573)
   immune genes ([183]Figure 3B, [184]Tables S6-5–S6-8), showing
   significant enrichment (Fisher’s exact tests, p = 4.88e-47 for synaptic
   genes, p =1.44e-42 for immune genes, [185]Figures 3C and 3D), compared
   to the background network which contains 4.2% (532/12,637) synaptic
   genes and 15.0% (1,901/12,637) immune genes. Compared to other genes in
   the background network (the human interactome), both the seed genes and
   the expanded candidate genes are significantly enriched with synaptic
   genes (Fisher’s exact tests, p = 2.37e-27 for seed genes; p = 2.39e-15
   for the expanded candidate genes, [186]Figure 3C), but the seed genes
   show stronger enrichment than the expanded candidate genes (Fisher’s
   exact test, p = 1.98e-4, [187]Figure 3C). However, only the expanded
   candidate genes are significantly enriched with immune genes (Fisher’s
   exact test, p = 0.09 for seed genes; p = 4.22e-47 for the expanded
   candidate genes, [188]Figure 3D). These results demonstrate that apart
   from synaptic genes our network approach has captured many additional
   candidate genes involving in SCZ-related immune response that have not
   been identified in previous genetic studies. The interconnection among
   the immune genes, synaptic genes and other genes in the SCZ Network
   ([189]Figure S3B) suggests a critical role of immune genes in the
   pathogenesis of schizophrenia, which provides the basis for systematic
   investigation on the molecular mechanisms linking immune activation to
   schizophrenia.

Spatiotemporal expression patterns of genes in SCZ network associates with
two susceptible developmental stages

   We are interested in understanding whether the spatiotemporal dynamics
   of the SCZ Network might reveal the stages and brain regions sensitive
   to immune activation and, more importantly, the underlying molecular
   pathways. We utilized the expression data from 4 regions of human
   brains at different developmental periods (periods 2-13, P2-P13)
   ([190]Kang et al., 2011; [191]Lin et al., 2015) to calculate the
   overrepresentation of the genes in the SCZ Network in different
   spatiotemporal phases ([192]Figures 4A and [193]S4; see [194]STAR
   Methods and [195]Table S7 for details). The p-values for the
   enrichments were calculated using one-side Fisher’s exact test. Region
   1 includes: posterior inferior parietal cortex, primary auditory (A1)
   cortex, primary visual (V1) cortex, superior temporal cortex, and
   inferior temporal cortex; region 2 includes: primary somatosensory
   cortex, primary motor cortex, orbital prefrontal cortex, dorsolateral
   prefrontal cortex, medial prefrontal cortex, and ventrolateral
   prefrontal cortex; region 3 includes: striatum, hippocampus, and
   amygdala; region 4 includes: mediodorsal nucleus of the thalamus, and
   cerebella cortex ([196]Figure 4A).

Figure 4.

   [197]Figure 4
   [198]Open in a new tab

   Spatiotemporal dynamic expression of the genes in the SCZ Network
   during brain development (See also [199]Figure S4)

   (A) A schematic of human spatiotemporal expression data source: 4 brain
   regions and 10 developmental periods. Region 1 includes: posterior
   inferior parietal cortex, primary auditory (A1) cortex, primary visual
   (V1) cortex, superior temporal cortex, inferior temporal cortex; region
   2 includes: primary somatosensory cortex, primary motor cortex, orbital
   prefrontal cortex, dorsolateral prefrontal cortex, medial prefrontal
   cortex and ventrolateral prefrontal cortex; region 3 includes:
   striatum, hippocampus and amygdala; region 4 includes: mediodorsal
   nucleus of the thalamus and cerebella cortex. Developmental periods
   including: early fetal development, early mid-fetal development, late
   mid-fetal development, late fetal development, neonatal and early
   infancy, late infancy, early childhood, middle and late childhood,
   adolescence and young adulthood (see [200]Figure S4).

   (B–E) Enrichment of different categories of genes of the SCZ Network in
   spatiotemporal phases. The heatmap shows the significance levels of
   overrepresentation of the genes of the SCZ Network in the
   spatiotemporal phases. One-sided Fisher’s exact test was used for the
   analyses of enrichments. The numbers in the squares represent the
   −log[10](p-value). (B) Enrichment of all genes of the SCZ Network in
   spatiotemporal phases; (C) Enrichment of the non-immune genes of the
   SCZ Network in spatiotemporal phases; (D) Enrichment of the synaptic
   genes of the SCZ Network in spatiotemporal phases; (E) Enrichment of
   the immune genes of the SCZ Network in spatiotemporal phases.

   The genes in the SCZ Network are significantly enriched in the regions
   1 and 2 (the cortex regions) during most of development periods
   (log[10]-transformed p-values shown in [201]Figure 4B). Stronger
   enrichment is found in region 1 during periods 3-6, 11-13 (P3-6: early
   fetal to late mid-fetal development, p = 1.66e-10 ∼ 6.60e-07; P11-13:
   middle childhood to young adulthood, p = 8.05e-06 ∼ 9.59e-11), in
   region 2 during periods 3-4, 5-7 (P3-4: early fetal to early mid-fetal
   development, p = 5.87e-06; P5-7: early mid- to late fetal development,
   p = 1.20e-07 ∼ 4.45e-06) and periods 12-13 (adolescence to young
   adulthood, p = 3.77e-11) and in region 3 during periods 12-13
   (adolescence to young adulthood, p = 4.85e-5) ([202]Figure 4B). The
   overrepresentation of the genes in the SCZ Network in the stages of
   early- to late fetal development and adolescence to young adult
   suggests that this network may be more susceptible to perturbation
   during fetal development and adolescence. Since 76% of the genes in the
   SCZ Network are nonimmune genes, the nonimmune genes show a very
   similar pattern as all genes in the network ([203]Figure 4C).

   Accumulating evidence suggests that synaptic plasticity impairment
   during development may play a role in the development of schizophrenia
   ([204]Boksa, 2012; [205]Forsyth and Lewis, 2017; [206]Keshavan et al.,
   1994; [207]Osimo et al., 2019; [208]Sellgren et al., 2019; [209]Stephan
   et al., 2006). We further explored the spatial temporal expression
   pattern of the synaptic genes of the SCZ Network ([210]Figure 4D).
   Synaptic genes are significantly enriched in region 1 during periods
   2-6 (P2-6: early fetal to late mid-fetal development, p = 7.6e-4 ∼
   0.01) and 10-13 (P10-13: early childhood to young adulthood, p =
   0.04–0.02); in region 2 during periods 5-6 (P5-6: early-to-late
   midfetal development, p = 0.03) and periods 12-13 (P12-13: adolescence
   to young adulthood, p = 0.03) ([211]Figure 4E). The overrepresentation
   of synaptic genes during periods from early fetal to late mid-fetal
   development and from early childhood to young adult suggests these
   stages are critical for synaptic development. In addition, these stages
   are coincident with the two critical stages for cortex development, in
   which cell differentiation and synapse remodeling may be influenced by
   the environment factors, respectively ([212]Selemon and Zecevic, 2015).

   To further identify the brain regions and developmental stages that are
   sensitive to immune activation, we further calculated the
   overrepresentation of the immune genes of the SCZ Network in the
   spatiotemporal phases. The immune genes are more significantly enriched
   in region 1 during periods 5-7, 11-13 (P5-7: early mid- to late fetal
   development p = 0.02–0.03; P11-13: middle childhood to young adulthood,
   p = 0.02–0.03); in region 2 during periods 6-7 (P6-7: late mid-to late
   fetal development, p = 0.007) ([213]Figure 4E). The overrepresentation
   of immune genes in this network during both early- to mid-fetal
   development and peripuberty suggests that it may be more easily
   affected by immune activation during these stages. These results are
   consistent previous findings that fetal and peripubertal developmental
   stages are the two key stages for schizophrenia: (1) maternal infection
   ([214]Brown, 2012; [215]Brown and Derkits, 2010; [216]Estes and
   McAllister, 2016), maternal stress ([217]Gomes et al., 2019) and
   placental diseases ([218]Kendell et al., 1996) during gestation
   increase the risk of schizophrenia in the offspring; (2) adolescent
   stress is also a risk factor for schizophrenia ([219]Gomes and Grace,
   2017). These results suggest that abnormal immune activation during
   these stages may perturb the molecular network in the brain,
   potentially leading to schizophrenia. On the other hand, many
   environmental factors, including infection and stress, induce
   peripheral and central immune activation ([220]Uhlhaas, 2011).

   Based on network topology, the immune genes, synaptic genes and other
   genes are interconnected with each other, and therefore immune
   activation may affect brain development and synapse remodeling. Recent
   studies show that immune genes in the brain may regulate neuronal
   synapse remodeling ([221]Boulanger, 2009; [222]Morimoto and Nakajima,
   2019). The complement component 4 (C4) participates in the synaptic
   pruning regulation ([223]Sekar et al., 2016). The immune activation
   increases synaptic pruning through microglia ([224]Sellgren et al.,
   2019). Our results are consistent with previous findings of the
   critical role of immune system in brain development ([225]Morimoto and
   Nakajima, 2019; [226]Schepanski et al., 2018). Overall, our findings
   illustrated the connection between immune activation and synapse
   plasticity during peripubertal developmental stages, further confirming
   that fetal and peripubertal developmental stages are critical for
   schizophrenia development ([227]Mottahedin et al., 2017). Since the
   typical onset time of schizophrenia is the transition period from
   adolescence to adulthood ([228]Uhlhaas, 2011), it may be possible that
   maternal immune activation during fetal development may prime the brain
   to be susceptible to a second immune activation at adolescence.

Immune activation during fetal development and puberty dysregulates signaling
pathways in SCZ network

   In order to further investigate how immune activation might be involved
   in the molecular mechanism underlying schizophrenia, we induced immune
   activation twice in mice: first during mid-late fetal development stage
   by intraperitoneal injection (IPI) of lipopolysaccharide (LPS) into
   pregnant mice at G16 (16^th day after gestation), and a second time
   during puberty by IPI of LPS into the male offspring at P30 (30^th day
   after birth) ([229]Figure 5A). We used saline as control, and designed
   the experiments in three combinations of saline and LPS treatments: (1)
   S-S (control group): saline at G16 + saline at P30; (2) L-S: LPS at
   G16 + saline at P30; (3) S-L: saline at G16 + LPS at P30; (4) L-L: LPS
   at G16 + LPS at P30 ([230]Figure 5A).

Figure 5.

   [231]Figure 5
   [232]Open in a new tab

   Immune activation during fetal development and puberty dysregulates
   signaling pathways in SCZ Network (See also [233]Figures S5–S7)

   (A) A schematic of experimental design. The first intraperitoneal
   injection (IPI) of LPS or saline to the pregnant mice was performed on
   G16 (Gestation Day 16) and the second IPI of LPS or saline to the
   offspring was performed on P30 (Postnatal Day 30). The combination of
   LPS and saline treatments: saline at G16 + saline at P30 (S-L); LPS at
   G16 + saline at P30 (L-S); saline at G16 + LPS at P30 (S-L); LPS at
   G16 + LPS at P30 (L-L).

   (B) The Venn diagram of the DEGs in the L-S group (yellow), S-L group
   (cyan) and L-L group (blue) compared with S-S group (p < 0.05 and
   |FC|≥1.3).

   (C–E) The threshold-free comparison of expression alteration between
   three treatment groups by rank-rank hypergeometric overlap (RRHO). The
   log[10]-transformed hypergeometric p-values are plotted in a heatmap as
   indicated by an accompanying color scale, mapping the degree of the
   statistical significance of overlaps between two differential
   transcriptomes (two ranked gene lists on the X-axis or Y-axis). Each
   pixel in a RRHO map represents the log[10]-transformed hypergeometric
   p-value for the overlap of subsections of two ranked lists from a point
   on the map (corresponding to a rank threshold pair: Rank X, Y) to
   either bottom left corner (Rank 1, 1) or top right corner (Rank N, N).
   The numbers on color scales indicate −log[10](p-value). The genes along
   each axis are sorted from most significantly up-regulated to most
   down-regulated from the lower left corners. The RRHO maps: L-S group
   versus L-L group (C); L-S group versus S-L group (D); S-L group versus
   L-L group (E).

   (F) Over-represented disease biomarkers of mental disorders in the DEGs
   of L-S, S-L and L-L groups identified via MetaCore “Diseases (by
   Biomarkers)”. The X-axis represents −log10 (FDR). Y-axis lists the
   enriched diseases. Vertical red dashed line shows the cutoff of FDR =
   0.05.

   (G) The node degree distribution of SCZ Network with top hub DEGs
   highlighted. Y-axis represents the gene count and X-axis represents the
   degree. Top 10% of genes have degrees ≥212 as indicated by the red
   line. The 10 DEGs among the top hub genes are marked (The degrees of
   all the DGEs in the SCZ Network are listed in [234]Table S9).

   (H) The MetaCore pathways enriched with DEGs. The Venn diagrams show
   the overlaps between the enriched pathways with DEGs and the enriched
   pathways with genes in SCZ Network: 18 of 20 (90%), 69 of 85 (81.2%)
   and 95 of 114 (83.3%). A total number of 100 enriched pathways in L-S,
   S-L or L-L groups are overlapped with enriched pathways of SCZ Network.
   These 100 enriched pathways are clustered into 18 groups by calculating
   the distance between two pathways based on shared genes within SCZ
   Network. The dot plot shows the enrichment of the pathways enriched
   with DEGs in L-S, S-L, L-L groups and with genes in SCZ Network. The
   red, yellow, cyan or blue dots show the −log[10] transformed FDR value
   of pathways in SCZ network, L-S, S-L, and L-L groups, respectively.
   Vertical purple dashed line shows the cutoff of FDR = 0.05. The 18
   representative pathways are listed on the bottom right corner. The
   information of all pathways is listed in [235]Table S9.

   We carried out poly-A selected RNA-seq sequencing on the whole brain at
   P32 to evaluate transcriptional changes and then performed the
   differential expression analysis using DeSeq2 ([236]Figures S5A and
   S5B, [237]STAR Methods). The differentially expressed genes (DEGs)
   identified were 170 from L-S group, 474 from S-L group and 927 from L-L
   group using S-S group as control (p< 0.05 and |Fold Change| ≥1.3,
   [238]Figure 5B, [239]Table S8), suggesting that L-L group have greater
   immune response than S-L group or L-S group. The overlap of DEGs is
   22.9% (39/170) between L-S and L-L group, while 52.3% (248/474) between
   S-L and L-L group ([240]Figure 5B). We then looked at the genome-wide
   comparison of differential expression patterns at threshold-free manner
   among these three groups using rank-rank hypergeometric pair-wise
   comparison ([241]Plaisier et al., 2010), which showed co-regulation of
   genes among the three groups ([242]Figures 5C–5E).

   We further performed MetaCore disease biomarker enrichment analysis in
   the DEGs of L-S, S-L and L-L groups (FDR <0.05). The top enriched
   mental diseases include ‘schizophrenia’, ‘schizophrenia spectrum and
   other psychotic disorders’, ‘bipolar’, ‘bipolar related disorder’,
   ‘major depressive disorder’ and ‘depressive disorder’ ([243]Figure 5F,
   [244]Table S9-1), suggesting that gene dysregulation by maternal and
   adolescent immune activation may underly multiple psychotic disorders.
   As described above ([245]Figure S2), the SCZ Network is also enriched
   for biomarkers of these same mental diseases. These results not only
   confirm the current perspective on the role of immune activation in the
   pathogenesis of psychiatric disorders, but also suggest potential
   common mechanisms for the comorbidity between these mental disorders
   reported in epidemiological studies ([246]Buckley et al., 2009).

   To evaluate the potential effects that the gene differential expression
   induced by immune activation may have on the SCZ Network, we mapped the
   DEGs on the SCZ Network and ranked the node degrees ([247]Figure 5G,
   [248]Table S9-2). Of 184 DEGs that are in the SCZ Network, 28 DEGs have
   a node degree above the average while 8 DEGs (ACTA2, ALDH1A2, PLK1,
   MYH11, CBL, MAP1LC3A, AURKB and RPL18A) are among the top 10% hub genes
   ([249]Figure 5G). These hub genes may play a key role in linking immune
   activation to schizophrenia. The literature mining for these top hub
   DEGs identified many studies about their association with
   schizophrenia, psychiatric disorders or nervous system related
   functions (See [250]Table S9-3 for literature information). For
   example, ACTA2 is significantly downregulated in the postmortem cortex
   of schizophrenia patients ([251]Mladinov et al., 2016), and
   downregulation of ACTA2 inhibits neural stem cell migration ([252]Zhang
   et al., 2020). The deletion of 15q11-13, which covers gene ALDH1A2, is
   suggested to be involved schizophrenia ([253]Reay and Cairns, 2020).
   ALDH1A2 is found to associate with schizophrenia in a large scale
   genome-wide association study ([254]Reay et al., 2020).

   In order to understand the molecular mechanisms on how the immune
   activation during gestation (the first hit) and during puberty (the
   second hit) may increase the risk for schizophrenia, we searched for
   MetaCore pathways enriched with DEGs of each group. Using a cutoff of
   FDR < 0.05, we identified 20 enriched pathways for DEGs in L-S group,
   85 for S-L group and 104 for L-L group. Of the enriched pathways, 90.0%
   (18/20) in L-S group, 81.2% (69/85) in S-L group and 83.3% (95/114) in
   L-L group are overlapping with the enriched pathways in SCZ Network
   ([255]Figures 5H and [256]S6, [257]Tables S9-4, S9-5, and S9-6). A
   total number of 121 enriched pathways are identified in the three
   groups of DEGs, and 100 are also enriched in SCZ Network ([258]Table
   S9-7). We also carried out pathway enrichment analysis on the seed
   genes and the expanded candidate genes of the SCZ Network separately,
   and found that the seed genes and the expanded candidate genes had
   similar proportion of enriched pathways overlapping with those of DEGs
   ([259]Figure S7). These results suggest that the majority of the
   dysregulated signaling pathways in the brain induced by immune
   activation during fetal development and/or during puberty may be
   involved in schizophrenia, consistent with the accumulating evidence
   that maternal and peripubertal immune activation increase the risk of
   schizophrenia ([260]Brown, 2012; [261]Brown and Derkits, 2010;
   [262]Estes and McAllister, 2016; [263]Gomes and Grace, 2017; [264]Gomes
   et al., 2019; [265]Kendell et al., 1996).

   The enrichment of these pathways was compared between the three groups
   and the SCZ Network ([266]Figures 5H and [267]S6). L-L group shows
   better enrichment in most of these pathways than L-S and S-L groups
   ([268]Figures 5H and [269]S6), suggesting that maternal immune
   activation during fetal development may increase the risk of
   schizophrenia after a second immune activation at puberty. In order to
   identify the main categories of pathways linking immune activation to
   schizophrenia, we clustered the enriched pathways based on shared genes
   within SCZ Network ([270]Figures 5H and [271]S6). Eighteen pathway
   clusters were identified, and one representative pathway was selected
   for each cluster. Some of the clusters, i.e., C2, C5, C17 and C18, are
   highly enriched with genes in SCZ Network ([272]Figure S6), suggesting
   that the disruption of pathway in these clusters may have more impact
   on the molecular network. The “first hit” and the “second hit” may have
   different effects on the pathways. The “first hit” alone affects only a
   small number of clusters, including the WNT signaling pathways in
   clusters C14 and C15 ([273]Figure S6), which are important for fetal
   brain development, while the “second hit” alone affects most of the
   clusters. On the other hand, the vast majority of clusters except a
   few, are more affected by ‘two hits’ (L-L group) than by the “second
   hit” alone. Examples of the exception include cluster C7 and C10, which
   are more affected by “second hit” (S-L group) than by ‘two hits’ (L-L
   group) ([274]Figure S6), suggesting that the “first hit” may have
   primed the brain to response differently to the “second hit”. All these
   18 representative pathways are well supported to associate with
   schizophrenia based on literature (See [275]Table S9-8 for literature
   information). For example, pathways for antigen presentation by MHC
   ([276]International Schizophrenia Consortium et al., 2009;
   [277]Pardinas et al., 2018; [278]Schizophrenia Working Group of the
   Psychiatric Genomics, 2014; [279]Shi et al., 2009; [280]Stefansson
   et al., 2009; [281]Walters et al., 2013), canonical (beta-catenin
   dependent) and noncanonical (beta-catenin independent) WNT signaling
   ([282]Hoseth et al., 2018; [283]Hussaini et al., 2014) are
   well-documented to be involved in schizophrenia.

Key regulators for immune activation in mouse brain

   The comparison of the brain transcriptome profiles among the L-S, S-L
   and L-L groups, suggest that injection of LPS to the mother during
   mid-late gestation stage may induce latent sensitization of the brain
   to a second injection of LPS to the offspring at adolescence
   ([284]Figures 5 and [285]S5). We assumed that the response patterns of
   the brain to immune activation is established and maintained by the
   gene expression programs through transcription factors, cofactors and
   chromatin regulators. In order to identify the major transcription
   factors (TFs) that may regulate the transcriptional alteration in these
   three groups, we searched for enriched TF-binding motifs in the
   promoter regions of the DEGs using HOMER ([286]Heinz et al., 2010). We
   found a total number of 24 TF-binding motifs corresponding to 26 TFs
   after removing the motifs whose corresponding TFs were not expressed in
   any of the transcriptomes in L-S, S-L, L-L or control groups
   ([287]Figures 6A and [288]S8A). These 24 motifs are enriched in
   up-regulated genes (ISRE, ZBTB12, MAFA and KLF9), down-regulated genes
   (NF1-halfsite, ARNT:AHR, NR2E1, SOX7, SOX17, SMAD4, SIX1, TWIST1, HIF1A
   and PAX3:FOXO1) or both (IRF2, IRF8, ZBTB1, PRDM1, SIX2, SPI1, IRF3,
   IRF1 and NR2F2): 6 for L-S group, 6 for S-L group and 16 for L-L group
   ([289]Figures 6A, [290]S8B, [291]Tables S10-1 and S10-2). There are 3
   common motifs (ISRE, ARNT:AHR and NF1-halfsite) between L-L and S-L
   group, and 1 common motif (IRF2) between L-L and L-S groups, but no
   common motifs between L-S and S-L groups ([292]Figures 6A and
   [293]S8A).

Figure 6.

   [294]Figure 6
   [295]Open in a new tab

   Key regulators for immune activation in mouse brain (See also
   [296]Figure S8)

   (A) The 24 binding motifs and corresponding transcription factors (TFs)
   predicted for DEGs of L-S, S-L and L-L groups using HOMER. The motif
   patterns are shown on the left. The colored bars on the right indicate
   the target DEGs in L-S, S-L and L-L groups.

   (B) The mRNA expression levels (detected in the bulk RNA-Seq) of the
   identified TFs. Dot size shows absolute value of log[2](fold change) in
   L-S, S-L and L-L groups. The colors represent the up-regulation (red)
   or down-regulation (blue). The asterisks indicate the p-values <0.5.

   (C) The 24 transcription factors (TFs) were clustered into 6 groups
   based on the correlation between their shared targets using Pearson
   correlation coefficient. Red and blue show the comparison pairs with a
   higher and lower correlation coefficient respectively. The colored bars
   on the top show the groups the DEGs belong to.

   (D) The Venn diagrams show the overlaps between the pathways enriched
   with the target DEGs of the TF groups and the pathways enriched with
   the genes in SCZ Network.

   (E) The comparison of the top enriched pathways among the 6 TF groups.
   The enrichment of the top 10 enriched pathways in each TF group was
   compared with enrichment of these pathways in other TF groups and SCZ
   Network. The sizes of the dots represent gene ratios and the colors
   represent FDR adjusted p-values.

   (F) Integrated protein-protein interactions and regulatory cascades of
   the identified TFs. Node size represents the degree. Red, yellow and
   grey node colors represent seeds, expanded candidates or other genes in
   SCZ Network, respectively. Cyan arrow edges and grey solid edges show
   the TF-to-target relationship and protein-protein interaction
   respectively.

   (G) The node degree distribution of SCZ Network with the identified TFs
   highlighted. The 11 of 26 identified TFs are in the SCZ Network.

   We checked the expression alteration of the 26 TFs corresponding to
   these 24 motifs by examining the transcriptomes of mouse brain of S-S,
   L-S, S-L, and LL groups. Of the 26 TF genes, 10 (Irf8, Spi1, Arnt, Nf1,
   Smad4, Six1, Six2, Twist1, Klf9 and Foxo1) in L-L group and 5 (Spi1,
   Nf1, Six2, Twist1 and Klf9) in S-L group showed significant expression
   alteration (Wald test, p < 0.05, [297]Figure 6B, [298]Table S10-3), but
   no TFs in L-S group showed significant expression change
   ([299]Figure 6B, [300]Table S10-3). The genes Irf8, Arnt, Smad4, Six1,
   and Foxo1 only showed significant expression change in L-L group, not
   in S-L group, suggesting that both “first hit” and “second hit” may be
   needed for expression alteration of these 5 TFs. We noticed that
   expression levels of some TFs were altered at an extent below the
   differential expression cutoff for TFs (p = 0.05) in L-S group. For
   example, Six1 showed 1.51-fold upregulation (Wald test, p = 0.13) in
   L-S group and a 1.14-fold change (p = 0.69, Wald test) in S-L group,
   but a fold change of 2.27 (Wald test, p = 0.001) in L-L group. These
   results demonstrated that, “first hit” or “second hit” alone was not
   able to significantly alter the expression level of some TFs while the
   combination of “two hits” was able to change their expression,
   suggesting that the “first hit” may prime the brain to make it more
   sensitive to a “second hit”. Such priming phenomenon has also been
   observed in early life stress in generating mouse model for depressive
   disorder, which makes the individuals more sensitive to social stress
   in adult life ([301]Pena et al., 2017).

   In order to find potential collaborative regulation among the 26 TFs
   during immune activation, we clustered the TF-binding motifs by
   calculating the correlation coefficient between two TF-binding motifs
   based on their shared target DEGs, and identified 6 groups including 5
   clusters and 1 singleton ([302]Figure 6C). We further performed
   MetaCore pathway enrichment analysis (FDR <0.05) for the TF targets in
   each group and compared the enriched pathways with those identified in
   SCZ Network ([303]Tables S10-4–S10-9). The majority of the enriched
   pathways were also enriched with SCZ Network: Group I: 88.9% (40/45);
   Group II: 98% (49/50); Group III: 86.8% (59/68); Group IV: 88.5%
   (46/52); Group V: 80% (20/25); Group VI: 75% (3/4). We compared the top
   10 enriched pathways from each group and found that each of these TF
   groups regulate both distinct and common pathways ([304]Figure 6E). For
   example, ‘Antigen presentation by MHC class I: cross-presentation’ is
   shared by Group 1 and Group 3, and ‘Immune response_IFN-gamma in
   macrophages activation’ is unique to Group 1. Most of these top
   enriched pathways (73%) are reported to be associated with
   schizophrenia or psychotic disorders (See [305]Table S10-10 for
   literature information). Therefore, together with the pathway clusters
   identified for the 100 SCZ-associated pathways ([306]Figures 5H and
   [307]S6; [308]Table S9), these 6 TF groups and their target pathways
   ([309]Figures 6C–6E; [310]Table S10) provide a framework for studying
   the immune activation-induced transcriptional regulation of the
   pathways involved in schizophrenia.

   In order to understand how the TFs may regulate each other to
   coordinate the immune responses after the “first hit” and/or the
   “second hit”, we mapped potential transcriptional regulation among
   these TFs using HOMER motif scan results for the promoter regions of
   the TF genes, and also integrated with the protein-protein interactions
   among these TFs using interaction data of the SCZ Network. This network
   contains 14 transcriptional regulatory relationships and 8
   protein-protein interactions among 15 TFs ([311]Figure 6F), where SIX1
   and SIX2 are TF-TF regulatory hubs, while SMAD4 is protein interaction
   hub, suggesting that SIX1, SIX2 and SMAD4 may play critical role in
   immune activation (see [312]Table S10-11 for literature information).
   SIX1 can be regulated by 5 TFs (SIX2, PRDM1, IRF8, IRF3 and SPI1) while
   SIX2 can regulate three genes (SIX1, TWIST1 and IRF8) and can be
   regulated by itself and two other TFs (ZBTB18 and NR2F2). As described
   above, SIX1 is only upregulated in mice with LPS treatments during both
   gestation and puberty ([313]Figure 6B), suggesting that the priming by
   “first hit” during embryonic development may be necessary for the
   transcriptional upregulation of this genes after the “second hit”
   during puberty. It is reported in a recent study that SIX1 and SIX2
   control the activity of inflammatory genes such as interferon (IFN) in
   response to noncanonical NF-κB pathway activation to protect cell from
   inflammatory damage ([314]Liu et al., 2019). Multiple studies also
   reported the dysregulation of NF-κB signaling pathway in cortical
   immune activation in schizophrenia ([315]Roussos et al., 2013;
   [316]Volk et al., 2019). As an interaction hub ([317]Figures 6F and
   6G), SMAD4 binds 4 TFs (IRF3, NR2F2, NF1 and FOXO1). IRF3 ([318]Xu
   et al., 2014), NR2F2 ([319]Qin et al., 2013), NF1 ([320]Kretova et al.,
   2014), and FOXO1 ([321]Ellis et al., 1989), each can bind to SMAD4 to
   suppress SMAD transcriptional complexes, which is at the core of the
   TGF-β signaling pathways ([322]Mullen et al., 2011). Genetic variants
   in TGF-β are found to associate with schizophrenia ([323]Jia et al.,
   2010) while TGF-β signaling promotes neuronal axon formation ([324]Yi
   et al., 2010) and dendritic growth ([325]Luo et al., 2016). All these
   lines of evidence support our prediction on the critical roles of SIX1,
   SIX2, and SMAD4 in the immune activation associated with schizophrenia.
   Systematic empirical investigation into the roles of these
   transcription factors and signaling pathways in an immune response may
   illustrate a more detailed molecular mechanisms on how immune
   activation in the brain may lead to schizophrenia.

Discussion

   The most significant advances in searching for risk factors of
   schizophrenia over the last decade include: first, a large number of
   schizophrenia-associated genes have been identified by genome wide
   studies ([326]Bowen et al., 2019; [327]Fromer et al., 2014;
   [328]Howrigan et al., 2020; [329]Pardinas et al., 2018; [330]Rees
   et al., 2020; [331]Schizophrenia Working Group of the Psychiatric
   Genomics, 2014; [332]Xu et al., 2011); and second, immune activation
   has been repeatedly proven to be associated with this disease
   ([333]Brown, 2012; [334]Brown and Derkits, 2010; [335]Fillman et al.,
   2013; [336]Goldsmith et al., 2016; [337]Miller et al., 2011; [338]van
   Kesteren et al., 2017). Proinflammatory cytokine levels have been
   consistently reported to be increased in the serum of SCZ patients
   ([339]Goldsmith et al., 2016; [340]Miller et al., 2011) and also in
   their prefrontal cortex ([341]Fillman et al., 2013). Maternal infection
   or stress during pregnancy greatly increases the risk of schizophrenia
   in the offspring ([342]Brown, 2012; [343]Brown and Derkits, 2010;
   [344]Estes and McAllister, 2016; [345]Gomes et al., 2019), while
   adolescent stress is also reportedly a risk factor ([346]Gomes and
   Grace, 2017; [347]Harrop and Trower, 2001; [348]Trotman et al., 2013).
   The “environment-environment interaction” model of the “two-hit”
   hypothesis is proposed explanation for the etiology ([349]Khan and
   Powell, 2018). What is needed is a systematic way to illustrate the
   underlying molecular mechanism, particularly how immune system
   contributes to the disease onset. Here we presented a systematic search
   for molecular pathways linking immune activation to schizophrenia by
   extracting a high-quality protein interaction network for schizophrenia
   (SCZ Network) using our in-house developed “neighborhood walk” (NW)
   approach in combination with “random walk with restart” (RWR).

   Starting from a core set of 1,720 high-confidence SCZ candidate genes,
   we extracted the SCZ Network from a high-quality comprehensive human
   interactome network ([350]Li et al., 2017) using NW and RWR
   ([351]Figure 1). This is the first time that NW approach is applied in
   searching for disease-associate neighborhoods in a molecular network.
   NW and RWR show comparable performance ([352]Figure 2). Although both
   RWR and NW start walking from seeds in the network, they retrieve
   disease candidate genes based on different principles. RWR walks from a
   seed to all directions with a restart probability, to examine the
   proximity between each individual seed and all the other nodes based on
   the topology of the network. Therefore, RWR does not depend on the
   functional convergence of the disease genes. Starting from randomly
   picked degree-controlled fake seeds, RWR retrieved more “disease”
   candidate genes than starting from real seeds ([353]Figure S1H),
   probably because the fake seeds were not functionally related and thus
   were topologically scattered in the network. However, NW walks from a
   seed to explore the disease association of the neighborhood. It depends
   on the functional convergence of the seed genes in neighborhoods of the
   network. Starting with randomly picked degree-controlled fake seeds, NW
   was not able to retrieve sizable “disease”-associated neighborhoods
   ([354]Figure S1G), because functionally unrelated genes were scattered
   in the network, not clustered in neighborhoods of the network. The
   common nodes of these two subnetworks show stronger enrichment with
   SCZ-associated genes than either subnetwork alone ([355]Figure 2).
   Therefore, we selected the common genes identified by these two
   methods, which generated a more reliable molecular network for
   schizophrenia than each method alone. The expanded candidate genes and
   the seed genes in the SCZ Network have enrichment of equal magnitude
   with the SCZ-dysregulated genes in the brain ([356]Figure 2B),
   confirming the high quality of the SCZ Network. MetaCore and KEGG
   pathway enrichment analyses identified underlying signaling pathways
   ([357]Tables S6-1 and S6-3). We examined the literature for the top ten
   enriched MetaCore pathways (see [358]Table S6-2 for literature
   information) and found that all these top pathways are involved in
   schizophrenia and immune regulation. Most interestingly, these same
   pathways are also reported to be involved in synaptogenesis and synapse
   modulation. The literature search for the top enriched KEGG pathways
   reached similar conclusion (see [359]Table S6-4 for literature
   information). These findings together suggest that the molecular
   signaling pathways in the SCZ Network that are involved in immune
   regulation may contribute to the development of schizophrenia through
   regulating the formation and modulation of synapses. These results not
   only further confirm the high quality of the SCZ Network, but also
   suggest that molecular processes of immune activation and synapse
   remodeling might be a fundamental mechanism for schizophrenia.

   The spatiotemporal expression pattern of the genes of the SCZ Network
   in the brain during development demonstrates that this molecular
   network is overrepresented in the expressed genes during both “early-
   to late fetal development” and “adolescence to early adulthood”
   ([360]Figure 4B), suggesting that this molecular network may be more
   susceptible to perturbation during these developmental stages.
   Additionally, immune genes are more likely to be expressed in cortex
   regions from mid- to late fetal development and from adolescence to
   early adulthood ([361]Figure 4E), further suggesting that immune
   activation during these two sensitive developmental stages may disturb
   the molecular network. The spatiotemporal expression pattern of the SCZ
   Network is coincident with the structural and functional abnormalities
   found in the cortex of the schizophrenia patients ([362]Karlsgodt
   et al., 2010), and with the developmental stages of cortex in animal
   models ([363]Selemon and Zecevic, 2015), suggesting that the normal
   function of this molecular network may be involved in the structural
   and functional development of cortex regions, and that immune
   activation at these critical stages may disturb the cortex development,
   leading to schizophrenia. The spatiotemporal expression pattern of the
   immune genes in the SCZ Network fits well with “environment-environment
   interaction” model of the “two-hit” hypothesis on the etiology
   ([364]Khan and Powell, 2018), and it is also consistent with the
   clinical finding on increased immune response in patients ([365]Fillman
   et al., 2013; [366]Goldsmith et al., 2016; [367]Miller et al., 2011),
   suggesting a “twice immune-activation” mechanism for this model.

   To explore “twice immune-activation” mechanism, we have built a
   “two-hit” mouse model using LPS to mimic infection during gestation and
   during puberty. We induced immune activation twice in mice, first
   during mid-late fetal development stage by intraperitoneal injection
   (IPI) of lipopolysaccharide (LPS) in to the pregnant mice at G16 (16^th
   day after gestation), and a second time during puberty by IPI of LPS
   into the male offspring at P30 (30^th day after birth)
   ([368]Figure 5A), and carried out transcriptome sequencing on the brain
   at P32 to explore the differentially expressed genes (DEGs)
   ([369]Figure 5B). The DEGs of L-S, S-L or L-L groups are enriched in
   biomarkers of ‘schizophrenia’ and other mental disorders such as
   ‘bipolar’, ‘bipolar related disorder’, ‘major depressive disorder’ and
   ‘depressive disorder’ ([370]Figure 5F), suggesting that immune
   activation may underlie multiple psychotic disorders. The
   differentially expressed genes in three treatment groups are enriched
   in a total of 121 pathways ([371]Tables S9-4–S9-6). Surprisingly, 82.6%
   (100/121) of the enriched pathways of the differentially expressed
   genes induced by immune activation are also identified in SCZ Network
   ([372]Table S9-7). These pathways are clustered into 18 groups
   according to their shared genes in SCZ Network ([373]Figures 5H and
   [374]S6). All the representative pathways from the 18 clusters are
   documented to associate with schizophrenia or other psychiatric
   disorders ([375]Table S9-8), providing a framework for investigating
   the mechanisms on how immune activation in the brain may lead to
   schizophrenia.

   The mice with “two hits” showed greater transcription alteration than
   the mice with the “first hit” or the “second hit” alone
   ([376]Figure 5B, [377]Table S8), suggesting that the “first hit” may
   have effects on the brain to make it more sensitive to the “second
   hit”. We searched for enriched binding motifs in the promoter regions
   of the differentially expressed genes and found 24 TF-binding motifs
   corresponding to 26 transcription factors ([378]Figure 6A). These
   transcription factors can be clustered into 6 groups, each regulating a
   group of pathways ([379]Figure 6C). Majority of pathways identified for
   each group of TFs overlap with the signaling pathways identified in the
   SCZ Network ([380]Figure 6D; [381]Tables S10-4–S10-9), further
   providing a framework for studying the key regulators of these
   immune-activated pathways linking to schizophrenia. The network of TFs
   with regulatory edges and protein interaction edges suggest that SIX1
   and SIX2 are regulatory hubs and SMAD4 is an interaction hub
   ([382]Figures 6F and 6G). SIX1 and SIX2 are reported to be involved in
   cortical immune activation through NF-κB pathway ([383]Liu et al.,
   2019; [384]Roussos et al., 2013; [385]Volk et al., 2019). SMAD4
   transcription factor is at the core of the TGF-β signaling pathways
   ([386]Mullen et al., 2011), regulating neuronal axon formation ([387]Yi
   et al., 2010) and dendritic growth ([388]Luo et al., 2016).

   In summary, in order to lay the foundation for studying molecular
   mechanism underlying schizophrenia, we consolidated a core set of
   high-confidence SCZ candidate genes and then constructed a high-quality
   SCZ Network from a comprehensive interactome dataset using
   “neighborhood walk” and “random walk with restart”. The dynamic
   spatiotemporal expression patterns of the genes of the SCZ Network in
   human brain revealed “mid- to late fetal development” and “adolescence
   to early adulthood” as critical stages when the molecular network is
   susceptible to environmental insults. The top enriched pathways
   suggested a tight link between immune activation, synapse remodeling
   and schizophrenia. The transcriptome sequencing on the brain of mice
   with immune activation at late fetal developmental stage or puberty
   revealed that over 80% of the dysregulated pathways by immune
   activation overlap with pathways in the SCZ Network. The dysregulated
   pathways further supported the molecular link between immune signaling,
   synaptic function, and schizophrenia. HOMER scan on the promoter
   regions of the dysregulated genes identified a small number of
   potential key regulators, such as SIX1, SIX2, and SMAD4, of the
   SCZ-associated immune activation in the brain. Our work provides a
   strategy for illustrating the molecular mechanism underlying complex
   psychiatric disorders and a rich resource for disease candidate genes
   as well as systematically evaluated signaling pathways to study the
   pathogenesis and therapeutics of schizophrenia.

Limitations of the study

   This study only provides a landscape of genes and pathways that may be
   involved in the development of schizophrenia, especially the molecular
   pathways and regulators that may link the immune activation to this
   disease. A limitation of our study is the lack of deeper understanding
   of the identified pathways and key regulators using cell models and
   mouse models. Another limitation is that our in-house developed NW
   method has not been tested in different datasets. In theory, NW
   retrieves disease candidate genes based on their convergence in the
   neighborhoods of the network, and therefore functionally unrelated
   genes may be left out of the selected disease-associated neighborhoods.
   Although we have demonstrated the validity of NW in retrieving SCZ
   candidate genes, this method needs to be further tested in the study of
   other complex diseases.

STAR★Methods

Key resources table

   REAGENT or RESOURCE SOURCE IDENTIFIER
   Chemicals, peptides, and recombinant proteins
     __________________________________________________________________

   Lipopolysaccharide Sigma-Aldrich L2630-10MG, lot: #099M4002V
     __________________________________________________________________

   Critical commercial assays
     __________________________________________________________________

   RNeasy Plus Universal Mini Kit Qiagen Cat.NO.73404
     __________________________________________________________________

   Deposited data
     __________________________________________________________________

   Raw and analyzed data This paper [389]GSE156568
   Mouse reference genome, GRCm38 Genome Reference Consortium
   [390]https://www.ncbi.nlm.nih.gov/grc/mouse
     __________________________________________________________________

   Experimental models: organisms/strains
     __________________________________________________________________

   Mouse: C57BL/6J Southern Medical University N/A
     __________________________________________________________________

   Software and algorithms
     __________________________________________________________________

   R (version 3.6.0) The R Project for Statistical Computing
   [391]https://cran.r-project.org/bin/windows/base/old/
   RandomWalkRestartMH (version 1.6.0) [392]Valdeolivas et al., 2019
   [393]http://bioconductor.org/packages/RandomWalkRestartMH/
   Neighborhood walk This paper [394]Supplemental information
   STAR (version 2.5.2b). [395]Dobin et al., 2013
   [396]https://github.com/alexdobin/STAR/releases
   HTSeq (version 0.11.2) [397]Anders et al., 2015
   [398]https://htseq.readthedocs.io/en/master/
   HOMER (version v4.11.1) [399]Heinz et al., 2010
   [400]http://homer.ucsd.edu/homer/
     __________________________________________________________________

   Other
     __________________________________________________________________

   Resource data for selecting core set of high-confidence SCZ candidate
   genes This paper [401]STAR Methods and [402]supplemental information
   Construction and validation of SCZ Network, related resources This
   paper [403]STAR Methods and [404]supplemental information
   SCZ Network analyses and related resources This paper [405]STAR Methods
   and [406]supplemental information
   RNA-seq data analyses and related resources This paper [407]STAR
   Methods and [408]supplemental information
   [409]Open in a new tab

Resource availability

Lead contact

   Further information and requests for resources and reagents should be
   directed to and will be fulfilled by the lead contact, Xinping Yang
   ([410]xpyang1@smu.edu.cn)

Materials availability

   This study did not generate new unique reagents. All renewable reagents
   and detailed protocols will be available upon request.

Experimental model and subject details

Animal

   The C57BL/6 mice were housed under standard conditions (12 h/12 h
   light/dark cycle, access to dry food and water ad libitum). All the
   experimental procedures involving mice were approved by the Committee
   on Animal Care and Use at the Southern Medical University.

Immune activation in mice during mid-fetal stage and adolescent stage

   Immune activation was induced twice in mice, first during mid-late
   fetal development stage by intraperitoneal injection (IPI) of
   lipopolysaccharide (LPS) (150ug/kg) in to the pregnant mice at G16
   (16th day after gestation) and a second time during puberty by IPI of
   LPS into the male offspring at P30 (30th day after birth). We then
   carried out poly-A selected RNA-seq sequencing on the brain at P32 to
   evaluate transcriptional changes. We used saline as control for LPS,
   and set the experiments in three combination of saline and LPS
   treatments: (1) S-S (control group): saline at G16 + saline at P30; (2)
   L-S: LPS at G16 + saline at P30; (3) S-L: saline at G16 + LPS at P30;
   (4) L-L: LPS at G16 + LPS at P30. Lipopolysaccharide (LPS) was
   purchased from Sigma, USA.

Method details

Collection of Schizophrenia candidate genes

   A total number of 1,720 schizophrenia candidate genes (protein coding
   genes) were collected from 7 methods in 87 studies pooled by SZDB: copy
   number variations (CNVs), differential expressed genes (DEGs),
   differential methylated genes (DMGs), whole exome sequencing (WES),
   genome-wide association studies (GWASs), imputation and linkage
   ([411]Table S1). For genes from CNVs, DMGs and WES, only genes
   supported by two or more publications are kept. For genes from DEGs,
   only genes with FDR ≤0.05 are kept.

Human interactome data

   The comprehensive human interactome data were downloaded from InBioMap
   ([412]https://www.intomics.com/inbio/map.html) ([413]Table S2).

Random walk with restart (RWR)

   RWR was performed using R package “RandomWalkRestartMH” (version 1.6.0)
   ([414]Valdeolivas et al., 2019). The cutoff score ≥ 1e-05 from RWRM
   results was used for identifying disease candidate genes.

Neighborhood walk (NW)

   NW was performed using in-house developed algorithm.
     * i.
       We first mapped the 1,720 high-confidence SCZ candidate genes as
       “seeds” onto the human interactome network ([415]Table S2).
     * ii.
       For each seed, we assigned a risk score (R-score = 1-1.4^−x, x is
       the number of methods used in identifying the gene as SCZ candidate
       gene in literature). We used “R-score = 1-1.4^−x” for scoring seeds
       because it was shown to give seeds proper weights ([416]Calderone
       et al., 2013). For each nonseed, we calculated a guilt score
       (G-score = the average R-score of its interactors) ([417]Table S3).
     * iii.
       For each edge, we calculated an edge weight (W = the sum of the
       R-score and/or G-score of the two interacting nodes) and a weighted
       edge length (WEL) (WEL = 1/W) ([418]Table S3).
     * iv.
       For any two nodes in the network, we calculated a weighted shortest
       distance (WSD):
       [MATH: <mrow><mi>W</mi><mi>S</mi><mi>D</mi><mo
       linebreak="goodbreak"
       linebreakstyle="after">=</mo><mo>∑</mo><mrow><mo>(</mo><mrow><mn>1<
       /mn><mo
       linebreak="badbreak">/</mo><mrow><mi>W</mi><mi>i</mi></mrow></mrow>
       <mo>)</mo></mrow></mrow> :MATH]
       (1/Wi is the weighted edge length of an edge in a given shortest
       path) ([419]Figure S1E).
     * v.
       For each seed as an origin, we started the “walking” from the
       origin and stopped at a potential neighborhood border (a sphere
       with certain walking distance) if further “walking” to next sphere
       made the seed ratio in the neighborhood drop significantly,
       (walking distance between two spheres is 1.0) ([420]Figure S1E);
     * vi.
       We obtained 1,592 ranked neighborhoods (one for each seed) based on
       seed ratio ([421]Table S3). The neighborhoods with seed ratio from
       75% to 100% were considered as SCZ-related network neighborhoods.
     * vii.
       We extracted an NW-retrieved Subnetwork by fusing the 1,043
       neighborhoods with the seed ratio ≥75%, which contains 1,592 seed
       genes (high-confidence candidate genes) and 3,858 additional SCZ
       candidate genes (nonseed genes in the network).

Independent datasets for validation of SCZ network

   In order to compare the SCZ relevance of the expanded candidate genes
   and seed genes in SCZ network, we calculated the enrichment of with
   dysregulated genes or variants identified in SCZ patients collected
   from recent literature. These data sources have not been used for
   collection of seed genes and therefore they contain SCZ-associated
   genes (dysregulated genes or genes harboring variants) unbiased towards
   seeds or nonseeds. A total number of 1,682 brain differentially
   expressed genes (DEGs) ([422]Table S5) between schizophrenia patients
   and normal subjects were collected from the study by [423]Bowen et al.
   (2019), and a total number of 4,217 exome variants ([424]Table S5)
   identified in the postmortem brains of SCZ patients were collected from
   the studies by [425]Howrigan et al. (2020) and [426]Rees et al. (2020).
   The three studies are listed below:
     * 1
       .Bowen, E. F. W. et al. DLPFC transcriptome defines two molecular
       subtypes of schizophrenia. Transl Psychiatry 9, 147,
       [427]https://doi.org/10.1038/s41398-019-0472-z (2019).
     * 2
       .Howrigan, D. P. et al. Exome sequencing in schizophrenia-affected
       parent-offspring trios reveals risk conferred by protein-coding de
       novo mutations. Nature neuroscience 23, 185-193,
       [428]https://doi.org/10.1038/s41593-019-0564-3 (2020).
     * 3
       .Rees, E. et al. De novo mutations identified by exome sequencing
       implicate rare missense variants in SLC6A1 in schizophrenia. Nature
       neuroscience 23, 179-184,
       [429]https://doi.org/10.1038/s41593-019-0565-2 (2020).

Tools and data sources for functional enrichment analysis

     * 1
       .KEGG pathway enrichment analysis were performed using R package
       “clusterProfiler” (version 3.14.3) ([430]Yu et al., 2012).
     * 2
       .Disease Enrichment was performed using MetaCore Disease (by
       biomarker) from Clarivate Analytics (version 20.2).
     * 3
       .Collection of immune genes: 3,904 immune genes ([431]Table S6)
       were collected from:
          + i.
            InnateDB (Innate Immunity Genes,
            [432]https://www.innatedb.com) ([433]Breuer et al., 2013);
          + ii.
            GO-immune-response
            ([434]http://amigo.geneontology.org/amigo/term/GO:0006955);
          + iii.
            ImmPort:
            ([435]http://www.immport.org/immport-open/public/reference/gen
            elists);
          + iv.
            Nanostring's Immune Profiling Panel Probe
            ([436]https://www.nanostring.com/products/gene-expression-pane
            ls/gene-expression-panels-overview/hallmarks-cancer-gene-expre
            ssion-panel-collection/pancancer-immune-profiling-panel?jumpto
            =SUPPORT);
          + v.
            IMGT/GENE-DB (the international ImMunoGeneTics information
            system): ([437]http://www.imgt.org/genedb/).
     * 4
       .Collection of synaptic transmission genes: 1,068 synaptic
       transmission genes ([438]Table S6) were collected from GO
       ([439]http://amigo.geneontology.org/amigo/landing) using key word
       “synaptic transmission”.

The spatiotemporal expression of genes in human brain

   The brain transcriptome data was generated across 13 dissection stages
   ([440]Kang et al., 2011). Spatiotemporal co-expression was calculated
   according to a previous study ([441]Lin et al., 2015), as described
   below. Since well-defined anatomical brain structures are limited
   during early embryonic development, the first period (4-8
   postconceptional weeks, PCW) was removed from further analysis. The
   dissection stages were merged into ten partially-overlapping
   developmental periods ranging from 8 PCW to 40 years of age
   ([442]Figures 4A and [443]S4). Brain regions were grouped into four
   clusters using hierarchical clustering based on brain transcriptional
   similarity to reflect actual topological proximity and functional
   segregation ([444]Willsey et al., 2013). As a result, 38 spatiotemporal
   regions were defined after eliminating one region from P2-P4 (P2-4, R4)
   due to lack of transcriptome data. Co-expression networks were used to
   identify phase-specific functional relationships between genes. A
   co-expression between two genes was defined as positive if the
   pair-wise Pearson Correlation Coefficient (PCC) value was > 0.8. Using
   this approach, 38 different spatiotemporal co-expression networks were
   generated and the co-expressed gene pairs were used for the enrichment
   analyses of SCZ network ([445]Table S7).

RNA preparation

   For each group, six pregnant mice were treated saline or LPS (maternal
   immune activation); and the male offspring were treated with saline or
   LPS (immune activation at puberty). For each group, four whole-brain
   RNA samples were prepared. Each RNA sample was extracted from the whole
   brain of one adult mouse according to the manufacturer's protocol
   (RNAeasy Mini Kit, Qiagen, Germantown, MD, USA). The quality and yield
   of the isolated RNAs was assessed using a NanoDrop Spectrophotometer
   (Thermo Fisher Scientific, Waltham, MA, USA) and Agilent 2100
   Bioanalyzer (Agilent Technologies, Santa Clara, CA, USA). Only RNAs
   with a high RNA integrity number (RIN > 9) were selected and used for
   the subsequent sequencing.

RNA-Seq data analysis and differential motifs

   RNA sequencing was performed at Berry Genomics (Beijing, China) using
   Illumina NovaSeq. Sequenced reads were mapped to GRCm38 whole genome
   using STAR_2.5.2b. Raw counts are calculated by HTSeq-count version
   0.11.2. Pair-wised analyses are performed, genes with >3 raw counts in
   all four samples of case or control were considered as expressed genes
   and used for downstream analysis ([446]Table S8). The principal
   components analysis (PCA) is performed by plotPCA function using
   normalized count by variance stabilizing transformation function of
   DEseq2 (v1.20.0).

   Differential expression analysis on two groups was performed using the
   DESeq2. Differentially expressed genes (DEGs) were determined using a
   cutoff of p < 0.05 and |FC|≥1.3 for DESeq2 ([447]Table S8). Functional
   annotation for pathways was performed using MetaCore from Clarivate
   Analytics (version 20.2).

   The threshold-free comparison of expression alteration between
   different treatment groups was carried out using R package “RRHO”
   (version 1.26.0). Full threshold-free gene list of differential
   transcriptomes (L-S group, S-L group or L-L group compared to S-S
   group) were ranked by the −log[10](p-value) multiplied by the sign of
   the fold change from the DESeq2 analysis.

   The log[10]-transformed hypergeometric p-value are plotted to map the
   degree of the statistically significant overlap between two
   differential transcriptomes (two ranked gene lists). Each pixel in a
   RRHO map represents the signed log[10]-transformed hypergeometric
   p-value for the overlap of subsections of two ranked lists from a point
   on the map (corresponding to a rank threshold pair: Rank X, Y) to
   either bottom left corner (Rank 1, 1) or top right corner (Rank N, N).
   Genes along each axis are sorted from most significantly up-regulated
   to most down-regulated from the lower left corners.

   Pathways are clustered by calculating the distance between two pathways
   based on shared genes within SCZ Network using dist() function with
   method “euclidean” and hclust() function with method “ward.D” in R
   package “stats” (version 3.6.3). The dendrogram tree was plot using R
   packages dendextend (version 1.14.0) and ggdendro (version 0.1-20).

   HOMER (version v4.11.1) motif analysis was used to predict differential
   motifs (enriched transcription factor-binding sites) located within
   −2,000 to +1,000 bases of the TSS of DEGs.

   TF correlation was calculated by their shared target DEGs using cor()
   function with method “pearson” in R package “stats” (version 3.6.3).
   Then plot was performed by corrplot package (version 0.84) using
   hierarchical clustering order.

Quantification and statistical analysis

   The p-values in the enrichment analysis were calculated using Fisher's
   Exact Test. And the p-values in the differentially expressed analysis
   were calculated using Wald significance tests.

Acknowledgments