Extended Data Table 1. Sample and detected variant properties.
a. Numbers of individuals in the final dataset, after individual-level QC. Finnish ancestry was inferred by multidimensional scaling. P-values from Fisher’s exact test. b. Technical metrics for the cases and controls (after individual-level QC); P-values for two-sided test of case/control differences (t-test). c. Properties of variants detected by exome sequencing. Counts (N) and minor allele counts (MAC) for various classes of variant in the main exome dataset, following all QC. Missense deleteriousness prediction algorithms and how they were combined described in SI section 4.
a
| ||||
---|---|---|---|---|
Case/control status by sex (P = 5e-11) | Status | Male | Female | Female (%) |
Control | 1291 | 1252 | 49% | |
Case | 1520 | 1016 | 40% | |
| ||||
Case/control status by ancestry (P = 2e-12) | Ancestry | Control | Case | Case (%) |
| ||||
Swedish | 2356 | 2197 | 48% | |
Finnish | 139 | 274 | 66% |
b
| |||
---|---|---|---|
Sample and sequencing metrics | Cases | Controls | P (case vs. control) |
N | 2536 | 2543 | - |
N (pre-QC) | 2546 | 2545 | - |
Total number of reads | 100,532,755 | 100,079,333 | 0.62 |
Filtered, unique reads aligned | 68,940,753 | 68,339,964 | 0.26 |
Filtered, unique bases aligned | 5,106,614,996 | 5,070,497,844 | 0.34 |
Mean target coverage | 89.98 | 89.55 | 0.53 |
Percentage of target bases covered > 10x | 92.83 | 92.85 | 0.55 |
Percentage of target bases covered > 20x | 87.30 | 87.30 | 0.93 |
Percentage of target bases covered > 30x | 81.13 | 81.07 | 0.63 |
Percentage of targets w/out any bases covered at 2x | 1.72 | 1.72 | 0.60 |
Mean number of non-reference genotypes per individual (unfiltered) | 18772.9 | 18786.6 | 0.13 |
Mean number of on-target singletons per individual (unfiltered) | 49.6 | 49.0 | 0.38 |
Mean dbSNP % per individual | 98.3970% | 98.3969% | 1.00 |
c
| ||||
---|---|---|---|---|
Property | Variant type | N | Mean MAC | % singleton |
All alternate alleles | 635,944 | 103.37 | 56% | |
Functional class | ||||
Noncoding | 61,416 | 142.03 | 53% | |
Silent | 185,336 | 152.85 | 51% | |
Missense | 342,561 | 69.52 | 58% | |
Non-essential splice site | 25,450 | 127.04 | 54% | |
Nonsense | 9,022 | 20.68 | 69% | |
Essential splice-site | 4,394 | 16.18 | 70% | |
Frameshifting indel | 3,461 | 9.46 | 79% | |
In silico annotation of missenses | ||||
LRT | 168,437 | 34.55 | 62% | |
Mutation Taster | 167,316 | 19.90 | 63% | |
PolyPhen2 (HumDiv) | 130,719 | 28.84 | 62% | |
PolyPhen2 (HumVar) | 91,156 | 24.74 | 64% | |
SIFT | 140,345 | 43.85 | 61% | |
Primary variant groupings for analysis | ||||
Singletons | Gene disruptive | 12,047 | 1.00 | 100% |
Nonsyn (strict) | 36,542 | 1.00 | 100% | |
Nonsyn (broad) | 160,229 | 1.00 | 100% | |
<0.1% MAF (1-10 alleles) | Gene disruptive | 15,972 | 1.56 | 75.4% |
Nonsyn (strict) | 50,369 | 1.65 | 72.5% | |
Nonsyn (broad) | 233,575 | 1.78 | 68.6% | |
<0.5% MAF (1-50 alleles) | Gene disruptive | 16,523 | 2.24 | 72.9% |
Nonsyn (strict) | 52,545 | 2.51 | 69.5% | |
Nonsyn (broad) | 248,217 | 3.04 | 64.6% |