Table 2.
Estimation of the number of off-target siRNAs that may arise from genome-wide RNAi screens.
In this analysis, we calculate the hit rate and validation rate in an siRNA screen of 20,000 genes using four siRNAs per gene. Genes chosen for validation would include any gene for which at least 2/4 siRNAs produce a phenotype; we assume that at least 2/4 siRNAs that target the true (specific) genes involved in the pathway would yield a phenotype.
Assuming that a given siRNA has on average 50 off-target mRNAs (leading to mRNA knockdown by more than 2 fold) and that these off-targets are random, the probability that an siRNA has a seed match that results in off-targeting for an mRNA out of 20,000 genes is 50/ 20,000 = 1/ 400. Let Y denote the number of genes among the indicated number of genes required for phenotype n ={1, 5, 10, 20, 50} that have a seed match to a given siRNA. If each of the n genes is independent of each other, then P(Y) obeys the binomial distribution B (n, 1/400). Therefore the probability of a given siRNA not to contain a seed match to any of the n genes is given as follows:
P(Y=0) = C10,0 (1/400)0 (1/400)n-0
The probability that a given siRNA has a seed match to at least one of the n genes is
p = 1 - P(Y=0)
The probability P that exactly A = {1, 2, 3, 4} siRNAs out of 4 independent siRNAs tested for a given gene give phenotype as a result of an off-target effect follows the binomial distribution:
P(A) = C4,A pA p4-A
The estimated number of genes with exactly A out of 4 siRNAs giving phenotype as a result of an off-target effect was determined by multiplying the probability P(A) for each n by 20,000, assuming this number of genes was tested in the siRNA screen.
| Number of Genes in a Pathway that Yield Phenotype when Expression Reduced by 50% | 1 | 5 | 10 | 20 | 50 |
|---|---|---|---|---|---|
| A. Number of On-target Genes That Score | 1 | 5 | 10 | 20 | 50 |
| B. Number of Off-target Genes That Score | 200 | 976 | 1906 | 3630 | 7877 |
| B1. Number of genes with exactly 1/4 siRNAs that score | 199 | 958 | 1835 | 3362 | 6465 |
| B2. Number of genes with exactly 2/4 siRNAs that score | 1 | 18 | 70 | 259 | 1293 |
| B3. Number of genes with exactly 3/4 siRNAs that score | 0 | 0 | 1 | 9 | 115 |
| B4. Number of genes with exactly 4/4 siRNAs that score | 0 | 0 | 0 | 0 | 4 |
| Total Number of Genes Identified by Screen (A+B) | 201 | 981 | 1916 | 3650 | 7927 |
| Hit Rate ([A+B]/20,000) | 1% | 4.9% | 9.5% | 18% | 39% |
| Fraction of Genes that are On Target (A/[A+B]) | 0.5% | 0.5% | 0.5% | 0.5% | 0.6% |
| Validation Rate (A/[A+B2+B3+B4]) | 50% | 21.7% | 12.3% | 6.9% | 3.4% |