Table 1.
The top-10 features ranked by F-score on our combined (training + test) data set (disordered model of split NESsential)
Rank | Feature description | F-score |
---|---|---|
1 | # of leucines among the three hydrophobic positions | 0.130 |
2 | Distance to previous match of Φx(2,3)ΦxΦ divided by the protein length | 0.042 |
3 | Whether a hydrophobic residue exists in the upstream -3 (for 7-mer match) or -4 position (for 6-mer match) | 0.035 |
4 | # of negatively charged residues in the upstream flank | 0.032 |
5* | # of prolines within the pre-filter match Φx(2,3)ΦxΦ | 0.026 |
6 | # of negatively charged residues within the pre-filter match Φx(2,3)ΦxΦ | 0.013 |
7* | # of polar residues in the downstream flank | 0.009 |
8* | Whether the first two residues are involved in a β-strand based on second structure prediction | 0.009 |
9* | Whether the first residue is involved in a β-strand based on second structure prediction | 0.008 |
10 | Avg. predicted solvent accessibility of downstream flank | 0.008 |
*Indicates features that the mean value of spurious matches is greater than that of real NES sites.