Sites per sequence | Nsequences | % | % (cum.) | Nsites | % | % (cum.) |
---|
1 | 88 | 61.5 | 61.5 | 88 | 36.8 | 36.8 |
2 | 30 | 21.0 | 82.5 | 60 | 25.1 | 61.9 |
3 | 15 | 10.5 | 93.0 | 45 | 18.8 | 80.7 |
4 | 7 | 4.9 | 97.9 | 28 | 11.7 | 92.4 |
5 | 1 | 0.7 | 98.6 | 5 | 2.1 | 94.5 |
6 | 1 | 0.7 | 99.3 | 6 | 2.5 | 97.0 |
7 | 1 | 0.7 | 100.0 | 7 | 3.0 | 100.0 |
Total | 143 | 100.0 | 100.0 | 239 | 100.0 | 100.0 |
- Positive examples in the dataset contain up to seven sites per sequence. Although approximately two thirds (88) of all entries (143) contain only one phosphorylated serine or threonine, this accounts for just slightly more than one third of the total number of included sites (239). It should be noted that, for many proteins, additional phosphorylation sites might have been undetected so far.