Method | Protein dataset | Method | Peptide dataset |
---|
RMSD | % | Outliers | RMSD | % | Outliers |
---|
IPC_protein | 0.874 | 0 | 46 | IPC_peptide | 0.251 | 0 | 232 |
Toseland | 0.934 | 14.9 | 52 | Solomons | 0.255 | 0.9 | 235 |
Bjellqvist | 0.944 | 17.7 | 47 | Lehninger | 0.262 | 2.5 | 236 |
Dawson | 0.945 | 17.8 | 56 | EMBOSS | 0.325 | 18.5 | 372 |
Wikipedia | 0.955 | 20.5 | 55 | Wikipedia | 0.421 | 47.9 | 1467 |
Rodwell | 0.963 | 22.8 | 58 | Toseland | 0.425 | 49.1 | 990 |
ProMoST | 0.966 | 23.6 | 52 | Sillero | 0.428 | 50.3 | 1223 |
Grimsley | 0.968 | 24.2 | 60 | Dawson | 0.435 | 52.9 | 1432 |
Solomons | 0.970 | 24.8 | 58 | Thurlkill | 0.481 | 69.7 | 1361 |
Lehninger | 0.970 | 25.0 | 59 | Rodwell | 0.502 | 78.4 | 1359 |
pIR | 1.013 | 38.0 | 58 | DTASelect | 0.550 | 99.1 | 1714 |
Nozaki | 1.024 | 41.3 | 56 | Nozaki | 0.602 | 124.3 | 1368 |
Thurlkill | 1.030 | 43.4 | 61 | Grimsley | 0.616 | 131.4 | 1550 |
DTASelect | 1.032 | 44.1 | 58 | Bjellqvist | 0.669 | 161.5 | 1583 |
pIPredict | 1.048 | 49.4 | 56 | pIPredict | 1.024 | 493.6 | 2720 |
EMBOSS | 1.056 | 52.3 | 69 | ProMoST | 1.239 | 873.4 | 2649 |
Sillero | 1.059 | 53.2 | 63 | pIR | 1.881 | 4159.7 | 3358 |
Patrickios | 2.392 | 3201.8 | 227 | Patrickios | 1.998 | 5479.1 | 2739 |
Avg_pIa
| 0.960 | 22.1 | 53 | Avg_pI | 0.454 | 59.6 | 1571 |
-
aAverage from all pKa sets without Patrickios (highly simplified pKa set) and IPC sets. Note, that the average pI is calculated on the level of individual protein or peptide, thus it does not represent the average from values presented in the table for individual methods
- % - Note that the pH scale is logarithmic with base 10; thus, the percent difference corresponds to pow(10, x), where x is equal to the delta of the RMSD of two error estimates represented in pH units; for example, the % difference between Toseland and IPC_protein is pow(10, (0.934-0.874))
- Protein dataset (IPC_protein was trained on 1,743 proteins with 10-fold cross-validation – data in Table 2, tested on 581 proteins not used for training – data in the table above), peptide dataset (IPC trained on 12,662 peptides with 10-fold cross-validation – data in Table 2, tested on 4,220 peptides not used for training – data in the table above). Outliers correspond to the number of predictions for which the difference between the experimental pI and predicted pI was greater than the threshold of the mean standard error (MSE) of 3 for the protein dataset and MSE of 0.25 for the peptide dataset