Skip to main content

Table 1 Prediction of isoelectric point on the 25 % testing datasets

From: IPC – Isoelectric Point Calculator

Method Protein dataset Method Peptide dataset
RMSD % Outliers RMSD % Outliers
IPC_protein 0.874 0 46 IPC_peptide 0.251 0 232
Toseland 0.934 14.9 52 Solomons 0.255 0.9 235
Bjellqvist 0.944 17.7 47 Lehninger 0.262 2.5 236
Dawson 0.945 17.8 56 EMBOSS 0.325 18.5 372
Wikipedia 0.955 20.5 55 Wikipedia 0.421 47.9 1467
Rodwell 0.963 22.8 58 Toseland 0.425 49.1 990
ProMoST 0.966 23.6 52 Sillero 0.428 50.3 1223
Grimsley 0.968 24.2 60 Dawson 0.435 52.9 1432
Solomons 0.970 24.8 58 Thurlkill 0.481 69.7 1361
Lehninger 0.970 25.0 59 Rodwell 0.502 78.4 1359
pIR 1.013 38.0 58 DTASelect 0.550 99.1 1714
Nozaki 1.024 41.3 56 Nozaki 0.602 124.3 1368
Thurlkill 1.030 43.4 61 Grimsley 0.616 131.4 1550
DTASelect 1.032 44.1 58 Bjellqvist 0.669 161.5 1583
pIPredict 1.048 49.4 56 pIPredict 1.024 493.6 2720
EMBOSS 1.056 52.3 69 ProMoST 1.239 873.4 2649
Sillero 1.059 53.2 63 pIR 1.881 4159.7 3358
Patrickios 2.392 3201.8 227 Patrickios 1.998 5479.1 2739
Avg_pIa 0.960 22.1 53 Avg_pI 0.454 59.6 1571
  1. aAverage from all pKa sets without Patrickios (highly simplified pKa set) and IPC sets. Note, that the average pI is calculated on the level of individual protein or peptide, thus it does not represent the average from values presented in the table for individual methods
  2. % - Note that the pH scale is logarithmic with base 10; thus, the percent difference corresponds to pow(10, x), where x is equal to the delta of the RMSD of two error estimates represented in pH units; for example, the % difference between Toseland and IPC_protein is pow(10, (0.934-0.874))
  3. Protein dataset (IPC_protein was trained on 1,743 proteins with 10-fold cross-validation – data in Table 2, tested on 581 proteins not used for training – data in the table above), peptide dataset (IPC trained on 12,662 peptides with 10-fold cross-validation – data in Table 2, tested on 4,220 peptides not used for training – data in the table above). Outliers correspond to the number of predictions for which the difference between the experimental pI and predicted pI was greater than the threshold of the mean standard error (MSE) of 3 for the protein dataset and MSE of 0.25 for the peptide dataset