Skip to main content

Table 3 Prediction of isoelectric points for SWISS-2DPAGE and PIP-DB databases

From: IPC – Isoelectric Point Calculator

Method

SWISS-2DPAGE

Method

PIP-DB

RMSD

%

Outliers

RMSD

%

Outliers

IPC_protein

0.476

0

10

IPC_protein

1.019

0

141

Toseland

0.521

10.9

18

Toseland

1.086

16.7

153

Bjellqvist

0.590

30.0

31

Bjellqvist

1.085

16.3

150

ProMoST

0.597

32.1

29

Dawson

1.081

15.3

161

Dawson

0.599

32.5

37

Wikipedia

1.087

16.9

163

Wikipedia

0.619

39.0

35

Rodwell

1.095

19.1

167

Rodwell

0.628

41.7

37

Grimsley

1.121

26.6

170

Grimsley

0.572

24.5

21

Solomons

1.103

21.4

159

Solomons

0.635

44.2

44

Lehninger

1.102

21.1

161

Lehninger

0.640

45.8

44

ProMOST

1.111

23.5

150

Nozaki

0.679

59.4

43

pIR

1.152

35.8

184

Thurlkill

0.691

63.9

39

Nozaki

1.165

39.9

170

DTASelect

0.677

58.8

35

Thurlkill

1.180

44.9

176

EMBOSS

0.724

76.9

49

DTASelect

1.186

47.1

173

Sillero

0.721

75.5

50

pIPredict

1.195

50.0

182

pIR

0.761

92.4

37

EMBOSS

1.198

51.2

191

pIPredict

0.768

95.9

33

Sillero

1.202

52.4

187

Patrickios

1.600

1227.9

243

Patrickios

2.623

3918

604

Avg_pIa

0.614

37.1

32

Avg_pIa

1.101

20.9

160

  1. aAverage from all pKa sets without the Patrickios (highly simplified pKa set) and IPC sets. Note, that the average pI is calculated on the level of individual protein or peptide
  2. Both SWISS-2DPAGE and PIP-DB were cleaned of outliers (MSE > 3 between experimental pI and average predicted pI) and clustered by CD-HIT with 99 % sequence identity threshold, as described in the Materials and Methods (982 and 1,307 proteins, respectively), but they were not divided into training and testing datasets. Thus, the results for the IPC sets are slightly overestimated, but this is not relevant, as shown by the comparison of Tables 1 and 2
  3. Outliers correspond to the number of predictions for which the difference between the experimental pI and the predicted pI exceeded the threshold of an MSE of 3 for the protein dataset