Skip to main content

Table 5 Detailed statistics for the available datasets

From: IPC – Isoelectric Point Calculator

Dataset

Initial no. entries

No. entries with sequence and pI

No. entries after removing outliers

No. entries after removing duplicates

Gauci et al.

5,758

5,758

NA

NA

PHENYX

7,582

7,582

NA

NA

SEQUEST

7,629

7,629

NA

NA

IPC_peptide

-

20,969

20,969

16,882 [25] [75]

SWISS-2DPAGE

2,530

1,054

1,029

982

PIP-DB

4,947

2,427

2,254

1,307

IPC_protein

-

3.481

3,283

2,324 [25] [75]

  1. NA not available refers to the situation where the given dataset was not created because a merged version was used
  2. Note: all datasets presented in the table are available as hyperlinks; the final datasets were divided randomly into 75 % training and 25 % testing subsets (denoted as [75] and [25], respectively)