Dataset
|
Initial no. entries
|
No. entries with sequence and pI
|
No. entries after removing outliers
|
No. entries after removing duplicates
|
---|
Gauci et al.
|
5,758
|
5,758
|
NA
|
NA
|
PHENYX
|
7,582
|
7,582
|
NA
|
NA
|
SEQUEST
|
7,629
|
7,629
|
NA
|
NA
|
IPC_peptide
|
-
|
20,969
|
20,969
|
16,882 [25] [75]
|
SWISS-2DPAGE
|
2,530
|
1,054
|
1,029
|
982
|
PIP-DB
|
4,947
|
2,427
|
2,254
|
1,307
|
IPC_protein
|
-
|
3.481
|
3,283
|
2,324 [25] [75]
|
-
NA not available refers to the situation where the given dataset was not created because a merged version was used
- Note: all datasets presented in the table are available as hyperlinks; the final datasets were divided randomly into 75 % training and 25 % testing subsets (denoted as [75] and [25], respectively)