Skip to main content

Table 1 Descriptions of various datasets

From: A high-performance approach for predicting donor splice sites based on short window size and imbalanced large samples

Datasets Number of true donor sites Number of false donor sites
HS3Dall 2796 271928
HS3DI 2796 2796
HS3DII 2769 5000
HS3DIII 2796 10000
HS3DIV 2796 15000
HS3D-test1:1 796 796
HS3D-train1:1 2000 2000
HS3D-train1:10 2000 20000
HS3D-train1:20 2000 40000
HS3D-train1:50 2000 100000
HS3D-train1:135 2000 271132
BG-570orig 2127 149039
BG-570muta 2081 149572