Skip to main content

Table 3 Dataset’s properties

From: A survey of motif finding Web tools for detecting binding site motifs in ChIP-Seq data

Dataset

Mark

GEO accession

Number of sequences

Shortest sequence (residues)

Longest sequence (residues)

Total length (residues)

Size (FASTA format)

Size (BED format)

Reference

DM230

PolII (RNA polymerase II)

GSM722763

105

157

1728

47242

49 KB

5 KB

[82]

DM05

p300 (co-activator protein)

GSM722762

142

130

1214

50318

53 KB

7 KB

[82]

DM254

CTCF (insulator binding protein)

GSM722759

4009

94

2374

1518265

1604 KB

181 KB

[82]

DM01

H3K4me1 (histone H3 lysine 4 monomethylation)

GSM722760

2001

175

8520

1856431

1871 KB

88 KB

[82]

DM721

H3K27ac (H3 lysine 27 acetylation)

GSM851275

4005

255

16542

5429909

5423 KB

180 KB

[82]