Skip to main content

Table 6 Summary of the physical terms T j in the scoring function of the pkaPS predictor.

From: pkaPS: prediction of protein kinase A phosphorylation sites with the simplified kinase-substrate binding model

T j

Property

Positions

αppt,j

Description

T 1

(+) H, K, R

-3/-2

1.0

Positive charge

T 2

EISD860102 [29]

-3/-2

0.030

Hydrophilic residues

T 3

ZIMJ680104 [87]

-6 to -2

0.020

Isoelectric point (positive charge), long range

T 4

(+) H, K, R; (-) D, E

-6 to -2

0.48

Total charge, long range

T 5

GEIM800106 [34]

+1

0.070

β-strand preference

T 6

GEIM800107 [34]

+1/+4

0.040

β-strand preference, compensated

T 7

HAGECH94_V [88]

+2/+3

0.040

Size restrictions

T 8

KARP850101 [89]

+3

0.040

Flexibility

T 9

KARP850101 [89]

-9 to -4

0.040

Minimal linker – flexibility

T 10

EISD840101 [29]

-9 to -4

0.040

Minimal linker – hydrophilicity

T 11

EISD840101 [29]

+4 to +9

0.058

Minimal linker – hydrophilicity

T 12

KARP850101 [89]

+4 to +9

0.058

Minimal linker – flexibility

T 13

CIDH920105 [90]

-18 to -6, +6 to +23

0.040

Avoid buried regions – hydrophilicity

T 14

VINM940101 [30]

-18 to -6, +6 to +23

0.040

Avoid buried regions – flexibility

  1. The table shows the complete list of physical property terms in the score function. The values t j in equations 12 and 13 are set equal to zero for Gaussian-type terms and equal to 0.1 for fixed penalties T1 and T4. The only adjustable parameter per term is the weight αppt,j (equation 10). These parameters have been selected so that Sppt is close to zero for most of the learning set examples. The values α profile,j (equation 9) for the positions -6...+6 are the following multiples of a normalization factor 0.051: 7, 6, 3, 5, 6, 6, 6, 3, 2, 5, 3, 4, and 1. Initial guesses for the αppt,j and αprofile,j parameters have been calculated with linear kernel support vector machines as implemented in the LIBSVM library [91]. These weights have subsequently been rounded to two significant positions and edited manually to avoid non-positive numbers and to achieve close-to-zero T j values for the learning set sequences.