>


>



Effect of sequence homology on PDB-based PREDITOR predictions.
Figure 3.

>

Figure above illustrates the scatter plot comparing the φ/ψ[30] accuracy of the PDB-derived predictions versus % sequence identity.  A total of 31 different query proteins were used, each with an average of 5.5 homologues in the PDB, yielding a plot with nearly 170 data points.  The best-fit hyperbolic curve (y=A - B/x, where A = 103.8 ; B = 1533.6, correlation coeff. = 0.75) follows quite closely what is found for homology modeling, with high sequence identities (>60%) yielding very high φ/ψ[30] values.  As the sequence identity drops below 60% the accuracy drops quite quickly so that by the time is at 40% sequence identity the φ/ψ[30] values are no better than 40%.  Similar curves drawn for different accuracy cut-offs φ/ψ[40] and φ/ψ[50] show similar trends but with the upper flat line extending further to the left (data not shown).  These data were used to derive the following scaling factors from which the secondary structure-specific baseline errors were multiplied according to the sequence identity of PDB homologues: 

Sequence identity    Scaling factor
100-90% -1.00
90-80% -1.05
80-70% -1.1
70-60% -1.2
< 60% -1.3

>