Table 1. Optimized scaling factors (Kmn) used in Eq. 1 as determined using a simple grid search protocol. Note that “m” runs from 1-7 and corresponds to SeqSim, 1Hα, 13Cα, 13Cβ, 13CO, 15N and 1HN respectively. The index “n” corresponds to the residue positions (-1, 0 or 1) in each amino acid triplet. ______________________________________________________ Residue SeqSim 1Hα 13Cα 13Cβ 13CO 15N 1HN ______________________________________________________ n=-1 0.5 37 11 9 5 1 1 n=0 2.5 31 14 14 6 1.5 0.3 n=1 1.5 37 7 7 4 2 1.5 ______________________________________________________ Explanation how these factors are used to calculate the similarity score. PREDITOR makes backbone torsion angle predictions using the chemical shifts of successive amino acid triplets from the query sequence, each of which is compare to all triplets contained in the PREDITOR database. For each query triplet “i” and each database triplet “j” (with the same central residues) the similarity score S(i,j) is calculated using Equation 1. S(i,j) = Σ{0.5*Kn1(SeqSim) + Kn2(|δΔCαi+n – δΔCαj+n|) + Kn3(|δΔCβi+n – δΔCβj+n|) + Kn4(|δΔCOi+n – δΔCOj+n|) + Kn5(|δΔHαi+n – δΔHαj+n|) + Kn6(|δΔNi+n – δΔNj+n|) + Kn7(|δΔHNi+n – δΔHNj+n|)} Where Σ sums over the triplet of n = -1 to 1, Knm corresponds to empirically determined weighting coefficients (see Table 1) for each triplet “n” of each term “m”, SeqSim corresponds to the sequence similarity between each sequence triplet using the SeqSee weight matrix and δΔX corresponds to the secondary chemical shifts for nucleus X (i.e. the difference of the observed chemical shift from the random coil values of Wishart et al.). The combined similarity score [S(i,j)] between the query triplet and each database triplet is sequentially calculated for all 3755 triplets in the database. The ten triplets with the lowest scores are then selected. The torsion angles for each central residue of the ten low-scoring triplets is then extracted to estimate the most likely torsion angles for the central query residue. The ten predicted torsion angles are clustered, using a simple hierarchical clustering algorithm, by evaluating the difference between the ten sets of predicted torsion angles. Clusters are then grouped if the difference of the φ or ψ angles is less then 15o. The cluster with the highest overall S(i,j) score is then selected and the mean φ and ψangles for that cluster are used as the predicted torsion angles for the central residue of the query triplet. To prevent spurious predictions, secondary structures are also predicted using chemical shift indices and these predictions are used to confirm or help select the correct torsion angles for secondary structure elements. ![]() |