Expected prediction accuracy of PHD
Contact: Burkhard
Rost (rost@EMBL-Heidelberg.de)
Date: June, 1996
Figure captions:
- Dataset: Cross-validation experiment on 705 sequence unique
protein chains (largest sequence unique subset of PDB in May, 1996, i.e., no
two proteins in that set have more than 25% pairwise identical
residues).
- Fig. 1:
Distribution of prediction accuracy for per-residue accuracy (percentage
of residues predicted correctly in either of the three states helix,
strand, rest).
- Fig. 1a:
Subset of proteins with less than 100 residues.
- Fig. 1b:
Subset of proteins with less than 200 residues.
- Fig. 1c:
Subset of proteins with more than 200 residues.
- Reliability of correctly predicting residues.
Given is the expected per-residue accuracy for residues with a
reliability index (RI) above a given cutoff.
- Fig. 2a:
Overall per-residue accuracy (three states: H, E, L). For example, a
level of accuracy comparable to homology modelling is reached for 48% of
all residues by PHDsec (RI >6).
- Fig. 2b:
Per-residue accuracy for correctly predicting helix, strand, and
other
Percentage of observed: number of correctly predicted residues in either
state (H,E,L) / number of residues observed in that state (H,E,L)
- Fig. 2c:
Per-residue accuracy for correctly predicting helix, strand, and
other
Percentage of predicted: number of correctly predicted residues in
either state (H,E,L) / number of residues predicted in that state (H,E,L)
- Reliability of correctly predicting accessibility.
Fig. 2d:
Overall per-residue accuracy (two states: buried/exposed). For example, a
level of accuracy comparable to homology modelling is reached for 50% of
all residues by PHDsec (RI >2).
- Fig. 3:
Distribution of prediction accuracy for the per-segment accuracy
(percentage of overlapping segments, defined in Rost
et al., 1994).
- Fig. 4:
Distribution of prediction accuracy for 'bad' predictions, i.e. residues
observed in helix and predicted in strand or vice versa.