Table of Contents
Evolution teaches to predict protein structure and function
Evolution teaches prediction
http://cubic.bioc.columbia.edu/
CUBIC http://cubic.bioc.columbia.edu
The Data Deluge
Data Deluge: what do we want?
Data Deluge: numbers
Data Deluge: what CAN we do?
Data Deluge: we CAN we do?
Evolution teaches prediction
Dynamic programming: optimal alignment
BLAST: fast matching of single ëwordsí
Profile-based comparison
Zones
Sequence -> Structure
Sequence -> Structure
Sequence -> Structure
Twilight zone = false positives explode
Significant sequence identity
Evolution did it !
Similar sequence -> similar structure?
Detecting true hits in Twilight zone
Finding similar structures in Twilight zone
ëSecureí thresholds for BLAST
Accuracy vs. coverage
BLAST is not enough ...
Sequence Space Hopping
Success through sequence space hopping
Zones
Profile-based database search
Profile-based database search
Profile-based database search
Profile-based database search
Profile-based database search
Profile-based database search
Zones
Hypothetical distribution of similar structures
PPT Slide
Midnight zone: real - random
Evolution into the Midnight zone
Protein structures evolved at random - almost
Structure space
Gold-mine out of reach!
Conservation of function
Conservation of EC number
Conservation of EC number 2
Conservation of EC number: BLAST
Conservation in detail
Accuracy vs. coverage: EC number
Conservation of EC numbers
Evolution teaches prediction
Notation: protein structure 1D, 2D, 3D
PPT Slide
PPT Slide
Goal of structure prediction
Protein structure prediction in reality
PPT Slide
Homology modelling/comparative modelling
Protein structure prediction in reality
Protein structure prediction in reality
Structure prediction for protein universe
Improving prediction by waiting it out Ö
Evolution teaches prediction
Evolution did it !
PPT Slide
PPT Slide
PPT Slide
Evolution teaches prediction
Membrane prediction
HTM prediction waiting for database growth ...
Topology for membrane helical proteins
PHDsec success on Poly-Valine
PPT Slide
Refine by dynamic programming on NN ëenergyí
PHDhtmrefinetopologyprediction
PHDhtm on Poly-Valine
Example IS representative
To be or not to be (HTM)
False positives: globular proteins
Details PHDsec: Wrong alignment
Details PHDhtm: wrong for ësaveí alignment
Details PHDhtm: correct for accurate alignment
Evolution teaches prediction
Defining residue solvent accessibility
PPT Slide
Evolution for accessibility prediction
PHDacc: the un-g(l)ory details
Evolution teaches prediction
Evolution teaches prediction
PPT Slide
PPT Slide
PPT Slide
PPT Slide
PPT Slide
Shuttle into the nucleus
How many NLS motifs in databases?
Experimental NLS: positive charges
Experimental NLS: more complicated
In silico mutagenisis
Increasing accuracy and coverage
Increasing accuracy and coverage
Increasing accuracy and coverage
Increasing accuracy and coverage
Increasing accuracy and coverage
Nuclear protein in proteomes
Un-annotated nuclear proteins with NLS
Using NLS to bind DNA
DNA-binding predictions in proteomes
Rotation @ CUBIC.bioc.columbia.edu
Significant motifs
Rotation @ CUBIC.bioc.columbia.edu
Finding unique subsets of proteins
Similar sequence -> similar structure?
Rotation @ CUBIC.bioc.columbia.edu
Retention signals in ER and Golgi
Evolution teaches prediction
PPT Slide
Family size
Structure prediction for protein universe
Do we aim at getting one structure per fold?
Similar amino acid composition
Inventory of life: membrane proteins
Number of membrane helices -> complexity?
Membraneproteins:kingdomsinventeddifferenttricks
The membraneLEGO
Length of globular regions in membrane proteins
Inventory of life: coiled-coil proteins
Coiled-coil proteins: details
Inventory of life: compartments
Proteinstructureuniverse
Distribution of protein length
Bottleneck 5: money ...
What will we get?
Recipe to determine targets
Alternative recipe to determine targets
Reality check:the invaluable contribution of bioinformatics to target selection
Target selection
Priority classes
Target selection machinery
Conclusions: Structural Genomics
Evolution teaches prediction
Midnight zone STRONGLY populated
What we are threading for
Goals of fold recognition, threading,remote homology modelling
Two paths to fold recognition
TOPITS
Prediction-based threading
Example of remote sequence identity
30% correct first, better if stronger
Other threading methods
Evolution teaches prediction
Long floppy regions
Floppy loops between domains
Floppy ends
Floppy-wrap
Weirdoes
Weirdoes are not alone !
10% of biomass weird !
Length distribution of floppy regions
Weirdoes functional !
Yeast-2-hybrid interactions
Evolution teaches prediction
Conclusions
Thanksgiving
Availability of methods
|