| Title: | Static benchmarking of membrane helix predictions |
| Author: | Andrew Kernytsky & Burkhard Rost |
| Quote: | Nucl Acids Res, 2003, volxx, pages_xx |
Static benchmarking of membrane helix predictions
| 1 | CUBIC, Dept. of Biochemistry and Molecular Biophysics, Columbia University, 650 West 168th Street BB217, New York, NY 10032, USA |
| 2 | Columbia University Center for Computational Biology and Bioinformatics (C2B2), Russ Berrie Pavilion, 1150 St. Nicholas Avenue, New York, NY 10032, USA |
| 3 | North East Structural Genomics Consortium (NESG), Department of Biochemistry and Molecular Biophysics, Columbia University, 650 West 168th Street BB217, New York, NY 10032, USA |
| * | Corresponding author: email = amk2002@columbia.edu URL http://cubic.bioc.columbia.edu/ Tel: +1-212-305-4018, fax: +1-212-305-7932 |
This article is published in (Nucleic Acids Research, issue, date and pages) © copyright Oxford University Press (2003). OUP is the only authorised source. All copying of this article including placing on another website requires the written permission of the copyright owner.
Prediction of trans-membrane helices continues to be a difficult task with a few prediction methods clearly taking the lead; none of these is clearly best on all accounts. Recently, we have carefully set up protocols for benchmarking the most relevant aspects of prediction accuracy and have applied it to over 30 prediction methods. Here, we present the extension of that analysis to the level of an automatic web server evaluating new methods (http://cubic.bioc.columbia.edu/tmh_benchmark/). The most important achievements of the tool are: (1) any new method is compared to the battery of well-established tools. (2) The battery of measures explored allows spotting strengths in methods that may not be 'best' overall. In particular, we report per-residue and per-segment scores for accuracy, and the error-rates for confusing membrane helices with globular proteins or signal peptides. An additional feature is that developers can directly investigate any hydrophobicity scale for its potential in predicting membrane helices.
Key words: genome sequence analysis, predicting globularity, protein domains, protein structure prediction, solvent accessibility, multiple alignments, transmembrane helices, bioinformatics.
Membrane spanning proteins are vital for cells to function [1, 2] . However, it is very difficult to experimentally determine high-resolution three-dimensional structures for these proteins: fewer than 50 structures are currently deposited in PDB [3, 4] . C-terminal fusions with indicator proteins [5, 6] and from antibody-binding studies [7, 8] reveal the location of the helices and the orientation with respect to the protein termini. We refer to these data - slightly incorrectly - as 'low-resolution structures'. Mšller, Apweiler, and colleagues at the EBI have carefully hand-selected the results from low-resolution experiments for about 500 proteins [9] . We have taken the high-resolution set from PDB, added the low-resolution set from the EBI and have filtered the noise from redundancy of very similar proteins by creating the largest possible sub-set that is chosen such that we cannot infer structural similarity between any pair of proteins in that set from sequence alone ('sequence-unique subset [10, 11] ). Our previous work gives a comprehensive evaluation of transmembrane helix prediction methods based on these sets [11, 12] . Our automatic benchmarking server accomplishes this by using several different evaluation criteria for evaluating accuracy of the prediction method against proteins of known high and low resolution structure. False positives are estimated by applying the method to signal peptides and proteins without trans-membrane helices.
Input. The server accepts two types of input from users: (1) Simple scales reflecting the propensity of residues to form membrane helices, e.g. hydrophobicity scales. Such scales for each amino acid are either uploaded in text format or entered into a form. (2) Results from novel prediction methods: developers can benefit from the benchmark server by following the following steps: (i) download the data sets from our web site (about 2200 proteins, some of which contain membrane helices, some do not), (ii) run your method on all proteins, (iii) upload the predictions to our server in either of two commonly accepted formats. The upload is checked for possible problems that are immediately communicated to the developer. For example, if the predictions contains three-state Ð helix / non-helix / possible-helix Ð predictions rather than two state Ð helix / non-helix Ð predictions, the user is give a choice to abort the operation, convert all possible-helix residues to non-helix, or convert all possible-helix residues to helix.
Algorithm. When we test hydrophobicity scales, we simply apply the Wimley-White algorithm turn such scales into predictions of membrane helices [13] . New prediction methods (or predictions) are evaluated directly from the data uploaded by the users. A detailed description of the particular scores and schemes explored to measure performance is beyond the scope of this manuscript; they are available in our original publications or on our web site [11, 12] .
Output. Submissions are tracked through identifiers (IDs) that are shown on all result pages. After the request is queued, the user can either refresh the results page to check the status of the request, or follow the link that is e-mailed to the provided e-mail address when the request is completed. When the results are ready (approximately 3-5 minutes), the user is presented with several tables showing how well the tested method/scale performs in comparison to established methods (Table). Results are given separately for (1) high- and (2) low-resolution membrane proteins, as well as for the discrimination against (3) globular proteins and against (4) signal peptides ( Fig. 1 ). In all four resulting tables, several columns show different measures for prediction accuracy and discrimination. Clicking on the column headers will resort the given table by that metric. Clicking on the question mark (?) in the column header names will give a description of the metric. Clicking on the other prediction servers in the row header will give the full name of the server as well as the citation for the source as well as a web link to the server if available. Although the primary format used is the interactive web document described here, the results can also be obtained in non-interactive format. If desired, the results will be e-mailed in text format along with the link to the interactive results. Additionally, a permanent, non-interactive web document can be generated on the server by clicking a link on the interactive web page. A link to the document is then provided which the user can use to reference the serverÕs results.
Fig. 1. : Sample for server output. Example of one of the four output tables from server showing where the tested method falls in the ranking of existing methods. Table shown is for accuracy of predicting helices as tested against known high resolution structures. The method named YOU is highlighted in the output and shows the result for tested method. Hyperlinks re-sort by the given column and also lead to column (scoring metric) and row (prediction method) descriptions.
>>>Table 1<<<
Standard point of reference. The primary goal of this server is to provide users, developers, and referees with a standard benchmark evaluation for helical trans-membrane prediction methods in a format that is publicly available and as convenient as possible. The tool may help all not to over-estimate performance, and/or to spot strengths and weaknesses of particular methods. The battery of measure for performance that we use encompasses almost all the scores that found applied in the literature. With the web server, any new algorithm can be tested instantly and seamlessly.
Downside of static benchmark: over-fit to do well on this set only. We might argue that a possible problem with such an easily available and comprehensive method is that someone with enough time on their hands could write a program searching the space of all possible hydrophobicity-like scales in order to optimise the performance on our sets, more generally, developers may over-fit their models. In fact, to some extent, this is a principle problem of any standard data set accepted in the community. However, we challenge that if the scale/method really does consistently better than all methods in respect to all scores, it may indeed capture important aspects of helical membrane proteins. Perhaps more probable is the possibility that one may accidentally overfit to the benchmark by testing against the benchmark several times during development. To that end, developers can at least reduce the risk of fooling themselves by first testing their final or nearly final method on their own data sets and by then investigating to what extent their results are confirmed by ours.
Ultimate solution: go dynamic. Nevertheless, there is only one way to completely solve the problem, namely test on proteins that could not have been used to develop methods since their experimental structures arrived after the method. This is the concept that we explore through our EVA server evaluating the performance of structure prediction for globular proteins [14, 15, 16] . However, for globular proteins every week tens of new structures appear in PDB. Although, this will not become reality for membrane proteins in the foreseeable future, we are currently investigating ways of embedding some dynamic system for the evaluation of membrane predictions into EVA.
Thanks to Jinfeng Liu and Megan Restuccia (Columbia) for computer assistance; to Chien Peter Chen (Columbia) for his in-depth analysis of membrane helix predictions. Particular thanks to Volker Eyrich for his crucial help with setting up the META-PP and EVA servers without which most of the results presented here would not exist. Thanks to the anonymous reviewer for very detailed, constructive comments. This work was supported by the grants RO1-GM63029-01 from the National Institute of Health (NIH) and 1-R01-LM07329-01 from the National Library of Medicine (NLM). Last, not least, thanks to Amos Bairoch (SIB, Geneva), Rolf Apweiler (EBI, Hinxton), Phil Bourne (San Diego Univ.), and their crews for maintaining excellent databases and to all experimentalists who enabled this tool by making their data publicly available.
| 1. | Truscott, K. N. & Pfanner, N.(1999). Import of carrier proteins into mitochondria. Biol Chem, 380, 1151-6. |
| 2. | Thanassi, D. G. & Hultgren, S.J. (2000). Multiple pathways allow protein secretion across the bacterial outermembrane. Curr Opin Cell Biol, 12, 420-30. |
| 3. | Berman, H. M., Westbrook, J., Feng,Z., Gilliland, G., Bhat, T. N. et al. (2000). The Protein Data Bank. NucleicAcids Res, 28, 235-42. |
| 4. | Jayasinghe, S., Hristova, K. &White, S. H. (2001). MPtopo: A database of membrane protein topology. ProteinScience, 10, 455-458. |
| 5. | McGovern, K., Ehrmann, M. &Beckwith, J. (1991). Decoding signals for membrane proteins using alkalinephosphatase fusions. EMBO Journal, 10, 2773-2782. |
| 6. | van Geest, M. & Lolkema, J. S.(2000). Membrane topology and insertion of membrane proteins: search fortopogenic signals. Microbiol. Mol. Biol. Rev., 64, 13-33. |
| 7. | Traxler, B., Boyd, D. &Beckwith, J. (1993). The topological analysis of integral membrane proteins.Journal of Membrane Biology, 132, 1-11. |
| 8. | Amstutz, P., Forrer, P., Zahnd, C.& Pluckthun, A. (2001). In vitro display technologies: novel developmentsand applications. Curr Opin Biotechnol, 12, 400-405. |
| 9. | Moller, S., Kriventseva, E. V. &Apweiler, R. (2000). A collection of well characterised integral membraneproteins. Bioinformatics, 16, 1159-1160. |
| 10. | Rost, B. (1999). Twilight zone ofprotein sequence alignments. Protein Engineering, 12, 85-94. |
| 11. | Chen, C. P., Kernytsky, A. &Rost, B. (2002). Transmembrane helix predictions revisited. Protein Science,11, 2774-2791. |
| 12. | Chen, C. P. & Rost, B. (2002).Long membrane helices and short loops predicted less accurately. ProteinScience, 2766-2773. |
| 13. | Jayasinghe, S., Hristova, K. &White, S. H. (2001). Energetics, stability, and prediction of transmembranehelices. J Mol Biol, 312, 927-34. |
| 14. | Eyrich, V. A., Marti-Renom, M. A.,Przybylski, D., Madhusudhan, M. S., Fiser, A. et al. (2001). EVA: continuousautomatic evaluation of protein structure prediction servers. Bioinformatics,17, 1242-3. |
| 15. | Marti-Renom, M. A., Madhusudhan, M.S., Fiser, A., Rost, B. & Sali, A. (2002). Reliability of assessment ofprotein structure prediction methods. Structure, 10, 435-440. |
| 16. | Koh, I., Eyrich, V. A.,Marti-Renom, M. A., Przybylski, D., Madhusudhan, M. S. et al. (2003). EVA:evaluation of protein structure prediction servers. Nucleic Acids Research, . |
| 17. | Kabsch, W. & Sander, C. (1983).Dictionary of protein secondary structure: pattern recognition of hydrogenbonded and geometrical features. Biopolymers, 22, 2577-2637. |
| 18. | Tusnady, G. E. & Simon, I. (1998).Principles governing amino acid composition of integral membrane proteins:application to topology prediction. J. Mol. Biol., 283, 489-506. |
| 19. | Cserzš, M., Wallin, E., Simon, I.,von Heijne, G. & Elofsson, A. (1997). Prediction of transmembrane a-helicesin prokaryotic membrane proteins: the dense alignment surface method. Prot.Engin., 10, 673-676. |
| 20. | Hirokawa, T., Boon-Chieng, S. &Mitaku, S. (1998). SOSUI: classification and secondary structure predictionsystem for membrane proteins. Bioinformatics, 14, 378-379. |
| 21. | Sonnhammer, E. L. L., von Heijne,G. & Krogh, A. (1998). A hidden Markov model for predicting transmembranehelices in protein sequences. In Sixth International Conference on IntelligentSystems for Molecular Biology (ISMB98) (Glasgow, J., Littlejohn, T., Major, F.,Lathrop, R., Sankoff, D. et al., eds.), pp. 175-182, AAAI Press, Montreal,Canada. |
| 22. | Pasquier, C., Promponas, V. J.,Palaios, G. A., Hamodrakas, J. S. & Hamodrakas, S. J. (1999). A novelmethod for predicting transmembrane segments in proteins based on a statisticalanalysis of the SwissProt database: the PRED-TMR algorithm. Prot. Engin., 12,381-385. |
| 23. | Rost, B., Casadio, R., Fariselli,P. & Sander, C. (1995). Prediction of helical transmembrane segments at 95%accuracy. Prot. Sci., 4, 521-533. |
| 24. | Rost, B., Casadio, R. &Fariselli, P. (1996). Topology prediction for helical transmembrane proteins at86% accuracy. Prot. Sci., 5, 1704-1718. |
| 25. | Engelman, D. M., Steitz, T. A.& Goldman, A. (1986). Identifying nonpolar transbilayer helices in aminoacid sequences of membrane proteins. Annu. Rev. Biophys. Biophys. Chem., 15,321-353. |
| 26. | Kessel, A. & Ben-Tal, N.(2002). Free energy determinants of peptide association with lipid bilayers. InPeptide-lipid interactions (Simon, S. & McIntosh, T., eds.), pp. in press,Academic Press, San Diego. |
| 27. | Kyte, J. & Doolittle, R. F.(1982). A simple method for displaying the hydrophathic character of a protein.J. Mol. Biol., 157, 105-132. |
| 28. | Wolfenden, R., Andersson, L.,Cullis, P. M. & Southgate, C. C. B. (1981). Affinities of amino acid sidechains for solvent water. Biochemistry, 20, 849-855. |
| Contact: rost@columbia.edu | Version: Mar 20, 2003 |