HAL will be down for maintenance from Friday, June 10 at 4pm through Monday, June 13 at 9am. More information
Skip to Main content Skip to Navigation
Journal articles

Percolation of annotation errors through hierarchically structured protein sequence databases.

Abstract : Databases of protein sequences have grown rapidly in recent years as a result of genome sequencing projects. Annotating protein sequences with descriptions of their biological function ideally requires careful experimentation, but this work lags far behind. Instead, biological function is often imputed by copying annotations from similar protein sequences. This gives rise to annotation errors, and more seriously, to chains of misannotation. [Percolation of annotation errors in a database of protein sequences (2002)] developed a probabilistic framework for exploring the consequences of this percolation of errors through protein databases, and applied their theory to a simple database model. Here we apply the theory to hierarchically structured protein sequence databases, and draw conclusions about database quality at different levels of the hierarchy.
Complete list of metadata

Contributor : Benjamin Audit Connect in order to contact the contributor
Submitted on : Saturday, September 29, 2007 - 12:54:42 PM
Last modification on : Monday, November 29, 2021 - 1:10:04 PM

Links full text




Walter R Gilks, Benjamin Audit, Daniela de Angelis, Sophia Tsoka, Christos A Ouzounis. Percolation of annotation errors through hierarchically structured protein sequence databases.. Mathematical Biosciences, Elsevier, 2005, 2 (193), pp.223-34. ⟨10.1016/j.mbs.2004.08.001⟩. ⟨ensl-00175660⟩



Record views