CATH - A hierarchic classification of protein domain structures

2.2kCitations
Citations of this article
787Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

Background: Protein evolution gives rise to families of structurally related proteins, within which sequence identities can be extremely low. As a result, structure-based classifications can be effective at identifying unanticipated relationships in known structures and in optimal cases function can also be assigned. The ever increasing number of known protein structures is too large to classify all proteins manually, therefore, automatic methods are needed for fast evaluation of protein structures. Results: We present a semi-automatic procedure for deriving a novel hierarchical classification of protein domain structures (CATH). The four main levels of our classification are protein class (C), architecture (A), topology (T) and homologous superfamily (H). Class is the simplest level, and it essentially describes the secondary structure composition of each domain. In contrast, architecture summarises the shape revealed by the orientations of the secondary structure units, such as barrels and sandwiches. At the topology level, sequential connectivity is considered, such that members of the same architecture might nave quite different topologies. When structures belonging to the same T-level have suitably high similarities combined with similar functions, the proteins are assumed to be evolutionarily related and put into the same homologous superfamily. Conclusions: Analysis of the structural families generated by CATH reveals the prominent features of protein structure space. We find that nearly a third of the homologous superfamilies (H-levels) belong to ten major T-levels, which we call superfolds, and furthermore that nearly two-thirds of these H-levels cluster into nine simple architectures. A database of well-characterised protein structure families, such as CATH, will facilitate the assignment of structure-function/evolution relationships to both known and newly determined protein structures.

References Powered by Scopus

Dictionary of protein secondary structure: Pattern recognition of hydrogen‐bonded and geometrical features

13475Citations
N/AReaders
Get full text

A general method applicable to the search for similarities in the amino acid sequence of two proteins

8663Citations
N/AReaders
Get full text

The protein data bank: A computer-based archival file for macromolecular structures

8490Citations
N/AReaders
Get full text

Cited by Powered by Scopus

The Protein Data Bank

32135Citations
N/AReaders
Get full text

Protein secondary structure prediction based on position-specific scoring matrices

4728Citations
N/AReaders
Get full text

TM-align: A protein structure alignment algorithm based on the TM-score

2511Citations
N/AReaders
Get full text

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Cite

CITATION STYLE

APA

Orengo, C. A., Michie, A. D., Jones, S., Jones, D. T., Swindells, M. B., & Thornton, J. M. (1997). CATH - A hierarchic classification of protein domain structures. Structure, 5(8), 1093–1109. https://doi.org/10.1016/s0969-2126(97)00260-8

Readers' Seniority

Tooltip

PhD / Post grad / Masters / Doc 349

61%

Researcher 142

25%

Professor / Associate Prof. 67

12%

Lecturer / Post doc 13

2%

Readers' Discipline

Tooltip

Agricultural and Biological Sciences 274

47%

Biochemistry, Genetics and Molecular Bi... 161

28%

Computer Science 95

16%

Chemistry 53

9%

Article Metrics

Tooltip
Mentions
References: 18
Social Media
Shares, Likes & Comments: 22

Save time finding and organizing research with Mendeley

Sign up for free