Similarity between semantic description sets: Addressing needs beyond data integration

ISSN: 16130073
2Citations
Citations of this article
20Readers
Mendeley users who have this article in their library.

Abstract

Descriptive information is easy to understand and communicate in natural language. Examples in the biological realm include the cellular functions of proteins and the phenotypes exhibited by organisms. Large latent stores of such descriptive data are stored in databases that can be mined, but even more still reside only in the scientific literature. Although such information has traditionally been opaque to computers, in recent years significant efforts have gone into exposing descriptive information to computation through the development of ontologies and associated tools. A host of software applications now employ simple reasoning over Gene Ontology annotated data to help interpret experimental findings in genomics in terms of protein function. In the domain of biological phenotypes, the combination of entity terms from taxon-specific anatomy ontologies with quality terms from generic ontologies such as PATO have been used to construct semantically precise and contextualized descriptions. It is natural for multiple semantic descriptions to pertain to single instances in the real world, as in the case of both protein functions and organismal phenotypes. However, applications for ontology-based annotations that go beyond simple knowledge organization, and that exploit sets of semantic descriptions, are puzzlingly rare. In particular, we argue that there is wide applicability, and a sore need, for tools that can satisfy the simple, common use case of identifying statistically improbable similarity between sets of semantic descriptions. Several metrics have been proposed for this task in the literature, but not yet fully evaluated, explored, and adopted. The requirements for semantic similarity tools tailored to sets of semantic descriptions would include speed, scalability to large numbers of sets, demonstrated statistical and biological validity, and ease of use.

References Powered by Scopus

Gene ontology: Tool for the unification of biology

32481Citations
N/AReaders
Get full text

Bioinformatics enrichment tools: Paths toward the comprehensive functional analysis of large gene lists

11437Citations
N/AReaders
Get full text

The OBO Foundry: Coordinated evolution of ontologies to support biomedical data integration

2063Citations
N/AReaders
Get full text

Cited by Powered by Scopus

Next-generation phenotyping: Requirements and strategies for enhancing our understanding of genotype-phenotype relationships and its relevance to crop improvement

507Citations
N/AReaders
Get full text

A Linked Science investigation: Enhancing climate change data discovery with semantic technologies

10Citations
N/AReaders
Get full text

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Cite

CITATION STYLE

APA

Vision, T., Blake, J., Lapp, H., Mabee, P., & Westerfield, M. (2011). Similarity between semantic description sets: Addressing needs beyond data integration. In CEUR Workshop Proceedings (Vol. 783). CEUR-WS.

Readers' Seniority

Tooltip

PhD / Post grad / Masters / Doc 6

43%

Researcher 5

36%

Professor / Associate Prof. 3

21%

Readers' Discipline

Tooltip

Computer Science 9

53%

Agricultural and Biological Sciences 6

35%

Linguistics 1

6%

Biochemistry, Genetics and Molecular Bi... 1

6%

Save time finding and organizing research with Mendeley

Sign up for free