Analysing structured scholarly data embedded in web pages

Pracheta Sahoo; Ujwal Gadiraju; Ran Yu; Sriparna Saha; Stefan Dietze

Conference Proceedings

Analysing structured scholarly data embedded in web pages

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2016) 9792 LNCS 90-100

DOI: 10.1007/978-3-319-53637-8_10

2Citations

12Readers

Get full text

Abstract

Web pages increasingly embed structured data in the form of microdata, microformats and RDFa. Through efforts such as schema.org, such embedded markup have become prevalent, with current studies estimating an adoption by about 26% of all web pages. Similar to the early adoption of Linked Data principles by publishers, libraries and other providers of bibliographic data, such organisations have been among the early adopters, providing an unprecedented source of structured data about scholarly works. Such data, however, is fundamentally different from traditional Linked Data, by being very sparsely linked and consisting of a large amount of coreferences and redundant statements. So far, the scale and nature of embedded scholarly data on the Web has not been investigated. In this work, we provide a study on embedded scholarly data to answer research questions about the depth, syntactic and semantic characteristics and distribution of extracted data, thereby investigating challenges and opportunities for using embedded data as a structured knowledge graph of scholarly information.

Author supplied keywords

Cite

CITATION STYLE

APA

Sahoo, P., Gadiraju, U., Yu, R., Saha, S., & Dietze, S. (2016). Analysing structured scholarly data embedded in web pages. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 9792 LNCS, pp. 90–100). Springer Verlag. https://doi.org/10.1007/978-3-319-53637-8_10

Analysing structured scholarly data embedded in web pages

Abstract

Author supplied keywords

Cite

Register to see more suggestions