Regular expression pattern matching for XML

50Citations
Citations of this article
26Readers
Mendeley users who have this article in their library.

Abstract

We propose regular expression pattern matching as a core feature of programming languages for manipulating XML. We extend conventional pattern-matching facilities (as in ML) with regular expression operators such as repetition (*), alternation (*), etc., that can match arbitrarily long sequences of subtrees, allowing a compact pattern to extract data from the middle of a complex sequence. We then show how to check standard notions of exhaustiveness and redundancy for these patterns. Regular expression patterns are intended to be used in languages with type systems based on regular expression types. To avoid excessive type annotations, we develop a type inference scheme that propagates type constraints to pattern variables from the type of input values. The type inference algorithm translates types and patterns into regular tree automata, and then works in terms of standard closure operations (union, intersection, and difference) on tree automata. The main technical challenge is dealing with the interaction of repetition and alternation patterns with the first-match policy, which gives rise to subtleties concerning both the termination and precision of the analysis. We address these issues by introducing a data structure representing these closure operations lazily.

References Powered by Scopus

The Lorel query language for semistructured data

642Citations
N/AReaders
Get full text

Hope: An experimental applicative language

149Citations
N/AReaders
Get full text

DTD inference for views of XML data

141Citations
N/AReaders
Get full text

Cited by Powered by Scopus

Taxonomy of XML schema languages using formal language theory

219Citations
N/AReaders
Get full text

Automata for XML-A survey

109Citations
N/AReaders
Get full text

Regular expression types for XML

104Citations
N/AReaders
Get full text

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Cite

CITATION STYLE

APA

Hosoya, H., & Pierce, B. C. (2003). Regular expression pattern matching for XML. Journal of Functional Programming, 13(6), 961–1004. https://doi.org/10.1017/S0956796802004410

Readers' Seniority

Tooltip

PhD / Post grad / Masters / Doc 12

55%

Researcher 6

27%

Professor / Associate Prof. 3

14%

Lecturer / Post doc 1

5%

Readers' Discipline

Tooltip

Computer Science 19

86%

Physics and Astronomy 1

5%

Engineering 1

5%

Agricultural and Biological Sciences 1

5%

Save time finding and organizing research with Mendeley

Sign up for free