Abstract
We present a new sound and complete axiomatization of regular
expression containment. It consists of the conventional axiomatization
of concatenation, alternation, empty set and (the singleton set
containing) the empty string as an idempotent semiring, the fixedpoint
rule E = 1 + E E for Kleene-star, and a general coinduction
rule as the only additional rule.
Our axiomatization gives rise to a natural computational interpretation
of regular expressions as simple types that represent parse
trees, and of containment proofs as coercions. This gives the axiomatization
a Curry-Howard-style constructive interpretation: Containment
proofs do not only certify a language-theoretic containment,
but, under our computational interpretation, constructively
transform a membership proof of a string in one regular expression
into a membership proof of the same string in another regular
expression.
We show how to encode regular expression equivalence proofs
in Salomaa’s, Kozen’s and Grabmayer’s axiomatizations into our
containment system, which equips their axiomatizations with a
computational interpretation and implies completeness of our axiomatization.
To ensure its soundness, we require that the computational
interpretation of the coinduction rule be a hereditarily total
function. Hereditary totality can be considered the mother of syntactic
side conditions: it “explains” their soundness, yet cannot be
used as a conventional side condition in its own right since it turns
out to be undecidable.
We discuss application of regular expressions as types to bit
coding of strings and hint at other applications to the wide-spread
use of regular expressions for substring matching, where classical
automata-theoretic techniques are a priori inapplicable.
Neither regular expressions as types nor subtyping interpreted
coercively are novel per se. Somewhat surprisingly, this seems to
be the first investigation of a general proof-theoretic framework for
the latter in the context of the former, however.
expression containment. It consists of the conventional axiomatization
of concatenation, alternation, empty set and (the singleton set
containing) the empty string as an idempotent semiring, the fixedpoint
rule E = 1 + E E for Kleene-star, and a general coinduction
rule as the only additional rule.
Our axiomatization gives rise to a natural computational interpretation
of regular expressions as simple types that represent parse
trees, and of containment proofs as coercions. This gives the axiomatization
a Curry-Howard-style constructive interpretation: Containment
proofs do not only certify a language-theoretic containment,
but, under our computational interpretation, constructively
transform a membership proof of a string in one regular expression
into a membership proof of the same string in another regular
expression.
We show how to encode regular expression equivalence proofs
in Salomaa’s, Kozen’s and Grabmayer’s axiomatizations into our
containment system, which equips their axiomatizations with a
computational interpretation and implies completeness of our axiomatization.
To ensure its soundness, we require that the computational
interpretation of the coinduction rule be a hereditarily total
function. Hereditary totality can be considered the mother of syntactic
side conditions: it “explains” their soundness, yet cannot be
used as a conventional side condition in its own right since it turns
out to be undecidable.
We discuss application of regular expressions as types to bit
coding of strings and hint at other applications to the wide-spread
use of regular expressions for substring matching, where classical
automata-theoretic techniques are a priori inapplicable.
Neither regular expressions as types nor subtyping interpreted
coercively are novel per se. Somewhat surprisingly, this seems to
be the first investigation of a general proof-theoretic framework for
the latter in the context of the former, however.
Originalsprog | Engelsk |
---|---|
Tidsskrift | A C M / S I G P L A N Notices |
Vol/bind | 46 |
Udgave nummer | 1 |
Sider (fra-til) | 385-398 |
Antal sider | 14 |
ISSN | 1523-2867 |
DOI | |
Status | Udgivet - jan. 2011 |
Begivenhed | 38th Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages - Austin, USA Varighed: 26 jan. 2011 → 28 jan. 2011 Konferencens nummer: 38 |
Konference
Konference | 38th Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages |
---|---|
Nummer | 38 |
Land/Område | USA |
By | Austin |
Periode | 26/01/2011 → 28/01/2011 |