Non-erasing Chomsky-Schützenberger theorem with grammar-independent alphabet

The famous theorem by Chomsky and Schützenberger (CST) says that every context-free language L over an alphabet Σ is representable as h(D∩R), where D is a Dyck language over a set Ω of brackets, R is a local language and h is an alphabetic homomorphism that erases unboundedly many symbols. Berstel f...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Information and computation 2019-12, Vol.269, p.104442, Article 104442
Hauptverfasser: Crespi Reghizzi, Stefano, San Pietro, Pierluigi
Format: Artikel
Sprache:eng
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:The famous theorem by Chomsky and Schützenberger (CST) says that every context-free language L over an alphabet Σ is representable as h(D∩R), where D is a Dyck language over a set Ω of brackets, R is a local language and h is an alphabetic homomorphism that erases unboundedly many symbols. Berstel found that the number of erasures can be linearly limited if the grammar is in Greibach normal form; Berstel and Boasson (and later, independently, Okhotin) proved a non-erasing variant of CST for grammars in Double Greibach Normal Form. In all these CST statements, however, the size of the Dyck alphabet Ω depends on the grammar size for L. In the Stanley variant of the CST, |Ω| only depends on |Σ| and not on the grammar, but the homomorphism erases many more symbols than in the other versions of CST; also, the regular language R is strictly locally testable but not local. We prove a new version of CST which combines both features of being non-erasing and of using a grammar-independent alphabet. In our construction, |Ω| is polynomial in |Σ|, namely O(|Σ|46), and the regular language R is strictly locally testable. Using a recent generalization of Medvedev's homomorphic characterization of regular languages, we prove that the degree in the polynomial dependence of |Ω| on |Σ| may be reduced to just 2 in the case of linear grammars in Double Greibach Normal Form.
ISSN:0890-5401
1090-2651
DOI:10.1016/j.ic.2019.104442