Non-regular Maximal Prefix-Free Subsets of Regular Languages

Jirásek, Jozef

doi:10.1007/978-3-662-53132-7_19

Jozef Jirásek Jr.¹⁵

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 9840))

Included in the following conference series:

International Conference on Developments in Language Theory

583 Accesses

Abstract

We investigate non-regular maximal prefix-free subsets (MPFS) of regular languages. We give a method to decide whether or not a regular language has any non-regular MPFS.

Next, we prove that if a regular language has any non-regular MPFS, then it also has a MPFS which is context-sensitive but not context-free, it has a MPFS which is recursive but not context-sensitive, and it has a MPFS which is not recursively enumerable.

We show that no regular language has a MPFS which is recursively enumerable but not recursive. Finally, for any regular language we can decide whether or not it has a context-free non-regular MPFS.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
In the special case of $L = \Sigma ^*$, this corresponds to Lemma 10 of [1].
2.
$\mathrm {AC}$ for acyclic. The computation does not contain a cycle.

References

Calude, C.S., Staiger, L.: On universal computably enumerable prefix codes. Math. Struct. Comput. Sci. 19(1), 45–57 (2009)
Article MathSciNet MATH Google Scholar
Han, Y.-S., Salomaa, K., Wood, D.: Nondeterministic state complexity of basic operations for prefix-free regular languages. Fundam. Inform. 90(1–2), 93–106 (2009)
MathSciNet MATH Google Scholar
Han, Y.-S., Salomaa, K., Wood, D.: Operational state complexity of prefix-free regular languages. In: Automata, Formal Languages, and Related Topics - Dedicated to Ferenc Gécseg on the Occasion of his 70th Birthday, pp. 99–115 (2009)
Google Scholar
Han, Y.-S., Salomaa, K., Yu, S.: State complexity of combined operations for prefix-free regular languages. In: Dediu, A.H., Ionescu, A.M., Martín-Vide, C. (eds.) LATA 2009. LNCS, vol. 5457, pp. 398–409. Springer, Heidelberg (2009)
Chapter Google Scholar
Hopcroft, J.E., Ullman, J.D.: Introduction to Automata Theory. Languages and Computation. Addison-Wesley, Reading (1979)
MATH Google Scholar
Jirásek, J., Jirásková, G.: Cyclic shift on prefix-free languages. In: Bulatov, A.A., Shur, A.M. (eds.) CSR 2013. LNCS, vol. 7913, pp. 246–257. Springer, Heidelberg (2013)
Chapter Google Scholar
Jirásek, J.Š., Šebej, J.: Prefix-free subsets of regular languages and descriptional complexity. In: Shallit, J., Okhotin, A. (eds.) DCFS 2015. LNCS, vol. 9118, pp. 129–140. Springer, Heidelberg (2015)
Chapter Google Scholar
Jirásková, G., Krausová, M.: Complexity in prefix-free regular languages. In: Proceedings Twelfth Annual Workshop on Descriptional Complexity of Formal Systems, DCFS 2010, Saskatoon, Canada, 8–10th, pp. 197–204, August 2010
Google Scholar
Krausová, M.: Prefix-free regular languages: closure properties, difference, and left quotient. In: Kotásek, Z., Bouda, J., Černá, I., Sekanina, L., Vojnar, T., Antoš, D. (eds.) MEMICS 2011. LNCS, vol. 7119, pp. 114–122. Springer, Heidelberg (2012)
Chapter Google Scholar
Palmovský, M., Šebej, J.: Star-complement-star on prefix-free languages. In: Shallit, J., Okhotin, A. (eds.) DCFS 2015. LNCS, vol. 9118, pp. 231–242. Springer, Heidelberg (2015)
Chapter Google Scholar
Sipser, M.: Introduction to the Theory of Computation. PWS Publishing Company, Boston (1997)
MATH Google Scholar

Download references

Author information

Authors and Affiliations

Kuzmányho 27, 04001, Košice, Slovakia
Jozef Jirásek Jr.

Authors

Jozef Jirásek Jr.
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jozef Jirásek Jr. .

Editor information

Editors and Affiliations

Université du Québec à Montréal , Montreal, Québec, Canada
Srečko Brlek
Dept Mathematiques, Univ du Quebec Montreal Dept Mathematiques, Montreal, Québec, Canada
Christophe Reutenauer

Appendix

Lemma 14. Let L be a regular language with the property $P_2$. Then L has a maximal prefix-free subset which is context-free, but not regular.

Proof

Here we show that the set C obtained as described in Lemma 14 is a MPFS of L. The proof of the lemma in the article shows that $C \in \mathrm {CF}\setminus \mathrm {Reg}$.

Let $D_1 = \{ xu^i \mid i \ge 0 \} \cdot C_1$. In Case 1, let $D_2 = \{ xu^iyv^i \mid i \ge 0 \}$, $D_3 = \{ xu^iyv^j \mid i \ge 1, 0 \le j < i \} \cdot C_2$. In Case 2, let $D_2 = \{ xu^iyv^i \mid i \ge 0 \} \cdot C_2$, $D_3 = \{ xu^iyv^j \mid i, j \ge 0; i \ne j \} \cdot C_3$.

(1) $C \subseteq L$: $C_0 \subseteq L$. We have $s \cdot xu^i = p$, and $C_1 \subseteq L_p$. Next, $s \cdot xu^iyv^j = q$ for $i, j \ge 0$. In Case 1, $q \in F$ and $C_2 \subseteq L_q$. In Case 2, $C_2, C_3 \subseteq L_q$. Therefore $C \subseteq L$.

(2) C is prefix-free: $C_0$ is prefix-free. $C_0$ does not contain any string in [x], therefore strings in $C_0$ are incomparable with any string in $D_1 \cup D_2 \cup D_3$.

Since $C_1$ is prefix-free and does not contain any strings in [u], $D_1$ is prefix-free as well. $C_1$ does not contain any string in [u] or [y], so strings in $D_1$ are incomparable with any string in $D_2 \cup D_3$.

Since $C_2$ and $C_3$ are prefix-free and do not contain any string in [v], for a given $i, j \ge 0$ the languages $xu^iyv^j \cdot C_2$ and $xu^iv^j \cdot C_3$ are also prefix-free. Let $i_1 < i_2$. Then $xu^{i_1}yv$ and $xu^{i_2}yv$ are incomparable, since u and y differ in the first symbol. Let $j_1 < j_2$. Then any string in $ux^iyv^{j_1} \cdot C_2$, resp. $C_3$ is incomparable with any string in $ux^iyv^{j_2} \cdot C_2$, resp. $C_3$, since no string in $C_2$, resp. $C_3$ is in [v].

(3) C is maximal. Let $w \in L$. Consider the following cases:

$w \notin [x]$. Then w is comparable to a string in $C_0$.
$w \le _p x$. Then in Case 1 $w \le _p xy \in D_2$, in Case 2 $w \le _p xyz \in D_2$.
$w = xu^iw_1, i \ge 0, w_1 \notin [u] \cup [y]$. Then $w_1$ is comparable to a string $w_1'$ in $C_1$ and w is comparable to $xu^iw_1' \in D_1$.
$w = xu^iw_1, i \ge 0, w_1 \le _p u$. Then in Case 1 $w \le _p xu^{i+1}yv^{i+1} \in D_2$, in Case 2 $w \le _p xu^{i+1}yv^{i+1}z \in D_2$.
$w = xu^iw_1, i \ge 0, w_1 \le _p y$. Then in Case 1 $w \le _p xu^iyv^i \in D_2$, in Case 2 $w \le _p xu^iyv^iz \in D_2$.
$w = xu^iyv^jw_2, i, j \ge 0, w_2 \notin [v]$.
- Case 1: If $j \ge i$, then $w \ge _p xu^iyv^i \in D_2$. Otherwise $w_2$ is comparable to a $w_2' \in C_2$ and w is comparable to $xu^iyv^jw_2' \in D_3$.
- Case 2: If $j = i$, then $w_2$ is comparable to a $w_2' \in C_2$ and w is comparable to $xu^iyv^iw_2' \in D_2$. Otherwise $w_2$ is comparable to a $w_2'' \in C_3$ and w is comparable to $xu^iyv^jw_2'' \in D_3$.
$w = xu^iyv^jw_2, i, j \ge 0, w_2 \le _p v$.
- Case 1: If $j \ge i$, then $w \ge _p xu^iyv^i \in D_2$. Otherwise $w \le _p xu^iyv^i \in D_2$.
- Case 2: $w \le _p xu^iyv^{i+j+1}z' \in D_3$.

Therefore for any $w \in L$ there is a $w' \in C$ such that w is comparable to $w'$. Then C is a MPFS. $\square $

In the following, we use the notation introduced in Lemma 15.

Lemma 16. For every state $q_i, 0 \le i \le k$, there is at most one non-empty string $s_{q_i}$ such that $q_i \cdot s_{q_i} = q_i$, and the computation $q_i \xrightarrow []{s_{q_i}} q_i$ does not pass through $q_i$ except for the first and last state.

Proof

For a contradiction, let $q_i \cdot s_1 = q_i$ and $q_i \cdot s_2 = q_i$ for non-empty $s_1 \ne s_2$, where the computations on $s_1$ and $s_2$ do not pass through $q_i$. Then $s_1$ and $s_2$ are not comparable, and we have $s_1 = s'as_1'$ and $s_2 = s'bs_2'$ for $s, s_1', s_2' \in \Sigma ^*$ and $a, b \in \Sigma , a \ne b$. Let $q_0 \cdot x' = q_i$ and $q_i \cdot z' = q_k$.

Let $p = q = q_i \cdot s'$, $x = x's'$, , $u = v = as_1's'$, $y = bs_2's'$. Then L has the property $P_2$ since either $q = q_i$ is a final state if $i = k$ (Case 1), or $L_q \setminus [v]$ is not prefix-free, since $q \cdot bs_2's'z' = q_f$ and $q_f$ is a final non-$\varepsilon $ state (Case 2).

Lemma 17. Let $\mathrm {AC}(q_k) = \{ w' \in \Sigma ^* \mid s \cdot w' = q_k$, and the computation $s \xrightarrow []{w'} q_k$ does not contain any state more than once$\}$. Let $\ell $ be the first index such that $q_\ell $ occurs in the computation on w more than once, if such a state exists. Then $w = rs_{q_\ell }^i t$, where $r, t \in \Sigma ^*$, $i \ge 0$, and $rt \in \mathrm {AC}(q_k)$.

Proof

Let $q_\ell $ be the first and $q_{\ell '}$ be the last occurrence of the state $q_\ell $ in the computation on w. By Lemma 16, the only possible string that can be read between two consecutive passes through $q_\ell $ must be $s_{q_\ell }$. The computation therefore looks like this:

$$ q_0 \xrightarrow []{a_1} q_1 \xrightarrow []{a_2} \cdots \xrightarrow []{a_\ell } q_\ell \xrightarrow []{s_{q_\ell }} q_\ell \xrightarrow []{s_{q_\ell }} \cdots \xrightarrow []{s_{q_\ell }} q_{\ell '} \xrightarrow []{a_{\ell '+1}} q_{\ell '+1} \xrightarrow []{a_{\ell '+2}} \cdots \xrightarrow []{a_k} q_k $$

Let $r = a_1a_2 \cdots a_\ell $ and $t = a_{\ell '+1}a_{\ell '+2} \cdots a_k$. Let us show that $rt \in \mathrm {AC}(q_k)$; that is, the states $q_0, q_1, \dots , q_{\ell -1}, q_{\ell '+1}, \dots , q_k$ are all distinct.

We know that $q_\ell $ is the first state which occurs in the computation more than once, therefore if two states $q_i$ and $q_j$ for $i < j$ among the above are equivalent, it must be that $\ell '< i < j \le k$.

Thus the computation goes through a cycle $q_\ell \xrightarrow []{s_{q_\ell }} q_\ell $, potentially several times. After that, the computation goes through the cycle $q_i \xrightarrow []{s_{q_i}} q_i$. This cycle does not contain $q_\ell $, since $q_\ell '$ is the last occurrence of this state and $\ell ' < i$. The computation must therefore “leave” the $q_\ell $ cycle at some point. Let $q_p$ be the last state in the computation that follows this cycle. Without further technical details, let us observe that we have $p = q_p$, $q = q_i$, $x = a_1a_2 \cdots a_p$, $u = s_{q_p}$, $y = a_{p+1}a_{p+2} \cdots a_i$, $v = s_{q_i}$ and L has the property P since either $q = q_i = q_k$ is final, or the final non-$\varepsilon $ state $q_k$ is reachable from $q_i$ and thus $L_{q_i}$ is not prefix-free.

This is a contradiction with the initial assumption that the language L does not have the property P. $\square $

Lemma 18. Let R, S, and T be finite languages. Let $L \subseteq \{ rs^it \mid r \in R, s \in S, t \in T, i \ge 0 \}$ be a context-free language. Then L is regular.

Proof

It holds that:

$$ L = \bigcup _{\begin{array}{c} r \in R\\ s \in S\\ t \in T \end{array}} \{ rs^it \mid rs^it \in L \}, $$

that is, L is a union of finitely many languages of the form $\{ rs^it \mid rs^it \in L \}$ for some specific r, s, t. For each of these languages we have $\{ rs^it \mid rs^it \in L \} = r \cdot \{s^i \mid rs^it \in L \} \cdot t = r \cdot ( r \backslash L / t ) \cdot t$, where $\backslash $ and / are the left and right quotient operation, respectively. This set is context-free, since L is context-free and $\mathrm {CF}$ is closed under concatenation and left and right quotients by regular languages.

It follows that $r \backslash \{ rs^it \mid rs^it \in L \} / t = \{ s^i \mid rs^it \in L \}$ is also context-free, and since CF is closed under inverse homomorphism, the set $\{a^i \mid rs^it \in L \}$ is context-free as well. However, the latter language is unary, and every unary context-free language is also regular. Therefore every language $\{ rs^it \mid rs^it \in L \} = r \cdot \{s^i \mid rs^it \in L \} \cdot t$ is regular as well, and L is a union of finitely many regular languages. Hence L is regular. $\square $

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Jirásek, J. (2016). Non-regular Maximal Prefix-Free Subsets of Regular Languages. In: Brlek, S., Reutenauer, C. (eds) Developments in Language Theory. DLT 2016. Lecture Notes in Computer Science(), vol 9840. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-53132-7_19

Download citation

DOI: https://doi.org/10.1007/978-3-662-53132-7_19
Published: 21 July 2016
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-662-53131-0
Online ISBN: 978-3-662-53132-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Non-regular Maximal Prefix-Free Subsets of Regular Languages

Abstract

Access this chapter

Notes

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Appendix

Appendix

Proof

Proof

Proof

Proof

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation