Skip to main content

Non-regular Maximal Prefix-Free Subsets of Regular Languages

  • Conference paper
  • First Online:
Developments in Language Theory (DLT 2016)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 9840))

Included in the following conference series:

  • 583 Accesses

Abstract

We investigate non-regular maximal prefix-free subsets (MPFS) of regular languages. We give a method to decide whether or not a regular language has any non-regular MPFS.

Next, we prove that if a regular language has any non-regular MPFS, then it also has a MPFS which is context-sensitive but not context-free, it has a MPFS which is recursive but not context-sensitive, and it has a MPFS which is not recursively enumerable.

We show that no regular language has a MPFS which is recursively enumerable but not recursive. Finally, for any regular language we can decide whether or not it has a context-free non-regular MPFS.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    In the special case of \(L = \Sigma ^*\), this corresponds to Lemma 10 of [1].

  2. 2.

    \(\mathrm {AC}\) for acyclic. The computation does not contain a cycle.

References

  1. Calude, C.S., Staiger, L.: On universal computably enumerable prefix codes. Math. Struct. Comput. Sci. 19(1), 45–57 (2009)

    Article  MathSciNet  MATH  Google Scholar 

  2. Han, Y.-S., Salomaa, K., Wood, D.: Nondeterministic state complexity of basic operations for prefix-free regular languages. Fundam. Inform. 90(1–2), 93–106 (2009)

    MathSciNet  MATH  Google Scholar 

  3. Han, Y.-S., Salomaa, K., Wood, D.: Operational state complexity of prefix-free regular languages. In: Automata, Formal Languages, and Related Topics - Dedicated to Ferenc Gécseg on the Occasion of his 70th Birthday, pp. 99–115 (2009)

    Google Scholar 

  4. Han, Y.-S., Salomaa, K., Yu, S.: State complexity of combined operations for prefix-free regular languages. In: Dediu, A.H., Ionescu, A.M., Martín-Vide, C. (eds.) LATA 2009. LNCS, vol. 5457, pp. 398–409. Springer, Heidelberg (2009)

    Chapter  Google Scholar 

  5. Hopcroft, J.E., Ullman, J.D.: Introduction to Automata Theory. Languages and Computation. Addison-Wesley, Reading (1979)

    MATH  Google Scholar 

  6. Jirásek, J., Jirásková, G.: Cyclic shift on prefix-free languages. In: Bulatov, A.A., Shur, A.M. (eds.) CSR 2013. LNCS, vol. 7913, pp. 246–257. Springer, Heidelberg (2013)

    Chapter  Google Scholar 

  7. Jirásek, J.Š., Šebej, J.: Prefix-free subsets of regular languages and descriptional complexity. In: Shallit, J., Okhotin, A. (eds.) DCFS 2015. LNCS, vol. 9118, pp. 129–140. Springer, Heidelberg (2015)

    Chapter  Google Scholar 

  8. Jirásková, G., Krausová, M.: Complexity in prefix-free regular languages. In: Proceedings Twelfth Annual Workshop on Descriptional Complexity of Formal Systems, DCFS 2010, Saskatoon, Canada, 8–10th, pp. 197–204, August 2010

    Google Scholar 

  9. Krausová, M.: Prefix-free regular languages: closure properties, difference, and left quotient. In: Kotásek, Z., Bouda, J., Černá, I., Sekanina, L., Vojnar, T., Antoš, D. (eds.) MEMICS 2011. LNCS, vol. 7119, pp. 114–122. Springer, Heidelberg (2012)

    Chapter  Google Scholar 

  10. Palmovský, M., Šebej, J.: Star-complement-star on prefix-free languages. In: Shallit, J., Okhotin, A. (eds.) DCFS 2015. LNCS, vol. 9118, pp. 231–242. Springer, Heidelberg (2015)

    Chapter  Google Scholar 

  11. Sipser, M.: Introduction to the Theory of Computation. PWS Publishing Company, Boston (1997)

    MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jozef Jirásek Jr. .

Editor information

Editors and Affiliations

Appendix

Appendix

Lemma 14. Let L be a regular language with the property \(P_2\). Then L has a maximal prefix-free subset which is context-free, but not regular.

Proof

Here we show that the set C obtained as described in Lemma 14 is a MPFS of L. The proof of the lemma in the article shows that \(C \in \mathrm {CF}\setminus \mathrm {Reg}\).

Let \(D_1 = \{ xu^i \mid i \ge 0 \} \cdot C_1\). In Case 1, let \(D_2 = \{ xu^iyv^i \mid i \ge 0 \}\), \(D_3 = \{ xu^iyv^j \mid i \ge 1, 0 \le j < i \} \cdot C_2\). In Case 2, let \(D_2 = \{ xu^iyv^i \mid i \ge 0 \} \cdot C_2\), \(D_3 = \{ xu^iyv^j \mid i, j \ge 0; i \ne j \} \cdot C_3\).

(1) \(C \subseteq L\): \(C_0 \subseteq L\). We have \(s \cdot xu^i = p\), and \(C_1 \subseteq L_p\). Next, \(s \cdot xu^iyv^j = q\) for \(i, j \ge 0\). In Case 1, \(q \in F\) and \(C_2 \subseteq L_q\). In Case 2, \(C_2, C_3 \subseteq L_q\). Therefore \(C \subseteq L\).

(2) C is prefix-free: \(C_0\) is prefix-free. \(C_0\) does not contain any string in [x], therefore strings in \(C_0\) are incomparable with any string in \(D_1 \cup D_2 \cup D_3\).

Since \(C_1\) is prefix-free and does not contain any strings in [u], \(D_1\) is prefix-free as well. \(C_1\) does not contain any string in [u] or [y], so strings in \(D_1\) are incomparable with any string in \(D_2 \cup D_3\).

Since \(C_2\) and \(C_3\) are prefix-free and do not contain any string in [v], for a given \(i, j \ge 0\) the languages \(xu^iyv^j \cdot C_2\) and \(xu^iv^j \cdot C_3\) are also prefix-free. Let \(i_1 < i_2\). Then \(xu^{i_1}yv\) and \(xu^{i_2}yv\) are incomparable, since u and y differ in the first symbol. Let \(j_1 < j_2\). Then any string in \(ux^iyv^{j_1} \cdot C_2\), resp. \(C_3\) is incomparable with any string in \(ux^iyv^{j_2} \cdot C_2\), resp. \(C_3\), since no string in \(C_2\), resp. \(C_3\) is in [v].

(3) C is maximal. Let \(w \in L\). Consider the following cases:

  • \(w \notin [x]\). Then w is comparable to a string in \(C_0\).

  • \(w \le _p x\). Then in Case 1 \(w \le _p xy \in D_2\), in Case 2 \(w \le _p xyz \in D_2\).

  • \(w = xu^iw_1, i \ge 0, w_1 \notin [u] \cup [y]\). Then \(w_1\) is comparable to a string \(w_1'\) in \(C_1\) and w is comparable to \(xu^iw_1' \in D_1\).

  • \(w = xu^iw_1, i \ge 0, w_1 \le _p u\). Then in Case 1 \(w \le _p xu^{i+1}yv^{i+1} \in D_2\), in Case 2 \(w \le _p xu^{i+1}yv^{i+1}z \in D_2\).

  • \(w = xu^iw_1, i \ge 0, w_1 \le _p y\). Then in Case 1 \(w \le _p xu^iyv^i \in D_2\), in Case 2 \(w \le _p xu^iyv^iz \in D_2\).

  • \(w = xu^iyv^jw_2, i, j \ge 0, w_2 \notin [v]\).

    • Case 1: If \(j \ge i\), then \(w \ge _p xu^iyv^i \in D_2\). Otherwise \(w_2\) is comparable to a \(w_2' \in C_2\) and w is comparable to \(xu^iyv^jw_2' \in D_3\).

    • Case 2: If \(j = i\), then \(w_2\) is comparable to a \(w_2' \in C_2\) and w is comparable to \(xu^iyv^iw_2' \in D_2\). Otherwise \(w_2\) is comparable to a \(w_2'' \in C_3\) and w is comparable to \(xu^iyv^jw_2'' \in D_3\).

  • \(w = xu^iyv^jw_2, i, j \ge 0, w_2 \le _p v\).

    • Case 1: If \(j \ge i\), then \(w \ge _p xu^iyv^i \in D_2\). Otherwise \(w \le _p xu^iyv^i \in D_2\).

    • Case 2: \(w \le _p xu^iyv^{i+j+1}z' \in D_3\).

Therefore for any \(w \in L\) there is a \(w' \in C\) such that w is comparable to \(w'\). Then C is a MPFS. \(\square \)

In the following, we use the notation introduced in Lemma 15.

Lemma 16. For every state \(q_i, 0 \le i \le k\), there is at most one non-empty string \(s_{q_i}\) such that \(q_i \cdot s_{q_i} = q_i\), and the computation \(q_i \xrightarrow []{s_{q_i}} q_i\) does not pass through \(q_i\) except for the first and last state.

Proof

For a contradiction, let \(q_i \cdot s_1 = q_i\) and \(q_i \cdot s_2 = q_i\) for non-empty \(s_1 \ne s_2\), where the computations on \(s_1\) and \(s_2\) do not pass through \(q_i\). Then \(s_1\) and \(s_2\) are not comparable, and we have \(s_1 = s'as_1'\) and \(s_2 = s'bs_2'\) for \(s, s_1', s_2' \in \Sigma ^*\) and \(a, b \in \Sigma , a \ne b\). Let \(q_0 \cdot x' = q_i\) and \(q_i \cdot z' = q_k\).

Let \(p = q = q_i \cdot s'\), \(x = x's'\), , \(u = v = as_1's'\), \(y = bs_2's'\). Then L has the property \(P_2\) since either \(q = q_i\) is a final state if \(i = k\) (Case 1), or \(L_q \setminus [v]\) is not prefix-free, since \(q \cdot bs_2's'z' = q_f\) and \(q_f\) is a final non-\(\varepsilon \) state (Case 2).

Lemma 17. Let \(\mathrm {AC}(q_k) = \{ w' \in \Sigma ^* \mid s \cdot w' = q_k\), and the computation \(s \xrightarrow []{w'} q_k\) does not contain any state more than once\(\}\). Let \(\ell \) be the first index such that \(q_\ell \) occurs in the computation on w more than once, if such a state exists. Then \(w = rs_{q_\ell }^i t\), where \(r, t \in \Sigma ^*\), \(i \ge 0\), and \(rt \in \mathrm {AC}(q_k)\).

Proof

Let \(q_\ell \) be the first and \(q_{\ell '}\) be the last occurrence of the state \(q_\ell \) in the computation on w. By Lemma 16, the only possible string that can be read between two consecutive passes through \(q_\ell \) must be \(s_{q_\ell }\). The computation therefore looks like this:

$$ q_0 \xrightarrow []{a_1} q_1 \xrightarrow []{a_2} \cdots \xrightarrow []{a_\ell } q_\ell \xrightarrow []{s_{q_\ell }} q_\ell \xrightarrow []{s_{q_\ell }} \cdots \xrightarrow []{s_{q_\ell }} q_{\ell '} \xrightarrow []{a_{\ell '+1}} q_{\ell '+1} \xrightarrow []{a_{\ell '+2}} \cdots \xrightarrow []{a_k} q_k $$

Let \(r = a_1a_2 \cdots a_\ell \) and \(t = a_{\ell '+1}a_{\ell '+2} \cdots a_k\). Let us show that \(rt \in \mathrm {AC}(q_k)\); that is, the states \(q_0, q_1, \dots , q_{\ell -1}, q_{\ell '+1}, \dots , q_k\) are all distinct.

We know that \(q_\ell \) is the first state which occurs in the computation more than once, therefore if two states \(q_i\) and \(q_j\) for \(i < j\) among the above are equivalent, it must be that \(\ell '< i < j \le k\).

Thus the computation goes through a cycle \(q_\ell \xrightarrow []{s_{q_\ell }} q_\ell \), potentially several times. After that, the computation goes through the cycle \(q_i \xrightarrow []{s_{q_i}} q_i\). This cycle does not contain \(q_\ell \), since \(q_\ell '\) is the last occurrence of this state and \(\ell ' < i\). The computation must therefore “leave” the \(q_\ell \) cycle at some point. Let \(q_p\) be the last state in the computation that follows this cycle. Without further technical details, let us observe that we have \(p = q_p\), \(q = q_i\), \(x = a_1a_2 \cdots a_p\), \(u = s_{q_p}\), \(y = a_{p+1}a_{p+2} \cdots a_i\), \(v = s_{q_i}\) and L has the property P since either \(q = q_i = q_k\) is final, or the final non-\(\varepsilon \) state \(q_k\) is reachable from \(q_i\) and thus \(L_{q_i}\) is not prefix-free.

This is a contradiction with the initial assumption that the language L does not have the property P. \(\square \)

Lemma 18. Let R, S, and T be finite languages. Let \(L \subseteq \{ rs^it \mid r \in R, s \in S, t \in T, i \ge 0 \}\) be a context-free language. Then L is regular.

Proof

It holds that:

$$ L = \bigcup _{\begin{array}{c} r \in R\\ s \in S\\ t \in T \end{array}} \{ rs^it \mid rs^it \in L \}, $$

that is, L is a union of finitely many languages of the form \(\{ rs^it \mid rs^it \in L \}\) for some specific rst. For each of these languages we have \(\{ rs^it \mid rs^it \in L \} = r \cdot \{s^i \mid rs^it \in L \} \cdot t = r \cdot ( r \backslash L / t ) \cdot t\), where \(\backslash \) and / are the left and right quotient operation, respectively. This set is context-free, since L is context-free and \(\mathrm {CF}\) is closed under concatenation and left and right quotients by regular languages.

It follows that \(r \backslash \{ rs^it \mid rs^it \in L \} / t = \{ s^i \mid rs^it \in L \}\) is also context-free, and since CF is closed under inverse homomorphism, the set \(\{a^i \mid rs^it \in L \}\) is context-free as well. However, the latter language is unary, and every unary context-free language is also regular. Therefore every language \(\{ rs^it \mid rs^it \in L \} = r \cdot \{s^i \mid rs^it \in L \} \cdot t\) is regular as well, and L is a union of finitely many regular languages. Hence L is regular. \(\square \)

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Jirásek, J. (2016). Non-regular Maximal Prefix-Free Subsets of Regular Languages. In: Brlek, S., Reutenauer, C. (eds) Developments in Language Theory. DLT 2016. Lecture Notes in Computer Science(), vol 9840. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-53132-7_19

Download citation

  • DOI: https://doi.org/10.1007/978-3-662-53132-7_19

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-662-53131-0

  • Online ISBN: 978-3-662-53132-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics