Keywords

1 Introduction

We study the ability of the jumping mode of tape heads to strengthen accepting power of automata. Automata have several characteristics; determinism of transition functions, ability of rewriting or erasing input words, memory devices like stacks and directions of tape heads to read inputs. Recently a mode of tape head movement has been introduced and examined with respect to how the class of languages accepted is affected ([1, 3,4,5,6, 8, 9, 11,12,13]). We study the abilities and limitations of the new mode of tape head move by comparing several machine models.

Jumping finite automata (JFA) were introduced as automata with a new mode of tape head in [12]. In the new mode, the tape head is allowed to jump - either left or right - over a part of the input word after reading a letter and continue processing from there. Once a letter in the input word is read, it cannot be reread again later. This implies that once a letter is read, it is erased. The tape head starts anywhere in the input word and it can move to either right or left side. It was shown that a language is accepted by jumping finite automata if and only if it is commutative and semilinear in [4, 5].

One-way jumping (deterministic) finite automata (OWJFA), a variant of jumping finite automata, were introduced and analyzed in [4]. They have another mode of tape head; the head moves in one direction only and starts at the beginning of the input word. It moves from left to right (and jumps over parts of the input it cannot read) and when the tape head reaches the end of the input, it is returned to the beginning of the input and continues the computation until all the letters are read or the automaton is stuck in the sense that it can no longer read any letter of the remaining input. Several properties and characterization results were provided in [1]. The majority of decidability questions have been answered in [2], with the most notable exception being, when is the language accepted regular? We attempt to get closer to the answer by looking at the complements of OWJFA.

In this paper we consider deterministic or nondeterministic finite automata and pushdown automata in a uniform manner. However, we restrict our study to automata that do not rewrite the input letters and so we exclude linear bounded automata and Turing machines from consideration.

We recall notations of automata (see [7, 15]). We denote \(\varSigma _{\epsilon } = \varSigma \cup \{ \epsilon \}\), where \(\epsilon \) is the empty word and P(Q) stands for the power set of Q. A nondeterministic finite automaton (denoted by NFA) M is a 5-tuple \((Q,\varSigma ,\delta ,q_0,F)\), where Q is set of states, \(\varSigma \) is finite set, \(\delta : Q \times \varSigma \rightarrow P(Q)\) is a transition relation, \(q_0\) is the initial state, F is set of accept state. In Sect. 3 we deviate from this definition by allowing the NFA to have multiple initial states, a change that is known not to affect the class of accepted languages in the classical case, but makes a difference in the alternative tape head modes. If \(\delta \) is a mapping \(Q \times \varSigma _{\epsilon } \rightarrow P(Q)\), then M is called \(\epsilon -\)NFA. M is called deterministic (denoted by a DFA) if (1) it is an \(\varepsilon \)-free NFA and (2) for \(\forall p \in Q\) and \(\forall a \in \varSigma \), there is no more than one \(q \in Q\) such that \(\delta (p,a) = q\).

A nondeterministic pushdown automaton (denoted by NPDA) M is a 6-tuple \((Q,\varSigma ,\varGamma ,\delta ,q_0,F)\), where Q is finite set of states, \(\varSigma \) is finite set, \(\varGamma \) is a finite stack alphabet, \(\delta :Q \times \varSigma _{ \epsilon } \times \varGamma _{\epsilon } \rightarrow P(Q \times \varGamma _{\epsilon })\) is the transition function, \(q_0\) is the initial state, F is set of accept states, and \(\$\) is a bottom marker of stack. M is called deterministic (denoted by DPDA) if it satisfies (1) For \(\forall q \in Q, \forall a \in \varSigma \cup \{ \epsilon \}, \forall b \in \varGamma \) we have \(| \delta (q,a,b) | \le 1\) and (2) For \(\forall q \in Q, \forall a \in \varSigma , \forall b \in \varGamma \), we have \(\delta (q,a,b)=\emptyset \) if \(\delta (q,\epsilon ,b) \ne \emptyset \). The language accepted by DPDA is defined as the set of inputs on which the automaton ends up in a final state (not the empty stack acceptance condition) after reading the whole input.

2 Modes of Tape Head Move

First we define modes of movement of the tape head to capture characteristics of both JFA and (R)OWJFA. Transitions between configurations of automata are considered to be rewriting of strings on state and input alphabets. We study three ways of rewriting configurations of automata; the standard mode, the nondeterministic jumping mode and the one-way jumping mode. The first one is the traditional way to rewrite configurations of automata as defined in [7] and [15]. The second and third one are introduced by [12] and [4], respectively. The three modes can be applied to any automata with deterministic/nondeterministic transition functions, with/without stacks, and with/without rewriting and erasing a letter in an input. Our objective with this paper is to extend the examination of how the three modes relate to each other with respect to the accepting power of automata, in particular, nondeterministic finite and pushdown automata.

2.1 Tape Head Modes

Suppose M is a (deterministic or nondeterministic) finite automaton.

Standard Mode

A configuration of M is any string in \(Q \times \varSigma ^{*}\). A transition from configuration \(q_1aw\) to configuration \(q_2w\), written as \(q_1aw \rightarrow q_2w\), is possible when \(q_2 \in \delta (q_1,a)\). In the standard manner, we extend \(\rightarrow \) to \(\rightarrow ^{m}\), where \(m \ge 0\). Let \(\rightarrow ^{+}\) and \(\rightarrow ^{*}\) denote the transitive closure and the transitive-reflexive closure of \(\rightarrow \), respectively.

We define a (deterministic or nondeterministic) finite automaton with standard mode to be a rewriting system \((M,\rightarrow )\) based on \(\rightarrow ^{*}\). The language accepted by \((M,\rightarrow )\) (denoted by \(L(M,\rightarrow )\)) is defined to be \(\{ w \mid w \in \varSigma ^{*}, sw \rightarrow ^{*} f, f \in F \}\).

Nondeterministic Jumping Mode

A configuration of M is any string in \(\varSigma ^{*} \times Q \times \varSigma ^{*}\), representing the part of the input to the left from the reading head, the state and the input to the right from the head. The binary jumping relation, symbolically denoted by \(\curvearrowright \), over \(\varSigma ^{*} \times Q \times \varSigma ^{*}\), is defined as follows. Let x, z, \(x'\), \(z'\) be strings in \(\varSigma ^{*}\) such that \(xz = x'z'\) and \(q \in \delta (p,y)\); then, M makes a jump from xpyz to \(x'qz'\), symbolically written as \(xpyz \curvearrowright x'qz'\). In the standard manner, we extend \(\curvearrowright \) to \(\curvearrowright ^{m}\), where \(m \ge 0\). Let \(\curvearrowright ^{+}\) and \(\curvearrowright ^{*}\) denote the transitive closure of \(\curvearrowright \) and the transitive-reflexive closure of \(\curvearrowright \), respectively.

We define a (deterministic or nondeterministic) finite automaton with nondeterministic jumping mode to be a rewriting system \((M,\curvearrowright )\) based on \(\curvearrowright ^{*}\). (see Fig. 1). The language accepted by \((M,\curvearrowright )\) (denoted by \(L(M,\curvearrowright )\)) is defined to be \(\{ uv \mid u, v \in \varSigma ^{*}, usv \curvearrowright ^{*} f, f \in F \}\). This jumping mode was introduced as a new automaton model called a JFA in [12].

One-way Jumping Mode

The right one-way jumping relation (denoted by \(\circlearrowright \)) between configurations from \(Q \varSigma ^{*}\), was defined in [4] as follows. Suppose that x and y belong to \(\varSigma ^{*}\), a belongs to \(\varSigma \), p and q are states in Q and \(q \in \delta (p,a)\). Then the right one-way jumping automaton M makes a jump from the configuration pxay to the configuration qyx, symbolically written as \(pxay \circlearrowright qyx\) if x belongs to \(\{ \varSigma \setminus \varSigma _p \}^{*}\) where \(\varSigma _p = \{b\in \varSigma \mid \exists q\in Q\ s.t.\ q \in \delta (p,b) \}\) (see Fig. 2). In the standard manner, we extend \(\circlearrowright \) to \(\circlearrowright ^{m}\), where \(m \ge 0\). Let \(\circlearrowright ^{*}\) denote the transitive-reflexive closure of \(\circlearrowright \).

We define a (deterministic or nondeterministic) finite automaton with one-way jumping mode of tape head to be a rewriting system \((M,\circlearrowright )\) based on \(\circlearrowright ^{*}\). The language accepted by \((M,\circlearrowright )\) (denoted by \(L(M,\circlearrowright )\)) is defined to be \(\{ w \mid w \in \varSigma ^{*}, sw \circlearrowright f, f \in F \}\).

Fig. 1.
figure 1

Nondeterministic jumping mode of tape head move.

Fig. 2.
figure 2

One-way jumping mode of tape head move.

In a similar manner we define a (deterministic or nondeterministic) pushdown automaton with standard mode, nondeterministic jumping mode and one-way jumping mode, respectively, to be the rewriting systems \((M,\rightarrow )\), \((M,\curvearrowright )\) and \((M,\circlearrowright )\), respectively, where M is a (deterministic or nondeterministic) pushdown automaton.

2.2 Language Classes

We consider deterministic and nondeterministic finite automata and deterministic and nondeterministic pushdown automata, denoted by DFA, NFA, DPDA, NPDA, respectively. Then we classify these automata with three modes of tape head from the standpoint of languages accepted. We denote the language classes accepted by DFA, NFA, DPDA, NPDA with three modes \(\rightarrow , \curvearrowright , \circlearrowright \) by (\(\rightarrow , \curvearrowright , \circlearrowright \))-DFA, (\(\rightarrow , \curvearrowright , \circlearrowright \))-NFA, (\(\rightarrow , \curvearrowright , \circlearrowright \))-DPDA, (\(\rightarrow , \curvearrowright , \circlearrowright \))-NPDA, respectively, in this paper. For example, \(\rightarrow \)DFA coincides with \(\rightarrow \)NFA and they comprise the class of regular languages, and \(\rightarrow \)NPDA is the class of context-free languages.

2.3 Differences of Modes of Tape Head Move

The next proposition shows the basic relationship between the processing of inputs by the same machine M in different tape head modes. Versions of the statement regarding the acceptance of inputs by M in the different modes have been shown in [13, Ch.17] and [4], for \(\curvearrowright \) and \(\circlearrowright \), respectively.

Proposition 1

Let M be an (deterministic or non-deterministic) automaton. Suppose \(w_1, w_2, w_3\) are words over \(\varSigma \).

(1) If \(q_1w_1 \rightarrow ^* q_2w_2\) then \(q_1w_1 \curvearrowright ^{*} q_2w_2\).

(2) If \(q_1w_1 \rightarrow ^* q_2w_2\) then \(q_1w_1 \circlearrowright ^* q_2w_2\).

(3) If \(q_1w_1 \curvearrowright ^{*} q_2w_2\) then there exists a permutation \(\phi \) such that \(q_1\phi (w_1) \rightarrow ^* q_2w_2\).

(4) If \(q_1w_1 \circlearrowright ^* q_2w_2\) then there exists a permutation \(\phi \) such that \(q_1\phi (w_1) \rightarrow ^* q_2w_2\).

Proof

(1) Let \(q_1w_1 \rightarrow ^{n} q_2w_2\) and let \(m=|w_1|\). We prove that \(q_1w_1 \curvearrowright ^{n} q_2w_2\) holds, by induction on n. If \(n=0\), then \(q_1=q_2\) and \(w_1=w_2\). Thus, \(q_1w_1 \curvearrowright ^{0} q_2w_2\) holds. Suppose that \(n = k \le m-1\) holds. If n = k + 1, then there exist q and \(a \in \varSigma \) such that \(q_1w_1 \rightarrow ^{k} qaw_2 \rightarrow q_2w_2\). By \(q_2\in \delta (q,a)\), we have \(q_1w_1 \curvearrowright ^{k} qaw_2 \curvearrowright q_2w_2 =q_1w_1 \curvearrowright ^{k+1} q_2w_2\). Therefore, if \(q_1w_1 \rightarrow ^* q_2w_2\), then \(q_1w_1 \curvearrowright ^{*} q_2w_2\). We can prove (2) in a similar manner.

(3) Let \(q_1w_1 \curvearrowright ^{n} q_2w_2\) and let \(m=|w_1|\). We prove that there exists a permutation \(\phi \) such that \(q_1\phi (w_1) \rightarrow ^{n} q_2w_2\) holds, by induction on n. If \(n=0\), then there exists a permutation \(\phi \) such that \(\phi (w_1)=w_1\). Thus, \( q_1\phi (w_1) \rightarrow ^{0} q_1w_1 = q_2w_2\) holds. Suppose that \(n = k \le m-1\) holds. i.e. there exists \(\sigma = \bigl ({\begin{matrix} 1 &{} 2 &{} 3 &{} \cdots &{} k-1 &{} k \\ \sigma (1) &{} \sigma (2) &{} \sigma (3) &{} \cdots &{} \sigma (k-1) &{} \sigma (k) \end{matrix}}\bigr )\) such that \(q_1\sigma (w_1) \rightarrow ^k q_2w_2\). If n = k + 1, then there exist q and \(a \in \varSigma \) such that \(q_1w_1 \curvearrowright ^{k} quav \curvearrowright q_2w_2\) where \(uv=w_2\). From before, \(q_1\sigma (w_1) \rightarrow ^k quav \curvearrowright q_2w_2\) holds and let the permutation \(\sigma '\) be such that \(\sigma '(xyaz)\,=\,xayz\) for \(a\in \varSigma \) and \(xyz\in \varSigma ^*\) with \(|x|=k\). Choosing \(\phi =\sigma '\circ \sigma \) gives \(q_1\phi (w_1) \rightarrow ^k qauv \rightarrow q_2w_2\). Thus, \(q_1\phi (w_1) \rightarrow ^{k+1} q_2w_2\) holds.

We can prove (4) in a similar manner.   \(\square \)

Remark 1

It is easy to see that if \(q_1w_1w_2 \rightarrow ^* q_3\) and \(q_1w_1 \rightarrow ^* q_2\) then \(q_2w_2 \rightarrow ^* q_3\). However, \(q_1w_1w_2 \curvearrowright ^{*} q_3\) and \(q_1w_1 \curvearrowright ^{*} q_2\) does not imply \(q_2w_2 \curvearrowright ^{*} q_3\). Similarly, \(q_1w_1w_2 \circlearrowright ^* q_3\) and \(q_1w_1 \circlearrowright ^* q_2\) does not imply \(q_2w_2 \circlearrowright ^* q_3\).

In [1](equation (1)) the authors summarize the relationship between the languages accepted by DFA M in the various tape head modes, as:

$$L(M,\rightarrow )\subseteq L(M,\circlearrowright )\subseteq L(M,\curvearrowright ) = Perm(L(M,\rightarrow )).$$

Regarding the \(\rightarrow \) and \(\curvearrowright \) modes, it makes no difference in the proofs whether the machine is deterministic. This means that we have the following relationship between acceptance in \(\rightarrow \) and \(\curvearrowright \) modes.

Proposition 2

Let M be any automaton. A word w is accepted by \((M, \curvearrowright )\) if and only if there exists a permutation \(\phi _w\) depending on w such that \(\phi _w(w)\) is accepted by \((M, \rightarrow )\).

We note that the permutation \(\phi _w\) depends on the word w, however, it is not necessarily unique. Therefore, if a language L is accepted by a JFA \((M, \curvearrowright )\), we have \(L = Perm(L)\).

Corollary 1

A word w is accepted by a rewriting system \((M,\curvearrowright )\) if and only if there exists a permutation \(\phi \) such that \(\phi (w)\) is accepted by a rewriting system \((M,\rightarrow )\).

The following corollary has been stated for JFA (\(\curvearrowright \)NFA) in [12] and for deterministic ROWJFA (\(\circlearrowright \)DFA) in [4]. They also follow, just as the nondeterministic case, from Proposition 1. A similar argument can easily be made for DPDA and NPDA.

Corollary 2

Let M be a deterministic or nondeterministic finite automaton or pushdown automaton. Any word w, which is accepted by \((M, \rightarrow )\), is also accepted by \((M,\curvearrowright )\) and \((M,\circlearrowright )\).

This means, generalizing the inclusions from before, that for any DFA, NFA, DPDA or NPDA M:

$$\begin{aligned} L(M,\rightarrow )\subseteq L(M,\circlearrowright )\subseteq L(M,\curvearrowright ) = Perm(L(M,\rightarrow )). \end{aligned}$$
(1)

Parikh’s well-known paper [14] shows that all context-free languages have semilinear Parikh-image. Furthermore, all semilinear sets have a regular preimage, and this, combined with Eq. (1) tell us that the class of permutation closures of context-free languages is the same as the class of permutation closures of regular languages, accepted by DFA/NFA in \(\curvearrowright \) mode ([5]). From here, by Corollary 1, we can deduce that in \(\curvearrowright \) tape head mode, adding a stack to a finite automaton does not change the class of accepted languages, i.e.,

Equation (1) also allows us to extend all the well-known pumping lemmas trivially to the language classes defined through the new execution modes.

Remark 2

Let M be an NFA, DFA, DPDA or NPDA, and \(N_M>0\) a constant which depends on M. If a pumping lemma holds for all words \(w\in L(M,\rightarrow )\) with \(|w|\ge N_M\), then for all words \(w\in L(M,\curvearrowright )\cup L(M,\circlearrowright )\) with \(|w|\ge N_M\), there exists a permutation \(\phi \) such that the lemma holds for \(\phi (w)\).

To close this section, we briefly discuss the special cases of unary and binary alphabets for the new tape head modes.

Theorem 1

Let M be a deterministic or nondeterministic finite automaton or pushdown automaton with \(| \varSigma | = 1\), then \(L(M,\curvearrowright )=L(M, \circlearrowright )=L(M, \rightarrow )\).

Proof

Straightforward, as the languages above are commutative and semilinear.

   \(\square \)

It follows that there is no \({\circlearrowright }\)NPDA that accepts \(\{a^p \mid \text {p is a prime number}\}\), providing separation of \({\circlearrowright }\)NPDA from the class of context-sensitive languages. Binary alphabets are also special cases for the \(\curvearrowright \) mode. As shown in [10], over binary alphabets all commutative semilinear languages are context-free, so we have . This is in contrast with larger alphabets, where \({\curvearrowright }\)NFA and (\(\rightarrow \))-NPDA are incomparable.

3 One-Way Jumping Nondeterministic Finite Automata (\({\circlearrowright }\)NFA)

In previous papers [1, 2, 4] the class \({\circlearrowright }\)DFA of languages accepted by one-way jumping deterministic finite automata has been investigated. In [1] it was shown that permutation closed languages in \({\circlearrowright }\)DFA are characterized by having finitely many positive Myhill-Nerode equivalence classes. This nice characterization gave the corollary that a permutation closed language in \({\circlearrowright }\)DFA is regular if and only if its complement is in \({\circlearrowright }\)DFA. However, not much more has been known about the complements of \({\circlearrowright }\)DFA languages. To tackle this, we look at the class \({\circlearrowright }\)NFA of languages accepted by one-way jumping nondeterministic finite automata (\({\circlearrowright }\)NFA), that is, machines \((M,\circlearrowright )\), where M is an NFA without \(\epsilon \)-moves, but with possibly multiple initial states. Then, we show that \({\circlearrowright }\)NFA strictly includes \({\curvearrowright }\)NFA, the class of permutation closed semilinear languages and, as a corollary, the complements of permutation closed \({\circlearrowright }\)DFA languages. Finally, we go further by showing that \({\circlearrowright }\)NFA contains the complements of all \({\circlearrowright }\)DFA languages, but \({\circlearrowright }\)NFA itself is not closed under complementation.

Definition 1

(\({\circlearrowright }\)NFA) A (right) one-way jumping nondeterministic finite automaton is an \(\epsilon \)-free NFA with multiple initial states in \(\circlearrowright \) execution mode.

Example 1

The language \(K = \{w \in \{a,b\}^*: |w|_b = 0\ or\ |w|_a = |w|_b\}\) is accepted by \({\circlearrowright }\)NFA \(M=(\{q_0,q_1,q_2,q_3,q_4,q_5\},\{a,b\},\delta ,\{q_0\},\{q_1,q_3\})\), where \(\delta (q_0,a)=\{q_1,q_3\},\delta (q_1,a)=\{q_2\},\delta (q_2,b)=\{q_1\},\delta (q_3,a)=\{q_3\}\).

In [1] it is shown that K ; this establishes . To move on to the inclusions in \({\circlearrowright }\)NFA of the classes mentioned above, first we need to state that, as expected, \({\circlearrowright }\)NFA is closed under union, by a construction similar to the case of classical NFA.

Proposition 3

The class \({\circlearrowright }\)NFA is closed under union.

Proof

Let \(M_1 = (Q_1,\varSigma _1,\delta _1,S_1,F_1)\) and \(M_2 = (Q_2,\varSigma _2,\delta _2,S_2,F_2)\) be two NFA such that \(Q_1\cap Q_2=\emptyset \) (if this does not hold, we can simply rename the states). It is straightforward to see that \(M_3 = (Q_1\cup Q_2, \varSigma _1\cup \varSigma _2, \delta _1\cup \delta _2, S_1\cup S_2, F_1\cup F_2)\) accepts the union of the languages accepted by \(M_1\) and \(M_2\) in \(\circlearrowright \) execution mode, that is, \(L(M_3, \circlearrowright ) = L(M_1, \circlearrowright ) \cup L(M_2, \circlearrowright )\).   \(\square \)

Theorem 2

For any semilinear set \(S \in \mathbb {N}^{|\varSigma |}\) there exists an NFA M, such that \(\varPsi ^{-1}_\varSigma (S) = L(M,\circlearrowright )\), where \(\varPsi _\varSigma \) is the Parikh mapping for alphabet \(\varSigma \).

Proof

S is semilinear, so we can write it as the finite union of linear sets. We give the construction for a linear set. The statement then follows from Proposition 3, by constructing disjoint NFA for all linear sets and taking their union to be M. Let \(\varSigma =\{a_1,\dots , a_k\}\) and \(Lin\subseteq \mathbb {N}^k\) be a linear set of vectors over non-negative integers, i.e., there exist \(v_0, \dots , v_{n}\in \mathbb {N}^k\) such that

$$Lin = \{v_0 + c_1v_1+\cdots + c_n v_{n} \mid c_1,\dots , c_n\in \mathbb {N} \}.$$

First we define the set of “starting vectors” as \(Lin_0 = \{v_0, v_0+v_1, \dots , v_0+v_n \}\), the minimal subset of Lin such that all letters which occur in Lin, also occur in \(Lin_0\). The set \(W(Lin_0)\) of representative words for a set of vectors \(Lin_0\) will be a subset of the preimage of \(Lin_0\) under the Parikh mapping. For each \((m_1,\dots ,m_k)\in Lin_0\) we take k words such that all possible starting letters are represented (there may be less than k if some \(m_i=0\)). Several choices for \(W:\mathbb {N}^{|\varSigma |}\rightarrow \varSigma ^*\) are possible. We set W such that

$$W(Lin_0) = \bigcup _{(m_1,\dots ,m_k) \in Lin_0} \{ a_1^{m_1} a_2^{m_2} \cdots a_k^{m_k}, a_2^{m_2} \cdots a_k^{m_k} a_1^{m_1}, \dots , a_k^{m_k}a_1^{m_1}\cdots a_{k-1}^{m_{k-1}}\}$$

Now we construct M as follows. Let the initial state of M be s and for each \(w_i\in W(Lin_0)\) add a new path from s to a new final state \(w_i\) labeled by letters of \(w_i\) (see Fig. 3). To each of these new final states \(w_i\) we add all possible loops labeled by words in \(W(\{v_1,\dots , v_n\})\) (see Fig. 4). Here we removed \(v_0\), because it is only added once to each vector in Lin, according to the definition. It is clear from the construction that for each vector \(v\in Lin\) there is at least a path in M labeled by a word in the preimage of v. Conversely, the labels of each path form a word whose Parikh mapping is some \(v\in Lin\). Even though not every preimage of v forms a path in M, the \(\circlearrowright \) mode of execution allows M to read all the letters.   \(\square \)

Fig. 3.
figure 3

For each \(w_i\in Lin_0\), add a path to a final state, labeled by the letters of \(w_i\).

Fig. 4.
figure 4

For each \(u_j\in W(\{v_1,\dots ,v_n\})\) add a loop to each \(w_i\), labeled by the letters of \(u_j\).

At first sight it may seem that adding multiple paths for each vector in a linear set to the NFA in the previous proof is an overkill. While in some cases this could be avoided, as the next example shows, it is necessary in general.

Example 2

Consider the semilinear set, which is the union of linear sets \(Lin_1 = \{(1,0,0,0) + c_1\cdot (0,2,2,0) + c_2\cdot (0,0,2,2) +c_3\cdot (0,2,0,2) \mid c_1, c_2, c_3\in \mathbb {N} \}\) and \(Lin_2 = \{(0,1,0,0) + c_1\cdot (2,0,0,0) + c_2\cdot (2,2,0,0) \mid c_1, c_2\in \mathbb {N} \}\) over alphabet \(\{a,b,c, d\}\). If in the first phase of the construction only \(v_0\) was to be added as a path from s for each linear set, then for input bbcca, the machine would have to choose the branch of \(Lin_2\), because the first letter to the right for which there is a transition from s is b. This would mean bbcca is rejected, because \((1,2,2,0)\notin Lin_2\), even though \((1,2,2,0)\in Lin_1\) and thus \(bbcca\in \varPsi ^{-1}(Lin_1\cup Lin_2)\). If in the second phase of the construction only one path per vector \(v_i\) were to be added to the states \(w_i\), then one of the inputs abbcc, accdd or addbb would be rejected even though they are in the language.

Corollary 3

.

Proof

As it was shown in [4], \({\curvearrowright }\)NFA is the class of permutation closed semilinear languages, that is, the class of languages which are preimages of semilinear sets under the Parikh mapping.   \(\square \)

Theorem 3

For any DFA \(M=(Q,\varSigma ,\delta ,s,F)\), we can construct an NFA \(M'\) such that \(\varSigma ^*\setminus L(M,\circlearrowright )=L(M',\circlearrowright )\).

Proof

As we mentioned before, \(\varSigma _q\) denotes the set of letters for which there is an outgoing transition from \(q\in Q\). We construct the NFA \(M'=(Q', \varSigma , \delta ',\{s,s'\}, F')\) by the following steps:

  1. 1.

    switch the accepting states and non-accepting states of M;

  2. 2.

    \(\forall q\in Q\setminus F\): if \(\varSigma _q\ne \varSigma \), then add new state \(q'\in F'\), and the transitions:

    • \(\delta '(p,a)=q'\) for all \(p\in Q, a\in \varSigma \) such that \(q\in \delta (p,a)\)

    • \(\delta '(q',b)=q'\), for each \(b\in \varSigma \setminus \varSigma _q\);

  3. 3.

    \(\forall q\in F\): if \(\varSigma _q\ne \varSigma \), then add new states \(q'\in F'\) and \(q''\notin F'\), and the transitions:

    • \(\delta '(p,a)=q''\) for all \(p\in Q, a\in \varSigma \) such that \(q\in \delta (p,a)\)

    • \(\delta '(q'',b)=q'\) and \(\delta '(q',b)=q'\), for each \(b\in \varSigma \setminus \varSigma _q\);

  4. 4.

    if \(\varSigma _s\ne \varSigma \) then if \(s\in F\), let \(s''\) be a new initial state, whereas if \(s\notin F\), let \(s'\) be a new initial state.

For each \(w\in \varSigma ^*\), the machine \(M'\) will read all of the input along some nondeterministically chosen path. Some path will finish in an accepting state \(q\in F'\cap Q\) if and only if \((M,\circlearrowright )\) read the whole input and stopped in a non-accepting state. Some path will finish in an accepting state \(q'\in F'\setminus Q\) if and only if \((M,\circlearrowright )\) could not read the input and got stuck in state q, because the remaining letters were all elements of \(\varSigma _q\). In more detail, for each input w one of the following happens:

  • \(w\in L(M,\circlearrowright )\): this means M reads all the input and finishes in a state from \(q\in F\). The only branches of \(M'\) which read the whole input finish in q or \(q''\) (if it exists), but since \(q, q''\notin F'\), the input is rejected by \(M'\);

  • \(w\notin L(M,\circlearrowright )\) and M reads the whole input finishing in some state \(q\notin F\). In this case, the same path in \(M'\) reads the whole input and finishes in \(q\in F'\), so \(M'\) accepts the input;

  • \(w\notin L(M, \circlearrowright )\) and M cannot read the whole input, so M gets stuck in a state q, with the remaining input being in \((\varSigma \setminus \varSigma _q)^*\). Depending on whether q is final or not, we distinguish two subcases:

    • \(q\in F\): from the states preceding q in the path, the branch that goes to q gets stuck. The branch that goes to \(q''\) reads the remaining input after transitioning to \(q'\in F'\), so \(M'\) accepts;

    • \(q\notin F\): from the states preceding q in the path, the branch that goes to q gets stuck. The branch that goes to \(q'\in F'\) reads the remaining input, so \(M'\) accepts.    \(\square \)

Example 3

To illustrate how the construction works (Fig. 5), consider the non-regular \({\circlearrowright }\)DFA language \(L_{ab}=\{w\in \{a,b\}^* \mid |w|_a = |w|_b \}\), which can be accepted by the \({\circlearrowright }\)DFA \(M=(\{s,q\}, \{a,b\}, \delta , s, \{s\})\), where \(\delta (s,a)=q\) and \(\delta (q,b)=s\).

Fig. 5.
figure 5

Left: \({\circlearrowright }\)DFA for \(L_{ab} = \{w\in \{a,b\}^* \mid |w|_a = |w|_b \}\). Right: \({\circlearrowright }\)NFA for \(\varSigma ^*\setminus L_{ab}\).

4 One-Way Jumping Pushdown Automaton: (\(\circlearrowright \))-NPDA, (\(\circlearrowright \))-DPDA

In this section we initiate the study of how the \(\circlearrowright \) tape head mode affects the computational power of PDA. In particular, we exhibit certain languages which provide separation between the classes of languages accepted by DPDA and NPDA in the \(\circlearrowright \) mode and their classical counterparts, as well as, finite automata in \(\circlearrowright \) mode. We also present languages which show that several common closure properties do not apply for \({\circlearrowright }\)NPDA. The proofs for the results that follow are based on extended versions of pumping lemmas for deterministic context-free languages ([16]) and context-free languages (Bar-Hillel).

First, we state two extended pumping lemmas for \(\circlearrowright \) mode. The proofs of these lemmas are trivial based on Remark 2.

Corollary 4

(Bar-Hillel lemma for \(\circlearrowright \)-NPDA). For any language L accepted by a \({\circlearrowright }\)NPDA there exists a constant n, such that for every string \(w \in L\) with \(|w| > n\), there exists a permutation \(w_{\sigma }\), which can be written as \(w_{\sigma } = uvxyz\), satisfying (1) \(|vy| \ge 1\), (2) \(|vxy| \le n\) and (3) \(uv^{i}xy^{i}z \in L\) for every \(i \ge 0\).

Corollary 5

(Pumping Lemma for \({\circlearrowright }\)DPDA, original version in [16]). Suppose L is accepted by a \({\circlearrowright }\)DPDA M. Then there exists a constant n for L such that for any pair of words \(w,w' \in L\) if

  1. (1)

    \(s=xy\) and \(s'=xz\), \(|x|>n\), and

  2. (2)

    (first symbol of y) = (first symbol of z),

where s and \(s'\) are permutations of w and \(w'\), such that \(s,s'\in L(M,\rightarrow )\), then either (3) or (4) holds:

  1. (3)

    there is a factorization \(x=x_{1}x_{2}x_{3}x_{4}x_{5}, |x_{2}x_{4}| \ge 1\) and \(|x_{2}x_{3}x_{4}| \le n\), such that for all \(i \ge 0, x_{1}{x_{2}}^{i}x_{3}{x_{4}}^{i}x_{5}y\) and \(x_{1}{x_{2}}^{i}x_{3}{x_{4}}^{i}x_{5}z\) are in L;

  2. (4)

    there exist factorizations \(x=x_{1}x_{2}x_{3}, y=y_{1}y_{2}y_{3}\) and \(z=z_{1}z_{2}z_{3}, |x_{2}| \ge 1\) and \(|x_{2}x_{3}| \le n, \) such that for all \(i \ge 0, x_{1}{x_{2}}^{i}x_{3}y_{1}{y_{2}}^{i}y_{3}\) and \(x_{1}{x_{2}}^{i}x_{3}z_{1}{z_{2}}^{i}z_{3}\) are in L.

Let \(L_{\mathrm {ppal}} = \{ w\#\phi (w) \mid w \in \{ a,b \}^*, \phi \in S_{ |w| } \}\). Note that , by a simple application of the Bar-Hillel lemma. By a construction mimicking the DPDA accepting palindromes \(w\#w^R\), it is easy to show that .

The following propositions provide us with the separation results which, added to the previously known relationships, add up to Fig. 6.

Fig. 6.
figure 6

Relationship between REG\(=\rightarrow \)DFA\(=\rightarrow \)NFA, CF\(=\rightarrow \)NPDA, and one-way jumping classes.

Proposition 4

\(L_{\mathrm {ppal}}\,{\notin }{{\circlearrowright }{} \mathbf{DFA}}\).

Proof

Suppose that \(M=(Q,\varSigma ,\delta ,q_0,F)\) is a DFA, such that \(L(M,\circlearrowright ) = L_{\mathrm {ppal}}\). Consider the word \(w=a^{p} \# a^{p} \in L_{\mathrm {ppal}}\), with \(p=|Q|+1\). By Proposition 1 there exists a permutation P such that \(P(w) \in L(M,\rightarrow )\subseteq L_{\mathrm {ppal}}\) and the pumping lemma for regular languages says that it can be written as \(P(w) = xyz\), where \(y \ne \epsilon ,|xy| \le |Q|\) and \(xy^{i}z \in L_{\mathrm {ppal}},\forall i \ge 0\). Since \(L_{\mathrm {ppal}}\cap (a+\#)^*\) has only one word of each length, we have \(P(w)=a^p\#a^p\). From \(|xy| \le |Q|\), we get a contradiction when \(i=2\) as \(xy^{i}z \notin L_{\mathrm {ppal}}\).   \(\square \)

Proposition 5

.

Proof

Assume there exists DPDA M = \((Q,\varSigma ,\varGamma ,\delta ,q_0,F)\) such that \(L(M,\circlearrowright )=L_{\mathrm {ambig}}\) and let C be the constant for \(L_{\mathrm {ambig}}\) in Corollary 5.

Choose \(w=a^{n}b^{n}\) and \(w'=a^{n}b^{2n}\) for some integer \(n > C\), then there exist permutations \(\sigma \) and \({\sigma }'\) such that \(w_{\sigma }, w'_{{\sigma }'}\in L(M,\rightarrow )\). By \(w_{\sigma },w'_{{\sigma }'} \in L(M,\rightarrow ) \subseteq L(M,\circlearrowright )=L_{\mathrm {ambig}}\), we get that \(w_{\sigma }=a^{n}b^{n}\) and \(w'_{{\sigma }'}=a^{n}b^{2n}\). Let \(x=a^{n}b^{n-1}\), \(y=b\), and \(z=b^{2n-1}\). The choice of \(w_{\sigma }=xy\) and \(w'_{{\sigma }'}=xz\) satisfies (1) and (2) of Corollary 5. According to Corollary 5, either (3) or (4) should hold.

Let us consider (3) first. The only possible factorization \(x=x_1x_2x_3x_4x_5\) such that \(|x_2x_4|>0\) and for all i, \(x_1{x_2}^{i}x_3{x_4}^{i}x_5y \in L_{\mathrm {ambig}}\) must satisfy the condition \(x_2=a^k\) and \(x_4=b^k\) for some \(k>0\). But then \(x_1{x_2}^{0}x_3{x_4}^{0}x_5z = x_1x_3x_5z = a^{n-k}b^{2n-k} \notin L_{\mathrm {ambig}}\). Therefore (3) does not hold.

Now, we consider (4). Any factorization \(x=x_1x_2x_3\) such that \(|x_2| > 0\) and \(|x_2x_3| \le C < n\) will result in \(x_2 \in b^+\) and \(y_2 \in b^{*}\), so \(x_1x_3y_1y_3=a^nb^{n-|x_2|-|y_2|} \notin L_{\mathrm {ambig}}\). So (4) does not hold either.

This contradicts the \(\circlearrowright \)-DPDA pumping lemma, so \(L_{\mathrm {ambig}}\,{\notin }{{\circlearrowright }{} \mathbf{DPDA}}\).   \(\square \)

The language \(L_{\mathrm {ambig}}\) is a classic example of nondeterministic context-free language. At the same time, as mentioned in [4],

figure a

It is also straightforward that

figure b

and \(L_{\mathrm {lin}}\notin \) \({\circlearrowright }\)DFA, by [4, Cor.12]. Since \(\rightarrow \) NPDA is trivially included in \({\circlearrowright }\)NPDA, by the same arguments as above, the union \(L_{cd}\cup L_{\mathrm {ambig}}=L_{cd+\mathrm {ambig}}\), where \(L_{cd} = \{ w\in \{c,d\}^* \mid |w|_c=|w|_d \}\), is in \({\circlearrowright }\)NPDA. At the same time, it is easy to see that . This completes the separation examples for Fig. 6. Finally, let us say a few words about closure properties of \({\circlearrowright }\)NPDA.

Proposition 6

.

Proof

Suppose that M = \((Q,\{a,b,c\},\varGamma ,\delta ,q_0,F)\) and \(L(M, \circlearrowright ) = L_{3}\). Consider the word \(w = a^{n}b^{n}c^{n} \in L_1\). By Corollary 4 there exists a permutation \(\sigma \) such that \(w_{\sigma } \in L_1\) can be written as \(w_{\sigma } = uvxyz\), where \(|vy| \ge 1\), \(|vxy| \le n\), \(uv^{i}xy^{i}z \in L_1\), for all \(i \ge 0\), where n is the contant from the Bar-Hillel lemma for \({\circlearrowright }\)NPDA. Depending on the decomposition uvxyz, we have two cases

  1. 1.

    If vxy is generated by one symbol, then \(uv^{2}xy^{2}z\) does not include the same number of a, b, c. This contradicts \(uv^{i}xy^{i}z \in L_1\), for \(i=2\).

  2. 2.

    If vxy contains two kinds of symbols, then \(uv^{2}xy^{2}z\notin L_1\), because the number of copies of the third letter (the one not in vxy) does not match the other two. This contradicts \(uv^{i}xy^{i}z \in L_1\), when \(i =2\).

Therefore, no \({\circlearrowright }\)NPDA accepts \(L_1\).   \(\square \)

Since both and , from Proposition 6 we get that \({\circlearrowright }\)NPDA is not closed under intersection. Together with the fact that it is closed under union, we get that it is not closed under complementation, either.

Theorem 4

.

Proof

Suppose that there exist a \({\circlearrowright }\)NPDA \(M=(Q,\{ a,b,c \},\varGamma ,\delta ,q_0,F)\), which accepts \(L_2\). Consider the words \(w_1=a^mb^mc^ma \in L_2\) and \(w_2 = a^{m+1}b^mc^m \notin L_2\), where m is the constant from Corollary 4. By \(w_1=a^mb^mc^ma \in L_2\), there exists permutation P such that \(P(w_1)\in L(M,\rightarrow )\). Let \(P(w_1)=x_1x_2...x_{3m+1}\), then \(x_{3m+1}=a\) by \(P(w_1) \in L_2\). We will use the fact that \(w_2\) is the cyclic shift of \(w_1\). Let us look at the computation performed by the automaton on reading \(w_1\). Before reading b or c, the automaton will read \(a^k\) for some \(k \ge 0\), that is:

$$\begin{aligned} (q_0,a^mb^mc^ma,\$) \circlearrowright (q_1,a^{m-1}b^mc^ma,z_1) \circlearrowright .... \circlearrowright (q_k,a^{m-k}b^mc^ma,z_k) \end{aligned}$$

for some \(\{q_1,q_2...q_k\} \subset Q\) and \(\{z_1,z_2....z_k\} \subset \varGamma ^*\) ($ is the bottom marker for the pushdown). Since \(w \in L\), there is an accepting computation from the last configuration \( (q_k,a^{m-k}b^mc^ma,z_k)\) in that sequence. Depending on k, we have two cases.

(CASE 1) if \(k<m\): we get that there is no transition defined from state \(q_k\) on reading a, therefore \(a^{m-k},a^{m-k+1} \in (\varSigma -\varSigma _{(q_k,z_k)})^{*}\), where \(\varSigma _{(q_k,z_k)}=\{d \in \varSigma \mid \delta (q_k,d,z_k) \ne \emptyset \}\). We get that

$$(q_k,uv,z_k)=(q_k,a^{m-k+1}b^{m}c^{m},z_k) \circlearrowright ^* (q,\epsilon ,z)$$

if and only if

$$(q_k,vu,z_k)=(q_k,b^{m}c^{m}a^{m-K+1},z_k) \circlearrowright ^* (q,\epsilon ,z).$$

This means \((q_0,a^{m+1}b^mc^m,\$) \circlearrowright ^* (q_k,a^{m-k+1}b^mc^m,z_k) \circlearrowright ^* (q,\epsilon ,z)\), which contradicts the initial assumption \(a^{m+1}b^{m}c^{m} \notin L_2\).

(CASE 2) if \(k=m\): This case says that \((q_0,a^mb^mc^ma,\$) \circlearrowright ^k (q_m,b^mc^ma,z_m)\).

Let \({\circlearrowright }\)NPDA M then read b. Let us look at the computation performed by the automaton on reading \(b^mc^ma\). Before reading c, the automaton will read \(b^l\) for some \(l \ge 0\), that is:

$$\begin{aligned} (q_m,b^mc^ma,z_m) \circlearrowright (q_{m+1},b^{m-1}c^ma,z_{m+1}) \circlearrowright .... \circlearrowright (q_{m+l},b^{m-l}c^ma,z_{m+l}) \end{aligned}$$

for some \(\{q_{m+1},q_{m+2}...q_{m+l}\} \subset Q\) and \(\{z_{m+1},z_{m+2}....z_{m+k}\} \subset \varGamma ^*\). There is an accepting computation from the last configuration \( (q_{m+l},b^{m-l}c^ma,z_{m+l})\) in that sequence. Depending on l, we have two cases.

(CASE 2-1) if \(l<m\): This case is the same as (CASE 1).

(CASE 2-2) if \(l=m\): This case is the same as (CASE 2). \({\circlearrowright }\)NPDA M must read c. Let us look at the computation performed by the automaton on reading \(c^ma\). Before reading the last letter, a, the automaton will read \(c^n\) for some \(n \ge 0\):

$$\begin{aligned} (q_{2m},c^ma,z_{2m}) \circlearrowright ^{n} (q_{2m+n},c^{m-n}a,z_{2m+n}) \end{aligned}$$

for some \(q_{2m+n} \in Q\) and \(z_{2m+n} \in \varGamma ^*\). There is an accepting computation from the last configuration \((q_{2m+n},c^{m-n}a,z_{2m+n})\) in that sequence. Depending on n, we have two cases.

(CASE 2-2-1) if \(n<m\): This case is the same as (CASE 1).

(CASE 2-2-2) if \(n=m\): This case says that

$$(q_{2m},c^ma,z_{2m}) \circlearrowright ^{m} (q_{3m},a,z_{3m}) \circlearrowright (q_{3m+1},\epsilon ,z_{3m+1})$$

for \(\{q_{3m},q_{3m+1}\} \subseteq Q\) and \(\{z_{3m},z_{3m+1}\} \subseteq \varGamma ^*\). In this case, \(P(w_1)=a^mb^mc^ma\in L(M,\rightarrow )\), so we can write \(a^mb^mc^ma=uvxyz\), where \(|vy| \ge 1\), \(|vxy| \le m\), \(uv^{i}xy^{i}z \in L_2\), for all \(i \ge 0\). The same argument as in the proof of Proposition 6 can be applied to reach a contradiction.

Therefore, no \({\circlearrowright }\)NPDA accepts \(L_2\).    \(\square \)

Since and , from Theorem 4 we get that the class \({\circlearrowright }\)NPDA is not closed under concatenation. However, the reversal of \(L_2\), that is, \(\{aw \mid |w|_{a}=|w|_{b}=|w|_{c}\}\), is in \({\circlearrowright }\)NPDA, in fact, it is in \({\circlearrowright }\)DFA, so we get that \({\circlearrowright }\)NPDA is not closed under reversal, either.

5 Summary

We discussed three modes of tape head, however, other established modes can be considered, e.g., a two-way jumping mode. In such a mode, the tape head does not erase the letters it read. The modes can be applied to linear bounded automata, but one can show that none of these modes, including the two-way jumping mode adds to the power of linear bounded automata. Therefore, the language accepted by linear bounded automata with the mentioned alternative modes of tape head coincides with the class of context sensitive languages. Several questions remain open with respect to the jumping modes, particularly so in the case of \(\circlearrowright \). An unanswered decidability problem, which so far resisted attempts, is whether there exists an algorithm which decides \(L(M,\circlearrowright )\in \)REG for a given DFA/NFA M. Here, the technique of completing \({\circlearrowright }\)DFA  presented in Sect. 3 might help, but the problem seems rather difficult, because of the unusual languages which these classes of machines can accept.