1 Introduction

Limit-deterministic Büchi automata (LDBA, also known as semi-deterministic Büchi automata) were introduced by Courcoubetis and Yannakakis (based on previous work by Vardi) to solve the qualitative probabilistic model-checking problem: Decide if the executions of a Markov chain or Markov decision process satisfy a given LTL formula with probability 1 [4, 39, 40]. The problem faced by these authors was that fully nondeterministic Büchi automata (NBAs), which can capture all LTL-recognisable languages, cannot be used for probabilistic model checking, and deterministic Büchi automata (DBA), which could be used for probabilistic model checking, cannot capture all LTL-recognisable languages. The solution was to introduce LDBAs as a model in-between: as expressive as NBAs, but deterministic enough.

After these papers, LDBAs received little attention. The alternative path of translating the LTL formula into an equivalent fully deterministic Rabin automaton using Safra’s construction [32] was considered a better option, mostly because it also solves the quantitative probabilistic model-checking problem (computing the probability of the executions that satisfy a formula). However, recent papers have shown that LDBAs were unjustly forgotten. Blahoudek et al. have shown that LDBAs are easy to complement [2]. Kini and Viswanathan have given a single exponential translation of LTL\(_{\setminus {\mathbf {G}}{\mathbf {U}}}\) to LDBA [18]. Finally, Sickert et al. describe in [9, 34, 36] two double exponential translations for full LTL that can also be applied to the quantitative case, and tend to behave better than Safra’s construction in practice.

In this paper, we add to this trend by showing that LDBAs are also attractive for synthesis. The classical approach to the synthesis problem with LTL objectivesFootnote 1 involves a translation of NBAs to DPAs with the help of the Safra-Piterman construction [30] or other recent determinisation constructions, such as [13, 17, 25]. While limit-determinism is not ‘deterministic enough’ for the synthesis problem, we introduce a conceptually simple and worst-case optimal translation LDBA\(\rightarrow \)DPA.

The presented translation bears some similarities with that of [12] where, however, a Muller acceptance condition is used. This condition can also be phrased as a Rabin condition, but not as a parity condition. Moreover, the way of tracking all possible states and finite runs differs. Furthermore, readers familiar with [13, 17] might notice that our construction tries to identify an (infinite) left-path, which is by definition accepting, in the reduced split-tree. If we restrict ourselves to LDBAs, identifying these becomes considerably simpler compared to the cited approaches. Hence our approach uses similar ideas, but is stream-lined and simpler.

Together with the translation LTL\(\rightarrow \)LDBA of [9, 34, 36], our construction provides a ‘Safraless’ procedure to obtain a DPA from an LTL formula. However, the direct concatenation of the two constructions does not yield an algorithm of optimal complexity: the LTL \(\rightarrow \) LDBA translation is double exponential (and there is a double exponential lower bound), and so for the LTL\(\rightarrow \)DPA translation we only obtain a triple exponential bound. We solve this problem by showing that these LDBAs derived from LTL formulas possess semantic state annotations that can be used to reduce the amount of tracked information in the constructed DPA. We then prove that in this setting the concatenation of the two constructions remains double exponential.

With the availability of efficient translations from LTL formulas into DPAs several tools emerged following the classical approach to synthesis with LTL objectives. First, there is ltlsynt which is part of Spot [5] that uses a NBA\(\rightarrow \)DPA translation. Second, there is Strix [27, 28] that relies on the translation presented in this paper and which recently won all LTL tracks of the synthesis competition SyntComp [16]. Besides, the preserved semantic labelling of the states of the automata allows for heuristics guiding the exploration of the on-the-fly generated automaton [27], but also for efficient deployment of learning-based algorithms and lifelong learning paradigms in LTL synthesis [19]. Such efforts have a great impact on the practical performance of solutions to this 2-EXPTIME-complete problem. For a detailed description of the exact implementation details of Strix we refer the reader to [27].

In the third and final part, we report on an experimental evaluation of our LTL\(\rightarrow \)LDBA\(\rightarrow \)DPA construction, and compare it with other constructions that translate LTL to DPAs.

Structure of the Paper. Section 2 introduces the necessary preliminaries about automata. Section 3 defines the translation LDBA\(\rightarrow \)DPA. Section 4 shows how to compose this translation with a translation from LTL to LDBAs in such a way that the resulting DPA is at most doubly exponential in the size of the LTL formula. Section 5 reports on the experimental evaluation of this worst-case optimal translation, and Sect. 6 contains our conclusions.

Editorial Note. This is an extended journal version of our previously published conference paper [8], including full proofs, more examples, and an extensive evaluation on classical and new, parametrised benchmarks.

2 Preliminaries

Büchi automata. A (nondeterministic) word automaton A with Büchi acceptance condition (NBA) is a tuple \((Q,q_0,\Sigma ,\delta ,\alpha )\) where Q is a finite set of states, \(q_0 \in Q\) is the initial state, \(\Sigma \) is a finite alphabet, \(\delta \subseteq Q \times \Sigma \times Q\) is the transition relation, and \(\alpha \subseteq \delta \) is the set of accepting transitionsFootnote 2. A is deterministic if for all \(q \in Q\), for all \(\sigma \in \Sigma \), there exists a unique \(q' \in Q\) such that \((q,\sigma ,q') \in \delta \) or there exists no such state. Given \(S \subseteq Q\) and \(\sigma \in \Sigma \), let \(\mathsf{post}^{\sigma }_{\delta }(S)=\{ q' \mid \exists q \in S \cdot (q,\sigma ,q') \in \delta \}\). Further, we use \(q \rightarrow ^\sigma p\) as a shorthand for \((q, \sigma , p) \in \delta \) if \(\delta \) is clear from the context.

A run of A on a \(\omega \)-word \(w : {\mathbb {N}}\rightarrow \Sigma \) is a \(\omega \)-sequence of states \(\rho : {\mathbb {N}}\rightarrow Q\) such that \(\rho (0)=q_0\) and for all positions \(i \in {\mathbb {N}}\), we have that \((\rho (i),w(i),\rho (i+1)) \in \delta \). A run \(\rho \) is accepting if there are infinitely many positions \(i \in {\mathbb {N}}\) such that \((\rho (i),w(i),\rho (i+1)) \in \alpha \). The language defined by A, denoted by \(\mathsf {L}(A)\), is the set of \(\omega \)-words w for which A has an accepting run.

A limit-deterministic Büchi automaton (LDBA) is a Büchi automaton \(A=(Q,q_0,\Sigma ,\delta ,\alpha )\) such that there exists a subset \(Q_d \subseteq Q\) satisfying the three following properties:

  1. 1.

    \(\alpha \subseteq Q_d \times \Sigma \times Q_d\), i.e. all accepting transitions are transitions within \(Q_d\);

  2. 2.

    \(\forall q \in Q_d \cdot \forall \sigma \in \Sigma \cdot \forall q_1,q_2 \in Q \cdot (q,\sigma ,q_1) \in \delta \wedge (q,\sigma ,q_2) \in \delta \rightarrow q_1=q_2\), i.e. the transition relation \(\delta \) is deterministic within \(Q_d\);

  3. 3.

    \(\forall q \in Q_d \cdot \forall \sigma \in \Sigma \cdot \forall q' \in Q \cdot (q,\sigma ,q') \in \delta \rightarrow q' \in Q_d\), i.e. \(Q_d\) is a trap (when \(Q_d\) is entered it is never left).

Without loss of generality, we assume that \(q_0 \in Q \setminus Q_d\), and we denote \(Q \setminus Q_d\) by \(\overline{Q_d}\). Courcoubetis and Yannakakis show that for every \(\omega \)-regular language \({{\mathcal {L}}}\), there exists an LDBA A such that \(\mathsf {L}(A)={{\mathcal {L}}}\) [4]. That is, LDBAs are as expressive as NBAs. An example of LDBA is given in Fig. 1. Note that the language accepted by this LDBA cannot be recognised by a deterministic Büchi automaton.

Fig. 1
figure 1

An LDBA for the LTL language \({\mathbf {F}}{\mathbf {G}}a \vee {\mathbf {F}}{\mathbf {G}}b\). The behaviour of A is deterministic within the subset of states \(Q_d=\{2,3,4\}\) which is a trap, the set of accepting transitions are depicted in bold face and they are defined only between states of \(Q_d\). We simplify figure, by using the alphabet \(Ap = {a, b}\) instead of \(2^{Ap}\)

Parity automata. A deterministic word automaton A with parity acceptance condition (DPA) is a tuple (Q, \(q_0\), \(\Sigma ,\delta ,p)\), defined as for deterministic Büchi automata with the exception of the acceptance condition p, which is now a function assigning an integer in \(\{ 1, 2, \dots ,\) \(d \}\), called a colour, to each transition in the automaton. Colours are naturally ordered by the order on integers.

Given a run \(\rho \) over a word w, the infinite sequence of colours traversed by the run \(\rho \) is noted \(p(\rho )\) and is equal to \(p(\rho (0),w(0),\rho (1)) \dots \) \(p(\rho (n),w(n),\rho (n+1)) \dots \). A run \(\rho \) is accepting if the minimal colour that appears infinitely often along \(p(\rho )\) is even. The language defined by A, denoted by \(\mathsf {L}(A)\), is the set of \(\omega \)-words w for which A has an accepting run.

While deterministic Büchi automata are not expressively complete for the class of \(\omega \)-regular languages, DPAs are complete for \(\omega \)-regular languages: for every \(\omega \)-regular language \({{\mathcal {L}}}\) there exists a DPA A such that \(\mathsf {L}(A)={{\mathcal {L}}}\), see e.g. [30].

Linear Temporal Logic. We introduce linear temporal logic (LTL) as most authors with the following reduced syntax:

$$\begin{aligned} \varphi {:=}\; {\mathbf {tt}}\mid a \mid \lnot \varphi \mid \varphi \wedge \varphi \mid {\mathbf {X}}\varphi \mid \; \varphi {\mathbf {U}}\varphi \quad \text { with } a \in Ap \end{aligned}$$

Let w be a word over the alphabet \(2^{Ap}\) and let \(\varphi \) be a formula. Let \(w_i = w(i) w(i+1) \dots \) denote the suffix of w at position i. The satisfaction relation \(w \models \varphi \) is inductively defined as follows:

$$\begin{aligned}{}\begin{array}{lcl} w \models {\mathbf {tt}}\\ w \models a &{} \text{ iff } &{} a \in w(0) \\ w \models \lnot \varphi &{} \text{ iff } &{} w \not \models \varphi \\ w \models {\mathbf {X}}\varphi &{} \text{ iff } &{} w_1 \models \varphi \\ w \models \varphi {\mathbf {U}}\psi &{} \text{ iff } &{} \exists k \cdot \, w_k \models \psi \text { and } \forall j < k \cdot \, w_j \models \varphi \end{array} \end{aligned}$$

We denote by \(\mathsf {L}(\varphi ) {:=}\{ w \in (2^{Ap})^\omega \mid w \models \varphi \}\) the language of \(\varphi \). Left-out, but often used LTL operators are then added as abbreviations. \({\mathbf {F}}\varphi {:=}{\mathbf {tt}}{\mathbf {U}}\varphi \) (eventually) and \({\mathbf {G}}\varphi {:=}\lnot {\mathbf {F}}\lnot \varphi \).

3 From LDBA to DPA

3.1 Run DAGs and their colouring

Run DAG. A nondeterministic automaton A may have several (even an infinite number of) runs on a given \(\omega \)-word w. As in [23], we represent this set of runs by means of a directed acyclic graph structure called the run DAG of A on w. Given an LDBA \(A=(Q,Q_d,q_0\), \(\Sigma ,\delta ,\alpha )\), this graph \(G_w=(V,E)\) has a set of vertices \(V \subseteq Q \times {\mathbb {N}}\) and edges \(E \subseteq V \times V\) defined as follows:

  • \(V = \bigcup _{i \in {\mathbb {N}}} V_i\), where the sets \(V_i\) are defined inductively:

    • \(V_0=\{ (q_0,0) \}\), and for all \(i \ge 1\),

    • \(V_i = \{ (q,i) \mid \exists (q',i-1) \in V_{i-1} \cdot q' \rightarrow ^{w(i)} q \}\);

  • \(E = \{ ((q,i),(q',i+1)) \in V_i \times V_{i+1} \mid q \rightarrow ^{w(i)} q' \}\).

We denote by \(V^d_{i}\) the set \(V_i \cap (Q_d \times \{i\})\) that contains the subset of vertices of layer i that are associated with states in \(Q_d\).

Observe that all the infinite paths of \(G_w\) that start from \((q_0,0)\) are runs of A on w, and, conversely, each run \(\rho \) of A on w corresponds exactly to one path in \(G_w\) that starts from \((q_0,0)\). So, we call runs the infinite paths in the run DAG \(G_w\). In particular, we say that an infinite path \(v_0 v_1 \dots v_n \dots \) of \(G_w\) is an accepting run if there are infinitely many positions \(i \in {\mathbb {N}}\) such that \(v_i=(q,i)\), \(v_{i+1}=(q',i+1)\), and \((q,w(i),q') \in \alpha \). Clearly, w is accepted by A if and only if there is an accepting run in \(G_w\). We denote by \(\rho (0..n)=v_0 v_1 \dots v_n\) the prefix of length \(n+1\) of the run \(\rho \).

Ordering of runs. A function \(\mathsf{Ord}: Q \rightarrow \{ 1,2,\dots \), \(|Q_d|,+\infty \}\) is called an ordering of the states of A w.r.t. \(Q_d\) if \(\mathsf{Ord}\) defines a strict total order on the state from \(Q_d\), and maps each state \(q \in \overline{Q_d}\) to \(+ \infty \), i.e.:

  • for all \(q \in \overline{Q_d}\), \(\mathsf{Ord}(q)=+\infty \),

  • for all \(q \in Q_d\), \(\mathsf{Ord}(q)\not =+\infty \), and

  • for all \(q,q' \in Q_d\), \(\mathsf{Ord}(q)=\mathsf{Ord}(q')\) implies \(q=q'\).

We extend \(\mathsf{Ord}\) to vertices in \(G_w\) as follows: \(\mathsf{Ord}((q,i))=\mathsf{Ord}(q)\).

Starting from \(\mathsf{Ord}\), we define the following pre-order on the set of run prefixes of the run DAG \(G_w\). Let \(\rho (0..n)=v_0 v_1 \dots v_n\) and \(\rho '(0..n)=v'_0 v'_1 \dots v'_n\) be two run prefixes of length \(n+1\), we write \(\rho (0..n) \sqsubseteq \rho '(0..n)\), if \(\rho (0..n)\) is smaller than \(\rho '(0..n)\), which is defined as:

  • for all i, \(0 \le i \le n\), \(\mathsf{Ord}(\rho (i))=\mathsf{Ord}(\rho '(i))\), or

  • there exists i, \(0 \le i \le n\), such that:

    • \(\mathsf{Ord}(\rho (i)) < \mathsf{Ord}(\rho '(i))\), and

    • for all j, \(0 \le j < i\), \(\mathsf{Ord}(\rho (j))=\mathsf{Ord}(\rho '(j))\).

This is extended to (infinite) runs as: \(\rho \sqsubseteq \rho '\) iff for all \(i \ge 0 \cdot \mathsf{Ord}(\rho (0..i)) \sqsubseteq \mathsf{Ord}(\rho '(0..i))\).

Remark 1

If A accepts a word w, then A has a \(\sqsubseteq \)-smallest accepting run for w.

We use the \(\sqsubseteq \)-relation on run prefixes to order the vertices of \(V_i\) that belong to \(Q_d\): for two different vertices \(v=(q,i) \in V_i\) and \(v'=(q',i) \in V_i\), v is \(\sqsubset _i\)-smaller than \(v'\), if there is a run prefix of \(G_w\) that ends up in v which is \(\sqsubseteq \)-smaller than all the run prefixes that ends up in \(v'\), which induces a total order among the vertices of \(V^d_i\) because the states in \(Q_d\) are totally ordered by the function \(\mathsf{Ord}\).

Lemma 1

For all \(i \ge 0\), for two different vertices \(v=(q,i),v'=(q',i) \in V^d_i\), then either \(v \sqsubset _i v'\) or \(v' \sqsubset _i v\), i.e., \(\sqsubset _i\) is a total order on \(V^d_i\).

Indexing vertices. The index of a vertex \(v=(q,i) \in V_i\) such that \(q \in Q_d\), denoted by \(\mathsf{Ind}_i(v)\), is a value in \(\{1,2,\dots ,|Q_d|\}\) that denotes its order in \(V^d_i\) according to \(\sqsubset _i\) (the \(\sqsubset _i\)-smallest element has index 1). For \(i \ge 0\), we identify two important sets of vertices:

  • \(\mathsf{Dec}(V^d_{i})\) is the set of vertices \(v \in V^d_{i}\) such that

    • either there does not exists \(v' \in V^d_{i+1}: (v,v') \in E\), i.e. v has no successor in \(V^d_{i+1}\) meaning that the sequence of states monitored so far aborts and does not lead to an infinite run;

    • or there exists a vertex \(v' \in V^d_{i+1}\): \((v,v') \in E\) and \(\mathsf{Ind}_{i+1}(v') < \mathsf{Ind}_{i}(v)\), i.e. the set of vertices in \(V^d_{i}\) whose (unique) successor in \(V^d_{i+1}\) has a smaller index value.

  • \(\mathsf{Acc}(V^d_{i})\) is the set of vertices \(v=(q,i) \in V^d_{i}\) such that there exists \(v'=(q',i+1) \in V^d_{i+1}\): \((v,v') \in E\) and \((q,w(i),q') \in \alpha \), i.e. the set of vertices in \(V^d_{i}\) that are the source of an accepting transition on w(i).

Remark 2

Along a (infinite) run, the index of vertices can only decrease. As the function \(\mathsf{Ind}(\cdot )\) has a finite range, the index along a run has to eventually stabilise.

Assigning colours. The set of colours that are used for colouring the levels of the run DAG \(G_w\) is \(\{1, 2, \dots , 2\cdot |Q_d|+1\}\). We associate a colour with each transition from level i to level \(i+1\) according to the following set of cases:

  1. 1.

    if \(\mathsf{Dec}(V^d_{i})=\emptyset \) and \(\mathsf{Acc}(V^d_i)\not =\emptyset \), the colour is \(2 \cdot \min _{v \in \mathsf{Acc}(V^d_{i})} \mathsf{Ind}_{i}(v)\).

  2. 2.

    if \(\mathsf{Dec}(V^d_{i})\not =\emptyset \) and \(\mathsf{Acc}(V^d_i)=\emptyset \), the colour is \(2 \cdot \min _{v \in \mathsf{Dec}(V^d_{i})} \mathsf{Ind}_{i}(v)-1\).

  3. 3.

    if \(\mathsf{Dec}(V^d_{i})\not =\emptyset \) and \(\mathsf{Acc}(V^d_i)\not =\emptyset \), the colour is defined as the minimal colour among

    • \(c_{\mathsf{odd}}=2 \cdot \min _{v \in \mathsf{Dec}(V^d_{i})} \mathsf{Ind}_{i}(v)-1\), and

    • \(c_{\mathsf{even}}=2 \cdot \min _{v \in \mathsf{Acc}(V^d_{i})} \mathsf{Ind}_{i}(v)\).

  4. 4.

    if \(\mathsf{Dec}(V^d_{i})=\mathsf{Acc}(V^d_i)=\emptyset \), the colour is \(2 \cdot |Q_q|+1\).

The intuition behind this colouring is as follows: the colouring tracks (potentially infinite) runs in \(Q_d\), as \(\alpha \subseteq Q_d \times \Sigma \times Q_d\), and tries to produce an even colour that corresponds to the smallest index of an accepting run. If in level i the run DAG has an outgoing transition that is accepting, then this is a positive event, as a consequence the colour emitted is even and it is a function of the smallest index of a vertex associated with an accepting transition from \(V_{i}\) to \(V_{i+1}\). Runs in \(Q_d\) are deterministic but they can merge with smaller runs or they may abort. When this happens, this is considered as a negative event because the even colours that have been emitted by the run that merges with the smaller run or aborts should not be taken into account anymore. As a consequence an odd colour is emitted in order to cancel all the (good) even colours that were generated by the run that merges or aborts. In that case the odd colour is function of the smallest index of a run vertex in \(V_{i}\) whose run merges or aborts. Those two first cases are handled by cases 1 and 2 of the case study above. When both situations happen at the same time, then the colour is determined by the minimum of the two colours assigned to the positive and the negative events. This is handled by case 3 above. And finally, when there is no accepting transition from \(V_{i}\) to \(V_{i+1}\) and no merging or abort, the largest odd colour is emitted as indicated by case 4 above.

According to this intuition, we define the colour summary of the run DAG \(G_w\) as the minimal colour that appears infinitely often along the transitions between its levels. Because of the deterministic behaviour of the automaton in \(Q_d\), each run can only merge at most \(| Q_d |-1\) times with a smaller one (the size of the range of the function \(\mathsf{Ind}(\cdot )\) minus one), and as a consequence of the definition of the above colouring, we know that, on word accepted by A, the smallest accepting run will eventually generate infinitely many (good) even colours that are never trumped by smaller odd colours.

Fig. 2
figure 2

The run DAGs automaton of Fig. 1 on the word \(w=(ab)^{\omega }\) given on the left, and on the word \(w=aab^{\omega }\) given on the right, together with their colourings

Example 1

The left part of Fig. 2 depicts the run DAG of the limit-deterministic automaton of Fig. 1 on the word \(w=abb(ab)^{\omega }\). Each path in this graph represents a run of the automaton on this word. The colouring of the run DAG follows the colouring rules defined above. Between level 0 and level 1, the colour is equal to \(7= 2|Q_d| + 1\), as no accepting edge is taken from level 0 to level 1 and no run merges (within \(Q_d\)). The colour 7 is also emitted from level 1 to level 2 for the same reason. The colour 4 is emitted from level 2 to level 3 because the accepting edge (3, b, 3) is taken and the index of state 3 in level 2 is equal to 2 (state 4 has index 1 as it is the end point of the smallest run prefix within \(Q_d\)). The colour 3 is emitted from level 3 to level 4 because the run that goes from 3 to 4 merges with the smaller run that goes from 4 to 4. In order to cancel the even colours emitted by the run that goes from 3 to 4, colour 3 is emitted. It cancels the even colour 4 emitted before by this run. Afterwards, colours 3 is emitted forever. The colour summary is 3 showing that there is no accepting run in the run DAG.

The right part of Fig. 2 depicts the run DAG of the limit deterministic automaton of Fig. 1 on the word \(w=aab^{\omega }\). The colouring of the run DAG follows the colouring rules defined above. Between levels 0 and 1, colour 7 is emitted because no accepting edge is crossed. To the next level, we see the accepting edge (2, a, 2) and colour \(2\cdot 1=2\) is emitted. Upon reading the first b, we see again 7 since there is neither any accepting edge seen nor any merging takes place. Afterwards, each b causes an accepting edge (3, b, 3) to be taken. While the smallest run, which visits 4 forever, is not accepting, the second smallest run that visits 3 forever is accepting. As 3 has index 2 in all the levels below level 3, the colour is forever equal to 4. The colour summary of the run is thus equal to \(2\cdot 2=4\) and this shows that word \(w=aab^{\omega }\) is accepted by our limit-deterministic automaton of Fig. 1.

The following theorem tells us that the colour summary (the minimal colour that appears infinitely often) can be used to identify run DAGs that contain accepting runs.

Theorem 1

The colour summary of the run DAG \(G_w\) is even if and only if there is an accepting run in \(G_w\).

Proof

(\(\Rightarrow \)): Assume that the colour summary of \(G_w\) is even and equal to c. Then it must be the case that there exists a level \(i \ge 0\) such the colour after level i is always larger than or equal to c, and infinitely many times equal to c. W.l.o.g. assume that in level i, there exists a vertex \(v=(q,i) \in \mathsf{Acc}(V^d_i)\) and \(c=2 \cdot \mathsf{Ind}(v)\). Take the smallest run prefix that ends up in v, this run prefix will never merge with a smaller run prefix, and all smaller run prefixes that are active in level i will not merge or abort, as otherwise, there would exist a position \(j \ge i\) where the index of the run that passes by (qi) would decrease and this would contradict the fact that for all \(j \ge i\), all the colours that are emitted are larger than or equal to c. Let us now consider the suffix of the run that pass by \(v=(q,i)\). As the even colour c is emitted infinitely many times after level i, we know that this run suffix crosses infinitely many times \(\alpha \). So this run is accepting and this is the smallest such run.

(\(\Leftarrow \)): (Step 1): Now, let us consider the other direction. Assume that there exists an accepting run of A on a word w. We first establish the existence of a run \(\rho \) which is accepting and for which there exists a position \(k \ge 0\) from which \(\rho \) does not merge with any smaller run, and all smaller runs are non accepting. We identify \(\rho \) and k as follows. Among the accepting runs, we select one that enters first in the set of states \(Q_d\) say at level \(i \ge 0\). They can be several of them, but we take one that enters \(Q_d\) via a state q of minimal index for \(\mathsf{Ord}\). Let \(V_i^d\) be the active states at level i that are in \(Q_d\). The way we have chosen q make sure that all the states in \(V^d_i\) with a smaller index than q are the origin of non accepting runs and clearly as \(\rho \) is accepting it cannot merge with one of those smaller runs. Now, some of those smaller runs may merge or abort in the future, and each time they merge or abort, the index of \(\rho \) will decrease. But this will happen a number of times which is bounded by \(Q_d\).

(Step 2): Let k be the position when the last merge or abort of a smaller run prefix happens.

(Step 3): Let us now show that the existence of \(\rho \) and this position k allow us to prove that the colour summary is even. After position k, there are only odd colours with values larger than or equal to \(2 \cdot \mathsf{Ind}(\rho (k))+1\) because we know that nor \(\rho \) neither smaller runs merge or abort in the future. Also as \(\rho \) is accepting, there will be an infinite number of positions \(l \ge k\) where the even colour is equal to \(2 \cdot \mathsf{Ind}(\rho (k)))\), and only finitely many positions after k may have an even colour which is less than this value as all runs that are smaller than \(\rho \) are not accepting. So the summary colour is even and equal to \(2 \cdot \mathsf{Ind}(\rho (k)))\). \(\square \)

3.2 Construction of the DPA

From an LDBA \(A=(Q,Q_d,q_0,\Sigma ,\delta ,\alpha )\) and an ordering function \(\mathsf{Ord}: Q \rightarrow \{1,2,\dots ,|Q_d|,+\infty \}\) compatible with \(Q_d\), we construct a deterministic parity automaton \(B=(Q^B,q_0^B,\Sigma ,\delta ^B,p)\) that, on a word w, constructs the levels of the run DAG \(G_w\) and the colouring of previous section. Theorem 1 tells us that such an automaton accepts the same language as A.

First, we need some notations. Given a finite set S, we note \({{\mathcal {P}}}(S)\) the set of its subsets, and \(\mathcal{OP}\mathcal{}(S)\) the set of its totally ordered subsets. So if \((s,<) \in \mathcal{OP}\mathcal{}(S)\) then \(s \subseteq S\) and \(\mathord {<} \subseteq s \times s\) is a total strict order on s. For \(e \in s\), we denote by \(\mathsf{Ind}_{(s,<)}(e)\) the position of \(e \in s\) among the elements in s for the total strict order <, with the convention that the index of the <-minimum element is equal to 1. The deterministic parity automaton \(B=(Q^B,q_0^B,\Sigma ,\delta ^B,p)\) is defined as follows.

States and initial state. The set of states is \(Q^B =\) \({{\mathcal {P}}}(\overline{Q_d}) \) \(\times \mathcal{OP}\mathcal{}(Q_d)\), i.e. a state of B is a pair \((s,(t,<))\) where s is a set of states outside \(Q_d\), and t is an ordered subset of \(Q_d\). The ordering reflects the relative index of each state within t. The initial state is \(q^B_0=(\{q_0\},(\{\},\{\}))\).

Transition function. Let \((s_1,(t_1,<_1))\) be a state in \(Q^B\), and \(\sigma \in \Sigma \), and let us assume that there is a state \(q \in s_1 \cup t_1\) and a state \(q' \in Q\) such that \((q,\sigma ,q')\in \delta \) (otherwise \(\delta ^B\) is not defined in \((s_1,(t_1,<_1))\) for \(\sigma \)). Then \(\delta ^B((s_1,(t_1,<_1)),\sigma )=(s_2,(t_2,<_2))\) where:

  • \(s_2 = \mathsf{post}^{\sigma }_{\delta }(s_1) \cap \overline{Q_d}\);

  • \(t_2 = \mathsf{post}^{\sigma }_{\delta }(s_1 \cup t_1) \cap Q_d\);

  • \(<_2\) is defined from \(<_1\) and \(\mathsf{Ord}\) as follows: \(\forall q_1,q_2 \in t_2\): \(q_1 <_2 q_2\) iff:

    1. 1.

      either, \(\lnot \exists q'_1 \in t_1:q_1=\delta (q'_1,\sigma )\), and \(\lnot \exists q'_2 \in t_1:q_2=\delta (q_2',\sigma )\), and \(\mathsf{Ord}(q_1) < \mathsf{Ord}(q_2)\), i.e. none has a predecessor in \(Q_d\), then they are ordered using \(\mathsf{Ord}\);

    2. 2.

      or, \(\exists q_1' \in t_1: q_1=\delta (q_1',\sigma )\), and \(\lnot \exists q'_2 \in t_1:q_2=\delta (q_2',\sigma )\), i.e. \(q_1\) has a \(\sigma \)-predecessor in \(Q_d\), and \(q_2\) not;

    3. 3.

      or \(\exists q'_1 \in t_1:q_1=\delta (q'_1,\sigma )\), and \(\exists q'_2 \in t_1:q_2=\delta (q_2',\sigma )\), and \(\min _{<_1} \{ q'_1 \in t_1 \mid q_1=\delta (q'_1,\sigma )\}< \min _{<_1} \{ q'_2 \in t_1 \mid q_2=\delta (q'_2,\sigma ) \}\), i.e. both have a predecessor in \(Q_d\), and they are ordered according to the order of their minimal parents.

Colouring. To define the colouring of edges in the deterministic automaton, we need to identify the states \(q \in t_1\) in a transition \((s_1,(t_1,<_1)) {\mathop {\rightarrow }\limits ^{\sigma }} (s_2,(t_2,<_2))\) whose indices decrease when going from \(t_1\) to \(t_2\) or that abort because they have no \(\delta \)-successor for \(\sigma \). Those are defined as follows:

$$\begin{aligned} \mathsf{Dec}(t_1)= \left\{ q_1 \in t_1 \mid \begin{array}{l} \mathsf{Ind}_{(t_2,<_2)}(\delta (q_1,\sigma ))< \mathsf{Ind}_{(t_1,<_1)}(q_1) \\ \vee \lnot \exists (q_1,\sigma ,q) \in \delta \end{array} \right\} . \end{aligned}$$

Additionally, let \(\mathsf{Acc}(t_1)=\{ q \mid \exists q' \in t_2 : (q,\sigma ,q') \in \alpha \}\) denote the subset of states in \(t_1\) that are the source of an accepting transition.

We assign a colour to each transition \((s_1,(t_1,<_1))\) \(\rightarrow ^{\sigma } (s_2,(t_2,<_2))\) as follows:

  1. 1.

    if \(\mathsf{Dec}(t_1)=\emptyset \) and \(\mathsf{Acc}(t_1)\not =\emptyset \), the colour is \(2 \cdot \min _{q \in \mathsf{Acc}(t_1)} \mathsf{Ind}_{(t_1,<_1)}(q)\).

  2. 2.

    if \(\mathsf{Dec}(t_1)\not =\emptyset \) and \(\mathsf{Acc}(t_1)=\emptyset \), the colour is \(2 \cdot \min _{q \in \mathsf{Dec}(t_1)} \mathsf{Ind}_{(t_1,<_1)}(q)-1\).

  3. 3.

    if \(\mathsf{Dec}(t_1)\not =\emptyset \) and \(\mathsf{Acc}(t_1)\not =\emptyset \), the colour is defined as the minimal colour among

    • \(c_{\mathsf{odd}}=2 \cdot \min _{q \in \mathsf{Dec}(t_1)} \mathsf{Ind}_{(t_1,<_1)}(q)-1\), and

    • \(c_{\mathsf{even}}=2 \cdot \min _{q \in \mathsf{Acc}(t_1)} \mathsf{Ind}_{(t_1,<_1)}(q)\).

  4. 4.

    if \(\mathsf{Dec}(t_1)=\mathsf{Acc}(t_1)=\emptyset \), the colour is \(2 \cdot |Q_d|+1\).

Fig. 3
figure 3

Upper: DPA that accepts the LTL language \({\mathbf {F}}{\mathbf {G}}a \vee {\mathbf {F}}{\mathbf {G}}b\), edges are decorated with a natural number that specifies its colour. Lower: a reduced DPA

Example 2

The DPA of Fig. 3 is the automaton that is obtained by applying the construction LDBA\(\rightarrow \)DPA defined above to the LDBA of Fig. 1 that recognises the LTL language \({\mathbf {F}}{\mathbf {G}}a \vee {\mathbf {F}}{\mathbf {G}}b\). The figure only shows the reachable states of this construction. As specified in the construction above, states of DPA are labelled with a subset of \(\overline{Q_d}\) and an ordered subset of \(Q_d\) of the original NBA. As an illustration of the definitions above, let us explain the colour of edges from state \((\{1\},[4,3])\) to itself on letter b. When the NBA is in state 1, 3 or 4 and letter b is read, then the next state of the automaton is again 1, 3 or 4. Note also that there are no runs that are merging in that case. As a consequence, the colour that is emitted is even and equal to the index of the smallest state that is the target of an accepting transition. In this case, this is state 3 and its index is 2. This is the justification for the colour 4 on the edge. On the other hand, if letter a is read from state \((\{1\},[4,3])\), then the automaton moves to states \((\{1\},[4,2])\). The state 3 is mapped to state 4 and there is a run merging which induces that the colour emitted is odd and equal to 3. This 3 trumps all the 4’s that were possibly emitted from state \((\{1\},[4,3])\) before.

Theorem 2

The language defined by the deterministic parity automaton B is equal to the language defined by the limit deterministic automaton A, i.e. \(\mathsf {L}(A)=\mathsf {L}(B)\).

Proof

Let \(w \in \Sigma ^{\omega }\) and \(G_w\) be the run DAG of A on w. It is easy to show by induction that the sequence of colours that occur along \(G_w\) is equal to the sequence of colours defined by the run of the automaton B on w. By Theorem 1, the language of automaton B is thus equal to the language of automaton A. \(\square \)

3.3 Complexity analysis

3.3.1 Upper bound

Let \(n = |Q|\) be the size of the LDBA and let \(n_d = |Q_d|\) be the size of the accepting component. We can bound the number of different orderings using the series of reciprocals of factorials (with e being Euler’s number):

$$\begin{aligned}\begin{array}{ll} |\mathcal{OP}\mathcal{}(Q_d)| &{} = \sum _{i=0}^{n_d}\frac{n_d!}{(n_d-i)!} \\ &{} \le n_d \cdot n_d! \cdot \sum _{i=0}^{\infty }\frac{1}{i!} \\ &{} = e \cdot n_d \cdot n_d! \in \mathcal O(2^{n \log n}) \end{array}\end{aligned}$$

Thus the obtained DPA has \({{\mathcal {O}}}(2^n\cdot 2^{n \log n}) \subseteq 2^{\mathcal O(n \log n)}\) states and \(2n_d + 1 \in \mathcal {O}(n)\) colours.

3.3.2 Lower bound

We obtain a matching lower bound by strengthening Theorem 8 from [24]:

Lemma 2

There exists a family \((L_n)_{n \ge 2}\) of languages (\(L_n\) over an alphabet of n letters) such that for every n the language \(L_n\) can be recognised by a limit-deterministic Büchi automaton with \(3n + 2\) states but cannot be recognised by a deterministic Parity automaton with less than n! states.

Proof

The proof of Theorem 8 from [24] constructs a nondeterministic Büchi automaton of exactly this size and which is in fact limit-deterministic.

Assume there exists a deterministic Parity automata for \(L_n\) with \(m < n!\) states. Since parity automata are closed under complementation, we can obtain a parity automaton and hence also a Rabin automaton of size m for \(\overline{L_n}\) and thus a Streett automaton of size m for \(L_n\), a contradiction to Theorem 8 of [24].\(\square \)

Corollary 1

Every translation from limit-deterministic Büchi automata of size n to deterministic parity yields automata with \(2^{\Omega (n \log n)}\) states in the worst case.

4 From LTL to DPA in \(2^{2^{\mathcal O(n)}}\)

In [34, 36], we present two different LTL\(\rightarrow \)LDBA translations. Given a formula \(\varphi \) of size n, both translations produce an asymptotically optimal LDBA with \(2^{2^{\mathcal O(n)}}\) states. The straightforward composition of these translations with the single exponential LDBA\(\rightarrow \)DPA translation of the previous section is only guaranteed to be triple exponential, while the Safra–Piterman and Muller–Schupp constructions produce DPAs of at most doubly exponential size, if applied to NBAs constructed from LTL formulas.

In this section, we describe two modifications of our simple approach relying on additional semantic information that yield DPAs with \(2^{2^{\mathcal O(n)}}\) states. The approach taken by both modifications is the following: We can view the second component of the states produced by our construction as a sequence of states of the LDBA, ordered by their indices. Since there are \(2^{2^{\mathcal O(n)}}\) states in the LDBA for an LTL formula of length n, the number of such sequences is:

$$\begin{aligned} 2^{2^{2^{\mathcal O(n)}}} \end{aligned}$$

If only the length of the sequences (the maximum index) were bounded by \(2^n\), the number of such sequences would be bounded by the number of functions \(2^n\rightarrow 2^{2^{\mathcal O(n)}}\) which is:

$$\begin{aligned} (2^{2^{{\mathcal {O}}(n)}})^{2^n}=2^{2^{{\mathcal {O}}(n)}\cdot 2^n}=2^{2^{{\mathcal {O}}(n)}} \end{aligned}$$

Both modifications prune these state sequences and guarantee that their length stay below a suitable threshold such that the resulting DPAs are asymptotically optimal.

4.1 Pruning by language decomposition

We introduce the main ideas of this approach using the LDBAFootnote 3 depicted in Fig. 4 and the corresponding DPA depicted in Fig. 5 (upper part) obtained by the construction from the previous section. First, let us examine the LDBA: the state \(q_4\) accepts a superset of the language accepted by \(q_3\) and state \(q_2\) allows to ‘restart’ failed runs in \(q_4\). Second, let us examine the DPA: the state \(\langle \{2\}, [3 < 4] \rangle \) encodes that in the corresponding run DAG the run in \(q_3\) has entered first \(Q_d\) before the run in \(q_4\). One also immediately sees that the states \(\langle \{2\}, [3] \rangle \) and \(\langle \{2\}, [3 < 4] \rangle \) are bi-similar and that they can be collapsed to a single state (lower part). However, inspecting this example closer we can find a different explanation for this phenomenon. One sees that the states \(q_3\) and \(q_4\) in the original LDBA accept if c appears infinitely often and only differ in the treatment of a. In fact this can be captured by two classic notions about languages:

  • A language \(S \subseteq \Sigma ^\omega \) is a safety language if there exists a set of bad prefixes \(B \subseteq \Sigma ^*\) such that \(S = \Sigma ^\omega - B\Sigma ^\omega \). Thus for all words outside the language there exists a finite witness that the word does not belong to the language. We then denote the set of all safety languages by \(\mathcal {S}\).

  • A language \(C \subseteq \Sigma ^\omega \) is a suffix-closed language if \(C \subseteq \Sigma C\). Thus all suffixes of a word in the language are also in the language. We then denote the set of all suffix-closed languages by \(\mathcal {C}\).

Fig. 4
figure 4

An LDBA A for the LTL language \(a {\mathbf {W}}b \wedge {\mathbf {G}}{\mathbf {F}}c\).\(^{3}\) The behaviour of A is deterministic within the subset of states \(Q_d=\{3,4\}\) which is a trap and contains all accepting transitions which are depicted in bold face

Fig. 5
figure 5

Upper: DPA constructed for the LDBA from Fig. 4, edges are decorated with a natural number that specifies its colour. Lower: A reduced DPA obtained through the construction relying on Proposition 1

The languages of \(q_3\) and \(q_4\) are in fact an intersection of languages from \(\mathcal {S}\) and \(\mathcal {C}\). To be more concrete, we have \(\mathsf {L}(q_3) = C \cap S_1\) and \(\mathsf {L}(q_4) = C \cap S_2\) with \(C = \mathsf {L}({\mathbf {G}}{\mathbf {F}}c)\), \(S_1 = \mathsf {L}({\mathbf {G}}a)\), \(S_2 = \Sigma ^\omega \). We now make use of this to remove nodes from the run-DAG and thus also explain the removal of \(\langle \{2\}, [3 < 4] \rangle \).

Assume we have the situation \(V_i = \{q_2, q_3\}\), \(V^d_i = \{q_3\}\), and \(V^d_{i+1} = \{q_3,q_4\}\). We argue that we can keep only \(q_3\), redefine \(V^d_{i+1} {:=}\{q_3\}\), and still capture all relevant information to decide acceptance. We focus on the difficult case where the subtree of \(q_3 \in V^d_{i+1}\) is rejecting and the subtree of \(q_4 \in V^d_{i+1}\) is accepting. Then since \(q_4\) is accepting the suffix \(w_{i+1}\) is in C and thus \(w_{i+1}\) cannot be in \(S_1\), which is the safety condition for \(q_3\). Since \(S_1\) is a safety language, we will detect this after a finite prefix and can discard that particular branch of the run-DAG. Thus for some \(j > i\) we have \(V^d_j = \{\}\). Since we have \(V_j = \{q_2\}\) due the self-loop on \(q_2\), we get \(V^d_{j+1} = \{q_4\}\) and this subtree is going to be accepting, since \(w_i \in C\) and thus also \(w_j \in C\) due to the suffix-closure of C.

Let us now generalise these insights: We call an LDBA decomposable if \(\delta \cap (\overline{Q_d} \times \Sigma \times \overline{Q_d})\) is deterministic, and there exists a partition of the states \(Q_d\) into sets \(Q_d^1\), \(Q_d^2\), ...\(Q_d^n\) such that for each \(Q_d^i\) the following holds:

  1. 1.

    \(\exists C \in \mathcal {C} \cdot \forall q \in Q_d^i \cdot \exists S \in \mathcal {S} \cdot \mathsf {L}(q) = S \cap C\), i.e., all states in the component \(Q_d^i\) can be represented by an intersection of a safety language and a suffix-closed language,

  2. 2.

    \(\forall q \in Q^i_d \cdot \forall \sigma \in \Sigma \cdot \forall q' \in Q \cdot (q,\sigma ,q') \in \delta \rightarrow q' \in Q_d^i\), i.e. \(Q_d^i\) is a trap (when \(Q_d^i\) is entered it is never left),

  3. 3.

    if \(q \rightarrow ^\sigma p \rightarrow ^{\sigma '} r\) for states \(q \in \overline{Q_d}\), \(p, r \in Q_d^i\) and letters \(\sigma , \sigma ' \in \Sigma \), then there exists \(p' \in \overline{Q_d}\) and \(r' \in Q_d^i\) such that \(q \rightarrow ^\sigma p' \rightarrow ^{\sigma '} r'\) and \(\mathsf {L}(r) \subseteq \mathsf {L}(r')\), i.e. moving to the partition \(Q_d^i\) can be delayed, and

  4. 4.

    \(\forall q \in \overline{Q_d} \cdot \forall \sigma \in \Sigma \cdot |\delta (q, \sigma ) \cap Q_d^i| \le 1\), i.e. for each state \(q \in \overline{Q_d}\) and letter \(\sigma \) we have at most one transition to \(Q_d^i\).

In fact, we can obtain by repetitive application of assumption (3) a generalisation to arbitrary finite words:

Lemma 3

Assume \(q \rightarrow ^\sigma p \rightarrow ^w r \rightarrow ^{\sigma '} s\) for states \(q \in \overline{Q_d}\), p, r, \(s \in Q_d^i\) and letters \(\sigma , \sigma '\) and (finite) word \(w \in \Sigma ^*\), then there exists \(p',r' \in \overline{Q_d}\) and \(s' \in Q_d^i\) such that \(q \rightarrow ^\sigma p' \rightarrow ^{w} r' \rightarrow ^{\sigma '} s'\) and \(\mathsf {L}(s) \subseteq \mathsf {L}(s')\).

Proof

We proceed by induction on w. In the case \(w = \epsilon \) we can immediately apply assumption (3).

Case \(w = w'\sigma ''\): Consider Fig. 6. In terms of this picture we need to prove that there exists some \(t'' \in Q^i_d\) such that \(\mathsf {L}(t) \subseteq \mathsf {L}(t'')\). We obtain the first part (the solid lines) by applying the induction hypothesis and have \(\mathsf {L}(s) \subseteq \mathsf {L}(s')\) such that \(s,s' \in Q_d^i\). Since the transition relation within \(Q_d^i\) is deterministic we also obtain \(\mathsf {L}(t) \subseteq \mathsf {L}(t')\) (the dotted lines). Now we apply, assumption (3) on \(r'\), \(s'\), and \(t'\) (the dashed lines) to obtain \(s''\) and \(t''\) such that \(\mathsf {L}(t') \subseteq \mathsf {L}(t'')\). Then by transitivity we get \(\mathsf {L}(t) \subseteq \mathsf {L}(t'')\). \(\square \)

Fig. 6
figure 6

Structure of the induction step in Lemma 3

We claim that for each block of the partition \(Q^i_d\) at most one state needs to be tracked by a run DAG in order to decide acceptance. Without loss of generality let us assume that LDBAs only contain states that are reachable from the initial state and that can reach an accepting transition. Thus for any state \(q \in Q\) we have \(\mathsf {L}(q) \ne \emptyset \). Given a decomposable LDBA with a set of states Q and a suitable partition \(Q_d^1\), \(Q_d^2\), ...\(Q_d^n\), we define the reduced run DAG \(G_w^*\). This graph \(G^*_w=(V,E)\) has a set of vertices \(V \subseteq Q \times {\mathbb {N}}\) and edges \(E \subseteq V \times V\) defined as follows:

  • \(V = \bigcup _{i \in {\mathbb {N}}} V_i\) and \(V_i = \bigcup _{j = 0}^n V_{i,j}\), where \(V_{i,0}\) contains the nodes representing runs that are at level i in \(\overline{Q_d}\) and \(V_{i,j}\) contains the nodes that correspond to \(Q^j_d\). Formally, the sets \(V_{i,j}\) are defined inductively as:

    $$\begin{aligned}\begin{array}{ll} V_{0,0} = &{} \{(q_0,0)\} \\ V_{0,j} = &{} \emptyset \\ V_{i,0} = &{} \mathsf{post}_{\delta }^{w(i-1)}(V_{i-1,0}) \cap (\overline{Q_d} \times \{i\}) \\ V_{i,j} = &{} {\left\{ \begin{array}{ll} \mathsf{post}_{\delta }^{w(i-1)}(V_{i-1,0}) \cap (Q_d^j \times \{i\}) &{} \hbox { if}\ V_{i-1,j} = \emptyset \\ \mathsf{post}_{\delta }^{w(i-1)}(V_{i-1,j}) &{} \text {otherwise.} \\ \end{array}\right. }\\ \end{array}\end{aligned}$$

    for all \(i \ge 1\) and all \(1 \le j \le n\) and where we use

    $$\begin{aligned} \mathsf{post}_{\delta }^\sigma (V_{i,j}) = \{q \in Q \mid \exists (q',i) \in V_{i,j} \cdot (q',\sigma ,q) \in \delta \} \end{aligned}$$

    to denote the successors of a level in the underlying automaton.

  • \(E = \{((q,i),(q',i+1)) \in V_i \times V_{i+1} \mid q \rightarrow ^{w(i)} q' \}\).

Let us now reconsider the LDBA from Fig. 4. This LDBA is decomposable with \(Q_d = Q_d^1\). Property (1) follows from our previous analysis and (2) is a direct consequence of the LDBA definition, since we only have single partition, and (4) follows from the fact that we have at most one transition from \(q_1\) and \(q_2\) to \(Q_d\) under each letter. For (3) observe that \(\mathsf {L}(q_3) \subseteq \mathsf {L}(q_4)\) and we have the ‘restarting’ loop in \(q_2\). Let us now see why this pruning is correct:

Proposition 1

There is an accepting run in the reduced run DAG \(G^*_w\) if and only if there is an accepting run in the run DAG \(G_w\).

Proof

(\(\Rightarrow \)) Observe that \(G^*_w\) is a subgraph of \(G_w\), since we obtain \(G^*_w\) from \(G_w\) by removing nodes and edges. Thus every accepting run on \(G^*_w\) is also an accepting run on \(G_w\).

(\(\Leftarrow \)) Assume that \(G_w = (V,E)\) has an accepting run. Then an accepting run eventually transitions from some \(q \in \overline{Q_d}\) by reading the letter \(w(i-1)\) to some \(p \in Q_d^j\). Then \(p \in V^d_i\) and the (deterministic) run starting in (pi) is accepting. Let us now see what happens in \(G^*_w = (V',E')\). We proceed by a case distinction.

Assume \(V'_{i,j} = \{p\}\). Then by definition of \(G^*_w\) we also have a (deterministic) run starting in (pi) that is identical to the one in \(G_w\) starting in (pi) which is accepting.

Assume \(V'_{i,j} = \emptyset \) and let \(r = \delta (p, w(i))\). Then by assumption (3) and (4) there exists unique \(p' \in \overline{Q_d}\) and \(r' \in Q^j_d\) such that \(q \rightarrow ^{w(i-1)} p' \rightarrow ^{w(i)} r'\) and \(\mathsf {L}(r) \subseteq \mathsf {L}(r')\). Since the (deterministic) run \((r,i+1)\) (in \(G_w\)) is accepting, the (deterministic) run starting \((r',i+1)\) (in \(G^*_w\)) is also accepting.

It remains to consider the case where \(V'_{i,j}\) is neither empty nor simply \(\{p\}\). By the definition of \(G^*_w\) there exists a unique state \(p' \in V'_{j,i}\). If \(w_i \in \mathsf {L}(p')\), then we are also done since then \(G^*_w\) has then an accepting run. Thus assume \(w_i \notin \mathsf {L}(p')\). From \(w_i \in \mathsf {L}(p)\) we derive that \(w_{i} \in C\) for the suffix-closed language C associated with \(Q_d^j\). This follows from assumption (1). Moreover, assumption (1) tells us that \(w_i \notin S\) for all \(S \in \mathcal {S}\) with \(\mathsf {L}(q') = C \cap S\). Finally, since the LDBA does not contain states q with \(\mathsf {L}(q) = \emptyset \), there must be a level \(i' > i\) such that \(V'_{i',j}\) is empty. We then proceed analogous to the case \(V'_{i,j} = \emptyset \), but make use of Lemma 3 to bridge the longer distance. To be more precise, let \(q \rightarrow ^{w(i-1)} p \rightarrow ^{w(i)} \dots \rightarrow ^{w(i'-2)} r \rightarrow ^{w(i'-1)} s\) be the sequence of states in the original run-DAG. We then apply Lemma 3 to obtain \(s' \in Q_d^j\) with \(\mathsf {L}(s) \subseteq \mathsf {L}(s')\) and that is in \(V'_{i'+1,j}\). Due to the language inclusion, this run is then accepting. \(\square \)

We now can apply the construction from Sect. 3.2 to obtain a DPA tracking the reduced run DAG \(G^*_w\). Assume that the LDBA has m states and n partitions. Then each \(V_i\) has at most cardinality \(n+1\) and thus the resulting DPA has at most \((m+1)^{n+1}\) states.

A suitable LTL \(\rightarrow \) LDBA translation. We now show that decomposable LDBAs exist and these are in fact produced by the recent LTL\(\rightarrow \)LDBA translation defined in [34][Theorem 6.2].

Proposition 2

For all LTL formulas \(\varphi \), the procedure of [34] [Theorem 6.2] produces a LDBA which is decomposable and has at most \(2^n\) partitions, where n is the the length of \(\varphi \).

Fig. 7
figure 7

Schematic overview of an LDBA obtained in [34, Theorem 6.2] for a formula \(\varphi \). \(\overline{Q_d}\) is on the left and \(Q_d\) is on the right hand-side

Proof

Let us first sketch the structure of the resulting LDBA. A schema of the structure can be found Fig. 7. The states of the LDBA are a disjoint union of the set \({ Reach}(\varphi )\), forming the states for the initial component, and \(Q_{X_i,Y_i}\), forming the states for the accepting component. Further, the latter is parametrised by sets \(X_i\) and \(Y_i\) which depend on the formula \(\varphi \). Within the initial component, we have a deterministic transition relation, named \({ af}\), and states that can be identified by LTL formulas. The accepting component is constructed as follows: for fixed sets X and Y, let \(A_{X,Y} = (Q_{X,Y}, q_{0,X,Y}, \Sigma , \delta _{X,Y}, \alpha )\) be the intersection of the following deterministic Büchi automata (DBA):

  • \(A^1_{\varphi ,X}\) accepts the language of a syntactic safe formula obtained from \(\varphi \) and X.

  • \(A^2_{X,Y}\) accepts the language of \(\bigwedge _i {\mathbf {G}}{\mathbf {F}}\psi _i\), where \(\psi _i\) is derived from X and Y.

  • \(A^3_{X,Y}\) accepts the language of \(\bigwedge _j {\mathbf {G}}\psi _j\), where \(\psi _j\) is a syntactic safe formula derived from X and Y.

The overall transition relation \(\delta \) for the LDBA A is the union of \({ af}\), \(\delta _{X,Y}\), and the yet-to-be-defined \(\delta _\curvearrowright \). \(\delta _\curvearrowright \) connects the initial component with the accepting component. It contains exactly one edge for each state in \({ Reach}(\varphi )\) to a state in \(Q_{X,Y}\).

We now claim that the partition \(\overline{Q_d} = { Reach}(\varphi )\), \(Q_d^1 = Q_{X_1, Y_1}\), \(Q_d^2 = Q_{X_2, Y_2}\), ..., \(Q_d^m = Q_{X_m, Y_m}\) is a suitable partition to show that the LDBA A is decomposable. For this observe, that \({ af}\), \(\delta _{X,Y}\) are deterministic transition relations. Lastly, we need to verify that for each \(Q_{X,Y}\) the assumptions (1-4) hold:

  1. 1.

    By construction, \(A_{X,Y}\) is derived by an intersection, every state in \(q \in Q_{X,Y}\) recognises the intersection \(\mathsf {L}(\bigwedge _i {\mathbf {G}}{\mathbf {F}}\psi _i) \in \mathcal {C}\) and a safety language.

  2. 2.

    By construction Q is a disjoint union and no transitions leaving \(Q_{X,Y}\) have been added, thus \(Q_{X,Y}\) is a trap.

  3. 3.

    This assumption is proven by the technical [34][Lemma 4.20] relating \({ af}\) and \(\cdot [\cdot ]_\nu \), which is the essential component of \(\delta _\curvearrowright \).

  4. 4.

    Finally, it can be easily verified by looking at the definition \(\delta _\curvearrowright \) in [34][Theorem 6.2], that there exists at most one transition from the initial component to each of the partitions.

Lastly we claimed that there are at most \(2^n\), where n is the size of the formula, partitions. For this observe that \(X_i\) and \(Y_j\) are according to [34][Theorem 6.2] subsets of two disjoint sets containing only subformulas of \(\varphi \) and thus there exist at most \(2^n\) possible choices for \(X_i\) and \(Y_j\). \(\square \)

4.2 Pruning by language subsumption

Fix an LDBA with a set of states Q. Assume the existence of an oracle: a list of statements of the form \(\mathsf {L}(q) \subseteq \bigcup _{q' \in Q_q} \mathsf {L}(q')\) where \(q \in Q\) and \(Q_q \subseteq Q\). We use the oracle to define a mapping that associates to each run DAG \(G_w\) a ‘reduced DAG’ \(G_w^*\), defined as the result of iteratively performing the following four-step operation:

  • Find the first \(V_i\) in the current DAG such that the sequence \((v_1,i)\sqsubset (v_2,i)\sqsubset \cdots \sqsubset (v_{n_i},i)\) of vertices of \(V_i^d\) contains a vertex \((v_k,i)\) for which the oracle ensures

    $$\begin{aligned} \mathsf {L}(v_k)\subseteq \bigcup _{j<k}\mathsf {L}(v_j) \end{aligned}$$
    (*)

    We call \((v_k, i)\) a redundant vertex.

  • Remove \((v_k, i)\) from the sequence, and otherwise keep the ordering \(\sqsubseteq _i\) unchanged (thus decreasing the index of vertices \((v,\ell )\) with \(\ell >k\)).

  • Remove any vertices (if any) that are no longer reachable from vertices of \(V_1\).

We define the colour summary of \(G_w^*\) in exactly the same way as the colour summary of \(G_w\). The DAG \(G_w^*\) satisfies the following crucial property:

Proposition 3

The colour summary of the run DAG \(G_w^*\) is even if and only if there is an accepting run in \(G_w\).

Proof

\(\Rightarrow \)”: The ‘only-if’ direction can be proven as in Theorem 1 verbatim, only replacing \(G_w\) by \(G_w^*\). The reason why the argumentation is still correct, is that the discussed “smallest run prefix that ends up in v” (now in \(G_w^*\)) is actually a real run prefix (in \(G_w\)) since it never secondarily merged. Indeed, runs only merge into smaller ones.

\(\Leftarrow \)”: (Step 1): For the ‘if’ direction, we first use the proof Theorem 1, Step 1, verbatim, obtaining the smallest accepting run in \(G_w\).

Additionally, we prove that this (smallest) constructed run \(\rho \) is actually a run in \(G_w^*\). For a contradiction, assume that this is not the case and \(\rho =\rho _1 (v_k,i) \rho _2\) where \((v_k,i)\) is the first vertex on \(\rho \) that secondarily merged. Then there is \((v_j,i)\in V_i\cap (Q_d\times \{i\})\) with \(v_j\sqsubset v_k\) and \(\mathsf {L}(v_j)\) contains the label of the run \((v_k,i) \rho _2\), accepted by some run \((v_j,i)\rho _2'\) in \(G_w\). Since \((v_j,i)\sqsubset _i(v_k,i)\), we also have a run prefix \(\rho _1'(v_j,i)\sqsubset \rho _1(v_k,i)\), and thus an accepting run \(\rho _1'(v_j,i)\rho _2'\) in \(G_w\) such that \(\rho _1'(v_j,i)\rho _2'\sqsubset \rho _1 (v_k,i) \rho _2=\rho \), a contradiction with minimality of \(\rho \).

(Step 2): Let k be the position when the last merge of a smaller run prefix happens in \(G_w^*\) (not \(G_w\)).

(Step 3): We use the proof Theorem 1, Step 3, verbatim, proving the colour summary is even. \(\square \)

The mapping on DAGs induces a reduced DPA as follows. The states are the pairs \((s, (t, <))\) such that \((t, <)\) does not contain redundant vertices. There is a transition \((s_1, (t_1,<)) {\mathop {\rightarrow }\limits ^{a}} (s_2, (t_2, <))\) with colour c iff there is a word w and an index i such that \((s_1, (t_1, <))\) and \((s_2, (t_2, <))\) correspond to the i-th and \((i+1)\)-th levels of \(G_w^*\), and a and c are the letter and colour of the step between these levels in \(G_w^*\). Observe that the set of transitions is independent of the words chosen to define them.

The equivalence between the initial DPA \(\mathcal A\) and the reduced DPA \(\mathcal A_r\) follows immediately from Proposition 3: \(\mathcal A\) accepts w iff \(G_w\) contains an accepting run iff the colour summary of \(G_w^*\) is even iff \(\mathcal A_r\) accepts w.

Example 3

Consider the LDBA of Fig. 1 and an oracle given by \(\mathsf {L}(4)=\emptyset \), ensuring \(\mathsf {L}(4)\subseteq \bigcup _{i\in I}\mathsf {L}(i)\) for any \(I\subseteq Q\). Then 4 is always redundant and merged, removing the two rightmost states of the DPA of Fig. 3 (left), resulting in the DPA of Fig. 3 (right). However, for the sake of technical convenience, we shall refrain from removing a redundant vertex when it is the smallest one (with index 1).

Since the construction of the reduced DPA is parametrised by an oracle, the obvious question is how to obtain an oracle that does not involve applying an expensive language inclusion test. Let us give a first example in which an oracle can be easily obtained:

Example 4

Consider an LDBA where each state \(v=\{s_1,\ldots ,s_k\}\) arose from some powerset construction on an NBA in such a way that \(\mathsf {L}(\{s_1,\ldots ,s_k\})=\mathsf {L}(s_1)\cup \cdots \mathsf {L}(s_k)\). An oracle can, for instance, allow us to merge whenever \(v_k\subseteq \bigcup _{j<k}v_j\), which is a sound syntactic approximation of language inclusion. This motivates the following formal generalisation.

Let \(\mathcal L_B=\{L_i\mid i\in B\}\) be a finite set of languages, called base languages. We call \(\mathcal L_C:=\{\bigcup \mathcal L\mid \mathcal L\subseteq \mathcal L_B\}\) the join-semilattice of composed languages. We shall assume an LDBA with some \(\mathcal L_B\) such that \(\mathsf {L}(q)\in \mathcal L_C\) for every state q. We say that such an LDBA has a base \(\mathcal L_B\). In other words, every state recognises a union of some base languages. (Note that every automaton has a base of at most linear size.) Whenever we have states \(v_j\) recognising \(\bigcup _{i\in I_j}L_i\) with \(I_j\subseteq B\) for every j, the oracle allows us to merge vertices \(v_k\) satisfying \(I_k\subseteq \bigcup _{j<k}I_j\). Intuitively, the oracle declares a vertex redundant whenever the simple syntactic check on the indices allows for that.

Let \(V_1=\bigcup _{i\in I_1}L_i,\cdots V_j=\bigcup _{i\in I_j}L_i\) be a sequence of languages of \(\mathcal L_C\) where the reduction has been applied and there are no more redundant vertices. The maximum length of such a sequence is given already by the base \(\mathcal L_B\) and we denote it \( width (\mathcal L_B)\).

Lemma 4

For any \(\mathcal L_B\), we have \( width (\mathcal L_B)\le |\mathcal L_B|+1\).

Proof

We provide an injective mapping of languages in the sequence (except for \(V_1\)) into B. Since \(I_2\not \subseteq I_1\), there is some \(i\in I_2\setminus I_1\) and we map \(V_2\) to this i. In general, since \(I_k\not \subseteq \bigcup _{j=1}^{k-1}I_j\), we also have \(i\in I_k\setminus \bigcup _{j=1}^{k-1}I_j\) and we map \(V_k\) to this i.\(\square \)

On the one hand, the transformation of LDBA to DPA without the reduction yields \(2^{\mathcal O(|Q|\cdot \log |Q|)}\) states. On the other hand, we can now show that the second component of reduced LDBA with a base can be exponentially smaller. Further, let us assume the LDBA is initial-deterministic, meaning that \(\delta \cap (\overline{Q_d}\times \Sigma \times \overline{Q_d})\) is deterministic, thus not resulting in blowup in the first component.

Corollary 2

For every initial-deterministic LDBA with base of size m, there is an equivalent DPA with \(2^{\mathcal O(m^2)}\) states.

Proof

The number of composed languages is \(\mathcal L_C=2^{m}\). Therefore, the LDBA has at most \(2^m\) (nonequivalent) states. Hence the construction produces at most

$$\begin{aligned} |\mathcal L_C|\cdot |\mathcal L_C|^{\mathcal O( width (\mathcal L_B))}=2^m\cdot (2^m)^{\mathcal O(m)}=2^{\mathcal O(m^2)} \end{aligned}$$

states since the LDBA is initial-deterministic, causing no blowup in the first component. \(\square \)

4.3 Bases for LDBAs obtained from LTL formulas

We prove that the width for LDBA arising from the LTL transformation is only singly exponential in the formula size. To this end, we need to recall a property of the LTL\(\rightarrow \)LDBA translation of [36]. Since partial evaluation of formulas plays a major role in the translation, we introduce the following definition. Given an LTL formula \(\varphi \) and sets T and F of LTL formulas, let \(\varphi [T,F]\) denote the result of substituting \({\mathbf {tt}}\) (true) for each occurrence of a formula of T in \(\varphi \), and similarly \({\mathbf {ff}}\) (false) for formulas of F. The following property of the translation is proven in “Appendix A”.

Proposition 4

For every LTL formula \(\varphi \), every state s of the LDBA of [36] is labelled by an LTL formula \( label (s)\) such that (i) \(\mathsf {L}(s)=\mathsf {L}( label (s))\) and (ii) \( label (s)\) is a Boolean combination of subformulas of \(\varphi [T_s, F_s]\) for some \(T_s\) and \(F_s\). Moreover, the LDBA is initial-deterministic.

As a consequence, we can bound the corresponding base:

Corollary 3

For every LTL formula \(\varphi \), the LDBA of [36] for \(\varphi \) has a base of size \(2^{\mathcal O{(|\varphi |)}}\).

Proof

Firstly, we focus on states using the same \(\varphi [T_s, F_s]\). The language of each state can be defined by a Boolean formula over \(\mathcal O (|\varphi |)\) atoms. Since every Boolean formula can be expressed in the disjunctive normal form, its language is a union of the conjuncts. The conjunctions thus form a base for these states. There are exponentially many different conjunction in the number of atoms. Hence the base is of singly exponential size \(2^{\mathcal O(|\varphi |)}\) as well.

Secondly, observe that there are only \(2^{\mathcal O(|\varphi |)}\) different formulas \(\varphi [T_s, F_s]\) and thus only \(2^{\mathcal O(|\varphi |)}\) different sets of atoms. Altogether, the size is bounded by \(2^{\mathcal O(|\varphi |)}\cdot 2^{\mathcal O (|\varphi |)}= 2^{\mathcal O (|\varphi |)}\) \(\square \)

Theorem 3

For every LTL formula \(\varphi \), there is a DPA with \(2^{2^{\mathcal O(|\varphi |)}}\) states.

Proof

The LDBA for \(\varphi \) has base of singly exponential size \(2^{\mathcal O(|\varphi |)}\) by Corollary 3 and is initial-deterministic by Proposition 4. Therefore, by Corollary 2, the size of the DPA is doubly exponential, in fact \(2^{{(2^{\mathcal O(|\varphi |)})}^2}=2^{2^{\mathcal O(|\varphi |)}}\) \(\square \)

This matches the lower bound \(2^{2^{\Omega (n)}}\) by [22] as well as the upper bound by the Safra-Piterman approach. Finally, note that while the breakpoint constructions in [36] is analogous to Safra’s vertical merging, the merging introduced here is analogous to Safra’s horizontal merging.

4.4 Comparing the two pruning methods

We presented two different pruning techniques to achieve an asymptotically optimal LTL\(\rightarrow \)DPA translation. The question is how to they compare on conceptual level, is one stronger than the other? No, in fact they are incomparable. Consider Fig. 5. Here, the first construction removes the ranking \([3 < 4]\) but this cannot be achieved by the second construction. On the other hand, is is clear that language-based pruning technique can be applied to any LDBA (by using a language inclusions checks) and thus is applicable to larger set of LDBAs.

5 Experimental evaluation

We showed that our determinisation construction, which is considerably simpler compared to other constructions, can be combined with semantic pruning of tracked states. This then yields an asymptotical optimal construction. This simplicity is achieved by a detour over LDBAs and one would expect that this incurs inefficiencies in practice. In this section, we provide experimental evidence that this not the case.

5.1 Method

Metric. We compare the size (number of states, number of colours) of produced automata since this is a good indicator of the size of the arena in the automata-theoretic approach to synthesis. In contrast, we do not include any resource consumption analysis, i.e. measurements of computation time or allocated memory. Not only are these values highly dependent on implementation details but, foremost, as shown by [27] on-the-fly computation of the state-space and additional compositional constructions are highly relevant for the overall computation time in the context of synthesis. For each input formula \(\varphi _i\), we compare sizes \(a_i\) and \(b_i\) achieved by different methods and are interested in the achieved improvement factor, which is their ratio \(a_i/b_i\). In order to aggregate the factors into an average one, given that the involved numbers can be hugely different, it is appropriate to use the geometric average of the ratios:

$$\begin{aligned} \root n \of {\prod _{i=1}^n \frac{a_i}{b_i}} = \frac{\root n \of {\prod _{i=1}^n {a_i}}}{\root n \of {\prod _{i=1}^n {b_i}}} \end{aligned}$$

Consequently, for each approach and a set of inputs, we display the the geometric average of the sizes since, for each pair of approaches, the respective ratio immediately yields the desired comparison.

Competing Translations. We compare seven configurations from three different groups of translations that yield DPAs (with the acceptance condition defined on transitions):

  • via NBAs:

    \(\textsf {N}_1\):

    ltl2tgbaFootnote 4 ( [5], 2.8.5): The tool implements a portfolio approach for constructing small automata, in our configuration DPAs. If the formula is not covered by one of the specialised constructions, a version of the Safra determinisation procedure [31] with several optimisations is used. Thus one can argue that it is the state-of-the-art portfolio translator.

    \(\textsf {N}_2\):

    nbadetFootnote 5 ( [26]Footnote 6): The tool implements the construction presented [25] with additional optimisations [26]. We use ltl2tgba -B to translate LTL formulas into NBAs with the acceptance condition defined on states, since in the tested version of the tool acceptance conditions on transitions are not supported. Note that such a detour causes a blowup in the intermediate NBA.

  • via DRAs (deterministic Rabin automata):

    \(\textsf {D}_1\):

    ltl2dgra (asymmetric), dgra2dra, dra2dpaFootnote 7: This approach uses the direct translation to deterministic generalised Rabin automata (DGRA) that has been described in [7] and revised and corrected in [10]. This configuration uses all available optimisations, including the usual reduction rules for Rabin pairs, e.g. [10]. This approach treats the least- (\({\mathbf {F}}\), \({\mathbf {U}}\), \({\mathbf {M}}\)) and greatest-fixed-point operators (\({\mathbf {G}}\), \({\mathbf {W}}\), \({\mathbf {R}}\)) ‘asymmetrically’, and has only a proven triple exponential upper bound. We combine it with the IAR (index-appearance record) construction as improved in [21].

    \(\textsf {D}_2\):

    ltl2dra (symmetric), dra2dpa\(^{7}\): This approach uses the direct translation to DRAs based on the ‘Master-Theorem’ [9, 34] and uses optimisations described in detail in [34]. As in the previous case, we combine it with the IAR construction of [21].

  • via LDBAs:

    \(\textsf {LD}_1\):

    ltl2dpa (asymmetric)\(^{7}\): This translation combines the construction presented here with the ‘asymmetric’ translation LTL\(\rightarrow \)LDBA presented in [36], which treats least- (\({\mathbf {F}}\), \({\mathbf {U}}\), \({\mathbf {M}}\)) and greatest-fixed-point operators (\({\mathbf {G}}\), \({\mathbf {W}}\), \({\mathbf {R}}\)) differently.

    \(\textsf {LD}_2\):

    ltl2dpa (symmetric)\(^{7}\): This translation combines the construction presented here with the ‘symmetric’ translation LTL\(\rightarrow \)LDBA based on the ‘Master-Theorem’ presented in [9, 34], which treats least- (\({\mathbf {F}}\), \({\mathbf {U}}\), \({\mathbf {M}}\)) and greatest-fixed-point operators (\({\mathbf {G}}\), \({\mathbf {W}}\), \({\mathbf {R}}\)) symmetrically.

    \(\textsf {LD}_p\):

    ltl2dpa (portfolio)\(^{7}\): Here, we combine the previous translations with a portfolio of translations for fragments that directly yield DPAs [9, 34, 35]. This portfolio approach is important in comparison to the configuration \(\textsf {N}_1\)where similar steps are taken. Moreover, since complementation of DPAs is trivial, this configuration translates both the formula and its negation to DPAs \(A_\varphi \), \(A_{\lnot \varphi }\), constructs the complement \(\overline{A_{\lnot \varphi }}\), and picks the smaller of \(A_\varphi \) and \(\overline{A_{\lnot \varphi }}\).

Table 1 Parametrised formulas set

Input Formula Sets. We base the evaluation on two sets of formulas: the first set consists of the well-known ‘Dwyer’-patterns [6] that collects 55 LTL formulas specifying common properties; the second set is obtained by instantiating the 11 parametrised formulas from Table 1. These families are partly taken from [14, 29, 38] or are simple combinations of \({\mathbf {U}}\), \({\mathbf {G}}{\mathbf {F}}\), and \({\mathbf {F}}{\mathbf {G}}\) formulasFootnote 8. The second set of formulas is useful to isolate and analyse strong- and weak points of the compared translations. Furthermore, we abstained from using randomly generated formulas, because in our experience it is unclear what this implies for practice, since formulas from real-world examples usually have a high degree of structure compared to randomly generated formulas.

Table 2 This table displays the results for the ‘Dwyer’-patterns set. The table list number of states, followed by the number of colours (if larger than 1) and is sorted descending in regards to the largest difference in order of magnitude differences, as explained in the text. The results for the remaining 88 formulas are located in Appendix B. We write \(\frac{1}{n}\Sigma \), \(\sigma \), , and med., for the average, the standard deviation, the geometric average, and the median, respectively, for the number of states considering the whole data set

The formula sets are obtained by executing genltlFootnote 9 with the corresponding parameters. Each formula and its negation is then added to the set of formulas. We take the following steps to reduce the influence of specific simplification rules and to remove (close to) duplicate entries: first, we bring formulas into negation normal form; second, we apply a standard set of LTL simplification rules [1, 11, 29, 33, 37] with the goal to neutralise the effect LTL simplifier in the evaluation; third, we normalise the atomic propositions and remove formulas that are equal modulo renaming of atomic propositions.

As a consequence, the number of formulas we consider is less than the number of formulas of the corresponding original publication. For example, [6] lists 55 formulas, but we remove six entries: e.g., only one of \({\mathbf {G}}a\), \({\mathbf {G}}\lnot a\), and \({\mathbf {F}}a\) is added to the formula set. Note that we always evaluate the translation also on the negation of each formula. However, we do not remove duplicates across the two formula sets.

5.2 Results

The measured automata sizes for the LTL formulas are listed in Tables 2 and 4. We refer by \(\varphi \) to formulas of the pattern set, and by \(\chi \) to formulas of the parametrised set. Further, we write \({\overline{\varphi }}\) instead of \(\lnot \varphi \). We sort the rows of the table by the difference in the orders of magnitude of sizes yielded by the considered configurations. More precisely, we compute \(\frac{ max }{ min }\) for each row, where min refers to the number of states of the smallest automaton and max refers to the number of states of the largest automaton, and sort in descending order. In the main body of the paper, we list only the top 10 rows according to this order to highlight the most interesting differences. The remaining results are located in Appendix B.

5.3 Discussion

Table 2 and Table 4, which contain the formulas with the largest differences in size, suggest the following conclusions.

Variants of the LDBA approach. There are several cases where \(\textsf {LD}_1\) produces dramatically smaller automata than \(\textsf {LD}_2\), e.g., the first four rows of Table 2. Nevertheless, there are also many cases where \(\textsf {LD}_2\) is slightly smaller \(\textsf {LD}_1\), e.g., the following rows in that table and most of the formulas of Table 4. Thus both techniques have their merit, with their ratio close to 1 and the asymmetric being ‘safer’ if only one is to be used.

Table 3 Excerpt of Table 2 highlighting the effect of negation and complementation used in the portfolio approach
Table 4 This table displays the results for the parametrised set (Table 1). The table is structured as Table 2 and the results for the remaining 56 formulas are located in Appendix B

Observe that the same behaviour occurs with the pair \(\textsf {D}_1\) and \(\textsf {D}_2\), reflecting the fact that this difference stems from the difference between the asymmetric and symmetric approaches. This pattern is already noticeable for the intermediate constructions, i.e, the sizes of the constructed LDBAs and DRAs, respectively. The geometric averages for this intermediate step are 6.16, 5.68, 4.75, and 4.80 on the patterns set for \(\textsf {LD}_1\), \(\textsf {LD}_2\), \(\textsf {D}_1\), and \(\textsf {D}_2\), respectively, and 7.47, 7.94, 4.85, and 5.51 on the parametrised set. The complete results are located in “Appendix B”.

The portfolio approach \(\textsf {LD}_p\) yields not only the smaller of the two results, but its dedicated constructions tailored to special fragments yield yet smaller LDBAs and correspondingly also DPAs for several formulas. Overall, not surprisingly, a portfolio approach entails a considerable advantage.

Safra versus IAR versus LDBA determinisation Our LDBA determinisation in a portfolio configuration (\(\textsf {LD}_p\)) is on-par with \(\textsf {N}_1\) on the pattern set, the average ratio being \(103\%\), and takes the lead in the parametrised setting, the average ratio being \(70\%\) there.

For \(\textsf {LD}_1\) and \(\textsf {LD}_2\), without post-processing and portfolio techniques, the difference to \(\textsf {N}_1\) grows (with the ratios \(124\%\) and \(133\%\), respectively). Yet, on the parametrised set they are still better than the the portfolio \(\textsf {N}_1\) (with \(90\%\) and \(87\%\), respectively).

One of the reasons for the discrepancy between comparisons on the pattern set and on the parametrised set is that the parametrised set contains several ‘simpler’ formulas that are recognisable by a deterministic Büchi or deterministic co-Büchi automaton, as indicated by \(\textsf {N}_1\) only needing a single colour. In these ‘simple’ cases, several techniques are known, and implemented in \(\textsf {N}_1\), to reduce the number of states in the automata.

The other Safra-based approach \(\textsf {N}_2\) tends to yield larger automata than \(\textsf {N}_1\).

Comparing the IAR and LDBA-determinisation approaches, interestingly, there are significant differences in both directions on many formulas; yet the ratios are close to 1 on the pattern set, showing they are quite incomparable. However, the latter takes the lead on the parametrised set.

Summary The portfolio approaches are the clear winners. While \(\textsf {N}_1\) may produce slightly smaller automata in many cases, \(\textsf {LD}_p\) produces in several cases significantly smaller automata. Both approaches are practically used in the LTL synthesis: \(\textsf {N}_1\) is used in ltlsynt [15] and a variant of \(\textsf {LD}_p\) is used in Strix [27]. We leave open the question, whether implementing a portfolio for the IAR approach would also yield a competitive configuration.

Finally, having pointed out the differences, it is important to keep in mind that, for a substantial part of the formulas, all participating tools yield small automata, not differing dramatically, as one can see by inspecting “Appendix B”.

6 Conclusion

We have presented a simple, ‘Safraless’, and asymptotically optimal translation from LTL and LDBA to DPA. Furthermore, the translation is suitable for an on-the-fly implementation and deployment in the LTL synthesis, which has been successfully demonstrated by Strix [27], the winner of the LTL-synthesis track of SyntComp 2018 [16] and 2019.Footnote 10