1 Introduction

Motivation. Embedded systems are key components of cyberphysical systems and are often subject to stringent safety goals. Among the current approaches to the modeling and analysis of timed systems, the approach of timed automata [5] stands out as being a very successful approach with well-developed tool support – in particular the UPPAAL suite [28] of tools. As cyberphysical systems become increasingly distributed and interconnected through wireless communication links it becomes even more important to ensure that they meet suitable security goals.

In this paper, we are motivated by an example of a smart power grid system. In its very basic form, a smart grid system consists of a meter that measures the electricity consumption in a customer’s (C) house and then sends this data to the utility company (UC). The detailed measurements of the meter provide more accurate billings for UC, while C receives energy management plans that optimize his energy consumption. Although this setting seems to be beneficial for both UC and C, it has been shown that high-frequent monitoring of the power flow poses a major threat to the privacy of C [14, 23, 27]. To deal with this problem many smart grid systems introduce a trusted third-party (TTP), on which both UC and C agree [27]. The data of the meter now is collected by the TTP and by the end of each month the TTP charges C depending on the tariff prices defined by UC. In this protocol, UC trusts TTP for the accurate billing of C, while C trusts TTP with its sensitive data. However, in some cases, C may desire an energy management plan by UC, and consequently he makes a clear statement to TTP that allows the latter to release the private data of C to UC. Therefore, it is challenging to formally prove that our trusted smart grid system leaks information only under \(C'\)s decision.

Information Flow Control. [10, 26, 29] is a key approach to ensuring that software systems maintain the confidentiality and/or integrity of their data. Policies for secure information flow are usually formalized as non-interference [29] properties and systems that adhere to the stated policy are guaranteed to admit no flow of information that violates it. However, in many applications information is leaked by intention as in our smart grid example. To deal with such systems, information flow control approaches are usually extended with mechanisms that permit controlled information leakage. The major difficulty imposed by this extension is to formalize notions of security that are able to differentiate between the intentional and the unintentional information leakages in a system.

Contribution. It is therefore natural to extend the enforcement of safety properties of timed automata with the enforcement of appropriate Information Flow policies. It is immediate that the treatment of clocks, the non-determinism, and the unstructured control flow inherent in automata will pose a challenge. More fundamentally there is the challenge that timed automata is an automata-based formalism whereas most approaches to Information Flow take a language-based approach by developing type systems for programming languages with structured control flow or process calculi.

We start by giving the semantics of timed automata (Sect. 2) based on the ones used in UPPAAL [28]. Next, we formalize the security of a timed automaton using a bisimulation relation (Sect. 3). This notion describes the observations of a passive attacker and formally describes where an observation is allowed to leak information and where it is not. To deal with implicit flows we define a general notion of the post-dominator relation [18] (Sect. 4). We then develop a sound algorithm (Sect. 5) that imposes information flow constraints on the clocks and the variables of a timed automaton. We finish with our conclusions (Sect. 6) and the proofs of our main results (Appendix).

Related Work. There are other papers dealing with Information Flow using language based techniques for programs with a notion of time [2, 9, 16, 22] or programs that leak information intentionally [6, 13, 19,20,21, 24]. Our contribution focuses on the challenges of continuous time and the guarded actions of timed automata.

The work of [7, 8] define a notion of non-interference for timed automata with high-level (secret) and low-level (public) actions. Their notion of security is expressed as a non-interference property and it depends on a natural number m, representing a minimum delay between high-level actions such that the low-level behaviors are not affected by the high-level ones. The authors of [17] define a notion of timed non-interference based on bisimulations for probabilistic timed automata which again have high-level (secret) and low-level (public) actions. A somewhat different approach is taken in [12] that studies the synthesis of controllers. None of those approaches considers timed automata that have data variables, nor is their notion of security able to accommodate systems that leak information intentionally.

The authors of [25] take a language-based approach and they define a type-system for programs written in the language Timed Commands. A program in their language gives rise to a timed automaton, and type-checked programs adhere to a non-interference like security property. However, their approach is limited only to automata that can be described by their language and they do not consider information release.

2 Timed Automata

A timed automaton [1, 5] \(\mathsf{TA}=(\mathsf{Q},\mathsf{E},\mathsf{I},q_\circ )\) consists of a set of nodes \(\mathsf{Q}\), a set of annotated edges \(\mathsf{E}\), and a labelling function \(\mathsf{I}\) on nodes. A node \(q_\circ \in \mathsf{Q}\) will be the initial node and the mapping \(\mathsf{I}\) maps each node in \(\mathsf{Q}\) to a condition (to be introduced below) that will be imposed as an invariant at the node.

The edges are annotated with actions and take the form where \(q_s\in \mathsf{Q}\) is the source node and \(q_t\in \mathsf{Q}\) is the target node. The action consists of a guard g that has to be satisfied in order for the multiple assignments to be performed and the clock variables \({{\varvec{r}}}\) to be reset. We shall assume that the sequences \({{\varvec{x}}}\) and \({{\varvec{a}}}\) of program variables and expressions, respectively, have the same length and that \({{\varvec{x}}}\) does not contain any repetitions. To cater for special cases we shall allow to write for the assignments of when \({{\varvec{x}}}\) (and hence \({{\varvec{a}}}\)) is empty; also we shall allow to omit the guard g when it equals \(\mathsf{tt}\) and to omit the clock resets when \({{\varvec{r}}}\) is empty.

It has already emerged that we distinguish between (program) variables x and clock variables (or simply clocks) r. The arithmetic expressions a, guards g and conditions c are defined as follows using boolean tests b:

The arithmetic operators \(\mathsf{{op}}_{a}\) and the relational operators \(\mathsf{{op}}_{r}\) are as usual. For comparisons of clocks we use the operators \(\mathsf{{op}}_{c}\in \{<, \le , =, \ge ,>\}\) in guards and the less permissive set of operators \(\mathsf{{op}}_{d}\in \{<, \le , =\}\) in conditions.

To specify the semantics of timed automata let \(\sigma \) be a state mapping variables to values (which we take to be integers) and let \(\delta \) be a clock assignment mapping clocks to non-negative reals. We then have total semantic functions \([\![{\cdot }]\!]\) for evaluating the arithmetic expressions, boolean tests, guards and conditions; the values of the arithmetic expressions and boolean expressions only depend on the states whereas that of guards and conditions also depend on the clock assignments.

The configurations of the timed automata have the form \(\langle {q},{\sigma },{\delta }\rangle \in \mathbf{Config }\) where \([\![{\mathsf{I}(q)}]\!](\sigma ,\delta )\) is true, and the transitions are described by an initial delay (possibly none) that increases the values of all the clocks followed by an action. Therefore, whenever is in \(\mathsf{E}\) we have the rule:

where d corresponds to the initial delay. The rule ensures that after the initial delay the invariant and the guard are satisfied in the starting configuration and updates the mappings \(\sigma \) and \(\delta \) where \(\delta + d\) abbreviates \(\lambda r.\, \delta (r) + d\). Finally, it ensures that the invariant is satisfied in the resulting configuration. Initial configurations assume that all clocks are initialized to 0 and have the form \(\langle {q_\circ },{\sigma },{\lambda r.0}\rangle \).

Traces. We define a trace from \(\langle {q_s},{\sigma },{\delta }\rangle \) to \(q_t\) in a timed automaton \(\mathsf{TA}\) to have one of three forms. It may be a finite “successful” sequence

in which case at least one step is performed. It may be a finite “unsuccessful” sequence

where \(\langle {q'_n},{\sigma '_n},{\delta '_n}\rangle \) is stuck when there is no action starting from \(\langle {q'_n},{\sigma '_n},{\delta '_n}\rangle \). Finally, it may be an infinite “unsuccessful” sequence

$$\begin{aligned} \begin{array}{ll} &{} \langle {q_s},{\sigma },{\delta }\rangle = \langle {q'_0},{\sigma '_0},{\delta '_0}\rangle \overset{d_1}{\longrightarrow } \cdots \overset{d_n}{\longrightarrow } \langle {q'_n},{\sigma '_n},{\delta '_n}\rangle \overset{d_{n+1}}{\longrightarrow } \cdots \\ &{} \,\, \hbox {such that}\,\, q_t\not \in \{q'_1,\cdots ,q'_n,\cdots \}. \end{array} \end{aligned}$$

We shall write \([\![{\mathsf{TA}}:{q_s}\mapsto {q_t}]\!](\sigma ,\delta )\) for the set of traces from \(\langle {q_s},{\sigma },{\delta }\rangle \) to \(q_t\). We then have the following proposition

Proposition 1

[15]. For a pair \((\sigma ,\delta )\) whenever \([\![{\mathsf{TA}}:{q_s}\mapsto {q_t}]\!](\sigma ,\delta )\) contains only successful traces, then there exists a trace \(t\in [\![{\mathsf{TA}}:{q_s}\mapsto {q_t}]\!](\sigma ,\delta )\) with maximal length.

We also define the delay of a trace t from \(\langle {q_s},{\sigma },{\delta }\rangle \) to \(q_t\) and we have that if t is a successful trace

$$\begin{aligned} \langle {q_s},{\sigma },{\delta }\rangle = \langle {q'_0},{\sigma '_0},{\delta '_0}\rangle \overset{d_1}{\longrightarrow } \cdots \overset{d_n}{\longrightarrow } \langle {q'_n},{\sigma '_n},{\delta '_n}\rangle =\langle {q_t},{\sigma '},{\delta '}\rangle \end{aligned}$$

then

$$\begin{aligned} \begin{array}{l} \varDelta (t)=\mathop {\sum }\nolimits _{i=1}^{n} d_i \end{array} \end{aligned}$$

In the case of t being an unsuccessful (finite or infinite) trace we have that

$$\begin{aligned} \varDelta (t)=\infty \end{aligned}$$

Finally for \((\sigma _1,\delta _1)\), \((\sigma _2,\delta _2)\) whenever for all \(t_1\in [\![{\mathsf{TA}}:{q_s}\mapsto {q_t}]\!](\sigma _1,\delta _1)\) and \(t_2\in [\![{\mathsf{TA}}:{q_s}\mapsto {q_t}]\!](\sigma _2,\delta _2)\) we have that \(\varDelta (t_1)=\varDelta (t_2)\), we will say that \((\sigma _1,\delta _1)\) and \((\sigma _2,\delta _2)\) have the same termination behaviour with respect to \(q_s\) and \(q_t\). Note that it is not necessarily the case that a pair \((\sigma ,\delta )\) has the same termination behaviour as itself.

Fig. 1.
figure 1

The timed automaton SG (and the abbreviations used).

Example 1

To illustrate our development we shall consider an example automaton of a smart grid system as the one described in Sect. 1. The timed automaton SG is given in Fig. 1 and it uses the clocks \(\mathsf{t}\) and \(\mathsf{T}\) to model the time elapse of a day and a month respectively. Between midnight and noon, the electricity data \(\mathsf{ed}\) is aggregated in the variable \(\mathsf{e_1}\), while from noon to midnight the measurements are saved in the variable \(\mathsf{e_2}\). The clock \(\mathsf{r}\) is used to regulate the frequency of the measurements, by allowing one measurement every full hour. At the end of a day (midnight) the last measurement is calculated and the clock \(\mathsf{t}\) is being reset to 0 indicating the start of a new day. At the end of each month (\(\mathsf{T=720}\)) the trusted party TTP collects the data \(\mathsf{e_1}\) and \(\mathsf{e_2}\) of the meter and stores it in the collectors \(\mathsf{c_1}\) and \(\mathsf{c_2}\) respectively. At the same time, the customer C sends a service request \(\mathsf{s}\) to TTP in case he desires to get some analytics regarding his energy consumption. The TTP then requests from the UC the prices \(\mathsf{p_1}\), \(\mathsf{p_2}\) of the electricity tariffs for the two time periods of interest and in case that C has made a request for his data to be analysed (\(\mathsf{s=1}\) otherwise \(\mathsf{s=0}\)), TTP also reveals the collected data \(\mathsf{c_1}\) and \(\mathsf{c_2}\) to the UC where the latter stores them in the variables \(\mathsf{y_1}\) and \(\mathsf{y_2}\) respectively. The UC then responds back to the TTP by sending the values \(\mathsf{v_1}\) and \(\mathsf{v_2}\) of the electricity tariffs and also the result \(\mathsf{z}\) of \(C'\)s data analytics in case C made a request for that, otherwise it sends the value 0. Once the TTP receives everything \((\mathsf{f}=1)\) he calculates the bill \(\mathsf{b}\) for C, sends it to him together with the analysis result \(\mathsf{a}\) (C stores it in \(\mathsf{x}\)), the clocks and the variables of the meter are being reset to 0 and a new month starts. For simplicity here we assume that all the calculations done by the TTP and the UC by the end of the month are being completed in zero time.

3 Information Flow

We envisage that there is a security lattice expressing the permissible flows [10]. Formally this is a complete lattice and the permitted flows go in the direction of the partial order. In our development, it will contain just two elements, L (for low) and H (for high), and we set \(L \sqsubseteq H\) so that only the flow from H to L is disallowed. For confidentiality, one would take L to mean public and H to mean private and for integrity one would take L to mean trusted and H to mean dubious.

A security policy is then expressed by a mapping \(\mathcal{L}\) that assigns an element of the security lattice to each program variable and clock variable. An entity is called high if it is mapped to H by \(\mathcal{L}\), and it is said to be low if it is mapped to L by \(\mathcal{L}\). To express adherence to the security policy we use the binary operation \(\rightsquigarrow \) defined on sets \(\chi \) and \(\chi '\) (of variables and clocks):

$$\begin{aligned} \chi \rightsquigarrow \chi ' \Leftrightarrow \forall u\in \chi : \forall u'\in \chi ': \mathcal{L}(u) \sqsubseteq \mathcal{L}(u') \end{aligned}$$

This expresses that all the entities of \(\chi \) may flow into those of \(\chi '\); note that if one of the entities of \(\chi \) has a high security level then it must be the case that all the entities of \(\chi '\) have high security level.

Example 2

Returning to Example 1 of our smart grid system, we have that \(\mathcal{L}\) maps the program variable \(\mathsf{ed}\) of the electricity data, the variables \(\mathsf{e_1},\mathsf{e_2}\) that store this data, the collectors \(\mathsf{c_1},\mathsf{c_2}\) and the bill \(\mathsf{b}\) to the security level H, while the rest of the program variables and clocks are mapped to L.

Information flow control enforces a security policy by imposing constraints of the form \(\{y\}\rightsquigarrow \{x\}\) whenever the value of y may somehow influence (or flow into) that of x. Traditionally we distinguish between explicit and implicit flows as explained below. As an example of an explicit flow consider a simple assignment of the form . This gives rise to a condition \(\mathsf{fv}({a}) \rightsquigarrow \{x\}\) so as to indicate that the explicit flow from the variables of a to the variable x must adhere to the security policy: if a contains a variable with high security level then x also must have high security level. For an example of an implicit flow consider a conditional assignment where x is assigned the constant value 0 in case g evaluates to true. This gives rise to a condition \(\mathsf{fv}({g}) \rightsquigarrow \{x\}\) so as to indicate that the implicit flow from the variables of g to the variable x must adhere to the security policy: if g contains a variable with high security level then x also must have high security level.

As has already been explained, many applications as our smart grid example inevitably leak some information. In this paper we develop an approach to ensure that the security policy is adhered to by the timed automaton of interest, however in certain conditions it can be bypassed. Thus, for a timed automaton \(\mathsf{TA}=(\mathsf{Q},\mathsf{E},\mathsf{I},q_\circ )\), we shall assume that there exists a set of observable nodes \(Y\subseteq \mathsf{Q}\), that are the nodes where the values of program variables and clocks with low security are observable by an attacker. The observable nodes will be described by the union of two disjoint sets \(Y_s\) and \(Y_w\), where a node q in \(Y_s\) (\(Y_w\) resp.) will be called strongly observable (weakly observable resp.). The key idea is to ensure that \(\{x\}\rightsquigarrow \{y\}\) whenever there is an explicit flow of information from x to y (as illustrated above) or an implicit flow from x to y in computations that lead to strongly observable nodes, while computations that lead to weakly observable nodes are allowed to bypass the security policy \(\mathcal{L}\).

To overcome the vagueness of this explanation we need to define a semantic condition that encompasses our notion of permissible information flow, where information leakage occurs only at specific places in our automaton.

Observable Steps. Since the values of low program variables and clocks are only observable at the nodes in Y, we collapse the transitions of the automaton that lead to non-observable nodes into one. Thus we have an observable successful step

$$\begin{aligned} \langle {q_s},{\sigma },{\delta }\rangle \overset{D}{\Longrightarrow }_{{Y}}\langle {q_t},{\sigma '},{\delta '}\rangle \end{aligned}$$

whenever there exists a successful trace t

$$ \begin{array}{l l} \langle {q_s},{\sigma },{\delta }\rangle = \langle {q_0},{\sigma _0},{\delta _0}\rangle \overset{d_1}{\longrightarrow } \cdots \overset{d_n}{\longrightarrow } \langle {q_n},{\sigma _n},{\delta _n}\rangle =\langle {q_t},{\sigma '},{\delta '}\rangle&\quad (n>0) \end{array} $$

from \(\langle {q_s},{\sigma },{\delta }\rangle \) to \(q_t\) in \(\mathsf{TA}\) and \(q_t\in Y\), \(D=\varDelta (t)\) and \(\forall i\in \{1,...,n-1\}:q_i\not \in Y\).

And we have an observable unsuccessful trace

$$\begin{aligned} \langle {q_s},{\sigma },{\delta }\rangle \overset{\infty }{\Longrightarrow }_{{Y}}\bot \end{aligned}$$

whenever there exists an unsuccessful finite trace

$$ \begin{array}{l l} \langle {q_s},{\sigma },{\delta }\rangle = \langle {q_0},{\sigma _0},{\delta _0}\rangle \overset{d_1}{\longrightarrow } \cdots \overset{d_n}{\longrightarrow } \langle {q_n},{\sigma _n},{\delta _n}\rangle&\quad (n\ge 0) \end{array} $$

or an unsuccessful infinite trace

$$ \langle {q_s},{\sigma },{\delta }\rangle = \langle {q_0},{\sigma _0},{\delta _0}\rangle \overset{d_1}{\longrightarrow } \cdots \overset{d_n}{\longrightarrow } \langle {q_n},{\sigma _n},{\delta _n}\rangle \overset{d_{n+1}}{\longrightarrow } \cdots $$

from \(\langle {q_s},{\sigma },{\delta }\rangle \) to any of the nodes in Y and \(\forall i>0:q_i\not \in Y\). From now on it should be clear that a configuration \(\gamma \) will range over \(\mathbf{Config } \cup \{\bot \}\).

We write \((\sigma ,\delta )\equiv (\sigma ' ,\delta ')\) to indicate that the two pairs are equal on low variables and low clocks:

It is immediate that this definition of \(\equiv \) gives rise to an equivalence relation. Intuitively \(\equiv \) represents the view of a passive attacker as defined in [24], a principal that is able to observe the computations of a timed automaton and deduce information.

We will now define our security notion with the use of a bisimulation relation. Our notion shares some ideas from [19, 21], where a bisimulation-based security is defined for a programming language with threads. In their approach, the bypassing of the security policy is localized on the actions, and that is because their attacker model is able to observe the low variables of a program at any of its computation steps (e.g. in a timed-automaton all of the nodes would have been observable). In contrast to [19, 21], we localize bypassing of policies at the level of the nodes, while we also define a more flexible notion of security with respect to the attacker’s observability.

Definition 1

(\(Y-\)Bisimulation). For a timed automaton \(\mathsf{TA}=(\mathsf{Q},\mathsf{E},\mathsf{I},q_\circ )\) and a set of nodes \(Y=Y_s\cup Y_w\), a relation \(\simeq _Y \subseteq (\mathbf{Config } \cup \{\bot \}) \times (\mathbf{Config } \cup \{\bot \})\) will be called a \(Y-\)bisimulation relation if \(\simeq _Y\) is symmetric and we have that if \(\gamma _1=\langle {q_1},{\sigma _1},{\delta _1}\rangle \simeq _Y \langle {q_2},{\sigma _2},{\delta _2}\rangle =\gamma _2\) then

where node(\(\langle {q},{\sigma },{\delta }\rangle )=q\), pair(\(\langle {q},{\sigma },{\delta }\rangle )=({\sigma },{\delta })\), and if \(\gamma _1 \simeq _Y \gamma _2\) then

$$\begin{aligned}{}\begin{array}[t]{l} (\gamma _1=\bot \Leftrightarrow \gamma _2=\bot ) \end{array} \end{aligned}$$

We write \(\sim _Y\) for the union of all the \(Y-\)bisimulations and it is immediate that this definition of \(\sim _Y\) is both a \(Y-\)bisimulation and an equivalence relation. Intuitively, when two configurations are related in \(\sim _Y\), and they are low equivalent then they produce distinguishable pairs of states only at the weakly observable nodes. Otherwise, observations made at strongly observable nodes should be still indistinguishable. In both cases, the resulting configurations of two \(Y-\)bisimilar configurations should also be \(Y-\)bisimilar. We are now ready to define our security notion.

Definition 2

(Security of Timed Automata). For a timed automaton \(\mathsf{TA}=(\mathsf{Q},\mathsf{E},\mathsf{I},q_\circ )\) and a set \(Y=Y_s\cup Y_w\) of observable nodes, we will say that \(\mathsf{TA}\) satisfies the information security policy \(\mathcal {L}\) whenever:

Whenever \(Y_w=\emptyset \) our notion of security coincides with standard definitions of non-interference [29], where an automaton that satisfies the information security policy \(\mathcal {L}\) does not leak any information about its high variables.

Example 3

For the smart grid automaton \(\mathsf{SG}\) of the Example 1, we have the set of observable nodes \(Y=\{2,3,4\}\), where the strongly observable ones are the nodes 2 and 4 (\(Y_s=\{2,4\}\)), and the weakly one is the node 3 (\(Y_w=\{3\}\)), where the TTP is allowed to release the secret information of C.

4 Post-dominators

For the implicit flows arising from conditions, we are interested in finding their end points (nodes) that are the points where the control flow is not dependent on the conditions anymore. For that, we define a generalized version of the post-dominator relation and the immediate post-dominator relation [18].

Paths. A path \(\pi \) in a timed automaton \(\mathsf{TA}=(\mathsf{Q},\mathsf{E},\mathsf{I},q_\circ )\) is a finite (\(n\ge 0\)) or infinite sequence of nodes and actions such that \(\forall i>0: (q_{i-1},act_{i},q_{i})\in \mathsf{E}\). We say that a path is trivial if \(\pi =q_0\) and we say that a node q belongs to the path \(\pi \), or \(\pi \) contains q, and we will write \(q\in \pi \), if there exists some i such that \(q_i=q\). For a finite path we write (\(i\le n)\) for the suffix of \(\pi \) that starts at the i-th position and we usually refer to it as the i-th suffix of \(\pi \). Finally, for a node q and a set of nodes \(Y\subseteq \mathsf{Q}\) we write

for the set of all the non-trivial finite paths that start at q, end at a node y in Y and all the intermediate nodes of the path do not belong in Y.

Definition 3

(Post-dominators). For a node q and a set of nodes \(Y\subseteq \mathsf{Q}\) we define the set

$$\begin{aligned} pdom_{Y}{} { (q)} = \{q' \, | \, \forall \pi \in \varPi _{(q,Y)}:q'\in \pi (1)\} \end{aligned}$$

and whenever \(q'\in \) pdom\(_{Y}\)(q), we will say that \(q'\) is a Y post-dominator of q.

Intuitively whenever a node \(q'\) is a Y post-dominator of a node q it means that every non-trivial path that starts at q has to visit \(q'\) before it visits one of the nodes in Y. We write pdom\(_{y}\)(q) whenever \(Y=\{y\}\) is a singleton and we have the following facts

Fact 1

For a set of nodes \(Y\subseteq \mathsf{Q}\) and for a node q we have that

Fact 2

The post-dominator set for a singleton set \(\{y\}\) can be computed by finding the greatest solution of the following data-flow equations:

figure a

For a node q, we are interested in finding the Y post-dominator “closest” to it.

Definition 4

For a node q and a set of nodes Y we definite the set

$$\begin{aligned} \begin{array}{l l l} \text {ipdom}_Y(q)=\{q'\in pdom_Y(q)| &{} pdom_Y(q)=\{q'\}\vee &{} \\ &{} q'\not \in Y\wedge (\forall q''\in pdom_Y(q):&{} q''\ne q' \Rightarrow \\ &{} &{} q''\in pdom_Y(q'))\} \end{array} \end{aligned}$$

and a node \(q'\in \text {ipdom}_Y(q)\) will be called an immediate Y post-dominator of q.

The following fact gives us a unique immediate Y post-dominator for the nodes that can reach Y (\(\varPi _{(q,Y)}\ne \emptyset \)). Intuitively this unique immediate Y post-dominator of a node q is the node that is the “closest” Y post-dominator of q, meaning that in any non-trivial path starting from q and ending in Y, the Y immediate post-dominator of q will always be visited first before any other Y post-dominator of q.

Fact 3

For a set of nodes Y and a node q, whenever \(\varPi _{(q,Y)}\ne \emptyset \) and pdom\(_Y\)(q\(\ne \emptyset \) then there exists node \(q'\) such that ipdom\(_Y(q)=\{q'\}\).

For simplicity, whenever a node \(q'\) is the unique immediate Y post-dominator of a node q and \(\varPi _{(q,Y)}\ne \emptyset \) we shall write ipd\(_{{Y}}({q})\) for \(q'\) and we will say that the unique immediate Y post-dominator of q is defined. For any other case where q can either not reach Y (\(\varPi _{(q,Y)}= \emptyset \)) or pdom\(_Y(q)=\emptyset \) we will say that the unique immediate post-dominator of q is not defined.

Example 4

For the timed automaton \(\mathsf{SG}\) and for the set of observable nodes \(Y=\{2,3,4\}\), we have that for q being 1, 3 and 4 while . Therefore for the nodes 1,3 and 4 their unique immediate Y post-dominator is defined and it is the node 2, while the unique immediate Y post-dominator of the node 2 is not defined.

5 Algorithm for Secure Information Flow

We develop an algorithm (Fig. 2) that traverses the graph of a timed automaton \(\mathsf{TA}=(\mathsf{Q},\mathsf{E},\mathsf{I},q_\circ )\) and imposes information flow constraints on the program variables and clocks of the automaton with respect to a security policy \(\mathcal L\) and a Y post-dominator relation, where \(Y=Y_s\cup Y_w\) is the set of observable nodes. Before we explain the algorithm we start by defining some auxiliary operators.

Auxiliary Operators. For an edge we define the auxiliary operator \(\mathsf{ass}({.})\), \(\mathsf{expr}({.})\) and \(\mathsf{con}({.})\) as

where \(\mathsf{ass}({.})\) gives the modified variables and clocks of the assignment performed by \(\mathsf{TA}\) using that edge, \(\mathsf{expr}({.})\) gives the expressions used for the assignment, and the operator \(\mathsf{con}({.})\) returns the condition that has to hold in order for the assignment to be performed. We finally lift the \(\mathsf{ass}({.})\) operator to finite paths and thus for a finite path \(\pi =q_0act_1q_1...q_{n-1}act_nq_n\) we define the auxiliary operators \(\mathsf{Ass}({.})\) as

$$\begin{aligned} \begin{array}{rcl} \mathsf{Ass}({q_0act_1q_1...q_{n-1}act_nq_n})= & {} \mathop {\bigcup }\nolimits _{i=1}^n \mathsf{ass}({(q_{i-1},act_i,q_i)}) \end{array} \end{aligned}$$

We write

for the set of nodes, where whenever the automaton performs a successful observable step starting from a node and ending in an observable node \(q'\in Y\), then it is always the case that \(q'\) is weakly observable.

Fig. 2.
figure 2

Security of \(\mathsf{TA}=(\mathsf{Q},\mathsf{E},\mathsf{I},q_\circ )\) with respect to \(\mathcal L\) and the Y post-dominator relation

. We start by looking at the nodes in . According to our security notion (Definition 2), for two low equivalent configurations at a node q, whenever the first one performs a successful (or unsuccessful) observable step that ends at a weakly observable node, then also the second should be able to perform an observable step that ends at a weakly observable node (or an unsuccessful one resp.). For that, the condition C1 (a) first requires that the conditions of the outgoing edges in \(\mathsf{E}_q\) where \(\mathsf{E}_q=\{(q,act,q')|(q,act,q')\in \mathsf{E}\}\) contain only low variables. However, this is not enough.

Fig. 3.
figure 3

Example automata (a) (top) and (b) (bottom)

To explain the rest of the constraints imposed by the condition C1 (a) consider the automaton (a) of Fig. 3, where the node 3 is weakly observable, \(\mathsf{h}\) and \(\mathsf{l}\) is a high and a low variable respectively, and all the invariants of the nodes are set to \(\mathsf{tt}\). This automaton is not secure with respect to Definition 2. To see this, we have \(([\mathsf{l}\mapsto 0,\mathsf{h}\mapsto 1],\delta )\equiv ([\mathsf{l}\mapsto 0,h\mapsto 0],\delta )\) (for some clock state \(\delta \)) but the pair \(([\mathsf{l}\mapsto 0,\mathsf{h}\mapsto 1],\delta )\) always produces \(\bot \) since we will have an infinite loop at the node 2, while \(([\mathsf{l}\mapsto 0,\mathsf{h}\mapsto 0],\delta )\) always terminates at the node 3. That is because even if both edges of the node 2 contain only the low variable \(\mathsf{l}\) in their condition, the assignment bypasses the policy \(\mathcal{L}\) and thus, right after it, the two pairs stop being low equivalent.

As another example, consider the automaton (b) of Fig. 3. Here the node 4 is weakly observable, \(\mathsf{h}\) is a high variable, \(\mathsf{l}\), \(\mathsf{l}'\) are two low variables and all the invariants of nodes are set to \(\mathsf{tt}\) again. We have \(([\mathsf{l}\mapsto 0,\mathsf{l}'\mapsto 0,\mathsf{h}\mapsto 1],\delta )\equiv ([\mathsf{l}\mapsto 0,\mathsf{l}'\mapsto 0,\mathsf{h}\mapsto 0],\delta )\) (for some clock state \(\delta \)) and again the first pair produces \(\bot \) by looping at the node 3, whereas the second pair always terminates. Here even if the variable \(\mathsf{l}\) is not used in any condition after the assignment , it influences the value of \(\mathsf{l}'\) and consequently, since \(\mathsf{l}'\) appears on the condition of the edges of the node 3 we get this behavior.

To cater for such cases, for an edge we first define the predicate

$$\begin{aligned} A_e=\bigwedge _{i} \mathsf{fv}({a_i})\rightsquigarrow \{x_i\} \end{aligned}$$

that takes care of the explicit flows arising from the assignments. We then define

$$\begin{aligned} \varPi _{(e,Y)}=\{\pi |e=(q_0,act_1,q_1): \pi =q_0act_1q_1...q_{n-1}act_nq_n\in \varPi _{(q_0,Y)}\} \end{aligned}$$

to be set of paths (the ones defined in Sect. 4) that start with e and end in Y, and all the intermediate nodes do not belong to Y. Finally, whenever an assignment bypasses the security policy \(\mathcal{L}\) due to an explicit flow and thus \(A_e\) is false, we then impose the predicate

$$\begin{aligned} \begin{array}{l l l} \varPsi _e&{}=&{} \forall \pi \in \varPi _{(e,Y)}: \forall q' \in \pi (1):\\ &{} &{} q'\not \in Y\Rightarrow (\forall e' \in E_{q'}: (\mathsf{ass}({e})\setminus R) \cap (\mathsf{fv}({\mathsf{con}({e'})})\cup \mathsf{fv}({\mathsf{expr}({e'})}))=\emptyset ) \end{array} \end{aligned}$$

The predicate \(\varPsi _e\) demands that the assigned program variables of \(e=(q_s,act,q_t)\) cannot be used in any expression or condition that appears in a path that starts with \(q_t\) and goes to an observable node. Note here that even if \(\varPsi _e\) quantifies over a possibly infinite set of paths (\(\varPi _{(e,Y)}\)), it can be computed in finite time by only looking at the paths where each cycle occurs at most once.

We will now look at the nodes where the automaton may perform a successful observable step that ends in a strongly observable node. Those nodes are described by the set , that is the complement of .

. For a node q in , whose immediate Y post-dominator is defined, condition C2 (a) takes care of the explicit and the implicit flows generated by the assignment and the control dependencies respectively, arising from the edges of q. Note here that we do not propagate the implicit flows any further after ipd\(_{{Y}}({q})\). This is because ipd\(_{{Y}}({q})\) is the point where all the branches of q are joining and any further computation is not control-dependent on them anymore. Those constraints are along the line of Denning’s approach [10] of the so-called block-labels.

Fig. 4.
figure 4

Example automaton (c)

To understand condition C2 (b) consider the automaton (c) of Fig. 4, where \(\mathsf{h}\) and \(\mathsf{l}\) is a high and a low variable respectively, the node 2 is strongly observable, and both nodes 1 and 2 have their invariant set to \(\mathsf{tt}\). Next take \(([\mathsf{l}\mapsto 0,\mathsf{h}\mapsto 1],\delta )\equiv ([\mathsf{l}\mapsto 0,\mathsf{h}\mapsto 0],\delta )\) (for some clock state \(\delta \)) and note that the first pair can result in a configuration in 2 with \(([l\mapsto 0,h\mapsto 1],\delta )\) (taking the top branch) while the second pair always ends in 2 with \([l\mapsto 1,h\mapsto 0]\). Therefore this automaton is not secure with respect to our Definition 2.

To take care of such behaviours we write \(\underline{{\text {sat}}}(\cdots )\) to express the satisfiability of the \(\cdots \) formula. Whenever there are two branches (induced by the edges e and \(e'\) both leaving q) that are not mutually exclusive (that is, where \(\underline{{\text {sat}}}(\mathsf{con}({e}) \wedge \mathsf{con}({e'}))\) we make sure to record the information flow arising from bypassing the branch that would otherwise perform an assignment. This is essential for dealing with non-determinism.

Fact 4

For a timed automaton \(\mathsf{TA}=(\mathsf{Q},\mathsf{E},\mathsf{I},q_\circ )\), we have that if

$$\begin{aligned} \langle {q},{\sigma },{\delta }\rangle \overset{D}{\Longrightarrow }_{{\{q'\}}}\langle {q'},{\sigma '},{\delta '}\rangle \end{aligned}$$

then

where e corresponds to the initial edge of this observable step.

Condition C2 (c) takes care of cases where a timing/termination side channel [2] could have occurred.

Fig. 5.
figure 5

Example automaton (d)

As an example of such a case consider the automaton (d) of Fig. 5, where \(\mathsf{h}\) and \(\mathsf{t}\) is a high program variable and a low clock respectively, node 2 is strongly observable and both 1 and 2 have their invariant set to \(\mathsf{tt}\). Next, for \(([\mathsf{h}\mapsto 1],[\mathsf{t}\mapsto 0])\equiv ([\mathsf{h}\mapsto 0],[\mathsf{t}\mapsto 0])\) we have that the first pair always delays at least 30 units and ends in 2 with a clock state that has \(t>30\), whereas the second pair can go to 2, taking the lower branch immediately without any delay, and thus the resulting pairs will not be low equivalent. To take care of such behaviours, we stipulate a predicate \(\varPhi _q\) such that

Using this predicate we demand that whenever the \(\mathsf{TA}\) does not have a “constant” termination behavior from the node q to the node ipd\(_{{Y}}({q})\), then variables that influence the termination behavior should not be of high security level.

. We are now left with the nodes in , whose immediate Y post-dominator is not defined. Since for such a node q, we cannot find a point (the unique immediate Y post-dominator) where the control dependencies from the branches of q end, condition C3 (a) requires that the conditions of the edges of q should not be dependent on high security variables.

Condition C3 (b) caters for the explicit flows, of an edge e using the predicate \(A_e\). However we are allowed to dispense \(A_e\), whenever further computations after taking the edge e may lead only to weakly observable nodes and \(\varPsi _e\) holds. To express this for an edge we write

whenever .

Example 5

Consider now the automaton \(\mathsf{SG}\) of Example 1, and the Y post-dominator relation of Example 4.

We have that the nodes 1, 3 and 4 are in and also that their immediate unique Y post-dominator is defined. Condition C2 (a) and C2 (b) impose the following constraints

$$\begin{aligned} \begin{array}{l} \{\mathsf{T},\mathsf{t},\mathsf{r}\}\rightsquigarrow \{\mathsf{ed},\mathsf{e_i},\mathsf{c_i},\mathsf{s},\mathsf{t},\mathsf{r}\}, \{\mathsf{ed},\mathsf{e_i}\}\rightsquigarrow \{\mathsf{e_i}\}, \{\mathsf{e_i}\}\rightsquigarrow \{\mathsf{c_i}\}, \{\mathsf{v_i}\}\rightsquigarrow \{\mathsf{p_i}\}\quad (i=1,2) \\ \{\mathsf{T}\}\rightsquigarrow \{\mathsf{p_1},\mathsf{p_2},\mathsf{a},\mathsf{f}\}, \{\mathsf{z}\}\rightsquigarrow \{\mathsf{a}\}, \{\}\rightsquigarrow \{\mathsf{s},\mathsf{f},\mathsf{a}\}\\ \end{array} \end{aligned}$$

Finally, for the node 1, because \(\varPhi _1\) (C2 (c)) all the clocks need to be of low security level.

Next, the node 2 is in and since its unique immediate Y post-dominator is not defined, condition C3 (b) impose the constraints

$$\begin{aligned} \begin{array}{l} \{\mathsf{p_1},\mathsf{p_2},\mathsf{c_1},\mathsf{c_2}\}\rightsquigarrow \{\mathsf{b}\}, \{\mathsf{a}\}\rightsquigarrow \{\mathsf{x}\}, \{\}\rightsquigarrow \{\mathsf{e_1},\mathsf{e_2},\mathsf{f}\} \end{array} \end{aligned}$$

and condition C3 (a) imposes that \(\mathsf{T}\), \(\mathsf{s}\) and \(\mathsf{f}\) should be of low security level. Notice here that since for the edge that releases the sensitive information of C we have that , we are not imposing the constraint \(\{\mathsf{c}_\mathsf{i}\}\rightsquigarrow \{\mathsf{y}_\mathsf{i}\}\) (\(i=(1,2)\)). Those constraints are easy to verify for the security assignment of Example 2.

Now if we were to change the node 3 from being a weakly observable to a strongly observable node, the automaton \(\mathsf{SG}\) will not be secure with respect to Definition 2. In that case our algorithm will reject it, since for the edge e we would have that and the predicate \(A_e\) would have resulted in false.

Finally, we shall write \(\text {sec}_{Y,\mathcal {L}}(\mathsf{TA})\) whenever the constraints arising from our algorithm (Fig. 2) are satisfied and thus we have the following lemmas

Lemma 1

For a timed automaton \(\mathsf{TA}=(\mathsf{Q},\mathsf{E},\mathsf{I},q_\circ )\), if \(\text {sec}_{Y,\mathcal {L}}(\mathsf{TA})\) then for \((\sigma _1,\delta _1)\), \((\sigma _2,\delta _2)\) such that \([\![{\mathsf{I}(q)}]\!](\sigma _1,\delta _1)\) and \([\![{\mathsf{I}(q)}]\!](\sigma _2,\delta _2)\) and \((\sigma _1,\delta _1)\equiv (\sigma _2,\delta _2)\) we have that

Lemma 2

For a timed automaton \(\mathsf{TA}=(\mathsf{Q},\mathsf{E},\mathsf{I},q_\circ )\), if \(\text {sec}_{Y,\mathcal {L}}(\mathsf{TA})\) then for \((\sigma _1,\delta _1)\), \((\sigma _2,\delta _2)\) such that \([\![{\mathsf{I}(q)}]\!](\sigma _1,\delta _1)\) and \([\![{\mathsf{I}(q)}]\!](\sigma _2,\delta _2)\) and \((\sigma _1,\delta _1)\equiv (\sigma _2,\delta _2)\) we have that

$$\begin{aligned} \text {if } \langle {q},{\sigma _1},{\delta _1}\rangle \overset{\infty }{\Longrightarrow }_{{Y}}\bot \text { then also } \langle {q},{\sigma _2},{\delta _2}\rangle \overset{\infty }{\Longrightarrow }_{{Y}}\bot \end{aligned}$$

The following theorem concludes the two lemmas from above to establish the soundness of our algorithm with respect to the notion of security of Definition 2.

Theorem 1

For a timed automaton \(\mathsf{TA}=(\mathsf{Q},\mathsf{E},\mathsf{I},q_\circ )\), if \(\text {sec}_{Y,\mathcal {L}}(\mathsf{TA})\) then \(\mathsf{TA}\) satisfies the information security policy \(\mathcal {L}\).

6 Conclusion

We have shown how to successfully enforce Information Flow Control policies on timed automata. This has facilitated developing an algorithm that prevents unnecessary label creep and that deals with non-determinism, non-termination, and continuous real-time. The algorithm has been proved sound by means of a bisimulation result, that allows controlled information leakage.

We are exploring how to automate the analysis and in particular how to implement (a sound approximation of) the \(\varPhi _q\) predicate. There has been a lot of research [3, 4] done for determining the maximum \((max_t)\) or minimum \((min_t)\) execution time that an automaton needs to move from a location \(q_s\) to a location \(q_t\). One possibility is to make use of this work [3, 4] and thus the predicate \(\varPhi _q\) would amount to checking if the execution time between the two nodes of interest (q and ipd\(_{{Y}}({q})\)) is constant (e.g. \(max_t=min_t)\).

A longer-term goal is to allow policies to simultaneously deal with safety and security properties of cyberphysical systems.