1 Introduction

The privacy of GSM cellular telephony is protected by the A5 family of cryptosystems. The first two members of this family, the stream ciphers A5/1 (developed primarily for European markets) and A5/2 (developed primarily for export markets) were designed in the late 1980s in an opaque process and were kept secret until they were reverse engineered in 1999 from actual handsets [12]. Once published, it became clear that A5/2 provided almost no security, and A5/1 could be attacked with practical complexity by a variety of techniques (e.g., [2, 3, 10, 13]). In particular, a team of cryptographers led by Karsten Nohl published in December 2009 a 2-TBytes rainbow table for A5/1, that makes it easy to derive the session key of any particular conversation with minimal delay and hardware support [1].

In response to these developments, the GSM Association decided to design a new block cipher with 128-bit keys called KASUMI [24], and to use it for both secrecy and authentication purposes, deploying newly developed modes of operation. This time, the process was significantly more open, and resulted in two ways to deploy KASUMI: A5/3 (using a simplified 64-bit key version of KASUMI) which is mandatory in all new handsets, and A5/4 (using the full 128-bit key version of KASUMI) which is optional and does not seem to be in use by any operator. In UMTS (3G) cellular networks, there are two possible encryption algorithms which are both mandatory on all handsets: UEA1 which is based on 128-bit KASUMI, and UEA2 which is based on 128-bit SNOW 3G. A5/3 and UEA1 are already implemented in a majority of the five billion available handsets, and thus KASUMI had become one of the most widely deployed cryptosystems in the world, and its security had become one of the most important practical issues in cryptography.

The KASUMI block cipher is based on the MISTY block cipher which was published at FSE 1997 by Matsui [20]. It has 64-bit blocks, 128-bit keys, and a complex recursive Feistel structure with 8 rounds, each one of which consists of 3 rounds, each one of which has 3 rounds of nonlinear SBox operations. MISTY withstood 15 years of cryptanalytic efforts, and only recently a first attack faster than exhaustive search on its full version has appeared, with a completely impractical complexity of 2125 [16]. However, the designers of A5/3 decided to make MISTY faster and more hardware-friendly by simplifying its key schedule and modifying some of its components. In [25], the designers provide a rational for each one of these changes, and in particular they analyze the resistance of KASUMI against related-key attacks [4] by stating that “removing all the FI functions in the key scheduling part makes the hardware smaller and/or reduces the key set-up time. We expect that related-key attacks do not work for this structure.” The best attack found by the designers and external evaluators of KASUMI is described as follows: “There are chosen plaintext and/or related-key attacks against KASUMI reduced to 5 rounds. We believe that with further analysis it might be possible to extend some attacks to 6 rounds, but not to the full 8-round KASUMI.”

The existence of better related-key attacks on the full KASUMI was already shown in [7]. The attack of [7] had a data complexity of 254.6 and time complexity of 276.1, which are impractical but better than exhaustive search. In this paper we develop a new attack, which requires only 4 related keys, 226 data, 230 bytes of memory, and 232 time. Since these complexities are so low, we verified our attack experimentally, and our unoptimized implementation on a single core of an old PC recovered about 96 key bits in a few minutes, and the complete 128-bit key in less than two hours.Footnote 1 Careful analysis of our attack technique indicates that it cannot be applied against the original MISTY, since it exploits a sequence of coincidences and lucky strikes which were created when MISTY was changed to KASUMI by ETSI’s SAGE task force working for the GSM Association. This calls into question the design of KASUMI, and especially its simplified key schedule.

In this paper, we develop a new type of attack which is an improved version of the boomerang attack introduced in [26]. We call it a “sandwich attack,” since it uses a distinguisher which is divided into three parts: A thick slice (“bread”) at the top, a thin slice (“meat”) in the middle, and a thick slice (“bread”) at the bottom. The top and bottom parts are assumed to have high probability differential characteristics, which can be combined into a quartet by the standard boomerang technique. However, in our case they are separated by an additional middle slice, which can significantly reduce the probability of the resulting boomerang structure. Nevertheless, as we show in this paper, careful analysis of the dependence between the top and bottom differentials allows us in some cases to combine the two properties above and below the middle slice with an enhanced probability. In particular, we show that in the case of KASUMI we can use top and bottom 3-round differential characteristics with an extremely high probability of 2−2 each, and combine them via a middle 1-round slice in such a way that the “cost in probability” of the combination is 2−6, instead of the 2−32 we would expect from a naive analysis. This increases the probability of our 7-round distinguisher from 2−40 to 2−14, and reduces significantly the data and the time complexities of the attack. Such a three-level structure was used in several previous attacks such as [8, 9] (where it was called the “Feistel switch” or the “middle-round S-box trick”), but to the best of our knowledge it was always used in the past in simpler situations in which the transition probability through the middle layer (in at least one direction) was 1 due to the structural properties of a single Feistel round, or due to the particular construction of a given S-Box. Our sandwich attack is the first non-trivial application of such a structure, and the delicacy of the required probabilistic analysis is demonstrated by the fact that a tiny change in the key schedule of KASUMI or in the differentials (which both have no effect on the differential probabilities of the top and bottom layers) can change the probability of the combined distinguisher from the surprisingly high value of 2−14 to 0.

We note that after the sandwich technique was presented in the Crypto 2010 version of our paper, it was successfully applied to attack the MMB block cipher in [15]. We expect that other uses of this technique will be found in the future.

This paper is organized as follows: Section 2 describes the new sandwich attack, along with a chosen-plaintext variant which we call “rectangle-like sandwich attack,” and discusses the transition between the top and bottom parts of the cipher through the middle slice of the sandwich. Section 3 describes the KASUMI block cipher. Section 4 describes our new 7-round distinguisher for KASUMI which has a probability of 2−14, and demonstrates its extreme sensitivity to tiny structural modifications. In Sect. 5 we use the new distinguisher to develop a practical-time key recovery attack on the full KASUMI cryptosystem. Finally, Section 6 concludes the paper.

2 Sandwich Attacks

In this section we describe the technique used in our attacks on KASUMI. We start with a description of the basic (related-key) boomerang attack, and then describe a new framework, which we call a (related-key) sandwich attack, that exploits the dependence between the underlying differentials to obtain a more accurate estimation of the probability of the distinguisher. Finally, we describe the chosen plaintext variant of the attack, which we call (related-key) rectangle-like sandwich attack. We note that the idea of using dependence between the differentials in order to improve the boomerang distinguisher was implicitly proposed by Wagner [26], and was also used in some simple scenarios in [8, 9]. Therefore, our framework can be considered as a formal treatment and generalization of the ideas proposed in [8, 9, 26].

2.1 The Basic Related-Key Boomerang Attack

The related-key boomerang attack was introduced by Kim et al. [14, 18], and independently by Biham et al. [6], as a transformation of the boomerang attack [26] to the related-key differential settings [17]. In this attack, the cipher is treated as a cascade of two sub-ciphers E=E 1E 0, and related-key differentials of E 0 and E 1 are combined into an adaptive chosen plaintext and ciphertext distinguisher for E.

Let us assume that there exists a related-key differential αβ for E 0 under key difference ΔK ab with probability p (i.e., \({\bf Pr}[E_{0(K)}(P) \oplus E_{0(K \oplus K_{ab})}(P \oplus \alpha) = \beta] = p\), where E 0(K) denotes encryption through E 0 under the key K and the probability is taken over all possible plaintexts and keys). Similarly, we assume that there exists a related-key differential γδ for E 1 under key difference ΔK ac with probability q. The related-key boomerang distinguisher requires encryption/decryption under the secret key K a , and under the related keys K b =K a ⊕ΔK ab , K c =K a ⊕ΔK ac , and K d =K c ⊕ΔK ab =K b ⊕ΔK ac .

The attack is based on the following process:

  1. 1.

    Pick a random plaintext P a , and let P b =P a α.

  2. 2.

    Ask for the ciphertexts \(C_{a} = E_{K_{a}} (P_{a})\) and \(C_{b} = E_{K_{b}} (P_{b})\). Denote C c =C a δ and C d =C b δ.

  3. 3.

    Ask for the plaintexts \(P_{c} = E^{-1}_{K_{c}} (C_{c})\) and \(P_{d} = E^{-1}_{K_{d}} (C_{d})\).

  4. 4.

    Check whether P c P d =α.

The probability that the pair (P a ,P b ) is a right pair with respect to the first differential (i.e., the probability that the intermediate difference after E 0 equals β, as predicted by the differential) is p. Assuming independence, the probability that both pairs (C a ,C c ) and (C b ,C d ) are right pairs with respect to the second differential is q 2. If all these are right pairs, then we have

$$(X_a \oplus X_b=\beta) \wedge (X_a \oplus X_c = \gamma) \wedge (X_b \oplus X_d = \gamma), $$

where X i is the intermediate encryption value of P i . Thus,

$$X_c \oplus X_d = (X_c \oplus X_a) \oplus (X_a \oplus X_b) \oplus (X_b \oplus X_d) = \beta \oplus \gamma \oplus \gamma = \beta $$

(see left side of Fig. 1). This, in turn, implies that with probability p, P c P d =α. Hence, the total probability of this quartet of plaintexts and ciphertexts to satisfy the condition P c P d =α is at least (pq)2. For a random permutation the probability that the last condition is satisfied is 2n, where n is the block size. Therefore, if pq≫2n/2, it is possible to distinguish E from a random permutation given O((pq)−2) adaptively chosen plaintexts and ciphertexts. The algorithm of the distinguisher is as follows:

  1. 1.

    Choose M plaintexts at random, and initialize a counter C to zero. For each plaintext P a , perform the following:

    1. (a)

      Ask for the ciphertexts \(C_{a} = E_{K_{a}} (P_{a})\) and \(C_{b} = E_{K_{b}} (P_{b})\) where P b =P a α.

    2. (b)

      Ask for the plaintexts \(P_{c} = E^{-1}_{K_{c}} (C_{c})\) and \(P_{d} = E^{-1}_{K_{d}} (C_{d})\) where C c =C a δ and C d =C b δ.

    3. (c)

      If P c P d =α, increment the counter C by 1.

  2. 2.

    If C>Threshold, output “E.” Otherwise, output “Random Permutation.”

Fig. 1.
figure 1

Related-key boomerang and sandwich quartets.

The distinguisher can be improved by considering multiple differentials of the form αβ′ and γ′→δ (for the same α and δ). We omit this improvement here since it is not used in our attack on KASUMI, and refer the reader to [6]. For a rigorous treatment of the related-key boomerang attack, including a discussion of the independence assumptions the attack relies upon, we refer the interested reader to [19, 21].Footnote 2

The way to transform a related-key boomerang distinguisher into a key-recovery attack is rather standard, and thus we do not present it here and rely on the detailed description of such a transformation in our attack on KASUMI presented in Sect. 5.

2.2 The Related-Key Sandwich Attack

In this framework we consider the cipher as a cascade of three sub-ciphers: E=E 1ME 0. Our assumptions are the same as in the basic boomerang attack: We assume that there exists a related-key differential αβ for E 0 under key difference ΔK ab with probability p, and a related-key differential γδ for E 1 under key difference ΔK ac with probability q. The attack algorithm is also exactly the same as in the basic attack (ignoring the middle sub-cipher M). However, the analysis is more delicate and requires great care in analyzing the dependence between the various distributions.

The main idea behind the sandwich attack is the transition in the middle. In the basic boomerang attack, if the pair (P a ,P b ) is a right pair with respect to the first differential, and both pairs (C a ,C c ) and (C b ,C d ) are right pairs with respect to the second differential, then we have

$$ (X_a \oplus X_b=\beta) \wedge (X_a \oplus X_c = \gamma) \wedge (X_b \oplus X_d = \gamma), $$
(1)

where X i is the intermediate encryption value of P i , and thus

$$ X_c \oplus X_d = (X_c \oplus X_a) \oplus (X_a \oplus X_b) \oplus (X_b \oplus X_d) = \beta \oplus \gamma \oplus \gamma = \beta, $$
(2)

resulting in P c P d =α with probability p (see left side of Fig. 1).

In the new sandwich framework, instead of condition (1), we get

$$ (X_a \oplus X_b=\beta) \wedge (Y_a \oplus Y_c = \gamma) \wedge (Y_b \oplus Y_d = \gamma), $$
(3)

where X i is the partial encryption of P i under E 0 (and the respective key) and Y i is the partial decryption of C i under E 1 (see right side of Fig. 1). Therefore, the probability of the three-layer related-key boomerang distinguisher is p 2 q 2 r, where

$$ r = {\bf Pr}\bigl[ (X_c \oplus X_d = \beta) \mid (X_a \oplus X_b=\beta) \wedge (Y_a \oplus Y_c = \gamma) \wedge (Y_b \oplus Y_d = \gamma) \bigr]. $$
(4)

Without further assumptions on M, r is expected to be very low (close to 2n for an n-bit block), and thus the distinguisher is expected to fail. However, as observed in [8, 9, 26], in some cases the differentials in E 0 and E 1 can be chosen such that the probability penalty r in going through M (in at least one direction) is 1, which is much higher than expected.

An example of this phenomenon, introduced in [26] and described in [9] under the name “Feistel switch,” is the following. Let E be a Feistel cipher, decomposed as E=E 1ME 0, where M consists of one Feistel round (see Fig. 2). Assume that the differentials αβ (for E 0) and γδ (for E 1) have no key difference (i.e., ΔK ab K ac =0), and satisfy β L=γ R (i.e., the left half of β which is the difference in the state X L equals the right half of γ which is the difference in the state Y R). We would like to compute the value of r.

Fig. 2.
figure 2

A Feistel construction. M is the second round.

Assume that condition (3) holds. In this case, as by the Feistel construction, \(Y_{i}^{R}=X_{i}^{L}\) for all i, we have

$$ X_a^L \oplus X_b^L = \beta^L=\gamma^R = X_a^L \oplus X_c^L = X_b^L \oplus X_d^L, $$
(5)

and thus,

$$ \bigl(X_a^L=X_d^L\bigr) \quad \mbox{and} \quad \bigl(X_b^L=X_c^L \bigr). $$
(6)

Therefore, the output values of the F-function in the Feistel round represented by M, denoted in Fig. 2 by (O a ,O b ,O c ,O d ), satisfy

$$(O_a=O_d) \quad \mbox{and} \quad (O_b=O_c). $$

Since by the Feistel construction, \(X_{i}^{R} = Y_{i}^{L} \oplus O_{i}\) and by condition (3), Y a Y b Y c Y d =0, it follows that

$$X_a \oplus X_b \oplus X_c \oplus X_d =0, $$

which by condition (3) implies X c X d =β. Thus, in this case we get that

$$r = {\bf Pr}\bigl[ (X_c \oplus X_d = \beta) \mid (X_a \oplus X_b=\beta) \wedge (Y_a \oplus Y_c = \gamma) \wedge (Y_b \oplus Y_d = \gamma) \bigr]=1, $$

independently of the choice of the F-function used.

Other examples of the same phenomenon are considered in [8] (under the name “middle-round S-box trick”), and in [9] (under the names “ladder switch” and “S-box switch”).

Our attack on KASUMI is the first non-trivial example of this phenomenon in which a careful analysis shows that r is smaller than 1, but much larger than its expected value under the standard independence assumptions. In our attack, the cipher E (7-round KASUMI) is a Feistel construction, M consists of a single round, and β L=γ R. However, the argument presented above cannot be applied directly since there is a non-zero key difference in M, and thus a zero input difference to the F-function does not imply a zero output difference. Instead, we analyze the F-function thoroughly and show that in this case, r=2−6 (instead of 2−32, which is the expected value for a random Feistel round in a 64-bit block cipher).

Remark 1

We note that our treatment of the sandwich distinguisher allows us to specify the precise independence assumptions we rely upon. Since r is defined as a conditional probability, the only independence assumptions we use are between the differentials of E 0 and E 1, and thus the formula p 2 q 2 r relies on exactly the same assumptions as the ordinary boomerang attack. In [8, 9, 26], this situation was treated as a “trick” allowing to increase the probability of the distinguisher, or in other words, as a failure of the formula p 2 q 2 in favor of the adversary. This approach is problematic since once we claim that the entire formula does not hold due to dependencies, we cannot rely on independence assumptions in other places where such dependencies could be found.

2.3 The Rectangle-Like Sandwich Attack

The transformation of the (related-key) boomerang distinguisher into a chosen plaintext rectangle attack relies on standard birthday-paradox arguments. The division into sub-ciphers and the assumptions are the same as in the (related-key) boomerang distinguisher. The key idea behind the transformation is to encrypt many plaintext pairs with input difference α, and to look for quartets that happen to conform to the requirements of the boomerang process. In other words, the adversary considers quartets of plaintexts of the form ((P a ,P b =P a α),(P c ,P d =P c α)) encrypted under the related keys K a ,K b ,K c , and K d , respectively, and a quartet is called a “right quartet” if the following conditions are satisfied:

  1. 1.

    \(E_{0(K_{a})} (P_{a}) \oplus E_{0(K_{b})} (P_{b}) = \beta = E_{0(K_{c})} (P_{c}) \oplus E_{0(K_{d})} (P_{d})\) (i.e., X a X b =β=X c X d ).

  2. 2.

    \(E_{0(K_{a})} (P_{a}) \oplus E_{0(K_{c})} (P_{c}) = X_{a} \oplus X_{c} = \gamma\) (which leads to \(E_{0(K_{b})} (P_{b}) \oplus E_{0(K_{d})} (P_{d}) = X_{b} \oplus X_{d} = \gamma\) if this condition holds along with the previous one).

  3. 3.

    C a C c =δ=C b C d .

The probability of a quartet to be a right quartet is a lower bound on the probability of the event

$$ C_a \oplus C_c = \delta = C_b \oplus C_d. $$
(7)

The usual assumption is that each of the above conditions is independent of the rest, and hence the probability that a given quartet ((P a ,P b ),(P c ,P d )) is a right quartet is p 2⋅2nq 2. Since for a random permutation, the probability of condition (7) is 2−2n, the rectangle process can be used to distinguish E from a random permutation if pq≫2n/2 (the same condition as in the standard boomerang distinguisher).

However, the data complexity of the distinguisher is O(2n/2(pq)−1), which is much higher than the complexity of the boomerang distinguisher. The higher data complexity follows from the fact that the event \(E_{0(K_{a})}(P_{a}) \oplus E_{0(K_{c})} (P_{c}) = \gamma\) occurs with a “random” probability of 2n (in fact, this is the birthday-paradox argument behind the construction). The identification of right quartets is also more complicated than in the boomerang case, as instead of checking a condition on pairs, the adversary has to go over all the possible quartets. At the same time, the chosen plaintext nature allows using stronger key recovery techniques. An optimized method of finding the right rectangle quartets is presented in [5].

The transformation of the (related-key) sandwich framework into the (related-key) rectangle-like sandwich framework is performed similarly. The way in which the distinguisher is deployed remains the same, and the probability of a quartet to be a right quartet is p 2⋅2nr′⋅q 2, where

$$ r' = {\bf Pr}\bigl[ (Y_b \oplus Y_d = \gamma) \mid (X_a \oplus X_b=\beta) \wedge (X_c \oplus X_d = \beta) \wedge (Y_a \oplus Y_c = \gamma) \bigr]. $$
(8)

It follows from symmetry arguments that in the case where E is a Feistel cipher, M consists of a single round, and β L=γ R, we have r′=r (even if there is a non-zero key difference in M). Thus, in our attack on KASUMI we are able to use the computation of r in the sandwich framework to find also the probability of the corresponding related-key rectangle-like sandwich distinguisher.

3 The KASUMI Block Cipher

KASUMI [24] is a 64-bit block cipher with 128-bit keys. It has a recursive Feistel structure, following its ancestor MISTY. The cipher has eight Feistel rounds, where each round is composed of two functions: the FO function which is in itself a 3-round 32-bit Feistel construction, and the FL function that mixes a 32-bit subkey with the data in a linear way. The order of the two functions depends on the round number: in the even rounds the FO function is applied first, and in the odd rounds the FL function is applied first.

The FO function also has a recursive structure: its F-function, called FI, is a four-round Feistel construction. The FI function uses two nonlinear S-boxes S7 and S9 (where S7 is a 7-bit to 7-bit permutation and S9 is a 9-bit to 9-bit permutation), and accepts an additional 16-bit subkey, which is mixed with the data. In total, a 96-bit subkey enters FO in each round—48 subkey bits are used in the FI functions and 48 subkey bits are used in the key mixing stages.

The FL function accepts a 32-bit input and two 16-bit subkey words. One subkey word affects the data using the OR operation, while the second one affects the data using the AND operation. We outline the structure of KASUMI and its components in Fig. 3.

Fig. 3.
figure 3

Outline of KASUMI.

The key schedule of KASUMI is much simpler than the original key schedule of MISTY, and the subkeys are linearly derived from the key. The 128-bit key K is divided into eight 16-bit words: K 1,K 2,…,K 8. Each K i is used to compute \(K_{i}' = K_{i} \oplus C_{i}\), where the C i ’s are fixed constants (we omit these from the paper, and refer the intrigued reader to [24]). In each round, eight words are used as the round subkey (up to some in-word rotations). Hence, each 128-bit round subkey is a linearly modified version of the secret key. We summarize the details of the key schedule of KASUMI in Table 1.

Table 1. KASUMI’s key schedule algorithm.

4 A Related-Key Sandwich Distinguisher for 7-Round KASUMI

4.1 The New Distinguisher

In our distinguisher, we treat rounds 1–7 of KASUMI as a cascade E=E 1ME 0, where E 0 consists of rounds 1–3, M consists of round 4, and E 1 consists of rounds 5–7. The related-key differential we use for E 0 is a slight modification of the differential characteristic presented in [11], in which

$$\alpha=(0_x,0010~0000_x) \rightarrow (0010~0000_x,0_x) = \beta. $$

The corresponding key difference is ΔK ab K cd =(0,0,8000 x ,0,0,0,0,0), i.e., only the third key word has a single bit difference ΔK 3=8000 x . This related-key differential is depicted in Fig. 4. The related-key differential we use for E 1 is the same differential shifted by four rounds, in which the data differences are

$$\gamma=(0_x,0010~0000_x) \rightarrow (0010~0000_x,0_x) = \delta, $$

and the key difference is ΔK ac K bd =(0,0,0,0,0,0,8000 x ,0) (to handle the different subkeys used in these rounds).

Fig. 4.
figure 4

3-Round related-key differential characteristic of KASUMI.

As shown in [11], the probability of each one of these 3-round differential characteristics is 1/4. In order to find the probability of the related-key sandwich distinguisher, we need to compute the probability

$$ {\bf Pr}\bigl[ (X_c \oplus X_d = \beta) \mid (X_a \oplus X_b=\beta) \wedge (Y_a \oplus Y_c = \gamma) \wedge (Y_b \oplus Y_d = \gamma) \bigr], $$
(9)

where (X a ,X b ,X c ,X d ) and (Y a ,Y b ,Y c ,Y d ) are the intermediate values before and after the middle slice of the sandwich during the encryption/decryption of the quartet (P a ,P b ,P c ,P d ) (see the right side of Fig. 1). This computation, which is a bit complex, spans the rest of this subsection.

Consider a quartet (P a ,P b ,P c ,P d ) for which the condition

$$ (X_a \oplus X_b=\beta) \wedge (Y_a \oplus Y_c = \gamma) \wedge (Y_b \oplus Y_d = \gamma) $$
(10)

is satisfied. Note that for our differentials, we have β L=γ R, as illustrated in Fig. 5. Hence, we can apply the argument of Sect. 2, since M is a single Feistel round. In particular, we obtain

$$ \bigl(X_a^L=X_d^L \bigr) \wedge \bigl(X_b^L=X_c^L \bigr), $$
(11)

where \(X_{i}^{L}\) denotes the left half of X i , which enters the function FO4 (see left part of Fig. 5). Moreover, as the right half of β L and γ R is zero, we have

$$ X_a^{LR}=X_b^{LR}=X_c^{LR}=X_d^{LR}, $$
(12)

where \(X_{i}^{LR}\) denotes the right half (i.e., the 16 rightmost bits) of \(X_{i}^{L}\) (see central part of Fig. 5).

Fig. 5.
figure 5

Rounds 3–5 of the sandwich distinguisher and the notations used in the attack description.

The following transitions are illustrated in Fig. 6 and the notations we use in their description are shown in Fig. 5. The function FO4 is a 3-round Feistel construction whose 32-bit values after round j are denoted by \((X_{a}^{j},X_{b}^{j},X_{c}^{j},X_{d}^{j})\). The functions FI 4,1,FI 4,2, and FI 4,3 are 4-round Feistel constructions, and the 16-bit outputs of FI 4,j are denoted by \((I_{a}^{j},I_{b}^{j},I_{c}^{j},I_{d}^{j})\). Note that the key differences ΔK ab and ΔK ac affect in round 4 the subkeys KI 4,3 and KI 4,2, respectively, and in particular, there is no key difference in the first round of FO4. As a result, Eq. (11) implies that

$$ \bigl(X_a^1=X_d^1 \bigr) \wedge \bigl(X_b^1=X_c^1 \bigr). $$
(13)

Furthermore, there is no key difference in the pairs corresponding to (P a ,P b ) and (P c ,P d ) in the second round of FO4, and thus Eq. (12) implies that:

$$ \bigl(I_a^2=I_b^2 \bigr) \wedge \bigl(I_c^2=I_d^2 \bigr). $$
(14)

Combining Eqs. (13) and (14), we get the following relation in the right half of the intermediate values after round 3 of FO4:

$$ X_a^{3R} \oplus X_b^{3R} \oplus X_c^{3R} \oplus X_d^{3R} = 0. $$
(15)

In the F-function of round 3 of FO4 we consider the pairs corresponding to (P a ,P d ) and (P b ,P c ). Since the key difference in these pairs (which equals to K ab K ac ) affects only the subkey KI 4,3,1, Eq. (13) suggests that

$$ I_a^{3R} \oplus I_b^{3R} \oplus I_c^{3R} \oplus I_d^{3R}=0 $$
(16)

in the 9 bits which composes the right hand side of the output. In the left hand side of the output, the XOR of the four values is not necessarily equal to zero, due to the subkey difference that affects the inputs to the second S7 in FI 4,3. However, if these 7-bit inputs, denoted by (J a ,J b ,J c ,J d ), satisfy one of the conditions,

$$ \bigl((J_a = J_b) \wedge (J_c =J_d) \bigr) \quad \mbox{or} \quad \bigl((J_a = J_c) \wedge (J_b=J_d) \bigr), $$
(17)

then Eq. (16) implies

$$ I_a^{3L} \oplus I_b^{3L} \oplus I_c^{3L} \oplus I_d^{3L}=0. $$
(18)

Since we have J a J d =J b J c (both are equal to the subkey difference in KI 4,3,1), each one of the two conditions in Eq. (17) is expected to holdFootnote 3 with probability 2−7. Therefore, combining Eqs. (15), (16), and (18) we get that the condition

$$ X_a^{3} \oplus X_b^{3} \oplus X_c^{3} \oplus X_d^{3} = 0 $$
(19)

holds with probability 2−6.

Fig. 6.
figure 6

The development of differences in FO 4 and in FI 4,3.

Finally, since the FL function is linear for a given key and there is no key difference in FL4, we can conclude that whenever Eq. (19) holds, the outputs of the F-function in round 4 (denoted in Fig. 5 by \((O_{a}^{4},O_{b}^{4},O_{c}^{4},O_{d}^{4})\)) satisfy

$$ O_a^{4} \oplus O_b^{4} \oplus O_c^{4} \oplus O_d^{4} = 0. $$
(20)

Since by condition (10),

$$Y_a^L \oplus Y_b^L \oplus Y_c^L \oplus Y_d^L =0, $$

it follows that

$$ X_a^{R} \oplus X_b^{R} \oplus X_c^{R} \oplus X_d^{R} = 0 $$
(21)

also holds with probability 2−6. Combining this with Eq. (11) yields

$$ {\bf Pr}\bigl[ (X_c \oplus X_d = \beta) \mid (X_a \oplus X_b=\beta) \wedge (Y_a \oplus Y_c = \gamma) \wedge (Y_b \oplus Y_d = \gamma) \bigr] = 2^{-6}. $$
(22)

Therefore, the overall probability of the related-key sandwich distinguisher is

$$ (1/4)^2 \cdot (1/4)^2 \cdot 2^{-6} = 2^{-14}, $$
(23)

which is much higher than the probability of (1/4)2⋅(1/4)2⋅2−32=2−40 which is expected by the naive analysis of the sandwich structure.

4.2 Experimental Verification

To verify the properties of the new distinguisher, we used the official code available as an appendix in [24]. The verification experiment was set up as follows: In each test we randomly chose a key quartet satisfying the required key differences. We then generated 216 quartets by following the boomerang procedure described above. We utilized a slight improvement of the first differential suggested in [11] that increases its probability in the encryption direction by a factor of 2 by fixing the value of two plaintext bits. Hence, the number of right quartets in each test was expected to follow a Poisson distribution with a mean value of 216⋅2−14⋅2=8. We repeated the test 100,000 times, and obtained a distribution which is extremely close to the expected distribution. The full results are summarized in Table 2.

Table 2. The number of right quartets in 100,000 experiments.

4.3 A Tale of Two Sandwiches

In this subsection we present two examples which demonstrate the extremely delicate nature of the probability estimations used in the sandwich attack, and the “lucky strikes” which made our attack on KASUMI possible. These two examples, along with a detailed analysis of various related-key boomerang distinguishers of a similar nature presented in [19], illustrate the thorough analysis of the structure of M which must be performed in each specific case in order to compute the probability r analytically. Another possibility is to give up the rigorous theoretical analysis and sample the probability r experimentally instead.

In the first example we present, we make a tiny change in the key schedule of KASUMI, which does not seem to have any effect on the differential probabilities of any one of its three sub-ciphers. However, for this example, the probability of the distinguisher is zero! In the second example, we use the original KASUMI key schedule, and slightly alter the differentials, such that the differential probabilities in the top and bottom sub-ciphers are not changed. As in the first example, it turns out that the probability of the distinguisher becomes zero.

4.3.1 A Slight Change in the KASUMI Key Schedule

The only change we make in KASUMI is the order of the subkeys. We take the original key schedule of KASUMI, and swap the roles of KI i,1 and KI i,3. Namely, the word used in KASUMI as KI i,1 is used in this variant as KI i,3 and vice versa. For example, in our variant \(\mathit{KI}_{1,3}=K_{5}'\), \(\mathit{KI}_{2,1} = K'_{1}\), and \(\mathit{KI}_{3,3} = K'_{7}\).

Since our change affects only the subkeys used in KI i,1 and KI i,3 in each round, the differentials used in our distinguisher on KASUMI remain exactly the same for the new variant (with the same input/output differences, the same key differences and the same probabilities). However, we claim that in this case,

$$ r = {\bf Pr}\bigl[ (X_c \oplus X_d = \beta) \mid (X_a \oplus X_b=\beta) \wedge (Y_a \oplus Y_c = \gamma) \wedge (Y_b \oplus Y_d = \gamma) \bigr] = 0, $$
(24)

and thus the probability of the distinguisher is zero. In all the computations below, the notations are the same as in the original distinguisher above. The impossible transition is depicted in Fig. 7.

Fig. 7.
figure 7

The development of differences in FO 4 and in FI 4,1 in the modified KASUMI.

Since the differentials are the same as in the original distinguisher, we have

$$ \bigl(X_a^L=X_d^L \bigr) \wedge \bigl(X_b^L=X_c^L \bigr) $$
(25)

and

$$ X_a^{LR}=X_b^{LR}=X_c^{LR}=X_d^{LR}. $$
(26)

Also, since the second round of FO4 is unchanged, we have

$$ \bigl(I_a^2=I_b^2 \bigr) \wedge \bigl(I_c^2=I_d^2 \bigr). $$
(27)

Therefore,

$$ X_a^{3L} \oplus X_b^{3L} \oplus X_c^{3L} \oplus X_d^{3L} = I_a^1 \oplus I_b^1 \oplus I_c^1 \oplus I_d^1. $$
(28)

In the first round of FO4 we have a difference between the modified variant and the original KASUMI, as in the modified variant there is a subkey difference in the pairs corresponding to (P a ,P b ) and to (P c ,P d ), in the MSB of the subkey KI 4,1,1. Let us analyze the function FI 4,1.

By the structure of the differential, the inputs of FI 4,1 are of the form

$$\begin{aligned} & \bigl(X_a^{LL} \oplus \mathit{KO}_{4,1},X_b^{LL} \oplus \mathit{KO}_{4,1},X_c^{LL} \oplus \mathit{KO}_{4,1},X_d^{LL} \oplus \mathit{KO}_{4,1}\bigr)\\ &\quad= (t,t \oplus 0010_x, t \oplus 0010_x, t), \end{aligned}$$

for some 16-bit value t. After the application of the first S9 of FI 4,1 the values remain in the form (t′,t′⊕0010 x ,t′⊕0010 x ,t′), for some 16-bit value t′, since all four inputs to the S-box S9 are equal. After the XOR and the swap, the values are of the form (t″,t″⊕2010 x ,t″⊕2010 x ,t″). Hence, the inputs to the first S7 are of the form (x,y,y,x), and the outputs of that S7 after the XOR with the truncated 9 bits, are of the form (u,v,v,u) (for some 7-bit values v,u). We claim that uv and uv≠40 x . Indeed, if we had uv=40 x , then the 7-bit outputs of the S-box S7 had to be of the form (u′,u′⊕40 x ⊕10 x ,u′⊕40 x ⊕10 x ,u′). However, the differential (10 x →50 x ) is impossible for S7, and thus this event cannot occur. Similarly, u=v cannot occur since the differential (10 x →10 x ) is impossible for S7.

We now claim that the four 7-bit intermediate values after the XOR with the subkey KI 4,1,1 are different. Indeed, these values are of the form (uk,vk⊕40 x ,vk,uk⊕40 x ), and these are all different since uv and uv⊕40 x .

Finally, we consider the S-box S7 in the fourth round of FI 4,1. Its four inputs are all different, and can be divided into two pairs (uk,vk⊕40 x ) and (vk,uk⊕40 x ) with the same difference. Since S7 is an almost perfect nonlinear permutation,Footnote 4 this implies that the two corresponding pairs of outputs have distinct differences, and thus, the XOR of the four output values is necessarily non-zero. Since the XOR of the output values in the right half is zero, we have

$$I_a^1 \oplus I_b^1 \oplus I_c^1 \oplus I_d^1 \neq 0, $$

and hence,

$$X_a^{3L} \oplus X_b^{3L} \oplus X_c^{3L} \oplus X_d^{3L} \neq 0. $$

Therefore, the XOR of the four outputs of FO4 is non-zero with probability 1, and since FL is linear and invertible, this implies that the XOR of the four outputs of FL is non-zero with probability 1. This proves that

$${\bf Pr}\bigl[ (X_c \oplus X_d = \beta) \mid (X_a \oplus X_b=\beta) \wedge (Y_a \oplus Y_c = \gamma) \wedge (Y_b \oplus Y_d = \gamma) \bigr] = 0, $$

and thus the distinguisher fails in this variant of KASUMI, as asserted.

For the sake of completeness, we implemented this variant of KASUMI, and verified experimentally that the number of right quartets with the desired sandwich property was always zero.Footnote 5

4.3.2 A Slight Change in the Differential

In this example we do not alter the original key schedule of KASUMI, but rather slightly change one of the differentials. Since the considerations we use are similar to the previous example, we present them briefly.

The differential for E 0 remains

$$\alpha=(0_x,0010~0000_x) \rightarrow (0010~0000_x,0_x) = \beta, $$

with key difference ΔK ab =(0,0,8000 x ,0,0,0,0,0). The differential we use for E 1 is slightly changed to

$$\gamma=(0_x,0100~0000_x) \rightarrow (0100~0000_x,0_x) = \delta, $$

with key difference ΔK ac =(0,0,0,0,0,0,0008 x ,0). It is easy to see that the probabilities of the differentials in E 0 and E 1 remain 2−2, like for the original differentials. Also, Eqs. (26), (27), and (28) hold as in the previous example, and hence, in order to show that the probability of the distinguisher is zero, it is sufficient to show that

$$ I_a^1 \oplus I_b^1 \oplus I_c^1 \oplus I_d^1 \neq 0. $$
(29)

Consider the function FI 4,1. By the structure of the differentials, its inputs are of the form

$$(t,t \oplus 0010_x, t \oplus 0100_x, t \oplus 0110_x). $$

(Note that unlike the previous example, the four inputs are distinct.) It follows that the inputs to the S-box S9 in the first round of FI 4,1 are of the form (x,x,y,y) (where xy=2 x ) and the inputs to the S-box S7 in the second round of FI 4,1 are of the form (z,w,z,w) (where zw=10 x ). Hence, the corresponding outputs are of the forms (x′,x′,y′,y′) and (z′,w′,z′,w′), respectively. Since both these quadruples are balanced (i.e., sum up to zero), and there is no key difference in FI 4,1 (again, unlike the previous example), this implies that in both halves of the intermediate value after the key addition, the quadruples are balanced. Therefore, due to the 4-round Feistel structure, if we show that the outputs of the S-box S9 in the third round of FI 4,1 are unbalanced, this will imply that the right half of the output of FI 4,1 is unbalanced, thus proving that inequality (29) holds.

Consider the four inputs to the S-box S9 in the third round of FI 4,1. By the Feistel structure, they are of the form (x′,x′,y′,y′)⊕(z,w,z,w)⊕(KI 4,1,2,KI 4,1,2,KI 4,1,2,KI 4,1,2), and hence, they are balanced. Furthermore, they are distinct, since zw=10 x , while x′⊕y′≠10 x (since the differential 0000000102→0000100002 is impossible for the S-box S9). Since S9 is an almost perfect nonlinear permutation, this implies that the four outputs are necessarily unbalanced, concluding the proof.

We note that a similar argument holds for almost all choices of modified differentials for E 0 and E 1 in which for one of the differentials the non-zero difference enters the S-box S9, and for the other one the non-zero difference enters the S-box S7, and shows that the distinguisher must fail. The only two exceptions are:

$$\alpha = (0_x,0001~0000_x), \qquad \gamma = (0_x,0400~0000_x), $$

and:

$$\alpha = (0_x,0040~0000_x), \qquad \gamma = (0_x,0080~0000_x), $$

with appropriately chosen key differences. For these exceptions, the probability r of transition through the middle layer M is close to 2−32 (which is the expected probability for a “random” single Feistel round with 64-bit block). For a detailed and experimental analysis of these examples, we refer the reader to [19].

5 Related-Key Attacks on the Full KASUMI

In this section we use the 7-round distinguisher presented in Sect. 4 to devise related-key attacks on the full 8-round KASUMI. Our first attack is a related-key sandwich attack, which requires 226 adaptively chosen plaintexts and ciphertexts encrypted under one of four related keys, and has a time complexity of 232 encryptions. This attack was fully verified experimentally, as described in Sect. 5.1.1. Our second attack is a related-key rectangle-like sandwich attack, which requires 241 chosen plaintexts encrypted under one of four related keys, and has time complexity of 241 encryptions. Although its complexity is higher than that of the first attack, it has the advantage of performing in the more conservative chosen plaintext model (rather than the adaptively chosen plaintext/ciphertext model of the first attack).

5.1 Related-Key Sandwich Attack on the Full KASUMI

Our attack on the full KASUMI applies the distinguisher presented in Sect. 4 to rounds 1–7 (see Fig. 8), and retrieves subkey material in round 8. Let ΔK ab =(0,0,8000 x ,0,0,0,0,0) and ΔK ac =(0,0,0,0,0,0,8000 x ,0), and let K a , K b =K a ⊕ΔK ab , K c =K a ⊕ΔK ac , and K d =K c ⊕ΔK ab be the unknown related keys we wish to retrieve.

Fig. 8.
figure 8

The 7-round related-key sandwich distinguisher of KASUMI.

The attack algorithm is as follows:

  1. 1.

    Data Collection Phase:

    1. (a)

      Choose a structure of 224 ciphertexts of the formFootnote 6 C a =(X a ,A), where A is a fixed 32-bit value and X a assumes 224 arbitrary different 32-bit values. Ask for the decryption of all the ciphertexts under the key K a and denote the plaintext corresponding to C a by P a . For each P a , ask for the encryption of P b =P a ⊕(0 x ,0010 0000 x ) under the key K b and denote the resulting ciphertext by C b . Store the pairs (C a ,C b ) in a hash table indexed by the 32-bit value \(C_{b}^{R}\) (i.e., the right half of C b ).

    2. (b)

      Choose a structure of 224 ciphertexts of the form C c =(Y c ,A⊕0010 0000 x ), where A is the same constant as before, and Y c assumes 224 arbitrary different values. Ask for the decryption of the ciphertexts under the key K c and denote the plaintext corresponding to C c by P c . For each P c , ask for the encryption of P d =P c ⊕(0 x ,0010 0000 x ) under the key K d and denote the resulting ciphertext by C d . Then, access the hash table in the entry corresponding to the value \(C_{d}^{R} \oplus (0_{x},0010~0000_{x})\), and for each pair (C a ,C b ) found in this entry, apply Step 2 on the quartet (C a ,C b ,C c ,C d ).

In the first step described above, the (224)2=248 possible quartets are filtered according to a condition on the 32 difference bits which are known (due to the output difference δ of the distinguisher), which leaves about 216 quartets with the required differences.

In Step 2 we can identify the right quartets instantly using an extremely lucky property of the KASUMI structure. We note that a pair (C a ,C c ) can be a right quartet only if

$$ C_a^L \oplus \mathit{FL}8\bigl(\mathit{FO}8 \bigl(C_a^R\bigr)\bigr) = C_c^L \oplus \mathit{FL}8\bigl(\mathit{FO}8\bigl(C_c^R\bigr)\bigr), $$
(30)

since by the Feistel structure, this is the only case in which the difference after round 7 is the output difference of the sandwich distinguisher (i.e., δ=(0010 0000 x ,0 x )). However, the values \(C_{a}^{R}\) and \(C_{c}^{R}\) are fixed for all the considered ciphertexts, and hence Eq. (30) yields

$$ C_a^L \oplus C_c^L = \mathit{FL}8 \bigl(\mathit{FO}8(A)\bigr) \oplus \mathit{FL}8\bigl(\mathit{FO}8\bigl(A \oplus (0_x,0010~0000_x) \bigr)\bigr) = \mathrm{const}. $$
(31)

Thus, the value \(C_{a}^{L} \oplus C_{c}^{L}\) is equal for all the right quartets. This allows us to perform the following simple filtering:

  1. 2.

    Identifying the Right Quartets: Insert the approximately 216 remaining quartets (C a ,C b ,C c ,C d ) into a hash table indexed by the 32-bit value \(C_{a}^{L} \oplus C_{c}^{L}\), and apply Step 3 only to bins which contain at least three quartets.

Since the probability of a 3-collision in a list of 216 random 32-bit values is \({{2^{16}}\choose{3}} \cdot 2^{-64} < 2^{-18}\), with very high probability only the right quartets remain after this filtering. The expected number of such quartets is 216⋅2−14=4.

In the following step, we treat all the remaining quartets as right quartets. Under this assumption, we know not only the actual inputs to the F-function of round 8, but also the differences between its outputs.

  1. 3.

    Analyzing Right Quartets:

    1. (a)

      For each remaining quartet (C a ,C b ,C c ,C d ), guess the 32-bit value of KO 8,1 and KI 8,1. For the two pairs (C a ,C c ) and (C b ,C d ) use the value of the guessed key to compute the input and output differences of the OR operation in FL8 of both pairs.Footnote 7 For each bit of this 16-bit OR operation, the possible values of the corresponding bit of KL 8,2 (given the input and output difference of OR in that bit) are given in Table 3. On average, (8/16)16=2−16 values of KL 8,2 are suggested by each quartet and guess of KO 8,1 and KI 8,1.Footnote 8 Since all the right quartets suggest the same key, all the wrong keys are discarded with overwhelming probability, and the adversary obtains the correct value of (KO 8,1,KI 8,1,KL 8,2).

      Table 3. Possible values of KL 8,2 and KL 8,1.
    2. (b)

      Guess the 32-bit value of KO 8,3 and KI 8,3, and use this information to compute the input and output differences of the AND operation in both pairs of each quartet. For each bit of the 16-bit AND operation of FL8, the possible values of the corresponding bit of KL 8,1 are given in Table 3. On average, (8/16)16=2−16 values of KL 8,1 are suggested by each quartet and guess of KO 8,3, KI 8,3, and thus the adversary obtains the correct value of (KO 8,3,KI 8,3,KL 8,1).

  2. 4.

    Finding the Right Key: For each value of the 96 bits of (KO 8,1, KI 8,1, KO 8,3, KI 8,3, KL 8,1, KL 8,2) suggested in Step 3, guess the remaining 32 bits of the key, and perform a trial encryption.

The data complexity of the attack is 225 chosen ciphertexts and 225 adaptively chosen plaintexts encrypted/decrypted under one of four keys. The time complexity is dominated by the trial encryptions performed in step 4 to find the last 32 bits of the key, and thus it is approximately equal to 232 encryptions. The probability of success is approximately 76 % (this is the probability of having at least three right pairs in the data pool).

The memory complexity of the attack is also very moderate. We just need to store 226 plaintext/ciphertext pairs, where each pair takes 16 bytes. Hence, the total amount of memory used in the attack is 230 bytes, i.e., 1 GByte of memory.

5.1.1 Experimental Verification

We performed two types of experiments to verify our attack. In the first experiment, we just generated the required data, and located the right quartets (thus verifying the correctness of our randomness assumptions). The second experiment was the application of the full attack (both with and without the final exhaustive search over the remaining 32 key bits). All our experiments were carried out on an Intel Core Duo 2 machine with a T7200 CPU (2 GHz, 4 MB L2 Cache, 2 GBytes RAM, Linux-2.6.27 kernel, with gcc 4.3.2 and standard optimization flags (-O3, -fomit-frame-pointers, -funroll-loops), single core, single thread). We recall the fact that the experiment used the official reference implementation of KASUMI from [25], which is not optimized for performance (and thus for exhaustive search).

The first experiment was conducted 1000 times. In each test, we generated the data and found candidate quartets according to Steps 1 and 2 of the attack algorithm. Once these were found, we partially decrypted the quartets, and checked how many quartets were right ones. Table 4 details the outcome of these experiments, which follows the expected distribution.

Table 4. The number of identified right quartets in 1,000 tests.

The second experiment simulated the full attack. We repeated it 100 times, and counted in each case how many times the final exhaustive search over 232 possible keys would have been invoked.Footnote 9 In 78 out of these 100 experiments, 3 or more quartets were identified to be right ones (the expected number was 76.1), and then the key was found.

About 50 % of the tests were able to identify the right key by invoking either 2 or 4 exhaustive searches. As the first part of the attack (which identifies candidate quartets) takes about 8 minutes, and each exhaustive search (using the official KASUMI source code) takes about 26 minutes, we could find the full 128-bit key in about 50 % of our tests in less than 112 minutes (using a single core). It is important to note that by increasing the running time, one can increase the success rate of the attack without increasing its data requirements. The full distribution of the experiments is given in Table 5.

Table 5. The number of exhaustive searches as a function of the number of right quartets (100 experiments).

5.2 Related-Key Rectangle-Like Sandwich Attack on the Full KASUMI

The related-key sandwich distinguisher of 7-round KASUMI presented in Sect. 4 can be transformed in a standard way to a related-key rectangle-like sandwich distinguisher in which the probability of a quartet to be a right quartet is (1/4)2⋅(1/4)2⋅2−64⋅2−6=2−78. It is worth noting that in a chosen plaintext manner, one can ensure that the first round of the differential characteristic for E 0 is followed with probability 1 rather than 1/2, and hence the overall probability can be increased to 2−76. This distinguisher can be used to mount a related-key rectangle-like sandwich attack on the full KASUMI. The attack is very similar to the attack presented in detail in [7], and hence we omit the full description here, and just mention the changes.

Instead of starting with 251 plaintexts in each structure, the adversary can take 239 plaintexts. These plaintexts contain 278 possible quartets, and after the first filtering step only 214 quartets remain. Then in Step 2(a) of the attack, the adversary gets 230 suggestions for 48 key bits (instead of 254 as in [7]), and thus all the wrong suggestions can be discarded (since the right pairs suggest the same value). As a result, the time complexity of the following steps becomes negligible, and the overall time complexity is dominated by the time required to encrypt the 241 chosen plaintexts. This is worse than the 232 time complexity of our sandwich attack but is still practical, and applies in the more realistic chosen plaintext attack model.

As in the ordinary sandwich attack, the memory used during the rectangle-like sandwich attack is dominated by the storage of the plaintexts and the ciphertexts. By first encrypting the data under keys K a and K b , storing it in a sorted table, and then encrypting the data under K c and K d in a pair-by-pair manner, we have to store only 240 plaintext/ciphertext pairs. Hence, the total memory complexity of the attack is about 16 TBytes (244 bytes). Fortunately, this memory is accessed sequentially and can be relatively slow, so only a few hard disks are needed to store this data.

6 Summary

In this paper we developed a new sandwich attack on iterated block ciphers, and used it to reduce the time complexity of the best known attack on the full KASUMI from an impractical 276 to the very practical 232. However, the new attack uses both related keys and chosen messages, and thus it is not clear how to apply it in practice to break the specific way in which KASUMI is used in GSM and UMTS (3G) telephony. Our main point was to show that contrary to the assurances of its designers, the transition from MISTY to KASUMI led to a much weaker cryptosystem, which should be avoided in any application in which related-key attacks can be mounted.

6.1 Future Work

A drawback in the generic sandwich technique presented in this paper is the lack of rigorous analysis. While in the specific case of KASUMI, we performed a rigorous analysis of the transition probability at the middle slice M and further validated our results with experimental verifications, we were not able to provide such a rigorous analysis for the general case. In particular, we cannot give necessary and sufficient conditions on the cipher structure that ensure that the sandwich attack is applicable, neither we can give explicit conditions under which the independence assumptions the technique relies on are satisfied.

As for conditions that allow mounting a sandwich attack, formulating the exact conditions seems impossible, since such conditions should depend heavily on the exact structure of the cipher. What seems possible is to find other generic structures in which the sandwich attack is applicable (such as the Feistel construction presented in Sect. 2).

As for the independence assumptions, the same problem exists even in the much simpler case of differential cryptanalysis, where one cannot verify whether the independence assumption (known as the hypothesis of stochastic equivalence) holds without a large computational effort. In the case of (related-key) boomerang attacks, where the assumptions are close to the assumptions behind the sandwich attack, such conditions were analyzed in [19, 21], and the conclusion was that the assumptions must be checked in each particular case separately. It is likely that the same holds also for the sandwich attack, but any further results regarding the correctness of the independence assumptions in various cases will be interesting.

The last direction for further research refers to the specific case of KASUMI. While the related-key sandwich attack we presented in the paper was accompanied with a full experimental verification, we did not verify experimentally the rectangle-like sandwich attack. An experimental verification of this attack will be interesting, as it will be the first full implementation of a rectangle attack (on any block cipher).

We conclude this paper with a formal statement of the directions for further research raised above.

Problem 1

Find other generic structures in which the sandwich attack is applicable.

Problem 2

Find necessary and sufficient conditions under which the independence assumptions used in the sandwich attack are satisfied.

Problem 3

Verify the validity of the rectangle-like sandwich attack on full KASUMI presented in Sect. 5.2.