Keywords

1 Introduction

Most symmetric key primitives, like block ciphers, stream ciphers or hash functions, are usually based on various components that provide confusion and diffusion. Both concepts are very important for the overall security and efficiency of the cryptographic scheme and extensive studies have been conducted to find the best possible building blocks. The goal of diffusion is basically to spread the internal dependencies as much as possible. Several designs use a weak yet fast diffusion layer based on simple XOR, addition and shifting operation, but another trend is to rely on strong linear diffusion matrices, like Maximal Distance Separable (MDS) matrices. A typical example is the AES cipher [17], which uses a \(4 \times 4\) matrix in \(\mathrm {GF}(2^8)\) to provide diffusion among a vector of 4 bytes. These mathematical objects ensure the designers a perfect diffusion (the underlying linear code meets the Singleton bound), but can be quite heavy to implement. Software performances are usually not so much impacted as memory is not really constrained and table-based implementations directly incorporate the field multiplications in the stored values. However, hardware implementations will usually suffer from an important area requirement due to the Galois field multiplications. The impact will also be visible on the efficiency of software bitslice implementations which basically mimic the hardware computations flow.

Good hardware efficiency has became a major design trend in cryptography, due to the increasing importance of ubiquitous computing. Many lightweight algorithms have recently been proposed, notably block ciphers [9, 12, 14, 19] and hash functions [4, 11, 18]. The choice of MDS matrices played an important role in the reduction of the area required to provide a certain amount of security. Along with PHOTON hash function [18] was proposed a new type of MDS matrix that can be computed in a serial or recursive manner. This construction greatly reduces the temporary memory (and thus the hardware area) usually required for the computation of the matrix. Such matrices were later used in LED  [19] block cipher, or PRIMATEs  [1] authenticated encryption scheme, and were further studied and generalized in subsequent articles [2, 3, 10, 28, 33]. Even though these serial matrices provide a good way to save area, this naturally comes at the expense of an increased number of cycles to apply the matrix. In general, they are not well suited for round-based or low-latency implementations.

Another interesting property for an MDS matrix to save area is to be involutory. Indeed, in most use cases, encryption and decryption implementations are required and the inverse of the MDS matrix will have to be implemented as well (except for constructions like Feistel networks, where the inverse of the internal function is not needed for decryption). For example, the MDS matrix of AES is quite lightweight for encryption, but not really for decryptionFootnote 1. More generally, it is a valuable advantage that one can use exactly the same diffusion matrix for encryption and decryption. Some ciphers like ANUBIS  [5], KHAZAD  [6], ICEBERG  [32] or PRINCE  [13] even pushed the involution idea a bit further by defining a round function that is almost entirely composed of involution operations, and where the non-involution property of the cipher is mainly provided by the key schedule.

There are several ways to build a MDS matrix [15, 20, 24, 27, 29, 34], a common method being to use a circulant construction, like for the AES block cipher [17] or the WHIRLPOOL hash function [8]. The obvious benefit of a circulant matrix for hardware implementations is that all of its rows are similar (up to a right shift), and one can trivially reuse the multiplication circuit to save implementation costs. However, it has been proven in [23] that circulant matrices of order 4 cannot be simultaneously MDS and involutory. And very recently Gupta et al. [21] proved that circulant MDS involutory matrices do not exist. Finding lightweight matrices that are both MDS and involutory is not an easy task and this topic has attracted attention recently. In [29], the authors consider Vandermonde or Hadamard matrices, while in [15, 20, 34] Cauchy matrices were used. Even if these constructions allow to build involutory MDS matrices for big matrix dimensions, it is difficult to find the most lightweight candidates as the search space can become really big.

Our Contributions. In this article, we propose a new method to search for lightweight MDS matrices, with an important focus on involutory ones. After having recalled the formula to compute the XOR count, i.e. the amount of XORs required to evaluate one row of the matrix, we show in Sect. 2 that the choice of the irreducible polynomial is important and can have a significant impact on the efficiency, as remarked in [26]. In particular, we show that the best choice is not necessarily a low Hamming weight polynomial as widely believed, but instead one that has a high standard deviation regarding its XOR count. Then, in Sect. 3, we recall some constructions to obtain (involutory) MDS matrices: circulant, Hadamard, Cauchy and Cauchy-Hadamard. In particular, we prove new properties for some of these constructions, which will later help us to find good matrices. In Sect. 4 we prove the existence of equivalent classes for Hadamard matrices and involutory Hadamard-Cauchy matrices and we use these considerations to conceive improved search algorithms of lightweight (involutory) MDS matrices. Our methods can also be relaxed and applied to the search of lightweight non-involutory MDS matrices. In Sect. 5, we quickly describe these new algorithms, providing all the details for lightweight involutory and non-involutory MDS matrices in the full version of this article [31]. These algorithms are significant because they are feasible exhaustive search while the search space of the algorithms described in [15, 20] is too big to be exhaustedFootnote 2. Our algorithms guarantee that the matrices found are the lightest according to our metric.

Overall, using Hadamard or Hadamard-Cauchy constructions, we provide the smallest known (involutory or non-involutory) MDS matrices for the classical dimensions \(4 \times 4\), \(8 \times 8\), \(16 \times 16\) and \(32 \times 32\) in \(\mathrm {GF}(2^4)\) and \(\mathrm {GF}(2^8)\). The designers of one of the CAESAR competition candidates, Joltik  [22], have used one of the matrices that we have found to build their primitive. All our results are summarized and commented in Sect. 6. Surprisingly, it seems that involutory MDS matrices are not much more expensive than non-involutory MDS ones, the former providing the great advantage of a free inverse implementation as well. We recall that in this article we are not considering serial matrices, as their evaluation either requires many clock cycles (for serial implementations) or an important area (for round-based implementations).

Due to space constraints, all proofs are given in the full version of this article [31].

Notations and Preliminaries. We denote by \(\mathrm {GF}(2^r)\) the finite field with \(2^r\) elements, \(r\ge 1\). This field is isomorphic to polynomials in \(\mathrm {GF}(2)[X]\) modulo an irreducible polynomial p(X) of degree r, meaning that every field element can be seen as a polynomial \(\alpha (X)\) with coefficients in \(\mathrm {GF}(2)\) and of degree \(r-1\): \(\alpha (X)=\sum ^{r-1}_{i=0}b_i X^i\), \(b_i \in \mathrm {GF}(2)\), \(0 \le i \le r-1\). The polynomial \(\alpha (X)\) can also naturally be viewed as an r-bit string \((b_{r-1}, b_{r-2}, ..., b_0)\). In the rest of the article, an element \(\alpha \) in \(\mathrm {GF}(2^r)\) will be seen either as the polynomial \(\alpha (X)\), or the r-bit string represented in a hexadecimal representation, which will be prefixed with \(\mathsf {0x}\). For example, in \(\mathrm {GF}(2^8)\), the 8-bit string 00101010 corresponds to the polynomial \(X^5+X^3+X\), written \(\mathsf {0x2a}\) in hexadecimal.

The addition operation on \(\mathrm {GF}(2^r)\) is simply defined as a bitwise XOR on the coefficients of the polynomial representation of the elements, and does not depend on the choice of the irreducible polynomial p(X). However, for multiplication, one needs to specify the irreducible polynomial p(X) of degree r. We denote this field as \(\mathrm {GF}(2^r)/p(X)\), where p(X) can be given in hexadecimal representationFootnote 3. The multiplication of two elements is then the modulo p(X) reduction of the product of the polynomial representations of the two elements.

Finally, we denote by M[ij] the (ij) entry of the matrix M, we start the counting from 0, that is M[0, 0] is the entry corresponding to the first row and first column.

2 Analyzing XOR Count According to Different Finite Fields

In this section, we explain the XOR count that we will use as a measure to evaluate the lightweightness of a given matrix. Then, we will analyze the XOR count distribution depending on the finite field and irreducible polynomial considered. Although it is known that finite fields of the same size are isomorphic to each other and it is believed that the security of MDS matrices is not impacted by this choice, looking at the XOR count is a new aspect of finite fields that remains unexplored in cryptography.

2.1 The XOR Count

It is to note that the XOR count is an easy-to-manipulate and simplified metric, but MDS coefficients have often been chosen to lower XOR count, e.g. by having low Hamming weight. As shown in [26], low XOR count is strongly correlated minimization of hardware area.

Later in this article, we will study the hardware efficiency of some diffusion matrices and we will search among huge sets of candidates. One of the goals will therefore be to minimize the area required to implement these lightweight matrices, and since they will be implemented with XOR gates (the diffusion layer is linear), we need a way to easily evaluate how many XORs will be required to implement them. We explain our method in this subsection.

In general, it is known that low Hamming weight generally requires lesser hardware resource in implementations, and this is the usual choice criteria for picking a matrix. For example, the coefficients of the AES MDS matrix are 1, 2 and 3, in a hope that this will ensure a lightweight implementation. However, it was shown in [26] that while this heuristic is true in general, it is not always the case. Due to some reduction effects, and depending on the irreducible polynomial defining the computation field, some coefficients with not-so-low Hamming weight might be implemented with very few XORs.

Definition 1

The XOR count of an element \(\alpha \) in the field \(\mathrm {GF}(2^r)/p(X)\) is the number of XORs required to implement the multiplication of \(\alpha \) with an arbitrary \(\beta \) over \(\mathrm {GF}(2^r)/p(X)\).

For example, let us explain how we compute the XOR count of \(\alpha =3\) over \(\mathrm {GF}(2^4)/\mathsf {0x13}\) and \(\mathrm {GF}(2^4)/\mathsf {0x19}\). Let \((b_3,b_2,b_1,b_0)\) be the binary representation of an arbitrary element \(\beta \) in the field. For \(\mathrm {GF}(2^4)/\mathsf {0x13}\), we have:

$$\begin{aligned} (0,0,1,1)\,\cdot \,(b_3,b_2,b_1,b_0)&= (b_2,b_1,b_0 \oplus b_3 ,b_3) \oplus (b_3,b_2,b_1,b_0) \\&= (b_2 \oplus b_3, b_1 \oplus b_2,b_0 \oplus b_1 \oplus b_3, b_0 \oplus b_3), \end{aligned}$$

which corresponds to 5 XORsFootnote 4. For \(\mathrm {GF}(2^4)/\mathsf {0x19}\), we have:

$$\begin{aligned} (0,0,1,1)\,\cdot \,(b_3,b_2,b_1,b_0)&= (b_2 \oplus b_3,b_1,b_0,b_3) \oplus (b_3,b_2,b_1,b_0) \\&= (b_2, b_1 \oplus b_2, b_0 \oplus b_1, b_0 \oplus b_3), \end{aligned}$$

which corresponds to 3 XORs. One can observe that XOR count is different depending on the finite field defined by the irreducible polynomial.

In order to calculate the number of XORs required to implement an entire row of a matrix, we can use the following formula given in [26]:

$$\begin{aligned} {\text {XOR count for one row of}}\ M = (\gamma _1, \gamma _2, ..., \gamma _k)+(n-1)\cdot r, \end{aligned}$$
(1)

where \(\gamma _i\) is the XOR count of the i-th entry in the row of M, n being the number of nonzero elements in the row and r the dimension of the finite field.

For example, the first row of the AES diffusion matrix being (1, 1, 2, 3) over the field \(\mathrm {GF}(2^8)/\mathsf {0x11b}\), the XOR count for the first row is \((0+0+3+11)+3 \times 8 = 38\) XORs (the matrix being circulant, all rows are equivalent in terms of XOR count).

2.2 XOR Count for Different Finite Fields

We programmed a tool that computes the XOR count for every nonzero element over \(\mathrm {GF}(2^r)\) for \(r=2,\ldots ,8\) and for all possible irreducible polynomials (all the tables will be given in [31], we provide an extract in Appendix B). By analyzing the outputs of this tool, we could make two observations that are important to understand how the choice of the irreducible polynomial affects the XOR count. Before presenting our observations, we state some terminologies and properties related to reciprocal polynomials in finite fields.

Definition 2

A reciprocal polynomial \(\frac{1}{p}(X)\) of a polynomial p(X) over \(\mathrm {GF}(2^r)\), is a polynomial expressed as \(\frac{1}{p}(X) = X^{r} \cdot p(X^{-1})\). A reciprocal finite field, \(\mathbf {K} = \mathrm {GF}(2^r)/\frac{1}{p}(X)\), is a finite field defined by the reciprocal polynomial which defines \(\mathbf {F} = \mathrm {GF}(2^r)/p(X)\).

In other words, a reciprocal polynomial is a polynomial with the order of the coefficients reversed. For example, the reciprocal polynomial of \(p(X)=\mathsf {0x11b}\) in \(\mathrm {GF}(2^8)\) is \(\frac{1}{p}(X)=\mathsf {0x\frac{1}{11b}}=\mathsf {0x1b1}\). It is also to be noted that the reciprocal polynomial of an irreducible polynomial is also irreducible.

The Total XOR Count. Our first new observation is that even if for an individual element of the field the choice of the irreducible polynomial has an impact on the XOR count, the total sum of the XOR count over all elements in the field is independent of this choice. We state this observation in the following theorem.

Theorem 1

The total XOR count for a field \(\mathrm {GF}(2^r)\) is \(r\sum ^{r}_{i=2}2^{i-2}(i-1)\), where \(r \ge 2\).

From Theorem 1, it seems that there is no clear implication that one irreducible polynomial is strictly better than another, as the mean XOR count is the same for any irreducible polynomial. However, the irreducible polynomials have different distribution of the XOR count among the field elements, that is quantified by the standard deviation. A high standard deviation implies that the distribution of XOR count is very different from the mean, thus there will be more elements with relatively lower/higher XOR count. In general, the order of the finite field is much larger than the order of the MDS matrix and since only a few elements of the field will be used in the MDS matrices, there is a better chance of finding an MDS matrix with lower XOR count.

Hence, our recommendation is to choose the irreducible polynomial with the highest standard deviation regarding the XOR count distribution. From previous example, in \(\mathrm {GF}(2^4)\) (XOR count mean equals 4.25 for this field dimension), the irreducible polynomials \(\mathsf {0x13}\) and \(\mathsf {0x19}\) lead to a standard deviation of 2.68, while \(\mathsf {0x1f}\) leads to a standard deviation of 1.7075. Therefore, the two first polynomials seem to be a better choice. This observation will allow us to choose the best irreducible polynomial to start with during the searches. We refer to [31] for all the standard deviations according to the irreducible polynomial, here we provide an extract in Appendix B.

We note that the folklore belief was that in order to get lightweight implementations, one should use a low Hamming weight irreducible polynomial. The underlying idea is that with such a polynomial less XORs might be needed when the modular reduction has to be applied during a field multiplication. However, we have shown that this is not necessarily true. Yet, by looking at the data from Appendix B, we remark that the low Hamming weight irreducible polynomials usually have a high standard deviation, which actually validates the folklore belief. We conjecture that this heuristic will be less and less exact when we go to higher and higher order fields.

Matching XOR Count. Our second new observation is that the XOR count distribution implied by a polynomial will be the same compared to the distribution of its reciprocal counterpart. We state this observation in the following theorem.

Theorem 2

There exists an isomorphic mapping from a primitive \(\alpha \in \mathrm {GF}(2^r)/p(X)\) to another primitive \(\beta \in \mathrm {GF}(2^r)/\frac{1}{p}(X)\) where the XOR count of \(\alpha ^i\) and \(\beta ^i\) is equal for each \(i=\{1,2,...,2^r-1\}\).

In Table 4 of Appendix A, we listed all the primitive mapping from a finite field to its reciprocal finite field for all fields \(\mathrm {GF}(2^r)\) with \(r=2,\ldots ,8\) and for all possible irreducible polynomials. We give an example to illustrate our theorem. For \(\mathrm {GF}(2^4)\), there are three irreducible polynomials: \(\mathsf {0x13}\), \(\mathsf {0x19}\) and \(\mathsf {0x1f}\) and the XOR count for the elements are shown in Appendix B. From the binary representation we see that \(\mathsf {0x\frac{1}{13}}=\mathsf {0x19}\). Consider an isomorphic mapping \(\phi : \mathrm {GF}(2^4)/\mathsf {0x13} \rightarrow \mathrm {GF}(2^4)/\mathsf {0x19}\) defined as \(\phi (2)=12\), where 2 and 12 are the primitives for the respective finite fields. Table 3 of Appendix A shows that the order of the XOR count is the same.

We remark that for a self-reciprocal irreducible polynomial, for instance \(\mathsf {0x1f}\) in \(\mathrm {GF}(2^4)\), there also exists an automorphism mapping from a primitive to another primitive with the same order of XOR count (see Appendix A).

Theorem 2 is useful for understanding that we do not need to consider \(\mathrm {GF}(2^r)/\frac{1}{p}(X)\) when we are searching for lightweight matrices. As there exists an isomorphic mapping preserving the order of the XOR count, any MDS matrix over \(\mathrm {GF}(2^r)/\frac{1}{p}(X)\) can be mapped to an MDS matrix over \(\mathrm {GF}(2^r)/p(X)\) while preserving the XOR count. Therefore, it is redundant to search for lightweight MDS matrices over \(\mathrm {GF}(2^r)/\frac{1}{p}(X)\) as the lightest MDS matrix can also be found in \(\mathrm {GF}(2^r)/p(X)\). This will render our algorithms much more efficient: when using exhaustive search for low XOR count MDS over finite field defined by various irreducible polynomial, one can reduce the search space by almost a factor 2 as the reciprocal polynomials are redundant.

3 Types of MDS Matrices and Properties

In this section, we first recall a few properties of MDS matrices and we then explain various constructions of (involutory) MDS matrices that were used to generate lightweight candidates. Namely, we will study 4 types of diffusion matrices: circulant, Hadamard, Cauchy, and Hadamard-Cauchy. We recall that we do not consider serially computable matrices in this article, like the ones described in [2, 3, 18, 19, 28, 33], since they are not adapted to round-based implementations. As MDS matrices are widely studied and their properties are commonly known, their definition and properties are given in [31].

3.1 Circulant Matrices

A common way to build an MDS matrix is to start from a circulant matrix, reason being that the probability of finding an MDS matrix would then be higher than a normal square matrix [16].

Definition 3

A \(k \times k\) matrix C is circulant when each row vector is rotated to the right relative to the preceding row vector by one element. The matrix is then fully defined by its first row.

An interesting property of circulant matrices is that since each row differs from the previous row by a right shift, a user can just implement one row of the matrix multiplication in hardware and reuse the multiplication circuit for subsequent rows by just shifting the input. However in [31], we will show that these matrices are not the best choice.

3.2 Hadamard Matrices

Definition 4

([20]). A finite field Hadamard (or simply called Hadamard) matrix H is a \(k \times k\) matrix, with \(k = 2^s\), that can be represented by two other submatrices \(H_1\) and \(H_2\) which are also Hadamard matrices:

$$ H= \begin{pmatrix} H_1 &{} H_2 \\ H_2 &{} H_1 \\ \end{pmatrix}. $$

Similarly to [20], in order to represent a Hadamard matrix we use notation \(had(h_0, h_1, ...,h_{k-1})\) (with \(h_i=H[0,i]\) standing for the entries of the first row of the matrix) where \(H[i,j] = h_{i \oplus j}\) and \(k = 2^s\). It is clear that a Hadamard matrix is bisymmetric. Indeed, if we define the left and right diagonal reflection transformations as \(H_L = T_L(H)\) and \(H_R = T_R(H)\) respectively, we have that \(H_L[i,j] = H[j,i] = H[i,j]\) and \(H_R[i,j] = H[k-1-i,k-1-j] = H[i,j]\) (the binary representation of \(k-1 = 2^s-1\) is all 1, hence \(k-1-i= (k-1) \oplus i\)).

Moreover, by doing the multiplication directly, it is known that if \(H=had(h_0, h_1, ...,h_{k-1})\) is a Hadamard matrix, then \(H \times H = c^2 \cdot I\), with \(c^2 = h_0^2 + h_1^2 + h_2^2 + ... + h_{k-1}^2\). In other words, the product of a Hadamard matrix with itself is a multiple of an identity matrix, where the multiple \(c^2\) is the sum of the square of the elements from the first row.

A direct and crucial corollary to this fact is that a Hadamard matrix over \(\mathrm {GF}(2^r)\) is involution if the sum of the elements of the first row is equal to 1. Now, it is important to note that if one deals with a Hadamard matrix for which the sum of the first row over \(\mathrm {GF}(2^r)\) is nonzero, we can very simply make it involutory by dividing it with the sum of its first row.

We will use these considerations to generate low dimension diffusion matrices (order 4 and 8) with an innovative exhaustive search over all the possible Hadamard matrices. We note that, similarity to a circulant matrix, an Hadamard matrix will have the interesting property that each row is a permutation of the first row, therefore allowing to reuse the multiplication circuit to save implementation costs.

3.3 Cauchy Matrices

Definition 5

A square Cauchy matrix, C, is a \(k \times k\) matrix constructed with two disjoint sets of elements from \(\mathrm {GF}(2^r)\), \(\{\alpha _0, \alpha _1, ..., \alpha _{k-1}\}\) and \(\{\beta _0, \beta _1, ..., \beta _{k-1}\}\) such that \(C[i,j] = \frac{1}{\alpha _{i} + \beta _{j}}\).

It is known that the determinant of a square Cauchy matrix, C, is given as

$$\begin{aligned} \det (C) = \frac{\prod _{0 \le i < j \le k-1}(\alpha _j - \alpha _i)(\beta _j - \beta _i)}{\prod _{0 \le i, j \le k-1}(\alpha _i + \beta _j)}. \end{aligned}$$

Since \(\alpha _i \ne \alpha _j\), \(\beta _i \ne \beta _j\) for all \(i,j \in \{0,1,...,k-1\}\), a Cauchy matrix is nonsingular. Note that for a Cauchy matrix over \(\mathrm {GF}(2^r)\), the subtraction is equivalent to addition as the finite field has characteristic 2. As the sets are disjoint, we have \(\alpha _i \ne \beta _j\), thus all entries are well-defined and nonzero. In addition, any submatrix of a Cauchy matrix is also a Cauchy matrix as it is equivalent to constructing a smaller Cauchy matrix with subsets of the two disjoint sets. Therefore, a Cauchy matrix is an MDS matrix.

3.4 Hadamard-Cauchy Matrices

The innovative exhaustive search over Hadamard matrices is sufficient to generate low dimension diffusion matrices (order 4 and 8). However, the computation for verifying the MDS property and the exhaustive search space grows exponentially. It eventually becomes impractical to search for higher dimension Hadamard matrices (order 16 or more). Therefore, we use the Hadamard-Cauchy matrix construction, proposed in [20] as an evolution of the involutory MDS Vandermonde matrices [28], that guarantees the matrix to be an involutory MDS matrix.

In [20], the authors proposed a \(2^s \times 2^s\) matrix construction that combines both the characteristics of Hadamard and Cauchy matrices. Because it is a Cauchy matrix, a Hadamard-Cauchy matrix is an MDS matrix. And because it is a Hadamard matrix, it will be involutory when \(c^2\) is equal to 1. Therefore, we can construct a Hadamard-Cauchy matrix and check if the sum of first row is equal to 1 and, if so, we have an MDS and involutory matrix. A detailed discussion on Hadamard-Cauchy matrices is given in [31].

4 Equivalence Classes of Hadamard-Based Matrices

Our methodology for finding lightweight MDS matrices is to perform an innovative exhaustive search and by eventually picking the matrix with the lowest XOR count. Naturally, the main problem to tackle is the huge search space. By exploiting the properties of Hadamard matrices, we found ways to group them in equivalent classes and significantly reduce the search space. In this section, we introduce the equivalence classes of Hadamard matrices and the equivalence classes of involutory Hadamard-Cauchy matrices. It is important to note that these two equivalence classes are rather different as they are defined by very different relations.

4.1 Equivalence Classes of Hadamard Matrices

It is known that a Hadamard matrix can be defined by its first row, and different permutation of the first row results in a different Hadamard matrix with possibly different branch number. In order to find a lightweight MDS involution matrix, it is necessary to have a set of k elements with relatively low XOR count that sum to 1 (to guarantee involution). Moreover, we need all coefficients in the first row to be different. Indeed, if the first row of an Hadamard matrix has 2 or more of the same element, say \(H[0,i] = H[0,j]\), where \(i,j \in \{0,1,...,k-1\}\), then in another row we have \(H[i \oplus j, i]=H[i \oplus j, j]\). These 4 entries are the same and thus, H is not MDS.

By permuting the entries we hope to find an MDS involution matrix. However, given k distinct nonzero elements, there are k! ways to permute the first row of the Hadamard matrix, which can quickly become intractable. Therefore, we introduce a relation that relates certain permutations that lead to the same branch number.

Definition 6

Let H and \(H^{(\sigma )}\) be two Hadamard matrices with the same set of entries up to some permutation \(\sigma \). We say that they are related, \(H \sim H^{(\sigma )}\), if every pair of input vectors, \((v, v^{(\sigma )})\) with the same permutation \(\sigma \), to H and \(H^{(\sigma )}\) respectively, have the same set of elements in the output vectors.

For example, let us consider the following three Hadamard matrices

$$ H = \begin{pmatrix} w &{} x &{} y &{} z\\ x &{} w &{} z &{} y\\ y &{} z &{} w &{} x\\ z &{} y &{} x &{} w\\ \end{pmatrix}, \quad \quad H^{(\sigma _1)} = \begin{pmatrix} y &{} z &{} w &{} x\\ z &{} y &{} x &{} w\\ w &{} x &{} y &{} z\\ x &{} w &{} z &{} y\\ \end{pmatrix}, \quad \quad H^{(\sigma _2)} = \begin{pmatrix} w &{} x &{} z &{} y\\ x &{} w &{} y &{} z\\ z &{} y &{} w &{} x\\ y &{} z &{} x &{} w\\ \end{pmatrix}, $$

One can see that \(H^{(\sigma _1)}\) is defined by the third row of H, i.e. the rows are shifted by two positions and \(\sigma _1=\{ 2,3,0,1 \}\). Let us consider an arbitrary input vector for H, say \(v = (a,b,c,d)\). Then, if we apply the permutation to v, we obtain \(v^{(\sigma _1)} = (c,d,a,b)\). We can observe that:

$$\begin{aligned} v \cdot H = (aw+bx+cy+dz, ax+bw+cz+dy, ay+bz+cw+dx, az+by+cx+dw), \end{aligned}$$
$$\begin{aligned} v^{(\sigma _1)} \cdot H^{(\sigma _1)}\,{=}\,(cy+dz+aw+bx, cz+dy+ax+bw, cw+dx+ay+bz, cx+dw+az+by), \end{aligned}$$

It is now easy to see that \(v \cdot H = v^{(\sigma _1)} \cdot H^{(\sigma _1)}\). Hence, we say that \(H \sim H^{(\sigma _1)}\). Similarily, with \(\sigma _2=\{ 0,1,3,2 \}\), we have \(v^{(\sigma _2)} = (a,b,d,c)\) and:

$$\begin{aligned} v \cdot H = (aw+bx+cy+dz, ax+bw+cz+dy, ay+bz+cw+dx, az+by+cx+dw), \end{aligned}$$
$$\begin{aligned} v^{(\sigma _2)} \cdot H^{(\sigma _2)} \,{=}\, (aw+bx+dz+cy, ax+bw+dy+cz, az+by+dw+cx, ay+bz+dx+cw), \end{aligned}$$

and since \(v \cdot H\) and \(v^{(\sigma _2)} \cdot H^{(\sigma _2)}\) are the same up to the permutation \(\sigma _2\), we can say that \(H \sim H^{(\sigma _2)}\).

Definition 7

An equivalence class of Hadamard matrices is a set of Hadamard matrices satisfying the equivalence relation \(\sim \).

Proposition 1

Hadamard matrices in the same equivalence class have the same branch number.

When searching for an MDS matrix, we can make use of this property to greatly reduce the search space: if one Hadamard matrix in an equivalence class is not MDS, then all other Hadamard matrices in the same equivalence class will not be MDS either. Therefore, it all boils down to analyzing how many and which permutation of the Hadamard matrices belongs to the same equivalence classes. Using the two previous examples \(\sigma _1\) and \(\sigma _2\) as building blocks, we generalize them and present two lemmas.

Lemma 1

Given a Hadamard matrix H, any Hadamard matrix \(H^{(\alpha )}\) defined by the \((\alpha +1)\)-th row of H, with \(\alpha = {0,1,2,...,k-1}\), is equivalent to H.

Next, let us consider the other type of permutation. We can see in the example with \(\sigma _2\) that up to the permutation applied to the Hadamard matrix, input and output vectors are the same. Let \(H^{(\sigma )}\), \(v^{(\sigma )}\) and \(u^{(\sigma )}\) denote the permuted Hadamard matrix, the permuted input vector and its corresponding permuted output vector. We want the permutation to satisfy \(u_{\sigma (j)} = u^{(\sigma )}_j\), where \(j \in \{0,1,...,k-1 \}\). That is the permutation of the output vector of H is the same as the permuted output vector of \(H^{(\sigma )}\). Using the definition of the Hadamard matrix, we can rewrite it as

$$ \bigoplus _{i=0}^{k-1}v_i h_{i \oplus \sigma (j)}= \bigoplus _{i=0}^{k-1}v^{(\sigma )}_i H^{(\sigma )}[i,j]. $$

Using the definition of the permutation and by the fact that it is one-to-one mapping, we can rearrange the XOR order of the terms on the left-hand side and we obtain

$$ \bigoplus _{i=0}^{k-1}v_{\sigma (i)} h_{\sigma (i) \oplus \sigma (j)}= \bigoplus _{i=0}^{k-1}v_{\sigma (i)} h_{\sigma (i \oplus j)}. $$

Therefore, we need the permutation to be linear with respect to XOR: \(\sigma (i \oplus j) = \sigma (i) \oplus \sigma (j)\). This proves our next lemma.

Lemma 2

For any linear permutation \(\sigma \) (w.r.t. XOR), the two Hadamard matrices H and \(H^{(\sigma )}\) are equivalent.

We note that the permutations in Lemmas 1 and 2 are disjoint, except for the identity permutation. This is because for the linear permutation \(\sigma \), it always maps the identity to itself: \(\sigma (0) = 0\). Thus, for any linear permutation, the first entry remains unchanged. On the other hand, when choosing another row of H as the first row, the first entry is always different.

With these two lemmas, we can now partition the family of Hadamard matrices into equivalence classes. For Lemma 1, we can easily see that the number of permutation is equal to the order of the Hadamard matrix. However, for Lemma 2 it is not so trivial. Therefore, we have the following lemma.

Lemma 3

Given a set of \(2^s\) nonzero elements, \(S = \{ \alpha _0, \alpha _1, ..., \alpha _{2^s-1} \}\), there are \(\prod ^{s-1}_{i=0} (2^s - 2^i)\) linear permutations w.r.t. XOR operation.

Theorem 3

Given a set of \(2^s\) nonzero elements, \(S = \{ \alpha _0, \alpha _1, ..., \alpha _{2^s-1} \}\), there are \(\frac{(2^s-1)!}{\prod ^{s-1}_{i=0} (2^s - 2^i)}\) equivalence classes of Hadamard matrices of order \(2^s\) defined by the set of elements S.

For convenience, we call the permutations in Lemmas 1 and 2 the \(\mathcal {H}\)-permutations. The \(\mathcal {H}\)-permutations can be described as a sequence of the following types of permutations on the index of the entries:

  1. 1.

    choose \(\alpha \in \{0,1,...,2^s-1\}\), define \(\sigma (i)=i \oplus \alpha , \forall i=0,1,...,2^s-1\), and

  2. 2.

    fix \(\sigma (0)=0\), in ascending order of the index i, choose the permutation if i is power of 2, otherwise it is defined by the linear permutation (w.r.t. XOR): \(\sigma (i \oplus j)=\sigma (i) \oplus \sigma (j)\).

We remark that given a set of 4 nonzero elements, from Theorem 3 we see that there is only 1 equivalence class of Hadamard matrices. This implies that there is no need to permute the entries of the \(4 \times 4\) Hadamard matrix in hope to find MDS matrix if one of the permutation is not MDS.

With the knowledge of equivalence classes of Hadamard matrices, what we need is an algorithm to pick one representative from each equivalence class and check if it is MDS. The idea is to exhaust all non-\(\mathcal {H}\)-permutations through selecting the entries in ascending index order. Since the entries in the first column of Hadamard matrix are distinct (otherwise the matrix is not MDS), it is sufficient for us to check the matrices with the first entry (index 0) being the smallest element. This is because for any other matrices with the first entry set as some other element, it is in the same equivalence class as some matrix \(H^{(\alpha )}\) where the first entry of \((\alpha + 1)\)-th row is the smallest element. For indexes that are powers of 2, select the smallest element from the remaining set. While for the other entries, one can pick any element from the remaining set.

For \(8 \times 8\) Hadamard matrices for example, the first three entries, \(\alpha _0\), \(\alpha _1\) and \(\alpha _2\) are fixed to be the three smallest elements in ascending order. Next, by Lemma 2, \(\alpha _3\) should be defined by \(\alpha _1\) and \(\alpha _2\) in order to preserve the linear property, thus to “destroy” the linear property and obtain matrices from different equivalence classes, pick an element from the remaining set in ascending order as the fourth entry \(\alpha _3\). After which, \(\alpha _4\) is selected to be the smallest element among the remaining 4 elements and permute the remaining 3 elements to be \(\alpha _5\), \(\alpha _6\) and \(\alpha _7\) respectively. For each of these arrangement of entries, we check if it is MDS (the algorithm can be found in [31]). We terminate the algorithm prematurely once an MDS matrix is found, else we conclude that the given set of elements does not generate an MDS matrix.

It is clear that arranging the entries in this manner will not obtain two Hadamard matrices from the same equivalence class. But one may wonder if it actually does exhaust all the equivalence classes. The answer is yes: Theorem 3 shows that there is a total of 30 equivalence classes for \(8 \times 8\) Hadamard matrices. On the other hand, from the algorithm described above, we have 5 choices for \(\alpha _3\) and we permute the remaining 3 elements for \(\alpha _5\), \(\alpha _6\) and \(\alpha _7\). Thus, there are 30 Hadamard matrices that we have to check.

4.2 Equivalence Classes of Involutory Hadamard-Cauchy Matrices

Despite having a new technique to reduce the search space, the computation cost for checking the MDS property is still too huge when the order of the Hadamard matrix is larger than 8. Therefore, we use the Hadamard-Cauchy construction for order 16 and 32. Thanks to the Cauchy property, we are ensured that the matrix will be MDS. Hence, the only problem that remains is the huge search space of possible Hadamard-Cauchy matrices. To prevent confusion with Hadamard matrices, we denote Hadamard-Cauchy matrices with K.

First, we restate in Algorithm 1 the technique from [20] to build involutory MDS matrices, with some modifications on the notations for the variables. Although it is not explicitly stated, we can infer from Lemmas 6,7 and Theorem 4 from [20] that all Hadamard-Cauchy matrices can be expressed as an output of Algorithm 1.

figure a

Similarly to Hadamard matrices, we denote a Hadamard-Cauchy matrix by its first row of elements as \(hc(h_0,h_1,...,h_{2^s-1})\), with \(h_i=K[0,i]\). To summarize the construction of a Hadamard-Cauchy matrix of order \(2^s\) mentioned in Algorithm 1, we pick an ordered set of \(s+1\) linearly independent elements, we call it the basis. We use the first s elements to span an ordered set S of \(2^s\) elements, and add the last element z to all the elements in S. Next, we take the inverse of each of the elements in this new set and we get the first row of the Hadamard-Cauchy matrix. Lastly, we generate the matrix based on the first row in the same manner as an Hadamard matrix.

For example, for an \(8 \times 8\) Hadamard-Cauchy matrix over \(\mathrm {GF}(2^4)/\mathsf {0x13}\), say we choose \(x_1 = 1, x_2 = 2, x_4 = 4\), we generate the set \(S = \{ 0, 1, 2, 3, 4, 5, 6, 7 \}\), choosing \(z = 8\) and taking the inverses in the new set, we get a Hadamard-Cauchy matrix \(K = hc(15, 2, 12, 5, 10, 4, 3, 8)\). To make it involution, we multiply each element by the inverse of the sum of the elements. However for this instance the sum is 1, hence K is already an involutory MDS matrix.

One of the main differences between the Hadamard and Hadamard-Cauchy matrices is the choice of entries. While we can choose all the entries for a Hadamard matrix to be lightweight and permute them in search for an MDS candidate, the construction of Hadamard-Cauchy matrix makes it nontrivial to control its entries efficiently. Although in [20] the authors proposed a backward re-construction algorithm that finds a Hadamard-Cauchy matrix with some pre-decided lightweight entries, the number of entries that can be decided beforehand is very limited. For example, for a Hadamard-Cauchy matrix of order 16, the algorithm can only choose 5 lightweight entries, the weight of the other 11 entries is not controlled. The most direct way to find a lightweight Hadamard-Cauchy matrix is to apply Algorithm 1 repeatedly for all possible basis. We introduce now new equivalence classes that will help us to exhaust all possible Hadamard-Cauchy matrices with much lesser memory space and number of iterations.

Definition 8

Let \(K_1\) and \(K_2\) be two Hadamard-Cauchy matrices, we say they are related, \(K_1 \sim _{HC} K_2\), if one can be transformed to the other by either one or both operations on the first row of entries:

  1. 1.

    multiply by a nonzero scalar, and

  2. 2.

    \(\mathcal {H}\)-permutation of the entries.

The crucial property of the construction is the independence of the elements in the basis, which is not affected by multiplying a nonzero scalar. Hence, we can convert any Hadamard-Cauchy matrix to an involutory Hadamard-Cauchy matrix by multiplying it with the inverse of the sum of the first row and vice versa. However, permutating the positions of the entries is the tricky part. Indeed, for the Hadamard-Cauchy matrices of order 8 or higher, some permutations destroy the Cauchy property, causing it to be non-MDS. Using our previous \(8 \times 8\) example, suppose we swap the first two entries, \(K' = hc(2, 15, 12, 5, 10, 4, 3, 8)\), it can be verified that it is not MDS. To understand why, we work backwards to find the basis corresponding to \(K'\). Taking the inverse of the entries, we have \(\{ 9, 8, 10, 11, 12, 13, 14, 15 \}\). However, there is no basis that satisfies the 8 linear equations for the entries. Thus it is an invalid construction of Hadamard-Cauchy matrix. Therefore, we consider applying the \(\mathcal {H}\)-permutation on Hadamard-Cauchy matrix. Since it is also a Hadamard matrix, the \(\mathcal {H}\)-permutation preserves its branch number, thus it is still MDS. So we are left to show that a Hadamard-Cauchy matrix that undergoes \(\mathcal {H}\)-permutation is still a Hadamard-Cauchy matrix.

Lemma 4

Given a \(2^s \times 2^s\) involutory Hadamard-Cauchy matrix K, there are \(2^s \cdot \prod _{i=0}^{s-1}(2^s-2^i)\) involutory Hadamard-Cauchy matrices that are related to K by the \(\mathcal {H}\)-permutations of the entries of the first row.

With that, we can define our equivalence classes of involutory Hadamard-Cauchy matrices.

Definition 9

An equivalence class of involutory Hadamard-Cauchy matrices is a set of Hadamard-Cauchy matrices satisfying the equivalence relation \(\sim _{HC}\).

In order to count the number of equivalence classes of involutory Hadamard-Cauchy matrices, we use the same technique for proving Theorem 3. To do that, we need to know the total number of Hadamard-Cauchy matrices that can be constructed from the Algorithm 1 for a given finite field.

Lemma 5

Given two natural numbers s and r, based on Algorithm 1, there are \(\prod _{i=0}^{s}(2^r-2^i)\) many \(2^s \times 2^s\) Hadamard-Cauchy matrices over \(\mathrm {GF}(2^r)\).

Theorem 4

Given two positive integers s and r, there are \(\prod _{i=0}^{s-1} \frac{2^{r-1}-2^i}{2^s-2^i}\) equivalence classes of involutory Hadamard-Cauchy matrices of order \(2^s\) over \(\mathrm {GF}(2^r)\).

In [15], the authors introduced the notion of compact Cauchy matrices which are defined as Cauchy matrices with exactly \(2^s\) distinct elements. These matrices seem to include Cauchy matrices beyond the class of Hadamard-Cauchy matrices. However, it turns out that the equivalence classes of involutory Hadamard-Cauchy matrices can be extended to compact Cauchy matrices.

Corollary 1

Any compact Cauchy matrices can be generated from some equivalence class of involutory Hadamard-Cauchy matrices.

Note that since the permutation of the elements in S and \(z+S\) only results in rearrangement of the entries of the compact Cauchy matrix, the XOR count is invariant from Hadamard-Cauchy matrix with the same set of entries.

5 Searching for Involutory MDS and Non-involutory MDS Matrices

Due to space constraints, we have put the new methods we have designed to look for the lightest possible involutory MDS and non-involutory MDS matrices in [31].

More precisely, regarding involutory MDS matrices, using the previous properties and equivalence classes given in Sects. 3 and 4 for several matrix constructions, we have derived algorithms to search for the most lightweight candidate. First, we point out that the circulant construction can not lead to involutory MDS matrices, then we focus on the case of matrices of small dimension using the Hadamard construction. For bigger dimension, we add the Cauchy property to the Hadamard one in order to guarantee that the matrix will be MDS. We recall that, similarity to a circulant matrix, an Hadamard matrix will have the interesting property that each row is a permutation of the first row, therefore allowing to reuse the multiplication circuit to save implementation costs.

Regarding non-involutory MDS matrices, we have extended the involutory MDS matrix search to include non-involutory candidates. For Hadamard construction, we removed the constraint that the sum of the first row elements must be equal to 1. For the Hadamard-Cauchy, we multiply each equivalent classes by a non-zero scalar value. We note that the disadvantage of non-involutory MDS matrices is that their inverse may have a high computation cost. But if the inverse is not required (for example in the case of popular constructions such as a Feistel network, or a CTR encryption mode), non-involution matrices might be lighter than involutory matrices.

6 Results

We first emphasize that although in [15, 20] the authors proposed methods to construct lightweight matrices, the choice of the entries are limited as mentioned in Sect. 4.2. This is due to the nature of the Cauchy matrices where the inverse of the elements are used during the construction, which makes it non-trivial to search for lightweight Cauchy matricesFootnote 5. However, using the concept of equivalence classes, we can exhaust all the matrices and pick the lightest-weight matrix.

We applied the algorithms of Sect. 5 to construct lightweight MDS involutions over \(\mathrm {GF}(2^8)\). We list them in the upper half of Table 2 and we can see that they are much lighter than known MDS involutions like the KHAZAD and ANUBIS, previous Hadamard-Cauchy matrices [6, 20] and compact Cauchy matrices [15]. In lower half of Table 2, we list the \(\mathrm {GF}(2^8)\) MDS matrices we found using and show that they are lighter than known MDS matrices like the AES, WHIRLPOOL and WHIRLWIND matrices [7, 8, 17]. We also compare with the 14 lightweight candidate matrices \(C_0\) to \(C_{13}\) for the WHIRLPOOL hash functions suggested during the NESSIE workshop [30, Sect. 6]. Table 2 is comparing our matrices with the ones explicitly provided in the previous articles. Recently, Gupta et al. [21] constructed some circulant matrices that is lightweight for both itself and its inverse. However we do not compare them in our table because their approach minimizes the number of XORs, look-up tables and temporary variables, which might be optimal for software but not for hardware implementations based purely on XOR count.

By Theorem 2 in Sect. 2, we only need to apply the algorithms from Sect. 5 for half the representations of \(\mathrm {GF}(2^8)\) when searching for optimal lightweight matrices. And as predicted by the discussion after Theorem 1, the lightweight matrices we found in Table 2 do come from \(\mathrm {GF}(2^8)\) representations with higher standard deviations.

We provide in the first column of the Table 2 the type of the matrices. They can be circulant, Hadamard or Cauchy-Hadamard. The subfield-Hadamard construction is based on the method of [26, Sect. 7.2] which we explain here. Consider the MDS involution \(M = had(\mathsf {0x1},\mathsf {0x4},\mathsf {0x9},\mathsf {0xd})\) over \(\mathrm {GF}(2^4)/\mathsf {0x13}\) in the first row of Table 2. Using the method of [26, Sect. 7.2], we can extend it to a MDS involution over \(\mathrm {GF}(2^8)\) by using two parallel copies of Q. The matrix is formed by writing each input byte \(x_j\) as a concatenation of two nibbles \(x_j=(x_j^L||x_j^R)\). Then the MDS multiplication is computed on each half \((y_1^L,y_2^L,y_3^L,y_4^L) = M \cdot (x_1^L,x_2^L,x_3^L,x_4^L)\) and \((y_1^R,y_2^R,y_3^R,y_4^R) = M \cdot (x_1^R,x_2^R,x_3^R,x_4^R)\) over \(\mathrm {GF}(2^4)\). The result is concatenated to form four output bytes \((y_1,y_2,y_3,y_4)\) where \(y_j = (y_j^L||y_j^R)\).

We could have concatenated different submatrices and this is done in the WHIRLWIND hash function [7], where the authors concatenated four MDS submatrices over \(\mathrm {GF}(2^4)\) to form \((M_0 | M_1 | M_1 | M_0)\), an MDS matrix over \(\mathrm {GF}(2^{16})\). The submatrices are non-involutory Hadamard matrices \(M_0 = had(\mathsf {0x5}, \mathsf {0x4}, \mathsf {0xa}\), \(\mathsf {0x6}, \mathsf {0x2}, \mathsf {0xd}, \mathsf {0x8}, \mathsf {0x3})\) and \(M_1 = (\mathsf {0x5}, \mathsf {0xe}, \mathsf {0x4}, \mathsf {0x7}, \mathsf {0x1}, \mathsf {0x3}, \mathsf {0xf}, \mathsf {0x8})\) defined over \(\mathrm {GF}(2^4)/\mathsf {0x13}\). For fair comparison with our \(\mathrm {GF}(2^8)\) matrices in Table 2, we consider the corresponding WHIRLWIND-like matrix \((M_0 | M_1)\) over \(\mathrm {GF}(2^8)\) which takes half the resource of the original WHIRLWIND matrix and is also MDS.

The second column of the result tables gives the finite field over which the matrix is defined, while the third column displays the first row of the matrix where the entries are bytes written in hexadecimal notation. The fourth column gives the XOR count to implement the first row of the \(n \times n\) matrix. Because all subsequent rows are just permutations of the first row, the XOR count to implement the matrix is just n times this number. For example, to compute the XOR count for implementing \(had(\mathsf {0x1}, \mathsf {0x4}, \mathsf {0x9}, \mathsf {0xd})\) over \(\mathrm {GF}(2^4)/\mathsf {0x13}\), we consider the expression for the first row of matrix multiplication \(\mathsf {0x1} \cdot x_1 \oplus \mathsf {0x4} \cdot x_2 \oplus \mathsf {0x9} \cdot x_3 \oplus \mathsf {0xd} \cdot x_4\). From Table 5 of Appendix B, the XOR count of multiplication by \(\mathsf {0x1}, \mathsf {0x4}, \mathsf {0x9}\) and \(\mathsf {0xd}\) are 0, 2, 1 and 3, which gives us a cost of \((0+2+1+3) + 3 \times 4 = 18\) XORs to implement one row of the matrix (the summand \(3 \times 4\) account for the three XORs summing the four nibbles). For the subfield construction over \(\mathrm {GF}(2^8)\), we need two copies of the matrix giving a cost of \(18 \times 2 =36\) XORs to implement one row.

We also applied the algorithms that can be found in [31] to find lightweight MDS involution and non-involution matrices of order 4 and 8 over \(\mathrm {GF}(2^4)\), these matrices are listed in Table 1. By the structure of Hadamard matrix, the first row of an MDS Hadamard matrix must be pairwise distinct. Therefore, there does not exist Hadamard matrix of order larger than 8 over \(\mathrm {GF}(2^4)\). Due to the smaller dimension of the finite field, the XOR counts of the matrices over \(\mathrm {GF}(2^4)\) are approximately half of those over \(\mathrm {GF}(2^8)\).

The application of our work has already been demonstrated in Joltik, a lightweight and hardware-oriented authenticated encryption scheme that uses our lightweight MDS involution matrix of order 4 over \(\mathrm {GF}(2^4)\) with XOR count as low as 18. On the other hand, the diffusion matrix from Prøst  [25] was designed with a goal in mind to minimise the number of XOR operations to perform for implementing it. By Theorem 2 in Sect. 2, we observe that these two matrices are in fact the counterpart of each other in their respective finite fields. Thus, they are essentially the same lightest matrix according to our metric.

Table 1. Comparison of MDS (involution) matrices over \(\mathrm {GF}(2^4)\)
Table 2. Comparison of MDS Matrices over \(\mathrm {GF}(2^8)\). The upper table compares the involutory MDS matrices, while the lower table compares the non-involutory MDS matrices (the factor 2 appearing in some of the XOR counts is due to the fact that we have to implement two copies of the matrices)

We also applied the algorithms from Sect. 5 to find lightweight MDS involution and non-involution matrices of order 4 and 8 over \(\mathrm {GF}(2^4)\), these matrices are listed in Table 1.

With our work, we can now see that one can use involutory MDS for almost the same price as non-involutory MDS. For example in the upper half of Table 2, the previous \(4 \times 4\) MDS involution from [20] is about 3 times heavier than the AES matrixFootnote 6; but in this paper, we have used an improved search technique to find an MDS involution lighter than the AES and ANUBIS matrix. Similarly, we have found \(8 \times 8\) MDS involutions which are much lighter than the KHAZAD involution matrix, and even lighter than lightweight non-involutory MDS matrix like the WHIRLPOOL matrix. Thus, our method will be useful for future construction of lightweight ciphers based on involutory components like the ANUBIS, KHAZAD, ICEBERG and PRINCE ciphers.