Wednesday, July 20, 2016

Full RNS Implementation of Fan Vercauteren scheme

In a paper (https://eprint.iacr.org/2016/510), Jean-Claude Bajard, Julien Eynard, Anwar Hasan and Vincent Zucca explain how to perform the division and rounding steps occurring during decryption and homomorphic multiplication procedures of scale invariant schemes such as FV and YASHE without using any multi-precision library. 

Such schemes, based on RLWE, usually lie in a power-of-two cyclotomic ring $\mathcal{R} = \mathbb{Z}[X]/(X^n+1)$ for efficiency reasons. In FV scheme, a ciphertext $\texttt{c} = (\texttt{c}_0, \texttt{c}_1)$ is a polynomial of degree 1 with coefficients in $\mathcal{R}_q = \mathbb{Z}_q[X]/(X^n+1)$. Each $\texttt{c}_i$ is then represented as a polynomial of degree $n-1$ with coefficients in $\mathbb{Z}_q$, $q$ is usually chosen as a product of several small prime moduli $q_i$. This allows to represent the integer coefficients of the $\texttt{c}_i$'s, through the Chinese Reminder Theorem (CRT), by their residues modulo the smaller $q_i$'s. However the use of the Residue Number System (a.k.a non-positional CRT representation) remains hardly compatible with procedures involving coefficient-wise division and rounding which require costly comparisons to be performed. Thus, until now, one had to invert the CRT representation and then use multi-precision arithmetic to perform these steps. 

The techniques presented in this paper allow to perform these steps directly in RNS permiting to avoid a costly conversion between RNS and positional system and the use of multi-precision library. Compared with the FV-NFLlib implementation (https://github.com/CryptoExperts/FV-NFLlib) it enables speed-ups from a factor 5 to 20 for decryption and from 2 to 4 for homomorphic multiplication. It may also lead to future implementations on highly parallel accelerator cards such as GPU. This paper will be presented at SAC (http://www.engr.mun.ca/~sac2016/), in St John's Canada, in August, this blog post briefly presents it.


Decryption
A plaintext $\texttt{m}$ is an element of $\mathcal{R}_t = \mathbb{Z}_t[X]/(X^n+1)$, after being encrypted it becomes a ciphertext $\texttt{c} = (\texttt{c}_0,\texttt{c}_1)\in (\mathcal{R}_q)^2$  with $t < q$. To recover the message the decryption procedure consists in computing $[\lfloor t/q(\texttt{c}_0+\texttt{c}_1 \texttt{s}) \rceil]_t$ with $\texttt{s}$ the secret key.

There are two difficulties to perform this computation in RNS. First $\texttt{c}'=t(\texttt{c}_0+\texttt{c}_1 \texttt{s})$ lies in $\mathcal{R}_q$ and is represented in the "native" RNS base $\mathcal{B}_q = \{q_1,....,q_k\}$ with $q = q_1...q_k$. Thus in base $\mathcal{B}_q$, it is impossible to perform the division by $q$. Second computing a rounding is unnatural in a non-positional systeme like RNS.

To overcome the first difficulty we need to convert the elements in base $\mathcal{B}_q$ to another base coprime to $\mathcal{B}_q$.  As we only need the result modulo $t$ we can use the single moduli base $\{ t \}$, as long as $t$ and $q$ are coprime (usually $t$ is a power of two while $q$ is odd). To achieve this conversion we use a fast base conversion which adds a multiple of $q$ called $q$-overflow. After the division by $q$ in the new base, $q$ is canceled and it only remains an error term $\tau$.

To solve the second difficulty, we choose to compute a flooring, using the equality $\lfloor \frac{x}{q}\rfloor = \frac{x-|x|_q}{q}$ instead of the rounding. The computation of the residue modulo $q$ is achieved by using a Montgomery reduction, which fits well in RNS. Of course, as flooring and rounding are not the same, an error may occur. In such case we simply consider this error as part of the previous one $\tau$.

At this step we need to cancel, the error $\tau$ due to the conversion and the flooring but also the inherent noise in the ciphertext to recover the message. To do this, we add an other moduli $\gamma$, coprime to $q$ to the base $\{t\}$, we multiply $\texttt{c}'$ by $\gamma$ in base $\mathcal{B}_q$ and we run the computations explained above in base $\{\gamma, t\}$ instead of $\{t\}$. Then, for a well-suited $\gamma$, we are able to cancel the noise in the residue modulo $t$ by subtracting the centered residue modulo $\gamma$ to it. Finally we are able to recover $[\texttt{m}]_t$.

Homomorphic multiplication

To perform an homomorphic multiplication of two ciphertexts $\texttt{c}^{(1)}$ and $\texttt{c}^{(2)}$, you first have to scale the product $\texttt{c}^{(1)}\times\texttt{c}^{(2)}$ (which is a degree two polynomial with coefficients in $\mathcal{R}_q$) by $t/q$ and to round the result coefficient-wise as for the decryption. Then one has to relinearize this degree two polynomial to get back a degree one polynomial. But, to limit the noise growth during this step, it requires to decompose the coefficients of an element of $\mathcal{R}_q$ in some radix base $\omega$. Sadly, this is precisely the kind of thing we cannot do in a non-positional number representation.

One should understand that the product $\texttt{c}^{(1)}\times\texttt{c}^{(2)}$ should be lifted in $\mathbb{Z}$ and because we scale down by $t/q$ the coefficients of the product fits back in $\mathcal{R}_q$. Instead of making the product computation in $\mathbb{Z}$, we use a second RNS base $\mathcal{B}'$ coprime to $\mathcal{B}_q$ of same size to not loose any information.

Then we use a fast conversion to convert $\texttt{c}^{(1)}$ and $\texttt{c}^{(2)}$ from $\mathcal{B}_q$ to $\mathcal{B}'$. We manage to reduce the error in base $\mathcal{B}'$ (due to the $q$-overflow) in order to reduce its growth during the product and then we perform the product in both bases.

Next, we use a fast conversion to convert the residues of the product in base $\mathcal{B}_q$ to base $\mathcal{B}'$. Then, as for decryption, we perform a flooring using Montgomery reduction instead of a rounding and we get $\lfloor t/q (\texttt{c}^{(1)}\times\texttt{c}^{(2)})\rfloor$ in base $\mathcal{B}'$. We consider the error due to the different conversions as part of the noise.

For the next step, we convert the result back in base $\mathcal{B}_q$, but this time using an exact base conversion (more costly), because we cannot tolerate overflows from the base $\mathcal{B}'$. To overcome the last difficulty we have managed to decompose the elements belonging to $\mathcal{R}_q$ in a very similar way from the original, however we can perform this one in RNS. The end of the homomorphic multiplication procedure remains the same, we just replace the original decomposition function by our own.  

Impact on the noise
Of course, using all this tricks add some errors which become part of the noise and thus make it bigger. One could wonder if, and how, our multiplicative depth has been affected. Indeed our methods modify the different bounds on the noise for the homomorphic multiplication and also the decryption bound. Nevertheless, by running an analysis with our new bounds, we have shown that our multiplicative depth remains unchanged from the one of the original scheme, for all the sets of parameters but one.

Thursday, July 7, 2016

WHEAT 2016: The Pre-history of Lattice-based Cryptanalysis

On the second day of the Workshop HEAT, Antoine Joux gave an invited talk on the history of cryptanalytic applications of  lattice-basis reduction algorithms.

Lattice is a discrete subgroup of $\mathbb{R}^n$. Equivalently, it is a span of integer-linear combinations of $\mathbb{R}$-linearly independent vectors in $\mathbb{R}^n$. Finding a short basis in $\mathbb{R}^2$ is relatively easy and Gauss' lattice-basis reduction algorithm is effective for this purpose. As the speaker emphasised, things become more complicated when we move to higher dimensions. He then went on to discuss the details of the famous LLL algorithm, invented by Lenstra, Lenstra and Lovász in 1982, for lattice-basis reduction in arbitrary dimensions. LLL is a polynomial-time algorithm but returns a short vector within an exponential approximation factor of the length of the shortest vector in the lattice. This algorithm, which performs well in practice, may be viewed as a combination of the Gauss lattice reduction and the Gram-Schmidt orthogonalisation methods.

An early application of lattice-basis reduction is the cryptanalysis of knapsack-based cryptosystems. The knapsack problem, a.k.a. the subset-sum problem, is given integers $a_1, \ldots, a_n$, and $S$, find $\epsilon_1,\ldots,\epsilon_n \in \{0,1\}$ such that 
\[S = \overset{n}{\underset{i=1}{\sum}} \epsilon_i \cdot a_ i.\]
This problem is NP-hard though some cases are easy, and it served as a basis for cryptosystems such as Merkle-Hellman cryptosystem. The basic idea is to hide an easy knapsack in a hard-looking one. However, at CRYPTO 1982, Shamir broke this cryptosystem using lattice-basis reduction. This attacks works for super-increasing knapsacks, where we have $a_i > \sum_{j=1}^{i-1} a_j$.  Other works starting with that of Lagarias-Odlyzko in 1985 lead to improved attacks on low-density knapsacks.

The speaker also briefly spoke about the application of lattie-basis reduction to the problems of

  •   reconstructing small linear dependencies: let  $(x_1,\ldots,x_n)$ be a sequence of real numbers, we need to find small coefficients $v_i$ such that $\sum v_i\cdot x_i =0$,
  • constructing polynomials for use in the Number Field Sieve for solving the Discrete Logarithm Problem, and
  • finding small roots of univariate and bivariate integer polynomials modulo composite numbers: Coppersmith's attack.

Tuesday, July 5, 2016

WHEAT Workshop 2016, Paris

The second HEAT workshop on Fully Homomorphic Encryption (otherwise known as WHEAT) kicked off this morning in (disappointingly cloudy) Paris.

The first talk of the day was Martin Albrecht's invited talk on solving short secret LWE instances. This is joint work with Rachel Player and Sam Scott. He provided a clear summary of the state of the art in terms of algorithms available and thus set the grounds for the rest of the day. Attacks using parameter choices inspired by HElib and the LP model were discussed and where possible, estimate running times were provided. A partial conclusion was the latter should not be assumed. For a more detailed read, see https://eprint.iacr.org/2015/046.pdf.

Also discussed was RLWE security in the FHE case, a talk given by Guillaume Bonnoron. This is joint work with Caroline Fontaine. They present an attack on specific FHE parameter choices and use the FV scheme for the reason that they wish to avoid doing a Modulus Switch Operation (see for example https://eprint.iacr.org/2015/889.pdf for a description). The conclusion of this talk is that their attack does work, but FHE and RLWE are not broken, because it does not scale up. A bigger parameter $n$ leads to bigger errors. Whilst they took one day to complete the attack versus a previous 3.5 days (https://eprint.iacr.org/2015/176.pdf), this talk's final conclusion was that more cryptanalysis is needed.

The second invited talk in the afternoon was Leo Ducas', joint work with Martin Albrecht and Shi Bai. This discussed the subfield lattice attack on "over stretched NTRU assumptions"; for the full technical read, see https://eprint.iacr.org/2016/127.pdf. This attack is based on using a subfield of the field we are working in - they assume the latter is a power of two cyclotomic, but claim it can be done using any field which is Galois. We map down our RLWE instances using the (partial) norm map in order to solve a (hopefully) easier problem in the subfield. Given the right conditions, we can then map the solution we find back up and recover a short enough solution to carry through with the attack. The practicality grows with the modulus $q$, hence the "over stretched" condition. Although this does not seem (yet?) to affect the NTRU schemes used in practice, this talk's conclusion was to drop the NTRU assumption altogether.

Monday, July 4, 2016

Fixed Point Arithmetic

In a paper (http://eprint.iacr.org/2016/250) to be presented at SAC (http://www.engr.mun.ca/~sac2016/), in St Johns Canada, in August HEAT researchers Ana Costache, Nigel P. Smart, Srinivas Vivek and Adrian Waller describe how to perform homomorphic operations on fixed point numbers. The work builds upon earlier work of Dowlin et al (http://research.microsoft.com/pubs/258435/ManualHEv2.pdf). In this blog post we present a quick overview of the technique.

To understand the technique you need to appreciate that all known efficient Somewhat Homomorphic Encryption (SHE) schemes are homomorphic over a given ring of polynomials. In particular the set of polymomials modulo both a prime $p$ (the so-called plaintext modulus) and a polynomial $F(X)$. Given such a plaintext space we can embed integers upto a bounded size within the plaintext space in a redundant manner as follows.

Suppose we have two integers, say $x=127$ and $y=78$, we write these as polynomials in {\em balanced base} $B$, for $B-3$. This means we write the integers in base $B$, but we allow plus or minus signs in the representation and the coefficients lie in $[0,\ldots,(B-1)/2]$. Thus we have
$$  127 \equiv 3^5-3^4 -3^3 -3^2+1 =X^5-X^4-X^3-X^2+1$$
and
$$  78 \equiv 3^4-3 = X^4-X$$.
When we add and multiply two such polynomials, and then substitute back in $X=3$, we will obtain the same result as multiplying and adding the original integers.  So for example
\begin{eqnarray*}  (X^5-X^4-X^3&-&X^2+1)
     \cdot
    (X^4-X)  \\
    &=& X^9-X^8-X^7-2 \cdot X^6+X^5+2 \cdot X^4+X^3-X.
\end{eqnarray*}
Notice how both the degree and the coefficient sizes have increased. If the resulting polynomial can still be represented in the plaintext space, i.e. the prime $p$ and the degree of $F(X)$ are large enough, then we can map the polynomial operation to the homomorphic operation. Thus assuming $p$ and $F(X)$ are {\em big enough} we can perform operations on integers; even though the homomorphic encryption scheme only supports operations on polynomials modulo $(p,F(X))$.

To encode a fixed point number we then simply encode it as the ratio of an integer numerator and a power of $B$ denominator; so for example $4.7037 \approx 127/3^3$. We can then apply operations on these fractional numbers, as long as we keep track of which denominator we are working with, by simply operating on the numberators. Thus each encrypted fixed point number is represented by an encrypted integer (the numerator) and a public power of three (the denominator).

Whilst this representation has been known before, and was given in the paper of Dowlin, a key issue was that one needed to know for a given homomorphic operation what the requisite sizes of $p$ and $F(X)$ should be to ensure correctness. It is this latter problem which our paper addresses; we show how to obtian lower bounds on the plaintext size which are needed to support a given homomorphic calculation.  In addition we show that a technique of Dowlin et al, which appears at first sight to be more efficient, turns out to be actually just the same as the above encoding method for fixed point numbers.

Fourier Transforms in the Encrypted Domain


Thales UK has identified Synthetic Aperture Radar (SAR) as a promising candidate for the application of Homomorphic Encryption (HE). It is proposed that the data collected from SAR can be protected throughout its processing up to the formation of image products. SAR data can be processed in many ways to produce different types of images; the simplest of which is the Single Look Complex Image. The main challenge of implementing the Single Look Complex Image algorithm in the encrypted domain is that of implementing the Fourier Transform.

Discrete Fourier Transform

Consider the Discrete Fourier Transform (DFT) for a sequence of complex numbers $x_0, x_1, \ldots, x_{N-1}$:
 
$X_k = \sum_{n=0}^{N-1} x_n \cdot e^{ -\frac{2\pi i k n} {N} } , \quad k=0,\ldots, N-1$
 
This is a linear transform; it requires only the summation of sequence samples and multiplication by scalar values $e^{-\frac{2\pi i k n}{N} }$ (sometimes called twiddle factors). We usually implement this via a Fast Fourier Transform (FFT) which reduces the computational complexity by the use of a few mathematical tricks.
 

Paillier Cryptosystem

The Paillier cryptosystem [1] encrypts plaintext messages $m\in \mathbb{Z}_n$, where $n$ is the product of two large prime numbers, to a ciphertext $c\in \mathbb{Z}^*_{n^2}$. Let $\text{Enc}$ and $\text{Dec}$ denote the encryption and decryption processes respectively. It can be shown that the Paillier cryptosystem satisfies the two following properties:
1)      For two plaintext messages $m_1, m_2\in \mathbb{Z}_n$
                  $\text{Dec}( \text{Enc}(m_1) \cdot \text{Enc}(m_2) \text{ mod } n^2)$ $ = m_1 + m_2 \text{ mod } n$.
2)      For a plaintext $m$ and any constant integer $k$
$\text{Dec} ( \text{Enc}(m)^k \text{ mod } n ) = km \text{ mod } n^2$.

Properties 1 and 2 allow us to add any two plaintexts and multiply any plaintext by an integer constant in the encrypted domain; these operations are sufficient to implement the Fourier Transform (DFT/FFT).
 

The Fourier Transform under Paillier Encryption

There are two major challenges in the implementation of a Fourier Transform under Paillier encryption:   
1)      Paillier encrypts and allows operations only on integer values modulo $n$. We must convert our sequence samples $x_0, x_1, \ldots, x_{N-1}$, and twiddle factors $e^{-\frac{2\pi ikn}{N} }$, into integers modulo $n$. We achieve this by multiplying by two appropriately chosen scale factors; one for the sequence samples and one for the twiddle factors. These scale factors must be dealt with throughout the Fourier Transform.
2)      The homomorphic addition and scalar multiplication operations result in plaintext values modulo $n$. That is, if we wish to sum two plaintext values $m_1, m_2$ in the encrypted domain we must ensure the resulting plaintext is no greater than $n-1$. Otherwise, we can say nothing about the magnitude of $m_1 + m_2$. Since we wish to perform many additions and scalar multiplications throughout the Fourier Transform, we require $n$ to be large enough to cope with the growth from all of these operations.

These two challenges have been discussed in [2]. Challenge 1 means that we need to deal with the quantisation scale factors when performing operations. The analysis of challenge 2 in [2] results in the following lower bound for the word length $n_p$ of the modulus $n$ after radix-2 FFT operations:
$n_p \geq v + (v-2) n_2 + n_1 + 3$

Where:
$v = \log_2 (N)$
$n_1 = b_1-1$, where $b_1$ is the number of bits of precision in the sequence samples
$n_2 = \left\lceil b_2 - \dfrac{v}{2} + \dfrac{1}{2} \right\rceil$ where $b_2$ is the hardware register size in an equivalent plaintext implementation of the Fourier Transform. Equivalently, $n_2$ is the number of bits of precision required in the twiddle factors to ensure the same precision in the result of the Fourier Transform as a standard (in clear) hardware implementation. 

Typical SAR applications perform multiple FFT operations, as well as other multiplications. For example, the Single Look Complex Image processing employs four 2-dimensional FFTs and 3 Hadamard products. Using the inequality above for one (1-dimensional) FFT operation, our analysis of the Single Look Complex Image processing results in a lower bound: 
 
$n_p \geq 8v + 8(v-2) n_2 + n_1 + 3(n_3-1) + 24$
 
for the modulus word length, where $n_3$ is the word length associated with each of the inputs to the Hadamard products. We expect values of the order $v=13$, $n_1 = 7$, $n_2=10$, $n_3=16$, giving a lower bound of $n_p\geq 1060$. It is likely that the modulus word length is required to be larger than this bound for security reasons so there is plenty of scope to perform Single Look Complex Image processing.
 

References

[1]  Paillier, "Public-Key Cryptosystems Based on Composite Degree Residuosity Classes", EUROCRYPT, Springer. pp. 223–238 (1999).
[2]  Bianchi, T., Piva, A., Barni, M, "On the Implementation of the Discrete Fourier Transform in the Encrypted Domain", IEE Transaction on Information and Security , 86-97 (2009).