Wednesday, April 29, 2015

Homomorphic Encryption At EUROCRYPT 2015

This is a blog post from the Eurocrypt 2015 conference held in Sofia, Bulgaria, this week. There were a number of talks which were related to homomorphic encryption; which we discuss below.

Kristen Lauter: Practical Applications of Homomorphic Encryption

Kristen gave one of the three invited talks, and she focused on topics which are closely related to the HEAT project. She first described the Secure Genome Analysis competition run last month, which was sponsored by the United States' NIH funding agency. The idea behind the competition was to encourage research on practical computations on encrypted genomic data. The competition involved both FHE and MPC technology, and attracted eleven submissions. The submissions related to FHE from Kristen's group at Microsoft as well as submissions from IBM, Stanford/MIT, UCI and Tsukuba.

The competition raised a lot of excitement in the bioinformatics community and in popular science journals. The reason for this excitement is the confluence of interest in both privact protection and genome research data, which on its own poses a significant risk in relation to personal data. The competition also encouraged interdisciplinary work between Computer Science, Mathematics,
Bioinformatics and Policy.

The spur in research in the area has resulted from the drop in the cost of sequencing a human genome. This was estimated to cost 3 billion US dollars in 2000 per person, but now costs around 1000 USD per person. This drop in cost opens up new applications in personalized medicine, forensic work, ancestry, anthropology and other areas. Yet, it brings to the fore the need for  LARGE
SCALE PRIVACY PRESERVING analysis. Since the privacy risks are not only for the people being sequenced but also their families.

Kristen went onto discuss the various Secure GWAS (Genome Wide Association Scenarios) that the competition covered. The two main areas were where there was a single data owner (e.g. one patient and one hospital) versus multiple data owners (which are mututally distrusting).  The talk focused on the single data owner case, as the other case is really more suited to MPC than FHE.

There were two challenges in the single data owner case. The first challenge was related to large scale statistical studies. Two hundred cases were taken from the PGP (Personal Genome Project) from Harvard Medical School, as well as 200 randomized controls and various statistical operations were performed. It turned out that the required computation could be done by only the additively homomorphic Paillier system. However, the somewhat homomorphic LWE based system had faster evaluation time than the Paillier scheme, and was hence relatively competitive.

In the second challenge two individuals were randomly selected from the PGP and 5000, 10000 and 100000 locations in the genome sequence were also selected. The competitors task was to compute the edit distance and Hamming Weight on the two sequences at the given locations. Despite the complexity of the task (the edit distance has a relatively deep circuit), the processing times for the LWE based schemes was measured in seconds as opposed to minutes.

It turned out that much of the work in these challenges was not directly cryptographically related. There were issues related to cleaning of the data, how data was encoded, time vs memory trade-offs in the algorithms used, as well as parameter selection (related to function and data size as opposed to
security).

Kristen then discussed another example, which was demonstrated at a recent American Assoication for the Advancement of Science (AAAS) meeting. In this application encrypted measurements of an individuals weight, age, height, blood pressure etc were transmitted to the cloud, and then the cloud produced an encryption of the likelihood of the individual having a heart attack. The overall computation took less than 0.2 of a second on the cloud, however the cost of sending the ciphertext to the cloud was far more due to the large ciphertext sizes.  In some sense this was a silly example as the function could be computed by the individual themselves. However there are scenarios where the function is private, thus this example demonstrates that something can be done in such situations.

Kristen ended with a number of research open problems which included that of reduced the storage and transmission costs, which often dominated the computation in the examples considered in the talk, the problem of analyzing the new hard problems such as RLWE upon which most schemes depend, as well as moving the computations to those of real life in scale by (for example) considering larger amounts of data or deeper circuits.

Ciphers for MPC and FHE:  Martin Albrecht, Christian Rechberger, Thomas Schneider, Tyge Tiessen and Michael Zohner

Earlier in the conference this paper was presented which tries to reduce the storage requirements by using a hybrid FHE scheme $HE_{pk}(k), Enc_k(m)$. The key problem being to define a symmetric encryption scheme $Enc_k$ which is easy to homomorphically evaluate using a homomorphic encryption scheme. This is a new design requirement for such symmetric primitives. The main difference between standard ciphers is that in the homomorphic environment XOR gates are cheaper than AND gates (a situation which holds also in MPC applications as well as FHE ones). Usually cipher designers assume the two costs are equivalent. Thus the question arises as to what would an efficient cipher look like if such linear operations are for free.

The paper considered a number of possible metrics for such ciphers for example i) the number of ANDs per bit of encrypted text, ii) the multiplicative depth, or iii) the total number of ANDs per encryption. Existing schemes such as Noekeon, Fantomas, and Robin meet these design criteria; but they were introduced for side-channel resistance. For FHE applications we need something more extreme.

The authors decided to pick an SPN design with very small S-Boxes. However, not all S-Boxes where used in each substitution later. The parameters of the scheme are the block size n, the number of S-Boxes used in the substitution layer m, the key size k, and the data complexity d (effort required to break the cipher). From these values they estimate the required number of rounds r.

For the S-Boxes the authors selected a 3-bit to 3-bit S-box consisting of three AND gates and having multiplicative depth one. The affine layers are selected by choosing a uniform random binary matrix and vector and computing
$$Ax+b.$$
To produce a NUMS (Nothing Up My Sleeve) matrix the Grain LFSR is used to generate the matrix.

The talk went on to discuss security against differential, algebraic and other forms of cryptanalysis. Coming up with example parameters of $n=256$, $m=63$, $r=12$ and $d=2^{128}$ with a total AND depth of twelve (compared to $40$ for AES).

Fully Homomorphic Encryption over the Integers Revisited: Jung He Cheon and Damien Stehle

This paper examined the security of integer based FHE/SHE schemes. These are derived from the original DGHV scheme and their security is based on the AGCD problem. There are two key issues with schemes based on AGCD; first there is no known reduction to a classical lattice problem (unlike the LWE based schemes) and the schemes need an additional sparse subset sum problem for
bootstrapping.

This talk solved both of these problems. It showed a reduction from LWE to AGCD, and hence AGCD is no easier than worst case lattice problem. It then showed that the cost estimate of various previous orthogonal lattice attacks on these schemes was overly pessimistic. Finally it gave an integer based scheme which was as secure as LWE, had smaller ciphertexts than previous schemes and
which did not require an additional assumption for bootstrapping.

The new scheme was similar to previous schemes in that encryption involved sampling a subset of encryptions of zero and then adding these onto the message. The message itself was embedded in the upper bits of the ciphertext (as in usual scale invariant schemes). Finally multiplication was performed using the usual trick of BitDecomp and PowerOfTwo used in other schemes.

Leo Ducas and Daniele Micciancio: FHEW: Bootstrapping in less than a second.

Whilst HELib (see below) can do bootstrapping in about 6 mins, and this is amortized over many ciphertexts using SIMD operations, this work refreshes one ciphertext, with no amortization, in 0.61 seconds on a 3GHz machine.

The key idea is to provide a new cheap NAND gate and then a simpler refreshing procedure using the ring structure. The cost of the Homomorphic NAND operation is relatively very small.  This was then combined with the general bootstrapping  approach using the ring variant of GSW scheme and using the approach of Alperin-Sheriff and Peikert from last year.  Implementing mod q arithmetic "in
the exponent".   The decryption circuit itself is implemented as an accumulator; which is very well suited to the GSW scheme.  To do reduction mod q we embed the group $(Z_q,+)$ into the $q$th roots of unity.

For sample parameters the authors have a bootstrapping key size of 1.3 GB. The main issue is computing fast fourier transforms (for which they used the FFTW library).

Bootstrapping for HELib: Shai Halevi and Victor Shoup.

The talk started by discussing the three generations of FHE schemes (as typified by the original Gentry scheme, the BGV scheme and the GSW scheme). The slow noise growth in third generation schemes such as GSW is however incompatible  with ciphertext packing.  Thus in terms of efficiency there is a fork in current schemes. One either uses packing and large noise (as in BGV style schemes) or small noise and no packing (as in GSW style schemes). The purpose of this talk was to focus on schemes which support packing.

There are very few bootstrapping implementations; and none (up to now) utilize packing. The paper starts by utilizing the techniques of GHS12 (The "Better Bootstrapping" paper), which focuses on $p=2$ and $q=2^e+1$. The technique requires one to encrypt the secret key modulo the plaintext
space $2^{e+1}$. Then we need to extract the most significant bit; this is the non-linear part of the procedure.

The problem in GHS12 was applying the procedure to packed ciphertexts. This is a linear operation. Then we do the above non-linear step. Then reverse the linear step. Alperin-Sheriff and Peikert showed how to perform these linear steps more efficiently.  This paper generalises GHS12 technique to arbitrary $p$, and not just $p=2$ and then specializes the AP linear techniques to make them more efficient.

Computationally the hard bit is the non-linear step, the hardest to implement is however the linear parts.  The authors presented a technique to implement matrix-vector multiplicatin in a SIMD manner. This method uses a arotation methodology (slow but adds little noise) and multiplication by constants (fast but adds lot of noise). To get the algebra to work they needed to impose constraints on m and p (dimension and plaintext modulus).

For small $p$ bootstrapping takes 12 levels, for larger $p$ can go to 24 (or even 48 levels). Can put extension field elements in the slots, and not just integers.  Single thread performance between 5 mins and 5 hours. For AES style parameters have a depth 24 evaluation of which 11 are for bootstrapping,
allowing enough for a round of AES to be done.  Now AES with bootstrapping allows much smaller parameters, without need bigger parameters. The new implementation without bootstrapping can evaluate 120 blocks in 245 secs, whilst the method with bootstrapping can evaluate 180 blocks in 1050 seconds