Circuit Minimization Work

We plan to post in this page results obtained by the "circuit minimization team" (CMT). This is an informal group of friends and colleagues that are working on the problem of finding "good" circuits over GF2 (alternatively, circuits using only AND, XOR, and XNOR gates). XNOR gates are represented by the operation t = a # b , meaning t = (a + b + 1) modulo 2. The definition of "good" varies: small, low-depth, few AND gates, and so on.

  1. Multiplication in GF(2^6) using irreducible polynomial x^6 + x^3 + 1 . Here is the straight-line program for a circuit with 57 gates and depth 5.
    Here
    is a pdf rendering of the circuit. The red nodes are AND gates. The yellow nodes are XOR gates.
    The Ai's and Bi's are the coefficients of polynomials A, B. The Ci's are the coefficients of C = A*B.
    The best circuit we can find in the literature for this problem, here, has depth 5, size 61.  

  2. Multiplication in GF(2^7). 
    1. Using irreducible polynomial x^7 + x^4 + 1 .
      The best published circuit we can find for this has depth 7 and 100 gates.
      Here
      is the straight-line program for a circuit with 84 gates and depth 5.
      Here
      is a pdf rendering of the circuit. The red nodes are AND gates. The yellow nodes are XOR gates. 
    2. Using irreducible polynomial x^7 + x^3 + 1 .
      The best published circuit we can find for this has depth 7 and 100 gates.
      Here
      is the straight-line program for a circuit with 85 gates and depth 5.
      Here
      is a pdf rendering of the circuit. The red nodes are AND gates. The yellow nodes are XOR gates. 

  3. Multiplication in GF(2^8) using the AES polynomial x^8 + x^4 + x^3 + x + 1 . The best published circuit we can find for this has 135 gates and depth 7 ( link ).
    Here
    is the straight-line program for a circuit with 117 gates and depth 6.
    The Ai's and Bi's are the coefficients of polynomials A, B. The Ci's are the coefficients of C = A*B.
  4. Multiplication in GF(2^8) : custom-built field representation. The next circuit was built using a tower field representation. It has 106 gates and is of depth 6. It probably beats any previous (as of Sept. 20, 2017) result for GF(2^8) multiplication. Here is the straight-line program.
    Send me email for details of the construction and representation. The inputs to the circuit are the Ai's and Bi's. The outputs are the Ci's.
  5. Multiplication in GF(2^16) : custom-built field representation. The next circuit was built using a tower field representation. It has 374 gates and is of depth 8. It probably beats any previous (as of Sept. 20, 2017) result for GF(2^16) multiplication. Here is the straight-line program.
    Send me email for details of the construction and representation. The inputs to the circuit are the Ai's and Bi's. The outputs are the Ci's.
  6. A 16-bit Sbox. The paper "Customizable Sponge-Based Authenticated Encryption Using 16-bit S-boxes" (in Military Communications Conference, MILCOM 2015 - 2015 IEEE) presents a 16-bit Sbox based on GF(2^16) inversion. It quotes a size of 1382 gates (1238 XOR gates and 144 AND gates, no depth is given). The next circuit for the same Sbox has 462 gates and depth 35 (349 XOR gates and 113 AND gates). Here is the straight-line program.
    Send me email for details of the construction and representation.

  7. Binary Multiplication.
    This is an old and much-studied problem. It can also be viewed as that of multiplication of polynomials of degree n over GF(2).
    Dan Bernstein's results at High-speed cryptography in characteristic 2 have been recently improved by Murat Cenk and M. Anwar Hasan (see Some New Results on Binary Polynomial Multiplication.) Further improvements in size, depth, or both are shown below.

    Here, M(n) is the minimum number of bit operations (ANDs and XORs) needed to multiply two polynomials of degree n-1 .
    It is not hard to show that M(n) \leq M(n-1) + 4(n-1).
    Below are a few circuits synthesized in the last few years.
    Our software needs to be rewritten in order to synthesize circuits for polynomials of degree larger than 20 or so. 
    1. M(10) \leq 154.
      Here
      is the straight-line program for a circuit with 154 gates and depth 7. 
      Here is the straight-line program for a circuit with 155 gates and depth 6.  
    2. M(11) \leq 186.
      Here
      is the straight-line program for a circuit with 186 gates and depth 7. 
      Here is the pdf rendering of the circuit. The red nodes are AND gates. The yellow nodes are XOR gates.  
    3. M(12) \leq 207.
      Here
      is the straight-line program for a circuit with 207 gates and depth 7. 
      Here is the pdf rendering of the circuit. The red nodes are AND gates. The yellow nodes are XOR gates.  
    4. M(13) \leq 255.
      Here
      is the straight-line program for a circuit with 255 gates and depth 8.  
    5. M(15) \leq 312.
      This illustrates why I think we don't really know the values of M(n).
      Bernstein's 2009 circuit has 329 gates. Cenk and Hasan improved this to 317 gates in 2015.
      Here
      is an optimization of the latter circuit with 312 gates and depth 9 (send me email for lower-depth circuits).  
    6. M(16) \leq 349 (and therefore M(17) \leq 413).
      Here
      is the straight-line program for a circuit with 349 gates and depth 8. 
      Here is the pdf rendering of the circuit. The red nodes are AND gates. The yellow nodes are XOR gates. 
    7. M(18) \leq 454.
      Here
      is the straight-line program for a circuit with 454 gates and depth 8. 
      Here is the pdf rendering of the circuit. The red nodes are AND gates. The yellow nodes are XOR gates. 
    8. M(19) \leq 498.
      Here
      is the straight-line program for a circuit with 498 gates and depth 8. 
      Here is the pdf rendering of the circuit. The red nodes are AND gates. The yellow nodes are XOR gates. 
    9. M(20) \leq 527.
      Here
      is the straight-line program for a circuit with 527 gates and depth 8. 
      Here is the pdf rendering of the circuit. The red nodes are AND gates. The yellow nodes are XOR gates. 

  8. The S-Box of AES.
    Here is a circuit for the S-Box of AES. The circuit is described as a straight-line program using ANDs, XORs, XNORs (denoted by #). It contains 113 gates (32 AND gates and 81 XOR/XNOR gates) and has depth 28. 
    Here is a circuit for the reverse direction of the S-Box. The circuit has size 121 and depth 21. We haven't worked much on this.
    The S-Box can be reduced to depth 16 and size 125 in the forward direction and 127 in the reverse direction.
    Here is a circuit for the forward directions of the S-Box. Here is a circuit for the reverse direction of the S-Box.

  9. All predicates on 4 bits.
    The multiplicative complexity of a function is the number of AND gates needed to implement it over the basis AND,XOR,1 (the 1 is for negation).
    The algebraic normal form of a function does not say much about this metric.
    For example, it is not easy to build a circuit with minimum number of AND gates for a function such as

    g(a,b,c,d) = abcd + abc + abd + acb + acd + ab + ac + ad + bc + bd + cd.

    The following facts have been verified:
    1) Every predicate on four bits can be implemented with at most 3 AND gates and at most 7 XOR gates; (for functions that have the constant 1 in their algebraic normal form, the last gate would have to be negated; there are only two functions - modulo permutation of inputs - which appear to require 7 XOR gates)
    2) Every predicate on four bits, but of degree 3, can be implemented with 2 AND gates.


    The following (large) file contains straight-line programs (SLPs) for all predicates f(x1,x2,x3,x4) with f(0,0,0,0) = 0. The SLPs are indexed by fn, where n is the decimal number corresponding to the truth table of the function. For example the function x1*x2*x3*x4 has truth table 0000 0000 0000 0001 = 1. So you can find it under "f1". The circuits were independently verified by Chris Wood (many thanks!).
    If you denote by C_n[A,X] the set of functions with multiplicative complexity A for which an AND-optimal circuit requires exactly X XOR gates, then C_n[A,X] is closed under permutation of inputs (but not under arbitrary affine transformations). This property allows for detection of sub-optimal circuits. I have not coded this yet, so many of the circuits below are likely to be suboptimal.
    There are the two functions (up to permutation of inputs) that use 7 XOR gates and 3 AND gates. These functions do not appear to be special in any way. Magnus Find used a SAT solver to look into this. The SAT solver "proved" that neither function has a circuit with 6 XOR gates. Given other functions, the SAT does find circuits with 6 XOR gates. So it would appear that these two functions are special in some way.

    Predicates on four bits.


  10. All predicates on 5 bits. Meltem Turan, Cagdas Calik, and I have shown that all predicates on 5 bits have multiplicative complexity at most 4. I lost a bet about this with Cagdas (I thought it would be 5 or more). Multiplicative complexity grows exponentially with the number of input bits, so one of these days I will win a bet.

Currently, CMT is Joan Boyar, Morris Dworkin, Rene Peralta, Meltem Turan, Cagdas Calik, Luis Brandao. Past collaborators include: Michael Bartock, Ramon Collazo, Magnus Find, Michael Fischer, Cagdas Calik, Christopher Wood, Andrea Visconti, Chiara Schiavo, Holman Gao, Bruce Strackbein, Larry Bassham .