Harvey’s SAT encoding for lexicographic ordering

In my research I often work with satisfiability (SAT) solvers—programs that solve the Boolean satisfiability problem. Modern SAT solvers require the problem instance to solve to be specified in a format known as conjunctive normal form. This essentially means that you need to specify all constraints of your problem in the form

\[ x_1 \lor x_2 \lor \dotsb \lor x_n \]

where $\lor$ denotes the logical OR operator and each $x_i$ is a Boolean variable like $p$ or the negation of a Boolean variable like $\lnot p$ (alternatively denoted $\bar p$). In Boolean logic, an expression like this is known as a clause. Because of De Morgan’s law, the clause $\bar x\lor\bar y\lor u\lor v$ may also be written in the form $\lnot(x\land y)\lor u\lor v$, or as a convenient shorthand we may also use the notation $(x\land y)\rightarrow(u\lor v)$.

On the positive side, clauses are versatile: they permit sufficient flexibility to encode a large number of practical and theoretical problems. On the negative side, they aren’t always that convenient: there are many other kinds of constraints that arise in practice that one might want to encode, such as a cardinality constraint saying at least $k$ of the variables $x_1$, $\dotsc$, $x_n$ are true (for some given $0\leq k\leq n$). There are other kinds of more expressive solvers (like pseudo-Boolean solvers and SMT solvers) that natively support such constraints, but surprisingly a pure SAT approach can actually sometimes be the most effective solution. This is because modern SAT solvers have been very well optimized over decades of work and it can pay off to take advantage of their efficiency even when the complexity of your problem’s encoding may increase. In this sense, SAT is a bit like the “assembly language” of combinatorial search. It may not be easy to express a problem in SAT, but doing so can be more efficient when compared with other more expressive encodings of the problem.

If you want to use a pure SAT approach but your problem has some non-clausal constraints a workaround is to find a way of expressing those constraints in the form of clauses. In this post I want to discuss a particular kind of constraint that I’ve used frequently in my research including in my papers on Lam’s problem and the Kochen–Specker problem. In particular, I want to discuss how to encode a lexicographic ordering constraint between two Boolean vectors in SAT. That is, say we have the $2n$ Boolean variables $x_1$, $\dotsc$, $x_n$, $y_1$, $\dotsc$, $y_n$ and we want to express that the Boolean vector $X=(x_1,\dotsc,x_n)$ is lexicographically smaller than or equal to the Boolean vector $Y=(y_1,\dotsc,y_n)$ where Boolean variables are taken to have value either $0$ (false) or $1$ (true). We’ll write this constraint as $X\leq Y$. But how can we express this in a format that a SAT solver can understand?

If you weren’t convinced it can make sense to go to the trouble of devising a SAT encoding for a combinatorial search problem, Donald Knuth (one of my longtime heroes) devoted the entire second half of the 714-page Volume 4B of his magnum opus The Art of Computer Programming to just the effectiveness of modern SAT solving techniques. I can’t think of a better endorsement that SAT solving is more than just a novelty, as might otherwise be assumed—it’s a powerful technology that in Knuth’s words “is key to the solution of so many other problems”.

Knuth discusses a huge number of SAT encodings in his book, including an encoding of $X\leq Y$ in paragraph 3 on page 285 of TAOCP 4B. Surprisingly, Knuth does not provide a citation for the original source of the encoding and only provides a single brief sentence about how the encoding can be derived. He says the encoding arises by considering the carries that occur when $\bar X=(\bar x_1,\dotsc,\bar x_n)$ (as a bitvector representing an unsigned integer) is added to the bitvector $Y$, a remark that I found a bit cryptic. I was curious, so I did some research and in this blog post I’ll provide some more details about this encoding.

The reason that Knuth doesn’t state the original source may simply be because there is no original paper to cite. According to a 2006 paper of Frisch et al., the encoding was originally proposed in a 2002 email by Warwick Harvey. The recipients of the email were not specified explicitly, but the authors imply that the email was sent to them after the publication of their 2002 paper Global Constraints for Lexicographic Orderings. Harvey obtained a PhD from the University of Melbourne in 1998, and in 2002 when the email in question was sent he was a Research Fellow at the Centre for Planning and Resource control (IC-Park) at Imperial College London.

Ok, so Harvey’s encoding is now over 20 years old and has undoubtedly been used many times since then. How does it work? Harvey cleverly observes that the constraint $(x_1,\dotsc,x_n)\leq(y_1,\dotsc,y_n)$ can equivalently be rewritten as

\[ x_1 < y_1 + [X_{2..n}\leq Y_{2..n}] \tag{$*$} \]

where $X_{2..n}$ denotes the bitvector $(x_2,\dotsc,x_n)$ and the square brackets denote Iverson notation. This definition also relies on $()\leq()$ being considered as vacuously true (since by definition the empty vector is lexicographically equal to itself).

It is pretty straightforward to see the equivalence just by examining all possible cases. Firstly, if $x_1=0$ and $y_1=1$ then $X\leq Y$ and $(*)$ are both true. Secondly, if $x_1=1$ and $y_1=0$ then $X\leq Y$ and $(*)$ are both false. Lastly, if $x_1=y_1$ then $X\leq Y$ is equivalent to $X_{2..n}\leq Y_{2..n}$ or $[X_{2..n}\leq Y_{2..n}]>0$.

Now, define $a_i$ (for $0\leq i\leq n$) to be a new Boolean variable representing the truth value of $X_{i+1..n}\leq Y_{i+1..n}$, so that $(*)$ becomes

\[ x_1 < y_1 + a_1 \]

and in general $a_i$ denotes the truth value of $x_{i+1} < y_{i+1} + a_{i+1}$. We also take $a_n$ to be vacuously true and will assert that $a_0$ is true, since $a_0$ denotes $X\leq Y$ (the constraint that we want to assert). In a SAT encoding it is easy to specify these variables true simply by asserting the “unit” clauses $a_0$ and $a_n$. SAT solvers will use any unit clauses in the encoding to automatically simplify the encoding: intuitively, $a_0$ and $a_n$ are constants, not variables, meaning that the encoding can be rewritten without reference to either $a_0$ and $a_n$. For example, if $\bar a_0$ appears in a clause it can be removed because $\bar a_0$ is false.

So far so good, but we still need to find a way to express the inequality defining the truth value of the variable $a_{i-1}$ (i.e., $[x_i < y_i + a_i]$) for $1\leq i\leq n$ in clauses because SAT solvers do not natively support inequalities like this. We can rewrite $x_i < y_i + a_i$ as $1<\bar x_i+y_i+a_i$ (since $\bar x_i=1-x_i)$ which is equivalent to $2\leq \bar x_i+y_i+a_i$ (since all quantities here have integer values). In other words, we want to add the constraints

\[ a_{i-1} \leftrightarrow [2\leq \bar x_i+y_i+a_i] \quad\text{for}\quad 1\leq i\leq n . \]

Intuitively, as soon as any two of the variables in the set $\{\bar x_i,y_i,a_i\}$ are known, the value of $a_{i-1}$ is completely determined. The variable $a_{i-1}$ can be considered the “carry bit” of the integer sum $\bar x_i+y_i+a_i$; when at least two summands are true then $a_{i-1}$ is true, and when at least two summands are false then $a_{i-1}$ is false.

In clausal form, we can write this via the following clauses for $1\leq i\leq n$:

\begin{align*}
(\bar a_i \land \bar y_i) &\rightarrow \bar a_{i-1} & (a_i \land y_i) &\rightarrow a_{i-1} \\
(\bar a_i \land x_i) &\rightarrow \bar a_{i-1} & (a_i \land \bar x_i) &\rightarrow a_{i-1} \\
(\bar y_i \land x_i) &\rightarrow \bar a_{i-1} & (y_i \land \bar x_i) &\rightarrow a_{i-1}
\end{align*}

It turns out that the three clauses on the left are the important ones, and the three clauses on the right can be dropped. By doing this we do lose the nice intuitive equivalence that $a_i$ is true if and only if $X_{i+1..n}\leq Y_{i+1..n}$. Instead, we only know that if $a_i$ is true then $X_{i+1..n}\leq Y_{i+1..n}$, but it is possible that $a_i$ is false and $X_{i+1..n}\leq Y_{i+1..n}$ is still true. However, this latter case is acceptable so long as our only purpose is to encode that $X\leq Y$. For example, if $x_1=0$ and $y_1=1$ then we know that $X\leq Y$ already, and in this case we don’t care if $a_1$ takes on the same truth value as $X_{2..n}\leq Y_{2..n}$ or not; intuitively, in this case the solver has the freedom to set $a_1$ arbitrarily. On the other hand, if $x_1=y_1$ then the clauses on the left imply that $a_1$ is true which then implies $X_{2..n}\leq Y_{2..n}$ (which in this case we do need to hold). Logically speaking, it doesn’t hurt to also include the clauses on the right. In practice, they slow down the solver somewhat in my experience as there is a cost associated with storing those clauses in memory.

Considering the simplifications that can be made from knowing that $a_0=a_n=1$, the Harvey lex encoding uses the $n-1$ new variables $a_1$, $\dotsc$, $a_{n-1}$ and $3n-2$ clauses (as two clauses with $i=n$ are trivially satisfied).

Interestingly, almost the exact same encoding works for the strict lexicographic ordering $X<Y$. The only difference is that because $()<()$ is false we need to set $a_n$ false instead of true. In this case, the clauses with $i=n$ become $\bar y_n\rightarrow\bar a_{n-1}$, $x_n\rightarrow\bar a_{n-1}$, and $(\bar y_n\land x_n)\rightarrow \bar a_{n-1}$. The last clause here is strictly weaker than each of the first two, so it is unnecessary to include. Thus, Harvey’s strict lex encoding uses $n-1$ new variables and $3n-1$ clauses.