Most software developers have never heard about SAT solvers, but they are amazingly powerful and the algorithms behind them are actually very easy to understand. The whole concept blew my mind when I learned about how they worked. This will be the first post in a series about SAT solving where we’ll end up building a simple solver in Java later on.
In this first part of the series we’ll cover:
- Look at what SAT is
- Explain the SAT DIMACS format
- Solve a puzzle: Peaceable Queens
What is SAT?
SAT is short for ‘satisfiability’, what people mean when they say SAT is: Boolean satisfiability problem.
It turns out you can write a lot of ‘problems’ in a specific way as a special boolean formula. These problems use boolean
variables that can be either
false. When a variable is assigned true or false, it is called a
An example of a boolean expression:
(A or B or ~C) and (B or C) and (~B) and (~A or C)
Conjunctive normal form (CNF)
The statement above is written in a particular way, it is called the
conjunctive normal form (or CNF). All boolean statements can be rewritten/transformed into this form.
First we’ll need to learn some more terminology. The variables have a label here, for example
A. There is also the option to have a NOT A, this is written as
The line above can be broken up into four pieces:
Each line is called a ‘clause’. Those clauses are connected together using
AND statements. A
clause can only contain variables (or negated variables) joined together by
Because each clause will have
OR between the
variables and the end of the clause is always an
AND statement, those things can be removed from the statement, turning it into:
If we do one more step you’ll end up with a format called
DIMACS, which is supported by almost all SAT solvers. The last step is to replace the name of variable
2 etc. The
NOT statement is just writing the number negative, so
Now add a
0 to the end of each line (as separator) and add a header which explains the amount of variables (= 3) and the amount of clauses (= 4) and you get:
This is a valid input file for almost all SAT solvers. You can run this for example in minisat:
And print the output:
Now we can see what a SAT solver actually does. It takes a boolean expression and tries to find a ‘solution’. It tries to find a particular combination of
literals so all the clauses are
true. If it is able to find one, it stops and returns
SAT. When it is unable to find a valid assignment it returns
UNSAT, the problem can’t be solved.
Most modern SAT solvers will not only return
UNSAT, they will also return “proof” of the UNSAT, a listing of all the clauses that conflict with eachother and cause the UNSAT. This is very powerful and can be used to verify the correctness of the solvers.
In our case we have a valid
1 -2 -3.
When we assign
A=true, B=false, C=false we satisfy our entire boolean expression.
How can we use this?
The example above is very abstract with A’s and B’s. Lets look at some other problems that can be solved by SAT solvers.
A lot of ‘problems’ can be turned into big
CNF statements which means a lot of problems can be solved by a general SAT solver! Which is kind of amazing to me, translate a hard problem into SAT/CNF and a (rather simple) general purpose solver can solve it!
Calling SAT solvers ‘rather simple’ isn’t a mistake. Sure, some implementations are very complicated pieces of finely tuned optimized software. But the algorithms and general ideas behind SAT solving are surprisingly simple. In the next coming blogpost(s), once I write them, I’ll show you how to make one in Java.
Most tutorials use a SAT solver to solve Sudoku puzzles. It is an easy problem to understand and it shows the power of SAT. It involves translating a Sudoku puzzle and all its rules into one big CNF file. Next the SAT solver will calculate a solution.
Some other (random) things SAT solvers can be used for:
- Hardware/software verification
- Scheduling (sport events)
- Routing/planning problems
Peaceable queens problem
Instead of doing what most tutorials do (writing a boolean expression for Sudoku’s) I want to do something different.
The challenge: Find solutions for a puzzle from a recent Numberphile video: Peaceable queens.
The problem is:
Given a N x N chess board, what is the highest amount of black and white queens you can place so they don’t attack each other?
How can we encode such a problem into one big
A chess board
First we need a way to describe our board. It is a chess board and can hold queens either white or black. So I decided to encode this using a variable for each square, first NxN white queens, next NxN black queens.
We don’t have to put anything in our
DIMACS file yet, but the SAT
output will contain these variables set to TRUE or FALSE.
White or black pieces
Now it is time to encode all the ‘rules’ of this puzzle. First a very simple rule: If a square has a white queen it CAN’T have a black queen.
Let’s take the first square, top left. If
TRUE we have a white queen, if
TRUE we have a black queen there. So we want either 1 or 10, not both. How do we encode a rule for this in our SAT input?
How does this work? For this
clause to be
TRUE we must have ‘NOT 1’ or ‘NOT 10’, so either one of them or both must be
Rows, columns, diagonals
To have a valid solution we must add a lot more rules, for example a row can only have either a white queen or black queen or none.
How do we do this? Let’s look at row
10,11,12 (the top row).
First we’re going to introduce two new
variables, I’ll just continue using free numbers,
21. Those will represent the top row for white and black.
Let’s analyse this. The first
clause says we must have
NOT 1 or the white row
TRUE. Next we state the same for
NOT 2 and
NOT 3. So if we have a white piece on row
1 2 3 means
20 will be
TRUE. The same is done for black, if there is a black piece on the top row
21 will be
Now we can encode the final rule for the top row just like we did with the white/black queen on the same square:
This can be done for each row, column and diagonal.
As you can imagine, this will quickly become too large to write manually. So to enumerate all possibilities I wrote a piece of Java code to generate a
DIMACS file for me instead.
This happens a lot when using SAT solvers, most of the time you’ll want to create a piece of software to output all the rules in
DIMACS for a solver to solve.
At least N-queens
If we run the SAT solver with the generated file above we encounter a small problem. It will solve very quickly by placing NO queens at all. Very ‘peaceable’ indeed, no queens are attacking.
We must tell the solver that we want to have a certain number of queens set to true. This turned out to be harder than I thought. There are algorithms online that do
at most N booleans. I decided to implement the LTseq method from this scientific paper.
Having rules like this quickly explodes the amount of clauses needed. But
LTseq seemed to be the best one.
This is what I’ve ended up with:
How can we use
at most N for this puzzle? Well in case of N=10 we want to have both white and black to have
at least 14 queens.
So I switched
at least 14 of white and black squares
TRUE to be
at most (N*N - 14) of white and black squares
This is still not very good because with (NxN - 14) we have a very large
k, thus making a LOT of clauses. I have the feeling this can be done much better, but up to now I haven’t found a proper way. Perhaps forcing the sequential counter to overflow (instead of protecting it)? This didn’t seem to work for me. If you have SAT experience and know more about encoding problems, let me know!
I ended up doing the thing that works:
And it worked like a charm. I’m now able to ‘solve’ N=10 in under two seconds using modern SAT solver glucose. When I generate my ‘rules’ it comes down to about a staggering
17336 variables in
34518 clauses (far from optimal?).
But glucose is able to find a valid solution for N=10, k=14 in 1.2 seconds.
Below is the output where I’ve omitted most variables, we are only interested in the first 200 (10 x 10 x 2) that represent the board:
And if you change this back into a chess board we can see valid solution for N=10 with 14 queens each!
I’ve uploaded all the code for this series to GitHub and called it: Boolshit
You can find the code to generate the Peaceable Queens puzzle here.
I hope this gives you an idea on how powerful SAT solvers are and how you can encode a problem into
In the next blogpost we’ll write an actual SAT solver from scratch in Java! Although it won’t be a very fast one…