Generating de Bruijn sequences and Lyndon words

Generating de Bruijn sequences and Lyndon words

Not so long ago I encountered something called the de Bruijn sequence. For now I’ll only use this for an alphabet of (0,1), binary. But everything said here could also be applied to other alphabets. In this post I’ll describe what this sequence is, and how you can generate them, using Lyndon words.

What is a de Bruijn sequence?

Well, it is a sequence (again, in this case binary) which contains all combinations/permutations of a specific length. And it does this only once.
For example: B(2,4)

  • 000100110101111 (000)

This sequence contains all possible permutations you can make in binary of length 4. The last part (000) is optional, and needed it you do not want the sequence looped. As you can see, the length of this sequence is 16. All sequences will have length 2^N.

How do you construct these sequences?

The next thing I wanted to know is how to construct these sequences. The first hit I got was from the MathWorld website. It states:
Surprisingly, it turns out that the lexicographic sequence of Lyndon words of lengths divisible by N gives the lexicographically smallest de Bruijn sequence (Ruskey).

Lyndon word..?

It seems that the next step is generating Lyndon words. Lyndon words are the lexographically smallest rotation of a word. I haven’t yet found a proper way to do this (I know there are) so here is what I do:


lastLyndon = -1
for all possible N's {
    currentSmallestLyndon = smallestLyndonRotation(N)
    if( currentSmallestLyndon > lastLyndon ) {
        lastLyndon = currentSmallestLyndon
        print currentSmallestLyndon

And here is what I used to generate the smallestLyndonRotation (Java):

private BigInteger smallestLyndonRotation(BigInteger input, int size) {
	BigInteger lowestForm = input;
	BigInteger mask = BigInteger.ONE;
	for(int i = 1;i<size;i++) {
		BigInteger possibleLowerForm = (input.and(mask).shiftLeft(size-i)).or(input.and(mask.negate()).shiftRight(i));
		mask = mask.or(mask.shiftLeft(1));
		if(possibleLowerForm.compareTo(lowestForm) == -1) {
			lowestForm = possibleLowerForm;
	return lowestForm;

How does it work? Well, for example the input is: 0010.
It loops over all rotations (using some bit-masking):

  • 0010
  • 0100
  • 1000
  • 0001

And finds the smallest lexographically (0001) and returns this.

The above pseudo code will spit out the following numbers:

  • 0000
  • 0001
  • 0011
  • 0101
  • 0111
  • 1111

These are all the Lyndon words for N=4.

Splitting, reducing the Lyndon words

Now if we put the Lyndon words together we get:

  • 000000010011010101111111 (our sequence)
  • 0000100110101111 (de Bruijn sequence)

We aren’t quite there yet… but there is some resemblance. There is one more step left, we have to reduce/split the Lyndon words into its smallest unique part.

Lets take the Lyndon words again:

  • 0000 -> 0 (4x)
  • 0001 -> 0001
  • 0011 -> 0011
  • 0101 -> 01 (2x)
  • 0111 -> 0111
  • 1111 -> 1 (4x)

Results in: 0000100110101111, the sequence!

And that is it! If we’ve created the smallest possible de Bruijn sequence for B(2,4).

Here are some more sequences:

  1. 01
  2. 0011
  3. 00010111
  4. 0000100110101111
  5. 00000100011001010011101011011111
  6. 0000001000011000101000111001001011001101001111010101110110111111
  7. 00000001000001100001010000111000100100010110001101000111100100110010101001011100110110011101001111101010110101111011011101111111
  8. 00000000100000011000001010000011100001001000010110000110100001111000100010011000101010001011100011001000110110001110100011111001
  9. 00000000010000000110000001010000001110000010010000010110000011010000011110000100010000100110000101010000101110000110010000110110