Find lowest set bit

So today I was working with Vulkan and needed to somehow stringify the state of a VkDebugReportFlagsEXT. I was too lazy to concatenate strings so I decided to only take into account the first bit, and ignore the rest (this is for internal debugging purposes so… meh). So I needed a way to find the lowest set bit in an unsigned integer. Here’s what I came up with:

uint find_lowest_set_bit(uint n)
    return (~(n - 1)) & n;

Now for a step-by-step explanation of what’s going on, with A = 01001100. First we subtract 1 which sets the right-trailing zeros to 1 and the first set bit (which we are looking for) to 0. Essentially subtracting 1 inverts the lowest bits up to and including the first set bit. This comes up pretty often in bit twiddling.

- 00000001
  01001011 = B

Since the only bits that changed are the one we are interested in and the ones below it, we look for the bit that was 1 and is now 0, the bits we don’t care about went from 0 to 1. This means we can invert the previous result and AND it with the original number.

If the reasons for this are a bit obscure to you, just imagine two boolean flags representing previous and current states. We need to find when previous was true and current false. This translates in most programming languages to previous && !current. The booleans && and ! operators correspond to the & and ~ bits operators, so in our case it translates to A & (~B). I like to think of bits operators as boolean SIMD instructions. :)

~ 01001011
& 01001100

And there we go. We can verify that this still works on edge-cases :

  11111111                     00000000
- 00000001                   - 00000001
----------                   ----------
~ 11111110                   ~ 11111111
----------                   ----------
  00000001                     00000000
& 11111111                   & 00000000
----------                   ----------
  00000001                     00000000

For reference, this compiles to 3 instructions: (with clang)

  mov rax, rdi
  neg rax
  and rax, rdi

I’m probably not the first one to come up with this, but I like figuring stuff out by myself and sharing it. Also I want to start posting on this blog.

Bonus: this is the first version I came up with : (((n - 1) ^ n) + 1) >> 1, which compiles in one more instruction.

  lea rax, [rdi - 1]
  xor rax, rdi
  inc rax
  shr rax