Suppose we are trying to remove the trailing zeroes from some unsigned variable.
uint64_t a = ... uint64_t last_bit = a & -a; // Two's complement trick: last_bit holds the trailing bit of a a /= last_bit; // Removing all trailing zeroes from a.
I noticed that it’s faster to manually count the bits and shift. (MSVC compiler with optimizations on)
uint64_t a = ... uint64_t last_bit = a & -a; size_t last_bit_index = _BitScanForward64( last_bit ); a >>= last_bit_index
Are there any further quick tricks that would make this even faster, assuming that the compiler intrinsic
_BitScanForward64 is faster than any of the alternatives?
_tzcnt_u64 is a faster alterative of
_BitScanForward64, if it is available (it is available with BMI instruction set).
Also, you can directly use that on the input, you don’t need to isolate lowest bit set, as pointed out by @AlanBirtles in a comment.
Other than that, noting can be done for a single variable. For an array of them, there may be a SIMD solution.
Answered By – Alex Guteniev
Answer Checked By – David Marino (BugsFixing Volunteer)