Integer Log Base 2

From WikiCoder

Find the log base 2 of an integer with the MSB N set in O(N) operations (the obvious way)

 unsigned int v; // 32-bit word to find the log base 2 of
 unsigned int r = 0; // r will be lg(v)
 while (v >>= 1) // unroll for more speed...
 {
  r++;
 }

The log base 2 of an integer is the same as the position of the highest bit set (or most significant bit set, MSB). The following log base 2 methods are faster than this one.

Find the integer log base 2 of an integer with an 64-bit IEEE float

 int v; // 32-bit integer to find the log base 2 of
 int r; // result of log_2(v) goes here
 union { unsigned int u[2]; double d; } t; // temp
 t.u[__FLOAT_WORD_ORDER==LITTLE_ENDIAN] = 0x43300000;
 t.u[__FLOAT_WORD_ORDER!=LITTLE_ENDIAN] = v;
 t.d -= 4503599627370496.0;
 r = (t.u[__FLOAT_WORD_ORDER==LITTLE_ENDIAN] >> 20) - 0x3FF;

The code above loads a 64-bit (IEEE-754 floating-point) double with a 32-bit integer (with no paddding bits) by storing the integer in the mantissa while the exponent is set to 252. From this newly minted double, 252 (expressed as a double) is subtracted, which sets the resulting exponent to the log base 2 of the input value, v. All that is left is shifting the exponent bits into position (20 bits right) and subtracting the bias, 0x3FF (which is 1023 decimal). This technique only takes 5 operations, but many CPUs are slow at manipulating doubles, and the endianess of the architecture must be accommodated.

Find the log base 2 of an integer with a lookup table

 static const char LogTable256[256] = 
 {
 #define LT(n) n, n, n, n, n, n, n, n, n, n, n, n, n, n, n, n
    -1, 0, 1, 1, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 3, 3,
    LT(4), LT(5), LT(5), LT(6), LT(6), LT(6), LT(6),
    LT(7), LT(7), LT(7), LT(7), LT(7), LT(7), LT(7), LT(7)
 };
 unsigned int v; // 32-bit word to find the log of
 unsigned r;     // r will be lg(v)
 unsigned int t, tt; // temporaries
 if (tt = v >> 16)
 {
  r = (t = tt >> 8) ? 24 + LogTable256[t] : 16 + LogTable256[tt];
 }
 else 
 {
  r = (t = v >> 8) ? 8 + LogTable256[t] : LogTable256[v];
 }

The lookup table method takes only about 7 operations to find the log of a 32-bit value. If extended for 64-bit quantities, it would take roughly 9 operations. Another operation can be trimmed off by using four tables, with the possible additions incorporated into each. Using int table elements may be faster, depending on your architecture. The code above is tuned to uniformly distributed output values. If your inputs are evenly distributed across all 32-bit values, then consider using the following:

 if (tt = v >> 24) 
 {
  r = 24 + LogTable256[tt];
 } 
 else if (tt = v >> 16) 
 {
  r = 16 + LogTable256[tt];
 } 
 else if (tt = v >> 8) 
 {
  r = 8 + LogTable256[tt];
 } 
 else 
 {
  r = LogTable256[v];
 }

To initially generate the log table algorithmically:

 LogTable256[0] = LogTable256[1] = 0;
 for (int i = 2; i < 256; i++) 
 {
  LogTable256[i] = 1 + LogTable256[i / 2];
 }
 LogTable256[0] = -1; // if you want log(0) to return -1

Find the log base 2 of an N-bit integer in O(lg(N)) operations

 unsigned int v;  // 32-bit value to find the log2 of 
 const unsigned int b[] = {0x2, 0xC, 0xF0, 0xFF00, 0xFFFF0000};
 const unsigned int S[] = {1, 2, 4, 8, 16};
 int i;
 unsigned int r = 0; // result of log2(v) will go here
 for (i = 4; i >= 0; i--) // unroll for speed...
 {
  if (v & b[i])
  {
    v >>= S[i];
    r |= S[i];
  } 
 }

OR (IF YOUR CPU BRANCHES SLOWLY):

 unsigned int v;	         // 32-bit value to find the log2 of 
 register unsigned int r; // result of log2(v) will go here
 register unsigned int shift;
 r =     (v > 0xFFFF) << 4; v >>= r;
 shift = (v > 0xFF  ) << 3; v >>= shift; r |= shift;
 shift = (v > 0xF   ) << 2; v >>= shift; r |= shift;
 shift = (v > 0x3   ) << 1; v >>= shift; r |= shift;
                                         r |= (v >> 1);

OR (IF YOU KNOW v IS A POWER OF 2):

 unsigned int v;  // 32-bit value to find the log2 of 
 static const unsigned int b[] = {0xAAAAAAAA, 0xCCCCCCCC, 0xF0F0F0F0, 
                                  0xFF00FF00, 0xFFFF0000};
 unsigned int r = (v & b[0]) != 0;
 for (i = 4; i > 0; i--) // unroll for speed...
 {
  r |= ((v & b[i]) != 0) << i;
 }

Of course, to extend the code to find the log of a 33- to 64-bit number, we would append another element, 0xFFFFFFFF00000000, to b, append 32 to S, and loop from 5 to 0. This method is much slower than the earlier table-lookup version, but if you don't want big table or your architecture is slow to access memory, it's a good choice. The second variation involves slightly more operations, but it may be faster on machines with high branch costs (e.g. PowerPC).

Find the log base 2 of an N-bit integer in O(lg(N)) operations with multiply and lookup

 uint32_t v; // find the log base 2 of 32-bit v
 int r;      // result goes here
 static const int MultiplyDeBruijnBitPosition[32] = 
 {
  0, 9, 1, 10, 13, 21, 2, 29, 11, 14, 16, 18, 22, 25, 3, 30,
  8, 12, 20, 28, 15, 17, 24, 7, 19, 27, 23, 6, 26, 5, 4, 31
 };
 v |= v >> 1; // first round down to one less than a power of 2 
 v |= v >> 2;
 v |= v >> 4;
 v |= v >> 8;
 v |= v >> 16;
 r = MultiplyDeBruijnBitPosition[(uint32_t)(v * 0x07C4ACDDU) >> 27];

The code above computes the log base 2 of a 32-bit integer with a small table lookup and multiply. It requires only 13 operations, compared to (up to) 20 for the previous method. The purely table-based method requires the fewest operations, but this offers a reasonable compromise between table size and speed. If you know that v is a power of 2, then you only need the following:

 static const int MultiplyDeBruijnBitPosition2[32] = 
 {
  0, 1, 28, 2, 29, 14, 24, 3, 30, 22, 20, 15, 25, 17, 4, 8, 
  31, 27, 13, 23, 21, 19, 16, 7, 26, 12, 18, 6, 11, 5, 10, 9
 };
 r = MultiplyDeBruijnBitPosition2[(uint32_t)(v * 0x077CB531U) >> 27];

Compiler Intrinsics

Often the CPU you are computing on already has an instruction to count leading zeros. This instruction can be used to compute the highest bit set via (for 32-bit) 32 - CountLeadingZero(val).

On MSVC, non-AVX2:

 unsigned highest_bit_set; 
 _BitScanReverse(&highest_bit_set, val);

On MSVC, with AVX2:

 #include <immintrin.h>
 unsigned highest_bit_set = 32 - _lzcnt_u32(val); 
 // Or for 64-bit
 unsigned highest_bit_set = 64 - _lzcnt_u64(val);

On GCC/Clang there is a handy built-in for this:

 unsigned highest_bit_set = 32 - __buildin_clz(val);
 // Or for 64-bit
 unsigned highest_bit_set = 64 - __buildin_clzll(val);

Handy wrapper handling both and zero:

int IntLog2(uint64_t value) {
#ifdef _MSC_VER
	unsigned long result;
	if (!_BitScanReverse64(&result, value))
		return -1;

	return (int)result;
#else
	if (!value)
		return -1;

	return 63 - __builtin_clzll(value);
#endif
}

References