Given an integer array of length N (an arbitrarily large number). How to count number of set bits in the array?

The simple approach would be, create an efficient method to count set bits in a word (most prominent size, usually equal to bit length of processor), and add bits from individual elements of array.

Various methods of counting set bits of an integer exists, see this for example. These methods run at best O(logN) where N is number of bits. Note that on a processor N is fixed, count can be done in O(1) time on 32 bit machine irrespective of total set bits. Overall, the bits in array can be computed in O(n) time, where ‘n’ is array size.

[ad type=”banner”]

However, a table look up will be more efficient method when array size is large. Storing table look up that can handle 232 integers will be impractical.

The following code illustrates simple program to count set bits in a randomly generated 64 K integer array. The idea is to generate a look up for first 256 numbers (one byte), and break every element of array at byte boundary. A meta program using C/C++ preprocessor generates the look up table for counting set bits in a byte.

The mathematical derivation behind meta program is evident from the following table (Add the column and row indices to get the number, then look into the table to get set bits in that number. For example, to get set bits in 10, it can be extracted from row named as 8 and column named as 2),

  0, 1, 2, 3
 0 - 0, 1, 1, 2 -------- GROUP_A(0)
 4 - 1, 2, 2, 3 -------- GROUP_A(1)
 8 - 1, 2, 2, 3 -------- GROUP_A(1)
12 - 2, 3, 3, 4 -------- GROUP_A(2)
16 - 1, 2, 2, 3 -------- GROUP_A(1)
20 - 2, 3, 3, 4 -------- GROUP_A(2)
24 - 2, 3, 3, 4 -------- GROUP_A(2)
28 - 3, 4, 4, 5 -------- GROUP_A(3) ... so on

From the table, there is a patten emerging in multiples of 4, both in the table as well as in the group parameter. The sequence can be generalized as shown in the code.

[ad type=”banner”]

Complexity:

All the operations takes O(1) except iterating over the array. The time complexity is O(n) where ‘n’ is size of array. Space complexity depends on the meta program that generates look up.

[pastacode lang=”c” manual=”%23include%20%3Cstdio.h%3E%0A%23include%20%3Cstdlib.h%3E%0A%23include%20%3Ctime.h%3E%0A%20%0A%2F*%20Size%20of%20array%2064%20K%20*%2F%0A%23define%20SIZE%20(1%20%3C%3C%2016)%0A%20%0A%2F*%20Meta%20program%20that%20generates%20set%20bit%20count%0A%20%20%20array%20of%20first%20256%20integers%20*%2F%0A%20%0A%2F*%20GROUP_A%20-%20When%20combined%20with%20META_LOOK_UP%0A%20%20%20generates%20count%20for%204×4%20elements%20*%2F%0A%20%0A%23define%20GROUP_A(x)%20x%2C%20x%20%2B%201%2C%20x%20%2B%201%2C%20x%20%2B%202%0A%20%0A%2F*%20GROUP_B%20-%20When%20combined%20with%20META_LOOK_UP%0A%20%20%20generates%20count%20for%204x4x4%20elements%20*%2F%0A%20%0A%23define%20GROUP_B(x)%20GROUP_A(x)%2C%20GROUP_A(x%2B1)%2C%20GROUP_A(x%2B1)%2C%20GROUP_A(x%2B2)%0A%20%0A%2F*%20GROUP_C%20-%20When%20combined%20with%20META_LOOK_UP%0A%20%20%20generates%20count%20for%204x4x4x4%20elements%20*%2F%0A%20%0A%23define%20GROUP_C(x)%20GROUP_B(x)%2C%20GROUP_B(x%2B1)%2C%20GROUP_B(x%2B1)%2C%20GROUP_B(x%2B2)%0A%20%0A%2F*%20Provide%20appropriate%20letter%20to%20generate%20the%20table%20*%2F%0A%20%0A%23define%20META_LOOK_UP(PARAMETER)%20%5C%0A%20%20%20GROUP_%23%23PARAMETER(0)%2C%20%20%5C%0A%20%20%20GROUP_%23%23PARAMETER(1)%2C%20%20%5C%0A%20%20%20GROUP_%23%23PARAMETER(1)%2C%20%20%5C%0A%20%20%20GROUP_%23%23PARAMETER(2)%20%20%20%5C%0A%20%0Aint%20countSetBits(int%20array%5B%5D%2C%20size_t%20array_size)%0A%7B%0A%20%20%20int%20count%20%3D%200%3B%0A%20%0A%20%20%20%2F*%20META_LOOK_UP(C)%20-%20generates%20a%20table%20of%20256%20integers%20whose%0A%20%20%20%20%20%20sequence%20will%20be%20number%20of%20bits%20in%20i-th%20position%0A%20%20%20%20%20%20where%200%20%3C%3D%20i%20%3C%20256%0A%20%20%20*%2F%0A%20%0A%20%20%20%20%2F*%20A%20static%20table%20will%20be%20much%20faster%20to%20access%20*%2F%0A%20%20%20%20%20%20%20static%20unsigned%20char%20const%20look_up%5B%5D%20%3D%20%7B%20META_LOOK_UP(C)%20%7D%3B%0A%20%0A%20%20%20%20%2F*%20No%20shifting%20funda%20(for%20better%20readability)%20*%2F%0A%20%20%20%20unsigned%20char%20*pData%20%3D%20NULL%3B%0A%20%0A%20%20%20for(size_t%20index%20%3D%200%3B%20index%20%3C%20array_size%3B%20index%2B%2B)%0A%20%20%20%7B%0A%20%20%20%20%20%20%2F*%20It%20is%20fine%2C%20bypass%20the%20type%20system%20*%2F%0A%20%20%20%20%20%20pData%20%3D%20(unsigned%20char%20*)%26array%5Bindex%5D%3B%0A%20%0A%20%20%20%20%20%20%2F*%20Count%20set%20bits%20in%20individual%20bytes%20*%2F%0A%20%20%20%20%20%20count%20%2B%3D%20look_up%5BpData%5B0%5D%5D%3B%0A%20%20%20%20%20%20count%20%2B%3D%20look_up%5BpData%5B1%5D%5D%3B%0A%20%20%20%20%20%20count%20%2B%3D%20look_up%5BpData%5B2%5D%5D%3B%0A%20%20%20%20%20%20count%20%2B%3D%20look_up%5BpData%5B3%5D%5D%3B%0A%20%20%20%7D%0A%20%0A%20%20%20return%20count%3B%0A%7D%0A%20%0A%2F*%20Driver%20program%2C%20generates%20table%20of%20random%2064%20K%20numbers%20*%2F%0Aint%20main()%0A%7B%0A%20%20%20int%20index%3B%0A%20%20%20int%20random%5BSIZE%5D%3B%0A%20%0A%20%20%20%2F*%20Seed%20to%20the%20random-number%20generator%20*%2F%0A%20%20%20srand((unsigned)time(0))%3B%0A%20%0A%20%20%20%2F*%20Generate%20random%20numbers.%20*%2F%0A%20%20%20for(%20index%20%3D%200%3B%20index%20%3C%20SIZE%3B%20index%2B%2B%20)%0A%20%20%20%7B%0A%20%20%20%20%20%20random%5Bindex%5D%20%3D%20rand()%3B%0A%20%20%20%7D%0A%20%0A%20%20%20printf(%22Total%20number%20of%20bits%20%3D%20%25d%5Cn%22%2C%20countSetBits(random%2C%20SIZE))%3B%0A%20%20%20return%200%3B%0A%7D” message=”c” highlight=”” provider=”manual”/] [ad type=”banner”]