When faced with a function, and you're not sure what it's doing, try to break it into separate parts, and look at the data. This can be kinda fun if you have the right mindset about it.
First, you see a bunch of shifting (<<) and bitwise-ANDing (&).
So, something is being done at the bit level, 1s and 0s. Let's represent our data in binary with a helper function:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
|
#include <iostream>
#include <string>
std::string binary_str(int n, int num_bits)
{
std::string output(num_bits, '0');
for (int i = 0; i < num_bits; i++)
{
char bit = (n % 2) + '0';
output[i] = bit;
n = n / 2;
}
return std::string(output.rbegin(), output.rend()); // reversed
}
int main()
{
int x = 0b1010;
std::cout << binary_str(x, 4);
}
| |
This will print 1010.
Let's look at just the first actual logical line: 5.
It deals with two variables, x and size.
Let's get rid of the unnecessary parentheses...
x = x & (1 << size) - 1;
(1 << size) is left-shifting the number 1
size places.
So (1 << 1) means 1 shifted left 1 place, which becomes binary 10.
(1 << 2) means 1 shifted left 2 places, which becomes binary 100.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32
|
#include <iostream>
#include <string>
int part_A(int size)
{
return 1 << size;
}
std::string binary_str(int n, int num_bits)
{
std::string output(num_bits, '0');
for (int i = 0; i < num_bits; i++)
{
char bit = (n % 2) + '0';
output[i] = bit;
n = n / 2;
}
return std::string(output.rbegin(), output.rend()); // reversed
}
int main()
{
using std::string;
int num_bits = 4;
for (int x = 0; x < 16; x++)
{
// print [input] [output]
string input = binary_str(x, 20);
string output = binary_str(part_A(num_bits), 20);
std::cout << input << " " << output << '\n';
}
}
| |
Effectively, this means that it's making the first
size bits 0, and then 1 after that.
e.g. (1 << 5) ---> 0b100000.
std::cout << binary_str(1 << 5, 6) << '\n';
But then, you have the result - 1. If you're familiar with how binary numbers work, if you have a power of 2 (e.g. 1000) and subtract 1, you get one less than a power of 2, which will be all 1s.
So,
0b1000 - 1 == 0b111.
0b10000 - 1 == 0b1111.
In short,
(1 << size) -1
is producing a number that
size 1s in it.
Then, ANDing the original x with it makes it so that all binary digits beyond the first
size digits become 0, so we don't have to worry about them.
In other words line 5 makes it so we can assume we're dealing with a number that is only size-bits wide. Any larger bits are discarded (become 0).
_________________________________________________________________
Now, we have the for loop. This part is easier to reason about.
(x & 0x1):
When you AND a number with 1, you're basically asking "is the least-significant digit 1"?
(binary) 10 would produce false, 11 would be true, 10101 would be true, 10010 would be false.
Then, it shifted right 1 place. This is the same thing as integer dividing by 2,
so 10010 becomes 1001, for example. This lets us look at the next bit to check its evenness.
So, the for loop is going through each digit, and checking if the digit is a 1, and if so, adding to a counter.
Or, simply put, it's counting the number of binary 1s in the number.
We can test this hypothesis:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39
|
int part_B(int x, int size)
{
int p = 0;
x = (x & ((1<<size)-1));
for (int i=0; i<size; i++)
{
if (x & 0x1) p++;
x = x >> 1;
}
return p;
}
#include <string>
std::string binary_str(int n, int num_bits)
{
std::string output(num_bits, '0');
for (int i = 0; i < num_bits; i++)
{
char bit = (n % 2) + '0';
output[i] = bit;
n = n / 2;
}
return std::string(output.rbegin(), output.rend()); // reversed
}
#include <iostream>
int main()
{
using std::string;
int num_bits = 4;
for (int x = 0; x < 16; x++)
{
// print [input] [output]
string input = binary_str(x, num_bits);
int output = part_B(x, num_bits);
std::cout << input << " " << output << '\n';
}
}
| |
0000 0
0001 1
0010 1
0011 2
0100 1
0101 2
0110 2
0111 3
1000 1
1001 2
1010 2
1011 3
1100 2
1101 3
1110 3
1111 4 |
See? The output is the number of 1s in the input.
_________________________________________________________________
Finally, we take this result (p), and do (p & 1). This is the same thing as before, but in the context, it means (p & 1) means "is p odd".
This is compared to 0,
i.e. if there are an even amount of 1s in the original number, the function returns true.
If there is an odd amount of 1s in the original number, the function returns false.
Putting it all together...
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40
|
bool parity(int x, int size)
{
int i;
int p = 0;
x = (x & ((1<<size)-1));
for (i=0; i<size; i++)
{
if (x & 0x1) p++;
x = x >> 1;
}
return (0 == (p & 0x1));
}
#include <string>
std::string binary_str(int n, int num_bits)
{
std::string output(num_bits, '0');
for (int i = 0; i < num_bits; i++)
{
char bit = (n % 2) + '0';
output[i] = bit;
n = n / 2;
}
return std::string(output.rbegin(), output.rend()); // reversed
}
#include <iostream>
int main()
{
using std::string;
int num_bits = 4;
for (int x = 0; x < 16; x++)
{
// print [input] [output]
string input = binary_str(x, num_bits);
int output = parity(x, num_bits);
std::cout << input << " " << output << '\n';
}
}
| |
0000 1
0001 0
0010 0
0011 1
0100 0
0101 1
0110 1
0111 0
1000 0
1001 1
1010 1
1011 0
1100 1
1101 0
1110 0
1111 1 |
It returns whether or not the input number has an even number of 1s.
______________________________________
Afterthoughts:
I don't know assembly very well, but you can play around with websites like godbolt.org to see what the C++ compiler is producing.
For example, your function creates the following assembly code on -O3 optimization
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26
|
parity(int, int):
mov ecx, esi
mov eax, 1
mov r8d, edi
sal eax, cl
lea edi, [rax-1]
and edi, r8d
test esi, esi
jle .L5
xor eax, eax
xor edx, edx
.L4:
mov r8d, edi
and r8d, 1
cmp r8d, 1
sbb eax, -1
add edx, 1
sar edi
cmp ecx, edx
jne .L4
not eax
and eax, 1
ret
.L5:
mov eax, 1
ret
| |
If we look at the pattern I posted in the output above the assembly, you might notice that the output is, sequentially, is 1001 0110 0110 1001. You can kinda see a recursive, fractally thing going on.
But let's search for 1, 0, 0, 1, 0, 1, 1, 0, 0, 1, 1, 0, 1, 0, 0, 1 on OEIS.org
You'll find that this is the Thue-Morse sequence, described as,
"
a(n) = "parity sequence" = parity of number of 1's in binary representation of n.", or
"
a(n) = S2(n) mod 2, where S2(n) = sum of digits of n, n in base-2 notation. "
If you search for this, you can find some interesting fractals, like the Koch snowflake.
I guess for whatever reasons, the compiler isn't smart enough to pick on some optimizations. We can help the compiler out a bit by removing the if-branch.
1 2 3 4 5
|
for (i=0; i<size; i++)
{
p += (x & 0x1);
x = x >> 1;
}
| |
In this case, this only eliminates, I think, one or two instructions? So not much difference, but it is interesting to note that even with today's advanced compilers, there are places where if you try hard to not use if-else statements, the compiler can optimize more.
But if we go back to the OEIS description, note that the second description is "sum of digits, mod 2". In mod 2, addition is equivalent to xor. So... we should be able to change our for loop.
1 2 3 4 5 6 7 8 9 10 11
|
int parity_v3(int x, int size)
{
x = x & (1 << size) - 1;
int ret = 1;
for (int i = 0; i < size; i++)
{
ret = ret ^ (x & 0x1);
x = x >> 1;
}
return ret;
}
| |
I believe this produces the same result. I don't think its any faster than the above method (at least, not in any significant way), but I just thought it was interesting.