I have loaded a binary file into memory (using ios::binary). I need to search for 10101 (or 0x15), which I did by searching each value returned by file.get(). What I have just realised is that 10101 may be part in the first 8 bits of file.get and part in the next 8 bits of file.get, ofcourse I will miss this by my search method. Does anyone know of a better search method?
You'll have to do some fancy bit-shifting.
You'll need an unsigned short (two bytes) so that you can check for matches that cross byte boundries:
[byte 1][byte 0] (see 4 below) 1100010101001101 10101 match
Then check for each possible bitmask by bitshifting, &&ing, and comparing. Once you've checked the first byte through, move byte 1 to byte 0 and read the next byte into byte 1, rinse and repeat.
A couple notes:
1. You don't need to check any patterns that fall completely in byte 1's bits -- as you'll do that on the next loop.
2. Be careful of the termination condition. I suggest that when you hit EOF you just put a zero in byte 1, do your tests, then quit.
3. If you've found a match, the next possible match is two-bits away: 1010100110110101 match
10101001101 10101 next possible match
4. Remember that the most-significant bit of byte 0 should be adjacent to the least-significant bit of byte 1. (Hence the 'backwards' representation of the bytes above.)
move byte 1 to byte 0 and read the next byte into byte 1
Wait. What if he needs to use big endian? In that case he should do
1 2
number<<=8;
number|=nextbyte;
Be careful if the bytes are signed chars, though. If they are and you find a negative value, C++ will convert it to short repeating the 8th bit on bits 8-15 (so 10000001 becomes 1111111110000001, instead of 0000000010000001, which is what you need). If that happens, just and it with 0x00FF.