Reading 17 bits...

I've just discovered that one type of record in my old DOS file stores two variables in 32 bits. I currently read the four bytes of data into a four-byte buffer, but am not sure what to do from there. The first 17 bits are an unsigned int and the next 15 bits are a signed short int. How can I go about reading this data? Mind you, it's currently stored in a "signed char[4]" buffer.

jmc (137)

It is the same as in a previous post.
The easiest way (you don't have to do a lot by yourself) is to use a struct with a certain bitset.
If you want to do it yourself for unsigned:

- create an unsigned variable type with at least ceil(bits / 8..0) bytes.
  If you know there are 15 or 17 bits use an unsigned/signed int.
- set the start address of your variable at the start of your bits
  (if they don't start at bits%8 == 0 then use bit shifting)
- &= the variable with all bits set, which you want to read (for example the first 17 &= 0x1FFFF)

If you want to do it yourself for signed:

- create an signed variable type with at least ceil(bits / 8..0) bytes.
  If you know there are 15 or 17 bits use an signed int.
- set the start address of your variable at the start of your bits
  (if they don't start at bits%8 == 0 then use bit shifting)
- &= variable with 0x1FFFF // 17 bits
  // test 16th bit for sign if signed number is 17 bits long
- if(variable & 0x10000){goto a;}else{goto b;}
  // xor to get the absolute value of the signed variable + 1
  // *-1 to get the sign and the filling bits
  // -1 since it is one less than the 
- a: signed:variable = (variable ^ 0xFFFFU) * (-1) - 1; positive value
- b: unsigned: variable contains result

PS: For big endian you have to do it slightly different.

[edit]corrected to big endian... my mistake. The above is for little endians...[/edit]

Last edited on

Duthomhas (13290)

Bitsets are bad. Don't use them... (The standard does not require any order on the bit storage.)

Also, DOS runs on x86 hardware, which is little endian.

Just bit-shift the values into place.

How exactly are the bits stored?
(Is the 17^th bit in the second or third byte? Is it in the MSB or LSB of the byte?)

Sephiroth (48)

As far as my documentation goes, it is fairly simple. The first 17 bits are one variable and the next 15 bits are the second variable. I have not messed with individual bits before so I don't know how to do this at all. I guess they needed to save memory back in DOS so this was what they did. I have never been that tight on memory so this is all new to me.

Oh, and This program will not run in XP, which is why I am writing a new one for XP.

Last edited on

Duthomhas (13290)

Well, I'll take the documentation at face value.

void decode_17_15( unsigned char b[ 4 ], int& b17, int& b15 )
  {
  b17 = b15 = 0;
  b17 |= b[ 0 ];
  b17 |= b[ 1 ] << 8;
  b17 |= (b[ 2 ] & 1) << 16;
  b15 |= b[ 2 ] >> 1;
  b15 |= b[ 3 ] << 7;
  }

I don't know why you have them as an array of signed char... but the unsignedness of the first argument there is important.

Hope this helps.

Sephiroth (48)

I think I follow that code and will apply it to m existing variables, but could you explain what is going on after you initialize b17 and b15 to zero? If you don't have much time, could you at least point me towards information on this subject? Thanks for your help though, I appreciate it.

Duthomhas (13290)

http://stackoverflow.com/questions/141525/absolute-beginners-guide-to-bit-shifting

Concerning C and C++ and arithmetic and logical shifts:
http://www.cplusplus.com/forum/beginner/12683/

Hope this helps.

Sephiroth (48)

thanks for the links. Would you mind double-checking this since it is my first go at it? The "uiXPosition" variable is a "signed int", and the "sLocationType" variable is a "signed short int". I believe I followed your example well enough to get it right, but I'd appreciate the reassurance. Oh and the buffer is of type "unsigned char[4]" as per your advice.

    //Jump to the combined data and read it into a buffer
    iJump += 5;
    pBuffer = *((unsigned char*)(pData + iJump));

    //Get our two variables out of the buffer
    this->pRecords[iLoop].uiXPosition |= pBuffer[0];
    this->pRecords[iLoop].uiXPosition |= pBuffer[1] << 8;
    this->pRecords[iLoop].uiXPosition |= (pBuffer[2] & 1) << 16;
    this->pRecords[iLoop].sLocationType |= pBuffer[2] >> 1;
    this->pRecords[iLoop].sLocationType |= pBuffer[3] << 7;

Last edited on

Duthomhas (13290)

Looks good. Just watch how you do line 3 -- don't dereference the pointer too early:

3 pBuffer = (unsigned char*)(pData + iJump);

(I assume that pData and pBuffer are both pointers...)

Glad to be of help.

Last edited on

Sephiroth (48)

Yes, both are pointers. I do things to name basic types of variables after their data-type so it makes things easier on me during a change to the code later. I start all pointers with "p", all chars with "c", shorts with "s", etc. The only ones I don't start that way are custom classes and such, unless they're a pointer.

*EDIT*

Actually upon compiling the code, I do have a problem with that four-byte character array.

1
2
3

1>MapTable.cpp
1>.\Source\MapTable.cpp(58) : error C2440: '=' : cannot convert from 'signed char *' to 'signed char [4]'
1>        There are no conversions to array types, although there are conversions to references or pointers to arrays

I also wanted to ask if I should be using the pointer to the data as an unsinged char instead of a signed char. You see I load a big section of a file into memory, and this data contains all kinds of 16bit variables. I get 32byte arrays of names, long and unsigned long, short and unsigned short, and more out of the data. I have always used a regular signed char for pointers to data like this before, but if it makes a difference, can you explain it?

Last edited on

Duthomhas (13290)

The pBuffer should be declared as

const unsigned char* pBuffer;

The signedness has to do with type promotion. If you treat all bytes as signed, when you work with them they will be type-promoted to signed types. This isn't always correct... and can cause problems when ORing values together. That's all.

Good luck!

Sephiroth (48)

OK I think I partially follow you. However, the data in that buffer contains both signed and unsigned values. Will declaring the buffer ans "unsigned char" present problems when dealing with signed data types?

Topic archived. No new replies allowed.

Reading 17 bits...

C++

Forum