I am developing a memory manager that will replace ALL new and delete requests globally (at the program level). It segments most memory based on the size of the request but there is a significant space for dynamic memory allocation such as strings and other arrays.
When an array segment is returned to the available memory pool, I want to check to see if its neighboring segments are used or still allocated so that on the off chance, consecutive blocks are available for use, they can be recombined. Can I do this using a unique sequence of bytes which embeds the size of the segment/block at the beginning/ending of each segment? If so, what would be the best approach for coming up with a series of values?
A classic technique for doing this is to use boundary tags.
Each block of memory is preceded and followed by a boundary tag.
A boundary tag is simply the number of bytes indicating the size of the memory block.
A negative boundary tag indicates the block is in use.
A positive boundary tag indicates the block is free.
Coalescing memory blocks is a simple matter of looking at the boundary tags for adjacent blocks to determine if the blocks are free. If they are, then the blocks are coalesced.
Do NOT assume that your run time implementation uses this technique. There are many techniques for managing memory pools.
You will need to allocate a large area of memory, then set up your boundary tags on the large memory block. You can walk from the beginning to the end of the pool by using the boundary tag to find the next boundary tag. This is a good way to ensure your pool is intact. If the beginning and ending boundary tag for a block do not agree, your pool has probably been trashed.
It's tedious and time consuming to walk the entire chain of boundary tags to find a free block, especially after many allocations have been done. It's common to maintain a series of buckets (linked lists) pointing to blocks of various sizes. Since the data area of a free block is not used, you can put the fwd/back pointers at the start of the data area.
Right. That is what I am talking about... a boundary tag at the end. To me however, it seems like you are suggesting a trailer along with a header. Currently, when I get an array request such as:
new [](size_t size)
I actually make use of size + 8 bytes for a header describing the content's size. If I fill the rest with data, the end boundary tag could coincidentally have the same value as data. So are you suggesting that I make room for a trailer? If not, then I need to come up with a boundary tag that is guaranteed to be unique; which brings me back to my original question.
Yes, boundary tags go before and after each block of memory. You don't have to worry about boundary tags being unique. Using the absolute value of a boundary tag, you can easily find the next boundary tag. Also when the user returns a block of data, you can find that block's preceding boundary tag by subtracting the size of a boundary tag from the address of the block. The ending boundary tag can be found by adding the absolute value of the boundary tag to the address of the preceding boundary tag. You can find the boundary tag of the next block by adding the size of a boundary tag to the address of the ending boundary tag. Likewise, you can find the ending boundary tag of the previous block given the starting boundary tag of the current block and backing up by the size of a boundary tag. Given the ending boundary tag of a block, you can find it's staring boundary tag by subtracting the absolute size of the ending boundary tag.
You will also need a unique value to use as a pool tag (e.g. -3). Pool tags delimit the memory pool. When you initially allocate your memory pool it should look like this: |PT|+BT|free memory|+BT|PT|
PT = Pool tag
BT = Boundary tag
When you allocate your first block, you split the free memory into two blocks. One you return to the user, the other remains free. |PT|-BT1|allocated block|-BT1|+BT2|free memory|+BT2|PT|
When the user releases that block, you need to back up to first BT1, Use first BT1 to find second BT1. From that you find BT2. Because BT2 is positive, you need to combine the two blocks.