I'm sure each function or symbol has an offset as to where it begins in that object file, so how is this offset translated into a physical (well virtual) memory address? |
The linker first needs to decide where to place it in the executable file, then it can figure out how it will map regions of the file to memory.
The simplest case (only usable for regular programs) is where the linker just sets up the binary so the OS will load the pages at fixed locations in virtual memory. For example, the process' entry point might always be at 0x4000000, which would be specified somewhere in the executable format. Obviously, since all the code is at fixed locations, once the code has been copied or mapped to memory there's nothing else to do, because all memory references, jumps, etc. were generated by the linker knowing where everything was going to be at run-time.
It's also possible to generate relocatable code, which is what's used for dynamic libraries. Dynamic libraries
must be relocatable, because there's no guarantee that any particular page will be unoccupied by some other piece of code or data when the process attempts to load the library.
Relocatable code works quite simply. Following the idea that "all problems in computer science can be solved by adding another level of indirection", instead of generating direct references to memory addresses, the linker generates references to a table to memory addresses. When the executable or dynamic library is loaded by the OS, the OS's load routine must decide a place to load it and then fill the relocation table with the correct addresses, according to the rules established by the binary format.
I've actually had to implement the relocation routine when I needed to load a DLL without passing through Windows' LdrLoadDll(), because the process was intercepting those calls. This is what that looks like:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
|
__declspec(dllexport) std::uint32_t __stdcall load_dll(void *p){
auto manual_inject = (MANUAL_INJECT *)p;
auto pIBR = manual_inject->base_relocation;
auto delta = (uintptr_t)manual_inject->image_base - (uintptr_t)manual_inject->nt_headers->OptionalHeader.ImageBase;
//Relocate the image.
while (pIBR->VirtualAddress){
if (pIBR->SizeOfBlock >= sizeof(IMAGE_BASE_RELOCATION)){
auto count = (pIBR->SizeOfBlock - sizeof(IMAGE_BASE_RELOCATION)) / sizeof(WORD);
auto list = (WORD *)(pIBR + 1);
for (decltype(count) i = 0; i < count; i++){
if (list[i]){
auto ptr = (uintptr_t *)((char *)manual_inject->image_base + ((uintptr_t)pIBR->VirtualAddress + (list[i] & 0xFFF)));
*ptr += delta;
}
}
}
pIBR = (IMAGE_BASE_RELOCATION *)((char *)pIBR + pIBR->SizeOfBlock);
}
//...
| |