Searching for class objects in memory.

Is it possible to search for instances of a class in memory by comparing the bytes that contain the classes functions?

I tried using the string class, since it seemed that the class was always ended with the three bytes "37620", this worked for finding strings in memory...Until the number of strings in memory reached 10 at which point the number changed.

Is the changing number due to the fact that this is impossible? Or is it because I was using a very clumsy way of reading the data? I used:

1
2
3
4
5
6
7
8
9
 
void * voidpt;
voidpt = &string1;

    for (int i = 0; i < sizeof(string); i++)
    {
   cout << "!" << i << endl;
   cout << (unsigned long int)*(char*)((long int)voidpt + i) << endl << endl;
    }


I typecasted the pointer as a long int so I could add i to it, then it was re-typecasted to a char pointer so it would only give me the data in one memory address....Then it was typecasted once again so that the data would be displayed as a number rather than a character.

Are class functions even stored in memory? Or are they something handled by the compiler?

No, you can't really do what you are wanting to do.

First of all, the compiled code that makes up the string class (or any class) will reside in read-only code pages
somewhere in memory, but actual instances of the string class will reside in read-write pages somewhere else.

You can't really "find" string objects by searching for instances, because the internal instance data will change
from class to class and instance to instance. You can find the compiled code for non-template, non-inline,
non-virtual functions simply by taking their address. [Note: you cannot take the address of constructors.]

No. You can't find objects like that.

Yes, code and functions are stored in memory.
So if I understand what you said jsmith... The compiled code that makes up the functions in a class is stored in only one place, and the instances merely look at that memory location when calling a certain function?

I don't think I completely understand, if it is set up that way, shouldn't there be a part of each string instance that shows the CPU where the code is that needs to be executed for each function? Or is that all handed invisibly by the OS/Compiler?

I also have a question about directly accessing memory in general, is there a program that allows me to enter a memory location and it will display the hex data inside that location? Something that allows me to enter how many bytes I want to view?

I have been working with pointers and memory locations alot recently and something like this would be much more efficient than my current method of using a for loop and overly typecasted void pointer.
Class functions sort of work like this:

If you have some function like this:
1
2
3
something::foo() {
    return this->var;
}


It sort of looks like this in the code:
1
2
3
_mangled_func_name_foo(something* this) {
    return this->var;
}


This function has an address that is stored in memory and called basically like any other global function, except when you use some_class->foo(); you are passing some_class as the this pointer.

*I think
I'm asking because I have some code that returns the size of an dynamically allocated array of strings, right now I have a check that compares the second byte of the string instance. I found what looks like a pattern that lets me scan through the memory and it works. Problem is that with the way I am doing it there is still a 0.7% chance of an my program mistaking some random bytes in memory as a string. If it does then the program crashes upon calling string functions(Which makes sense, since that part of memory won't really be a string function).

So I am trying to reduce that 0.7% chance to 0% or something very very close to it. If I had access to the ability to compare the bytes that contain the functions(Or where the bytes point to) I think I could reduce the chance of crashing to almost 0%.
Last edited on
I used a static address. Don't do it...
Last edited on
@Harbing: If you are dynamically make an array of strings, wouldn't you know the size because you passed it to new and/or malloc()?
Yes that is the one thing that makes this somewhat useless... The idea is that with this code a pointer to a string array can be passed around different functions/classes/files without having to include it's size.

That and I kinda wanted to prove it could be done. I managed to narrow down the error chance to 0.003%, which is close enough to 0% I guess.

I was hoping that by only needing to pass a pointer rather than an entire string array as well as the arrays size it would, I don't know reduce memory usage and increase access speed or something.

This was really just to test to see if it could be done without having to use any OS specific commands.

EDIT:

That and I suppose this could be used to access the values inside a string array(or even just a normal string) that was not created by your program.

EDIT 2:

Actually I do have one minor problem with my code, I have it returning a static int the problem is that when I call the function again at another point in the code, the static int starts back at whatever it last returned. Is there a way to return the int and then set it to zero?
Last edited on
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
#include <iostream>

class MyClass
{
public:
	unsigned int _header_;

	MyClass(int i) : _header_(0xDEADBEEF), SecretNumber(i){}

	void DisplaySecretNumber() {
		std::cout << SecretNumber << std::endl;
	}

private:
	int SecretNumber;
};

void CreateInstance(int i)
{
	MyClass* a = new MyClass(i);
}

void PrintInstance()
{
	/*  This is a really stupid way of doing it.
		Instead, parse the PE file to find the
		base address. Next you should call 
		VirtualQuery() to determine how the memory
		is set. You can save the state and then call
		VirtualProtect to change the memory pages
		access.
	*/
	unsigned int Address = 0x00345000;
	MyClass* pMyClass;

	int count = 0;
	while ( count != 1 )
	{
		pMyClass = (MyClass*)Address++;

		if ( pMyClass->_header_ == 0xDEADBEEF )
			++count;
	}

	pMyClass->DisplaySecretNumber();
}
int main()
{
	CreateInstance(1337);
	PrintInstance();

	std::cin.ignore();
	return 0;
}


Edit: Oops, there wasn't even any need for me to include windows.h since I didn't have any Windows specific functions used
Last edited on
I see how you used a header as a sort of "signature" for your class, however I don't understand this part

MyClass(int i) : _header_(0xDEADBEEF), SecretNumber(i){}

Is this a constructor that makes MyClass inherits _header_ and SecretNumber? If so how does that work? The _header_(0xDEADBEEF) also confuses me, are you assigning 0xDEADBEEF to _header_?

Another question, why do you start at 0x00345000? Is that the start of the ram used by the OS?

EDIT: Oh, I forgot to ask, what is the PE file? Is it some kind of memory index?

I also tried to compiler my code on Visual Studio, and Dev-C++ but in both cases the code did not work. It only seems to work when I compile using the latest release of Code::Blocks and the GNU C++ compiler.
Last edited on
_header_(0xDEADBEEF) means initialize _header_ with the value, 0xDEADBEEF.

Yeah, don't use 0x00345000 as the starting address xD I was being lazy and didn't add in the code to manually find the base address. The reason it would crash for you when compiling with Visual Studio or Dev-C++ was because nothing existed at that memory location and when you try to read from that location (pMyClass->_header_ == 0xDEADBEEF) it was impossible since nothing was there to read from.

A PE file just stands for Portable Executable. It's the executable format that Windows uses. The header in each of your .exe files contains information on where the file is going to be loaded into memory. The correct way would be to generate unsigned int Address based on that base address.

Then you would want to use the header of the PE file again to calculate where the file ends in memory. You would then, EndAddress - BaseAddress to get the exact number of bytes long in memory everything is stored. You could then use that in a for loop instead of my while loop.
Last edited on
I was referring to my code, which I probably should have posted in the first post but... Better late than never I suppose.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54

#include <iostream>
#include <string>


using namespace std;

unsigned long int findsize(string *);

int main()
{
    int stringsize;
    string * testpt;

    cin >> stringsize;

    testpt = new string [stringsize];
    cout << endl << findsize(testpt) << endl << endl << endl;
    delete [] testpt;


    return 0;
}

unsigned long int findsize(string *stringpt)
{
    static char * characterpt;
    static int size = 0;
    if ( *((char *)(stringpt) + 3) == 0 && (*((char *)(stringpt) + 2) == 62 || *((char *)(stringpt) + 2) == 68) ) //Is it a full string or empty string? Both are valid.
    {
        characterpt = (char*)stringpt->c_str(); //Get first character of the current strings cstring

        for (unsigned short int i = 0; i <= stringpt->size(); i++) //A new for loop must be created for each string, hence why recursion is needed.
        {

            if (*characterpt == '\0') // end of string
            {
                stringpt++; // Move onto next possible string
                size++;
                findsize(stringpt); //Call itself with the new string location being pointed to
                break;
            }

            else
            {
                characterpt++; // Move onto the next character in the string.
            }
        }
    }
    return size;


}


THAT code, won't compile correctly if compiled with Visual C++ Studio or Dev-C++. I have to compile it with Code::Blocks to get the result I am looking for. I still have my problem with the

static int size = 0;

It works just fine the first time the function is called, however upon calling the code a second time the value of size is not reset to zero.

Anyone know anything about why the code would compile fine on one compiler, but not on the others?
Jesus motherfucking Christ. Isn't it easier to just pass goddamn array size instead of all this faffing around? Will you do all this work for every type you need to pass pointers of? I'm curious because you're going to be passing a lot of pointers over the years.
Considering that I do this as a hobby, for fun and so that I can become better at programming languages and concepts? Yes, because not only did writing this program and getting it working teach me things about pointer usage and function pointers that I did not know before hand, but it was also quite fun to get running :D
Just remove the static from the two variables. Neither needs to be static.

But I have to agree -- what you are doing is not good. It is not foolproof and I can assure you it never will be. Your code breaks if the string class ever changes and might even break from platform to platform or compiler to compiler (or compiler version to compiler version).

The bottom line is that an algorithm that has a failure rate of non-zero is not useful no matter how small the failure rate is. When I call the function, I want the right answer all the time.

In C++, the programmer has a couple of options:

1
2
3
template< typename T, size_t N >
size_t ArraySize( T (&)[ N ] )
{ return N; }


Works for all stack-based fixed-length arrays and is as fast as it gets, since it will ultimately generate no code.

For arrays allocated on the heap, the best approach is to use an STL container such as vector. vector
computes the size of the "array" with a simple subtraction which will still be faster than calling your above
code.
Topic archived. No new replies allowed.