Hardcore C++: why "this" sometimes doesn't equal "this"
I usually try to write these blogposts in a way that is readable for most game developers and enthusiasts, but today for a change I'd like to dive deep into a detail of C++: why sometimes the this pointer can differ even though it is being used within the same object.
This is a problem that one can spend a lot of time debugging on before finding out what happens. I encountered it last week, and the only reason it didn't cost me several days of debugging to figure it out, is because I ran into the exact same problem during a project at University years ago.
Let me first sketch an example of a situation in which this might occur. In some cases a unique identifier for an object is needed, but we don't actually need to do anything with that object, so it doesn't matter what type it is. In such cases, an obvious and easy solution is to simply use the address of the object itself and store it as a void*. This way we can, for example, register whether a call to a function is from an object that already called that function before.
Now the question is: does that always work?
Since I am writing this blogpost, the answer of course is "no". Here's why:
Let's start by looking at this simple case of inheritance:
This code prints the this-pointer twice, but from two different classes. Since this is from within the same object, basic intuition tells me it will print the same address twice. Here's what it prints:
0018FD8C
0018FD8C
Indeed, the same address twice! But what happens if we add multiple inheritance? Let's complicate the example by adding one more class and inheriting from both at the same time:
This looks like we are doing pretty much the same thing as above, so the question is again: do all these calls print the same pointer? Let's have a look at what this prints:
0018FD8C
0018FD90
0018FD8C
As you can see, surprisingly, and totally against my own intuition, they do not print the same address! The call to printB() prints a different, slightly higher address!
Now why is that? Surely there is something broken in C++? Or at least, that is what I thought when I first encountered this situation. There is, however, a totally reasonable explanation for this behaviour.
In memory, when using inheritance, the hierarchy of objects is simply put after each other. So these three objects look something like this:
Now if we we have an object of type C and then on it a function from B, the this pointer needs to point at where B actually starts in memory. It cannot point to where our C started, because the functions of B cannot have any knowledge of that they are inside a bigger object. So this is where all the pointers go:
The difference between 0018FD8C and 0018FD90 is exactly four bytes, which makes sense, since A only contains one integer number, which uses four bytes of memory.
So there you have it: this does not always equal this, even when used from inside the same object!
This curiosity in C++ is another good reason to think again on whether code structures that use void* are a good idea. Bugs caused by something like this are easy to overlook and terribly difficult to find. In fact, I know several good C++ programmers who were not even aware of this phenomenon. When I first encountered this at university years ago, it took me a lot of debugging and asking friends for help before I learned this was happening.
And this is not the only problem with void*: using it almost always means unsafe casting is somewhere near. Void* has its uses once in a while, but it is often avoidable. So whenever you are using a void* in your code, stop for a short moment to think about whether there is another solution that avoids it!
This is a problem that one can spend a lot of time debugging on before finding out what happens. I encountered it last week, and the only reason it didn't cost me several days of debugging to figure it out, is because I ran into the exact same problem during a project at University years ago.
Let me first sketch an example of a situation in which this might occur. In some cases a unique identifier for an object is needed, but we don't actually need to do anything with that object, so it doesn't matter what type it is. In such cases, an obvious and easy solution is to simply use the address of the object itself and store it as a void*. This way we can, for example, register whether a call to a function is from an object that already called that function before.
Now the question is: does that always work?
Since I am writing this blogpost, the answer of course is "no". Here's why:
Let's start by looking at this simple case of inheritance:
void printPointer(void* pointer) |
This code prints the this-pointer twice, but from two different classes. Since this is from within the same object, basic intuition tells me it will print the same address twice. Here's what it prints:
0018FD8C
0018FD8C
Indeed, the same address twice! But what happens if we add multiple inheritance? Let's complicate the example by adding one more class and inheriting from both at the same time:
void printPointer(void* pointer) |
This looks like we are doing pretty much the same thing as above, so the question is again: do all these calls print the same pointer? Let's have a look at what this prints:
0018FD8C
0018FD90
0018FD8C
As you can see, surprisingly, and totally against my own intuition, they do not print the same address! The call to printB() prints a different, slightly higher address!
Now why is that? Surely there is something broken in C++? Or at least, that is what I thought when I first encountered this situation. There is, however, a totally reasonable explanation for this behaviour.
In memory, when using inheritance, the hierarchy of objects is simply put after each other. So these three objects look something like this:
Now if we we have an object of type C and then on it a function from B, the this pointer needs to point at where B actually starts in memory. It cannot point to where our C started, because the functions of B cannot have any knowledge of that they are inside a bigger object. So this is where all the pointers go:
The difference between 0018FD8C and 0018FD90 is exactly four bytes, which makes sense, since A only contains one integer number, which uses four bytes of memory.
So there you have it: this does not always equal this, even when used from inside the same object!
This curiosity in C++ is another good reason to think again on whether code structures that use void* are a good idea. Bugs caused by something like this are easy to overlook and terribly difficult to find. In fact, I know several good C++ programmers who were not even aware of this phenomenon. When I first encountered this at university years ago, it took me a lot of debugging and asking friends for help before I learned this was happening.
And this is not the only problem with void*: using it almost always means unsafe casting is somewhere near. Void* has its uses once in a while, but it is often avoidable. So whenever you are using a void* in your code, stop for a short moment to think about whether there is another solution that avoids it!
Comments
Post a Comment