Virtual methods in C++ are well-known and regular, but their machinery is hidden from a regular developer. Most of us write virtual methods having the documentation in mind without knowing the internals which is okay until you get curious and want some fun. So here it is.
The tools used:
- Ubuntu clang version 18.1.3 is used to compile C++ sources, compile options
--std=c++17 -g -O0 -fomit-frame-pointer -fno-pic -static
- GNU gdb (Ubuntu 15.0.50.20240403-0ubuntu1) is used to analyze assembler code and process memory
Some basics
A regular method call works simple: the compiler knows the object's type at compile time and can use direct static address of the method for the call. Assume we have the following code:
After compilation, we can find the call to the regular method Sum
in assembler code of main
:
The actual call instruction is call 0x401130
and its target address is static and is part of the instruction opcode, this instruction always calls the same method (same address) and may change only after recompilation of the binary. The arguments 0x01 and 0x02 are passed via %esi
and %edx
registers. There is one additional argument being passed to the call via %rdi
- the address of the object, is actually passed to the method to be used as this
to access object non-static data. This is hidden from us for convenience when we write C++ code but under the hood this
is passed as a regular argument. We will need this later. Another thing to mention is the size of the objects of class Base
:
This is regular for an empty object.
Virtual method calls work a bit differently. Let's gradually dive into this magic.
Let's modify our Base
making Sum
virtual, so the derived classes can override the logic of this method:
Disassembled main
changes a bit but not much:
The call itself still uses an absolute address, so nothing has changed here. A bit unexpected but the explanation is simple: the compiler still can compute the call address during compile time, and if it can use it - why not? In fact, we create an instance of Derived d
and use this object (of a known type Derived
) in the call d.Sum(1, 2)
. There is no reason for the compiler to do anything more than a direct call. So it does.
But one change still can be named even in this trivial example: a new call call 0x401140 <_ZN7DerivedC2Ev>
, and that is a call of default Derived
constructor. For some reason the compiler decided to insert a constructor call for the class with no data fields! It must be initializing something? Let's check the size of the Derived
:
It has something in there 8 bytes length (the object is not empty anymore as empty objects have 1 byte size). And obviously, this is what gets initialized and that is why the constructor is called. Let's step into its assembler code:
We see base constructor call call 0x401190 <_ZN4BaseC2Ev>
which is expected, but also the _ZTV7Derived
symbol address stored somewhere with lea 0x2c5c(%rip),%rcx # 0x403dc0 <_ZTV7Derived>
. Tracking down the path of this symbol we find that:
- its address is stored in
%rcx
register vialea
instruction - the value of
%rcx
(the address of_ZTV7Derived
) is increased by 0x10 (16 bytes or two 8-byte pointers); - the value of
%rcx
is stored at the memory location pointed to by%rax
usingmov %rcx,(%rax)
- in its turn
%rax
stores the value from memory at0x8(%rsp)
- from a variable of the current stack frame (%rsp
is stack pointer register) with 0x8 offset
...and that is where the value from %rdi
is stored at the beginning of the constructor call using instruction mov %rdi,0x8(%rsp)
. As shown above, %rdi
is used to pass the actual object address this
!
So, the strange 8-bytes hidden field of the object is initialized with some _ZTV7Derived
address with a constant offset 0x10. We can get some info about the symbol using its address:
Getting clearer now, the address loaded into the hidden field inside the Derived
instance is the address of a record in VTable (with a small offset) created by the compiler for our class. Every object of class Derived
has a pointer to this record.
The table itself has the following data inside:
The value actually stored in the hidden Derived
instance field is 0x403dd0
(16 offset), easy to check it (after constructor exists):
So the hidden field is a VMT (virtual methods table) pointer and points to 0x401170
- the very same address that was used as static address for direct call to Derived::Sum
(call 0x401170 <_ZN7Derived3SumEii>
). This means that the first record in VMT is the address of the first virtual method Sum
.
The call
VMT info seems redundant, why would the compiler save the address of the method (and, moreover, increase the sizes of all instances) if it can call it directly and actually does so? To find the answer, we need to modify the code a bit, changing the way we refer to the object and its methods:
The only change made is that a reference to Base&
is used now to call Sum
method trying to fool the compiler so it doesn't know the object type. And according to the assembler code the attempt succeeded:
There is no direct call to Sum
anymore, it is replaced with indirect call call *(%rax)
. That is, it calls whatever address is stored in dereferenced %rax
at the moment of instruction execution. And this is the key principle of virtual method dispatch, the actual code executed as a result of such call is defined dynamically. %rax
content at the moment of call is defined by the next operations:
-
%rdi
(thethis
register) is initialized at the beginning ofmain
vialea 0x8(%rsp),%rdi
and retains its value -
%rax
is loaded a few lines beforecall
inmov (%rdi),%rax
with dereferenced value of%rdi
, which means the dereferenced value ofthis
- first record in VMT for classDerived
So in fact Derived::Sum
is called via VMT record even if its called via reference to base class Base
, that is what virtual stuff exists for. The effective behavior of the object is defined by its actual type and not the way it is referred to. Any reference or pointer to any base sub-object works. Even in the case of the so-called diamond inheritance the mechanism will find a way to get the proper method address and call it
To summarize:
- each instance of a class with virtual methods has a hidden VMT pointer to corresponding record in VMT created by the compiler at compilation time
- VMT has a record for each virtual method of the class with the addresses of actual method implementation. Overriding a method actually means storing its address in the class's VMT and VMTs of all descendants
- calls of virtual methods are indirect via VMT records of the class
- compiler still can use direct calls for virtual methods when possible for optimization
Construction and destruction
Previously in the assembler code for Derived
constructor, Base
constructor was called. What is important is that this call is done before Derived
class sets up its VMT pointer. This is because Base
wants to setup its own VMT pointer to its own VMT so Base
class instances (created directly using Base
constructor) use Base virtual methods:
Base
constructor uses the same mechanism Derived
does (as shown before) to set up its VMT pointer. What about destruction? Same but in the opposite direction. To illustrate this, the code needs to be modified a bit to force the compiler to call the destructors. Additionally, call to virtual Sum
is added into the destructors to force the compiler to add some virtual-related code there:
Compacted asm code for Derived
destructor:
For Base
class:
A common pattern here, similar to what we have seen in constructors: each destructors reloads object's VMT pointer according to its class. This fact leads us to an important conclusion: construction and destruction code is always executed in the context of current class's VMT address (and not the most derived). A kind of a lifeline of the object's VMT pointer can be drawn as that:
This may seem a bit strange at first glance and still makes sense: imagine Base::~Base()
using VMT pointer of Derived
. Calls to virtual methods during destruction would call overridden implementations (Derived::Sum()
for example). But overridden methods of Derived
were written supposing that all the Derived
resources (variables, dynamically allocated memory or whatever was created in Derived
constructor) are available. But this is not true during Base
destructor execution as it is executed after Derived
destructor (that could have destroyed something required for normal execution of Derived
methods)! So the only safe way for Base
destructor to execute virtual methods is to execute its own implementations (or any of its parents, their destruction code is not executed yet). And that is what is done via VMT pointer manipulations. Same logic applies to the constructors but in the opposite direction: Base
constructor just can't safely call Derived
implementations as Derived
data may not be ready by that time. More explanation here.
Btw, this doesn't completely disable the virtual mechanism during construction or destruction, in the code above the calls to Sum()
are still indirect using value from VMT. If we had a virtual method in Base
that was not overridden in Derived
, Derived
destructor would call Base
implementation of the method via VMT. This guarantees that the most appropriate and at the same time safe overridden version is called. We can say that during these special periods of the object's lifetime we use limited virtual calls, not deeper than the current sub-object's class being constructed/destructed.
Not only about call address
Ok, now we understand a bit more about how the appropriate code is selected to be executed on a virtual call. But what's about this
for that call? Does the pointer to the object need to be somehow adjusted? Let's take a look at the following code example:
Here is how memory layout of MoreDerived
will look like:
After creation of an instance of MoreDerived
the pointer this
points at the beginning of the object memory layout. This pointer is suitable for all methods declared in MoreDerived
because MoreDerived
methods were generated by the compiler keeping in mind the offsets of A, B, C which never change for all objects of type MoreDerived
. The same this
value is suitable for all methods declared in Derived
, because Derived
layout is also known at compilation time and is part of MoreDerived
layout (it is a sub-object of the most derived object of class MoreDerived
). If we decided to create an instance of Derived
, we would get the next layout:
Obviously the offset of Base::A
and Derived::B
fields are the same for both layout mentioned above. It's a result of the strict layout rules of non-static data fields inside objects/sub-objects: for every given class (MoreDerived
), all parent's classes (Base
, Derived
) fields are located strictly before the classes fields (&Base::A < &MoreDerived::C
, &Derived::B < &MoreDerived::C
, &Base::A < &Derived::B
). So wherever sub-object of class MoreDerived
is met in any object memory layout of any descendant of MoreDerived
, it's field guaranteed to be prepended by the parents data fields with the same offsets in the same order. The same this
in this case fits to call any method of the class and base classes. Trivial but must be mentioned.
The picture changes significantly when we have multiple inheritance. Assume we have the following code:
Applying the described above to the layout of class Derived
is not possible anymore; the this
pointer is not common in this case. Let's review memory layouts of the classes:
Here the field Base2::B2
in Derived
layout has offset 16 while the same field in Base2
layout has offset 8, which means that casting Derived
to Base2
will have to change this
to be able to call Base2
methods properly. And it does change:
According to what we know so far about virtual calls, the code above cannot work properly. The line return d.Sum2(1) + db.Sum2(2)
calls the same implementation of Sum2
(overridden in Derived
) but uses different this
.
Let's figure this out. d
memory layout:
Here the data fields with values 0xb1, 0xb2 and 0xd0 are in their expected places, VMT pointer 0x0000000000403d68
in the begin of the layout as expected, but also an unexpected value 0x0000000000403d88
that precedes Base2
field 0xb2:
This unexpected value is also a vtable pointer pointing somewhere in Derived
vtable. And this is the beginning of the Base2
memory layout inside Derived
memory layout (see address):
So value stored at 0x7fffffffdcc8
will be used as VMT pointer when calling virtual methods of Base2
, redirecting the calls to the real method implementation in Derived
. Very similar to what we have seen before but... the Base2
sub-object has its own VMT pointer!
Moving along, examining the pointer:
This is the so-called thunk
, a piece of code that adjusts this
pointer to a proper value for the virtual call:
Since the address of this piece of code is stored in VMT, it is called instead of actual method. After the adjustment is done the overridden Derived
method Sum2
is called using jmp 0x4011d0 <_ZN7Derived4Sum2Ei>
.
The adjustment is done in add $0xfffffffffffffff0,%rdi
, which means that %rdi
(this
) is shifted -16 bytes:
And that is exactly the distance between the actual Derived
object's this
and Base2
sub-object (see memory layout above)! If we take a closer look at Derived
VMT we can notice that the same method Sum2
has actually 2 records there:
One of the records points directly to Derived::Sum2
implementation at 0x00000000004011d0
, another one points to the thunk 0x0000000000401270
which after this
adjustment calls the same Derived::Sum2
implementation at the same address (jmp 0x4011d0 <_ZN7Derived4Sum2Ei>
). Same call of the same address, one is done directly while the other after some preparations.
This way C++ compiler prepares for a virtual call and guarantees that:
this
value.Conclusion
Virtuals are fun... and more is to come!