Virtual methods in C++

 |  By Sergey Malahov

Virtual methods in C++ are well-known and regular, but their machinery is hidden from a regular developer. Most of us write virtual methods having the documentation in mind without knowing the internals which is okay until you get curious and want some fun. So here it is.

The aim of the article is to explore internals of C++ dynamic dispatch mechanism. The purpose, use cases and syntax details of virtual methods are out of the scope. Code examples are artificial and use simplified syntax.

The tools used:

Some basics

A regular method call works simple: the compiler knows the object's type at compile time and can use direct static address of the method for the call. Assume we have the following code:

After compilation, we can find the call to the regular method Sum in assembler code of main:

The actual call instruction is call 0x401130 and its target address is static and is part of the instruction opcode, this instruction always calls the same method (same address) and may change only after recompilation of the binary. The arguments 0x01 and 0x02 are passed via %esi and %edx registers. There is one additional argument being passed to the call via %rdi - the address of the object, is actually passed to the method to be used as this to access object non-static data. This is hidden from us for convenience when we write C++ code but under the hood this is passed as a regular argument. We will need this later. Another thing to mention is the size of the objects of class Base:

This is regular for an empty object.

Virtual method calls work a bit differently. Let's gradually dive into this magic.

Let's modify our Base making Sum virtual, so the derived classes can override the logic of this method:

Disassembled main changes a bit but not much:

The call itself still uses an absolute address, so nothing has changed here. A bit unexpected but the explanation is simple: the compiler still can compute the call address during compile time, and if it can use it - why not? In fact, we create an instance of Derived d and use this object (of a known type Derived) in the call d.Sum(1, 2). There is no reason for the compiler to do anything more than a direct call. So it does.

But one change still can be named even in this trivial example: a new call call 0x401140 <_ZN7DerivedC2Ev>, and that is a call of default Derived constructor. For some reason the compiler decided to insert a constructor call for the class with no data fields! It must be initializing something? Let's check the size of the Derived:

It has something in there 8 bytes length (the object is not empty anymore as empty objects have 1 byte size). And obviously, this is what gets initialized and that is why the constructor is called. Let's step into its assembler code:

We see base constructor call call 0x401190 <_ZN4BaseC2Ev> which is expected, but also the _ZTV7Derived symbol address stored somewhere with lea 0x2c5c(%rip),%rcx # 0x403dc0 <_ZTV7Derived>. Tracking down the path of this symbol we find that:

...and that is where the value from %rdi is stored at the beginning of the constructor call using instruction mov %rdi,0x8(%rsp). As shown above, %rdi is used to pass the actual object address this!

So, the strange 8-bytes hidden field of the object is initialized with some _ZTV7Derived address with a constant offset 0x10. We can get some info about the symbol using its address:

Getting clearer now, the address loaded into the hidden field inside the Derived instance is the address of a record in VTable (with a small offset) created by the compiler for our class. Every object of class Derived has a pointer to this record.

The table itself has the following data inside:

The value actually stored in the hidden Derived instance field is 0x403dd0 (16 offset), easy to check it (after constructor exists):

So the hidden field is a VMT (virtual methods table) pointer and points to 0x401170 - the very same address that was used as static address for direct call to Derived::Sum (call 0x401170 <_ZN7Derived3SumEii>). This means that the first record in VMT is the address of the first virtual method Sum.

Another pointer in VTable is worth mentioning: TI pointer (see the picture above), which stands for Type Information and actually points to RTTI (Run-Time Type Information) for the class

The call

VMT info seems redundant, why would the compiler save the address of the method (and, moreover, increase the sizes of all instances) if it can call it directly and actually does so? To find the answer, we need to modify the code a bit, changing the way we refer to the object and its methods:

The only change made is that a reference to Base& is used now to call Sum method trying to fool the compiler so it doesn't know the object type. And according to the assembler code the attempt succeeded:

There is no direct call to Sum anymore, it is replaced with indirect call call *(%rax). That is, it calls whatever address is stored in dereferenced %rax at the moment of instruction execution. And this is the key principle of virtual method dispatch, the actual code executed as a result of such call is defined dynamically. %rax content at the moment of call is defined by the next operations:

So in fact Derived::Sum is called via VMT record even if its called via reference to base class Base, that is what virtual stuff exists for. The effective behavior of the object is defined by its actual type and not the way it is referred to. Any reference or pointer to any base sub-object works. Even in the case of the so-called diamond inheritance the mechanism will find a way to get the proper method address and call it

To summarize:

Construction and destruction

Previously in the assembler code for Derived constructor, Base constructor was called. What is important is that this call is done before Derived class sets up its VMT pointer. This is because Base wants to setup its own VMT pointer to its own VMT so Base class instances (created directly using Base constructor) use Base virtual methods:

Base constructor uses the same mechanism Derived does (as shown before) to set up its VMT pointer. What about destruction? Same but in the opposite direction. To illustrate this, the code needs to be modified a bit to force the compiler to call the destructors. Additionally, call to virtual Sum is added into the destructors to force the compiler to add some virtual-related code there:

Compacted asm code for Derived destructor:

For Base class:

A common pattern here, similar to what we have seen in constructors: each destructors reloads object's VMT pointer according to its class. This fact leads us to an important conclusion: construction and destruction code is always executed in the context of current class's VMT address (and not the most derived). A kind of a lifeline of the object's VMT pointer can be drawn as that:

This may seem a bit strange at first glance and still makes sense: imagine Base::~Base() using VMT pointer of Derived. Calls to virtual methods during destruction would call overridden implementations (Derived::Sum() for example). But overridden methods of Derived were written supposing that all the Derived resources (variables, dynamically allocated memory or whatever was created in Derived constructor) are available. But this is not true during Base destructor execution as it is executed after Derived destructor (that could have destroyed something required for normal execution of Derived methods)! So the only safe way for Base destructor to execute virtual methods is to execute its own implementations (or any of its parents, their destruction code is not executed yet). And that is what is done via VMT pointer manipulations. Same logic applies to the constructors but in the opposite direction: Base constructor just can't safely call Derived implementations as Derived data may not be ready by that time. More explanation here.

Btw, this doesn't completely disable the virtual mechanism during construction or destruction, in the code above the calls to Sum() are still indirect using value from VMT. If we had a virtual method in Base that was not overridden in Derived, Derived destructor would call Base implementation of the method via VMT. This guarantees that the most appropriate and at the same time safe overridden version is called. We can say that during these special periods of the object's lifetime we use limited virtual calls, not deeper than the current sub-object's class being constructed/destructed.

This VMT manipulation may cause abstract method (if any) calls during construction or destruction. Oups...

Not only about call address

Ok, now we understand a bit more about how the appropriate code is selected to be executed on a virtual call. But what's about this for that call? Does the pointer to the object need to be somehow adjusted? Let's take a look at the following code example:

Here is how memory layout of MoreDerived will look like:

After creation of an instance of MoreDerived the pointer this points at the beginning of the object memory layout. This pointer is suitable for all methods declared in MoreDerived because MoreDerived methods were generated by the compiler keeping in mind the offsets of A, B, C which never change for all objects of type MoreDerived. The same this value is suitable for all methods declared in Derived, because Derived layout is also known at compilation time and is part of MoreDerived layout (it is a sub-object of the most derived object of class MoreDerived). If we decided to create an instance of Derived, we would get the next layout:

Obviously the offset of Base::A and Derived::B fields are the same for both layout mentioned above. It's a result of the strict layout rules of non-static data fields inside objects/sub-objects: for every given class (MoreDerived), all parent's classes (Base, Derived) fields are located strictly before the classes fields (&Base::A < &MoreDerived::C, &Derived::B < &MoreDerived::C, &Base::A < &Derived::B). So wherever sub-object of class MoreDerived is met in any object memory layout of any descendant of MoreDerived, it's field guaranteed to be prepended by the parents data fields with the same offsets in the same order. The same this in this case fits to call any method of the class and base classes. Trivial but must be mentioned.

The picture changes significantly when we have multiple inheritance. Assume we have the following code:

Applying the described above to the layout of class Derived is not possible anymore; the this pointer is not common in this case. Let's review memory layouts of the classes:

Here the field Base2::B2 in Derived layout has offset 16 while the same field in Base2 layout has offset 8, which means that casting Derived to Base2 will have to change this to be able to call Base2 methods properly. And it does change:

According to what we know so far about virtual calls, the code above cannot work properly. The line return d.Sum2(1) + db.Sum2(2) calls the same implementation of Sum2 (overridden in Derived) but uses different this.

Let's figure this out. d memory layout:

Here the data fields with values 0xb1, 0xb2 and 0xd0 are in their expected places, VMT pointer 0x0000000000403d68 in the begin of the layout as expected, but also an unexpected value 0x0000000000403d88 that precedes Base2 field 0xb2:

This unexpected value is also a vtable pointer pointing somewhere in Derived vtable. And this is the beginning of the Base2 memory layout inside Derived memory layout (see address):

So value stored at 0x7fffffffdcc8 will be used as VMT pointer when calling virtual methods of Base2, redirecting the calls to the real method implementation in Derived. Very similar to what we have seen before but... the Base2 sub-object has its own VMT pointer!

Moving along, examining the pointer:

This is the so-called thunk, a piece of code that adjusts this pointer to a proper value for the virtual call:

Since the address of this piece of code is stored in VMT, it is called instead of actual method. After the adjustment is done the overridden Derived method Sum2 is called using jmp 0x4011d0 <_ZN7Derived4Sum2Ei>.

The adjustment is done in add $0xfffffffffffffff0,%rdi, which means that %rdi (this) is shifted -16 bytes:

And that is exactly the distance between the actual Derived object's this and Base2 sub-object (see memory layout above)! If we take a closer look at Derived VMT we can notice that the same method Sum2 has actually 2 records there:

One of the records points directly to Derived::Sum2 implementation at 0x00000000004011d0, another one points to the thunk 0x0000000000401270 which after this adjustment calls the same Derived::Sum2 implementation at the same address (jmp 0x4011d0 <_ZN7Derived4Sum2Ei>). Same call of the same address, one is done directly while the other after some preparations.

This way C++ compiler prepares for a virtual call and guarantees that:

It doesn't matter where in classes hierarchy the implementation of a virtual method is located, and how the object is referred to, the call will always receive the correct this value.

Conclusion

Virtuals are fun... and more is to come!