Virtual Function Mechanism

Static Binding#

The default binding of functions in C++ is static binding, also known as early binding and compile-time binding.

Static binding completes the process of function lookup and association during compile time, which can improve the runtime performance of the program.

Dynamic Binding#

#include <iostream>

class Base {
public:
    void foo() {
        std::cout << "Base::foo" << std::endl;
    }
};

class Derive1 : public Base {
public:
    void foo() {
        std::cout << "Derive1::foo" << std::endl;
    }
};

class Derive2 : public Base {
public:
    void foo() {
        std::cout << "Derive2::foo" << std::endl;
    }
};

int main() {
    Base *base = new Derive1();
    base->foo();
    
    base = new Derive2();
    base->foo();
    
    delete base;
    return 0;
}

Output:

Base::foo
Base::foo

Here, the derived class functions are not called because the compiler defaults to static binding in this case. During the compilation phase, the compiler cannot determine what type the Base class pointer will point to, so it cannot make function calls based on the actual object type pointed to by the object pointer.

Therefore, it is necessary to defer the function binding from the compile phase to the runtime phase, so that the object pointer can point to the actual type. This is called dynamic binding, also known as late binding and runtime binding.

Declare the member functions of the base class that do not perform static binding as virtual functions to achieve dynamic binding.

#include <iostream>

class Base {
public:
    virtual void foo() {
        std::cout << "Base::foo" << std::endl;
    }
    
    virtual ~Base() {}
};

class Derive1 : public Base {
public:
    void foo() {
        std::cout << "Derive1::foo" << std::endl;
    }
};

class Derive2 : public Base {
public:
    void foo() {
        std::cout << "Derive2::foo" << std::endl;
    }
};

int main() {
    Base *base = new Derive1();
    base->foo();
    
    base = new Derive2();
    base->foo();
    
    delete base;
    return 0;
}

Output:

Derive1::foo
Derive2::foo

Virtual Function Table#

Dynamic binding of virtual functions is implemented based on the virtual function table (vftable). When there are virtual functions in a class, the compiler inserts a vfptr pointer at the first data member position of the object, which points to an array containing the addresses of all virtual functions.

The compiler generates a virtual function table for each class that contains virtual functions. All objects of this class share the same virtual function table. Consider the following code:

#include <iostream>

class Base {
public:
    virtual void foo() {}
};

int main() {
    Base b1;
    void **b1_vfptr = *(void ***)&b1;

    Base b2;
    void **b2_vfptr = *(void ***)&b2;

    std::cout << (b1_vfptr == b2_vfptr) << std::endl;

    return 0;
}

Output:

The virtual function table contains the addresses of all virtual functions in the class and its base classes. Consider the following code:

#include <iostream>

class Base {
public:
    virtual void foo() {
        std::cout << "Base::foo" << std::endl;
    }

    virtual ~Base() {}
};

class Derive : public Base {
public:
    void foo() override {
        std::cout << "Derive::foo" << std::endl;
    }

    virtual void foo2() {
        std::cout << "Derive::foo2" << std::endl;
    }
};

int main() {
    Base *base = new Derive;

    void **vfptr = *(void ***)base;

    // [0] &Base::foo
    // [1] &Base::~Base
    // [2] &Derive::foo
    // [3] &Derive::foo2
    void (*vf)() = (void(*)())vfptr[3];

    vf();

    delete base;
    return 0;
}

Output:

Derive::foo2

The compiler is responsible for initializing and destroying the virtual function table. It is initialized in the constructor and destroyed in the destructor.

In multiple inheritance, there may be multiple virtual function tables.

#include <iostream>

class Base1 {
public:
    virtual void foo() {} 
};

class Base2 {
public:
    virtual void foo() {}
};

class Derive : public Base1, public Base2 {
public:
    void foo() override {}
};

int main() {
    std::cout << sizeof(Derive) << std::endl;
    return 0;
}

Output on a 64-bit OS:

When the derived class does not override the virtual function of the base class, the virtual function table of the derived class inherits the virtual function addresses of the base class.

When the derived class overrides the virtual function of the base class, the virtual function table inherited from the base class will have the corresponding override behavior.

#include <iostream>

class Base {
public:
    virtual void foo() {
        std::cout << "Base::foo" << std::endl;
    }

    virtual ~Base() {}
};

class Derive1 : public Base {
public:
    
};

class Derive2 : public Base {
public:
    virtual void foo() override {
        std::cout << "Derive2::foo" << std::endl;
    }
};

int main() {
    Base *base = new Derive1();
    base->foo();

    base = new Derive2();
    base->foo();

    delete base;
    return 0;
}

Output:

Base::foo
Derive2::foo

Run-Time Type Identification#

C++ is a statically typed language, where data types are determined during the compilation phase. However, in certain scenarios (polymorphism), the data type cannot be determined during the compilation phase and needs to be determined during the runtime phase.

Run-time type identification (RTTI) is the mechanism used to determine the data type during the runtime phase.

typeid#

The typeid operator is used to obtain the type of a variable. It can be used to determine the type of a variable during the compilation phase or the runtime phase.

Determining the type of a variable during the runtime phase:

class Base {
public:
    virtual ~Base() {}
};

class Derive : public Base {
public:
    
};

int main() {
    Base *base = new Derive;
    std::cout << typeid(*base).name() << std::endl;
    delete base;
    return 0;
}

Output with g++:

6Derive

dynamic_cast#

The dynamic_cast is used for type casting and can check whether the type conversion of pointers or references with an inheritance relationship between parent and child types is safe. It can perform type conversion during the compilation phase or the runtime phase.

Type conversion during the compilation phase:

#include <iostream>

class Base {};

class Derive : public Base {};

int main() {
    Derive *derive = new Derive();
    Base *base = dynamic_cast<Base *>(derive);

    delete base;
    return 0;
}

Converting a pointer with a larger addressing range to a smaller range does not cause memory overflow, so it is safe.

Type conversion during the runtime phase:

#include <iostream>

class Base {
public:
    virtual ~Base() {}
};

class Derive : public Base {};

int main() {
    Base *base1 = new Base(); // base1 points to the base class
    Derive *derive1 = dynamic_cast<Derive *>(base1); // Convert the base class to the derived class
    std::cout << derive1 << std::endl; // 0 Conversion failed

    Base *base2 = new Derive(); // base2 points to the derived class
    Derive *derive2 = dynamic_cast<Derive *>(base2); // Convert the derived class to the base class
    std::cout << derive2 << std::endl; // Non-zero Conversion succeeded

    delete derive1;
    delete derive2;
    return 0;
}

Converting a pointer with a smaller addressing range to a larger range may cause memory overflow, so it is unsafe.

When performing type identification during the runtime phase using typeid and dynamic_cast, it relies on the virtual function mechanism.

In the case of polymorphism, it is possible to have a parent class pointer pointing to a child class object.

Each pointer in the virtual function table usually points to a std::type_info object, and the object's type information &class_meta can be accessed through the vfptr.

Pros and Cons of the Virtual Function Mechanism#

Advantages:

Dynamic Polymorphism: Allows calling overridden virtual functions in derived classes through a base class pointer or reference.
Code Reusability: The base class defines a common interface, and derived classes can override virtual functions to implement different functionalities.
Extensibility: New derived classes can be added and virtual functions can be overridden without modifying the base class code.
Decoupling: The virtual function mechanism helps decouple code, making it easier to maintain and modify.

Disadvantages:

Memory Overhead: Each class that contains virtual functions has a virtual function table, which is a function pointer array that consumes 8 bytes of memory on a 64-bit operating system.
Call Overhead: When calling a virtual function, the actual function address to be called is looked up through the virtual function table, which is slightly slower compared to calling a function directly through its address.
Potential Impact on Optimization: Static binding allows the compiler to determine the address of the called function during the compilation phase and perform inline optimization. With dynamic binding, the compiler cannot determine the specific function implementation during the compilation phase and can only access the actual function address through the virtual function table during the runtime phase.