C++ For C# Developers: Part 21 – Casting and RTTI
Now that we’ve seen how types are implicitly converted in C++, we can see how they’re explicitly converted by casting. C++ offers a lot more kinds of casts than C# to control the conversion process. One of them—dynamic_cast
—introduces the concept of Run-Time Type Information or RTTI, so we’ll go into that today as well.
Table of Contents
- Part 1: Introduction
- Part 2: Primitive Types and Literals
- Part 3: Variables and Initialization
- Part 4: Functions
- Part 5: Build Model
- Part 6: Control Flow
- Part 7: Pointers, Arrays, and Strings
- Part 8: References
- Part 9: Enumerations
- Part 10: Struct Basics
- Part 11: Struct Functions
- Part 12: Constructors and Destructors
- Part 13: Initialization
- Part 14: Inheritance
- Part 15: Struct and Class Permissions
- Part 16: Struct and Class Wrapup
- Part 17: Namespaces
- Part 18: Exceptions
- Part 19: Dynamic Allocation
- Part 20: Implicit Type Conversion
- Part 21: Casting and RTTI
- Part 22: Lambdas
- Part 23: Compile-Time Programming
- Part 24: Preprocessor
- Part 25: Intro to Templates
- Part 26: Template Parameters
- Part 27: Template Deduction and Specialization
- Part 28: Variadic Templates
- Part 29: Template Constraints
- Part 30: Type Aliases
- Part 31: Deconstructing and Attributes
- Part 32: Thread-Local Storage and Volatile
- Part 33: Alignment, Assembly, and Language Linkage
- Part 34: Fold Expressions and Elaborated Type Specifiers
- Part 35: Modules, The New Build Model
- Part 36: Coroutines
- Part 37: Missing Language Features
- Part 38: C Standard Library
- Part 39: Language Support Library
- Part 40: Utilities Library
- Part 41: System Integration Library
- Part 42: Numbers Library
- Part 43: Threading Library
- Part 44: Strings Library
- Part 45: Array Containers Library
- Part 46: Other Containers Library
- Part 47: Containers Library Wrapup
- Part 48: Algorithms Library
- Part 49: Ranges and Parallel Algorithms
- Part 50: I/O Library
- Part 51: Missing Library Features
- Part 52: Idioms and Best Practices
- Part 53: Conclusion
const_cast
C# has only one kind of cast: (DestinationType)sourceType
. Since it’s the only option, it must be capable of handling every possible reason for casting. C++ takes a different tack. It provides a suite of casts from which we may choose to suit the intention of particular casting operations.
The first cast in this suite is one of the simplest: const_cast
. We use this when we simply want to treat a const
pointer or references as non-const
:
// Remove const from a pointer int x = 123; int const * constPtr = &x; int* nonConstPtr = const_cast<int*>(constPtr); *nonConstPtr = 456; DebugLog(x); // 456 // Remove const from a reference int const & constRef = x; int& nonConstRef = const_cast<int&>(constRef); nonConstRef = 789; DebugLog(x); // 789 // It's OK to cast null constPtr = nullptr; int* nullPtr = const_cast<int*>(constPtr); DebugLog(nullPtr); // null
Note that function pointers and member function pointers can’t be const_cast
.
In the first two examples, we used const_cast
to get a non-const
pointer and reference and then modified the value they referred to: x
. That’s perfectly OK because x
wasn’t actually const
. However, it’s undefined behavior if we modify a variable that’s actually const
:
int const x = 123; int const & constRef = x; int& nonConstRef = const_cast<int&>(constRef); nonConstRef = 789; // Undefined behavior: modifying const variable x DebugLog(x); // Could be anything!
So what does const_cast
actually do? It’s really just a compile-time operation to reclassify an expression as non-const
. No CPU instructions are generated because the CPU has no concept of const
. In this sense, const_cast
is “free” from a performance perspective.
reinterpret_cast
The next kind of cast is reinterpret_cast
. This is another “free” cast that generates no CPU instructions. Instead, it just tells the compiler to “reinterpret” the type of one expression as another type.
We can only use this in particular situations. First, a pointer can be converted to and from an integer as long as that integer is large enough to hold all possible pointer values:
// Pointer -> Integer int x = 123; int* p = &x; uint64_t i = reinterpret_cast<uint64_t>(p); DebugLog(i); // memory address of x // Integer -> Pointer int* p2 = reinterpret_cast<int*>(i); *p2 = 456; DebugLog(x); // 456
We can also reinterpret nullptr
as the integer 0
:
uint64_t i = reinterpret_cast<uint64_t>(nullptr); DebugLog(i); // 0
More commonly, we can reinterpret one kind of pointer as another kind of pointer:
struct Vector2 { float X; float Y; }; struct Point2 { float X; float Y; }; Point2 point{2, 4}; Point2* pPoint = &point; Vector2* pVector = reinterpret_cast<Vector2*>(pPoint); DebugLog(pVector->X, pVector->Y); // 2, 4
We have to take some precautions in order to use the resulting pointer safely. First, CPUs have alignment requirements on various data types such as float
. The destination type’s alignment requirements can’t be stricter than the source type’s alignment requirements. It’s up to us as programmers to know the alignment requirements for our target CPU architectures and ensure that we’re using reinterpret_cast
responsibly.
The C++ Standard says using the result of a reinterpret_cast
is undefined behavior except in certain particular cases. The first is if they’re “similar.” That’s defined as being the same type, pointers to the same type, pointers to members of the same class and those members are similar, or arrays of the same size (or one has unknown size) with similar elements. Here are some examples:
// Similar: pointer to same type int x = 123; int* p = reinterpret_cast<int*>(&x); // int* -> int* DebugLog(*p); // 123 // Similar: array with same dimensions and same type of elements int a1[3]{1, 2, 3}; int (&a)[3] = reinterpret_cast<int(&)[3]>(a1); DebugLog(a[0], a[1], a[2]); // 1, 2, 3 // Not similar: different types of pointers float* pFloat = reinterpret_cast<float*>(p); DebugLog(*pFloat); // Undefined behavior
Because undefined behavior may be implemented by a compiler however it chooses, many compilers are less strict than the C++ Standard requires. The last example of an int*
to float*
conversion in particular is commonly allowed by compilers as a kind of “type punning” similar to what we saw with unions.
If the types aren’t “similar” then we have two more chances to avoid undefined behavior. First, if one type is the signed or unsigned version of the same type:
// int* -> unsigned int* int x = 123; unsigned int* p = reinterpret_cast<unsigned int*>(&x); DebugLog(*p); // 123
And second, if we’re reinterpreting as a char
, unsigned char
or, in C++17 and later, std::byte
. These are specifically allowed so we can look at the byte representation of objects, such as for serialization to disk or a network:
// Print the bytes of a Vector2 Vector2 vec{2, 4}; char* p = reinterpret_cast<char*>(&vec); DebugLog(p[0], p[1], p[2], p[3], p[4], p[5], p[6], p[7]);
static_cast
Next up we have our first cast that can generate CPU instructions: static_cast
. The compiler checks a series of conditions to decide what a static_cast
should do. The first check is to see if there’s an implicit conversion sequence from the source type to the destination type or if the destination type can be direct-initialized from the source type. If either is the case, the static_cast
behaves like we wrote DestType tempVariable(sourceType)
:
struct File { FILE* handle; File(const char* path, const char* mode) { handle = fopen(path, mode); } ~File() { fclose(handle); } operator FILE*() { return handle; } }; File reader{"/path/to/file", "r"}; FILE* handle = static_cast<FILE*>(reader); // Implicit conversion
Next, it checks if we’re downcasting from a pointer or reference to a base class to a pointer or reference to a (non-virtually) derived class:
struct Vector2 { float X; float Y; }; struct Vector3 : Vector2 { float Z; }; Vector3 vec; vec.X = 2; vec.Y = 4; vec.Z = 6; Vector2& refVec2 = vec; // Implicit conversion from Vector3& to Vector2& Vector3& refVec3 = static_cast<Vector3&>(refVec2); // Downcast DebugLog(refVec3.X, refVec3.Y, refVec3.Z); // 2, 4, 6
We can also static_cast
to void
to explicitly discard a value. This is sometimes used to silence an “unused variable” compiler warning:
Vector2 vec{2, 4}; static_cast<void>(vec); // Discard the result of the cast
If there’s a standard conversion from the destination type to the source type, static_cast
will reverse it. It won’t reverse any lvalue-to-rvalue conversions, array-to-pointer decay, function-to-pointer conversions, function pointer conversions, or bool
conversions though.
int i = 123; float f = static_cast<int>(i); // Undo standard conversion: int -> float DebugLog(f); // 123
We can also explicitly perform some implicit conversions with static_cast
: lvalue-to-rvalue, array-to-pointer decay, and function-to-pointer:
void SayHello() { DebugLog("hello"); } // lvalue to rvalue conversion int i = 123; int i2 = static_cast<int&&>(i); DebugLog(i2); // 123 // Array to pointer decay int a[3]{1, 2, 3}; int* p = static_cast<int*>(a); DebugLog(p[0], p[1], p[2]); // 1, 2, 3 // Function to pointer conversion void (*pFunc)() = static_cast<void(*)()>(SayHello); pFunc(); // hello
Scoped enumerations can be converted to integer or floating point types using static_cast
. Since C++20, this works like an implicit conversion from the enum’s underlying type to the destination type. Before that, casting to bool
was treated differently since only 0
would become false
and everything else would become true
.
enum class Color { Red, Green, Blue }; Color green{Color::Green}; // Scoped enum -> int int i = static_cast<int>(green); DebugLog(i); // 1 // Scoped enum -> float float f = static_cast<float>(green); DebugLog(f); // 1
We can go the other way, too: integers and floating point types can be static_cast
to scoped or unscoped enumerations. We can also cast between enumeration types:
// Integer -> enum int i = 1; FgColor g1 = static_cast<FgColor>(i); DebugLog(g1); // Green // Floating point -> enum float f = 1; FgColor g2 = static_cast<FgColor>(f); DebugLog(g2); // Green // Cast between enum types FgColor g3{FgColor::Green}; BgColor g4 = static_cast<BgColor>(g3); DebugLog(g4); // Green
It’s undefined behavior if the underlying type of the enum isn’t fixed and the value being cast to the enum is out of its range. If it is fixed, the result is just like converting to the underlying type. Floating point values are first converted to the underlying type.
We can also use static_cast
to upcast from a pointer to a member in a derived class to a pointer to a member in the base class:
float Vector3::* p1 = &Vector3::X; float Vector2::* p2 = static_cast<float Vector2::*>(p1); Vector3 vec; vec.X = 2; vec.Y = 4; vec.Z = 6; DebugLog(vec.*p1, vec.*p2); // 2, 2
And finally, static_cast
can be used like reinterpret_cast
to convert a void*
to any other pointer type. The same caveats about alignment and type similarity apply.
int i = 123; void* pv = &i; int* pi = static_cast<int*>(pv); DebugLog(*pi); // 123
C-Style Cast and Function-Style Cast
A “C-style” cast looks like a cast in C as well as C#: (DestinationType)sourceType
. It behaves quite differently in C++ compared to C#. In C++, it’s mostly a shorthand for the first “named” cast whose prerequisites are met in this order:
const_cast<DestinationType>(sourceType)
static_cast<DestinationType>(sourceType)
with more leniency: pointers and references to or from derived classes or members of derived classes can be cast to pointers or references to base classes or members of base classesstatic_cast
(with more leniency) thenconst_cast
reinterpret_cast<DestinationType>(sourceType)
reinterpret_cast
thenconst_cast
// Uses const_cast (#1) int const i1 = 123; int i2 = (int)i1; DebugLog(i2); // 123 // Uses static_cast (#2) Vector2 vec{2, 4}; Vector3* pVec = (Vector3*)&vec; DebugLog(pVec->X, pVec->Y); // 2, 4 (undefined behavior to use Z!) // Uses static_cast the const_cast (#3) Vector2 const * pConstVec = &vec; Vector3* pVec3 = (Vector3*)pConstVec; DebugLog(pVec3->X, pVec3->Y); // 2, 4 (undefined behavior to use Z!) // Uses reinterpret_cast (#4) float* f1 = (float*)&i2; DebugLog(*f1); // 1.7236e-43 // Uses reinterpret_cast then const_cast (#5) float* f2 = (float*)&i1; DebugLog(*f2); // 1.7236e-43
A “function-style” cast works just like a C-style cast. It looks like a function call and even requires the type to have only one word: int
not unsigned int
. Be careful not to mistake it for a function call or class initialization.
int i = 123; float f = float(i); DebugLog(f); // 123
dynamic_cast
All of the casts we’ve seen so far are “static.” That means the way they operate is determined at compile time and don’t depend on the run-time value of the expression being cast. For example, consider this downcast:
void PrintZ(Vector2& vec) { // Downcast Vector3& refVec3 = reinterpret_cast<Vector3&>(vec); // Undefined behavior if vec isn't really a Vector3 DebugLog(refVec3.X, refVec3.Y, refVec3.Z); }
Remember that reinterpret_cast
generates no CPU instructions. This means the compiler isn’t generating any CPU instructions that would check if vec
is really a Vector3
. If it is, this code works just fine. If it’s not, reading Z
will read the four bytes that come after wherever the Vector2
is in memory. That’s almost certainly not a valid Z
value and will cause severe errors in our program logic when we try to use it that way. It’s also undefined behavior, so the compiler might generate surprising CPU instructions such as just skipping reading and printing Z
altogether.
To address this issue, C++ has a “safe” cast called dynamic_cast
. It works very similarly to C#’s only cast.
Like static_cast
, a sequence of checks is performed to decide what the CPU should do. First, we can cast to the same type or to add const
:
// Cast to same type Vector2 v{2, 4}; Vector2& r1 = v; Vector2& r2 = dynamic_cast<Vector2&>(r1); DebugLog(r2.X, r2.Y); // 2, 4 // Cast to add const Vector2 const & r3 = dynamic_cast<Vector2 const &>(r1); DebugLog(r3.X, r3.Y); // 2, 4
Second, if the value is null then the result is null:
Vector2* p1 = nullptr; Vector2* p2 = dynamic_cast<Vector2*>(p1); DebugLog(p2); // 0
Third, we can upcast from a pointer or reference to a derived class to a pointer or reference to a base class:
Vector3 vec; vec.X = 2; vec.Y = 4; vec.Z = 6; Vector3& r3 = vec; Vector2& r2 = dynamic_cast<Vector2&>(r3); DebugLog(r2.X, r2.Y); // 2, 4
Fourth, we can cast pointers to classes that have at least one virtual
function to void*
and we’ll get a pointer to the most-derived object that pointer points to:
struct Combatant { virtual ~Combatant() { } }; struct Player : Combatant { int32_t Id; }; Player player; player.Id = 123; Combatant* p = &player; void* pv = dynamic_cast<void*>(p); // Downcast to most-derived class: Player* Player* p2 = reinterpret_cast<Player*>(pv); DebugLog(p2->Id); // 123
Finally, we have the primary use case of dynamic_cast
: a downcast from a pointer or reference to a base class to a pointer or reference to a derived class. This generates CPU instructions that examine the object being pointed to or referenced by the expression to cast. If that object is really a base class of the destination type and that destination type has only one sub-object of the base class, which may not be the case with non-virtual
inheritance, then the cast succeeds with a pointer or reference to the derived class:
Player player; player.Id = 123; Combatant* p = &player; Player* p2 = dynamic_cast<Player*>(p); // Downcast DebugLog(p2->Id); // 123
This can also be used to perform a “sidecast” from one base class to another base class:
struct RangedWeapon { float Range; virtual ~RangedWeapon() { } }; struct MagicWeapon { enum { FireType, WaterType, ArcaneType } Type; }; struct Staff : RangedWeapon, MagicWeapon { const char* Name; }; Staff staff; staff.Name = "Staff of Freezing"; staff.Range = 10.0f; staff.Type = MagicWeapon::WaterType; Staff& staffRef = staff; RangedWeapon& rangedRef = staffRef; // Implicit conversion upcasts MagicWeapon& magicRef = dynamic_cast<MagicWeapon&>(rangedRef); // Sidecast DebugLog(magicRef.Type); // 1
If neither the downcast nor the sidecast succeed, the cast fails. When pointers are being cast, the cast evaluates to a null pointer of the destination type. If references are being cast, a std::bad_cast
exception is thrown:
struct Combatant { virtual ~Combatant() { } }; struct Player : Combatant { int32_t Id; }; struct Enemy : Combatant { int32_t Id; }; // Make a Combatant: the base class Combatant combatant; Combatant* pc = &combatant; Combatant& rc = combatant; // Cast fails. Combatant object isn't a Player. Null returned. Player* pp = dynamic_cast<Player*>(pc); DebugLog(pp); // 0 try { // Cast fails. Combatant object isn't a Player. std::bad_cast thrown. Player& rp = dynamic_cast<Player&>(rc); DebugLog(rp.Id); // Never called } catch (std::bad_cast const &) { DebugLog("cast failed"); // Gets printed }
Note that using dynamic_cast
on this
during a constructor is undefined behavior unless the destination type is the same class type or a base class type. We’ll see why in the next section.
Run-Time Type Information
In order to implement dynamic_cast
, the compiler must generate what’s known as Run-Time Type Information or RTTI. The exact format of this information is compiler-specific, but the compiler will generate data to be used at runtime by dynamic_cast
in order to determine the type of a particular object.
Since dynamic_cast
only works on types with at least one virtual
function, it can take advantage of the object’s virtual
function table or “vtable.” This is a compiler-generated array of function pointers for all the virtual
functions of a class. One table will be generated for each class in the inheritance hierarchy. A pointer to the table, known as a “virtual table pointer” or “vpointer,” will be added as a data member of all classes in the hierarchy and initialized during construction.
This virtual table pointer can therefore also be used to identify the class of an object since there is one virtual function table per class. The inheritance hierarchy is then conceptually expressed as a tree of virtual table pointers with implementation details varying by compiler.
Because all this RTTI data adds to the executable size, many compilers allow it to be disabled. That also disables dynamic_cast
as it depends on RTTI.
typeid
There is one other use of RTTI: the typeid
operator. It’s used to get information about a type, similar to typeof
or GetType
in C#. The operand can be either be named statically like typeof
in C# or dynamically like GetType
in C# to look up the type based on an object’s value. The C++ Standard Library’s <typeinfo>
header is required to use this.
// Static usage based on type std::type_info const & ti1{typeid(Combatant)}; // Dynamic usage based on variable with a virtual function Enemy enemy; std::type_info const & ti2{typeid(enemy)}; // Dynamic usage based on variable with no virtual function // Equivalent to static usage: typeid(int) int i = 123; std::type_info const & ti3{typeid(i)};
It evaluates to a const std::type_info
which has only a few useful members:
operator==(const std::type_info&)
andoperator!=(const std::type_info&)
to compare typesstd::size_t hash_code()
to get an integer that’s always the same for a given typeconst char* name()
to get the type’s string name
When using typeid
on a null pointer, a bad_typeid
is thrown:
Enemy* pe = nullptr; try { // Doesn't dereference null // Instead, attempts to get the type_info for what pe points to std::type_info const & ti{typeid(*pe)}; // Not printed DebugLog(ti.name()); } catch (std::bad_typeid const &) { DebugLog("bad typeid call"); // Is printed }
One common surprise with typeid
is that it ignores const
:
DebugLog(typeid(int) == typeid(const int)); // true
Another is that the name
member function doesn’t return any specific string. That string is also usually some compiler-specific code that may or may not have the name of the type from the source code:
// All of these will vary from compiler to compiler DebugLog(typeid(int).name()); // i DebugLog(typeid(long).name()); // l DebugLog(typeid(Enemy).name()); // 5Enemy
One more is that the std::type_info
for one call might not be the same object as the std::type_info
for another call, even if they’re the same type. The hash_code
member function should be used instead:
DebugLog(&typeid(int) == &typeid(int)); // Maybe false DebugLog(typeid(int).hash_code() == typeid(int).hash_code()); // Always true
Conclusion
As is often the case when comparing the two languages, C++ provides many options when C# provides only a few. In the case of casting, C++ provides a wide variety of named, C-style, and function-style casts for specific purposes while C# essentially only provides dynamic_cast
.
When used appropriately, this can make many casts “free” as no CPU instructions will be generated and no size will be added to the executable. When used inappropriately, undefined behavior may cause severe errors such as crashes and data corruption. It’s up to us as programmers to know the rules of casting and to judiciously choose the appropriate cast for our task. The consequences, even with thrown exceptions in C#, of careless casting really demand that we exercise caution regardless of language and cast type.
#1 by Roman on February 16th, 2021 ·
In the first code example you have `const x`, but in the text block after code you write “that is ok, since X was not a const”
I guess it’s a typo and X shouldn’t be a const?
#2 by jackson on February 20th, 2021 ·
Thanks for pointing this out! I’ve updated the article with a fix to make
x
non-const
.#3 by TonyH on February 24th, 2022 ·
In static_cast downcasting example you have used reinterpret_cast instead.
#4 by jackson on February 24th, 2022 ·
Thanks for letting me know about this. I’ve fixed it in the article.