With constructors under our belts, we can now talk about initialization of structs and other types. This is a far more complex topic than in C#. Read on to learn a lot of nitty-gritty details!

Table of Contents

Explicit Constructors

Before we get to initialization, we need to talk a little more about how struct objects are created. First, all of the constructors we write may be optionally declared as explicit:

struct Vector2
{
    float X;
    float Y;
 
    explicit Vector2(const Vector2& other)
    {
        X = other.X;
        Y = other.Y;
    }
};

In C++20, this can be conditional on a compile-time constant expression put into parentheses after the explicit keyword:

struct Vector2
{
    float X;
    float Y;
 
    explicit (2 > 1) Vector2(const Vector2& other)
    {
        X = other.X;
        Y = other.Y;
    }
};

When a constructor is explicit, it’s no longer considered a “converting constructor.” As we’ll see below, some forms of initialization will no longer allow the constructor to be called implicitly.

User-Defined Conversion Operators

As with C#, we can write our own conversion operators from a struct to any other type:

struct Vector2
{
    float X;
    float Y;
 
    operator bool()
    {
        return X != 0 || Y != 0;
    }
};

Also as in C#, these can be explicit.

struct Vector2
{
    float X;
    float Y;
 
    explicit operator bool()
    {
        return X != 0 || Y != 0;
    }
};

They can also be conditionally explicit as of C++20:

struct Vector2
{
    float X;
    float Y;
 
    explicit (2 > 1) operator bool()
    {
        return X != 0 || Y != 0;
    }
};

There’s no implicit keyword like we have in C#. To make one implicit, just don’t add explicit.

In C#, user-defined conversion operators are static and take an argument of the same type as the struct they’re defined in. In C++, they’re non-static and this is used implicitly or explicitly instead of the argument.

Like other overloaded operators, they may be called explicitly. It’s rare to see this, but it’s allowed:

Vector2 v1;
v1.X = 2;
v1.Y = 4;
bool b = v1.operator bool();
Initialization Types

C++ classifies initialization into the following types:

  • Default
  • Aggregate
  • Constant
  • Copy
  • Direct
  • List
  • Reference
  • Value
  • Zero

The rules for how a type works frequently defers to the rules for how another type works. This is similar to a function calling another function. It creates a dependency of one type on another type. These dependencies frequently form cycles in the graph, which looks roughly like this:

Initialization Type Dependencies

This means that as we go through the initialization types we’re going to refer to other initialization types that we haven’t seen yet. Feel free to jump ahead to the referenced type or come back to revisit a type after reading about its references later on in the article.

As for terminology, we often say that a variable is “X-initialized” to mean that it is initialized using the rules of the “X” initialization type. For example, “MyVar is direct-initialized” means “MyVar is initialized according to the rules of the direct initialization type.”

Default Initialization

Default initialization happens when a variable is declared with no initializer:

T object;

It also happens when calling a constructor that doesn’t mention a data member:

struct HasDataMember
{
    T Object;
    int X;
 
    HasDataMember()
        : X(123) // No mention of Object
    {
    }
};

If the type (T) is a struct, its default constructor is called. If it’s an array, every element of the array is default-initialized:

struct ConstructorLogs
{
    ConstructorLogs()
    {
        DebugLog("default");
    }
};
 
ConstructorLogs single; // Prints "default"
ConstructorLogs array[3]; // Prints "default", "default", "default"

For all other types, nothing happens. This includes primitives, enums, and pointers. Using one of these objects is undefined behavior and may cause severe errors since the compiler can generate any code it wants to.

float f;
DebugLog(f); // Undefined behavior!

Default initialization isn’t allowed for these kinds of variables if they’re const since there would be no way to initialize them later:

void Foo()
{
    const float f; // Compiler error: default initializer does nothing
}

One exception is for static variables, including both static data members of structs and global variables. These are zero-initialized:

const float f; // OK: this is a global variable
 
struct HasStatic
{
    static float X;
};
float HasStatic::X; // OK: this is a static data member

It is also allowed if there’s a default constructor to call because that initializes the variable:

const HasDataMember single; // OK: calls default constructor
 
struct NoDefaultConstructor
{
    NoDefaultConstructor() = delete;
};
 
const NoDefaultConstructor ndc; // Compiler error: no default constructor

References, both lvalue and rvalue, are never default-initialized. They have their own initialization type which we’ll cover below: reference initialization.

Copy Initialization

Copy initialization has several forms:

// Assignment style
T object = other;
 
// Function call
func(other)
 
// Return value
return other;
 
// Array assigned to curly braces
T array[N] = {other};

For the first three forms, only one object is involved. That object’s copy constructor is called with other being passed in as the argument:

struct Logs
{
    Logs() = default;
 
    Logs(const Logs& logs)
    {
        DebugLog("copy");
    }
};
 
Logs Foo(Logs a)
{
    Logs b = a; // "copy" for assignment style
    return a; // "copy" for return value
}
 
Logs x;
Foo(x); // "copy" for function call

This is no longer allowed if the copy constructor is explicit:

struct Logs
{
    Logs() = default;
 
    explicit Logs(const Logs& logs)
    {
        DebugLog("copy");
    }
};
 
Logs Foo(Logs a)
{
    Logs b = a; // Compiler error: copy constructor is explicit
    return a; // Compiler error: copy constructor is explicit
}
 
Logs x;
Foo(x); // Compiler error: copy constructor is explicit

User-defined conversion operators can also be called by the same three forms of copy initialization:

struct ConvertLogs
{
    ConvertLogs() = default;
 
    operator bool()
    {
        DebugLog("convert");
        return true;
    }
};
 
bool Foo(bool b)
{
    ConvertLogs x;
    return x; // "convert" for return value
}
 
ConvertLogs x;
bool b = x; // "convert" for assignment style
 
Foo(x); // "convert" for function call

The return value of the user-defined conversion operator, a bool in this example, is then used to direct-initialize the variable.

As with the copy constructor, making the user-defined conversion operator explicit disables copy initialization and makes all of these “convert” lines generate compiler errors just like when we made the copy constructor explicit.

For non-struct types like primitives, enums, and pointers, the value is simply copied:

int x = y;

The last form deals with arrays. This happens during aggregate initialization.

Aggregate Initialization

Aggregate initialization has the following forms:

// Assign curly braces
T object = { val1, val2 };
 
// No-assign curly braces
T object{ val1, val2 };
 
// Assign curly braces with "designators" (data member names)
T object = { .designator=val1, .designator=val2 };
 
// No-assign curly braces with "designators" (data member names)
T object{ .designator=val1, .designator=val2 };
 
// Parentheses
T object(val1, val2);

All of these forms work on types (T) that are considered “aggregates.” That includes arrays and structs that don’t have any constructors except those using = default.

The elements of these arrays and data members of these structs are copy-initialized with the given values: val1, val2, etc. This is done in index order starting at the first element for arrays. With structs, this is done in the order that data members are declared, just like a constructor’s initializer list does.

Designators are available as of C++20. They’re similar to C#’s “object initializers”: Vector2 vec = {X=2, Y=4};. They must be in the same order as the struct’s data members and all values must have designators.

struct Vector2
{
    float X;
    float Y;
};
 
Vector2 v1 = { 2, 4 };
DebugLog(v1.X, v1.Y); // 2, 4
 
Vector2 v2{2, 4};
DebugLog(v2.X, v2.Y); // 2, 4
 
Vector2 v3 = { .X=2, .Y=4 };
DebugLog(v3.X, v3.Y); // 2, 4
 
Vector2 v4{ .X=2, .Y=4 };
DebugLog(v4.X, v4.Y); // 2, 4
 
Vector2 v5(2, 4);
DebugLog(v5.X, v5.Y); // 2, 4

It’s a compiler error to pass more values than there are data members or array elements:

Vector2 v5 = {2, 4, 6}; // Compiler error: too many data members
float a1[2] = {2, 4, 6}; // Compiler error: too many array elements

We can, however, pass fewer values than data members or array elements. The remaining data members are initialized with their default member initializers. If they don’t have default member initializers, they’re copy-initialized from an empty list ({}).

struct DefaultedVector2
{
    float X = 1;
    float Y;
};
 
DefaultedVector2 dv1 = {2};
DebugLog(dv1.X, dv1.Y); // 2, 0
 
float a2[2] = {2};
DebugLog(a2[0], a2[1]); // 2, 0

If a data member is an lvalue or rvalue reference, not passing it is a compiler error because it could never be initialized later on due to how references work.

struct HasRef
{
    int X;
    int& R;
};
 
HasRef hr = {123}; // Compiler error: reference data member not initialized

There are special rules for aggregate-initializing arrays from string literals:

// a1 has length 4 and contains: 'a', 'b', 'c', 0
char a1[4] = "abc";
 
// Length is optional. This is the same as a1.
char a2[] = "abc";
 
// Curly braces are optional. This is the same as a1.
char a3[] = {"abc"};
 
// Compiler error: array too small to fit the string literal's contents
char a4[1] = "abc";
 
// Extra array elements are zero-initialized
// a5 has length 6 and contains: 'a', 'b', 'c', 0, 0, 0
char a5[6] = "abc";
List Initialization

There are two sub-types of list initialization. First, “direct list initialization” has these forms:

// Named variable
T object{val1, val2};
 
// Unnamed temporary variable
T{val1, val2}
 
struct MyStruct
{
    // Data member
    T member{val1, val2};
};
 
MyStruct::MyStruct()
    // Initializer list entry
    : member{val1, val2}
{
}

Second, there’s “copy list initialization” with these forms:

// Named variable
T object = {val1, val2};
 
// Function call
func({val1, val2})
 
// Return value
return {val1, val2};
 
// Overloaded subscript operator call
object[{val1, val2}]
 
// Assignment
object = {val1, val2}
 
struct MyStruct
{
    // Data member
    T member = {val1, val2};
};

The compiler chooses what to do by essentially using a pretty long series of ifelse decisions.

First, if there’s a single value of the same type then it copy-initializes for copy list initialization and direct-initializes for direct list initialization:

Vector2 vec;
vec.X = 2;
vec.Y = 4;
 
// Direct list initialization direct-initializes vecA with vec
Vector2 vecA{vec};
DebugLog(vecA.X, vecA.Y); // 2, 4
 
// Copy list initialization copy-initializes vecB with vec
Vector2 vecB = {vec};
DebugLog(vecB.X, vecB.Y); // 2, 4

Second, if the variable is a character array and there’s a single value of the same character type then the variable is aggregate-initialized:

char array[1] = {'x'}; // Aggregate-initialized
DebugLog(array[0]); // x

Third, if the variable to initialize is an aggregate type then it’s aggregate-initialized:

Vector2 vec = {2, 4}; // Aggregate-initialized
DebugLog(vec.X, vec.Y); // 2, 4

Fourth, if no values are passed in the curly braces and the variable to initialize is a struct with a default constructor then it’s value-initialized:

struct NonAggregateVec2
{
    float X;
    float Y;
 
    NonAggregateVec2()
    {
        X = 2;
        Y = 4;
    }
};
 
NonAggregateVec2 vec = {}; // Value-initialized
DebugLog(vec.X, vec.Y); // 2, 4

Fifth, if the variable has a constructor that takes only the Standard Library’s std::initializer_list type then that constructor is called. We haven’t covered any of the Standard Library yet, but the details of this type aren’t really important at this point. Suffice to say that this is the C++ equivalent to initializing collections in C#: List<int> list = new List<int> { 2, 4 };.

struct InitListVec2
{
    float X;
    float Y;
 
    InitListVec2(std::initializer_list<float> vals)
    {
        X = *vals.begin();
        Y = *(vals.begin() + 1);
    }
};
 
InitListVec2 vec = {2, 4};
DebugLog(vec.X, vec.Y); // 2, 4

Sixth, if any constructor matches the passed values then the one that matches best is called:

struct MultiConstructorVec2
{
    float X;
    float Y;
 
    MultiConstructorVec2(float x, float y)
    {
        X = x;
        Y = y;
    }
 
    MultiConstructorVec2(double x, double y)
    {
        X = x;
        Y = y;
    }
};
 
MultiConstructorVec2 vec1 = {2.0f, 4.0f}; // Call (float, float) version
DebugLog(vec1.X, vec1.Y); // 2, 4
 
MultiConstructorVec2 vec2 = {2.0, 4.0}; // Call (double, double) version
DebugLog(vec2.X, vec2.Y); // 2, 4

Seventh, if the variable is a (scoped or unscoped) enumeration and a single value of that type is passed with direct list initialization, the variable is initialized with that value:

enum struct Color : uint32_t
{
    Blue = 0x0000ff
};
 
Color c = {Color::Blue};
DebugLog(c); // 255

Eighth, if the variable isn’t a struct, only one value is passed, and that value isn’t a reference, then the variable is direct-initialized:

float f = {3.14f};
DebugLog(f); // 3.14

Ninth, if the variable isn’t a struct, the curly braces have only one value, and the variable isn’t a reference or is a reference to the type of the single value, then the variable is direct-initialized for direct list initialization or copy-initialized for copy list initialization with the value:

float f = 3.14f;
 
float& r1{f}; // Direct list initialization direct-initializes
DebugLog(r1); // 3.14
 
float& r2 = {f}; // Copy list initialization copy-initializes
DebugLog(r2); // 3.14
 
float r3{f}; // Direct list initialization direct-initializes
DebugLog(r3); // 3.14
 
float r4 = {f}; // Copy list initialization copy-initializes
DebugLog(r4); // 3.14

Tenth, if the variable is a reference to a different type than the one value passed then a temporary reference to the value’s type is created, list-initialized, and bound to the variable. The variable must be const for this to work:

float f = 3.14;
 
const int32_t& r1 = f;
DebugLog(r1); // 3
 
int32_t& r2 = f; // Compiler error: not const
DebugLog(r2);

Eleventh, and lastly, if no values are passed then the variable is value-initialized:

float f = {};
DebugLog(f); // 0

One final detail to note is that the values passed in the curly braces are evaluated in order. This is unlike the arguments passed to a function which are evalutated in an order determined by the compiler.

Reference Initialization

As mentioned above, references have their own type of initialization. Here are the forms it takes:

// lvalue reference variables
T& ref = object;
T& ref = {val1, val2};
T& ref(object);
T& ref{val1, val2};
 
// rvalue reference variables
T&& ref = object;
T&& ref = {val1, val2};
T&& ref(object);
T&& ref{val1, val2};
 
// Function calls
/* Assume */ void func(T& val); /* or */ void func(T&& val);
func(object)
func({val1, val2})
 
// Return values
T& func() { T t; return t; }
T&& func() { return T(); }
 
// Constructor initializer lists
MyStruct::MyStruct()
    : lvalueRef(object)
    , rvalueRef(object)
{
}

If curly braces are provided, the reference is list-initialized:

float&& f = {3.14f};
DebugLog(f); // 3.14

Otherwise, the reference follows reference initialization rules. These are effectively another series of ifelse decisions, but a much shorter series than with list initialization.

First, for lvalue references of the same type the reference simply binds to the passed object:

float f = 3.14f;
float& r = f;
DebugLog(r); // 3.14

When the variable is an lvalue reference but it has a different type than the passed object, if there’s a user-defined conversion function then it’s called and the variable is bound to the return value:

float pi = 3.14f;
 
struct ConvertsToPi
{
    operator float&()
    {
        return pi;
    }
};
 
ConvertsToPi ctp;
float& r = ctp; // User-defined conversion operator called
DebugLog(r); // 3.14

In all other cases the passed expression is evaluated into a temporary variable and the reference is bound to that:

float Add(float a, float b)
{
    return a + b;
}
 
// Call function, store return value in temporary, bind reference to temporary
float&& sum = Add(2, 4);
DebugLog(sum); // 6

Temporary variables created by reference initialization have their lifetimes extended to match the lifetime of the reference. There are a few exceptions. First, returned references are always “dangling” as what they refer to ends its lifetime when the function exits. Second, and similarly, references to function arguments also end their lifetime when the function exits.

float&& Dangling1()
{
    return 3.14f; // Returned temporary ends its lifetime here
}
 
float& Dangling2(float x)
{
    return x; // Returned argument ends its lifetime here
}
 
DebugLog(Dangling1()); // Undefined behavior
DebugLog(Dangling2(3.14f)); // Undefined behavior

Third, the reference data members or elements of an aggregate only have their lifetime extended when curly braces, not parentheses, are used:

struct HasRvalueRef
{
    float&& Ref;
};
 
// Curly braces used. Lifetime of float with 3.14f value extended.
HasRvalueRef hrr1{3.14f};
DebugLog(hrr1.Ref); // 3.14
 
// Parentheses used. Lifetime of float with 3.14f value NOT extended.
HasRvalueRef hrr2(3.14f);
DebugLog(hrr2.Ref); // Undefined behavior. Ref has ended its lifetime.
Value Initialization

Value initialization can look like this:

// Variable
T object{};
 
// Temporary variable (i.e. it has no name)
T()
T{}
 
// Initialize a data member in an initializer list
MyStruct::MyStruct()
    : member1() // Parentheses version
    , member2{} // Curly braces version
{
}

Value initialization always defers to another type of initialization. Here’s how it decides which type to use:

If curly braces are used and the variable is an aggregate, it’s aggregate-initialized.

Vector2 vec{2, 4}; // Aggregate initialization
DebugLog(vec.X, vec.Y); // 2, 4

If the variable is a struct that doesn’t have a default constructor but it does have a constructor that takes only a std::initializer_list, the variable is list-initialized with an empty list (i.e. {}).

struct InitListVec2
{
    float X;
    float Y;
 
    InitListVec2(std::initializer_list<float> vals)
    {
        int index = 0;
        float x = 0;
        float y = 0;
        for (float cur : vals)
        {
            switch (index)
            {
                case 0: x = cur; break;
                case 1: y = cur; break;
            }
        }
        X = x;
        Y = y;
    }
};
 
InitListVec2 vec{}; // List initialization (passes empty list)
DebugLog(vec.X, vec.Y); // 0, 0

If the variable is a struct with no default constructor, it’s default-initialized.

struct Vector2
{
    float X;
    float Y;
 
    Vector2() = delete;
};
 
Vector2 vec{}; // Default-initialized
DebugLog(vec.X, vec.Y); // 0, 0

If the default constructor was generated by the compiler, the variable is zero-initialized then direct-initialized if any of the data members have default initializers (i.e. float X = 0;).

struct Vector2
{
    float X = 2;
    float Y = 4;
};
 
Vector2 vec{}; // Zero initialization then direct initialization
DebugLog(vec.X, vec.Y); // 2, 4

If the variable is an array, each element is value-initialized.

float arr[2]{}; // Elements value-initialized
DebugLog(arr[0], arr[1]); // 0, 0

If none of the above apply, the variable is zero-initialized.

float x{}; // Zero-initialized
DebugLog(x); // 0
Direct initialization

Here are the forms direct initialization can take:

// Parentheses with single value
T object(val);
 
// Parentheses with multiple values
T object(val1, val2);
 
// Curly braces with single value
T object{val};
 
MyStruct::MyStruct()
    // Parentheses in initializer list
    : member(val1, val2)
{
}

All of these look for a constructor matching the passed values. If one is found, the one that matches best is called to initialize the variable.

struct MultiConstructorVec2
{
    float X;
    float Y;
 
    MultiConstructorVec2(float x, float y)
    {
        X = x;
        Y = y;
    }
 
    MultiConstructorVec2(double x, double y)
    {
        X = x;
        Y = y;
    }
};
 
MultiConstructorVec2 vec1{2.0f, 4.0f}; // Call (float, float) version
DebugLog(vec1.X, vec1.Y); // 2, 4
 
MultiConstructorVec2 vec2{2.0, 4.0}; // Call (double, double) version
DebugLog(vec2.X, vec2.Y); // 2, 4

If no constructor matches or the variable isn’t a struct but it is an aggregate, the variable is aggregate-initialized.

struct Vector2
{
    float X;
    float Y;
};
 
// No constructor matches, but Vector2 is an aggregate
Vector2 vec{2, 4}; // Aggregate initialization
DebugLog(vec.X, vec.Y); // 2, 4

As of C++20, the variable can be an array. In this case the rules of aggregate initialization apply. For example, passing too many values is a compiler error.

float a1[2]{2, 4}; // Aggregate initialization
DebugLog(a1[0], a1[1]); // 2, 4
 
float a2[2]{2, 4, 6, 8}; // Compiler error: too many values

There’s one type-specific exception to this. If the variable is a bool and the value is nullptr, the variable becomes false.

bool b{nullptr};
DebugLog(b); // false

One common mistake with the parentheses forms of direct initialization is to create ambiguity between initialization of a variable and a function declaration. Consider this code:

struct Enemy
{
    float X;
    float Y;
};
 
struct Vector2
{
    float X;
    float Y;
 
    Vector2() = default;
 
    Vector2(Enemy enemy)
    {
        X = enemy.X;
        Y = enemy.Y;
    }
};
 
Vector2 defaultEnemySpawnPoint(Enemy());

The last line is ambiguous. The naming makes us think it’s a variable with type Vector2 named defaultEnemySpawnPoint that’s being direct-initialized with a value-initialized temporary Enemy variable.

Another way to read that line is that it declares a function named defaultEnemySpawnPoint that returns a Vector2 and takes an unnamed pointer to a function that takes no arguments and returns an Enemy. In that alternate reading, we could write code like this:

// Definition of a function that satisfies the function pointer type
Enemy cb()
{
    return {};
}
 
// Definition of the above declaration, intentional or not
Vector2 defaultEnemySpawnPoint(Enemy())
{
    return {};
}
 
// It can be called with 'cb' as the function pointer argument
defaultEnemySpawnPoint(cb);

The compiler always chooses the function declaration when this ambiguity arises. That means the above code is valid and actually works, but we’ll get errors if we try to use defaultEnemySpawnPoint like a variable when it’s actually a function:

// Compiler error: defaultEnemySpawnPoint is a function
// Functions have no X or Y data members to get
DebugLog(defaultEnemySpawnPoint.X, defaultEnemySpawnPoint.Y);

Thankfully, it’s easy to resolve the ambiguity by simply using the curly braces form of direct-initialization because the function pointer syntax doesn’t use curly braces:

Vector2 defaultEnemySpawnPoint(Enemy{});
DebugLog(defaultEnemySpawnPoint.X, defaultEnemySpawnPoint.Y); // 0, 0
Constant Initialization

Constant initialization has just two forms:

T& ref = constantExpression;
T object = constantExpression;

Both of these only apply when the variable is both const and static, such as for global variables and static struct data members. Otherwise, the variable is zero-initialized.

struct Player
{
    static const int32_t MaxHealth;
 
    int32_t Health;
};
 
// Constant-initialize a data member
const int32_t Player::MaxHealth = 100;
 
// Constant-initialize a global reference
const int32_t& defaultHealth = Player::MaxHealth;

This initialization happens before all other initialization, so it’s safe to read from these variables during other kinds of initialization. That’s even the case if that other initialization appears before the constant initialization:

struct Player
{
    static const int32_t MaxHealth;
 
    int32_t Health;
};
 
// 2) Aggregate initialization
Player localPlayer{Player::MaxHealth};
 
// 1) Constant initialization
const int32_t Player::MaxHealth = 100;
const int32_t& defaultHealth = Player::MaxHealth;
 
// 3) Normal code, not initialization
DebugLog(localPlayer.Health); // 100
Zero Initialization

Lastly, we have zero initialization. Unlike all the other types, it doesn’t have any explicit forms. Instead, as we’ve seen above, other types of initialization may result in zero initialization:

// Static variable that's not constant-initialized
// Zero initialization still happens before other types of initialization
static T object;
 
// During value initialization for non-struct types
// Includes struct data members and array elements
T();
T t = {};
T{};
 
// When initializing an array from a string literal that's too short
// Remaining elements are zero-initialized
char array[N] = "";

Zero initialization sets primitives and all padding bits of structs to 0. It doesn’t do anything to references.

Conclusion

As we’ve now seen, initialization is a far more complex topic in C++ than it is in C#. The main reason is that C++ provides far more features. Supporting default constructors, temporary variables, arrays, references, function pointers, const, string literals, and so forth requires a fair amount more syntax.

Still, quite a few details have been omitted here for language features we haven’t yet covered: inheritance, lambdas, etc. We’ll cover those as we go through the rest of the series.