JacksonDunstan.com

Today we’ll wrap up structs and classes by discussing a bunch of miscellaneous features: local classes, unions, overloaded assignment operators, and user-defined literals. C# doesn’t have any of these features, but it can emulate some of them. Read on to learn a bunch of new tricks!

Table of Contents

User-Defined Literals

C++ supports creating our own literals, with some limitations. These are used to create instances of structs or other types in a similar manner to user-defined conversion operators. They’re just converting from literals rather than existing objects.

Here are the kinds of literals we can create:

Name	Example
Decimal literal	`123_suffix`
Octal literal	`0123_suffix`
Hexadecimal literal	`0x123_suffix`
Binary literal	`0b123_suffix`
Real literal	`0.123_suffix`
Character literal	`'c'_suffix`
String literal	`"c"_suffix`

The suffix can be any valid identifier. To implement the literal, we write an operator "" _suffix function that's not part of the struct:

Vector2 operator "" _v2(long double val)
{
    Vector2 vec;
    vec.X = val;
    vec.Y = val;
    return vec;
}

Then we call it like this:

Vector2 v1 = 2.0_v2;
DebugLog(v1.X, v1.Y); // 2, 2

The C++ Standard Library reserves all suffixes that don't start with an _ for its own use:

string greeting = "hello"s;
hours halfHour = 0.5h;

Like with other forms of operator overloading, including user-defined conversion operators, it's important to strongly consider how understandable the resulting code will be given its terseness. Regular constructors and member functions may be more easily understood due to explicitly stating the type.

Still, there are situations where the brevity and expressiveness may come in handy. This is especially the case for codebases that make heavy use of auto:

// User-defined literals require _less_ typing with auto
auto a = Vector2{2.0f};
auto b = 2.0f_v2;
 
// User-defined literals require _more_ typing without auto
Vector2 a{2.0f};
Vector2 b = 2.0f_v2;

Local Classes

A local class (or struct) is one that is defined within the body of a function:

void Foo()
{
    struct Local
    {
        int32_t Val;
 
        Local(int32_t val)
            : Val(val)
        {
        }
    };
 
    Local ten{10};
    DebugLog(ten.Val); // 10
}

Local classes are regular classes in most ways, but have a few limitations. First, their member functions have to be defined within the class definition: we can't split the declaration and the definition.

void Foo()
{
    struct Local
    {
        int32_t Val;
 
        Local(int32_t val);
    };
 
    // Compiler error
    // Member function definition must be in the class definition
    Local::Local(int32_t val)
        : Val(val)
    {
    }
}

Second, they can't have static data members but they can have static member functions.

void Foo()
{
    struct Local
    {
        int32_t Val;
 
        // Compiler error
        // Local classes can't have static data members
        static int32_t Max = 100;
 
        // OK: local classes can have static member functions
        static int32_t GetMax()
        {
            return 100;
        }
    };
 
    DebugLog(Local::GetMax()); // 100
}

Third, and finally, they can have friends but they can't declare inline friend functions:

class Classy
{
};
 
void Foo()
{
    struct Local
    {
        // Compiler error
        // Local classes can't define inline friend functions
        friend void InlineFriend()
        {
        }
 
        // OK: local classes can have normal friends
        friend class Classy;
    };
}

Like local functions in C#, local classes in C++ are typically used to reduce duplication of code inside the function but are placed inside the function because they wouldn't be useful to code outside the function. It's even common to see local classes without a name when only one instance of them is needed. For example, this local class de-duplicates code that's run on players, enemies, and NPCs without requiring polymorphism:

// Three unrelated types: no common base class
struct Player
{
    int32_t Health;
};
struct Enemy
{
    int32_t Health;
};
struct Npc
{
    int32_t Health;
};
 
int32_t HealToFullIfNotDead(
    Player* players, int32_t numPlayers,
    Enemy* enemies, int32_t numEnemies,
    Npc* npcs, int32_t numNpcs)
{
    // Anonymous local class
    // Avoids needing to pick a good name
    struct
    {
        // More than just a function wrapped in a class
        // Also has its own state to keep track of healing
        int32_t NumHealed = 0;
 
        // Overloaded function call operator
        // Avoids needing to pick a good name
        int32_t operator()(int32_t health)
        {
            // Dead or already at full. No heal.
            if (health <= 0 || health >= 100)
            {
                return health;
            }
 
            // Damaged. Heal.
            NumHealed++;
            return 100;
        }
    } healer;
 
    // The body of each loop reuses the heal code
    for (int32_t i = 0; i < numPlayers; ++i)
    {
        // Call the overloaded function call operator
        players[i].Health = healer(players[i].Health);
    }
    for (int32_t i = 0; i < numEnemies; ++i)
    {
        enemies[i].Health = healer(enemies[i].Health);
    }
    for (int32_t i = 0; i < numNpcs; ++i)
    {
        npcs[i].Health = healer(npcs[i].Health);
    }
 
    return healer.NumHealed;
}
 
// One dead, two damaged, one full health for each
const int32_t num = 4;
Player players[num]{{0}, {50}, {75}, {100}};
Enemy enemies[num]{{0}, {50}, {75}, {100}};
Npc npcs[num]{{0}, {50}, {75}, {100}};
int32_t numHealed = HealToFullIfNotDead(
    players, num,
    enemies, num,
    npcs, num);
 
DebugLog(numHealed); // 6
DebugLog(
    players[0].Health, players[1].Health,
    players[2].Health, players[3].Health); // 0, 100, 100, 100
DebugLog(
    enemies[0].Health, enemies[1].Health,
    enemies[2].Health, enemies[3].Health); // 0, 100, 100, 100
DebugLog(
    npcs[0].Health, npcs[1].Health,
    npcs[2].Health, npcs[3].Health); // 0, 100, 100, 100

Copy and Move Assignment Operators

Along with destructors and some constructors, the compiler will also generate copy and move assignment operators for us.

struct Vector2
{
    float X;
    float Y;
 
    // Compiler generates a copy assignment operator like this:
    // Vector2& operator=(const Vector2& other)
    // {
    //     X = other.X;
    //     Y = other.Y;
    //     return *this;
    // }
 
    // Compiler generates a move assignment operator like this:
    // Vector2& operator=(const Vector2&& other)
    // {
    //     X = other.X;
    //     Y = other.Y;
    //     return *this;
    // }
};
 
void Foo()
{
    Vector2 a{2, 4};
    Vector2 b{0, 0};
    b = a; // Call the compiler-generated copy assignment operator
    DebugLog(b.X, b.Y); // 2, 4
}

It'll do this as long as we don't define the assignment operator ourselves, each non-static data member and base class has an assignment operator, and none of the non-static data members are const or references.

Like constructors and destructors, we can use = default and = delete to override the default behavior and either force the compiler to generate one or force it to not generate one.

struct Vector2
{
    float X;
    float Y;
 
    Vector2& operator=(const Vector2& other) = delete;
};
 
void Foo()
{
    Vector2 a{2, 4};
    Vector2 b{0, 0};
    b = a; // Compiler error: copy assignment operator is deleted
    DebugLog(b.X, b.Y); // 2, 4
}

Unions

We've seen how the class keyword can be used instead of struct to change the default access level from public to private. Similarly, C++ provides the union keyword to change the data layout of the struct. Instead of making the struct big enough to fit all of the non-static data members, a union is just big enough to fit the largest non-static data member.

union FloatBytes
{
    float Val;
    uint8_t Bytes[4];
};
 
void Foo()
{
    FloatBytes fb;
    fb.Val = 3.14f;
    DebugLog(sizeof(fb)); // 4 (not 8)
 
    // 195, 245, 72, 64
    DebugLog(fb.Bytes[0], fb.Bytes[1], fb.Bytes[2], fb.Bytes[3]);
 
    fb.Bytes[0] = 0;
    fb.Bytes[1] = 0;
    fb.Bytes[2] = 0;
    fb.Bytes[3] = 0;
    DebugLog(fb.Val); // 0
}

Because the non-static data members of a union occupy the same memory space, writing to one writes to the other. In the above example, we can use this to get the bytes that make up a float or to manipulate the float using integer math on the byte array that it shares memory with.

Note that it is technically undefined behavior to read any non-static data member except the most recently written one. However, nearly all compilers support this as it is a common usage for unions so it is very likely to be safe.

Like local classes, there are some restrictions put on unions. First, unions can't participate in inheritance. That means they can't have any base classes, be a base class themselves, or have any virtual member functions.

struct IGetHashCode
{
    virtual int32_t GetHashCode() = 0;
};
 
// Compiler error: unions can't derive
union Id : IGetHashCode
{
    int32_t Val;
    uint8_t Bytes[4];
 
    // Compiler error: unions can't have virtual member functions
    virtual int32_t GetHashCode() override
    {
        return Val;
    }
};
 
// Compiler error: can't derive from a union
struct Vec2Bytes : Id
{
};

Second, unions can't have non-static data members that are references:

union IntRefs
{
    // Compiler error: unions can't have lvalue references
    int32_t& Lvalue;
 
    // Compiler error: unions can't have rvalue references
    int32_t&& Rvalue;
};

Third, if any non-static data member of the union has a "non-trivial" copy or move constructor, copy or move assignment operator, or destructor, then the union's version of that function is deleted by default and needs to be explicitly written.

A struct has a "non-trivial" constructor if it's explicitly written, or if any of the non-static data members have default initializers, or if there are any virtual member functions or base classes, or if any non-static data member or base class has a non-trivial constructor.

A struct has a "non-trivial" destructor if it's explicitly written, virtual, or any non-static data member or base class has a non-trivial destructor.

A struct has a "non-trivial" assignment operator if it's explicitly written, if there are any virtual member functions or base classes, or any non-static data member or base class has a non-trivial assignment operator.

That's a lot of rules, but it's rather uncommon for unions to include types with these kinds of non-trivial functions. Typically they're used for simple primitives, structs, and arrays, like in the above examples. For more advanced usage, we need to keep the rules in mind:

// Note: "ctor" is a common abbreviation for "constructor"
//       Likewise, "dtor" is a common abbreviation for "destructor"
struct NonTrivialCtor
{
    int32_t Val;
 
    NonTrivialCtor()
    {
        Val = 100;
    }
 
    // Non-trivial copy constructor because it's explicitly written
    NonTrivialCtor(const NonTrivialCtor& other)
    {
        Val = other.Val;
    }
};
 
// Union with a non-static data member whose copy constructor is non-trivial
// The union's copy constructor is deleted by default
union HasNonTrivialCtor
{
    NonTrivialCtor Ntc;
};
 
// Union with a non-static data member whose copy constructor is non-trivial
// The union's copy constructor is deleted by default
union HasNonTrivialCtor2
{
    NonTrivialCtor Ntc;
 
    HasNonTrivialCtor2()
        : Ntc{}
    {
    }
 
    // Explicitly write a copy constructor
    HasNonTrivialCtor2(const HasNonTrivialCtor2& other)
        : Ntc{other.Ntc}
    {
    }
};
 
HasNonTrivialCtor a{};
DebugLog(a.Ntc.Val);
 
// Compiler error
// Union has a non-static data member with a non-trivial copy constructor
// Its copy constructor must be written explicitly
HasNonTrivialCtor b{a};
DebugLog(b.Ntc.Val);
 
HasNonTrivialCtor2 c{};
 
// OK: copy constructor explicitly written
HasNonTrivialCtor2 d{c};
DebugLog(d.Ntc.Val); // 100

Unions can also be "anonymous." Like structs, they can have no name. Unlike structs, they can also have no variable:

void Foo()
{
    union
    {
        int32_t Int;
        float Float;
    };
}

These are even more restricted than normal unions. They can't have any member functions or static data members and all their data members have to be public. Like unscoped enums, their members are added to whatever scope the union is in: Foo in the above example.

void Foo()
{
    union
    {
        int32_t Int;
        float Float;
    };
 
    // Int and Float are added to Foo, so they can be used directly
    Float = 3.14f;
    DebugLog(Int); // 1078523331
}

This feature is commonly used to create what's called a "tagged union" by wrapping the union and an enum in a struct:

struct IntOrFloat
{
    // The "tag" remembers the active member
    enum { Int, Float } Type;
 
    // Anonymous union
    union
    {
        int32_t IntVal;
        float FloatVal;
    };
};
 
IntOrFloat iof;
 
iof.FloatVal = 3.14f; // Set value
iof.Type = IntOrFloat::Float; // Set type
 
// Read value and type
DebugLog(iof.IntVal, iof.Type); // 1078523331, Float

This pattern is also called a "variant," typically when more protections are added to ensure the type and value are linked:

struct TypeException
{
};
 
class IntOrFloat
{
public:
 
    enum struct Type { Int, Float };
 
    Type GetType() const
    {
        return Type;
    }
 
    void SetIntVal(int32_t val)
    {
        Type = Type::Int;
        IntVal = val;
    }
 
    int32_t GetIntVal() const
    {
        if (Type != Type::Int)
        {
            throw TypeException{};
        }
        return IntVal;
    }
 
    void SetFloatVal(float val)
    {
        Type = Type::Float;
        FloatVal = val;
    }
 
    float GetFloatVal() const
    {
        if (Type != Type::Float)
        {
            throw TypeException{};
        }
        return FloatVal;
    }
 
private:
 
    Type Type;
 
    union
    {
        int32_t IntVal;
        float FloatVal;
    };
};
 
IntOrFloat iof;
iof.SetFloatVal(3.14f); // Set value to 3.14f and type to Float
DebugLog(iof.GetFloatVal()); // 3.14
DebugLog(iof.GetIntVal()); // Throws exception: type is not Int

Another common use of unions is to provide an alternative access mechanism without changing the type of the data. It's very common to see vectors, matrices, and quaternions that use unions to provide either named field access or array access to the components:

union Vector2
{
    struct
    {
        float X;
        float Y;
    };
 
    float Components[2];
};
 
 
Vector2 v;
 
// Named field access
v.X = 2;
v.Y = 4;
 
// Array access: same values due to union
DebugLog(v.Components[0], v.Components[1]); // 2, 4
 
// Array access
v.Components[0] = 20;
v.Components[1] = 40;
 
// Named field access: same values due to union
DebugLog(v.X, v.Y); // 20, 40

Pointers to Members

Finally, let's look at how we create pointers to members of structs. To simply get a pointer to a specific struct instance's non-static data member, we can use the normal pointer syntax:

struct Vector2
{
    float X;
    float Y;
};
 
Vector2 v{2, 4};
float* p = &v.X; // p points to the X data member of a

However, we can also get a pointer to a non-static data member of any instance of the struct:

float Vector2::* p = &Vector2::X; // p points to the X data member of a Vector2

To dereference such a pointer, we need an instance of the struct whose data member it points at:

float Vector2::* p = &Vector2::X;
Vector2 v{2, 4};
 
// Dereference the pointer for a particular struct
DebugLog(v.*p); // 2

These pointers can't be converted to plain pointers or vice versa, but polymorphism is allowed as long as the base class isn't virtual:

struct Vector2
{
    float X;
    float Y;
};
 
struct Vector3 : Vector2
{
    float Z;
};
 
float Vector2::* p = &Vector2::X;
Vector2 v{2, 4};
 
float* p2 = p; // Compiler error: not compatible
 
float f = 3.14f;
float Vector2::* pf = &f; // Compiler error: not compatible
 
float Vector3::* p3 = p; // OK: Vector3 derives from Vector2
DebugLog(v.*p3); // 2

The syntax gets a little complicated when making a pointer to a member that is itself a pointer to a member. Thankfully, this is rarely seen:

struct Float
{
    float Val;
};
 
struct PtrToFloat
{
    float Float::* Ptr;
};
 
// Pointer to Val in a Float pointed to by Ptr in a PtrToFloat
float Float::* PtrToFloat::* p1 = &PtrToFloat::Ptr;
 
Float f{3.14f};
PtrToFloat ptf{&Float::Val};
 
float Float::* pf = ptf.*p1; // Dereference first level of indirection
float floatVal = f.*pf; // Dereference second level of indirection
DebugLog(floatVal); // 3.14
 
// Dereference both levels of indirection at once
DebugLog(f.*(ptf.*p1)); // 3.14

Pointers to member functions can also be taken. The syntax is like a combination of data member pointers and normal function pointers:

struct Player
{
    int32_t Health;
};
 
struct PlayerOps
{
    Player& Target;
 
    PlayerOps(Player& target)
        : Target(target)
    {
    }
 
    void Damage(int32_t amount)
    {
        Target.Health -= amount;
    }
 
    void Heal(int32_t amount)
    {
        Target.Health += amount;
    }
};
 
// Pointer to a non-static member function of PlayerOps that
// takes an int32_t and returns void
void (PlayerOps::* op)(int32_t) = &PlayerOps::Damage;
 
Player player{100};
PlayerOps ops(player);
 
// Call the Damage function via the pointer
(ops.*op)(20);
DebugLog(player.Health); // 80
 
// Re-assign to another compatible function
op = &PlayerOps::Heal;
 
// Call the Heal function via the pointer
(ops.*op)(10);
DebugLog(player.Health); // 90

Conclusion

Today we've seen a bunch of miscellaneous class functionality that isn't available in C#. User-defined literals can make code both more expressive and more terse at the same time. It's best used sparingly for very stable, core types like the Standard Library's string.

Local classes give a lot of the same benefits that local functions do in C#, but go a step further and allow nearly full class functionality including data members, constructors, destructors, and overloaded operators.

Copy and move assignment operators allow us to easily copy and move classes with the familiar x = y syntax rather than utility functions typically named Clone or Copy. The compiler will even generate them for us, saving a lot of boilerplate and potential for errors if that boilerplate gets out of sync with changes to the class.

Unions allow for memory savings, advanced manipulation of the bits and bytes behind types like float, and the convenience of alternative access styles. They can be partially emulated in C#, but native support in C++ is more convenient and offers more advanced functionality.

Pointers to members allow us to limit them to pointing specifically to members of classes and to not tie that access to any particular instance of the class. With support for both data members and member functions, we have a tool that enables runtime determination of the data to use or function to call without needing a heavyweight language feature like C#'s delegates. This can be used for setting modes (e.g. Damage mode versus Heal mode), for GUI callbacks (e.g. click handlers), or a variety of other situations.

With that, we've wrapped up classes and structs! These are bedrock functionality of C++, so we'll be making use of them and even building on top of them throughout the rest of series. Stay tuned for more!

C++ For C# Developers: Part 16 – Struct and Class Wrapup