JacksonDunstan.com

Today we’ll continue the series with a look into pointers and, very differently from C#, the related concepts of arrays and strings. We’ll cover some interesting C++-only features, such as function pointers along the way.

Table of Contents

Pointers

C# pointers are allowed as long as we configure the compiler to enable “unsafe” code. We then need to only use pointers within an unsafe context, such as an unsafe method, unsafe class, or unsafe block within a function.

C++ has no concept of “safe” or “unsafe” code. There’s no such thing as an “unsafe” context, a “safe” context, or a compiler option to enable “unsafe” code. Pointers are allowed everywhere and are commonly used in many codebases. It turns out that their syntax works very similarly to the C# pointer syntax:

int x = 123;
 
// Declare a pointer type: int* is a "pointer to an int"
// Get the address of x with &x
int* p = &x;
 
// Dereference the pointer to get its value
DebugLog(*p); // 123
 
// Dereference and assign the pointer to set its value
*p = 456;
DebugLog(x); // 456
 
// x->y is a convenient shorthand for (*x).y
Player* p = &localPlayer;
p->Health = 100;

Multiple levels of indirection are also supported by adding more * characters to the type:

int x = 123;
int* p = &x;
int** pp = &p;
 
DebugLog(**pp); // 123
 
**pp = 456;
DebugLog(x); // 456
 
int y = 1000;
*pp = &y;
**pp = 2000;
DebugLog(x); // 456
DebugLog(y); // 2000

We also have void*, which is a pointer to any type. A cast is required to dereference a void* since the compiler has no idea what type it should do the read or write on. As in C#, such a cast is not checked at runtime to ensure that the pointer really points to the type being cast to.

int x = 123;
 
// &x is an int*, but void* is compatible with all pointer types
void* pVoid = &x;
 
// Cast back to int* so we can dereference
int* pInt = (int*)pVoid;
DebugLog(*pInt); // 123
 
// Cast to float* so we can treat the memory as though it held another type
float* pFloat = (float*)pVoid;
*pFloat = 3.14f;
DebugLog(x); // 1078523331

The last line could be considered data corruption of an int since 3.14f is not a valid int, but it’s a valid way to get the bits of a float. This is part of the reason that these casts are unchecked.

Note that this is called “type punning” and it is technically undefined behavior, meaning the compiler might generate arbitrary machine code for this C++. At least in this simple case though, all compilers will generate the machine code that we’d expect so that we’re simply treating the same memory as though it were a different type.

As in C#, pointers may be null. There are three main ways this is written in C++:

// nullptr is compatible with all pointer types, but not integer arithmetic
// This is generally the preferred way since C++11
int* p1 = nullptr;
 
// NULL is commonly defined to be zero, but works with integer arithmetic
int* p2 = NULL;
 
// The zero integer
int* p3 = 0;

Arrays

It may seem strange to see arrays lumped into the same article as pointers, but they’re very similar in C++. Unlike in C#, arrays are not an object that’s “managed” and subject to garbage collection. They are instead simply a fixed-size contiguous allocation of the same type of data:

// Declare an array of 3 int elements
// The elements of the array are uninitialized
int a[3];
 
// Initialize the first element of the array by writing to it
a[0] = 123;
 
// Read the first element of the array
DebugLog(a[0]); // 123

When we create an array variable, it’s just like we individually created its elements via variables:

int a0;
int a1;
int a2;

This means that there is no overhead for an array. It is literally just its elements. It doesn’t even have an integer keeping track of its length like the Length field in C#. This means that the C# stackalloc keyword is unnecessary as C++ arrays are already allocated on the stack when declared as local variables. Likewise, the fixed keyword to create a fixed-size buffer as a struct or class field is unnecessary as a C++ array’s elements are already stored inside the struct or class.

There is also no bounds-checking on indexes into the array, just like indexing into a pointer in C# or C++. It’s very important to be careful not to read beyond the beginning or end of the array as there’s usually no way to know what data will be read or overwritten.

The lines blur even more because we can implicitly convert arrays into pointers:

int a[3];
a[0] = 123;
 
// Implicitly convert the int[3] array to an int*
// We get a pointer to the first element
int* p = a;
DebugLog(*p); // 123
 
// Indexing into pointers works just like in C#
DebugLog(p[0]); // 123

The opposite does not work though: we can’t write int b[3] = p.

Short arrays are commonly initialized with curly braces:

int a[3] = { 123, 456, 789 };
DebugLog(a[0], a[1], a[2]); // 123, 456, 789

If we specify more elements than will fit in the array’s size, we get a compiler error:

int a[3] = { 123, 456, 789, 1000 }; // compiler error

If we specify fewer elements, only the ones we specify will be initialized. Note that a trailing comma is allowed:

int a[3] = { 123, 456, };
DebugLog(a[0], a[1]); // 123, 456
DebugLog(a[2]); // Uninitialized. Could be anything!

It’s common to omit the array size when using curly braces to initialize the array. This tells the compiler to count the number of elements in the curly braces and make the array that long.

int a[] = { 123, 456, 789 }; // The a array has 3 elements
DebugLog(a[0], a[1], a[2]); // 123, 456, 789

Finally, we have multi-dimensional arrays. These are arrays of arrays, both with fixed lengths. This means they are never “jagged” but always “rectangular.” Just as with one-dimensional arrays, we end up with a contiguous sequence of contiguous sequences of the same type of data. There’s still no overhead:

int a[2][3] = {{1, 2, 3}, {4, 5, 6}};
DebugLog(a[0][0], a[0][1], a[0][2]); // 1, 2, 3
DebugLog(a[1][0], a[1][1], a[1][2]); // 4, 5, 6

These are implicitly converted into a pointer to the first dimension of the array:

int a[2][3] = {{1, 2, 3}, {4, 5, 6}};
 
// Implicitly convert to a pointer to an array of 3 int
// Read the type name as "p is a pointer to an array of 3 int elements"
int (*p)[3] = a;
 
// Dereference that pointer to get a pointer to the first element
int* pp = *p;
for (int i = 0; i < 6; ++i)
{
    DebugLog(pp[i]); // 1, 2, 3, 4, 5, 6
}

Indexing into a multi-dimensional array with fewer subscripts than its dimensions just yields the remaining dimensions of the array. We can capture this in a pointer using the same implicit conversion:

int a[2][3] = {{1, 2, 3}, {4, 5, 6}};
int* firstRow = a[0]; // Index 1 of 2 dimensions to get the second dimension as a pointer
DebugLog(firstRow[0], firstRow[1], firstRow[2]); // 1, 2, 3

Pointers to Arrays and Arrays of Pointers

Sometimes we want to have a pointer to an array. This is essentially what a C# array is since we only have a reference to it, not its actual contents. Here’s how we’d do that in C++:

int a[] = { 1, 2, 3 };
 
// Add a * to make this a pointer to an array instead of just an array
// This is similar to how int* is a pointer to an int
int (*p)[3] = &a;
 
// Dereference the pointer to get the array, which we can index into
DebugLog((*p)[0], (*p)[1], (*p)[2]); // 1, 2, 3

Pointers to arrays aren’t supported by C# since pointers can’t point to managed types like arrays.

If we want an array of pointers, just add a * to the type of the array element:

int x = 1;
int y = 2;
int z = 3;
 
// Add a * to int to get int*: a pointer to an int
int* a[] = { &x, &y, &z };
 
// Index into the array to get the pointer then dereference it to get the int
DebugLog(*a[0], *a[1], *a[2]); // 1, 2, 3

Arrays of pointers are supported by C#, but the array is a managed object that we only have a reference to.

Strings

The difference with strings is similar to that of arrays. In C# we have managed System.String objects that are garbage-collected. In C++, we essentially have null-terminated arrays of characters:

// The string literal "hello" has type const char[6]
// Its contents are the characters 'h', 'e', 'l', 'l', 'o', 0
const char hello[] = "hello";
 
// Like any other array, it's implicitly converted a pointer
const char* p = hello;
for (int i = 0; i < 6; ++i)
{
    DebugLog(p[i]); // h, e, l, l, o, <NUL>
}

We’ll go into const more later, but for now it’s just important to know that the characters of the array can’t be changed. For instance, this would produce a compiler error:

p[0] = 'H';

In part 3, we saw that there are various kinds of character literals. The same is true for strings as each corresponds to the type of character elements in its array:

String Type	Syntax	Meaning
`char[]`	“hello”	ASCII string
`wchar_t[]`	L”hello”	“Wide character” string
`char8_t[]`	u8″hello”	UTF-8 string
`char16_t[]`	u”hello”	UTF-16 string
`char32_t[]`	U”hello”	UTF-32 string

Regardless of the character type, we can concatenate together string literals just by placing them together. No + operator is needed, as in C#.

char msg[] = "Hello, " "world!";
DebugLog(msg); // Hello, world!

As long as just one of the string literals has an encoding prefix, the others will get it too:

const char16_t msg[] = "Hello, " u"world!";
DebugLog(msg); // Hello, world!

Support for mixing encoding prefixes varies by compiler.

Raw strings like this are commonly used when literals suffice, such as log message text. When more advanced functionality is desired, and it very commonly is, wrapper classes such as the C++ Standard Library’s string or Unreal’s FString are used instead. We’ll go into string later in the series.

Pointer Arithmetic

Like in C#, arithmetic may be performed on pointers:

int a[3] = { 0, 0, 0 };
 
int* p = a; // Make p point to the first element of a
*p = 1;
 
p += 2; // Make p point to the third element of a
*p = 3;
 
--p; // Make p point to the second element of a
*p = 2;
 
DebugLog(a[0], a[1], a[2]); // 1, 2, 3

Pointers may also be compared:

int a[3] = { 0, 0, 0 };
int* theStart = a;
int* theEnd = theStart + 3;
while (theStart < theEnd) // Compare pointers
{
    *theStart = 1;
    theStart++;
}
DebugLog(a[0], a[1], a[2]); // 1, 1, 1

Recall from part six that this satisfies the criteria for a range-based for loop:

int a[3] = { 1, 2, 3 };
for (int val : a)
{
    DebugLog(val); // 1, 2, 3
}

The compiler transforms this into a normal for loop:

{
    int*&& range = a;
    int* cur = range;
    int* theEnd = range + 3;
    for ( ; cur != theEnd; ++cur)
    {
        int val = *cur;
        DebugLog(val);
    }
}

Note that the begin and end functions aren’t required in the special case of arrays because the compiler knows the beginning and ending pointers since the size of the array is fixed at compile time.

Function Pointers

Unlike C#, in C++ we are allowed to make pointers to functions:

int GetHealth(Player p)
{
    return p.Health;
}
 
// Get a pointer to GetHealth. Syntax in three parts:
// 1) Return type: int
// 2) Pointer name: (*p)
// 3) Parameter types: (Player)
int (*p)(Player) = GetHealth;
 
// Calling the function pointer calls the function
int health = p(localPlayer);
 
DebugLog(health);

There are two variants of this syntax that make no difference to the functionality:

// Assign the address of the function instead of just its name
int (*p)(Player) = &GetHealth;
 
// Dereference the function pointer before calling it
int health = (*p)(localPlayer);

Function pointers are commonly used like delegates in C#. They are an object that can be passed around that, when called, invokes a function. They are much more lightweight though as they are just a pointer. Delegates have much more functionality, such as the ability to add, remove, and invoke multiple functions and bind to functions of various types such as instance methods and lambdas. We’ll cover how to do that in C++ later on in the series.

To make an array of function pointers, add the square brackets ([]) after its name like before:

int GetHealth(Player p)
{
    return p.Health;
}
 
int GetLives(Player p)
{
    return p.Lives;
}
 
// Array of pointers to functions that take a Player and return an int
int (*statFunctions[])(Player) = { GetHealth, GetLives };
 
// Index into the array like any other array
int health = statFunctions[0](localPlayer);
DebugLog(health);
int lives = statFunctions[1](localPlayer);
DebugLog(lives);

Arrays of function pointers are commonly used for jump tables to replace a long chain of conditional logic with a simple index into a simple array indexed read operation.

Conclusion

C++ pointers functionality includes everything C# pointers can do and adds on the ability to create pointers to functions and pointers to any type. Arrays and strings are closely related to pointers, unlike their managed C# counterparts. Combined together, we have much enhanced functionality such as arrays of function pointers to make jump tables, a lightweight replacement for delegates, and an alternative to stackalloc and fixed-size buffers that supports any type of elements.

Next week we’ll continue the series with a related topic: references. Like in C#, these are often more commonly used than pointers and take some of the sharp edges off.

#1 by kgame on August 24th, 2020 · Reply

where is DebugLog function ?

#2 by jackson on August 24th, 2020 · Reply

DebugLog isn’t actually written into any of the articles but you can think of it like Debug.Log in Unity: a function you can pass anything to and it prints all the arguments. Eventually we’ll get to the point in the series where we can actually implement DebugLog in a generic way.
- #3 by Jan Reitz on March 25th, 2021 · Reply
  
  I think it would help to have that function available, that one can actually run these snippets locally or in compiler explorer, to be able to play around etc.
  - #4 by jackson on March 26th, 2021 · Reply
    
    I agree. Unfortunately, it’s quite an advanced function to write with support for arbitrary numbers of arguments and arbitrary types. I plan to finally show it a few articles from now when covering the I/O library. In the meantime, feel free to approximate it with printf from the [C Standard Library](/articles/6359) article.

#5 by radwan on September 18th, 2021 · Reply

Note that the C#’s stackalloc allows for dynamically allocated arrays, whereas in C++ the size of the array must be known at compile time.

#6 by Kamikaze on October 1st, 2021 · Reply

It’s good to note that since version 9 C# supports function pointers.

C++ For C# Developers: Part 7 – Pointers, Arrays, and Strings