JacksonDunstan.com

C# and C++ have similar lists of preprocessor directives like #if, but their features and usage are very different. This is especially the case in C++ with support for “macros” that can replace code. Today we’ll look into everything we can use the preprocessor for in C++ and compare with C#’s preprocessor.

Table of Contents

Conditionals

Just like in C#, the C++ preprocessor runs at an early stage of compilation. This is after the bytes of the file are interpreted as characters and comments are removed, but before the main compilation of language concepts like variables and functions. The preprocessor therefore has a very limited understanding of the source code.

It takes this limited understanding of the source code and makes textual substitutions to it. When it’s done, the resulting source code is compiled.

One common use of this in C# are the conditional “directives:” #if, #else, #elif, and #endif. These allow branching logic to take place during the preprocessing step of compilation. C# allows for logic on boolean preprocessor symbols:

// C#
static void Assert(bool condition)
{
    #if DEBUG && (ASSERTIONS_ENABLED == true)
        if (!condition)
        {
            throw new Exception("Assertion failed");
        }
    #endif
}

If the #if expression evaluates to false then the code between the #if and #endif is removed:

// C#
static void Assert(bool condition)
{
}

This helps us reduce the size of the generated executable and improve run-time performance by removing instructions and memory accesses.

One common mistake is to assume the preprocessor understands more about the structure of the source code than it really does. For example, we might assume that it understands what identifiers are:

// C#
void Foo()
{
    #if Foo
        DebugLog("Foo exists");
    #else
        DebugLog("Foo does not exist"); // Gets printed
    #endif
}

C++ has similar support for preprocessor conditionals. They’re even named #if, #else, #elif, and #endif. The above C# examples are actually valid C++!

The two languages differ in a few minor ways. First, #if ABC in C++ checks the value of ABC, not just whether it’s defined.

// Assume the value of ZERO is 0
#if ZERO
    DebugLog("zero");
#else
    DebugLog("non-zero"); // Gets printed
#endif

There are a couple ways to avoid this. First, we can use the preprocessor defined operator to check whether the symbol is defined instead of checking its value:

#if defined(ZERO) // evaluates to 1, which is true
    DebugLog("zero"); // Gets printed
#else
    DebugLog("non-zero");
#endif
 
// Alternate version without parentheses
#if defined ZERO // evaluates to 1, which is true
    DebugLog("zero"); // Gets printed
#else
    DebugLog("non-zero");
#endif

The other way is to use #ifdef and #ifndef instead of of #if:

#ifdef ZERO // ZERO is defined. Its value is irrelevant.
    DebugLog("zero"); // Gets printed
#else
    DebugLog("non-zero");
#endif
 
#ifndef ZERO // Check if NOT defined
    DebugLog("zero");
#else
    DebugLog("non-zero"); // Gets printed
#endif

These checks are commonly used to implement header guards. Since C++17, they can also be used with __has_include which evaluates to 1 if a header exists and 0 if it doesn’t. This is often used to check whether optional libraries are available or to choose from one of several equivalent libraries:

// If the system provides DebugLog via the debug_log.h header
#if __has_include(<debug_log.h>)
    // Use the system-provided DebugLog
    #include <debug_log.h>
// The system does not provide DebugLog
#else
    // Define our own version using puts from the C Standard Library
    #include <cstdio>
    void DebugLog(const char* message)
    {
        puts(message);
    }
#endif

This __has_include(<header_name>) check uses the same header file search that #include <header_name> would. To check the header file search of #include "header_name", we can use __has_include("header_name").

Macros

Both languages allow defining preprocessor symbols with #define and un-defining them with #undef:

// Define a preprocessor symbol
#define ENABLE_LOGGING
 
void LogError(const char* message)
{
    // Check if the preprocessing symbol is defined
    // It is, so the DebugLog remains
    #ifdef ENABLE_LOGGING
        DebugLog("ERROR", message);
    #endif
}
 
// Un-define the preprocessor symbol
#undef ENABLE_LOGGING
 
void LogTrace(const char* message)
{
    // Check if the preprocessing symbol is defined
    // It isn't, so the DebugLog is removed
    #ifdef ENABLE_LOGGING
        DebugLog("TRACE", message);
    #endif
}
 
void Foo()
{
    LogError("whoops"); // Prints "ERROR whoops"
    LogTrace("got here"); // Nothing printed
}

C# requires #define and #undef to appear only at the top of the file, but C++ allows them anywhere.

C++ also goes way beyond these simple preprocessor symbol definitions. It has a full “macro” system that allows for textual substitution. While this is generally discouraged in “Modern C++,” its use is still ubiquitous for certain tasks. Sometimes it’s used when the language doesn’t provide a viable alternative or at least didn’t when the code was written. Regardless, macros are widely used and it’s important to know how they work.

First, we can define an “object-like” macro by providing a value to the preprocessor symbol. Unlike C#, the value doesn’t have to be a boolean:

// Define an object-like macros
#define LOG_LEVEL 1
#define LOG_LEVEL_ERROR 3
#define LOG_LEVEL_WARNING 2
#define LOG_LEVEL_DEBUG 1
 
void LogWarning(const char* message)
{
    // The preprocessor symbol can be used in #if expressions
    #if LOG_LEVEL <= LOG_LEVEL_WARNING
        // The preprocessor symbol will be replaced with its value
        DebugLog(LOG_LEVEL_WARNING, message);
 
        // After preprocessing, the previous line becomes:
        DebugLog(2, message);
    #endif
}

We can also define “function-like” macros that take parameters:

// Define a function-like macro
#define MADD(x, y, z) x*y + z
 
void Foo()
{
    int32_t x = 2;
    int32_t y = 3;
    int32_t z = 4;
 
    // Call the function-like macro
    int32_t result = MADD(x, y, z);
 
    // After preprocessing, the previous line becomes:
    int32_t result = x*y + z;
 
    DebugLog(result); // 10
}

Unlike a runtime function call, calling a function-like macro simply performs textual substitution. It’s easy to forget this, especially when the macro is named like a normal function. This can lead to bugs and performance problems because argument expressions aren’t evaluated before the macro is called:

// Function-like macro named like a normal function, not ALL_CAPS
#define square(x) x*x
 
int32_t SumOfRandomNumbers(int32_t n)
{
    int32_t sum = 0;
    for (int32_t i = 0; i < n; ++i)
    {
        sum += rand();
    }
    return sum;
}
 
void Foo()
{
    // Call a very expensive function
    int32_t result = square(SumOfRandomNumbers(1000000));
 
    // After preprocessing, the previous line becomes:
    int32_t result = SumOfRandomNumbers(1000000)*SumOfRandomNumbers(1000000);
 
    DebugLog(result); // {some random number}
}

With a normal function call, SumOfRandomNumbers(1000000) would be evaluated before the function is called. With macros, it’s just textually replaced so square ends up making two calls to it. The call is very expensive, so we have a performance problem. It’s also a bug because we’re no longer necessarily multiplying the same number by itself since the two calls may return different numbers.

To see more clearly how bugs arise, consider this macro call:

void Foo()
{
    int32_t i = 1;
    int32_t result = square(++i);
 
    // After preprocessing, the previous line becomes:
    int32_t result = ++i*++i;
 
    DebugLog(result, i); // 6, 3
}

Again, the argument (++i) isn’t evaluated before the macro call but rather just repeated every time the macro refers to the parameter. This means i is incremented from 1 to 2 then again to 3 before the multiplication (*) produces the result of 2*3=6 and sets i to 3. If this were a function call, we’d expect 2*2=4 and for the value of i to be 2 afterward. These potential bugs are one reason why macros are discouraged.

Function-like macros have access to a couple of special operators: # and ##. The # operator wraps an argument in quotes to create a string literal:

// Wrap msg in quotes to create "msg"
#define LOG_TIMESTAMPED(msg) DebugLog(GetTimestamp(), #msg);
 
void Foo()
{
    // No need for quotes. hello becomes "hello".
    LOG_TIMESTAMPED(hello) // {timestamp} hello
 
    // Extra quotes are added and existing quotes are escaped: ""hello""
    LOG_TIMESTAMPED("hello") // {timestamp} "hello"
}

The ## operator is used to concatenate two symbols, which may be arguments:

// Each line concatenates some literal text (e.g. m_) with the value of name
// Backslashes are used to make a multi-line macro
#define PROP(type, name) \
    private: type m_##name; \
    public: type Get##name() const { return m_##name; } \
    public: void Set##name(const type & val) { m_##name = val; }
 
struct Vector2
{
    PROP(float, X)
    PROP(float, Y)
 
    // These macro calls are replaced with:
 
    private: float m_X;
    public: float GetX() const { return m_X; }
    public: void SetX(const float & val) { m_X = val; }
    private: float m_Y;
    public: float GetY() const { return m_Y; }
    public: void SetY(const float & val) { m_Y = val; }
};
 
void Foo()
{
    Vector2 vec;
    vec.SetX(2);
    vec.SetY(4);
    DebugLog(vec.GetX(), vec.GetY()); // 2, 4
}

Macros may also take a variable number of parameters using ... similar to functions. __VA_ARGS__ is used to access the arguments:

#define LOG_TIMESTAMPED(level, ...) DebugLog(level, GetTimestamp(), __VA_ARGS__);
 
void Foo()
{
    LOG_TIMESTAMPED("DEBUG", "hello", "world") // DEBUG {timestamp} hello world
 
    // This macro call is replaced by:
 
    DebugLog("DEBUG", GetTimestamp(), "hello", "world");
}

In C++20, __VA_OPT__(x) is also available. If __VA_ARGS__ is empty, it’s replaced by nothing. If __VA_ARGS__ isn’t empty, it’s replaced by x. This can be used to make parameters in macros like LOG_TIMESTAMPED optional:

// __VA_OPT__(,) adds a comma only if __VA_ARGS__ isn't empty, meaning the
// caller passed some log messages
#define LOG_TIMESTAMPED(...) DebugLog(GetTimestamp() __VA_OPT__(,) __VA_ARGS__);
 
void Foo()
{
    LOG_TIMESTAMPED() // {timestamp}
    LOG_TIMESTAMPED("hello", "world") // {timestamp} hello world
 
    // These macro calls are replaced by:
 
    DebugLog(GetTimestamp()  );
 
    DebugLog(GetTimestamp() , "hello", "world");
}

Without __VA_OPT__, we wouldn’t know if the macro should put a , or not because we wouldn’t know if there are any arguments to pass after it.

Built-in Macros and Feature-Testing

Just like how C# pre-defines the DEBUG and TRACE preprocessor symbols, C++ pre-defines some object-like macros:

Name	Value	Meaning
`__cplusplus`	`199711L` (C++98 and C++03) `201103L` (C++11) `201402L` (C++14) `201703L` (C++17) `202002L` (C++20)	C++ language version
`__STDC_HOSTED__`	`1` if there is an OS, `0` if not
`__FILE__`	`"mycode.cpp"`	Name of the current file
`__LINE__`	`38`	Current line number
`__DATE__`	`"2020 10 26"`	Date the code was compiled
`__TIME__`	`"02:00:00"`	Time the code was compiled
`__STDCPP_DEFAULT_NEW_ALIGNMENT__`	`8`	Default alignment of `new`. Only in C++17 and up.

Since C++20, there are a ton of "feature test" macros available in the <version> header file. These are all object-like and their values are the date that the language or Standard Library feature was added to C++. The intention is to compare them to __cplusplus to determine whether the feature is supported or not. There are way too many to list here, but the following shows a couple in action:

void Foo()
{
    if (__cplusplus >= __cpp_char8_t)
    {
        DebugLog("char8_t is supported in the language");
    }
    else
    {
        DebugLog("char8_t is NOT supported in the language");
    }
 
    if (__cplusplus >= __cpp_lib_byte)
    {
        DebugLog("std::byte is supported in the Standard Library");
    }
    else
    {
        DebugLog("std::byte is NOT supported in the Standard Library");
    }
}

A complete list is available in the C++ Standard's definition of the <version> header file.

Miscellaneous Directives

The pre-defined __FILE__ and __LINE__ values can be overridden by another preprocessor directive: #line. This works just like in C# except that default and hidden aren't allowed:

void Foo()
{
    DebugLog(__FILE__, __LINE__); // main.cpp, 38
#line 100
    DebugLog(__FILE__, __LINE__); // main.cpp, 100
#line 200 "custom.cpp"
    DebugLog(__FILE__, __LINE__); // custom.cpp, 200
}

#error can be used to make the compiler produce an error:

#ifndef _MSC_VER
    #error Only Visual Studio is supported
#endif

#pragma is used to allow compilers to provide their own preprocessor directives, just like in C#:

// mathutils.h
 
// Compiler-specific alternative to header guards
#pragma once
 
float SqrMagnitude(const Vector2& vec)
{
    return vec.X*vec.X + vec.Y*vec.Y;
}

_Pragma("expr") can be used instead of #pragma expr. It has exactly the same effect:

_Pragma("once")

C#'s #region and #endregion aren't supported in C++, but compilers like Visual Studio allow it via #pragma:

#pragma region Math
 
float SqrMagnitude(const Vector2& vec);
float Dot(const Vector2& a, const Vector2& b);
 
#pragma endregion Math

Usage and Alternatives

Each new version of C++ makes usage of the preprocessor less necessary. For example, C++11 introduced constexpr variables which removed a lot of the reasons to use object-like macros:

// Before C++11
#define PI 3.14f
 
// After C++11
constexpr float PI = 3.14f;

This made PI an actual object so it has a type (float), its address can be taken (&PI), and just generally used like other objects rather than as a textually-replaced float literal. The benefits become much greater with struct types, lambda classes, and other non-primitives where it's not really possible to make a macro for general use:

// Before C++11
// This isn't usable in many contexts like Foo(EXPONENTIAL_BACKOFF_TIMES)
#define EXPONENTIAL_BACKOFF_TIMES { 1000, 2000, 4000, 8000, 16000 }
 
// After C++11
// This works like any array object:
constexpr int32_t ExponentialBackoffTimes[] = { 1000, 2000, 4000, 8000, 16000 };

Likewise, constexpr and consteval functions have removed a lot of the need for function-like macros:

constexpr int32_t Square(int32_t x)
{
    return x * x;
}
 
void Foo()
{
    int32_t i = 1;
    int32_t result = Square(++i);
    DebugLog(result); // 4
}

These behave like regular functions rather than textual substitution. We skip all the bugs and performance problems that macros might cause but keep the compile-time evaluation. We can even force compile-time evaluation in C++20 with consteval. We get strong typing, so Square("FOO") is an error. We can use the function at run-time, not just compile time. It behaves like any other function: we can take function pointers, we can create member functions, and so forth.

Still, macros provide a sort of escape hatch for when we simply can't express something without raw textual substitution. The PROP macro example above generates members with access specifiers. There's no way to do that otherwise. That example might not be the best idea, but others really are. A classic example is an assertion macro:

// When assertions are enabled, define ASSERT as a macro that tests a boolean
// and logs and terminates the program when it's false.
#ifdef ENABLE_ASSERTS
    #define ASSERT(x) \
        if (!(x)) \
        { \
            DebugLog("assertion failed"); \
            std::terminate(); \
        }
// When assertions are disabled, assert does nothing
#else
    #define ASSERT(x)
#endif
 
bool IsSorted(const float* vals, int32_t length)
{
    for (int32_t i = 1; i < length; ++i)
    {
        if (vals[i] < vals[i-1])
        {
            return false;
        }
    }
    return true;
}
 
float GetMedian(const float* vals, int32_t length)
{
    ASSERT(vals != nullptr);
    ASSERT(length > 0);
    ASSERT(IsSorted(vals, length));
    if ((length & 1) == 1)
    {
        return vals[length / 2]; // odd
    }
    float a = vals[length / 2 - 1];
    float b = vals[length / 2];
    return (a + b) / 2;
}
 
void Foo()
{
    float oddVals[] = { 1, 3, 3, 6, 7, 8, 9 };
    DebugLog(GetMedian(oddVals, 7));
 
    float evenVals[] = { 1, 2, 3, 4, 5, 6, 8, 9 };
    DebugLog(GetMedian(evenVals, 8));
 
    DebugLog(GetMedian(nullptr, 1));
 
    float emptyVals[] = {};
    DebugLog(GetMedian(emptyVals, 0));
 
    float notSortedVals[] = { 3, 2, 1 };
    DebugLog(GetMedian(notSortedVals, 3));
}

Calling ASSERT with assertions enabled performs the following replacement:

ASSERT(IsSorted(vals, length));
 
// Becomes:
 
if (!(IsSorted(vals, length)))
{
    DebugLog("assertion failed");
    std::terminate();
}

When disabled, everything's removed including the expressions passed as arguments:

ASSERT(IsSorted(vals, length));
 
// Becomes:

Now imagine we had used a constexpr function instead of a macro:

#ifdef ENABLE_ASSERTS
    constexpr void ASSERT(bool x)
    {
        if (!x)
        {
            DebugLog("assertion failed");
            std::terminate();
        }
    }
#else
    constexpr void ASSERT(bool x)
    {
    }
#endif

When assertions are disabled, we get the empty constexpr function:

constexpr void ASSERT(bool x)
{
}

But when we call ASSERT the arguments still need to be evaluated even though the function itself does nothing:

ASSERT(IsSorted(vals, length));
 
// Is equivalent to:
 
bool x = IsSorted(vals, length);
Assert(x); // does nothing

The compiler might be able to determine that the call to IsSorted has no side effects and can be safely removed. In many cases, it won't be able to make this determination and an expensive call to IsSorted will still take place. We don't want this to happen, so we use a macro.

Macros can also be used to implement a primitive form of C# generics or C++ templates, which we'll cover soon in the series:

// "Generic"/"template" of a Vector2 class
#define DEFINE_VECTOR2(name, type) \
    struct name \
    { \
        type X; \
        type Y; \
    };
 
// Invoke the macro to generate Vector2 classes
DEFINE_VECTOR2(Vector2f, float);
DEFINE_VECTOR2(Vector2d, double);
 
// "Generic"/"template" of a function
#define DEFINE_MADD(type) \
    type Madd(type x, type y, type z) \
    { \
        return x*y + z; \
    }
 
// Invoke the macro to generate Madd functions
DEFINE_MADD(float);
DEFINE_MADD(int32_t);
 
void Foo()
{
    // Use the generated Vector2 classes
    // Use sizeof to show that they have different component sizes
    Vector2f v2f{2, 4};
    DebugLog(sizeof(v2f), v2f.X, v2f.Y); // 8, 2, 4
 
    Vector2d v2d{20, 40};
    DebugLog(sizeof(v2d), v2d.X, v2d.Y); // 16, 20, 40
 
    // Use the generated Madd functions
    // Use typeid on the return value to show that they're overloads
    float xf{2}, yf{3}, zf{4};
    auto maddf{Madd(xf, yf, zf)};
    DebugLog(typeid(maddf) == typeid(float)); // true
    DebugLog(typeid(maddf) == typeid(int32_t)); // false
 
    int32_t xi{2}, yi{3}, zi{4};
    auto maddi{Madd(xi, yi, zi)};
    DebugLog(typeid(maddi) == typeid(float)); // false
    DebugLog(typeid(maddi) == typeid(int32_t)); // true
}

This form of code generation is commonly used in C codebases that lack C++ templates. When templates are available, as they are in all versions of C++, they are the preferred option for many reasons. One reason is the ability to "overload" a class name so we just have Vector2 rather than coming up with awkward unique names like Vector2f and Vector2d.

Another is that there's no need for, usually large, lists of DEFINE_X macro calls for every permutation of types needed in every class and function. This really gets out of control when there are several "type parameters." Instead, the compiler generates all the permutations of the class or function based on our usage of them so we don't need to explicitly maintain such lists.

There are many more reasons that we'll get into when we cover templates later in the series.

Conclusion

The two languages have a lot of overlap in their use of the preprocessor. It runs at the same stage of compilation and features many identically-named directives with the same functionality.

The major points of divergence are in #include, an essential part of the build model before C++20, and in macros created by #define. Function-like macros represent another form of compile-time programming that runs during preprocessing as opposed to constexpr which runs during main compilation. They're also another form of generics or templates. While their necessity has diminished over time, they are still essential for some tasks and convenient for others.

#1 by Pratik Chowdhury on January 22nd, 2021 · Reply

// After C++11

constexpr float 3.14f

Small typo here

I think you meant

constexpr float PI = 3.14f;

Nice article BTW and thanks for the series and I hope you continue writing it!!!

#2 by jackson on January 23rd, 2021 · Reply

Thanks for pointing this out. I’ve updated the article with a fix for the typo.

#3 by typoman on March 14th, 2021 · Reply

typo here: IsSorted(vals, length;

#4 by jackson on March 14th, 2021 · Reply

Thanks for letting me know! I’ve updated the article to correct the typo.

C++ For C# Developers: Part 24 – Preprocessor