We’ve already seen C++’s traditional build model based on #include. Today we’ll look at the all-new build model introduced in C++20. This is built on “modules” and is much more analogous to the C# build model. Read on to learn how to use it by itself and in combination with #include!

Table of Contents

Module Basics

The new build system is based on a new language concept called a “module.” This system promises to dramatically decrease compile times, both clean and incremental. It also promises to dramatically increase encapsulation by preventing leakage of preprocessor directives and implementation details. Finally, it fully removes the need to specify file system paths in source code like we do with #include and then use complex directory lookups to find the referenced files.

To convert a translation unit such as a .cpp file into a “module unit,” we use an export module statement:

///////////
// math.ixx
///////////
 
export module math;

We’ve done two things here. First, we’ve named the module with the .ixx extension. Module files can be named with any extension, or no extension at all, just like any other C++ source file. The .ixx extension is used here simply because it’s the preference of Microsoft Visual Studio 2019, one of the first compilers to support modules.

Second, the line export module math; begins a module named math. Like the rest of C++, the source file is read from top to bottom. Everything after this statement is part of the math module, but everything before it is not.

Currently the module is empty since there’s nothing else in the source file. Let’s add some functions:

///////////
// math.ixx
///////////
 
// Normal function before the "export module" statement
float Average(float x, float y)
{
    return (x + y) / 2;
}
 
// Exported function before the "export module" statement
export float MagnitudeSquared(float x, float y)
{
    return x*x + y*y;
}
 
// The module begins here
export module math;
 
// Normal function after the "export module" statement
float Min(float x, float y)
{
    return x < y ? x : y;
}
 
// Exported function after the "export module" statement
export float Max(float x, float y)
{
    return x > y ? x : y;
}

There are a couple things to notice here, too. First, we can add export before anything we want to be usable from outside the module. This includes functions like these, variables, types, using aliases, templates, and namespaces. It does not include preprocessor directives such as macros.

Modules can seem analogous to namespaces, but the two are quite distinct. A module can export a namespace and a module doesn’t imply a namespace. Modules aren’t meant to replace namespaces, but they may be used for similar purposes in grouping together related functionality.

We can export anything that doesn’t have internal linkage, such as by being declared static or inside an unnamed namespace. Our exports must be directly inside of a namespace block, outside of any blocks at the top level of the file, or in an export block:

// Everything in this block is exported
export
{
    float Min(float x, float y)
    {
        return x < y ? x : y;
    }
 
    // Redundant "export" has no effect
    export float Max(float x, float y)
    {
        return x > y ? x : y;
    }
}

Second, two of these functions are before the export module math; statement. These are part of the “global module” rather than the math module, just like everything outside of a namespace is part of the “global namespace.”

There can be only one module in a module unit source file. This isn’t allowed:

// First module: OK
export module math;
float Min(float x, float y)
{
    return x < y ? x : y;
}
 
// Second module: compiler error
export module util;
export bool IsNearlyZero(float val)
{
    return val < 0.0001f;
}

Assuming we don’t do that, let’s now use this module from another file:

///////////
// main.cpp
///////////
 
// Import the module for usage
import math;
 
// OK: Max is found in the "math" module we imported
DebugLog(Max(2, 4)); // 4
 
// Compiler error: none of these are part of the "math" module
DebugLog(Average(2, 4));
DebugLog(MagnitudeSquared(2, 4));
DebugLog(Min(2, 4));

We use import to name the module that we want to use. We get access to everything marked export in that module. Unlike with header files, we don’t specify the file name of the module unit. This is similar to the C# build system where we simply name a namespace: using System;.

Partitions and Fragments

We could put all of the code for a module in a single file, but this doesn’t scale well as we add more and more code. Imagine all of System.Collections.Generic in a single file! C# addresses this by putting one class (List<T>, Dictionary<K, V>, etc.) in each file. C++ addresses this in multiple ways. The first is called “module partitions” and they allow us to split code across multiple files while still being part of a single module:

///////////////
// geometry.ixx
///////////////
 
// Specify that this is the "geometry" partition of the "math" module
export module math:geometry;
 
export float MagnitudeSquared(float x, float y)
{
    return x * x + y * y;
}
 
////////////
// stats.ixx
////////////
 
// Specify that this is the "stats" partition of the "math" module
export module math:stats;
 
export float Min(float x, float y)
{
    return x < y ? x : y;
}
 
export float Max(float x, float y)
{
    return x > y ? x : y;
}
 
export float Average(float x, float y)
{
    return (x + y) / 2;
}
 
///////////
// math.ixx
///////////
 
// This is the primary "math" module
export module math;
 
// Import the "stats" partition and export it
export import :stats;
 
// Import the "geometry" partition and export it
export import :geometry;
 
///////////
// main.cpp
///////////
 
// Import the "math" module as normal
import math;
 
// Use its exported entities as normal
DebugLog(Min(2, 4)); // 2
DebugLog(Max(2, 4)); // 4
DebugLog(Average(2, 4)); // 3
DebugLog(MagnitudeSquared(2, 4)); // 20

We see here that partitions are specified with a :. The module partition names the primary module (math) and the name of its partition (stats). The primary module just uses the name of the partition (:stats) because its name (math) has already been stated and doesn’t need to be repeated. It must export all of the partitions so the compiler knows everything that’s available in the module when it’s used.

Unlike other identifiers, module names may include a . in them. This means we could instead use math.stats and math.geometry as our module names:

///////////////
// geometry.ixx
///////////////
 
// This is a primary "math.geometry" module
export module math.geometry;
 
export float MagnitudeSquared(float x, float y)
{
    return x * x + y * y;
}
 
////////////
// stats.ixx
////////////
 
// This is a primary "math.stats" module
export module math.stats;
 
export float Min(float x, float y)
{
    return x < y ? x : y;
}
 
export float Max(float x, float y)
{
    return x > y ? x : y;
}
 
export float Average(float x, float y)
{
    return (x + y) / 2;
}
 
///////////
// math.ixx
///////////
 
// This is the primary "math" module
export module math;
 
// Import the "math.stats" module and export it
export import math.stats;
 
// Import the "math.geometry" module and export it
export import math.geometry;
 
///////////
// main.cpp
///////////
 
// Import the "math" module as normal
import math;
 
// Use its exported entities as normal
DebugLog(Min(2, 4)); // 2
DebugLog(Max(2, 4)); // 4
DebugLog(Average(2, 4)); // 3
DebugLog(MagnitudeSquared(2, 4)); // 20

The difference here is that math.stats and math.geometry aren’t partitions, they’re primary modules. Any of them can be used directly:

// Import the "math.stats" primary module
import math.stats;
 
// Use its exported entities as normal
DebugLog(Min(2, 4)); // 2
DebugLog(Max(2, 4)); // 4
DebugLog(Average(2, 4)); // 3

It’s important to note that math.stats and math.geometry aren’t “submodules” as far as the compiler is concerned. They just happened to be named in a way that makes them appear that way. This is largely the same as C# namespaces since there’s no special relationship between System, System.Collections, and System.Collections.Generic other than the naming.

Lastly, there is an implicit private “fragment” that can hold only code that can’t possibly effect the module’s interface. This restriction allows compilers to avoid recompiling code that uses the module when only the private fragment changes:

// Primary module
export module math;
 
// Export some function declarations
export float Min(float x, float y);
export float Max(float x, float y);
 
// This begins the "private fragment"
module :private;
 
// Define some non-exported functions
float Min(float x, float y)
{
    return x < y ? x : y;
}
float Max(float x, float y)
{
    return x > y ? x : y;
}
Module Implementation Units

So far all of our module files have been “module interface units” since they included the export keyword. They’re interfaces to be used by code outside the module such as our main.cpp.

There’s another kind of module unit though: “module implementation units.” These are meant to contain implementation details of the module. They don’t use the export keyword, but contain internal code that’s accessible from within the module:

///////////////
// geometry.ixx
///////////////
 
// A non-exported module partition
module math:geometry;
 
// A non-exported function
float MagnitudeSquared(float x, float y)
{
    return x * x + y * y;
}
 
///////////
// math.ixx
///////////
 
// Primary module
export module math;
 
// Import the module implementation partition
import :geometry;
 
// Export a function from the module implementation partition by declaring it
// and adding the "export" keyword
export float MagnitudeSquared(float x, float y);
 
// Export more functions
export float Magnitude(float x, float y)
{
    // Call functions in the imported module implementation partition
    float magSq = MagnitudeSquared(x, y);
    return Sqrt(magSq); // TODO: write Sqrt()
}

This is similar to how we’d split code across header files (.hpp) and translation units (.cpp). In that traditional build system, we’d add declarations of functions in the header files and definitions of those functions in the translation units.

If we don’t need the partitions but still want to separate the interface from the implementation, we can drop the import and remove the partition name:

///////////////
// geometry.cpp
///////////////
 
// A non-exported module
module math;
 
// A non-exported function
float MagnitudeSquared(float x, float y)
{
    return x * x + y * y;
}
 
///////////
// math.ixx
///////////
 
export module math;
 
// Note: no need to "import math;" since this is already the "math" module
 
export float MagnitudeSquared(float x, float y);
 
export float Magnitude(float x, float y)
{
    float magSq = MagnitudeSquared(x, y);
    return Sqrt(magSq); // TODO: write Sqrt()
}

Notice that we now have geometry.cpp, not geometry.ixx. This is because it can’t be imported anymore and must be used implicitly like we did in the math.ixx module unit.

Module Linkage

In the traditional build model, there is “internal linkage” and “external linkage.” This means that something is either the same internally in a translation unit or externally across translation units. With modules, there is now “module linkage.” This means that something is the same across all module units and users of the module:

///////////////////
// statsglobals.ixx
///////////////////
 
export module stats:globals;
 
// Variable with "module linkage"
export int NumEnemiesKilled = 0;
 
////////////
// stats.ixx
////////////
 
export module stats;
 
import :globals;
 
export void CountEnemyKilled()
{
    // Refers to the same variable as in statsglobal.ixx
    NumEnemiesKilled++;
}
 
export int GetNumEnemiesKilled()
{
    // Refers to the same variable as in statsglobal.ixx
    return NumEnemiesKilled;
}
 
///////////
// main.cpp
///////////
 
import stats;
 
DebugLog(GetNumEnemiesKilled()); // 0
CountEnemyKilled();
DebugLog(GetNumEnemiesKilled()); // 1
 
// Refers to the same variable as in statsglobal.ixx
DebugLog(NumEnemiesKilled); // 1
Compatibility

Given the 40+ year history of C++, the new build system must be compatible with the old build system. There are a ton of existing header files that we’ll want to use with modules. Thankfully, C++ provides a new preprocessor directive to do just that:

import "mylibrary.h";
// ...or...
import <mylibrary.h>;

Despite not starting with a # and requiring a ; at the end, this is really a preprocessor directive. It’s distinct from a regular module import because it either has double quotes ("mylibrary.h") or angle brackets (<mylibrary.h>) depending on the header search rules desired.

The effect of this directive is to export everything that’s exportable in the header file just like we added export to its source code. We typically use it to create a “header unit” that wraps a header file in a module:

////////////////
// mylibrary.ixx
////////////////
 
// Module that wraps mylibrary.h
export module mylibrary;
 
// Export everything in the header file that can be exported
import "mylibrary.h";

There are a couple of key differences between this import directive and #include and import with a module. First, contrary to #include, preprocessor symbols defined before the import directive are not visible to the imported header file:

//////////////
// mylibrary.h
//////////////
 
int ReadVersion()
{
    int version = ReadTextFileAsInteger("version.txt");
 
    #if ENABLE_LOGGING
        DebugLog("Version: ", version);
    #endif
 
    return version;
}
 
///////////
// main.cpp
///////////
 
#include "mylibrary.h"
int version = ReadVersion(); // Does not log
 
// ...equivalent to...
 
int ReadVersion()
{
    int version = ReadTextFileAsInteger("version.txt");
 
    #if ENABLE_LOGGING // Note: not defined
        DebugLog("Version: ", version);
    #endif
 
    return version;
}
int version = ReadVersion();
 
/////////////////
// mainlogged.cpp
/////////////////
 
// Define a preprocessor symbol before #include
#define ENABLE_LOGGING 1
 
#include "mylibrary.h"
int version = ReadVersion(); // Does log
 
// ...equivalent to...
 
#define ENABLE_LOGGING 1
 
int ReadVersion()
{
    int version = ReadTextFileAsInteger("version.txt");
 
    #if ENABLE_LOGGING // Note: is defined
        DebugLog("Version: ", version);
    #endif
 
    return version;
}
 
int version = ReadVersion();

C++ provides a facility to work around this limitation. We can use module; before our named module and put preprocessor directives between these two statements. Everything here will be part of the “global module” and accessible from inside the module:

///////////////
// metadata.ixx
///////////////
 
// No module name means "global module"
module;
 
// Define a preprocessor symbol before #include
// Only preprocessor symbols are allowed in this section
#define ENABLE_LOGGING 1
 
// Use #include instead of the import directive
#include "mylibrary.h"
 
// Our named module
export module metadata;
 
// Export a function from the header file
export int ReadVersion();
 
///////////
// main.cpp
///////////
 
// Use the module as normal
import metadata;
DebugLog(ReadVersion()); // 6

The second difference between the import directive and import with a module is that preprocessor macros in the header file are exported:

///////////////
// legacymath.h
///////////////
 
// Macro defined in the header file
#define PI 3.14
 
///////////
// math.ixx
///////////
 
export module math;
 
// Import directive exposes the PI macro
import "legacymath.h";
 
export double GetCircumference(double radius)
{
    // Macros from the import directive are usable
    return 2.0 * PI * radius;
}
 
///////////
// main.cpp
///////////
 
import math;
 
// OK
DebugLog(GetCircumference(10.0));
 
// Compiler error: macros from import directives are not exported
DebugLog(PI);

Notice how the PI macro is available for use in the header unit that used the import directive but not in users of that module. This prevents macros from transitively “leaking” throughout an entire program.

Conclusion

C++20’s new module build system is much more analogous to C# than its own legacy header files and #include. In C++ terms, C# mixes namespaces and modules together somewhat. We write the name of a namespace (using Math;) in order to gain access to its contents. C++ separates these two features. We can write import math; without math being a namespace. We can layer namespaces on top of modules and even export them.

C# provides support for splitting code across multiple files by adding one member of a namespace in each file. The same is possible in C++, but we can also go further by adding multiple members in a single file and splitting the interface from the implementation. Partitions and fragments are flexible tools that allow us to sub-divide large modules across many source files.

As a C++20 feature that was only standardized recently, modules are not commonly used as of this writing. However, they’re destined to eventually become the dominant build system and bring their many improvements over header files to the vast majority of codebases. In the meantime, we have tools such as the new import "header.h" directive and access to the global module to ease the transition. New code using modules can use these tools to package legacy code into modules, just as if it was written that way from the start. Old code can simply continue to use the header files.