Next-Level Code Generation
About a year ago we saw how easy it is to use code generation to go beyond the limits of C# generics. The system we used simply replaced strings in a template file to generate a C# file. Today we’ll go way further and radically increase the power of the code generator by using some simple, off-the-shelf tools.
Previously
When we left off last time, we had a system that consisted of just two elements:
- Template files: C# code with keywords marked for replacement, similar to
T
in generics - Code generator: A Unity editor C# script that loads template files, replaces keywords, and saves the result as a C# file
This was enough to allow us to escape some limitations of generics, such as the frequently-required virtual method calls due to the need to use interfaces and abstract classes. Still, this system was quite weak as it could only replace keyword strings in the template file.
Today we want to go way, way further. We’ll be going even beyond the power of C++ templates to a place of extreme flexibility.
Concept
The idea today is to replace the trivial code generator that only replaced keyword strings with an off-the-shelf templating tool. There are tons available with various feature sets. They’re highly used to generate HTML, but they can usually be used to generate any text. That means we can take a publicly-available templating engine and use its full feature set to generate C#.
This article isn’t meant to promote any particular templating engine, but one must be chosen to illustrate the concept. The criteria for the purposes of today’s article are that the engine is free, open source, stable, capable of generating C#, and available on all the major desktop platforms. These criteria will likely change from project to project. For example, if a game is exclusively developed on Windows then there is much less need for the templating engine to support macOS and Linux.
The chosen templating engine for this article is Cheetah3. It’s BSD-licensed, been around since 2001, can easily generate C#, and works on all the major desktop platforms. It’s written in Python, so installing it is as simple as running pip install cheetah3
.
Writing Templates
As before, templates are text files with a mixture of C# code and template code. Keywords are no longer used, but instead replaced by variables. For example, before we had keywords in ALLCAPS
like this:
using System.Collections.Generic; using TYPENAMESPACE; namespace INTONAMESPACE { public static class INTOTYPE { public static bool ContainsNoAlloc(this List<TYPE> list, TYPE element) { for (int i = 0, count = list.Count; i < count; ++i) { if (list[i] == element) { return true; } } return false; } } }
Now we’ll have variables named like this: ${VarName}
. Here’s how it looks:
using System.Collections.Generic; using ${TypeNamespace}; namespace ${IntoNamespace} { public static class ${IntoType} { public static bool ContainsNoAlloc(this List<${Type}> list, ${Type} element) { for (int i = 0, count = list.Count; i < count; ++i) { if (list[i] == element) { return true; } } return false; } } }
But we can go way further than this simple variable substitution. Cheetah3 provides the ability to write arbitrary Python code in the template file, so we can use features like loops. Say we want to generate Matrix
types with arbitrary numbers of columns and rows. We could do that simply with a nested loop using the #for
…#end for
syntax. Note that variables are referred to with just $VarName
inside such code.
public struct Matrix${NumRows}x${NumCols} { #for $row in range($NumRows) #for $col in range($NumCols) public float _${row}${col}; #end for #end for }
We can also define our own variables using the #set
syntax. For example, we could make a Vector
version of the Matrix
like this:
#set $NumComponents = $NumRows * $NumCols public struct Vector${NumComponents} { #for $i in range($NumComponents) public float _${i}; #end for }
This article won’t go in depth on all the syntax of Python, but suffice to say it’s a powerful scripting language that can be used in Cheetah3 to generate nearly any C# source code imaginable. We’ve already gone way beyond what’s possible with C# generics with just the trivial language features used in the Matrix
and Vector
examples above. With the addition of other common language features like if
and else
, lists and maps, and functional programming and lambdas, we have the power to generate whatever we want.
Running Templates
These templates require a tiny Python script to load up the Cheetah3 library and execute it with the input variables. Here’s what one looks like:
import sys from Cheetah.Template import Template print Template( sys.stdin.read(), searchList=[{ 'NumRows': int(sys.argv[1]), 'NumCols': int(sys.argv[2]) }])
This script reads the template from standard input (via sys.stdin
) and outputs the generated C# code to standard output (via print
). It uses two command-line parameters for NumRows
and NumCols
. To run it, pass the NumRows
and NumCols
parameters, provide the template via standard input redirection (<Template.cs.cheetah
), and redirect standard output to the file to generate (>Matrix4x3.cs
):
python GenerateMatrix.py 4 3 <Matrix.cs.cheetah >Matrix4x3.cs
The resulting Matrix4x3.cs
file will look like this:
public struct Matrix4x3 { public float _00; public float _01; public float _02; public float _10; public float _11; public float _12; public float _20; public float _21; public float _22; public float _30; public float _31; public float _32; }
At this point we have a variety of ways that the Python script could be run. For example, the Unity editor script from last time could be modified to execute the same command line as above. This would provide Unity integration via GUI menus and when run on the command line via batchmode
. Alternatively, the same command line could be placed into any build system, build server steps, or even simply run manually.
A Complex Example
Let’s end today by looking at a complex example of the kind of C# code that can be generated with our newfound power. In last week’s article, we broke down C# iterators and created struct
-based replacements of our own. One downside to this was that we still had to box them to IEnumerator
when passing them to StartCoroutine
. What if we could instead have our own coroutine-runner struct
that knew about our specific iterator-replacement struct
types so no boxing was ever required? That would eliminate the garbage creation due to boxing and improve performance by eliminating interface function calls like MoveNext
.
Implementing this with C#
generics is simply impossible. There’s just no way that limited tools like where
constraints can do the job. However, now that we have a templating engine in Cheetah3 we can easily write such a coroutine-runner. To start, let’s look at the Python generation script:
import sys from Cheetah.Template import Template print Template( sys.stdin.read(), searchList=[{ "runnerName": sys.argv[1], 'types': sys.argv[2:] }])
This script passes two parameter variables to the template: runnerName
and types
. The former is the name of the struct
to generate and the latter is a list of iterator-replacement struct
types like the PrintUrl
and DoFourThings
types we created last week.
Now let’s look at the template one part at a time:
#set $typeNames = map(lambda t: $t.split(".")[-1], $types) #set $ctorParamNames = map(lambda t: $t[0].lower() + $t[1:] + "Capacity", $typeNames) #set $coroutineTypeNames = map(lambda t: $t + "Coroutine", $typeNames) #set $handleTypeNames = map(lambda t: $t + "Handle", $typeNames) #set $countNames = map(lambda t: $t + "sCount", $typeNames) #set $coroutineArrayNames = map(lambda t: $t + "Coroutines", $typeNames)
This section defines new variables using some functional programming and lambdas. typeNames
is the name of each type, so System.DateTime
becomes just DateTime
. The others are derivatives of this that are commonly used throughout the rest of the template.
Next up we have the file’s header:
//////////////////////////////////////////////////////////////////////////////// // This file is auto-generated. // Do not hand modify this file. // It will be overwritten next time the generator is run. //////////////////////////////////////////////////////////////////////////////// public struct ${runnerName} {
This is the simplest substitution as it just fills in the struct’s name with the given variable.
Now we’ll define a couple of nested structs:
#for $i in range(len($types)) private struct ${coroutineTypeNames[$i]} { public int Id; public ${types[i]} Iterator; public ${coroutineTypeNames[$i]}(int id, $types[i] iterator) { Id = id; Iterator = iterator; } } #end for #for $i in range(len($types)) public struct ${handleTypeNames[$i]} { public int Id; public ${handleTypeNames[$i]}(int id) { Id = id; } } #end for
Each of these is a loop that creates one struct per iterator struct type. The first is to associate an ID with a running coroutine struct type. The second is to replace Unity’s Coroutine
type with iterator-specific types like PrintUrlHandle
. This provides strong compile-time checks to ensure that the proper Handle
type is being used when calling StopCoroutine
later on.
Now let’s look at the fields:
#for $i in range(len($types)) private ${coroutineTypeNames[$i]}[] ${coroutineArrayNames[$i]}; private int ${countNames[$i]}; #end for private int nextId;
These fields are again generated in a loop. We have one array and one count per type of iterator struct. We also have a single nextId
that we’ll use to dole out IDs when creating Handle
objects.
Next we have the constructor:
public ${runnerName}(${", ".join(map(lambda n: "int " + $n, $ctorParamNames))}) { #for $i in range(len($types)) ${coroutineArrayNames[$i]} = new ${coroutineTypeNames[$i]}[${ctorParamNames[$i]}]; ${countNames[$i]} = 0; #end for nextId = 1; }
This code has some inline code to generate the parameter list. Each is an int
used to set the initial capacity of the array. The body of the constructor is simply creating arrays, setting counts to zero, and starting out the IDs at 1
.
Next up is Add
, an overload set of private
functions to append to these arrays and resize if necessary. This is just like how List<T>
works, but we can’t use it because we need more precise control over copying these struct types.
#for $i in range(len($types)) private void Add(ref ${coroutineTypeNames[$i]} it) { int oldLen = ${coroutineArrayNames[$i]}.Length; int endIndex = ${countNames[$i]}; if (endIndex == oldLen) { ${coroutineTypeNames[$i]}[] newArray = new ${coroutineTypeNames[$i]}[oldLen * 2]; for (int i = 0; i < oldLen; ++i) { newArray[i] = ${coroutineArrayNames[$i]}[i]; } ${coroutineArrayNames[$i]} = newArray; } ${coroutineArrayNames[$i]}[endIndex] = it; ${countNames[$i]} = endIndex + 1; } #end for
Of course we also have RemoveAt
functions for each type that shift all elements toward the front and clear the last one in case it has any managed references that should be released:
#for $i in range(len($types)) private void RemoveAt${typeNames[$i]}(int index) { int endIndex = ${countNames[$i]} - 1; while (index < endIndex) { ${coroutineArrayNames[$i]}[index] = ${coroutineArrayNames[$i]}[index + 1]; index++; } ${coroutineArrayNames[$i]}[endIndex] = default(${types[$i]}Coroutine); ${countNames[$i]} = endIndex; } #end for
Now we get into the public API with a StartCoroutine
for each iterator type:
#for $i in range(len($types)) public ${handleTypeNames[$i]} StartCoroutine($types[$i] it) { ${handleTypeNames[$i]} handle = new ${handleTypeNames[$i]}(nextId); ${coroutineTypeNames[$i]} coroutine = new ${coroutineTypeNames[$i]}(nextId, it); Add(ref coroutine); nextId++; return handle; } #end for
This creates the appropriate Handle
and Coroutine
types, adds the Coroutine
, and returns the handle. Its opposite is StopCoroutine
:
#for $i in range(len($types)) public bool StopCoroutine(${handleTypeNames[$i]} handle) { UnityEngine.Assertions.Assert.IsTrue( handle.Id > 0 && handle.Id < nextId, "Invalid handle: " + handle.Id); for (int i = 0, len = ${countNames[$i]}; i < len; ++i) { if (${coroutineArrayNames[$i]}[i].Id == handle.Id) { RemoveAt${typeNames[$i]}(i); return true; } } return false; } #end for
These functions include an assertion to make sure the given Handle
is valid before searching for the given coroutine and stopping it.
Next is StopAllCoutines
:
public void StopAllCoroutines() { #for $i in range(len($types)) for (int i = 0, len = ${countNames[$i]}; i < len; ++i) { ${coroutineArrayNames[$i]}[i] = default(${types[$i]}Coroutine); } ${countNames[$i]} = 0; #end for }
This is just one function but it has a loop inside of it to set the counts to zero and clear all of the iterator objects in case they have any managed references that should be released.
Finally, we get to the point of all this code: running the iterator types!
public void Update() { #for $i in range(len($types)) for (int i = 0, len = ${countNames[$i]}; i < len; ) { if (${coroutineArrayNames[$i]}[i].Iterator.MoveNext()) { i++; } else { RemoveAt${typeNames[$i]}(i); len--; } } #end for }
Update
loops over all the arrays calling MoveNext
on them. Since these are specific struct types rather than IEnumerator
, the compiler will not box them or perform virtual function calls via the interface. If MoveNext
returns true
, it stays in the array. If it returns false
, it’s removed from the array.
Now let’s try this out using the following command line to generate the code:
python GenerateCoroutineRunner.py MyRunner PrintUrl DoFourThings <CoroutineRunner.cs.cheetah >MyRunner.cs
Here’s the output code:
//////////////////////////////////////////////////////////////////////////////// // This file is auto-generated. // Do not hand modify this file. // It will be overwritten next time the generator is run. //////////////////////////////////////////////////////////////////////////////// public struct MyRunner { private struct PrintUrlCoroutine { public int Id; public PrintUrl Iterator; public PrintUrlCoroutine(int id, PrintUrl iterator) { Id = id; Iterator = iterator; } } private struct DoFourThingsCoroutine { public int Id; public DoFourThings Iterator; public DoFourThingsCoroutine(int id, DoFourThings iterator) { Id = id; Iterator = iterator; } } public struct PrintUrlHandle { public int Id; public PrintUrlHandle(int id) { Id = id; } } public struct DoFourThingsHandle { public int Id; public DoFourThingsHandle(int id) { Id = id; } } private PrintUrlCoroutine[] PrintUrlCoroutines; private int PrintUrlsCount; private DoFourThingsCoroutine[] DoFourThingsCoroutines; private int DoFourThingssCount; private int nextId; public MyRunner(int printUrlCapacity, int doFourThingsCapacity) { PrintUrlCoroutines = new PrintUrlCoroutine[printUrlCapacity]; PrintUrlsCount = 0; DoFourThingsCoroutines = new DoFourThingsCoroutine[doFourThingsCapacity]; DoFourThingssCount = 0; nextId = 1; } private void Add(ref PrintUrlCoroutine it) { int oldLen = PrintUrlCoroutines.Length; int endIndex = PrintUrlsCount; if (endIndex == oldLen) { PrintUrlCoroutine[] newArray = new PrintUrlCoroutine[oldLen * 2]; for (int i = 0; i < oldLen; ++i) { newArray[i] = PrintUrlCoroutines[i]; } PrintUrlCoroutines = newArray; } PrintUrlCoroutines[endIndex] = it; PrintUrlsCount = endIndex + 1; } private void Add(ref DoFourThingsCoroutine it) { int oldLen = DoFourThingsCoroutines.Length; int endIndex = DoFourThingssCount; if (endIndex == oldLen) { DoFourThingsCoroutine[] newArray = new DoFourThingsCoroutine[oldLen * 2]; for (int i = 0; i < oldLen; ++i) { newArray[i] = DoFourThingsCoroutines[i]; } DoFourThingsCoroutines = newArray; } DoFourThingsCoroutines[endIndex] = it; DoFourThingssCount = endIndex + 1; } private void RemoveAtPrintUrl(int index) { int endIndex = PrintUrlsCount - 1; while (index < endIndex) { PrintUrlCoroutines[index] = PrintUrlCoroutines[index + 1]; index++; } PrintUrlCoroutines[endIndex] = default(PrintUrlCoroutine); PrintUrlsCount = endIndex; } private void RemoveAtDoFourThings(int index) { int endIndex = DoFourThingssCount - 1; while (index < endIndex) { DoFourThingsCoroutines[index] = DoFourThingsCoroutines[index + 1]; index++; } DoFourThingsCoroutines[endIndex] = default(DoFourThingsCoroutine); DoFourThingssCount = endIndex; } public PrintUrlHandle StartCoroutine(PrintUrl it) { PrintUrlHandle handle = new PrintUrlHandle(nextId); PrintUrlCoroutine coroutine = new PrintUrlCoroutine(nextId, it); Add(ref coroutine); nextId++; return handle; } public DoFourThingsHandle StartCoroutine(DoFourThings it) { DoFourThingsHandle handle = new DoFourThingsHandle(nextId); DoFourThingsCoroutine coroutine = new DoFourThingsCoroutine(nextId, it); Add(ref coroutine); nextId++; return handle; } public bool StopCoroutine(PrintUrlHandle handle) { UnityEngine.Assertions.Assert.IsTrue( handle.Id > 0 && handle.Id < nextId, "Invalid handle: " + handle.Id); for (int i = 0, len = PrintUrlsCount; i < len; ++i) { if (PrintUrlCoroutines[i].Id == handle.Id) { RemoveAtPrintUrl(i); return true; } } return false; } public bool StopCoroutine(DoFourThingsHandle handle) { UnityEngine.Assertions.Assert.IsTrue( handle.Id > 0 && handle.Id < nextId, "Invalid handle: " + handle.Id); for (int i = 0, len = DoFourThingssCount; i < len; ++i) { if (DoFourThingsCoroutines[i].Id == handle.Id) { RemoveAtDoFourThings(i); return true; } } return false; } public void StopAllCoroutines() { for (int i = 0, len = PrintUrlsCount; i < len; ++i) { PrintUrlCoroutines[i] = default(PrintUrlCoroutine); } PrintUrlsCount = 0; for (int i = 0, len = DoFourThingssCount; i < len; ++i) { DoFourThingsCoroutines[i] = default(DoFourThingsCoroutine); } DoFourThingssCount = 0; } public void Update() { for (int i = 0, len = PrintUrlsCount; i < len; ) { if (PrintUrlCoroutines[i].Iterator.MoveNext()) { i++; } else { RemoveAtPrintUrl(i); len--; } } for (int i = 0, len = DoFourThingssCount; i < len; ) { if (DoFourThingsCoroutines[i].Iterator.MoveNext()) { i++; } else { RemoveAtDoFourThings(i); len--; } } } }
And now we can use it like this:
using UnityEngine; public class TestScript : MonoBehaviour { private MyRunner runner; void Awake() { // Create the coroutine-runner runner = new MyRunner(10, 10); // Start a coroutine and get a handle to it MyRunner.PrintUrlHandle printHandle = runner.StartCoroutine(new PrintUrl("https://example.com")); // Stop the coroutine using the handle we got for it //runner.StopCoroutine(printHandle); // Start another coroutine MyRunner.DoFourThingsHandle doHandle = runner.StartCoroutine(new DoFourThings()); // Stop all coroutines //runner.StopAllCoroutines(); } void Update() { // Update the coroutine-runner (and all coroutines) runner.Update(); } }
Running this prints the following:
Did A Did B <!doctype html>... {snip} Did C Did D
This shows that the coroutines are indeed running simultaneously, just as with IEnumerator
-based Unity coroutines.
Conclusion
Today we’ve seen how to use an off-the-shelf templating engine as a code generator. In so doing we’ve radically increased the power of the system from lowly C# generics or even simple keyword substitution. We can now write arbitrarily-complex code to generate C# and achieve useful results like the above coroutine-runner. And all of this is done with relative ease since we’re just using common, free, well-supported, and portable tools like Python.
This clearly isn’t meant for every possible use case or to replace 100% of C# generics usage, but it’s a handy tool in your toolbox for when you need something more powerful.
If you’ve used a templating engine to generate code in the past, post your experiences in the comments and let me know how it worked out for you.
#1 by Todd Ogrin on November 12th, 2018 ·
I’ve done a fair bit of this sort of thing and I can offer a couple tips.
Don’t be afraid to just use {your favorite language here} to build your code generator. Once you get the hang of it, you can get a lot of mileage out of appending all your code together.
Also, if you use C# (for example) to build C# code, your generator can rely on C#-specific things, like reflection. Most template generator packages don’t offer that, at least, not without customization.
I spend a lot of time on a working, non-generated version of my code. Once I’m happy with how it works, I implement the generator that produces it. Basically, I do whatever I can so my code generator is targeting a final product that already works.
Finally, my first inclination was to use use random field names in my generated code. I switched to using scoped counters. For example, instead of …
…I get…
This is more verbose but it minimizes diffs when checking in to source control after regenerating.
#2 by jackson on November 13th, 2018 ·
Reflection can indeed be a powerful tool while building a code generator. My own UnityNativeScripting is written in C# and heavily uses reflection to generate its bindings. That said, the reflection part and the generation part can be split into two steps. First, C# code can use reflection to gather the required information about .NET assemblies and output a file with the given information. Second, a code generator in any language (e.g. Python for Cheetah3 as in the article) can consume this file and use it to actually generate the code. This will be somewhat more complex, but also gives access to a lot of powerful off-the-shelf code generation tools.
I think your general approach of writing a non-generic version of the code first is a good one. I essentially started with the final output code for the coroutine-runner in the article before working backwards to substitute specifics for generalities. The result is a script that can generate not only what I started with but code that would have been much more tedious for me to write and maintain by hand.
Counters seem much better than random names. IL2CPP and UnityNativeScripting do a lot of this. Another approach, as taken in the article, is to use type names as differentiators. There’s no one right way, so it’s good to have both available to fit the more appropriate one to each situation.
#3 by ddalacu on November 13th, 2018 ·
Hello, i like you’r articles a lot, i have ussed code generation myself with code dom too, i would like to make a sugestion, make a tutorial about cecil which is a super powerfull tool and maye a implementation of the Either monad in unity.Keep up the good work.
#4 by jackson on November 13th, 2018 ·
Thanks for the suggestions! I’ll look into an article on Cecil. As for Either, check out my two-part 2016 article.