While little utilized, C#’s struct type can come in really handy sometimes. Today’s article shows how to use it to get a lot more mileage out of modern CPUs’ caches to really boost your app’s performance. Read on for some quick tips!

The struct and class types seem really similar at first glance, but they’re represented very differently behind the scenes. One way this shows up is how arrays of each type are allocated in memory. With classes, an array is allocated with enough space to hold only pointers to the class instance objects. Those instances are allocated somewhere else. So an array of 10 classes will be about 80 bytes (8 bytes per pointer) all one contiguous block of memory, but the class instances will be located in memory somewhere else.

This means that as you loop over the array you are referencing the pointer in the array block of memory and then referencing the object somewhere else entirely. With each iteration you jump around arbitrarily in memory to find each object. While this is still much faster than, say, reading from a hard drive or SSD, it’s not nearly as fast as it can be. Enter struct.

With an array of struct, the array is allocated as a block big enough to hold all of the structs, not just pointers to them. As you loop over it, you read sequentially through this whole block of structs. A CPU loves this type of code because it loads large chunks of the array into its caches (e.g. L2) and your loop keeps reading straight out of the cache. This can be an order of magnitude faster than reading from RAM, but that’ll vary a lot by type of CPU.

To demonstrate, here’s a quick script that uses two types: PointStruct and PointClass. Each just holds two ints: X and Y. The script makes a large array of each and loops over them adding up their X and Y fields. In the end, the loop times are displayed on screen.

using System.Diagnostics;
 
using UnityEngine;
 
struct PointStruct
{
	public int X;
	public int Y;
}
 
class PointClass
{
	public int X;
	public int Y;
}
 
public class MainScript : MonoBehaviour
{
	private string report;
 
	void Start()
	{
		const int size = 100000000;
		var structs = new PointStruct[size];
		var classes = new PointClass[size];
		for (var i = 0; i < size; ++i)
		{
			classes[i] = new PointClass();
		}
		var sw = new Stopwatch();
		var accum = 0;
 
		sw.Reset();
		sw.Start();
		for (var i = 0; i < size; ++i)
		{
			accum += structs[i].X;
			accum += structs[i].Y;
		}
		var structTime = sw.ElapsedMilliseconds;
 
		sw.Reset();
		sw.Start();
		for (var i = 0; i < size; ++i)
		{
			accum += classes[i].X;
			accum += classes[i].Y;
		}
		var classTime = sw.ElapsedMilliseconds;
 
		report = "Type,Time\n"
		         + "Struct," + structTime + "\n"
		         + "Class," + classTime;
	}
 
	void OnGUI()
	{
		GUI.TextArea(new Rect(0, 0, Screen.width, Screen.height), report);
	}
}

If you want to try out the test yourself, simply paste the above code into a TestScript.cs file in your Unity project’s Assets directory and attach it to the main camera game object in a new, empty project. Then build in non-development mode for 64-bit processors and run it windowed at 640×480 with fastest graphics. I ran it that way on this machine:

  • 2.8 Ghz Intel Core i7-4980HQ
  • Windows 10
  • Unity 5.2.4f1, Windows Standalone, x86_64, non-development
  • 640×480, Fastest, Windowed

And here are the results I got:

Type Time
Struct 113
Class 241

Array Element Types Performance Graph

Clearly, the struct version was much quicker than the class version. Again, this will vary a lot based on CPU and where .NET/Mono decides to put the class instance objects in memory. In any case, the class version will never be as quick as the struct version because that it guaranteed to always take advantage of whatever CPU caching is available.

Keep this in mind next time you need to do a bulk operation on a lot of data. Using a struct just might speed things up a lot! If you’ve done this before, let me know in the comments how it’s worked out for you.