JacksonDunstan.com

Every programmer has heard that global variables are bad practice and should be avoided in favor of other techniques. Yet you’d be surprised how often global and pseudo-global variables are used. Today’s article reveals some of these usages and present some alternative ways to structure your code so it’s easier to read, write, and maintain. Read on to learn how!

The keyword static might as well be called global. A static variable exists only one time regardless of how often a class is instantiated, is accessible from anywhere, and is effectively a global variable. Consider this trivial class:

class Game
{
	public static int Difficulty;
}

No matter how many times you instantiate the Game class, there is only one Difficulty variable. It’s therefore a global variable since it’s not encapsulated in any data structure or a local variable of a function. Now let’s use it just a bit:

class Game
{
	public static int Difficulty;
 
	public IEnemy SpawnEnemy()
	{
		return new Enemy(Difficulty);
	}
}

The SpawnEnemy function creates an enemy and uses the global Difficulty variable to set the enemy’s difficulty at construction time. This begs the question- “when I call SpawnEnemy, what difficulty will the enemy I get have?”. A search through the source code ensues, rather than simply looking at the function signature of SpawnEnemy. In the case that Game is in a DLL and you don’t have the source code, you might even end up disassembling the DLL just to try to find out how the enemy’s difficulty is set.

Just from this one global variable we can write some really confusing code:

void SpawnEnemies(Game game)
{
	Foo();
	var e1 = game.SpawnEnemy();
	Bar();
	var e2 = game.SpawnEnemy();
	Baz();
	var e3 = game.SpawnEnemy();
}
 
void Foo() { Game.Difficulty = 1; }
void Bar() { Game.Difficulty = 2; }
void Baz() { Game.Difficulty = 3; }

Looking at the SpawnEnemies function, it looks like we should get three of the same enemy. After all, we called the function three times with the same (no) parameters. Instead, we get enemies with difficulties 1, 2, and 3. Later on, perhaps during QA, we’ll wonder why these enemies are behaving slightly differently. We’ll look at SpawnEnemies, be confused, and need to drill into every function that SpawnEnemies calls, every function those call, and so forth. Other threads might even be writing to it!

Just because Difficulty was made static our code is now extremely difficult to follow. Every function can read it and every function can write to it. This could change the logic of every function in the whole app. The only way we’d know for sure is to find everywhere that it’s used and inspect by hand to make sure that there are no inter-function troubles like with SpawnEnemies.

The problem can be slightly improved by converting from public to internal, protected, or private. This limits the amount of code we need to scan to just the assembly (e.g. DLL), class and derivatives, or the class, respectively. The problem is less severe, but the fundamental issue remains and it’s still easy to write reasonable-but-wrong code like SpawnEnemies. All you have to do is move Foo, Bar, and Baz into Game or its assembly, depending on which access specifier you choose.

Even better would be to remove the static keyword and make Difficulty a field of Game instances, not the Game class. Look how much harder it is to write a bad version of SpawnEnemies now:

class Game
{
	public int Difficulty;
 
	public IEnemy SpawnEnemy()
	{
		return new Enemy(Difficulty);
	}
}
 
void SpawnEnemies(Game game)
{
	Foo(game);
	Goo();
	var e1 = game.SpawnEnemy();
	Bar(game);
	var e2 = game.SpawnEnemy();
	Baz(game);
	var e3 = game.SpawnEnemy();
}
 
void Foo(Game game) { game.Difficulty = 1; }
void Bar(Game game) { game.Difficulty = 2; }
void Baz(Game game) { game.Difficulty = 3; }
void Goo() { Debug.Log("hi from Goo"); }

Foo, Bar, and Baz are a lot more suspicious now that they’re taking in a Game parameter. We had to pass it explicitly to them so we suspect that they might be changing something about it. When we called Goo we didn’t pass the Game so we don’t need to search it to make sure it’s not changing our Game in any way. It’s simply impossible for Goo to effect our Game since it doesn’t have any reference to it.

But is that true? Goo could still have access to our Game since we can have multiple variables referencing it. Consider this example, which happens quite a lot:

class EnemySpawner
{
	private Game game;
 
	EnemySpawner(Game game)
	{
		this.game = game;
	}
 
	void SpawnEnemies(Game game)
	{
		Foo(game);
		Goo();
		var e1 = game.SpawnEnemy();
		Bar(game);
		var e2 = game.SpawnEnemy();
		Baz(game);
		var e3 = game.SpawnEnemy();
	}
 
	void Foo(Game game) { game.Difficulty = 1; }
	void Bar(Game game) { game.Difficulty = 2; }
	void Baz(Game game) { game.Difficulty = 3; }
	void Goo() { game.Difficulty = 4; }
}

Now that Game is a field of the class, Goo has access to it. This is because there is an implicit parameter to all non-static functions of the class that is a reference to the instance of the class. You can use it explicitly with the this keyword. Essentially, Goo looks like this behind the scenes:

void Goo(EnemySpawner thiz) { thiz.game.Difficulty = 4; }

So while you might think you’re not passing any parameters to it, Goo actually gets access to all of the fields of the class. Even worse, it gets access to all of the protected variables of the base class, its base class, and so forth all the way up to Object. An innocent-looking function that takes no parameters could actually have access to dozens of variables that you’d never expect it to be reading and writing!

So while static is pretty obviously a kind of global variable, it’s good to remember that instance fields of a class are a pseudo-global variable too. They aren’t truly accessible globally, but neither are private static fields. Instead, they’re accessible to the group of functions in the class and its derivatives without any explicit parameter passing, just like static variables.

So how do we avoid these sorts of trouble? Our first preference should be to make functions’ logic only depend on its parameters and only output via its return value and out and ref parameters. Don’t base the logic on fields or write to them as the function’s output. Here’s a little example:

class Bad
{
	private int Val;
 
	// Uses the Val field as input and output
	// How would callers know this by looking at the function signature?
	public void Double()
	{
		Val += Val;
	}
}
 
class Good
{
	// Logic based on parameters only
	// Output via return value only
	public int Double(int val)
	{
		return val + val;
	}
}
 
void Test(Bad bad, Good good)
{
	// Users of Bad.Double have to know that it effects Val
	Debug.Log(bad.Val);
	bad.Double();
	Debug.Log(bad.Val);
 
	// It's obvious to users of Good.Double how it works
	var val = 2;
	Debug.Log(val);
	val = good.Double(val);
	Debug.Log(val);
}

One technique that can help enforce this is to make your functions static and your fields non-static. This keeps the fields from being truly global (i.e. existing only once in memory no matter what) and your functions unable to use them as their input or output. If you follow this, you’ll keep the “side effects” away from your functions and make them much easier to understand. To illustrate that, consider how much code you’d have to trawl through to find out what this function does:

class ThingDoer
{
	private static Stuff stuff;
	private static Thing thing;
 
	void DoStuff()
	{
		CreateStuff();
		SetStuffProperties();
		RemoveLastProperty();
		AddAlternateProperty();
		SetupLogging();
 
		stuff.Execute(thing);
	}
 
	void CreateStuff() { /* ... */ }
	void SetStuffProperties() { /* ... */ }
	void RemoveLastProperty() { /* ... */ }
	void AddAlternateProperty() { /* ... */ }
	void SetupLogging() { /* ... */ }
}

What did all those no-parameter functions do to the stuff and thing fields? Did each one change both? Did some only change one of them? Did any of them not change either one? To find out, you’d need to go to each of them and read through their code. If they called any other functions of the class, you’d have to go to them too.

Now let’s see how ThingDoer would look if we made the functions static and the fields non-static:

class ThingDoer
{
	private Thing thing;
 
	void DoStuff()
	{
		var stuff = CreateStuff();
		SetStuffProperties(stuff);
		RemoveLastProperty(thing);
		AddAlternateProperty(thing);
		SetupLogging();
 
		stuff.Execute(thing);
	}
 
	static Stuff CreateStuff() { /* ... */ }
	static void SetStuffProperties(Stuff stuff) { /* ... */ }
	static void RemoveLastProperty(Thing thing) { /* ... */ }
	static void AddAlternateProperty(Thing thing) { /* ... */ }
	static void SetupLogging() { /* ... */ }
}

Since stuff is created by the CreateStuff function, it isn’t even a field anymore. When it’s used by one of the other functions it’s passed as a parameter. In this case that’s just SetStuffProperties. We automatically know that all the other functions wouldn’t have had any impact on stuff since they don’t have any access to our local variable.

Likewise, thing is passed to only two functions so we know that none of the others are using it based on access to a parameter. Since all the functions are static, we know that they aren’t using it based on access to a field either. So if we’re interested in what’s going on with thing we can skip searching through three of the functions and focus on only two.

Finally, the SetupLogging function doesn’t take any parameters and is static so we know that it’s not going to be using either stuff or thing. Everything about this class is now very explicit and there’s lots of code we can skip reading through if we ever have questions about how DoStuff works. If we’re looking at one of the static functions, we can also skip looking at the rest of the class since there are no static variables that could possibly be used. We can consider the function in total isolation and just look at what it and the functions it calls does without worrying about the rest of the class.

That wraps up today’s discussion of global and pseudo-global variables, their problems, and how to avoid them. Please feel free to use the comments section to share your thoughts about globals and pseudo-globals and how you prefer to deal with them in your code!

#1 by Walker on April 25th, 2016 · Reply

I really like the approach of having static, very functional methods within classes. I used to use that pattern a lot, albeit without making the methods static. Unfortunately, I had a really hard time getting others on my team to do it, too. Making them static is a good trick, though, since others would at least be less likely to mess up the methods that are already doing this. I ended up taking the other route: try to avoid state wherever possible and make classes so small that it’s easy to reason about them. This article makes me want to try your pattern, though, and see how it goes.

#2 by jackson on April 25th, 2016 · Reply

Thanks for sharing your experience. I’m curious about something you said:

I ended up taking the other route: try to avoid state wherever possible and make classes so small that itâ€™s easy to reason about them.

How did the “avoid state wherever possible” end up looking in your code? Could you provide a little example?
- #3 by Walker on May 2nd, 2016 · Reply
  
  I’ll see if I can dig up a good example, but the short version is never cache anything. I work pretty hard to refactor code so that I only store the minimum set of values I need to calculate everything. Often times I’ll hide those calculations behind a private property so that it’s easy to still treat them like members, but you just can’t write to them. I’ll see if I can find the code, but an example that comes to mind is a class that handles controller input and needs to remember which controller in an array to read from. Rather than caching the controller instance, I save a ref to the array of controllers, which hand this object represents, and a ref to the service that tells me the controller index based on which hand it is. Then I hide that look up behind a private, read-only property named Controller. This worked out well because it perfectly handled the case where a controller disconnects and then reconnects with a new index.
  - #4 by jackson on May 2nd, 2016 · Reply
    
    That’s often a good idea. Any cache is essentially derived state that usually needs to be kept in sync. When it’s not, that’s often a bug. I’ve used the private getter approach before and found it works quite nice for creating pseudo-fields.

#5 by Behnam Rasooli on May 30th, 2020 · Reply

I thought you were going to talk about global variables and you even started it that way but you ended up talking about fields/member variables. These are IMHO two different things.

Your approach looks really weird for a OO design. You’re reinventing Functional Programming. OOP in its core says a class is a bunch of data/variables and a bunch of functions operating on them. If this idea doesn’t work for you, there are other options such as Functional Programming or Data-Oriented Programming.

In this case, a better way would be using smaller classes/methods, more meaningful naming, and a bit of discipline. Of course, if I see a method called DoStuff(), I have to dig into it to find out what it does. But if the same method called MoveToDestination(), then I know it’s changing the transform property.

The Global Variables Everybody Uses

Comments