/ Unity

A Rant On Unity, Error Handling, And Async Code

I often run across two problems while using Unity. The first is that there are multiple incompatible ways to handle async[1] operations, the second is that there is no good way to perform error handling for fallible operations. In the past I've thought of these as separate concerns, but together they reveal a shortcoming in the design of the Unity engine.

Let's start with a vague example: I have an operation I want to perform, maybe a web request or some time consuming computation. This operation may take multiple frames or even multiple seconds before it completes. This operation returns a result, but it might also fail and return an error. I want to make sure I handle errors when they occur so ideally I must explicitly handle the error case in order to invoke the operation[2].

In my experience there are two ways to go about async operations in Unity C#:

  1. Callbacks: Have the operation take two callbacks: One which takes the result of the operation and is invoked when the operation completes successfully, the other taking an error message and is invoked when the operation fails for any reason.
  2. Coroutines: Create class representing the operation and implement IEnumerator for the class, then use yield return myOperation; within a coroutine.

Callbacks

This one is pretty simple. We define our operation as such:

public void DoOperation(Action<GoodResult> onSuccess, Action<BadResult> onError)
{
	// Do the thing Zhu Li!

	if (success)
	{
		onSuccess(result);
	}
	else
	{
		onError(error);
	}
}

And when we want to perform the operation:

DoOperation(
	(result) => { ... },
	(error) => { ... });

The biggest advantage of this design is that it makes error handling explicit because you always have to pass in something to handle the error case, even if it's an empty callback. This makes it less likely that the user will simply forget to handle the error case[3]. This design also works well because C# has closures, which makes callbacks incredibly flexible and easy to use.

Callbacks don't work as well when you need to compose or chain multiple async operations. Let's say I need to run three operations in order A -> B -> C, passing the result of each operation into the next:

OperationA(
	(resultA) => {
		OperationB(
			resultA,
			(resultB) => {
				OperationC(
					resultB,
					(resultC) => { ... },
					(errorC) => { ... });
			},
			(errorB) => { ... });
	},
	(errorA) => { ... });

Even in this small example things are getting hairy. JavaScript[4] developers call this callback hell. When there's only one callback per operation this results in deeply-nested functions and convoluted control-flow, and adding an error handler to each operation only makes things worse. Depending on how much you need to compose async operations this may or may not be a deal breaker, but I'm looking for a solution that doesn't break down in more complex cases.

Coroutines

Coroutines, built on C# iterators using yield, provide an alternative to callbacks and exist primarily to avoid "callback hell" by making async code look like synchronous code. Using a coroutine for our nondescript operation is easy to write and easy to read:

// Create object representing the operation.
var operation = new MyOperation();

// Pause execution until the operation completes.
yield return operation;

// Check if the operation succeeded.
if (operation.success)
{
	DoSomething(operation.result);
}
else
{
	HandleError(operation.error);
}

In most cases you'll also write the operation as a coroutine, though it will depend on the particular operation. If the operation is asynchronous because it needs to happen over an extended period of time (e.g. tweening an object between two points) or the operation takes so long to compute that it needs to be spread over multiple frames to avoid slowing down the game (e.g. a pathfinding algorithm) then a coroutine will work just fine.

public IEnumerator Run(Transform transform, Vector3 from, Vector3 to, float duration)
{
	var offset = to - from;
	var velocity = offset / duration;

	while (duration > 0f)
	{
		transform.Translate(velocity * Time.deltaTime);
		duration -= Time.deltaTime;

		// Wait until next frame so the tween appears to animate.
		yield return null;
	}
}

And calling it looks almost the same:

yield return StartCoroutine(Run(transform, from, to, duration));

Things get more tricky if you want to do something more complex (e.g. a coroutine for async network calls or file loading) but it's doable.

To see how coroutines avoid callback hell let's rewrite the last callback example with coroutines.

var operationA = new OperationA();
yield return operationA;
if (!operationA.succeeded)
{
	...
	yield break;
}

var operationB = new OperationB(operationA.result);
yield return operationB;
if (!operationB.succeeded)
{
	...
	yield break;
}

var operationC = new OperationC(operationB.result);
yield return operationC;

Now it's much easier to tell that our operations happen in the order A -> B -> C, and the control flow overall is much cleaner.

The problem with coroutines, which is ultimately the problem with C#, is that there's no way to make error handling explicit. The most obvious example of this is Unity's WWW class:

var request = new WWW(new Url("http://my-site.com/game/assets.unity3d"));

// Wait for the download to complete.
yield return request;

// Load an asset from the asset bundle.
// ERROR: `request.assetBundle` will be null if the download failed.
request.assetBundle.Load("image.png");

With WWW you have to remember to check if the request completed, and if you forget even once you've injected madness into your codebase. Ultimately this failing isn't Unity's fault, the problem is that C# doesn't believe in error handling. In C# there's no good way to have a single object represent either a result value or an error value in a way that requires that you address both possibilities before getting either value. In Rust you'd use the Result<T, E> type:

let result = some_operation();
match result {
	// This branch is only run if the operation succeeded.
	Ok(success_value) => { do_something(success_value); },

	// This branch is only run if the operation failed.
	Err(error_info) => { handle_error(error_info); },
}

In Rust you match on the result and have to have a branch for both the success case and the error case[5]. This looks a lot like the callback version in C# but it works just fine with synchronous code and doesn't munge control flow the way callbacks do. Functionally this could be expressed with if blocks in C# as follows:

if (result.succeeded)
{
	result.successValue;
}
else
{
	result.errorInfo;
}

The major difference being that in C# you could do something erroneous like:

// Try to get the success value if the operation failed.
if (!result.succeeded)
{
	result.successValue;
}

or:

// Try to get the success value without checking if the operation succeeded.
result.succesValue;

In those cases the C# compiler wouldn't so much as emit a warning, whereas in Rust these would both be compile errors.

The Elephant In The Room: Exceptions

At this point it's worth discussing exceptions and try/catch. Exceptions are generally considered to be the "right" way to do error handling in C# and its family of languages. You write your code assuming all operations are infallible and wrap any error-causing code in a try block to catch thrown exceptions:

try
{
	let resultA = OperationA();
	let resultB = OperationB(resultA);
	let resultC = OperationC(resultB);
}
catch (Exception exception)
{
	...
}

If any of the operations fail they throw an exception, and then all error handling is performed within the catch block. While I could fill tomes with arguments against exceptions, the two most relevant argument is that exceptions are extra non-explicit. If having to do if (operation.succeeded) before trying to access operation.result is liable to be forgotten, then having the operation throw an exception when it fails is even worse. Exceptions tempt developers to say "I don't need to handle that error now, somewhere a few level up the callstack someone will have a try/catch, and if they don't it's not my problem". They munge control flow and silently introduce madness into your codebase. When operation.succeed is false you at least know what failed, if an exception is thrown in the example above then it may be impossible to know which of the three operations threw it. This makes error handling code harder to write and more likely to respond incorrectly to an error, and the only thing worse than buggy code is buggy error handling code.

The Solution

The solution to this problem, as I see it, is as follows:

  1. Use coroutines for all async operations.
  2. The yield statement takes an argument representing the operation to yield on and returns a result after the operation completes representing the result of the operation.
  3. Implement a Result<T, E> type that allows fallible operations to return a single result and still require explicit handling of both cases.

The 3rd point is effectively impossible because C# doesn't have the functionality create such a type in an ergonomic way. The 1st is really just a matter of conventions; If Unity used coroutines for everything, and encouraged pervasive usage of coroutines for all async operations, then coroutines would be the default for all Unity games. Interestingly, the 2nd point is already possible with C# today, but with await instead of yield. await takes an async operation as a parameter and returns the result of that operation once it completes, making it strictly more useful than yield. There may well be a good reason why Unity chose to go with yield and iterators instead of await and async, but my knowledge of C# isn't enough to say what that might be.

While it's not possible to have this kind of async with Unity and C# today, let's take a look at what it might look like if my suggested changes were made:

var resultA = await OperationA();
match resultA
{
	Success(value) => { ... },
	Error(error) => {
		await OperationC();
		return;
	},
}

var resultB = await OperationB();
match resultB
{
	Success() => { ... },
	Error(error) => { ... },
}

Now we have the best of both worlds: Sequential operations happen sequentially in code, but error handling is explicit. Branching async operations still lead to rightwards drift, but no worse than nested, branching if blocks do.

Bonus: Reactive Programming

If you've seen UniRx you may think you have a clever solution: Functional reactive programming. The thing is, FRP is what you use when you don't have language-level support for coroutines or async operations. C# has both, and so the await + async is a strictly more expressive, user-friendly alternative to FRP, it's just that Unity chooses not to use it. Also, we still have the problem of implicit error handling: If you don't add a handler for errors then it silently fails (or, worse, throws an exception) and you've injected madness into your codebase.

Conclusion

As much as I love complaining about Unity, and I do love complaining about Unity, I still work with it every day and need a solution to this problem that I can use now. So what's the best way to handle fallible async operations in Unity today? At this point I'm leaning towards coroutines.

My suspicion is that coroutines are underused in general, and I see a lot of potential in using coroutines more pervasively when writing game code. While I'm unhappy about not being able to also have explicit error handling, that's something I've largely given up on for C#. I think C# is an interesting language and it makes a lot of smart design choices, but I've never been satisfied with its error handling story. Since callbacks are the only way to get the kind of error handling I want, I'd rather push for coroutines and more elegant code and use other methods like code reviews and testing to catch errors.



  1. In this case I'm using "async" to mean "non-blocking". While in some cases asynchronous code involves multi-threading and true parallelism, Unity prefers to have all game code on the main thread, which means that async operations in Unity are often only time-sharing the main thread. ↩︎

  2. If I can access the result value without having to specify what I do if an error occurred, then chances are I will forget to handle the error case. Even if you trust yourself to always remember to check errors (which you shouldn't), do you trust all of your coworkers? Do you trust everyone to know the Unity API well enough to even know which operations can fail? ↩︎

  3. Of course, that doesn't guarantee that they'll do anything reasonable with the error, or handle it in the right way, but if I could write an API that couldn't be misused then I'll have solved the mystery of the universe or something. ↩︎

  4. That's JavaScript, not UnityScript, don't confuse the two. ↩︎

  5. Swift and Haskell also have the same type. That's because Rust, Swift, and Haskell all support "addition types" and pattern matching. C# doesn't support either of these, and neither do any of the other Unity scripting languages. Interestingly there is a .NET language that does have these features: F#. In theory you could use F# for your Unity development, but using an unsupported language is just trading one set of problems for another. ↩︎

David LeGare

David is a game developer and systems programmer. He likes cats, he tolerates dogs, and he often remembers to dress himself before leaving the house.

Read More