Why Rust enums are so cool
OOP enums are pretty boring. They are just a way to give names to some integral constants. Rust enums are much more powerful. They solve some design problems much more elegantly than was possible in OOP languages. Read on to find out what Rust enums can do that OOP enums can't.
Let's say you have an enum in C#:
Red, Green, Blue }
What can you do with it? You can compare it with other values of the same type:
var colour1 = Colour.Red; var colour2 = Colour.Green; //Compare colour1 and colour2 and do something if equal if //do something
Or maybe you can convert them into an integer:
var colourInt = colour1;
And that's pretty much it.
Now, I'm not completely dismissing enums in OOP languages. They do improve type safety and limit the possible values that a variable can have. Not to mention readability over plain integers. And that makes the code cleaner and reduces chances of bugs. But OOP enums stop short of realizing their full potential.
The real power of enums starts showing when programming languages allow enum variants to carry data around. Let's take a look at the
Option enum to understand what I mean.
Option in Rust is a container for a value. The container could be empty, or it could hold some value. This is how it is defined:
None variant represents an empty state and the
Some variant can carry a data of type
T. Why is that useful? Think about how you would implement something similar in C#. It would likely be a
class with a flag indicating whether a value is present or not. And indeed,
Nullable in C# is defined (almost) like this:
public bool hasValue; public T value; }
Now think about the usability of this class. How would you get the value out of a
var mightBeNull = ; if //do something with mightBeNull.value
if check in the code above is critical. If you omit it, the
mightBeNull.value.Length expression will throw a
var mightBeNull = ; //no compiler error but still a NullReferenceException var length = mightBeNull.value.Length;
In stark contrast, you can't directly access the value in Rust:
let might_be_null: = OptionNone; //error[E0609]: no field `value` on type `Option<String>` let some_other_var = might_be_null.value;
Instead the Rust compiler forces you to check if the
mightBeNull variable is the
Some variant before you can get your hands on the value wrapped inside:
let might_be_null: = //get an Option<String> from somewhere if let Some = might_be_null //do something with value
Pretty cool isn't it. While you could easily shoot yourself in the foot in C#, Rust prevented you from committing such silly mistakes. Let's take a look at another example.
Result is the centerpiece of error handling in Rust. Any fallible function can either successfully return a value or fail with an error. This is how
Result is defined:
Here both variants of
Result carry some data. The
Ok(T) variant carries the return value if a function succeeds. The
Err(E) variant carries the error value if it fails.
If you have ever written some C, you must have seen a pattern of error handling in which the return value doubles up as both the return value for success and for error. For example the
atof function is declared like this in C:
This function will try to parse a double from
str. If the function succeeds, it returns the parsed value. But if it can't parse a value it returns zero. Do you know what happens if the input string can be parsed into a zero (e.g. "0")? This will also return zero. It means if
atof returns zero, you can't tell if that was because the input string was "0" or some unparseable gibberish.
In Rust, the same function would have a much cleaner return type:
atof will return an
Ok(f64) when it can successfully parse a number but will return an
Err(u8) if it can't. There is no chance of using some valid value as an error code because the
Err variants carry separate values.
As with an
Option you can't directly get the value of a
let result = atof; //error[E0609]: no field `value` on type `Result<f64, u8>` let value = result.value;
The only safe way to get the value out is to pattern match on
atof's return value:
match atof Ok => Err =>
And lastly, there is one more safety feature enabled by
Result. In C it is too easy to forget to check an error code. In Rust the compiler warns you if you don't use a
//a call which throws away a Result atof;
The above line will issue this warning:
warning: unused `Result` that must be used --> src\main.rs:6:5 | 6 | atof("123.56"); | ^^^^^^^^^^^^^^^ |
That's all about
Result for now. Next, let's talk about when you should write your own enums in Rust?
When to use Rust enums
The built in
Result types are great, but how do you design your own enums? In general, whenever you have a situation in which a variable can have either of a few possible states, an enum might be a good fit. Consider the following example from the serde-yaml crate:
/// Represents a YAML null value. Null, /// Represents a YAML boolean. Bool, /// Represents a YAML numerical value, whether integer or floating point. Number, /// Represents a YAML string. String, /// Represents a YAML sequence in which the elements are /// `serde_yaml::Value`. Sequence, /// Represents a YAML mapping in which the keys and values are both /// `serde_yaml::Value`. Mapping,
Value enum represents a value in a yaml file. A yaml value can be either null or a bool or a number and so on. Hence, this is a perfect place to use an enum.
Tackling the same problem in an OOP language leaves you with just two broad options. Either try to shoehorn everything into a single
Value class. Or make one class for each type of value (
Bool etc.) derived from a base
Value class. The first option is just an ugly mishmash of unrelated member variables. The second is better but still a lot of boilerplate. Luckily, in Rust you don't have to make this tradeoff.
Before I wrap up, a small section on some terms from type theory.
Sum (and Product) Types
Rust enums are what are called sum types in type theory. Why that name? Let's consider how many possible distinct values an
Option<bool> enum can have. It is the total number of distinct values for
Option::None(1) plus total number of distinct values for
Option::Some (2). Their sum is 3. Since we add the possible values the variants of an enum can have, that is why enums are called sum types.
Now consider a user defined type in C++, a
int x; int y; ;
Think how many distinct
Points can be created? If an
int is 32 bit wide, there can be 2^32 possible values for
x and as many for
y. So the total number of distinct
Points are: total number of distinct values for
x multiplied by total number of distinct values for
y. That is why types like
Point are called product types.
The last section above was there just to make you aware of the terms some people throw around when talking about types. In reality, the esoteric, mathy sounding names are the least interesting aspect of enums in Rust. They are a tool that solve some design problems better than their OOP counterparts. And once you start using them, you wish other languages had them too.