Memory safety in Rust - part 2

2020-08-03

Introduction

In part 1 of Memory safety in rust I discussed the concept of memory safety and various techniques used by different languages to achieve it. Almost all languages fall on a spectrum with memory safety on one side and programmer control on the other. Rust is unique in that it doesn't make this trade-off — the programmer gets both memory safety and control.

Note Note

Aliasing, mutation and safety

To safely free an object, there must be no references to it, otherwise you'll end up with a dangling pointer. Similarly, if a thread wants to send an object to another thread, there can't be a reference to it on the sending thread. There are two elements in play here: aliasing and mutation. If the object was not being destroyed or sent across a thread, there is nothing wrong with having references to it. It is only when both of them are combined that you get in trouble.

In light of this observation, Rust's solution to memory safety is to simply disallow both aliasing and mutation at the same time, and Rust achieves this through ownership and borrowing.

Ownership

When you create a new object in Rust, the assigned variable becomes the owner of the object. For example in the following Rust code, variable v owns the Vec instance:

let v: Vec<i32> = Vec::new();

and when v goes out of scope, the Vec is dropped. There can only be a single owner of an object at a time, which ensures that only the owner drops it. This avoids double-free bugs. If v is assigned to another variable, the ownership is transferred:

let v1 = v;//v1 is the new owner

Since v1 is now the owner, access is no longer allowed through v:

v.len();//error: Use of moved value

Note Note

The owner can of course mutate the object:

let mut v = Vec::new();//mut is needed to mutate the object
v.push(1);

But since there is no aliasing, we are good.

If all a programmer could do in Rust was own values and transfer them around, it would be quite a restrictive programming environment. Fortunately, Rust allows borrowing from the owner.

Borrowing

Borrowing introduces aliasing. A reference can be borrowed from the owner:

let v: Vec<i32> = Vec::new();
let v1 = &v;//v1 has borrowed from v
v.len();//fine
v1.len();//also fine

Unlike an owner, there can be multiple borrowed references at the same time:

let v: Vec<i32> = Vec::new();
let v1 = &v;//v1 has borrowed from v
let v2 = &v;//v2 has also borrowed from v
v.len();//allowed
v1.len();//also allowed
v2.len();//also allowed

But a borrower cannot access the resource after the owner has destroyed it because otherwise it will lead to a use-after-free bug:

let v1: &Vec<i32>;
{
   let v = Vec::new();
   v1 = &v;
}//v is dropped here
v1.len();//error:borrowed value does not live long enough

So even though aliasing is possible, Rust ensures that no references outlive the object being referenced, again avoiding aliasing and mutation at the same time.

Until now all the borrows were immutable. It is possible to have mutable references, but as I will show next, Rust is smart enough to disallow aliasing when mutation is introduced.

Mutable borrowing

Although there can be multiple shared references, there can only be one mutable reference at one time:

let mut v:Vec<i32> = Vec::new();
let v1 = &mut v;//first mutable reference
let v2 = &mut v;//second mutable reference
v1.push(1);//error:cannot borrow `v` as mutable more than once at a time

As soon as mutation is allowed through a mutable reference, Rust takes away aliasing by disallowing other references (shared or mutable).

These borrowing rules prevent dangling pointers. If Rust had allowed a mutable reference and an immutable reference at the same time, the memory could become invalid through the mutable reference while the immutable reference could still be pointing to that invalid memory. For example, in the code below, v1 could access invalid memory if such code was allowed:

let mut v = vec![0, 1, 2, 3];
let v1 = &v[0];//an immutable reference to Vec's first element
v.push(4);//this can invalidate Vec's internal buffer
let v2 = *v1;//this could access invalid memory

In comparison, similar code would be allowed in C++.

Lifetimes

I have discussed that Rust disallows both aliasing and mutation at the same time to prevent memory safety issues but hand-waved my way through these sections about how Rust achieves this at compile time. Rust does this by keeping track of lifetimes of variables. Intuitively, the lifetime of a variable is tied to its scope:

let v1: &Vec<i32>;//-------------------------+
{//                                          |
   let v = Vec::new();//-----+               |v1's lifetime
   v1 = &v;//                | v's lifetime  |
}//<-------------------------+               |
v1.len();//<---------------------------------+

So the compiler compares the lifetimes of various variables to figure out if something fishy is going on. For example, in the code above, v1 outlives the owner v which is not allowed. The lifetimes in above example are called lexical lifetimes because they are inferred from the variable scopes. In reality, Rust has a more sophisticated implementation of lifetimes called non-lexical lifetimes.

Lifetimes is a big topic and I cannot cover everything in this post. You can learn more about lifetimes at the Rustonomicon.

Conclusion

In this post, I discussed the concepts of ownership and borrowing and how they help in achieving memory safety in Rust. Many memory safety issues boil down to the fact that languages like C++ allow both mutation and aliasing at the same time. Rust's ability to detect these memory safety problems at compile time makes it a strong contender for a systems programming language.