References in Rust

2020-12-03

Introduction

Rust references are very simple at runtime: they are plain memory addresses. At compile time, in contrast, references participate in more complex compiler analysis. For example, references help to prove memory safety of a program. But in this post, I will not cover the safety aspects of references. You have already read about that in the post Memory safety in Rust - part 2. In this post, you will understand the syntactical and usage aspects of references. Let's start with the very basics.

Basics

Let's say you have a variable x, which owns a value:

let x: i32 = 42;

To create a reference to x, you'd use the & operator:

let r = &x;

And to get the value of the referent, you'd use the * operator:

let v: i32 = *r;

All the values and references created above were immutable, which is the default in Rust. If you want to change the value through a reference, create a mutable reference. The &mut operator creates a mutable reference:

let m = &mut x;

But this alone is not enough. You will realize this if you try to compile it:

error[E0596]: cannot borrow `x` as mutable, as it is not declared as mutable
  |
2 |     let x: i32 = 42;
  |         - help: consider changing this to be mutable: `mut x`
...
5 |     let m = &mut x;
  |             ^^^^^^ cannot borrow as mutable

The error makes sense because it avoids surprises for the programmer. If the compiler had allowed this, a mutable reference (m) could change an immutable value (x). Such changes in an immutable value would have been confusing. Let's try the fix suggested by the error message:

//make x mut
let mut x: i32 = 42;
let m = &mut x;

Now the code compiles fine. Finally, you can change the value through the reference:

*m = 100;

Shared and mutable references

Let's now try to create another mutable reference to x:

let mut x: i32 = 42;
let m: &mut i32 = &mut x;
let n: &mut i32 = &mut x;
println!("m is {}", m);
println!("n is {}", n);

But it doesn't work:

error[E0499]: cannot borrow `x` as mutable more than once at a time
  |
3 |     let m: &mut i32 = &mut x;
  |                       ------ first mutable borrow occurs here
4 |     let n: &mut i32 = &mut x;
  |                       ^^^^^^ second mutable borrow occurs here
5 |     println!("m is {}", m);
  |                         - first borrow later used here

Again, the error tells us exactly what went wrong. We cannot have two mutable references to a value at the same time. In fact, it goes even further. As long as a mutable reference is alive, it locks the original value. You can't mutate the value through the owner during the lifetime of a mutable reference:

let mut x: i32 = 42;
let m: &mut i32 = &mut x;
x = 100;
println!("m is {}", m);

This results in the following error:

error[E0506]: cannot assign to `x` because it is borrowed
  |
3 |     let m: &mut i32 = &mut x;
  |                       ------ borrow of `x` occurs here
4 |     x = 100;
  |     ^^^^^^^ assignment to borrowed `x` occurs here
5 |     println!("m is {}", m);
  |                         - borrow later used here

On the flip side, you can create many immutable references to a value at the same time:

let x: i32 = 42;
let m: &i32 = &x;
let n: &i32 = &x;
println!("m is {}", m);
println!("n is {}", n);

This is safe since there is no mutation. But what about both mutable and immutable references at the same time? Should the compiler allow that? The answer is no. Again, for safety reasons this is forbidden:

let mut x: i32 = 42;
let m: &i32 = &x;
let n: &mut i32 = &mut x;
println!("m is {}", m);
println!("n is {}", n);

You'll get below error if you try:

error[E0502]: cannot borrow `x` as mutable because it is also borrowed as immutable
  |
3 |     let m: &i32 = &x;
  |                   -- immutable borrow occurs here
4 |     let n: &mut i32 = &mut x;
  |                       ^^^^^^ mutable borrow occurs here
5 |     println!("m is {}", m);
  |                         - immutable borrow later used here

To summarize, the usage rules for immutable and mutable references are as follows. For a value, there can be:

  1. Either many immutable references.
  2. Or one mutable reference.

But not both at the same time. For this reason, another term for immutable references is shared references. And a similar term for mutable references is exclusive references.

So why do references have these rules? You might have guessed by now — to enforce memory safety. (It is in fact due to safety reasons that Rustaceans term creating a reference as borrowing.) To understand how these rules achieve memory safety, read Memory safety in Rust - part 2.

Even though that post covers memory safety, it still leaves one question unanswered. Why does Rust ban more than one mutable reference on the same thread? That shouldn't cause any harm, right? Turns out that it does. Read Manish's excellent post The Problem With Single-threaded Shared Mutability to know how.

Do these rules remind you of something? To me they look exactly like a reader-writer lock. But that's where the similarities end. Unlike a reader-writer lock, the compiler enforces the reference rules at compile time. At runtime, a reference behaves exactly like a raw pointer; It is just a memory address. If you do want a construct with these same rules but at runtime, take a look at RefCell. If you are looking for an actual reader-writer lock, RwLock is your guy.

Implicit dereferencing

Let's say you have a struct instance s and a reference r to that instance:

struct Int {
    i: i32,
}

let s = Int { i: 10 };
let r = &s;

To get s's member i you'd write:

let j = s.i;

Similarly, if you were accessing the member through the reference, you'd write:

let j = r.i;

But wait a minute, shouldn't you dereference r before accessing the members? Like this:

let j = (*r).i;

You could, but it is not necessary because Rust has spared us needless verbosity. The explicit dereference is not needed because the . operator automatically dereferences the operand on its left. Not only that, the . operator also dereferences its left operand as many times as needed to reach the member:

let r = &s;
let rr = &&s;
let rrr = &&&s;

let j = r.i;
let j = rr.i;
let j = rrr.i;
Note Note

The . operator also follows any Deref trait implementations on the left operand:

struct Inner {
    i: i32,
}

struct Outer {
    o: Inner
}

impl Deref for Outer {
    type Target = Inner;

    fn deref(&self) -> &Self::Target {
        &self.o
    }
}

let s = Outer { o: Inner { i: 10 }};
//s.i is transformed into s.deref().i by the compiler
let k = s.i;

Here the s struct doesn't have a member i. But this still compiles because the compiler transforms the expression s.i into s.deref().i.

One last trick up the . operator's sleeve is that it automatically takes its left operand's reference if needed:

struct Int {
    i: i32,
}

impl Int {
    fn print(&self) {
        println!("i is {}", self.i);
    }
}

let s = Int { i: 10 };
(&s).print();//calls the print method
s.print();//so does this

You might think that you should call the print method by creating a reference like this: (&s).print(). This sounds sensible because the print method takes a reference to Int (via the &self argument). But the shorter version s.print() works just as well because the compiler implicitly creates a reference for you. Rust goes to great lengths to avoid even such small paper cuts. This shows the attention to detail that goes into making Rust more ergonomic.

Rust references vs C++ references

In some ways Rust's references resemble pointers in C++. For example, the & and * operators are very similar in the two languages. But these syntactical similarities are superficial. C++ pointers (even smart pointers) can't match Rust references' safety.

References in C++, in contrast, are quite dissimilar to Rust references, even syntactically. Creating and dereferencing a reference is implicit in C++:

int i = 42;
int &r = i;//no & operator before i
int j = r;//no * operator before r

While in Rust these are explicit:

let i = 42;
let r = &i;//note the & operator before i
let j = *r;//note the * operator before r

Another difference is that you can reassign a reference in Rust to another object:

let i = 42;
let mut r = &i;//r is a reference to i
let j = 84;
r = &j;//r is now a reference to j

But you can't reseat a C++ reference:

int i = 42;
int &r = i;
int j = 84;
//might look like r is reseated but
//r still refers to i whose value is
//updated to become 84
r = j;

Apart from those differences, the two languages are similar in one aspect: references can't be null in either language.

And that concludes a whirlwind tour of references in Rust.

Conclusion

In this post, I covered an everyday usage of references in Rust. I showed how you can create shared and mutable references, and what rules govern their use. I also covered implicit dereferencing, which makes it easy to write clearer code. At last I compared Rust's references with C++ references, highlighting their superficial similarity. But pointed out that Rust's references are unmatched in safety, even by smart pointers in C++.