Lifetimes in Rust

2020-08-19

Introduction

Lifetimes is a hard concept to grasp for a lot of beginner Rustaceans. I too struggled with them for some time before I started to appreciate how vital they are for the Rust compiler to carry out its duties. Lifetimes are not inherently hard. It is just that they are such a novel construct that most programmers have never seen them in any other language. What makes things worse is the overloaded use of the word lifetime to talk about many closely related ideas. In this article I will separate those ideas from each other, and in doing so give you tools to think clearly about lifetimes.

Purpose of lifetimes

Before discussing specifics, let's first understand why lifetimes exist. What purpose do they serve? Well, lifetimes help the compiler in enforcing one simple rule: no reference should outlive its referent. In other words, lifetimes help the compiler in squashing dangling pointer bugs. As you will see in examples below, the compiler achieves this through analyzing the lifetimes of the variables involved. If the lifetime of a reference is smaller than the lifetime of the referent, the code compiles, otherwise it doesn't.

Meaning of the word lifetime

Part of the reason why lifetimes are so confusing is because in much of Rust writing the word lifetime is loosely used to refer to three different things — the actual lifetimes of variables, lifetime constraints and lifetime annotations. Let's talk about them one by one.

Lifetimes of variables

This is straightforward. Lifetime of a variable is the amount of time for which it is alive. This meaning is closest to the dictionary sense of the duration of a thing's existence or usefulness. For example, in the below code, x's lifetime extends until the end of the outer block, while y's lifetime ends at the end of the inner block.

{
    let x: Vec<i32> = Vec::new();//---------------------+
    {//                                                 |
        let y = String::from("Why");//---+              | x's lifetime
        //                               | y's lifetime |
    }// <--------------------------------+              |
}// <---------------------------------------------------+

Lifetime constraints

The way variables interact in the code puts some constraints on their lifetimes. For example, in the following code, the line x = &y; adds a constraint that x's lifetime should be enclosed within y's lifetime:

//error:`y` does not live long enough
{
    let x: &Vec<i32>;
    {
        let y = Vec::new();//----+
//                               | y's lifetime
//                               |
        x = &y;//----------------|--------------+
//                               |              |
    }// <------------------------+              | x's lifetime
    println!("x's length is {}", x.len());//    |
}// <-------------------------------------------+

If this constraint was not added, x could access invalid memory in the println! line because x is a reference to y which will be destroyed in the previous line.

Note that a constraint does not change the actual lifetimes — x's lifetime, for example, still extends until the end of the outer block — they are just a tool used by the compiler to disallow dangling references. And in the above example, the actual lifetimes do not meet the constraint: x's lifetime has strayed beyond y's lifetime. Hence, this code fails to compile.

Lifetime annotations

As seen in the last section, many times the compiler generates all the lifetime constraints. But as code gets more complex, the compiler asks the programmer to manually add constraints. The programmer does this through lifetime annotations. For example, in the code snippet below, the compiler needs to know whether the reference returned from the print_ret function borrows from s1 or s2, so the compiler asks the programmer to explicitly add this constraint:

//error:missing lifetime specifier
//this function's return type contains a borrowed value,
//but the signature does not say whether it is borrowed from `s1` or `s2`
fn print_ret(s1: &str, s2: &str) -> &str {
    println!("s1 is {}", s1);
    s2
}
fn main() {
    let some_str: String = "Some string".to_string();
    let other_str: String = "Other string".to_string();
    let s1 = print_ret(&some_str, &other_str);
}
Note Note

The programmer then annotates both s2 and the returned reference with 'a, thus telling the compiler that the return value is borrowed from s2:

fn print_ret<'a>(s1: &str, s2: &'a str) -> &'a str {
    println!("s1 is {}", s1);
    s2
}
fn main() {
    let some_str: String = "Some string".to_string();
    let other_str: String = "Other string".to_string();
    let s1 = print_ret(&some_str, &other_str);
}

I want to emphasize that just because the annotation 'a appears on both the argument s2 and the returned reference, do not interpret this to mean that both s2 and the returned reference have the exact same lifetime. Instead, this should be read as: the returned reference with annotation 'a is borrowed from the argument with the same annotation.

And since s2 is further borrowed from other_str, the lifetime constraint is that the returned reference must not outlive other_str. The code compiles because the lifetime constraint is indeed met:

fn print_ret<'a>(s1: &str, s2: &'a str) -> &'a str {
    println!("s1 is {}", s1);
    s2
}
fn main() {
    let some_str: String = "Some string".to_string();
    let other_str: String = "Other string".to_string();//-------------+
    let ret = print_ret(&some_str, &other_str);//---+                 | other_str's lifetime
    //                                              | ret's lifetime  |
}// <-----------------------------------------------+-----------------+

Before showing you more examples, let me briefly cover lifetime annotation syntax. To create a lifetime annotation, a lifetime parameter must first be declared. For example, <'a> is a lifetime declaration. Lifetime parameters are a kind of generic parameter and you can read <'a> as "for some lifetime 'a...". Once a lifetime parameter is declared, it can be used in references to create a lifetime constraint.

Remember that by annotating references with 'a, the programmer is just formulating some constraints; it is then the compiler's job to find a concrete lifetime for 'a that satisfies the imposed constraints.

More examples

Next, consider a function min which finds the minimum of two values:

fn min<'a>(x: &'a i32, y: &'a i32) -> &'a i32 {
    if x < y {
        x
    } else {
        y
    }
}
fn main() {
    let p = 42;
    {
        let q = 10;
        let r = min(&p, &q);
        println!("Min is {}", r);
    }
}

Here, the 'a lifetime parameter annotates arguments x, y and the return value. It means that the return value could borrow from either x or y. Since x and y further borrow from p and q respectively, the returned reference's lifetime should be enclosed within both p and q's lifetimes. This code also compiles because the constraint is met:

fn min<'a>(x: &'a i32, y: &'a i32) -> &'a i32 {
    if x < y {
        x
    } else {
        y
    }
}
fn main() {
    let p = 42;//-------------------------------------------------+
    {//                                                           |
        let q = 10;//------------------------------+              | p's lifetime
        let r = min(&p, &q);//------+              | q's lifetime |
        println!("Min is {}", r);// | r's lifetime |              |
    }// <---------------------------+--------------+              |
}// <-------------------------------------------------------------+

In general, when the same lifetime parameter annotates two or more arguments of a function, the returned reference must not outlive the smallest of the arguments' lifetimes.

One last example. Many new C++ programmers make a mistake of returning a pointer to a local variable. A similar attempt in Rust is not allowed:

//Error:cannot return reference to local variable `i`
fn get_int_ref<'a>() -> &'a i32 {
    let i: i32 = 42;
    &i
}
fn main() {
    let j = get_int_ref();
}

As there is no argument to the get_int_ref function, the compiler knows that the returned reference must be borrowing from a local variable, which is not allowed. The compiler rightly averts disaster because the local variable will be cleaned up by the time the returned reference tries to access it:

fn get_int_ref<'a>() -> &'a i32 {
    let i: i32 = 42;//-------+
    &i//                     | i's lifetime
}// <------------------------+
fn main() {
    let j = get_int_ref();//-----+
//                               | j's lifetime
}// <----------------------------+

Lifetime elision

When the compiler lets the programmer omit the lifetime annotations, it is called lifetime elision . Again, the term lifetime elision is misleading — how could lifetimes be elided when they are inextricably linked with how variables come into and go out of existence? It is not the lifetimes that are being elided, but rather lifetime annotations, and by extension lifetime constraints. In early versions of the Rust compiler, no elision was allowed and every lifetime annotation was required. But over time the compiler team observed that same patterns of lifetime annotations were being repeated, so the compiler was modified such that it started inferring them.

The programmer can omit the annotations in the following cases:

  1. When there is exactly one input reference. In this case the input's lifetime annotation is assigned to all the output references. For example: fn some_func(s: &str) -> &str is inferred as fn some_func<'a>(s: &'a str) -> &'a str
  2. When there are multiple input references, but the first argument is &self or &mut self. In this case too the input's lifetime annotation is assigned to all the output references. For example: fn some_method(&self) -> &str is equivalent to fn some_method<'a>(&'a self) -> &'a str.

Lifetime elision reduces the clutter in the code and it is possible that in future the compiler could infer lifetime constraints for even more patterns.

Conclusion

Many Rust newcomers find the subject of lifetimes hard to understand. But lifetimes, per se, are not to blame, rather it is how the concept is presented in a lot of Rust writing. In this article I have tried to tease apart the shades of meaning hidden in the overloaded use of the word lifetime.

Lifetimes of variables have to meet certain constraints put on them by the compiler and the programmer before the compiler can ensure that the code is sound. Without the lifetime machinery, the compiler would not be able to guarantee safety of most Rust programs.