Rust From the Ground Up - Part 6

Rust From the Ground Up - Part 6

Introduction

Hello and welcome back.

We will be talking a little more about traits. We will see the Deref trait, which is one of the special traits in the language that allows us to do some magic. After that we will be talking about trait objects, which is how we can handle single dispatch and polymorphism in the language.

Deref

Deref is one of the few “magical” things in Rust. It has special rules that applies only to this trait, but before getting into that, let’s see the basic usage.

Deref stands for something like “derreference”. It’s signature is

pub trait Deref {
 type Target: ?Sized;
 fn deref(&self) -> &Self::Target;
}

Let’s go piece by piece

type Target: ?Sized;

Ignore the ?Sized detail for now. Just as a trait has methods to be defined when implementing it for some type, it might have types associated with it to be defined at it’s implementation. This is called an associated type. We will talk more about this on our next post, for now you can think about it just as a generic type of the trait.

fn deref(&self) -> &Self::Target;

The method we need to implement for our trait takes our type immutably and returns a reference for the type defined before. A simple implementation of our trait can be

use std::ops::Deref; // import trait into scope

struct MyType(String);

impl Deref for MyType {
    type Target = String;

    fn deref(&self) -> &Self::Target {
        &self.0
 }
}

//usage
let x = MyType(String::from("Hello"));
println!("{}", x.deref()); //prints Hello
`

Cool. But for now that just looks like a normal trait and somewhat useless right? The trick is that Deref has special rules, where it participates in what is called Derefcoercion. Details about it can be found in the documentation (doc.rust-lang.org/std/ops/trait.Deref.html), but it boils down to the following rules:

If T implements Deref, and x is a value of type T, then:

    * In immutable contexts, *x on non-pointer types is equivalent to 
    Deref::deref(&x).
    * Values of type &T are coerced to values of type &U
    * T implicitly implements all the (immutable) methods of the type U.

This means that on our prior code all of the following would work:

fn accepts_string_by_ref(x: &String) {…}

//usage
let x = MyType(String::from("Hello"));

accepts_strings_by_ref(&x); // coerse &MyType into &String

println!("{}", x.len()); // len is a String method that works on immutable references

Through the Deref trait we can call methods of some other value seamlessly and interpret our structure as another type. This looks a lot like some of the characteristics of inheritance that we talked about on our last post! Indeed, to simulate some simple cases of inheritance we can use this trait as demonstrated. We are essentially using Deref to bridge the distinction between inheritance and composition. Instead of having a “parent” class we can have a field of the type we want to “inherit” and use the Deref with that.

Notice that because we can only implement Deref once per type, which is equivalent to single inheritance. Also, Deref coercion only “walks” one step, so if our Target type implements Deref for a third typeW the coercion stops at Target and doesn’t care about W.

The language also has another trait called DerefMut, which works exactly the same, but for mutable methods and references.

I do not recommend this “hack” to be used frequently though. Don’t try to fit something the language is not meant for just because you are familiar with OOP. Take a second and think if it is actually necessary. The point of this trait is to work for smart pointer types, something that wraps another type. Common examples of the correct usage are the Box, Arc and Rc types which we will talk about in the future.

Another cool usage is when working with Domain Driven Development patterns. For instance, let’s say you have a Name type for your user, which is defined as a sequence of letters and spaces, but no number or special symbols. It’s obvious that we want Name to essentially work exactly as a String but limit its creation. You can work with something like

pub struct Name(String);

fn valid_name(name: &str) -> bool {…}

impl Name {
    fn new(source: &str) -> Result<Name, String> {
        if (valid_name(source)) {
            Ok(Name(String::from(source)))
         } else {
             Err(String::from("Invalid name"))
         }
    }
}

impl Deref for Name {
    type Target = String;

    fn deref(&self) -> &Self::Target {
        &self.0
    }
}

Our type now can be handled exactly like an immutable String, but with we can only create a value of type Name through the method provided in the implementation. Trait Objects

Let’s begin with a classical example. We want to define some behavior to be implemented in a case by case basis

trait NoiseMaker {
    fn noise(&self) -> String;
}

And we want client code (possibly being our own code) to implement this trait

struct Dog;

impl NoiseMaker for Dog {
    fn noise(&self) -> String {
        String::from("Bark")
    }
}

struct Cat;

impl NoiseMaker for Cat {
    fn noise(&self) -> String {
        String::from("Meow")
    }
}

You want to print what the noise each of these make

fn print_noise(dog: &Dog) {
    println!("{}", dog.noise())
}

fn print_noise(cat: &Cat) {
    println!("{}", cat.noise())
}

Well that looks fine, let’s compile and…

error[E0428]: the name `print_noise` is defined multiple times
 → src/lib.rs:27:1
 |
22 | fn print_noise(dog: &Dog) {
 | - - - - - - - - - - - - - previous definition of the value `print_noise` here
…
27 | fn print_noise(cat: &Cat) {
 | ^^^^^^^^^^^^^^^^^^^^^^^^^ `print_noise` redefined here
 |
 = note: `print_noise` must be defined only once in the value namespace of this module

Well, yeah, no function overloading in Rust, we can just rename the methods then

fn print_noise_for_dog(dog: &Dog) {
    println!("{}", dog.noise())
}
fn print_noise_for_cat(cat: &Cat) {
    println!("{}", cat.noise())
}

And it compiles!

But that doesn’t seem manageable. If nothing else, NoiseMaker might be implemented outside of our control by client code! We can’t implement a function for types we don’t yet know about. Or can we? If you are used to programming through interfaces you might do something like this:

fn print_noise(noise_maker: &NoiseMaker) {
    println!("{}", noise_maker.noise())
}

That’s a good intuition, but let’s see what the compiler tells us

warning: trait objects without an explicit `dyn` are deprecated
 → src/lib.rs:21:30
 |
21 | fn print_noise(noise_maker: &NoiseMaker) {
 | ^^^^^^^^^^ help: use `dyn`: `dyn NoiseMaker`
 |
 = note: `#[warn(bare_trait_objects)]` on by default

Good, just a warning about a missing keyword! We are indeed in the right track. The dyn keyword is something relatively new and is used to indicate when we are talking about passing a trait instead of a struct around. The fix is simple and works as you expect.

fn print_noise(noise_maker: &dyn NoiseMaker) {
    println!("{}", noise_maker.noise())
}

let dog = Dog;
let cat = Cat;

print_noise(&dog);
print_noise(&cat);

This will work exactly as you expect. Bark will be printed an then Meow. But now we want to group together a bunch of NoiseMaker to call print_noise in each one. Again, If you are used to programming through interfaces you might do something like this:

let noise_makers: Vec<NoiseMaker> = vec![Dog, Dog, Cat];

You can ignore the syntax vec! for now, just think we are creating a vector from the contents between []

Let’s see what the compiler think of that

warning: trait objects without an explicit `dyn` are deprecated
 → src/lib.rs:22:16
 |
22 | let x: Vec<NoiseMaker> = vec!(Dog, Dog, Cat);
 | ^^^^^^^^^^ help: use `dyn`: `dyn NoiseMaker`
 |
 = note: `#[warn(bare_trait_objects)]` on by default
error[E0277]: the size for values of type `dyn NoiseMaker` cannot be known at compilation time
 → src/lib.rs:22:12
 |
22 | let x: Vec<NoiseMaker> = vec!(Dog, Dog, Cat);
 | ^^^^^^^^^^^^^^^ doesn't have a size known at compile-time
 |
 = help: the trait `std::marker::Sized` is not implemented for `dyn NoiseMaker`
 = note: to learn more, visit <https://doc.rust-lang.org/book/ch19-04-advanced-types.html#dynamically-sized-types-and-the-sized-trait>
 = note: required by `std::vec::Vec`
--snip--

You will see the last error repeated a few times saying exactly the same thing, but let’s go one step at a time. The first one makes sense, it’s the same thing we’ve seen before. Even when using the type when instantiating a generic type we need to put in the dyn keyword.

let noise_makers: Vec<dyn NoiseMaker> = <something>;

The remaining errors are a bit more complex. In essence what the error is saying is that at compile time we don’t know the size of the contents of our vector. Think about it, we want to put things in our vector, but what the thing is is only defined later, and each thing might be of a different size. More importantly, it might contain types of different sizes that are not even known yet!

So how can we fix this? There are a couple of ways. First we could work with references to NoiseMakers. References are essentially pointers, and pointers have a known size (which depends on your architecture, but should be 64 or 32 bits). So we could do

let dog = Dog;
let cat = Cat;

let noise_makers: Vec<&dyn NoiseMaker> = vec![&dog, &cat];

for noise_maker in noise_makers {
    print_noise(noise_maker)
}

It works now, zero issues. But sometimes this will not quite work for us. If we changed the code above just a little

fn get_noise_makers() -> Vec<&dyn NoiseMaker> {
    let dog = Dog;
    let cat = Cat;
    vec![&dog, &cat]
}

The compiler will be unhappy again

error[E0106]: missing lifetime specifier
 → src/main.rs:25:30
 |
25 | fn get_noise_makers() -> Vec<&dyn NoiseMaker> {
 | ^ help: consider giving it a 'static lifetime: `&'static`
 |
 = help: this function's return type contains a borrowed value, but there is no value for it to be borrowed from
error[E0228]: the lifetime bound for this object type cannot be deduced from context; please supply an explicit bound
 → src/main.rs:25:31
 |
25 | fn get_noise_makers() -> Vec<&dyn NoiseMaker> {
 | ^^^^^^^^^^^^^^

I won’t go into too much detail about what the compiler is talking about, this will me discussed in the future once we go more in depth about references in Rust, but for a simple justification for the error is: the vector contains references (i.e. pointers) to the local variables dog and cat. Once the function finishes these variables are discarded, so the pointers are invalidated. Rust is able to detect this and warn you at compile time.

So how can we fix this? The issue currently is that our references point to something in the stack, which gets discarded when the function returns. We can solve this by moving these variables to the heap. This is done using the Box type.

use std::boxed::Box;
fn get_noise_makers() -> Vec<Box<dyn NoiseMaker>> {
    let dog = Box::new(Dog);
    let cat = Box::new(Cat);

    vec![dog, cat]
}

Notice the change in the function’s return type. The compiler won’t complain anymore. A Box is also essentially a pointer, but is represented at compile time as it’s own type. We will talk more about smart pointer types in the future (yes, I know I say that a lot), but for now just notice that thanks to this type the compiler is able to understand that the vector you are return contains things that are in the heap, therefore the pointers will not be invalidated.

So, finally, let’s define a trait object: A trait object is any object that is interacted through one of it’s traits, but not it’s concrete type. This is done through an indirection such as a reference or the Box type since we wouldn’t know which size to allocate to contain an unknown type.

There are restrictions to which traits can be used as trait objects. Traits can be used as trait objects if all methods follow these rules:

* The return type isn't Self
* There are no generic type parameters

Traits which can be used as trait objects are called object-safe. Conclusion

This post was an extension of our last post with some of the tools available to the Rust developer used to work with OOP. We’ve seen how to use Deref to simulate single inheritance (with caveats) and use trait objects to abstract over concrete types.

On our next post we will go into generics, which will include a discussion about associated types vs generic types.

See you then!