Rust From the Ground Up — Part 4

Rust From the Ground Up — Part 4

Introduction

Welcome to another entry on this Introduction to Rust series!

To compensate for my laziness over the last couple of posts (both on this series and the JVM series) I’ll be doing a longer and more in depth post today.

When writing the conclusion for the last post I totally forgot that I had cut the control flow topic from the post, so today we are going to do a deep dive in it, with a special focus on matchexpressions. This will bring together everything you need to write more imperative style programs.

Enjoy the read!

Statements vs Expressions

In Rust most things are expressions. But what are expressions anyway? Expressions is a piece of code that produces a value. Expressions might have sub-expressions. Statements do not produce values and are composed of expressions.

let x = 5 + 5; // 5 + 5 is an expression, but the whole line is a statement

let x = {
 let y = 5 + 5;
 y
}; // The whole block surrounded by {} is an expression created by sub expressions

During this post we will talk a lot about expressions and how having a expression based language benefits the code we write.

Control Flow

Control flow defines how which piece of code should be executed (and how many times) according to some condition. Some constructs in this category that you might be familiar with are if/else blocks, for blocks and while blocks. Let’s go through each of those.

if/else

if/else is a selection expression. According to some rule you select which block to code you want to execute:

if x > 10 {
 println!(“I’m big”)
} else {
 println!(“I’m small!”)
}

The block of code does exactly what it seems. If x is larger than 10 the first block is executed, if not the second one is. You can chain multiple cases using the else if pattern:

if x > 10 {
 // block 1
} else if x < 0 {
 // block 2
} else {
 // block 3
}

In the code above we add one new test. So if x is larger than 10, block 1 is executed. If not, the second test is executed and if x is smaller than 0. Finally, if none of the other tests are true, block 3 is executed. Different from other languages, the result of the test must be an expression that evaluates to a bool. That means that the following code is not valid and does not compile:

let x = 10;
if x — 10 { // evaluates to an i32, not a bool
 // does something.
}

But the following is fine

let x = true
if x {
 // …
}

Another difference from other languages is that the test does not need to be wrapped in parenthesis. Indeed, if you do try to use parenthesis the compiler will warn you.

Chances are you’ve written some code that initializes a variable according to some condition that looks like this:

let mut x;

if some_condition {
 x = …
} else {
 x = …
}

But in Rust we can go further! if/else blocks are expressions, therefore they return a value. So you can write the following:

let x = if some_condition {
 10
} else {
 42
};

Much better, right? Not only the intent of the code is clearer, we don’t need to make our variable mut. Of course, this will only work if all branches return the same type and you must make sure all possible paths are covered, often by having a default branch (i.e. an else branch without any tests at the end). This last requirement is to make sure our variable will always get initialized with something. To make things clearer lets see some code blocks that will not compile:

let x = if some_condition {
 10
}; // no default block

let x = if some_condition {
 10
} else {
 “some string// the blocks does not return the same type
};

Now lets suppose we have an enum. And a variable initialized with one of it’s values:

enum MyEnum {
 A, 
 B
}

//…

let x: MyEnum = …

And we want to initialize another variable according to the value of x. Based on what was discussed above you might try something like this:

let y = if x == MyEnum::A {
 // block 1
} else if x == MyEnum::B {
 // block 2
}

Indeed, you are in the right path. We don’t even need the default case because we are sure x may only be A or B! But we have two problems with it. If you try to compile the code you will get an error like this:

error[E0369]: binary operation `==` cannot be applied to type `MyEnum`
 → src/main.rs:11:18
 |
11 | let y = if x == MyEnum::A {
 | — ^^ — — — — — MyEnum
 | |
 | MyEnum
 |
 = note: an implementation of `std::cmp::PartialEq` might be missing for `MyEnum`

error[E0369]: binary operation `==` cannot be applied to type `MyEnum`
 → src/main.rs:13:13
 |
13 | } else if x == MyEnum::B {
 | — ^^ — — — — — MyEnum
 | |
 | MyEnum
 |
 = note: an implementation of `std::cmp::PartialEq` might be missing for `MyEnum`

This is because the concept of equality does not exist for the type MyEnum. We can teach this concept to it quite easily and we will learn how to do it in the future. But for now we will learn another conditional code execution primitive that Rust provides, called match expressions.

Match

Depending on your prior experiences with other programming languages you might have considered writing the solution for the code above using some kind of switch statement. Rust does not have such a thing, but has something more powerful!

match expressions are, roughly speaking, a generalization of if/else expressions. While if/else is only able to handle true/false tests, match expressions generalizes the tests to almost anything! The syntax looks like the following:

let x = ….
match x {
 <case-1> => …
 <case-2> => …
 …
 <default-case> => …
}

Lets begin slowly. If xis an i32, we might have the following:

let x: i32 = …
match x {
 10 => println!(“I’m ten!”),
 42 => println!(“I’m the meaning of life!”),
 y => println!(“I’m something else: {}”, y)
}

As you might imagine, in case x is 10, I’m ten! will be printed, in case it is 42 I’m the meaning of life! will be printed and if x is something else the last line will be printed. In the last case the value of x will be assigned to y and we can use it in the following expression. This binding of the value to y will only exist in the scope of the expression following the =&gt;. But lets say we just want to check for 10and 42:

let x: i32 = …
match x {
 10 => println!(“I’m ten!”),
 42 => println!(“I’m the meaning of life!”),
}

If write the above code the compiler will complain with the following explanation:

error[E0004]: non-exhaustive patterns: `std::i32::MIN..=9i32`, `11i32..=41i32` and `43i32..=std::i32::MAX` not covered
 → src/main.rs:11:11
 |
11 | match x {
 | ^ patterns `std::i32::MIN..=9i32`, `11i32..=41i32` and `43i32..=std::i32::MAX` not covered
 |
 = help: ensure that all possible cases are being handled, possibly by adding wildcards or more match arms

Take your time to understand what the compiler is telling you.

match expressions must check all possible values a variable might have. In the original match the compiler wouldn’t complain because we had a default case in our y binding case. How do we fix it if we want to only check for 10 and 42 then?

let x: i32 = …
match x {
 10 => println!(“I’m ten!”),
 42 => println!(“I’m the meaning of life!”),
 _ => ()
}

Here we introduce a default case similar to the one before, but instead of binding the value of x to some named variable we use the special name _. You can think of it as a special variable name where no binding is done. That is, after the =&gt; the value of x will not be bound to any value besides x itself. Finally, what is the () in the match? You might remember when we’ve discussed functions that () is the unit type. Essentially a type that means “empty” or something like void in other languages (which usually is not a type). Because matches are expressions they return something, and this something must be of the same type for all possible paths of the match. The println! call will return “nothing”, which is represented by the () type. So for the default case we must also return the () type, which is done by constructing it in-place just calling it’s name. Take some time to think about it.

We can also test for multiple values:

let x: i32 = …
match x {
 10 | 11 => println!(“I’m ten or eleven!”),
 42 => println!(“I’m the meaning of life!”),
 _ => ()
}

Or for ranges of values:

let x: i32 = …
match x {
 0..10 => println!(“I’m between 0 and 9!”),
 42 => println!(“I’m the meaning of life!”),
 _ => ()
}

The syntax n..m creates a range of values between n and m-1. If you want to include m you can use the syntax n..=m. We will go more in depth about ranges when talk about for loops further down.

A final result of match being an expression is that, like if/else you can return a value:

let x: i32 = …

let text = match x {
 10 => “I’m ten!”,
 42 => “I’m the meaning of life!”,
 _ => “I’m something else”
}

So what is all the fuss about matches? It seems neat but not too much better than traditional if/else.

Well, matches are way cooler because they perform pattern matching! Lets do an example. Suppose we want to perform an operation that might fail, and we will represent it by a type called Result. It can be either a successful operation returning an i32 or an error returning an explanation in a String. Things that are in a limited set of possible values can be easily represented by an enum. On our topic about enums in the last post we’ve seen that enums can carry values for for each of its cases:

enum Result {
 Ok(i32),
 Err(String)
}

fn perform_failable_operation() -> Result {…}

In case of a Successful result (represented by the Ok case) we want to call another function. If we get an error (represented by the Err case) we want to log the error. We can do the following:

fn another_operation(param: i32) {…}

match perform_failable_operation() {
 Result::Ok(result) => another_operation(result),
 Result::Err(reason) => println!(reason)
}

Neat right? Here we can use pattern matching to extract the inner values of the Result type depending on which case it is. We can go even further with it. Let’s change the Err case a little:

enum Result {
 Ok(i32),
 Err {
 error_code: i32,
 reason: String
 }
}

Now the Err case is a struct that also has a code for the error. We can also perform pattern matching on structs!

match perform_failable_operation() {
 Result::Ok(result) => another_operation(result),
 Result::Err { error_code, reason } => println!(“error_code: {}, reason: {}”, error_code, reason)
}

We are also able to extract the values of the Err struct case (this is done by using the same field names). Not impressed yet? Now let’s say that if the error_code is 0 we want to terminate the application and ignore the reason, this is done by calling to panic!:

match perform_failable_operation() {
 Result::Ok(result) => another_operation(result),
 Result::Err { error_code: 0, .. } => panic!(),
 Result::Err { error_code, reason } => println!(“error_code: {}, reason: {}”, error_code, reason)
}

Here we are able to restrict the acceptable values of error_code that will be accepted into match, and only if it’s 0 the panic! will be called. Something important to notice is that each branch is evaluated in sequence, so we need our new branch to appear before the last one, since the last one is able to match any case of Err. Finally, the .. symbol means “ignore everything else”, this way we don’t need to extract the reason field.

Now, let’s suppose that if the error_code is negative we are able to call a function to recover from the error, but we also want to log that the error happened. We also want to call special attention to this recoverable error_code by renaming the variable to something more significant. We can do this by creating a new match branch with a guard:

match perform_failable_operation() {
 Result::Ok(result) => another_operation(result),
 Result::Err { error_code: recoverable_error, reason } if recoverable_error < 0 => {
 println!(“error_code: {}, reason: {}”, recoverable_error, reason);
 recover();
 },
 Result::Err { error_code: 0, .. } => panic!(),
 Result::Err { error_code, reason } => println!(“error_code: {}, reason: {}”, error_code, reason)
}

Our second match branch now renames and also has a test for one of the values. If the test evaluates to true than that branch will be executed. Finally, as you can see we can introduce a new scope after the =&gt; to execute multiple operations.

Finally, let’s change our first match arm a little. Now we want to test if result is between 0 and 10 (inclusive), and if it is we call a special function, here’s how we can do it:

match perform_failable_operation() {
 Result::Ok(result @ 0..=10) => special_function(result),
 Result::Ok(result) => another_operation(result),
 Result::Err { error_code, reason } if error_code < 0 => {
 println!(“error_code: {}, reason: {}”, error_code, reason);
 recover();
 },
 Result::Err { error_code: 0, .. } => panic!(),
 Result::Err { error_code, reason } => println!(“error_code: {}, reason: {}”, error_code, reason)
}

Here we use the symbol @to name a value (the inner value of the Ok type) and at the same time introduce a range test for it!

Now, lets see how pattern matching works with tuples. Suppose we change perform_failable_operation a little:

fn perform_failable_operation() -> (Result, Result) {…}

Here perform_failable_operation is returning a pair of Results instead of a single one.

match perform_failable_operation() {
 (Result::Ok(a), Result::Ok(b)) => {
 another_operation(a);
 another_operation(b);
 },
 (Result::Err { .. }, Result::Err { .. }) => panic!()
}

matches are so powerful that we can keep extracting nested information! We can destructure the pair and then destructure the data inside the pair in one single operation! But aren’t we missing something? We are not covering every possible case! Indeed that’s what the compiler tells us.

error[E0004]: non-exhaustive patterns: `(Ok(_), Err { .. })` not covered
 → src/main.rs:18:7
 |
18 | match perform_failable_operation() {
 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ pattern `(Ok(_), Err { .. })` not covered
 |
 = help: ensure that all possible cases are being handled, possibly by adding wildcards or more match arms

To fix it we can just use a catch all case to log that some error has happened:

match perform_failable_operation() {
 (Result::Ok(a), Result::Ok(b)) => {
 another_operation(a);
 another_operation(b);
 },
 (Result::Err { .. }, Result::Err { .. }) => panic!(),
 (_, _) => println!(“Some error has happened!”)
}

If you are not impressed by now I give up.

The pattern we’ve developed here with the Result type is very important in Rust. Indeed, Rust’s standard library has its own Result generic type which is one of the staples of error handling in the language. We will cover it in detail in the future.

One final topic on match and if expressions is if let expressions. Let’s go back to the case where perform_failable_operation returns only one Result. We don’t care about errors, we just want to call another_operation if our result is Ok. Using a match expression we would need to do the following:

match perform_failable_operation() {
 Result::Ok(a) => another_operation(a),
 _ => ()
}

Well, that works but it ain’t pretty! if let expressions allows us to make intent clearer once again:

if let Result::Ok(a) = perform_failable_operation() {
 another_operation(a);
}

I’ll be the first to admit, the syntax here is kinda contra-intuitive. Even though we have a = symbol here no explicit assignment is being done. On the left of the = symbol we have if let followed by a destructuring exactly like you’d have done in a match expression. To the right we have our code with a matching type, followed by a new scope. In this new scope any variable extracted in the pattern matching (in this case a) will be available to us. You can also use a else branch to handle the case where the matching can’t be done:

if let Result::Ok(a) = perform_failable_operation() {
 another_operation(a);
} else {
 println!(“Some error happened!”);
}

match or if?

Before I said that match expressions were more general than if/else testing, this is because we can reproduce the if/else testing using match expressions:

let x: bool = 
match x {
 true => println!(“I’m true!”)
 false => println!(“I’m false!”)
}

So if match can do everything, why do we need if/else expressions? In general, it is a matter of readability. Sometimes you just want to do an operation if some condition is met, which could be represented by an if without else branches or a match where one of the cases does nothing. I’d say in this case the if expression just looks better and is easier to read. On the other hand, if you are testing many possible values for a variable (say an i32 or an enum) you could write a sequence of if/else branches, but in general the intent is clearer using a match expression (possibly with guards). matches also has the advantage to nag you into testing every possible case to make sure your code has full coverage. If you need to nest tests in general the code is more readable using nested if/else expressions, but it is a general good rule to follow that if you are nesting too much you probably need to rethink your code.

So try to use what makes your code easy to read and understand.

Repeating Yourself

Another control flow topic is repetition through while, for and loop blocks. If you have experience with other C based languages you are probably familiar with these, but let’s see how Rust handles it.

Repetition With while

while loops repeat some code while some condition holds true. For instance, if we want to print something while some value is positive we could do something like this:

let mut x = 10;

while x > 0 {
 println!(“Hello!”);
 x = x — 1;
}

The code above will print “Hello!” 10 times. Notice that because we need to update the value of x we need to say it is a mut value.

There’s not much more to say about while. You just need some boolean expression that will trigger the end of the repetition once it evaluates to false.

Iterating Over Collections With for

Our code sample for the while repetition had one problem. We needed a mut variable to keep track of our iteration. It would be nice to do it without needing mut variables, right? Well, take a look at this:

for _ in 0..10 {
 println!(“Hello!”);
}

Like it was mentioned before, the syntax n..m will create a range from n to m — 1. So this code sample does exactly the same thing as the one before with fewer lines of code and without mutable variables! Because we are not using the current value of the iterator for anything we can bind it to _ to just ignore it, similar to how we’ve used it in pattern matching. If we want to also print the current value we can give it a name:

for i in 0..10 {
 println!(“{}: Hello!”, i);
}

But it is not only ranges that can be used with for iteration! Any type that is Iterable can be used. We will go more into detail about what Iterables are and how to declare our own, but for now suffice to say that the type Vec is Iterable, so we may have the following code:

let vec: Vec = vec!(1, 2, 15, 22, 42);
for i in vec {
 println!(“{}, i);
}

Here we create a Vec with some data and use the for syntax to go through each item of the Vec. But let’s say that we don’t like the number 15 and want to skip it. For that we can use the continue keyword. With when continue keyword evaluates the current iteration is finished and the next one starts:

let vec: Vec = vec!(1, 2, 15, 22, 42);
for i in vec {
 if i == 15 {
 continue;
 }
 println!(“{}, i);
}

Now the number 15 won’t be printed, but all others will. Finally let’s say that we want to completely stop the iteration once we find the number 22. We can use the break keyword, that once evaluated will finish the whole iteration:

let vec: Vec<i32> = vec!(1, 2, 15, 22, 42);

for i in vec {
 if i == 22 {
 break;
 }
 println!(“{}”, i);
}

Now we will only print the numbers 1, 2 and 15. break and continue can also be used with while loops. Now, since we are talking about loops…

The loop Expression

What if you want to loop indefinitely? We could use a while call with something that is always true

while true {…}

And it will work. From inside there we can eventually use a break to get out of the loop. But we have something else that comes with some other goodies in the form of a loop expression

loop {…}

The loop call doesn’t require any tests and will keep on looping forever if you don’t put a break inside of it. So the following code

loop {
 println!(“Nothing can stop me!”);
}

will keep printing forever until you kill the application. But if you have been paying attention you noticed that I’m calling it an expression. So it should be able to return some value, right? Indeed it can!

let x = loop {
 let current_val = random_number();
 if current_val > 42 {
 break current_val;
 }
}

Here we are using some random number generating function, if the number is larger than 42 we return it. We do that using the break keyword which has a special meaning inside loop blocks and is able to return a value. Fancy, huh?

You can still use a break without a value to just break out of the loop (and continue to skip the current iteration) as we did with for and while, but here we can also return a value! We are not able to do that inside while and for blocks, though.

Conclusion

That’s it! On the next post we will finally talk about OOP in Rust, how to associate Data with Behavior and what the limits are in the language. Finally we will close talking about traits, Rust’s abstraction to generic type behavior which is more or less similar to interfaces in other OOP languages.

But that will take some time. I’ll take a couple of weeks vacation for all my series (both this and the JVM series) because of the end of the year.

See you next year!