Pedro Jordão's Blog

Pedro Jordão's Blog

Rust From the Ground Up — Part 5

Rust From the Ground Up — Part 5

Subscribe to my newsletter and never miss my upcoming articles

Introduction

I’m back!

Hope you had good festivities. But now it’s time to go back to Rust with some philosophical debate. You do read these posts to talk about philosophy, right?

Binding Data With Behavior

Up to this point we have seen how to create structures that group data together and how to create functions that receives and/or returns these structures. Now we will see how to bind our data with some behavior. In many languages you have the concept of associated functions or methods, which are functions that are associated with a type and operate on a specific instance of that type. Rust also has methods, but some things are a little bit different. To begin, we do not declare methods associated with our type together with the data for that type, but in a separate impl block

struct Point {
 x: f32,
 y, f32
}

impl Point {
 // here we declare our methods
}

Rust has no special constructor methods. The generally accepted pattern is to have constructor methods for your data type. That is, methods that receives 0 or may arguments and return an object of the type itself.

impl Point {
 fn new(x_arg: f64, y_arg: f64) -> Point {
   Point {
     x: x_arg,
     y: y_arg
   }
 }

 fn zero() -> Point {
   Point {
     x: 0.0,
     y: 0.0
   }
 }
}

We already have a few things to note here. First of, Rust does not have function or method overloading (that is, function with the same name but a different signature). Yup, shocking, took me a while to accept that too. That means we cannot overload the new method with no parameters to create a new point at the origin. The good thing about that is that you end up being forced to have methods with a clear name. Second, we can simplify things a little here like this:

// …
 fn new(x: f64, y: f64) -> Self {
   Point { x, y } // sugar for Point { x: x, y: y}
 }

 fn zero() -> Self {
   // …
 }
// …

So, here’s what has changed. First of, the signature of the constructor methods (or factory methods) we’ve replaced the type name by Self. This is sort of a meta type that means “the type context I’m in”. Because we are implementing methods to Point the type Self in this context will mean Point. This meta type becomes really useful when we start talking about generics.

Secondly, the implementation for the method new was changed a little. When creating a new structure with named parameters if we have variables with the same name and with a compatible type for fields of a structure, we can use the above syntactic sugar.

Calling factory methods is also simple. We use the syntax <TypeName>::<method_name>(...) to call the function method_name associated with the type TypeName. The symbol :: differentiates between calling calling functions associated with types from functions associate with instances (which we will see shortly). For instance, creating a new Point using the methods we’ve declared is as simple as

let point = Point::new(1.0, 2.0);

Keep in mind that these associated functions do not need to have any relationship with the type itself, we could do something like

impl Point {
 fn sum(a: u32, b: u32) -> u32 {
   a + b
 }
}

Though it would be a little bit silly and would not follow good practices.

Ok, we now know how to create functions associated with types, but we can also have functions which are associated with instances. This is more similar to how methods work in other languages because now we will need a specific instance of our structure and we have access to it’s fields:

impl Point {
 fn dividing_by(&self, d: f64) -> Self { 
   Point {
     x: self.x / d,
     y: self.y / d
   }
 }

 fn div_by_locally(&mut self, d: f64) {
   self.x /= d;
   self.y /= d;
 }

 fn div_by_globally(self, d: f64) -> Self {
   Point {
     x: self.x / d,
     y: self.y / d
   }
 }
}

Here we’ve implemented 3 different division methods. Calling methods in a instance is as simple as <instance_name>.<method_name>(...). Notice that here we are using a . instead of a ::. This is because :: will look for functions associated with a type, while the . will look for functions (or fields) associated with an instance. The self parameter is implicitly passed to the method when we call it.

What they all have in common is the first parameter called self. Instance methods take this first (and always first) parameter to mean the instance where the method is being called on. We have 4 ways of taking the self parameter:

  • self: will take the instance as a value. This means that the instance will not be available to be used after calling this method (it will be consumed/moved into the method);

  • mut self: exactly the same as self but we take it as a mutable value so we can change the instance’s data inside the method;

  • &self: we take an immutable reference to the instance, so inside the method we cannot mutate the contents of the instance, but we do not consume the instance afterwards;

  • &mut self: similar to &self but we can mutate the contents of the instance inside the method.

Note that for the mut variants we need to have a mut access to the instance in the first place:

let point = Point::new(0.0, 1.0);
let _ = point.dividing_by(1.0); // ok, we take it by an immutable 
                                // reference and the original 
                                // contents are not changed

point.div_by_locally(1.0); // not ok, we do not have mutable access
                           // to `point`

let mut mutable_point = Point::new(0.0, 1.0); // a mutable point
mutable_point.div_by_locally(1.0); // ok, we have mutable access to 
                                  // `point`

let p2 = point.div_by_globally(1.0); // ok, we can take point by 
                                     // value and move into the 
                                     // method
let p3 = point.div_by_globally(1.0); // not ok, `point` was moved in 
                                     // the last call and doesn’t   
                                     // exist anymore

Neat.

This control over mutability is one of the main perks and deepest topics in the language. I won’t go deeper into it right now since I want to dedicate a whole post (or maybe more than that) to dig deeper into it.

Traits

But what if we want to generalize about the capabilities types might have? For instance, our Point type might be able to be “drawn” into some surface, so we could add a draw method to the implementations.

impl Point {
 // …
 fn draw(&self, surface: &mut Surface) {…}
}

That’s cool and it works, but not only points are drawable, so if we have a line type

struct Line {
 start: Point,
 end: Point
}

we would need to also implement a draw method to it… well, if you are familiar with the concept of interfaces you know where I’m getting at. Rust doesn’t have interfaces exactly, but it does have something similar called traits, which are way cooler. Let’s start from the beginning. A trait represents a capability of a type

trait Drawable {
 fn draw(&self, surface: &mut Surface);
}

Here we are declaring that there is a trait called Drawable which has a method draw. We can implement it for some type as the following

impl Drawable for Point {
 fn draw(&self, surface: &mut Surface) {…}
}

impl Drawable for Line {
 fn draw(&self, surface: &mut Surface) {…}
}

That looks pretty similar to how you implement interfaces in other languages, with the exception that the implementations are made in separate implementation blocks. But this raises new possibilities. As of right now we are declaring our types and traits at the same place. But we could take advantage of separate implementation blocks to implement our new trait for a type we did not define

impl Drawable for u32 {…} // don’t ask me how this would work

u32 is defined in Rust’s standard library, but we can still modify its behavior through traits. The only restrictions is that either the trait or the type (or both) must be defined in the same module. That means we cannot implement a trait from module A for a type in module B while in module C. This is to avoid conflicts of implementation once we start working with dependencies. But for now we are keeping everything in the same module.

Another thing that makes Rust’s trait system more powerful than simple interfaces is that we can implement our traits for types we don’t even know!

trait Printable {
 fn print(&self)
}

impl <T> Printable for T {
 fn print(&self) {
 println!(“Hello!”)
 }
}

fn main() {
 let x = 0;
 x.print();
}

Here we are getting a little bit the generic type system of Rust, which will be covered in depth in a next installment, but it’s relatively clear what this code means: Implement Printable for every type T.

A small note here is that we could have simplified the implementation as

trait Printable {
 fn print(&self) {
   println!(“Hello!”)
 }
}

impl <T> Printable for T {}

We can give a default implementation for methods inside traits, and if we so which we can override this implementation later.

Going back to implementing over generics (known as blanket implementations), they are a really powerful tool that should be used with care. In the example of Printable if an user wanted to implement the Printable trait for his/her own type with a different implementation he would not be able to since that implementation would conflict with the blanket implementation. Another point is that this really broad blanket we are using (i.e. for every T) is almost useless. Inside of our implementation we have absolutely no information about our type T that might be used to do anything interesting. A more useful technique is use blanket implementations with type bounds. That means we limit the set of types T our blanket implementation applies to

trait Paintable {
 fn paint(&mut self, color: Color)
}

impl <T: Paintable> Drawable for T {
 fn draw(&self, surface: &mut Surface) {
   self.paint(Color(0, 0, 0, 1));
   // do drawing logic here
 }
}

Now, this is cool! But be aware that once again we have the issue that if a user code wants to implement Paintable and Drawable with their own implementation he won’t be able to, so use this strategy with care. A great example of this being used to great effect are the From and Into traits from the standard library. These conversion traits have roughly the following signatures:

trait From<T> {
 fn from(T) -> Self;
}

trait Into<T> {
 fn into(self) -> T;
}

We are introducing some new syntax here with the &lt;T&gt; and as you can imagine this is related to generics. For now you just need to understand it as “some type T”. The Fromtrait creates a new instance of your type from some type T, while Into takes your type and turns it into some type T. There’s a mirroring effect with this traits. It makes sense that if we have an implementation of From&lt;T&gt; for U we it’s pretty much the same of having Into&lt;U&gt; for T (take your time to think about this looking at the methods above). Indeed, the standard library provides the following quite trivial blanket implementation:

impl<T, U: From<T>> Into<U> for T {
 fn into(self) -> U {
   U::from(self)
 }
}

Again, take your time to understand what’s being said. For every type U that implements From&lt;T&gt; we get the implementation of Into&lt;U&gt; for the respective T automatically.

Another nifty blanket implementation in the From trait is quite trivial

impl <T> From<T> for T {
 fn from(value: T) -> T {
   value
 }
}

Though it looks really silly it comes in handy in some generic contexts.

OOP and rust

Object Oriented Programming has been the biggest programming paradigm over the next couple of decades. How many interviews did you go through where you had to talk about inheritance? The difference between interfaces and abstract classes, what is polymorphism blablabla.

Rust is not an OOP language, but concepts can be mapped to OOP.

Does that mean that Rust is inferior to OOP languages because you can’t do everything?

Does that mean that it is superior because it doesn’t have the pitfalls of OOP languages (shout out to my FP folks!)?

Well, neither and both. On one hand, if you only ever work with OOP it will probably take you some time to adjust your mindset. On the other hand, if you are coming from something like C or Fortran you will fell right at home about how to lay out your structures, but it might take some time to use methods and traits to its full.

But what is OOP anyway? Wikipedia says

Object-oriented programming (OOP) is a programming paradigm based on the concept of “objects”, which can contain data, in the form of fields (often known as attributes or properties), and code, in the form of procedures (often known as methods). A feature of objects is an object’s procedures that can access and often modify the data fields of the object with which they are associated (objects have a notion of “this” or “self”). In OOP, computer programs are designed by making them out of objects that interact with one another. OOP languages are diverse, but the most popular ones are class-based, meaning that objects are instances of classes, which also determine their types.

This initial definition is perfectly valid for Rust. We have structures that contain data fields and we can associate methods to these structures. We just don’t call these structures objects traditionally. So far so good. Next up let’s go through what features OOP languages might have.

Class-based vs prototype-based In class-based languages the classes are defined beforehand and the objects are instantiated based on the classes. If two objects apple and orange are instantiated from the class Fruit, they are inherently fruits and it is guaranteed that you may handle them in the same way; e.g. a programmer can expect the existence of the same attributes such as color or sugar_content or is_ripe. In prototype-based languages the objects are the primary entities. No classes even exist. The prototype of an object is just another object to which the object is linked. Every object has one prototype link (and only one). New objects can be created based on already existing objects chosen as their prototype. You may call two different objects apple and orange a fruit, if the object fruit exists, and both apple and orange have fruit as their prototype. The idea of the fruit class doesn’t exist explicitly, but as the equivalence class of the objects sharing the same prototype. The attributes and methods of the prototype are delegated to all the objects of the equivalence class defined by this prototype. The attributes and methods owned individually by the object may not be shared by other objects of the same equivalence class; e.g. the attribute sugar_content may be unexpectedly not present in apple. Only single inheritance can be implemented through the prototype.

Of these, Rust’s approach more closely resembles class-based design. Again, even though we don’t call them objects we do have some form of is a relationship using traits.

Dynamic dispatch/message passing It is the responsibility of the object, not any external code, to select the procedural code to execute in response to a method call, typically by looking up the method at run time in a table associated with the object. This feature is known as dynamic dispatch, and distinguishes an object from an abstract data type (or module), which has a fixed (static) implementation of the operations for all instances. If the call variability relies on more than the single type of the object on which it is called (i.e. at least one other parameter object is involved in the method choice), one speaks of multiple dispatch.

A method call is also known as message passing. It is conceptualized as a message (the name of the method and its input parameters) being passed to the object for dispatch.

With traits and trait objects we also get (single) dynamic dispatch.

Encapsulation Encapsulation is an object-oriented programming concept that binds together the data and functions that manipulate the data, and that keeps both safe from outside interference and misuse. Data encapsulation led to the important OOP concept of data hiding. If a class does not allow calling code to access internal object data and permits access through methods only, this is a strong form of abstraction or information hiding known as encapsulation. Some languages (Java, for example) let classes enforce access restrictions explicitly, for example denoting internal data with the private keyword and designating methods intended for use by code outside the class with the public keyword. Methods may also be designed public, private, or intermediate levels such as protected (which allows access from the same class and its subclasses, but not objects of a different class). In other languages (like Python) this is enforced only by convention (for example, private methods may have names that start with an underscore). Encapsulation prevents external code from being concerned with the internal workings of an object. This facilitates code refactoring, for example allowing the author of the class to change how objects of that class represent their data internally without changing any external code (as long as “public” method calls work the same way). It also encourages programmers to put all the code that is concerned with a certain set of data in the same class, which organizes it for easy comprehension by other programmers. Encapsulation is a technique that encourages decoupling.

As we’ve seen we can associate types (structures) with functions (methods), and we can set up things that you can only modify the data in a struture through its methods by using visibility modifiers (something we will see in the future).

I personally would argue that this is one of the weak features of OOP, not because it is intrisically bad, but because it is misused. The concept of encapsulation is used to extremes in OOP languages (I’m looking at you two, Java and C++), where it is considered good pratice (and many times required in Java if you want to deal with the beans ecosystem) to have private fields with getters and setters for it even for simple data classes.

Realistically most of the types we end up writing are simple carrier types, used to group data together in a logical fashion. It is rather silly the amount of boilerplate we have to write in these languages to achieve something that should be simple in the name of some sort of “defensive” programming. Indeed, other OOP languages such as C# wrap such concepts into properties, where the common case (a field of a data carrier should be accessible) is easily represented, whith scape hatches for implementing getters and setters with the same field access syntax. Java itself is moving towards a simpler data type by introducing Records, which removes most of the boilerplate of creating simple types.

But more important than data hiding, in my view, is controlled mutability. Making a parallel once again with Java and C++ where we have two different approaches, Rust works in a similar fashion to C++. Rust, as C++, is able to restrict the levle of mutability some “object” is passed to a function (or another object), but diferently from C++ it is immutable by default (const by default, in C++ terms). This immutability also propagates, to the inner data, so if we have a immutable “view” of an object, this immutability is recursive and we can’t access sub fields in a mutable fashion.

If you are coming from Java, this is a major distinction. The closest thing we have in Java is final fields, where we can’t change what that fields references, but we have no way of limiting the mutability of what that reference points to. In Java land this is dealt with using encapsulation.

We will talk more about mutability and ownership in a future post.

Composition, inheritance, and delegation Objects can contain other objects in their instance variables; this is known as object composition. For example, an object in the Employee class might contain (either directly or through a pointer) an object in the Address class, in addition to its own instance variables like “first_name” and “position”. Object composition is used to represent “has-a” relationships: every employee has an address, so every Employee object has access to a place to store an Address object (either directly embedded within itself, or at a separate location addressed via a pointer). Languages that support classes almost always support inheritance. This allows classes to be arranged in a hierarchy that represents “is-a-type-of” relationships. For example, class Employee might inherit from class Person. All the data and methods available to the parent class also appear in the child class with the same names. For example, class Person might define variables “first_name” and “last_name” with method “make_full_name()”. These will also be available in class Employee, which might add the variables “position” and “salary”. This technique allows easy re-use of the same procedures and data definitions, in addition to potentially mirroring real-world relationships in an intuitive way. Rather than utilizing database tables and programming subroutines, the developer utilizes objects the user may be more familiar with: objects from their application domain.[9] Subclasses can override the methods defined by superclasses. Multiple inheritance is allowed in some languages, though this can make resolving overrides complicated. Some languages have special support for mixins, though in any language with multiple inheritance, a mixin is simply a class that does not represent an is-a-type-of relationship. Mixins are typically used to add the same methods to multiple classes. For example, class UnicodeConversionMixin might provide a method unicode_to_ascii() when included in class FileReader and class WebPageScraper, which don’t share a common parent. Abstract classes cannot be instantiated into objects; they exist only for the purpose of inheritance into other “concrete” classes which can be instantiated. In Java, the final keyword can be used to prevent a class from being subclassed. The doctrine of composition over inheritance advocates implementing has-a relationships using composition instead of inheritance. For example, instead of inheriting from class Person, class Employee could give each Employee object an internal Person object, which it then has the opportunity to hide from external code even if class Person has many public attributes or methods. Some languages, like Go do not support inheritance at all. The “open/closed principle” advocates that classes and functions “should be open for extension, but closed for modification”. Delegation is another language feature that can be used as an alternative to inheritance.

This was a long one. Here we have what most people think about when they think about what OOP is and where Rust stops looking like an OOP language. Rust does not support data inheritance. It does provide the concept of is a type of through traits, as we’ve seen before. It is important to note that a lot of the current best practices in OOP talks about interface based APIs and composition over inheritance.

That is not to say that common data inheritance is useless, it does solve a lot of issues elegantly. To date some form of traditional inheritance is still one of the best ways to create GUIs.

Another point to note is that you can simulate the OOP behavior with a mix of composition and (ab)using the Deref trait. We will talk about that on our next post.

Polymorphism Subtyping — a form of polymorphism — is when calling code can be agnostic as to which class in the supported hierarchy it is operating on — the parent class or one of its descendants. Meanwhile, the same operation name among objects in an inheritance hierarchy may behave differently. For example, objects of type Circle and Square are derived from a common class called Shape. The Draw function for each type of Shape implements what is necessary to draw itself while calling code can remain indifferent to the particular type of Shape is being drawn. This is another type of abstraction which simplifies code external to the class hierarchy and enables strong separation of concerns.

Polymorphism and inheritance comes hand in hand when we talk about OOP. Rust is able to handle Polymorphism through it’s trait system and trait objects, which will be discussed on our next post. For now, suffice to say that it works just like C++. We can reason about pointers/references to traits and not care about what the actual structure is behind the implementation. At runtime the correct implementation of the trait’s methods will be called and work as expected.

So, is Rust an OOP language?

It depends on what in OOP you are looking for. Do you only care about data inheritance? Than no, Rust is not an OOP language.

Do you care about encapsulation and interfaces? Yeah, you can do that in Rust.

Now, should you write your Rust code as you would write other OOP languages? Probably not. Rust has other strengths that you should use focus on, but it doesn’t mean you should never use dynamic dispatch. The tools are available to you, it is your responsibility to use it well.

Conclusion

Hope you enjoyed the discussion in this post. I don’t plan to do this kinds of comparisons with other languages and paradigms too often, but at this point in time is almost impossible to talk about a programming language without talking about OOP.

Our next post we will talk about the Deref trait and how to use it to simulate some form of data inheritance, and about trait objects. After that we will talk about one of my favorite topics in any language: Generics!

See you soon.

 
Share this