Pedro Jordão
Pedro Jordão's Blog

Pedro Jordão's Blog

Rust and Pure Functional Programming: Square Peg, Round Hole

Rust and Pure Functional Programming: Square Peg, Round Hole

Subscribe to my newsletter and never miss my upcoming articles

Introduction


For the last few weeks since the announcement of Rats we've been fast at work exploring how to write pure FP in Rust. The original strategy based on this blog post from Michael Snoyman took us quite far. It works relatively well as far as you know which types you are working with. As soon as you try to write generic code that could work with, say, any Functor problems show.

In this post we will explore the problems that we've encountered with the current encoding of typeclasses, alternatives we've come up with and the final encoding we are going with (at least until more things break).

Motivation


Here is a snipped version of our typeclass hierarchy:

pub trait Functor {
    type Inner;
    type Outter<B>: Functor;

    fn fmap<...>(self, ...) -> Self::Outter<...>;
}

pub trait Apply: Functor {
    fn apply<...>(self, ...) -> Self::Outter<...>;
}

pub trait Applicative: Apply {
    fn pure<...>(...) -> Self::Outter<...>;
}

It looks nice at a first glance. And we can implement it quite easily for Option

impl<T> Functor for Option<T> {
    type Inner = T;
    type Outter<B> = Option<B>;

    fn fmap<...>(self, ...) -> Option<...> { ... }
}

impl <T> Apply for Option<T> {
    fn apply<...>(self, ...) -> Option<...> { ... }
}

impl <T> Applicative for Option<T> {
    fn pure<...>(...) -> Option<...> { ... }
}

That looks nice and clean, and usage is equally as simple.

let foo: Option<String> = Option::<i32>::pure(0).fmap(|i| i.to_string());

An important topic when talking about typeclasses are laws. The implementation of each typeclass should follow a set of laws for them to be properly formed. Something useful then would be to encode these laws in functions so we can reuse it for different implementations.

fn applicative_homomorphism<T, A, B, F>(a: A, mut f: T::Outter<F>)
where
     T: Applicative,
     F: Copy + FnMut(A) -> B,
         // other bounds
{
    assert_eq!(T::pure(a).apply(f), f.apply(T::pure(|mut fun: F| fun(a))));
}

Looks nice enough. Sadly it won't work.

Let's zoom in to the left side of the assert_eq! call:

T::pure(a).apply(f)

We know T: Applicative, so we are able to call pure. What do we know about the return type? Remember, this function doesn't know the concrete implementation of T, it should work for any T that implements our definition of Applicative:

fn pure<...>(...) -> Self::Outter<...>;

And where does Outter comes from?

pub trait Functor {
    type Outter<B>: Functor;
    ...
}

Finally, what is apply expecting?

fn apply<...>(self, Self::Outter<...>) -> Self::Outter<...> { ... }

You see the problem?

We aren't able to specify that Outter from Applicative should also be Apply, therefore apply can't be called on it. We didn't encounter this problem before because we knew we were working with Options. Now all we know is that we have a T that implements Applicative, and the call top pure returns something that is Functor, and that's it.

This limitation would make this API pretty much useless. If you know the concrete types you are working with upfront you might as well rely on the std methods, since most of the functional typeclass concepts are encoded there (even if with other names).

What we want is the power of implementing things in terms of typeclasses, and let the user decide which concrete type should be used.

But complaining is easy, now it is time to fix it.

Notice: Some of these issues were pointed out in the original post we based our implementation, but at the time it was a good starting point for the exploration of the design.

Exploring Solutions


Encoding per Typeclass

The obvious solution would be to reencode the the typeclasses as:

pub trait Functor {
    type Inner;
    type Outter<B>: Functor;
    ...
}

pub trait Apply {
    type Inner;
    type Outter<B>: Functor + Apply;
    ...
}

pub trait Applicative {
    type Inner;
    type Outter<B>: Functor + Apply + Applicative;
    ...
}

In this scheme the trait hierarchy is encoded in the associated type instead of the trait itself. Lets see how far it can get us:

fn applicative_homomorphism<T, A, B, F>(a: A, f: F)
    where
        T: Applicative + Apply + Functor,
        F: FnMut(A) -> B,
            // other bounds
    {
        assert_eq!(T::pure(a).apply(T::pure(f)), T::pure(f(a)));
    }

All that changed was the need to specify the full hierarchy on the T bounds.

That looks good. Again, this won't work, but it is closer to what we want. To understand why that is we need to look how would the apply and pure signatures look:

trait Apply {
    type Inner;
    type Outter<B>: Functor + Apply;

    fn apply<F: Fn(Self::Inner) -> T, T>(self, other: Self::Outter<F>) -> Self::Outter<T>;
}

trait Applicative {
    type Inner;
    type Outter<B>: Functor + Apply + Applicative;

    fn pure<A>(val: A) -> Self::Outter<A>;
}

There is no simple way to tell the compiler that the Self::Outter from Applicative is the same Self::Outter from Apply and therefore the call to apply on the left side of the assert_eq!

T::pure(a).apply(T::pure(f))

Fails because it wait a Apply::Outter but it is receiving an Applicative::Outter.

Kind Constructors

We need to carry forward the information about the compatibility of the types throughout the API.

First, let's define a way to indicate that a type is capable of being used as a functor:

pub trait FunctorTy {
    type Cons<T>;
}

Cons here stands for type CONStructor , this is essentialy encoding that a type that is Functor must be a type with one type parameter.

Next, we need to provide the implementation for fmap in a general manner:

pub trait FunctorInstance<T> {
    type Kind: FunctorTy<Cons<T> = Self>;

    fn fmap<B>(self, f: impl Fn(T) -> B) -> <Self::Kind as FunctorTy>::Cons<B>;
}

Not the prettiest, but functional. With this FunctorTy can enforce the that a type that is capable if being a functor must provide a FunctorInstance of the correct kind:

type Cons<T>: FunctorInstance<T, Kind = Self>;

Finally, we encode the annotation of Kind through an empty type that is only used as a "type carrier". For example, for Option we can do something like:

pub struct OptionKind;

impl FunctorTy for OptionKind {
    type Cons<T> = Option<T>;
}

impl<T> FunctorInstance<T> for Option<T> {
    type Kind = OptionKind;

    fn fmap<B>(self, f: impl Fn(T) -> B) -> Option<B> {
        self.map(f)
    }
}

Well, that's quite a lot of work for something that should be simpler. We will soon see the benefits (especially on the "Generic API Usage" topic).

Note: This is pretty much the idea of "type families" that motivates the usage for GAT as a HKT alternative: http://smallcultfollowing.com/babysteps/blog/2016/11/03/associated-type-constructors-part-2-family-traits/

Typeclass Hierarchy

But did all of this solve our initial problem? How would Apply look?

pub trait ApplyTy {
    type Cons<T>: ApplyInstance<T, Kind = Self> + FunctorInstance<T, Kind = Self>;
}

pub trait ApplyInstance<T> {
    type Kind: ApplyTy<Cons<T> = Self>;

    fn apply<B, F>(self, f: <Self::Kind as ApplyTy>::Cons<F>) -> <Self::Kind as ApplyTy>::Cons<B>
    where F: Fn(T) -> B;
}

While that does look really ugly, notice that at the implementation site we can substitute the ugliness for the concrete type:

impl ApplyTy for OptionKind {
    type Cons<T> = Option<T>;
}

impl<T> ApplyInstance<T> for Option<T> {
    type Kind = OptionKind;

    fn apply<B, F>(self, f: Option<F>) -> Option<B> 
    where F: Fn(T) -> B 
    { ... }
}

Applicative would be similar. With this we can implement applicative_homomorphism as

fn applicative_homomorphism<K, A, B, F>(a: A, f: F)
    where
        K: ApplicativeTy + ApplyTy + FunctorTy,
        F: FnMut(A) -> B,
            // other bounds
    {
        let other_a = <K as ApplicativeTy>::Cons::<A>::pure(a);
        let other_f = <K as ApplicativeTy>::Cons::<F>::pure(f);
        let foo = other_a.apply(other_f);

        let bar = <K as ApplicativeTy>::Cons::<B>::pure(f(a));

       assert_eq!(foo, bar);
    }

More verbose and ugly, but it works!

We will soon see how we can improve the API.

Notice that the encoding of the type in the Ty types (i.e. the types ended in Ty) needs to contain an instance of the full typeclass hierarchy plus itself.

Ensuring Consistency


What is more fun than building something? Breaking it!

Let's try to define an apply for a type that does not have an implementation of Functor

pub struct VecKind;
// Oops, I forgot to define Functor
impl ApplyTy for VecKind {
    type Cons<T> = Vec<T> 
}

impl<T> ApplyInstance<T> for Vec<T> { ... }

The compiler correctly points out that you do not have a FunctorInstance for Vec:

error[E0277]: the trait bound `VecKind: FunctorTy` is not satisfied
   --> src/lib.rs:45:5
    |
45  |     type Kind = VecKind;
    |     ^^^^^^^^^^^^^^^^^^^^ the trait `FunctorTy` is not implemented for `VecKind`
...
101 |     type Kind: ApplyTy + FunctorTy;
    |                         -------- required by this bound in `ApplyInstance::Kind`

What about passing the wrong kind to the Instance types?

pub struct VecKind;

impl FunctorTy for VecKind {
    type Cons<T> = Vec<T> 
}

impl<T> FunctorInstance<T> for Vec<T> { 
    type Kind = OptionKind; // oops

    //...
}

Once again, the compiler helps us:

error[E0271]: type mismatch resolving `<OptionKind as FunctorTy>::Cons<T> == Vec<T>`
  --> src/lib.rs:45:5
   |
21 |     type Kind: FunctorTy<Cons<T> = Self>;
   |                         ---------------- required by this bound in `FunctorInstance::Kind`
...
45 |     type Kind = OptionKind;
   |     ^^^^^^^^^^^^^^^^^^^^^^^ expected struct `Vec`, found enum `Option`
   |
   = note: expected struct `Vec<T>`
                found enum `Option<T>`

Notice that if you define Cons in FunctorTy with the wrong type, there is not that can be done (and that's a pro since that's how we encode the Id type).

Facilitating Generic API Usage


If you know the types you are working with beforehand the encoding presented (and in fact, the encoding we started out with) is simple to use:

Option::pure(0i32).fmap(|i| i + 1); // Some(1)

But as we've seen above, implementing the same thing generically looks complicated:

fn generic_function<K, A, B, F>(a: A, f: F)
    where
        K: ApplicativeTy + ApplyTy + FunctorTy,
        F: FnMut(A) -> B
    {
        let other_a = <K as ApplicativeTy>::Cons::<A>::pure(a);
        let foo = other_a.fmap(f);
    }

This is especially true for methods that do not have receives (i.e. self parameter) like pure.

At the call site this function does not look great either:

let foo = generic_function<OptionKind, _, _, _>(0i32, |i: i32| i + 1);

So we will help out the Rust compiler to identify what we want to do. For each typeclass we will add a module containing the set of functions available to the type:

pub mod functor {
        pub fn fmap<Kind: FunctorTy, A, B, F>(
        _: Kind,
        fa: impl FunctorInstance<A, Kind = Kind>,
        f: F,
    ) -> Kind::Cons<B>
    where
        F: FnOnce(A) -> B,
    {
        fa.fmap(f)
    }
}

pub mod applicative {
      pub fn pure<Kind: ApplicativeTy, A>(_: Kind, value: A) -> Kind::Cons<A> {
        Kind::Cons::<A>::pure(value)
    }
}

Notice that now we pass the Kind type as a parameter but never use it, and is only there to help out the type system. Since Kind is meant to only carry the associated type information and is indeed never bound we shouldn't get any runtime penalties (but we haven't checked).

This way, the applicative_homomorphism becomes

fn applicative_homomorphism<K, A, B, F>(kind: Kind, a: A, f: F)
    where
        K: ApplicativeTy + ApplyTy + FunctorTy,
        F: FnMut(A) -> B,
            // other bounds
    {
        let other_a = applicative::pure(kind, a);
        let other_f = applicative::pure(kind, f);
        let foo = apply::ap(kind, other_a, other_f);

        let bar = applicative::pure(kind, f(a));

       assert_eq!(foo, bar);
    }

Which is much, much cleaner.

At the call site it is as simple as

applicative_homomorphism(OptionKind, 0i32, |i: i32| i + 1);

Practical Considerations


The API presented above is (almost) exactly the one currently being implemented on the Rats library. The only distinction is what is the signature of the functions we accept for typeclasses like Apply and Functor. On our implementation the signature receives a reference to the contained type instead of passing by value (Fn(&A) → B).

We decided this approach was better than forcing types to be Clone in for types like Apply (think about how you would implement it for Vec for instance).

Limitations


But we are still not there. I'm sure there will still be a lot of corner cases that this API is not suited for. But at the time of this writing this are the two major issues we face:

Nightly Only

I don't see the GAT implementation getting into stable Rust any time soon. There is a fairly strong push to stabilize a version containing only lifetime parameters, but the general case is still up in the air.

Tooling

rustfmt doesn't like some of the new syntax introduced by GATs. In particular this:

type Kind: FunctorTy<Cons<T> = Self>;

Always gets overwritten to

type Kind: FunctorTy<Cons = Self>;

rust-analyzer also isn't ready to handle GATs and during development we get a quite limited usage of it, so we end up relying mostly on the write + cargo check loop.

Where to Go From Here


AT this point we have many of the main typeclasses implemented for the basic Rust std types, namely Vec, Option, and Result.

Now we are focusing on a implementation of functional data types based on these API. In you can find this project under the ratadata name in our Repos.

A first test of fire will be writing a proof of concept IO to wrap effectful operations, with a sample usable application (say, something akin to console4cats).

Conclusion


Rust's type system is fantastic, but it isn't quite as flexible as something like Scala or Haskell. The lack of encodings for HKT out of the box makes writing pure functional code cumbersome. Further, there is a reason why FP focused languages are garbage collected: to efficiently weave the data through these function calls it is much more efficient to pass pointers around and let the memory be managed by your runtime.

I believe that as soon as we try to implement any kind of bigger codebase based on this APIs we will need to rely extensively on Rcs and Arcs. This will probably make the API cumbersome to use, but we will get there when we get there.

Thank you @luizchagasjardim for reviewing this post.

Prior Work


https://github.com/kitsuneninetails/rust-effects

https://github.com/JasonShin/fp-core.rs

https://github.com/algermissen/rustz

https://github.com/TeaEntityLab/fpRust

https://github.com/j5ik2o/rust-fp

 
Share this