$$ % Typography and symbols \newcommand{\msf}[1]{\mathsf{#1}} \newcommand{\ctx}{\Gamma} \newcommand{\qamp}{&\quad} \newcommand{\qqamp}{&&\quad} \newcommand{\Coloneqq}{::=} \newcommand{\proves}{\vdash} \newcommand{\star}[1]{#1^{*}} \newcommand{\eps}{\varepsilon} \newcommand{\nul}{\varnothing} \newcommand{\brc}[1]{\{{#1}\}} \newcommand{\binopm}[2]{#1~\bar{\oplus}~#2} \newcommand{\mag}[1]{|{#1}|} \newcommand{\aequiv}{\equiv_\alpha} \newcommand{\semi}[2]{{#1};~{#2}} % Untyped lambda calculus \newcommand{\fun}[2]{\lambda ~ {#1} ~ . ~ {#2}} \newcommand{\app}[2]{#1 ~ #2} \newcommand{\fix}[3]{\msf{fix}~({#1} : {#2}) ~ . ~ #3 } \newcommand{\truet}{\msf{true}} \newcommand{\falset}{\msf{false}} \newcommand{\define}[2]{{#1} \triangleq {#2}} % Typed lambda calculus - expressions \newcommand{\funt}[3]{\lambda ~ \left(#1 : #2\right) ~ . ~ #3} \newcommand{\lett}[4]{\msf{let} ~ \hasType{#1}{#2} = #3 ~ \msf{in} ~ #4} \newcommand{\letrec}[4]{\msf{letrec} ~ \hasType{#1}{#2} = #3 ~ \msf{in} ~ #4}a \newcommand{\ift}[3]{\msf{if} ~ {#1} ~ \msf{then} ~ {#2} ~ \msf{else} ~ {#3}} \newcommand{\rec}[5]{\msf{rec}(#1; ~ #2.#3.#4)(#5)} \newcommand{\case}[5]{\msf{case} ~ {#1} ~ \{ L(#2) \to #3 \mid R(#4) \to #5 \}} \newcommand{\pair}[2]{\left({#1},~{#2}\right)} \newcommand{\proj}[2]{#1 . #2} \newcommand{\inj}[3]{\msf{inj} ~ #1 = #2 ~ \msf{as} ~ #3} \newcommand{\letv}[3]{\msf{let} ~ {#1} = {#2} ~ \msf{in} ~ {#3}} \newcommand{\fold}[2]{\msf{fold}~{#1}~\msf{as}~{#2}} \newcommand{\unfold}[1]{\msf{unfold}~{#1}} \newcommand{\poly}[2]{\Lambda~{#1}~.~ #2} \newcommand{\polyapp}[2]{{#1}~\left[{#2}\right]} \newcommand{\export}[3]{\msf{export}~ #1 ~\msf{without}~{#2}~\msf{as}~ #3} \newcommand{\import}[4]{\msf{import} ~ ({#1}, {#2}) = {#3} ~ \msf{in} ~ #4} % Typed lambda calculus - types \newcommand{\tnum}{\msf{num}} \newcommand{\tstr}{\msf{string}} \newcommand{\tint}{\msf{int}} \newcommand{\tbool}{\msf{bool}} \newcommand{\tfun}[2]{#1 \rightarrow #2} \newcommand{\tprod}[2]{#1 \times #2} \newcommand{\tsum}[2]{#1 + #2} \newcommand{\trec}[2]{\mu~{#1}~.~{#2}} \newcommand{\tvoid}{\msf{void}} \newcommand{\tunit}{\msf{unit}} \newcommand{\tpoly}[2]{\forall~{#1}~.~{#2}} \newcommand{\tmod}[2]{\exists ~ {#1} ~ . ~ #2} % WebAssembly \newcommand{\wconst}[1]{\msf{i32.const}~{#1}} \newcommand{\wbinop}[1]{\msf{i32}.{#1}} \newcommand{\wgetlocal}[1]{\msf{get\_local}~{#1}} \newcommand{\wsetlocal}[1]{\msf{set\_local}~{#1}} \newcommand{\wgetglobal}[1]{\msf{get\_global}~{#1}} \newcommand{\wsetglobal}[1]{\msf{set\_global}~{#1}} \newcommand{\wload}{\msf{i32.load}} \newcommand{\wstore}{\msf{i32.store}} \newcommand{\wsize}{\msf{memory.size}} \newcommand{\wgrow}{\msf{memory.grow}} \newcommand{\wunreachable}{\msf{unreachable}} \newcommand{\wblock}[1]{\msf{block}~{#1}} \newcommand{\wloop}[1]{\msf{loop}~{#1}} \newcommand{\wbr}[1]{\msf{br}~{#1}} \newcommand{\wbrif}[1]{\msf{br\_if}~{#1}} \newcommand{\wreturn}{\msf{return}} \newcommand{\wcall}[1]{\msf{call}~{#1}} \newcommand{\wlabel}[2]{\msf{label}~\{#1\}~{#2}} \newcommand{\wframe}[2]{\msf{frame}~({#1}, {#2})} \newcommand{\wtrapping}{\msf{trapping}} \newcommand{\wbreaking}[1]{\msf{breaking}~{#1}} \newcommand{\wreturning}[1]{\msf{returning}~{#1}} \newcommand{\wconfig}[5]{\{\msf{module}{:}~{#1};~\msf{mem}{:}~{#2};~\msf{locals}{:}~{#3};~\msf{stack}{:}~{#4};~\msf{instrs}{:}~{#5}\}} \newcommand{\wfunc}[4]{\{\msf{params}{:}~{#1};~\msf{locals}{:}~{#2};~\msf{return}~{#3};~\msf{body}{:}~{#4}\}} \newcommand{\wmodule}[1]{\{\msf{funcs}{:}~{#1}\}} \newcommand{\wcg}{\msf{globals}} \newcommand{\wcf}{\msf{funcs}} \newcommand{\wci}{\msf{instrs}} \newcommand{\wcs}{\msf{stack}} \newcommand{\wcl}{\msf{locals}} \newcommand{\wclab}{\msf{labels}} \newcommand{\wcm}{\msf{mem}} \newcommand{\wcmod}{\msf{module}} \newcommand{\wsteps}[2]{\steps{\brc{#1}}{\brc{#2}}} \newcommand{\with}{\underline{\msf{with}}} \newcommand{\wvalid}[2]{{#1} \vdash {#2}~\msf{valid}} \newcommand{\wif}[2]{\msf{if}~{#1}~{\msf{else}}~{#2}} \newcommand{\wfor}[4]{\msf{for}~(\msf{init}~{#1})~(\msf{cond}~{#2})~(\msf{post}~{#3})~{#4}} % assign4.3 custom \newcommand{\wtry}[2]{\msf{try}~{#1}~\msf{catch}~{#2}} \newcommand{\wraise}{\msf{raise}} \newcommand{\wraising}[1]{\msf{raising}~{#1}} \newcommand{\wconst}[1]{\msf{i32.const}~{#1}} \newcommand{\wbinop}[1]{\msf{i32}.{#1}} \newcommand{\wgetlocal}[1]{\msf{get\_local}~{#1}} \newcommand{\wsetlocal}[1]{\msf{set\_local}~{#1}} \newcommand{\wgetglobal}[1]{\msf{get\_global}~{#1}} \newcommand{\wsetglobal}[1]{\msf{set\_global}~{#1}} \newcommand{\wload}{\msf{i32.load}} \newcommand{\wstore}{\msf{i32.store}} \newcommand{\wsize}{\msf{memory.size}} \newcommand{\wgrow}{\msf{memory.grow}} \newcommand{\wunreachable}{\msf{unreachable}} \newcommand{\wblock}[1]{\msf{block}~{#1}} \newcommand{\wloop}[1]{\msf{loop}~{#1}} \newcommand{\wbr}[1]{\msf{br}~{#1}} \newcommand{\wbrif}[1]{\msf{br\_if}~{#1}} \newcommand{\wreturn}{\msf{return}} \newcommand{\wcall}[1]{\msf{call}~{#1}} \newcommand{\wlabel}[2]{\msf{label}~\{#1\}~{#2}} \newcommand{\wframe}[2]{\msf{frame}~({#1}, {#2})} \newcommand{\wtrapping}{\msf{trapping}} \newcommand{\wbreaking}[1]{\msf{breaking}~{#1}} \newcommand{\wreturning}[1]{\msf{returning}~{#1}} \newcommand{\wconfig}[5]{\{\msf{module}{:}~{#1};~\msf{mem}{:}~{#2};~\msf{locals}{:}~{#3};~\msf{stack}{:}~{#4};~\msf{instrs}{:}~{#5}\}} \newcommand{\wfunc}[4]{\{\msf{params}{:}~{#1};~\msf{locals}{:}~{#2};~\msf{return}~{#3};~\msf{body}{:}~{#4}\}} \newcommand{\wmodule}[1]{\{\msf{funcs}{:}~{#1}\}} \newcommand{\wcg}{\msf{globals}} \newcommand{\wcf}{\msf{funcs}} \newcommand{\wci}{\msf{instrs}} \newcommand{\wcs}{\msf{stack}} \newcommand{\wcl}{\msf{locals}} \newcommand{\wcm}{\msf{mem}} \newcommand{\wcmod}{\msf{module}} \newcommand{\wsteps}[2]{\steps{\brc{#1}}{\brc{#2}}} \newcommand{\with}{\underline{\msf{with}}} \newcommand{\wvalid}[2]{{#1} \vdash {#2}~\msf{valid}} % assign4.3 custom \newcommand{\wtry}[2]{\msf{try}~{#1}~\msf{catch}~{#2}} \newcommand{\wraise}{\msf{raise}} \newcommand{\wraising}[1]{\msf{raising}~{#1}} \newcommand{\wif}[2]{\msf{if}~{#1}~{\msf{else}}~{#2}} \newcommand{\wfor}[4]{\msf{for}~(\msf{init}~{#1})~(\msf{cond}~{#2})~(\msf{post}~{#3})~{#4}} \newcommand{\windirect}[1]{\msf{call\_indirect}~{#1}} % session types \newcommand{\ssend}[2]{\msf{send}~{#1};~{#2}} \newcommand{\srecv}[2]{\msf{recv}~{#1};~{#2}} \newcommand{\soffer}[4]{\msf{offer}~\{{#1}\colon({#2})\mid{#3}\colon({#4})\}} \newcommand{\schoose}[4]{\msf{choose}~\{{#1}\colon({#2})\mid{#3}\colon({#4})\}} \newcommand{\srec}[1]{\msf{label};~{#1}} \newcommand{\sgoto}[1]{\msf{goto}~{#1}} \newcommand{\dual}[1]{\overline{#1}} % Inference rules \newcommand{\inferrule}[3][]{\cfrac{#2}{#3}\;{#1}} \newcommand{\ir}[3]{\inferrule[\text{(#1)}]{#2}{#3}} \newcommand{\s}{\hspace{1em}} \newcommand{\nl}{\\[2em]} \newcommand{\evalto}{\boldsymbol{\overset{*}{\mapsto}}} \newcommand{\steps}[2]{#1 \boldsymbol{\mapsto} #2} \newcommand{\evals}[2]{#1 \evalto #2} \newcommand{\subst}[3]{[#1 \rightarrow #2] ~ #3} \newcommand{\dynJ}[2]{#1 \proves #2} \newcommand{\dynJC}[1]{\dynJ{\ctx}{#1}} \newcommand{\typeJ}[3]{#1 \proves \hasType{#2}{#3}} \newcommand{\typeJC}[2]{\typeJ{\ctx}{#1}{#2}} \newcommand{\hasType}[2]{#1 : #2} \newcommand{\val}[1]{#1~\msf{val}} \newcommand{\num}[1]{\msf{Int}(#1)} \newcommand{\err}[1]{#1~\msf{err}} \newcommand{\trans}[2]{#1 \leadsto #2} \newcommand{\size}[1]{\left|#1\right|} $$

&Notepad

Generic associated types encode higher-order functions on types

Will Crichton — January 4, 2021

GATs allow type parameters to associated types in traits. This feature enables total type-level functions to be associated to structs. I show how to use this pattern to implement higher-order type-level functions, and how to use specialization to make partial functions into total functions.

Part of an ongoing series about type-level programming in Rust. Read part one first!

With generic associated types landing recently in Rust nightly, I’ve been wondering: what expressive power does this feature add to type-level programming? The answer is higher-order functions on types, and in this post I’ll explain what that means and how it works.

A refresher on type-level programming

Using a pure functional programming style, we can define objects like a list of types. For example, using tyrade, my type-level programming language:

tyrade! {
  enum TList {
    TNil,
    TCons(Type, TList)
  }

  enum TOption {
    TNone,
    TSome(Type)
  }

  // Get the Nth item from the list, where Index is either Z or S<N>
  fn Nth<List, Index>() {
    match List {
      TNil => TNone,
      TCons(X, XS) => match Index {
        Z => TSome(X),
        S(IMinusOne) => Nth(XS, IMinusOne)
      }
    }
  }
}

fn main() {
  // checks that Nth([i32, f32], 1) == Some(f32)
  assert_type_eq::<
    Nth<TCons<i32, TCons<f32, TNil>>, S<Z>>,
    TSome<f32>
  >();
}

The tyrade! procedural macro compiles the pseudo-Rust notation into a series of structs, traits, and impls. For example:

pub struct TNil;
pub struct TCons<T0, T1>(...);

pub trait ComputeNth<Index> {
    type Output;
}
pub type Nth<List, Index> = <List as ComputeNth<Index>>::Output;

impl<Index> ComputeNth<Index> for TNil {
    type Output = TNone;
}
impl<X, XS> ComputeNth<Z> for TCons<X, XS>
where X: ComputeTSome {
    type Output = TSome<X>;
}
impl<IMinusOne, X, XS> ComputeNth<S<IMinusOne>> for TCons<X, XS>
where XS: ComputeNth<IMinusOne> {
    type Output = Nth<XS, IMinusOne>;
}

See my explainer on type-level programming if you are confused about the correspondence between these programs.

Higher-order functions on types

For me, Tyrade is a explicit representation of my mental model for type-level programming. Once I conceptually understood the correspondences between type-level enums and structs, or between type-level functions and traits, then I reified that understanding into the Tyrade compiler.

However, trait/function correspondence only worked when the arguments to type-level functions were types. To explain, we’ll use the running example of a list map function. The goal is to write it in Tyrade like this:

tyrade! {
  fn Map<List, Func>() {
    match List {
      TNil => TNil,
      TCons(X, XS) => TCons(Func(X), Map(XS, Func))
    }
  }
}

Then we could use the Map type function like this:

tyrade! {
  fn TIsZero<N>() {
    match N {
      Z => TTrue,
      S(N1) => TFalse
    }
  }
}

fn main() {
  assert_type_eq::<
    Map<
      TCons<Z, TCons<S<Z>, TNil>>,
      TIsZero
    >,
    TCons<TTrue, TCons<TFalse, TNil>>
  >();
}

However, the existing translation of Map doesn’t work. It would become:

pub trait ComputeMap<Func> {
  type Output;
}
pub type Map<List, Func> = <List as ComputeMap<Func>>::Output;

impl<Func> ComputeMap<Func> for TNil {
  type Output = TNil;
}
impl<X, XS, Func> ComputeMap<Func> for TCons<X, XS>
where XS: ComputeMap<Func> {
  type Output = TCons<Func<X>, Map<XS, Func>>;
}

And this code fails to compile because Func can’t be invoked with a parameter:

error[E0109]: type arguments are not allowed for this type
    |
    |     type Output = TCons<Func<X>, Map<XS, Func>>;
    |                              ^ type argument not allowed

Herein lies the crux of the issue: type variables (i.e. impl quantifiers) are only allowed to be of kind type, and not of kind type -> type. To get higher-order type functions, we need Rust to support higher-kinded types (HKT). While Rust doesn’t support HKT directly, the addition of generic associated types (GATs) enables a pseudo-HKT pattern. See Niko’s extended discussion for the gory details.

Implementing HOFs with HKTs with GATs

TIsZero cannot be passed directly into ComputeMap, so the key idea is to create a proxy object TIsZeroProxy which can be passed in. Using GATs, we associate the TIsZeroProxy back to TIsZero in a way that can be referenced within ComputeMap. First, the proxy:

pub trait FuncProxy {
  type Func<T>;
}

pub struct TIsZeroProxy;
impl FuncProxy for TIsZeroProxy {
  type Func<T> = TIsZero<T>;
}

Then the implementation of ComputeMap can be parameterized by any type implementing FuncProxy:

impl<X, XS, Proxy> ComputeMap<Proxy> for TCons<X, XS>
where
  Proxy: FuncProxy,
  XS: ComputeMap<Proxy>
{
  type Output = TCons<Proxy::Func<X>, Map<XS, Proxy>>;
}

However, this attempt still doesn’t quite work. We get an error in the implementation of FuncProxy for TIsZeroProxy:

error[E0277]: the trait bound `T: ComputeTIsZero` is not satisfied
    |
    |   type Func<T> = TIsZero<T>;
    |   ^^^^^^^^^^^^^^^^^^^^^^^^ the trait `ComputeTIsZero` is not implemented for `T`
    |

Why do we get this? Recall that TIsZero<T> is an alias for <T as ComputeTIsZero>::Output. This means that T must implement the ComputeTIsZero trait, which isn’t guaranteed by our general FuncProxy trait definition. We could theoretically change FuncProxy to include this bound, something like:

trait FuncProxy {
  type Func<T: ComputeTIsZero>;
}

However, our goal is for Map to take as input any type-level function. This definition of FuncProxy would restrict the implement to only functions mentioned in the trait bounds.

Dealing with partial functions

Let’s back up to understand the conceptual issue. In Rust, type-level functions are partial functions, meaning they may not be implemented for all types. For example, TIsZero is only implemented for the types Z and S<N>, but not e.g. for the type String. However, to define Map, we have to ensure that Proxy::Func<X> is defined for all X in a type list.

Previously, we could ensure this condition via a trait bound. For example, if Proxy::Func was ComputeTIsZero, then we could add X: ComputeTIsZero to the implementation. But for any generic Proxy::Func, there is no way to say X: Proxy::Func because Proxy::Func is a type, not a trait. Hypothetically, if Rust supported associated traits, we could do something like:

trait FuncProxy {
  trait Func { type Output; };
}

impl FuncProxy for TIsZeroProxy {
  trait Func = ComputeTIsZero;
}

type CallProxy<Proxy, T> = <T as Proxy::Func>::Output;

impl<X, XS, Proxy> ComputeMap<Proxy> for TCons<X, XS>
where
  Proxy: FuncProxy,
  XS: ComputeMap<Proxy>,
  X: Proxy::Func
{
  type Output = TCons<CallProxy<Proxy, X>, Map<XS, Proxy>>;
}

However, Rust doesn’t have such a feature. Instead, we can use specialization to make all type functions total. We can define a base case where a type function returns an error if it’s not implemented, but as a type rather than a compiler error. To compile TIsZero, this solution looks like:

pub struct Error;

pub trait FuncProxy {
  type Func<T>;
}

pub trait ComputeTIsZero {
  type Output;
}

type TIsZero<T> = <T as ComputeTIsZero>::Output;

impl ComputeTIsZero for Z {
  type Output = TTrue;
}

impl<N> ComputeTIsZero for S<N> {
  type Output = TFalse;
}

/* key addition */
impl<T> ComputeTIsZero for T {
  default type Output = Error;
}

struct TIsZeroProxy;
impl FuncProxy for TIsZeroProxy {
  type Func<T> = TIsZero<T>;
}

With this addition, the TIsZeroProxy implementation no longer errors, because ComputeTIsZero is guaranteed to be implemented for all types T. And now, at long last, our Map program will execute correctly if we replace TIsZero with TIsZeroProxy:

fn main() {
  assert_type_eq::<
    Map<
      TCons<Z, TCons<S<Z>, TNil>>,
      TIsZeroProxy
    >,
    TCons<TTrue, TCons<TFalse, TNil>>
  >();
}

Note: as of January 2021, this pattern is theoretically sound, but seems to have ongoing performance or correctness issues in the compiler. Specialization combined with recursive trait bounds will occassionally cause the compiler to stack overflow — see my Github issue.

Dynamically-kinded type-level programming

To add support for higher-order type functions, I had to remove support for type annotations (actually kind annotations) from Tyrade. Previously, you could write functions like this:

tyrade! {
  fn TIsZero(N: TNum) -> TBool {
    match N {
      Z => TTrue,
      S(N1 @ TNum) => TFalse
    }
  }
}

This program would compile into the trait definition:

trait ComputeTIsZero: TNum {
  type Output: TBool;
}

This ensures, for example, that a function’s return value matches its return kind. If you wrote a function with a mismatch:

tyrade! {
  fn TIsZero(N: TNum) -> TBool {
    Z
  }
}

Then the compiler raises an error at the point of definition for TIsZero rather than the point of use. Hence, this language is statically-kinded. However, to kind-check a higher-order function like Map, we need a polymorphic kind system. Ideally, we could write in Tyrade:

tyrade! {
  fn Map<A, B>(L: List<A>, F: A -> B) -> List<B> {
    ...
  }
}

I don’t believe it’s possible to encode this concept into Rust’s trait system. So to add higher-order functions, our type-level programming language had to become dynamically-kinded. A sad trade-off, but perhaps more acceptable for type-level programming than value-level. Although errors are caught by the users and not the definers, at least they’re still caught at compile-time!