Parallel Lists

Another common pattern in system design is to have two lists of pairwise-related (or "parallel") elements. For example, consider println:

println!("#{:03} {}: {}", id, user, message);

The first list is the format directives (e.g. {}) in the format string. The second list is the inputs (e.g. id) to println!. These lists must have the same length.

As another example, consider routing in a web server:

#[route("/user/:id/messages/:message_id")]
fn user_message(id: Id, message_id: MessageId) {
 /* .. /
}

The first list is the routing variables (e.g. :id) and the second is the function parameters (e.g. id: Id). These lists must have the same length and corresponding types.

I'll show you how to use heterogeneous lists (H-lists), or lists of values of different types, to implement APIs with parallel lists. Specifically, we will implement a type-safe printf. For a real-world example, you can check out how the warp framework uses H-lists to do type-safe web routing.

Core mechanism: HList

First, we need to understand H-lists. An H-list is a sequence of values of potentially different types. For example, [1, "a", true] is an H-list, but not a valid Rust vector. H-lists are implemented in Rust using a linked-list style:

struct HNil;
struct HCons<Head, Tail> {
  head: Head,
  tail: Tail
}

let example: HCons<i32, HCons<bool, HNil>> =
  HCons{head: 1, tail: HCons{head: true, tail: HNil}};

The key idea is that the type of an H-list changes every time you make a change to it. By contrast, if you push to a Vec<T>, the type of the vector stays the same.

Just like Rust has vec![], we can use the frunk crate to get an hlist! macro.

let example = hlist![1, true]; // same as above

Setting up printf

Let's go back to the ingredients of printf. We need a format string and an argument list. The key idea is to represent both with an H-list, and carefully use Rust's traits to ensure our desired property: the number of arguments should match the number of holes.

First, to represent format strings, we will have a sequence of structs that represent each part of the string.

pub struct FString(&'static str);
pub struct FVar;

// Assume that we compile "Hello {}! The first prime is {}" into this code.
// That would be a simple syntactic transformation.
let example = hlist![
  FString("Hello "), FVar, FString("! The first prime is "), FVar
];

To represent arguments, we will use a matching H-list of values. For example:

let args = hlist!["world", 2];

Then, our goal is to create a function format such that this is true:

assert_eq!(
  example.format(args),
  "Hello world! The first prime is 2"
);

And this should be a compile-time (NOT run-time) error:

example.format(hlist!["Only one arg"]);

The Format trait

First, we need to define the signature of our format function.

trait Format<ArgList> {
  fn format(&self, args: ArgList) -> String;
}

Here, self is the H-list of the format directives, and ArgList is the H-list of the variadic arguments. Format need to take ArgList as a type parameter, because its type will change as we remove elements from the ArgList list.

Now, we proceed to implement the Format trait by cases. First, the base case for reaching the end of the format list HNil:

impl Format<HNil> for HNil {
  fn format(&self, _args: HNil) -> String {
    "".to_string()
  }
}

This impl says that when we reach the end of a format list, just return the empty string. And the only argument we will accept is an empty argument list. Combined with the next impls, this inductively ensures that extra arguments are not accepted.

Next, we will implement FString. This implementation should use the string constant contained in the FString struct, and combine it recursively with the rest of the format list. We don't use variadic arguments for FString, so they get passed along. In Rust, this English specification becomes:

impl<ArgList, FmtList> Format<ArgList>
for HCons<FString, FmtList>
where FmtList: Format<ArgList>
{
  fn format(&self, args: ArgList) -> String {
    self.head.0.to_owned() + &self.tail.format(args)
  }
}

Note that we have to add FmtList: Format<ArgList> to ensure the recursive call to self.tail.format works. Also note that we aren't implementing Format directly on FString, but rather on an H-list containing FString.

Finally, the most complex case, FVar. We want this impl to take an argument from the ArgList, then format the remaining format list with the remaining arguments.

impl<T, ArgList, FmtList> Format<HCons<T, ArgList>>
for HCons<FVar, FmtList>
where
  FmtList: Format<ArgList>,
  T: ToString,
{
  fn format(&self, args: HCons<T, ArgList>) -> String {
    args.head.to_string() + &self.tail.format(args.tail)
  }
}

Be careful to observe which H-list is being accessed by head and tail. Here, the args H-list provides the data to fill the hole via args.head.

Checking our properties

With this implementation, our correct example successfully compiles and runs:

let example = hlist![
  FString("Hello "), FVar, FString("! The first prime is "), FVar
];
assert_eq!(
  example.format(hlist!["world", 2]),
  "Hello world! The first prime is 2"
);

What about our incorrect example? If we write this:

example.format(hlist!["just one arg"]);

This code fails to compile with the error:

error[E0308]: mismatched types
  --> src/printf.rs:48:18
   |
48 |   example.format(hlist!["just one arg"]);
   |                  ^^^^^^^^^^^^^^^^^^^^^^
   |                  expected struct `Cons`, found struct `HNil`
   |
   = note: expected struct `HCons<_, HNil>`
              found struct `HNil`

While the error is enigmatic, our mistake is at least correctly caught at compile-time. This is because Rust deduces that example.format() expects an H-list of the shape HCons<_, HCons<_, HNil>>, but it finds HNil too soon in our 1-element H-list. A similar error occurs when providing too many args.

Stupendous! We have successfully implemented a type-safe printf using H-lists and traits.

Extending our abstraction

Right now, our Format function just checks that the format list and argument list are the same length. We could extend our format structures, for example to ensure that an FVar must be a particular type, or must use Debug vs. Display. Here's the sketch of such a strategy:

use std::marker::PhantomData;

// Add flags for whether using Display or Debug
pub struct FDisplay;
pub struct FDebug;

// Use a type parameter with PhantomData to represent the intended type
pub struct FVar<T, Flag>(PhantomData<(T, Flag)>);

// Now, T has to be the same between the format list and arg list
// Also, FDisplay flag requires that `T: Display`
impl<T, ArgList, FmtList> Format<HCons<T, ArgList>>
for HCons<FVar<T, FDisplay>, FmtList>
where
  FmtList: Format<ArgList>,
  T: Display,
{
  fn format(&self, args: HCons<T, ArgList>) -> String {
    // using format! is cheating, but you get the idea
    format!("{}", args) + &self.tail.format(args.tail)
  }
}

// Similar impl for `T: Debug` when `FDebug` is used

With this approach, if our format list and arg list differ in type:

let fmt = hlist![FString("n: "), FVar::<i32, FDisplay>(PhantomData)];
fmt.format(hlist!["not a number"]);

Then the code will not compile with the error, &'static str is not i32.