Rewriting Rust

The Rust programming language feels like a first generation product.

You know what I mean. Like the first iPhone - which was amazing by the way. They made an entire operating system around multitouch. A smart phone with no keyboard. And a working web browser. Within a few months, we all realised what the iPhone really wanted to be. Only, the first generation iphone wasn't quite there. It didn't have 3G internet. There was no GPS chip. And there was no app store. In the next few years, iPhones would get a lot better.

Rust feels a bit like that first iPhone.

I fell in love with Rust at the start. Algebraic types? Memory safety without compromising on performance? A modern package manager? Count me in. But now that I've been programming in rust for 4 years or so, it just feels like its never quite there.

And I don't know if it will ever be there. Progress on the language has slowed so much. When I first started using it, every release seemed to add new, great features in stable rust. Now? Crickets. The rust "unstable book" lists 700 different unstable features - which presumably are all implemented, but which have yet to be enabled in stable rust. Most of them are changes to the standard library - but seriously. Holy cow.

How much of this stuff will ever make it into the language proper? The rust RFC process is a graveyard of good ideas.

Features like Coroutines. This RFC is 7 years old now. Make no mistake - coroutines are implemented in the compiler. They're just, not available for us "stable rust" peasants to use. If coroutines were a child, they would be in grade school by now. At this point, the coroutines RFC has lasted longer than World War 1 or 2.

I suspect rust is calcifying because its consensus process just doesn't scale. Early on, rust had a small group of contributors who just decided things. The monsters. Now, there are issue threads like this, in which 25 smart, well meaning people spent 2 years and over 200 comments trying to figure out how to improve Mutex. And as far as I can tell, in the end they more or less gave up.

Maybe this is by design. Good languages are stable languages. It might be time to think of rust as a fully baked language - warts and all. Python 2.7 for life.

But that doesn't change anything for me. I want a better rust, and I feel powerless to make that happen. Where are my coroutines? Even javascript has coroutines.

Fantasy language

Sometimes I lie awake at night fantasising about forking the compiler. I know how I'd do it. In my fork, I'd leave all the rust stuff alone and but make my own "seph" edition of the rust language. Then I could add all sorts of breaking features to that edition. So long as my compiler still compiles mainline rust as well, I could keep using all the wonderful crates on Cargo.

I think about this a lot. If I did it, here's what I'd change:

Function traits (effects)

Rust has traits on structs. These are used in all sorts of ways. Some are markers. Some are understood by the compiler (like Copy). Some are user defined.

Rust should also define a bunch of traits for functions. In other languages, function traits are called "effects".

This sounds weird at first glance - but hear me out. See, there's lots of different "traits" that functions have. Things like:

  • Does the function ever panic?
  • Does the function have a fixed stack size?
  • Does the function run to the end, or does it yield / await?
  • If the function is a coroutine, what is the type of the continuation?
  • Is the function "pure" (ie, the same input produces the same output, and it has no side effects)
  • Does the function (directly or indirectly) run unsafe code in semi-trusted libraries?
  • Is the function guaranteed to terminate?

And so on.

A function's parameters and return type are just associated types on the function:

fn some_iter() -> impl Iterator<Item = usize> {  
    vec![1,2,3].into_iter()
}

fn main() {  
    // Why doesn't this work already via FnOnce?
    let x: some_iter::Output = some_iter();
}

TAIT eat your heart out.

Exposing these properties is super useful. For example, the linux kernel wants to guarantee (at compile time) that some block of code will never panic. This is impossible to do in rust today. But using function traits, we could explicitly mark a function as being able - or unable - to panic:

#[disallow(Panic)] // Syntax TBD.
fn some_fn() { ... }  

And if the function does anything which could panic (even recursively), the compiler would emit an error.

The compiler already sort of implements traits on functions, like Fn, FnOnce and FnMut. But for some reason they're anemic. (Why??)

I want something like this:

/// Automatically implemented on all functions.
trait Function {  
  type Args,
  type Output,
  type Continuation, // Unit type () for normal functions
  // ... and so on.

  fn call_once(self, args: Self::Args) -> Self::Output;
}

trait NoPanic {} // Marker trait, implemented automatically by the compiler.

/// Automatically implemented on all functions which don't recurse.
trait KnownStackSize {  
  const STACK_SIZE: usize,
}

Then you could write code like this:

fn some_iter() -> impl Iterator<Item = usize> {  
  vec![1,2,3].into_iter();
}

struct SomeWrapperStruct {  
  iter: some_iter::Output, // In 2024 this is still impossible in stable rust.
}

Or with coroutines:

coroutine fn numbers() -> impl Iterator<Item = usize> {  
  yield 1;
  yield 2;
  yield 3;
}

coroutine fn double<I: Iterator<Item=usize>>(inner: I) -> impl Iterator<Item = usize> {  
  for x in inner {
    yield x * 2;
  }
}

struct SomeStruct {  
  // Suppose we want to store the iterator. We can name it directly:
  iterator: double<numbers>::Continuation,
}

Or, say, take a function parameter but require that the parameter itself doesn't panic:

fn foo<F>(f: F)  
    where F: NoPanic + FnOnce() -> String
{ ... }

Yoshua Wuyts has an excellent talk & blog post going into way more detail about effects - why they're useful and how this could work.

Compile-time Capabilities

Most rust projects pull in an insane number of 3rd party crates. Most of these crates are small utility libraries - like the human-size crate which formats file sizes for human consumption. Great stuff! But unfortunately, all of these little crates add supply chain risk. Any of those authors could push out an update which contains malicious code - cryptolockering our computers, our servers or sneaking bad code into our binaries.

I think this problem is similar to the problem of memory safety. Sure - its sometimes useful to write memory-unsafe code. The rust standard library is full of it. But rust's unsafe keyword lets authors opt in to potentially unsafe things. We only add unsafe blocks when its necessary.

Lets do the same thing for privileged function calls - like reading and writing to and from the filesystem or the network. This is useful stuff, but its potentially dangerous. Developers should actively whitelist code that is allowed to call these functions.

To implement this, first we want to add marker traits to all the security-sensitive functions in the standard library (opening a file from a string, exec, FFI, opening network connections, most unsafe functions that interact with raw pointers, and so on). So, for example, std::fs::write(path, contents) writes to an arbitrary path on disk with the credentials of the user. We add some #[cap(fs_write)] marker tag to the function itself, marking that this can only be called from code which is in some way trusted. The compiler automatically "taints" any other functions which call write in the entire call tree.

Suppose I call a function in a 3rd party crate which needs the fs_write capability. In order to call that function, I need to explicitly whitelist that call. (Either by adding the permission explicitly in my Cargo.toml or maybe with an annotation at the call site).

So, lets say the foo crate contains a function like this. The function will be marked (tainted) with the "writes to filesystem" tag:

// In crate `foo`.

// (this function is implicitly tagged with #[cap(fs_write)])
pub fn do_stuff() {  
  std::fs::write("blah.txt", "some text").unwrap();
}

When I try to run that function from my code:

fn main() {  
  foo::do_stuff();
}

The compiler can give me a nice rusty error, like this:

Error: foo::do_stuff() writes to the local filesystem, but the `foo` crate has not been trusted with this capability in Cargo.toml.

Tainted by this line in do_stuff:

  std::fs::write("blah.txt", "some text").unwrap();

Add this to your Cargo.toml to fix:

foo = { version = "1.0.0", allow_capabilities: ["fs_write"] }  

Obviously, most uses of unsafe would also require explicit whitelisting.

Most crates I use - like human-size or serde don't need any special capabilities to work. So we don't need to worry so much about their authors "turning evil" and adding malicious code to our software. Reducing the supply chain risk from the 100 or so crates I currently transitively depend on down to just a few would be massive.

This is a very simple, static way that capabilities could be introduced to Rust. But it might be possible & better to change privileged code to require an extra Capability parameter (some unit struct type). And heavily restrict how Capability objects can be instantiated. Eg:

struct FsWriteCapability;

impl FsWriteCapability {  
    fn new() { Self } // Only callable from the root crate
}

// Then change std::fs::write's signature to this:
pub fn write(path: Path, contents: &[u8], cap: FsWriteCapability) { ... }  

This requires more boilerplate, but its much more flexible. (And obviously, we'd also need to, somehow, apply a similar treatment to build.rs scripts and unsafe blocks.)

The result of all of this is that utility crates become "uncorruptable". Imagine if crates.io is hacked and serde is maliciously updated to include with cryptolocker code. Today, that malicious code would be run automatically on millions of developer machines, and compiled into programs everywhere. With this change, you'd just get a compiler error.

This is huge, and singlehandedly this one feature is probably worth the cost of forking rust. At least, to someone. (Anyone want to sponsor this work?)

Pin, Move and Struct Borrows

Feel free to skip this section if Pin & the borrow checker gives you a migraine.

Pin in rust is a weird, complicated hack to work around a hole in the borrow checker. Its a band-aid from the land of bizzaro choices that only make sense when you need to maintain backwards compatibility at all costs.

  • Its the reverse of the trait you actually want. It would make way more sense to have a Move marker trait (like Copy) indicating objects which can move.
  • But Pin isn't an actual trait. There's only Unpin (double negative now) and !Unpin - which is not-not-not-Move. For example impl !Unpin for PhantomPinned. Is !Unpin the same as Pin? Uhhhh, ... No? Because .. reasons? I get an instant headache when I think about this stuff. Here's the documentation for Unpin if you want to try your luck.
  • Pin only applies to reference types. If you read through code which uses Pin a lot, you'll find unnecessary Box-ing of values everywhere. For example, in tokio, or helper libraries like ouroboros, asynctrait and selfcell.
  • The pain spreads. Any function that takes a pinned value needs the value wrapped using some horrible abomonation like Future::poll(self: Pin<&mut Self>, ..). And then you need to figure out how to read the actual values out using projections, which are so complicated there are multiple crates for dealing with them. The pain cannot be confined. It spreads outwards, forever, corrupting everything.

I swear, it took more effort to learn pinning in rust than it took me to learn the entire Go programming language. And I'm still not convinced I'm totally across it. And I'm not alone. I've heard the Fuchsia operating system project abandoned Rust for C++ in some parts because of how impossibly complex Pin makes everything.

Why is Pin needed, anyway?

We can write rust functions like this:

fn main() {  
    let x = vec![1,2,3];
    let y = &x;

    //drop(x); // error[E0505]: cannot move out of `x` because it is borrowed
    dbg!(y);
}

All variables in a rust function are actually, secretly in one of 3 different states:

  • Normal (owned)
  • Borrowed
  • Mutably borrowed

While a variable is borrowed (y = &x), you can't move, mutate or drop the variable. In this example, x is put into a special "borrowed" state throughout the lifetime of y. Variables in the "borrowed" state are pinned, immutable, and have a bunch of other constraints. This "borrowed state" is visible to the compiler, but its completely invisible to the programmer. You can't tell that something is borrowed until you try to compile your program. (Aside: I wish Rust IDEs made this state visible while programming!)

But at least this program works.

Unfortunately, there's no equivalent to this for structs. Lets turn the function async:

async fn foo() {  
    let x = vec![1,2,3];
    let y = &x;

    some_future().await;

    dbg!(y);
}

When you compile this, the compiler creates a hidden struct for you, which stores the suspended state of this function. It looks something like this:

struct FooFuture {  
  x: Vec<usize>,
  y: &'_ Vec<usize>,
}

impl Future for FooFuture { ... }  

x is borrowed by y. So it needs to be placed under all the constraints of a borrowed variable:

  • It must not move in memory. (It needs to be Pinned)
  • It must be immutable
  • We can't take mutable references to x (because of the & xor &mut rule).
  • x must outlive y.

But there's no syntax for this. Rust doesn't have syntax to mark a struct field as being in a borrowed state. And we can't express the lifetime of y.

Remember: the rust compiler already generates and uses structs like this whenever you use async functions. The compiler just doesn't provide any way to write code like this ourselves. Lets just extend the borrow checker and fix that!

I don't know what the ideal syntax would be, but I'm sure we can come up with something. For example, maybe y gets declared as a "local borrow", written as y: &'Self::x Vec<usize>. The compiler uses that annotation to figure out that x is borrowed. And it puts it under the same set of constraints as a borrowed variable inside a function.

This would also let you work with self-referential structs, like an Abstract Syntax Tree (AST) in a compiler:

struct Ast {  
  source: String,
  ast_nodes: Vec<&'Self::source str>,
}

This syntax could also be adapted to support partial borrows:

impl Foo {  
  fn get_some_field<'a>(&'a self) -> &'a::some_field usize {
    &self.some_field
  }
}

This isn't a complete solution.

We'd also need a Move marker trait, to replace Pin. Any struct with borrowed fields can't be Moved - so it wouldn't have impl Move. I'd also consider a Mover trait, which would allow structs to intelligently move themselves in memory. Eg:

trait Mover {  
  // Something like that.
  unsafe fn move(from: *Self, to: MaybeUninit<&mut Self>);
}

We'd also need a sane, safe way to construct structs like this in the first place. I'm sure we can do better than MaybeUninit.

Miguel Young de la Sota gave a fantastic talk a few years ago talking about Move in rust. But I think it would be much more "rusty" to lean on the borrow checker instead.

If you ask me, Pin is a dead end solution. Rust already has a borrow checker. Lets use it for structs.

Comptime

This is a hot opinion. I haven't spent a lot of time with zig, but at least from a distance I adore comptime.

In the rust compiler we essentially implement two languages: Rust and the Rust Macro language. (Well, arguably there's 3 - because proc macros). The Rust programming language is lovely. But the rust macro languages are horrible.

But, if you already know rust, why not just use rust itself instead of sticking another language in there? This is the genius behind Zig's comptime. The compiler gets a little interpreter tacked on that can run parts of your code at compile time. Functions, parameters, if statements and loops can all be marked as compile-time code. Any non-comptime code in your block is emitted into the program itself.

I'm not going to explain the feature in full here. Instead, take in just how gorgeous this makes Zig's std print function.

Its entirely implemented using comptime. So when you write this in zig:

pub fn main() void {  
    print("here is a string: '{s}' here is a number: {}\n", .{ a_string, a_number });
}

print takes the format string as a comptime parameter, and parses it within a comptime loop. Aside from a couple keywords, the function is just regular zig code - familiar to anyone who knows the language. It just gets executed within the compiler. And the result? It emits this beauty:

pub fn print(self: *Writer, arg0: []const u8, arg1: i32) !void {  
    try self.write("here is a string: '");
    try self.printValue(arg0);
    try self.write("' here is a number: ");
    try self.printValue(arg1);
    try self.write("\n");
    try self.flush();
}

Read the full case study for more details.

In comparison, I tried to look up how rust's println!() macro is implemented. But println! calls some secret format_args_nl function. I assume that function is hardcoded in the rust compiler itself.

Its not a great look when even the rust compiler authors don't want to use rust's macro language.

Weird little fixes

Bonus round time. Here's some other little "nits" I'd love to fix while we're at it:

// Compile error! We can't have nice things.
if let Some(x) = some_var && some_expr { }  

You can sort of work around this problem today as below, but its awkward to write, hard to read and the semantics are different from how normal if statements work because it lacks short-circuit evaluation.

// check_foo() will run even if some_var is None.
if let (Some(x), true) = (some_var, check_foo()) { ... }  

Full example here.

Rust's ergonomics for raw pointers are also uniquely horrible. When I work with unsafe code, my code should be as easy to read & write as humanly possible. But the rust compiler seems intent on punishing me for my sins. For example, if I have a reference to a struct in rust, I can write myref.x. But if I have a pointer, rust insists that I write (*myptr).x or, worse: (*(*myptr).p).y. Horrible. Horrible and entirely counterproductive. Unsafe code should be clear.

I'd also change all the built in collection types to take an Allocator as a constructor argument. I personally don't like Rust's decision to use a global allocator. Explicit is better than implicit.

Closing thoughts

Thats all the ideas I have. I mean, async needs some love too. But there's so much to say on the topic that async deserves a post of its own.

Unfortunately, most of these changes would be incompatible with existing rust. Even adding security capabilities would require a new rust edition, since it introduces a new way that crates can break semver compatibility.

A few years ago I would have considered writing RFCs for all of these proposals. But I like programming more than I like dying slowly in the endless pit of github RFC comments. I don't want months of work to result in yet another idea in rust's landfill of unrealised dreams.

Maybe I should fork the compiler and do it myself. Urgh. So many projects. If I could live a million lifetimes, I'd devote one to working on compilers.