Seph

Rewriting Rust

Joseph Gentle — Thu, 26 Sep 2024 05:05:19 GMT

The Rust programming language feels like a first generation product.

You know what I mean. Like the first iPhone - which was amazing by the way. They made an entire operating system around multitouch. A smart phone with no keyboard. And a working web browser. Within a few months, we all realised what the iPhone really wanted to be. Only, the first generation iphone wasn't quite there. It didn't have 3G internet. There was no GPS chip. And there was no app store. In the next few years, iPhones would get a lot better.

Rust feels a bit like that first iPhone.

I fell in love with Rust at the start. Algebraic types? Memory safety without compromising on performance? A modern package manager? Count me in. But now that I've been programming in rust for 4 years or so, it just feels like its never quite there.

And I don't know if it will ever be there. Progress on the language has slowed so much. When I first started using it, every release seemed to add new, great features in stable rust. Now? Crickets. The rust "unstable book" lists 700 different unstable features - which presumably are all implemented, but which have yet to be enabled in stable rust. Most of them are changes to the standard library - but seriously. Holy cow.

How much of this stuff will ever make it into the language proper? The rust RFC process is a graveyard of good ideas.

Features like Coroutines. This RFC is 7 years old now. Make no mistake - coroutines are implemented in the compiler. They're just, not available for us "stable rust" peasants to use. If coroutines were a child, they would be in grade school by now. At this point, the coroutines RFC has lasted longer than World War 1 or 2.

I suspect rust is calcifying because its consensus process just doesn't scale. Early on, rust had a small group of contributors who just decided things. The monsters. Now, there are issue threads like this, in which 25 smart, well meaning people spent 2 years and over 200 comments trying to figure out how to improve Mutex. And as far as I can tell, in the end they more or less gave up.

Maybe this is by design. Good languages are stable languages. It might be time to think of rust as a fully baked language - warts and all. Python 2.7 for life.

But that doesn't change anything for me. I want a better rust, and I feel powerless to make that happen. Where are my coroutines? Even javascript has coroutines.

Fantasy language

Sometimes I lie awake at night fantasising about forking the compiler. I know how I'd do it. In my fork, I'd leave all the rust stuff alone and but make my own "seph" edition of the rust language. Then I could add all sorts of breaking features to that edition. So long as my compiler still compiles mainline rust as well, I could keep using all the wonderful crates on Cargo.

I think about this a lot. If I did it, here's what I'd change:

Function traits (effects)

Rust has traits on structs. These are used in all sorts of ways. Some are markers. Some are understood by the compiler (like Copy). Some are user defined.

Rust should also define a bunch of traits for functions. In other languages, function traits are called "effects".

This sounds weird at first glance - but hear me out. See, there's lots of different "traits" that functions have. Things like:

Does the function ever panic?
Does the function have a fixed stack size?
Does the function run to the end, or does it yield / await?
If the function is a coroutine, what is the type of the continuation?
Is the function "pure" (ie, the same input produces the same output, and it has no side effects)
Does the function (directly or indirectly) run unsafe code in semi-trusted libraries?
Is the function guaranteed to terminate?

And so on.

A function's parameters and return type are just associated types on the function:

fn some_iter() -> impl Iterator {  
    vec![1,2,3].into_iter()
}

fn main() {  
    // Why doesn't this work already via FnOnce?
    let x: some_iter::Output = some_iter();
}

TAIT eat your heart out.

Exposing these properties is super useful. For example, the linux kernel wants to guarantee (at compile time) that some block of code will never panic. This is impossible to do in rust today. But using function traits, we could explicitly mark a function as being able - or unable - to panic:

#[disallow(Panic)] // Syntax TBD.
fn some_fn() { ... }

And if the function does anything which could panic (even recursively), the compiler would emit an error.

The compiler already sort of implements traits on functions, like Fn, FnOnce and FnMut. But for some reason they're anemic. (Why??)

I want something like this:

/// Automatically implemented on all functions.
trait Function {  
  type Args,
  type Output,
  type Continuation, // Unit type () for normal functions
  // ... and so on.

  fn call_once(self, args: Self::Args) -> Self::Output;
}

trait NoPanic {} // Marker trait, implemented automatically by the compiler.

/// Automatically implemented on all functions which don't recurse.
trait KnownStackSize {  
  const STACK_SIZE: usize,
}

Then you could write code like this:

fn some_iter() -> impl Iterator {  
  vec![1,2,3].into_iter();
}

struct SomeWrapperStruct {  
  iter: some_iter::Output, // In 2024 this is still impossible in stable rust.
}

Or with coroutines:

coroutine fn numbers() -> impl Iterator {  
  yield 1;
  yield 2;
  yield 3;
}

coroutine fn double>(inner: I) -> impl Iterator {  
  for x in inner {
    yield x * 2;
  }
}

struct SomeStruct {  
  // Suppose we want to store the iterator. We can name it directly:
  iterator: double::Continuation,
}

Or, say, take a function parameter but require that the parameter itself doesn't panic:

fn foo(f: F)  
    where F: NoPanic + FnOnce() -> String
{ ... }

Yoshua Wuyts has an excellent talk & blog post going into way more detail about effects - why they're useful and how this could work.

Compile-time Capabilities

Most rust projects pull in an insane number of 3rd party crates. Most of these crates are small utility libraries - like the human-size crate which formats file sizes for human consumption. Great stuff! But unfortunately, all of these little crates add supply chain risk. Any of those authors could push out an update which contains malicious code - cryptolockering our computers, our servers or sneaking bad code into our binaries.

I think this problem is similar to the problem of memory safety. Sure - its sometimes useful to write memory-unsafe code. The rust standard library is full of it. But rust's unsafe keyword lets authors opt in to potentially unsafe things. We only add unsafe blocks when its necessary.

Lets do the same thing for privileged function calls - like reading and writing to and from the filesystem or the network. This is useful stuff, but its potentially dangerous. Developers should actively whitelist code that is allowed to call these functions.

To implement this, first we want to add marker traits to all the security-sensitive functions in the standard library (opening a file from a string, exec, FFI, opening network connections, most unsafe functions that interact with raw pointers, and so on). So, for example, std::fs::write(path, contents) writes to an arbitrary path on disk with the credentials of the user. We add some #[cap(fs_write)] marker tag to the function itself, marking that this can only be called from code which is in some way trusted. The compiler automatically "taints" any other functions which call write in the entire call tree.

Suppose I call a function in a 3rd party crate which needs the fs_write capability. In order to call that function, I need to explicitly whitelist that call. (Either by adding the permission explicitly in my Cargo.toml or maybe with an annotation at the call site).

So, lets say the foo crate contains a function like this. The function will be marked (tainted) with the "writes to filesystem" tag:

// In crate `foo`.

// (this function is implicitly tagged with #[cap(fs_write)])
pub fn do_stuff() {  
  std::fs::write("blah.txt", "some text").unwrap();
}

When I try to run that function from my code:

fn main() {  
  foo::do_stuff();
}

The compiler can give me a nice rusty error, like this:

Error: foo::do_stuff() writes to the local filesystem, but the `foo` crate has not been trusted with this capability in Cargo.toml.

Tainted by this line in do_stuff:

  std::fs::write("blah.txt", "some text").unwrap();

Add this to your Cargo.toml to fix:

foo = { version = "1.0.0", allow_capabilities: ["fs_write"] }

Obviously, most uses of unsafe would also require explicit whitelisting.

Most crates I use - like human-size or serde don't need any special capabilities to work. So we don't need to worry so much about their authors "turning evil" and adding malicious code to our software. Reducing the supply chain risk from the 100 or so crates I currently transitively depend on down to just a few would be massive.

This is a very simple, static way that capabilities could be introduced to Rust. But it might be possible & better to change privileged code to require an extra Capability parameter (some unit struct type). And heavily restrict how Capability objects can be instantiated. Eg:

struct FsWriteCapability;

impl FsWriteCapability {  
    fn new() { Self } // Only callable from the root crate
}

// Then change std::fs::write's signature to this:
pub fn write(path: Path, contents: &[u8], cap: FsWriteCapability) { ... }

This requires more boilerplate, but its much more flexible. (And obviously, we'd also need to, somehow, apply a similar treatment to build.rs scripts and unsafe blocks.)

The result of all of this is that utility crates become "uncorruptable". Imagine if crates.io is hacked and serde is maliciously updated to include with cryptolocker code. Today, that malicious code would be run automatically on millions of developer machines, and compiled into programs everywhere. With this change, you'd just get a compiler error.

This is huge, and singlehandedly this one feature is probably worth the cost of forking rust. At least, to someone. (Anyone want to sponsor this work?)

Pin, Move and Struct Borrows

Feel free to skip this section if Pin & the borrow checker gives you a migraine.

Pin in rust is a weird, complicated hack to work around a hole in the borrow checker. Its a band-aid from the land of bizzaro choices that only make sense when you need to maintain backwards compatibility at all costs.

Its the reverse of the trait you actually want. It would make way more sense to have a Move marker trait (like Copy) indicating objects which can move.
But Pin isn't an actual trait. There's only Unpin (double negative now) and !Unpin - which is not-not-not-Move. For example impl !Unpin for PhantomPinned. Is !Unpin the same as Pin? Uhhhh, ... No? Because .. reasons? I get an instant headache when I think about this stuff. Here's the documentation for Unpin if you want to try your luck.
Pin only applies to reference types. If you read through code which uses Pin a lot, you'll find unnecessary Box-ing of values everywhere. For example, in tokio, or helper libraries like ouroboros, asynctrait and selfcell.
The pain spreads. Any function that takes a pinned value needs the value wrapped using some horrible abomonation like Future::poll(self: Pin<&mut Self>, ..). And then you need to figure out how to read the actual values out using projections, which are so complicated there are multiple crates for dealing with them. The pain cannot be confined. It spreads outwards, forever, corrupting everything.

I swear, it took more effort to learn pinning in rust than it took me to learn the entire Go programming language. And I'm still not convinced I'm totally across it. And I'm not alone. I've heard the Fuchsia operating system project abandoned Rust for C++ in some parts because of how impossibly complex Pin makes everything.

Why is Pin needed, anyway?

We can write rust functions like this:

fn main() {  
    let x = vec![1,2,3];
    let y = &x;

    //drop(x); // error[E0505]: cannot move out of `x` because it is borrowed
    dbg!(y);
}

All variables in a rust function are actually, secretly in one of 3 different states:

Normal (owned)
Borrowed
Mutably borrowed

While a variable is borrowed (y = &x), you can't move, mutate or drop the variable. In this example, x is put into a special "borrowed" state throughout the lifetime of y. Variables in the "borrowed" state are pinned, immutable, and have a bunch of other constraints. This "borrowed state" is visible to the compiler, but its completely invisible to the programmer. You can't tell that something is borrowed until you try to compile your program. (Aside: I wish Rust IDEs made this state visible while programming!)

But at least this program works.

Unfortunately, there's no equivalent to this for structs. Lets turn the function async:

async fn foo() {  
    let x = vec![1,2,3];
    let y = &x;

    some_future().await;

    dbg!(y);
}

When you compile this, the compiler creates a hidden struct for you, which stores the suspended state of this function. It looks something like this:

struct FooFuture {  
  x: Vec,
  y: &'_ Vec,
}

impl Future for FooFuture { ... }

x is borrowed by y. So it needs to be placed under all the constraints of a borrowed variable:

It must not move in memory. (It needs to be Pinned)
It must be immutable
We can't take mutable references to x (because of the & xor &mut rule).
x must outlive y.

But there's no syntax for this. Rust doesn't have syntax to mark a struct field as being in a borrowed state. And we can't express the lifetime of y.

Remember: the rust compiler already generates and uses structs like this whenever you use async functions. The compiler just doesn't provide any way to write code like this ourselves. Lets just extend the borrow checker and fix that!

I don't know what the ideal syntax would be, but I'm sure we can come up with something. For example, maybe y gets declared as a "local borrow", written as y: &'Self::x Vec. The compiler uses that annotation to figure out that x is borrowed. And it puts it under the same set of constraints as a borrowed variable inside a function.

This would also let you work with self-referential structs, like an Abstract Syntax Tree (AST) in a compiler:

struct Ast {  
  source: String,
  ast_nodes: Vec<&'Self::source str>,
}

This syntax could also be adapted to support partial borrows:

impl Foo {  
  fn get_some_field<'a>(&'a self) -> &'a::some_field usize {
    &self.some_field
  }
}

This isn't a complete solution.

We'd also need a Move marker trait, to replace Pin. Any struct with borrowed fields can't be Moved - so it wouldn't have impl Move. I'd also consider a Mover trait, which would allow structs to intelligently move themselves in memory. Eg:

trait Mover {  
  // Something like that.
  unsafe fn move(from: *Self, to: MaybeUninit<&mut Self>);
}

We'd also need a sane, safe way to construct structs like this in the first place. I'm sure we can do better than MaybeUninit.

Miguel Young de la Sota gave a fantastic talk a few years ago talking about Move in rust. But I think it would be much more "rusty" to lean on the borrow checker instead.

If you ask me, Pin is a dead end solution. Rust already has a borrow checker. Lets use it for structs.

Comptime

This is a hot opinion. I haven't spent a lot of time with zig, but at least from a distance I adore comptime.

In the rust compiler we essentially implement two languages: Rust and the Rust Macro language. (Well, arguably there's 3 - because proc macros). The Rust programming language is lovely. But the rust macro languages are horrible.

But, if you already know rust, why not just use rust itself instead of sticking another language in there? This is the genius behind Zig's comptime. The compiler gets a little interpreter tacked on that can run parts of your code at compile time. Functions, parameters, if statements and loops can all be marked as compile-time code. Any non-comptime code in your block is emitted into the program itself.

I'm not going to explain the feature in full here. Instead, take in just how gorgeous this makes Zig's std print function.

Its entirely implemented using comptime. So when you write this in zig:

pub fn main() void {  
    print("here is a string: '{s}' here is a number: {}\n", .{ a_string, a_number });
}

print takes the format string as a comptime parameter, and parses it within a comptime loop. Aside from a couple keywords, the function is just regular zig code - familiar to anyone who knows the language. It just gets executed within the compiler. And the result? It emits this beauty:

pub fn print(self: *Writer, arg0: []const u8, arg1: i32) !void {  
    try self.write("here is a string: '");
    try self.printValue(arg0);
    try self.write("' here is a number: ");
    try self.printValue(arg1);
    try self.write("\n");
    try self.flush();
}

Read the full case study for more details.

In comparison, I tried to look up how rust's println!() macro is implemented. But println! calls some secret format_args_nl function. I assume that function is hardcoded in the rust compiler itself.

Its not a great look when even the rust compiler authors don't want to use rust's macro language.

Weird little fixes

Bonus round time. Here's some other little "nits" I'd love to fix while we're at it:

impl for Range. If you know, you know.
Fix derive with associated types. Full example here.
Make if-let expressions support logical AND. Its so simple, so obvious, and so useful. This should work:

// Compile error! We can't have nice things.
if let Some(x) = some_var && some_expr { }

You can sort of work around this problem today as below, but its awkward to write, hard to read and the semantics are different from how normal if statements work because it lacks short-circuit evaluation.

// check_foo() will run even if some_var is None.
if let (Some(x), true) = (some_var, check_foo()) { ... }

Full example here.

Rust's ergonomics for raw pointers are also uniquely horrible. When I work with unsafe code, my code should be as easy to read & write as humanly possible. But the rust compiler seems intent on punishing me for my sins. For example, if I have a reference to a struct in rust, I can write myref.x. But if I have a pointer, rust insists that I write (*myptr).x or, worse: (*(*myptr).p).y. Horrible. Horrible and entirely counterproductive. Unsafe code should be clear.

I'd also change all the built in collection types to take an Allocator as a constructor argument. I personally don't like Rust's decision to use a global allocator. Explicit is better than implicit.

Closing thoughts

Thats all the ideas I have. I mean, async needs some love too. But there's so much to say on the topic that async deserves a post of its own.

Unfortunately, most of these changes would be incompatible with existing rust. Even adding security capabilities would require a new rust edition, since it introduces a new way that crates can break semver compatibility.

A few years ago I would have considered writing RFCs for all of these proposals. But I like programming more than I like dying slowly in the endless pit of github RFC comments. I don't want months of work to result in yet another idea in rust's landfill of unrealised dreams.

Maybe I should fork the compiler and do it myself. Urgh. So many projects. If I could live a million lifetimes, I'd devote one to working on compilers.

NodeJS packages don't deserve your trust

Joseph Gentle — Mon, 11 Apr 2022 07:30:36 GMT

A modest proposal

Another week, another npm supply chain attack. Yikes! People on hacker news are wringing their hands about what should be done. The problem seems dire.

Apparently I couldn't help myself. At 2am the other night I woke up, staring at the ceiling. I couldn't stop thinking about this problem. It seems .. frankly, solvable. But how?

I think I came up with an answer. Or, the sketch of an answer. Is it any good? Will it work? I think it might... You be the judge.

The problem

The fundamental problem with npm is that any package you install has full access to do whatever it wants on your computer. For example, packages can:

Read every file on your computer, including your email, passwords, everything.
Edit your files. Delete them. Cryptolocker them
Do anything it wants on the internet
Run child processes, change your OS settings, install key loggers

You think you're installing leftpad. But you're actually letting a stranger into your house while you aren't at home. They can do basically whatever they want.

And its not just your home. We give package authors full access to our servers and our webpages. These systems store something much more precious: Our users' personal data.

Most people are trustworthy. But occasionally people decide that if you're in Russia or Belarus, wiping your hard drive is fair play. And if you let literally thousands of unknown people into your house unattended, its no surprise when someone does something you don't like. Frankly, I'm surprised supply chain attacks don't happen more often.

We can't solve this by figuring out all the baddies and banning them. I learned this as a kid in the 90s playing a video game called Theme Park. Once you played it enough, some park visitors would start vandalizing the park. I remember reading a strategy guide which said "You can't just hire a security guard and put them at the front gate. Security guards can only kick out visitors after they've broken the rules."

We have the same problem. We can't preemptively figure out which developers don't deserve our trust.

Deno tries to solve this problem, but I don't think its good enough. Deno lets you specify at the command line what kinds of actions your program is allowed to perform. You need to explicitly give permission to your deno process to have access to the internet or your database files.

This is a start; but I don't think its good enough. Just because I'm making a web server, that doesn't mean leftpad should be allowed to access the internet. If I'm making a file server, the leftpad library shouldn't have access to my filesystem. Deno's permission model is a good start, but it just isn't fine grained enough. (That said, I'd certainly take it over nodejs's current approach.)

Capabilities to the rescue

I think we can solve this problem entirely. But it might require some changes to how nodejs works.

I'm taking inspiration here from an OpenBSD API called pledge. The way pledge works is that, when the program starts but before your program has done anything, you make a set of pledges. "I promise this program will not access any files outside of /some/path, or make any network connections to peers except for example.com. If the program is later compromised, none of the compromised code can do anything nasty.

But I think we can take this a bit further. Here's my idea:

We add a new builtin nodejs library called capabilities, which can hand out capability tokens. Capability tokens can only be created by the capabilities library.
To make any privileged action (access the filesystem, the network, hardware, spawn child processes, load native npm modules, etc!), the caller needs to pass in an appropriate capability token. Most functions in fs, net, child_process and others will need a capability field added. Most of these methods already take an options object, so it shouldn't be too hard to add a capability token there.
Every capability token has a scope. The scope specifies what the bearer of that token is allowed to do. For example, a capability might give you read/write access to the /var/data directory. The capability library lets you narrow a capability, but capabilities can never be widened. So if a library has a capability for arbitrary network access, it can create a narrowed token which only has network access to your database server. That capability can be then passed to the database client library.
When your program launches, your main package (and only your main package!) gets access to a wildcard "do anything" capability. You can narrow & pass this capability token to other packages, depending on what you want them to do.

So, something like this:

// server.js
const cap = require('capabilities')  
const express = require('express')

const rootToken = cap.claimRootToken() // More on this below

const httpServerToken = cap.narrow(rootToken, {  
  // Scope of the new token
  net: {kind: 'listen', address: 'localhost'}
})

const app = express()  
app.get('/', (req, res) => {  
  res.send('Welcome to my lair of funk')
})

app.listen(4321, {  
  // If we don't pass a token, express can't function!
  token: httpServerToken
})

Express doesn't need to do anything fancy with the capability token. It just passes it to the http library behind the scenes. Whats new is that lots of things are not in the token. Because the token we passed only allows network access, express is banned from reading your filesystem, opening new network connections, running shell scripts, or really anything dangerous that we haven't explicitly allowed.

There's lots of things to nut out here, but I've put a simple sketch of what the capability module might look like at the bottom of this post.

Unfortunately, its not that simple. There are a few other thorny details to figure out too!

What about existing code?

We make the entire capability system opt in at the command line level. If you don't pass --strict-capabilities, then nodejs works like it does now, where any script can do anything.

Production web servers should enable this flag, but existing code should keep working.

How would your root package get the root capability token?

The first idea is something simple like this:

import * as cap from 'capabilities'

// This method can only be called once
const rootToken = cap.claimRootToken()

But the danger of this approach is that attackers can run code before we get the root token. And if they can do that, they can probably get the root access token themselves and do nasty stuff.

import * as cap from 'capabilities'  
import 'attackers_code'

const rootToken = cap.claimRootToken()

Unfortunately, ES modules require all import statements to be at the top of your file, before any code executes. You could work around this restriction by importing a local file first, which immediately claims the root token. But thats super awkward. I don't want a hello world web server to need (at a minimum) 2 source code files.

There might be a way to fix that with some weird ES6 getters, or by some deep V8 wizardry or something:

import {rootToken} from 'capabilities' // rootToken only be *imported* once? Is this possible?  
import 'attackers_code' // Its too late for you!

Or maybe nodejs just passes it in via module.capability / import.meta.capability or something. For example:

const rootToken = module.claimToken()

One way or another, this seems technically solvable.

What about packages which never get updated?

We probably don't need to solve this for version 1.

But if we did, we might be able to add a method in the capability module to "bless" a package. Eg:

const httpServerToken = narrow(rootToken, {net: {kind: 'listen'}})  
bless("express@^4.17.3", httpServerToken)

Then any direct system call from that library acts as if it had the capability we pass in. (And nothing else).

Its a bit hacky though. I mean, how can you tell if a method call comes from a specific package? Thats tricky, but it should be possible. The simplest answer is we could look at the call stack to see if the immediately preceeding item is in a blessed package. You can already inspect the call stack via new Error().stack, but thats slow, and probably corruptible from javascript code. I bet we could do something cleaner from native code.

There might also be scope for mischief via callbacks with this approach. Or someone could edit a package's methods.

How can we prevent javascript's dynamism from making this security system swiss cheese?

This is a real problem.

As an aside, I'm worried if we wait for a perfectly secure solution before launching a capabilities system, then we'll never solve this problem at all. If "mostly secure" is as good as we can get, it still might be better than the current situation. (Though smart people may well disagree with me.)

Javascript is weird, and I'm worried there might be ways to escape this little sandbox. For example:

If an attacker knows express is blessed, could they do something like this?

const express = require('express')

const oldListen = express.application.listen  
express.application.listen = (...args) => {  
  doNastyStuffAsExpress() // Oh no!
  oldListen(...args)
}

But this wouldn't work because the new function isn't part of the express package (even if its called via app.listen(). There may be a way around that. Maybe via an eval() call?

Use Object.defineProperty to overwrite some built in methods. Then use that to target code which has a reference to the root token:

Object.defineProperty(String.prototype, "length", {  
  get() {
    eval("console.log('steal the root token')")
  }
})

This code fails, but I don't know how strong the protections are:

$ node
Welcome to Node.js v16.6.1.  
> Object.defineProperty(String.prototype, "length", { get() { console.log('nasty') } })
Uncaught TypeError: Cannot redefine property: length  
    at Function.defineProperty ()

I might not be smart enough to figure out a way to pierce this security envelope, but maybe you are? This is a new security level. We need some smart security minds to have a play and see if they can bolt this thing down.

Directly editing the prototype of built in javascript classes like String and Array is considered bad form these days. I'd be happy to ban some of that dynamism entirely if the result is better security. If we have to ban eval in strict capabilities mode, frankly I'd be delighted.

If some packages in npm misbehave with capability based sandboxing enabled, thats fine. We can either fix them or boot them from our production systems. There is no shortage of excellent packages in npm. (If you can find them.)

Package install scripts

Npm packages are also allowed to run arbitrary shell scripts on your computer when you install them, via lifecycle events in package.json. I understand it - but I really wish this feature didn't exist, because there's almost no valid uses for it outside compiling your module. And modules should be compiled before they're published, not after.

There are vanishingly few legitimate uses for npm install scripts - almost no popular npm modules use them. But there's a mountain of malicious ways to abuse them.

Now, npm install already sort of has an answer to this problem - which is its --ignore-scripts option. But I bet almost nobody knows about that option, or uses it.

This might be the most controversial (and difficult to change) recommendation here - I think npm should ignore npm install scripts by default. Or maybe, by default prompt the user instead of just doing this stuff blindly:

$ npm install isobject
Installed package `isobject` wants to run a script on your computer to function. Blindly trust this package? (Y/n): n

Closing thoughts

Anyway, thats the core idea. We add capability tokens to nodejs. Packages need a capability token in order to do any privileged actions - like spawn child processes, load native modules, run scripts, access the filesystem or the internet.

We have some problems to work out:

How should a security token's scope be expressed?
How can we securely pass the root 'wildcard' token to the main module?
Are there any nasty javascript tricks which would let someone easily dodge this whole system? Are there any ways we might need to lock javascript down some more in strict capability mode?

But the javascript ecosystem has plenty of smart people. I think this is a challenge worth taking on. The security of our computers, and our users' data depends on it.

(And as an added bonus, it would make it impossible to sneak dirty telemetry and things like that into npm modules.)

Nodejs has a massive, dynamic ecosystem of 3rd party packages. We should be able to depend on arbitrary libraries without giving them the keys to the kingdom. We just need to do some work to make it happen.

And when I say "we", I mean "you". I'm too busy building CRDTs to join this fight. We only get this future if people like you step forward to build it. Are you up for the challenge?

Appendix: How do we write Node's capabilities library?

Here is a rough sketch of what nodejs's capabilities library might look like:

const registry = new Map()

let rootToken = new Symbol() // Special global wildcard token  
registry.set(rootToken, {scope: '*'}) // What do scopes look like?

function getRoot() {  
  if (rootToken === null) { throw Error('...'); }
  let token = rootToken
  rootToken = null
  return token
}

function hasScope(symbol, desiredScope) {  
  const scope = registry.get(symbol)
  // ... And check if desiredScope is a subset of scope.
}

function narrow(parentToken, requestedScope) {  
  if (!hasScope(parentToken, requestedScope)) {
    throw new Error("Nice try, evildoer!")
  }

  const narrowedToken = new Symbol()
  registry.set(narrowedToken, requestedScope)
  return narrowedToken
}

module.exports = {getRoot, hasScope, narrow}

5000x faster CRDTs: An adventure in optimization

Joseph Gentle — Sat, 31 Jul 2021 03:23:45 GMT

A few years ago I was really bothered by an academic paper.

Some researchers in France put together a comparison showing lots of ways you could implement concurrent editing, using various CRDT and OT algorithms. And they benchmarked all of them. (Wow, yess!) Some algorithms worked reasonably well. But others took upwards of 3 seconds to process simple paste operations from their editing sessions. Yikes!

Which algorithm was that? Well, this is awkward but .. it was mine. I mean, I didn't invent it - but it was the algorithm I was using for ShareJS. The algorithm we used for Google Wave. The algorithm which - hang on - I knew for a fact didn't take 3 seconds to process large paste events. Whats going on here?

I took a closer look at the paper. In their implementation when a user pasted a big chunk of text (like 1000 characters), instead of creating 1 operation with 1000 characters, their code split the insert into 1000 individual operations. And each of those operations needed to be processed separately. Do'h - of course it'll be slow if you do that! This isn't a problem with the operational transformation algorithm. This is just a problem with their particular implementation.

The infuriating part was that several people sent me links to the paper and (pointedly) asked me what I think about it. Written up as a Published Science Paper, these speed comparisons seemed like a Fact About The Universe. And not what they really were - implementation details of some java code, written by a probably overstretched researcher. One of a whole bunch of implementations that they needed to code up.

"Nooo! The peer reviewed science isn't right everybody! Please believe me!". But I didn't have a published paper justifying my claims. I had working code but it felt like none of the smart computer science people cared about that. Who was I? I was nobody.

When we think about CRDTs and other collaborative editing systems we have a language problem. We describe each system as an "algorithm". Jupiter is an Algorithm. RGA is an Algorithm. But really there are two very separate aspects:

The black-box behaviour of concurrent edits. When two clients edit the same region of text at the same time, what happens? Are they merged, and if so in what order? What are the rules?
The white-box implementation of the system. What programming language are we using? What data structures? How well optimized is the code?

If my implementation runs slowly, what does that actually tell us? Maybe it's like tests. A passing test suite suggests, but can never prove that there are no bugs. Likewise a slow implementation suggests, but can never prove that every implementation of the system will be slow. If you wait long enough, somebody will find more bugs. And, maybe, someone out there can design a faster implementation.

I've translated my old text OT code into C, Javascript, Go, Rust and Swift. Each implementation has the same behaviour, and the same algorithm. But the performance is not even close. In javascript my transform function ran about 100 000 times per second. Not bad! But the same function in C does 20M iterations per second. That's 200x faster. Wow!

Were the academics testing the slow version or the fast version of this code? Maybe, without noticing, they had fast versions of some algorithms and slow versions of others. It's impossible to tell from the paper!

Making CRDTs fast

So as you may know, I've been getting interested in CRDTs lately. I think they're the future of collaborative editing. And maybe the future of all software - but I'm not ready to talk about that yet. Most CRDTs you read about in academic papers are crazy slow, and a decade ago I decided to stop reading academic papers and dismissed them. I assumed CRDTs had some inherent problem. A GUID for every character? Nought but madness comes from those strange lands! But - and this is awkward to admit - I think I've been making the same mistake as those researchers. I was reading papers which described the behaviour of different systems. And I assumed that meant we knew how the best way to implement those systems. And wow, I was super wrong.

How wrong? Well. Running this editing trace, Automerge (a popular CRDT, written by a popular researcher) takes nearly 5 minutes to run. I have a new implementation that can process the same editing trace in 56 milliseconds. Thats 0.056 seconds, which is over 5000x faster. It's the largest speed up I've ever gotten from optimization work - and I'm utterly delighted by it.

Lets talk about why automerge is currently slow, and I'll take you through all the steps toward making it super fast.

Wait, no. First we need to start with:

What is automerge?

Automerge is a library to help you do collaborative editing. It's written by Martin Kleppmann, who's a little bit famous from his book and excellent talks. Automerge is based on an algorithm called RGA, which you can read about in an academic paper if you're into that sort of thing.

Martin explains automerge far better than I will in this talk from 2020:

Automerge (and Yjs and other CRDTs) think of a shared document as a list of characters. Each character in the document gets a unique ID, and whenever you insert into the document, you name what you're inserting after.

Imagine I type "abc" into an empty document. Automerge creates 3 items:

Insert 'a' id (seph, 0) after ROOT
- Insert 'b' id (seph, 1) after (seph, 0)
- Insert 'c' id (seph, 2) after (seph, 1)

We can draw this as a tree!

Lets say Mike inserts an 'X' between a and b, so we get "aXbc". Then we have:

Insert 'a' id (seph, 0) after ROOT
- Insert 'X' id (mike, 0) after (seph, 0)
- Insert 'b' id (seph, 1) after (seph, 0)
- Insert 'c' id (seph, 2) after (seph, 1)

Note the 'X' and 'b' both share the same parent. This will happen when users type concurrently in the same location in the document. But how do we figure out which character goes first? We could just sort using their agent IDs or something. But argh, if we do that the document could end up as abcX, even though Mike inserted X before the b. That would be really confusing.

Automerge (RGA) solves this with a neat hack. It adds an extra integer to each item called a sequence number. Whenever you insert something, you set the new item's sequence number to be 1 bigger than the biggest sequence number you've ever seen:

Insert 'a' id (seph, 0) after ROOT, seq: 0
- Insert 'X' id (mike, 0) after (seph, 0), seq: 3
- Insert 'b' id (seph, 1) after (seph, 0), seq: 1
- Insert 'c' id (seph, 2) after (seph, 1), seq: 2

This is the algorithmic version of "Wow I saw a sequence number, and it was this big!" "Yeah? Mine is even bigger!"

The rule is that children are sorted first based on their sequence numbers (bigger sequence number first). If the sequence numbers match, the changes must be concurrent. In that case we can sort them arbitrarily based on their agent IDs. (We do it this way so all peers end up with the same resulting document.)

Yjs - which we'll see more of later - implements a CRDT called YATA. YATA is identical to RGA, except that it solves this problem with a slightly different hack. But the difference isn't really important here.

Automerge (RGA)'s behaviour is defined by this algorithm:

Build the tree, connecting each item to its parent
When an item has multiple children, sort them by sequence number then by their ID.
The resulting list (or text document) can be made by flattening the tree with a depth-first traversal.

So how should you implement automerge? The automerge library does it in the obvious way, which is to store all the data as a tree. (At least I think so - after typing "abc" this is automerge's internal state. Uh, uhm, I have no idea whats going on here. And what are all those Uint8Arrays doing all over the place? Whatever.) The automerge library works by building a tree of items.

For a simple benchmark, I'm going to test automerge using an editing trace Martin himself made. This is a character by character recording of Martin typing up an academic paper. There aren't any concurrent edits in this trace, but users almost never actually put their cursors at exactly the same place and type anyway, so I'm not too worried about that. I'm also only counting the time taken to apply this trace locally, which isn't ideal but it'll do. Kevin Jahns (Yjs's author) has a much more extensive benchmarking suite here if you're into that sort of thing. All the benchmarks here are done on my chonky ryzen 5800x workstation, with Nodejs v16.1 and rust 1.52 when that becomes appropriate. (Spoilers!)

The editing trace has 260 000 edits, and the final document size is about 100 000 characters.

As I said above, automerge takes a little under 5 minutes to process this trace. Thats just shy of 900 edits per second, which is probably fine. But by the time it's done, automerge is using 880 MB of RAM. Whoa! That's 10kb of ram per key press. At peak, automerge was using 2.6 GB of RAM!

To get a sense of how much overhead there is, I'll compare this to a baseline benchmark where we just splice all the edits directly into a javascript string. This throws away all the information we need to do collaborative editing, but it gives us a sense of how fast javascript is capable of going. It turns out javascript running on V8 is fast:

| Test | Time taken | RAM usage | |:-------------------------- | ----------:| ---------:| | automerge (v1.0.0-preview2) | 291s | 880 MB | | Plain string edits in JS | 0.61s | 0.1 MB |

This is a chart showing the time taken to process each operation throughout the test, averaged in groups of 1000 operations. I think those spikes are V8's garbage collector trying to free up memory.

In the slowest spike near the end, a single edit took 1.8 seconds to process. Oof. In a real application, the whole app (or browser tab) would freeze up for a couple of seconds sometimes while you're in the middle of typing.

The chart is easier to read when we average everything out a bit and zoom the Y axis. We can see the average performance gets gradually (roughly linearly) worse over time.

Why is automerge slow though?

Automerge is slow for a whole slew of reasons:

Automerge's core tree based data structure gets big and slow as the document grows.
Automerge makes heavy use of Immutablejs. Immutablejs is a library which gives you clojure-like copy-on-write semantics for javascript objects. This is a cool set of functionality, but the V8 optimizer & GC struggles to optimize code that uses immutablejs. As a result, it increases memory usage and decreases performance.
Automerge treats each inserted character as a separate item. Remember that paper I talked about earlier, where copy+paste operations are slow? Automerge does that too!

Automerge was just never written with performance in mind. Their team is working on a replacement rust implementation of the algorithm to run through wasm, but at the time of writing it hasn't landed yet. I got the master branch working, but they have some kinks to work out before it's ready. Switching to the automerge-rs backend doesn't make average performance in this test any faster. (Although it does halve memory usage and smooth out performance.)

There's an old saying with performance tuning:

You can't make the computer faster. You can only make it do less work.

How do we make the computer do less work here? There's lots of performance wins to be had from going through the code and improving lots of small things. But the automerge team has the right approach. It's always best to start with macro optimizations. Fix the core algorithm and data structures before moving to optimizing individual methods. There's no point optimizing a function when you're about to throw it away in a rewrite.

By far, Automerge's biggest problem is its complex tree based data structure. And we can replace it with something faster.

Improving the data structure

Luckily, there's a better way to implement CRDTs, pioneered in Yjs. Yjs is another (competing) opensource CRDT implementation made by Kevin Jahns. It's fast, well documented and well made. If I were going to build software which supports collaborative editing today, I'd use Yjs.

Yjs doesn't need a whole blog post talking about how to make it fast because it's already pretty fast, as we'll see soon. It got there by using a clever, obvious data structure "trick" that I don't think anyone else in the field has noticed. Instead of implementing the CRDT as a tree like automerge does:

state = {  
  { item: 'a', id: ['seph', 0], seq: 0, children: [
    { item: 'X', id, seq, children: []},
    { item: 'b', id, seq, children: [
      { item: 'c', id, seq, children: []}
    ]}
  ]}
}

Yjs just puts all the items in a single flat list:

state = [  
  { item: 'a', id: ['seph', 0], seq: 0, parent: null },
  { item: 'X', id, seq, parent: ['seph', 0] },
  { item: 'b', id, seq, parent: ['seph', 0] },
  { item: 'c', id, seq, parent: [..] }
]

That looks simple, but how do you insert a new item into a list? With automerge it's easy:

Find the parent item
Insert the new item into the right location in the parents' list of children

But with this list approach it's more complicated:

Find the parent item
Starting right after the parent item, iterate through the list until we find the location where the new item should be inserted (?)
Insert it there, splicing into the array

Essentially, this approach is just a fancy insertion sort. We're implementing a list CRDT with a list. Genius!

This sounds complicated - how do you figure out where the new item should go? But it's complicated in the same way math is complicated. It's hard to understand, but once you understand it, you can implement the whole insert function in about 20 lines of code:

(But don't be alarmed if this looks confusing - we could probably fit everyone on the planet who understands this code today into a small meeting room.)

const automergeInsert = (doc, newItem) => {  
  const parentIdx = findItem(doc, newItem.parent) // (1)

  // Scan to find the insert location
  let i
  for (i = parentIdx + 1; i < doc.content.length; i++) {
    let o = doc.content[i]
    if (newItem.seq > o.seq) break // Optimization.
    let oparentIdx = findItem(doc, o.parent)

    // Should we insert here? (Warning: Black magic part)
    if (oparentIdx < parentIdx
      || (oparentIdx === parentIdx
        && (newItem.seq === o.seq)
        && newItem.id[0] < o.id[0])
    ) break
  }
  // We've found the position. Insert at position *i*.
  doc.content.splice(i, 0, newItem) // (2)

  // .. And do various bookkeeping.
}

I implemented both Yjs's CRDT (YATA) and Automerge using this approach in my experimental reference-crdts codebase. Here's the insert function, with a few more comments. The Yjs version of this function is in the same file, if you want to have a look. Despite being very different papers, the logic for inserting is almost identical. And even though my code is very different, this approach is semantically identical to the actual automerge, and Yjs and sync9 codebases. (Fuzzer verified (TM)).

If you're interested in going deeper on this, I gave a talk about this approach at a braid meeting a few weeks ago.

The important point is this approach is better:

We can use a flat array to store everything, rather than an unbalanced tree. This makes everything smaller and faster for the computer to process.
The code is really simple. Being faster and simpler moves the Pareto efficiency frontier. Ideas which do this are rare and truly golden.
You can implement lots of CRDTs like this. Yjs, Automerge, Sync9 and others work. You can implement many list CRDTs in the same codebase. In my reference-crdts codebase I have an implementation of both RGA (automerge) and YATA (Yjs). They share most of their code (everything except this one function) and their performance in this test is identical.

Theoretically this algorithm can slow down when there are concurrent inserts in the same location in the document. But that's really rare in practice - you almost always just insert right after the parent item.

Using this approach, my implementation of automerge's algorithm is about 10x faster than the real automerge. And it's 30x more memory-efficient:

| Test | Time taken | RAM usage | |:-------------------------- | ----------:| ---------:| | automerge (v1.0.0-preview2) | 291s | 880 MB | | reference-crdts (automerge / Yjs) | 31s | 28 MB | | Plain string edits in JS | 0.61s | 0.1 MB |

I wish I could attribute all of that difference to this sweet and simple data structure. But a lot of the difference here is probably just immutablejs gumming automerge up.

It's a lot faster than automerge:

Death by 1000 scans

We're using a clean and fast core data abstraction now, but the implementation is still not fast. There are two big performance bottlenecks in this codebase we need to fix:

Finding the location to insert, and
Actually inserting into the array

(These lines are marked (1) and (2) in the code listing above).

To understand why this code is necessary, lets say we have a document, which is a list of items.

state = [  
  { item: 'a', isDeleted: false, id: ['seph', 0], seq, parent: null },
  { item: 'X', isDeleted: false, id, seq, parent: ['seph', 0] },
  { item: 'b', isDeleted: true,  id, seq, parent: ['seph', 0] },
  { item: 'c', isDeleted: false, id, seq, parent: ['seph', 1] },
  ...
]

And some of those items might have been deleted. I've added an isDeleted flag to mark which ones. (Unfortunately we can't just remove them from the array because other inserts might depend on them. (Drat! But that's a problem for another day.)

Imagine the document has 150 000 array items in it, representing 100 000 characters which haven't been deleted. If the user types an 'a' in the middle of the document (at document position 50 000), what index does that correspond to in our array? To find out, we need to scan through the document (skipping deleted items) to figure out the right array location.

So if the user inserts at position 50 000, we'll probably have to linearly scan past 75 000 items or something to find the insert position. Yikes!

And then when we actually insert, the code does this, which is double yikes:

doc.content.splice(destIdx, 0, newItem)

If the array currently has 150 000 items, javascript will need to move every single item after the new item once space forward in the array. This part happens in native code, but it's still probably slow when we're moving so many items. (Aside: V8 is actually suspiciously fast at this part, so maybe v8 isn't using an array internally to implement Arrays? Who knows!)

But in general, inserting an item into a document with n items will take about n steps. Wait, no - it's worse than that because deleted items stick around. Inserting into a document where there have ever been n items will take n steps. This algorithm is reasonably fast, but it gets slower with every keystroke. Inserting n characters will take O(n^2).

You can see this if we zoom in on the diagram above. There's a lot going on here because Martin's editing position bounced around the document. But there's a strong linear trend up and to the right, which is what we would expect when inserts take O(n) time:

And why this shape in particular? And why does performance get better near the end? If we simply graph where each edit happened throughout the editing trace, with the same bucketing and smoothing, the result is a very familiar curve:

It looks like the time spent applying changes is dominated by the time it takes to scan through the document's array.

Changing the data structure

Can we fix this? Yes we can! And by "we", I mean Kevin fixed these problems in Yjs. How did he manage that?

So remember, there are two problems to fix:

How do we find a specific insert position?
How do we efficiently insert content at that location?

Kevin solved the first problem by thinking about how humans actually edit text documents. Usually while we're typing, we don't actually bounce around a document very much. Rather than scanning the document each time an edit happens, Yjs caches the last (index, position) pair where the user made an edit. The next edit will probably be pretty close to the previous edit, so Kevin just scans forwards or backwards from the last editing position. This sounds a little bit dodgy to me - I mean, thats a big assumption to make! What if edits happen randomly?! But people don't actually edit documents randomly, so it works great in practice.

(What if two users are editing different parts of a document at the same time? Yjs actually stores a whole set of cached locations, so there's almost always a cached cursor location near each user no matter where they're making changes in the document.)

Once Yjs finds the target insert location, it needs to insert efficiently, without copying all the existing items. Yjs solves that by using a bidirectional linked list instead of an array. So long as we have an insert position, linked lists allow inserts in constant time.

Yjs does one more thing to improve performance. Humans usually type in runs of characters. So when we type "hello" in a document, instead of storing:

state = [  
  { item: 'h', isDeleted: false, id: ['seph', 0], seq, parent: null },
  { item: 'e', isDeleted: false, id: ['seph', 1], seq, parent: ['seph', 0] },
  { item: 'l', isDeleted: false, id: ['seph', 2], seq, parent: ['seph', 1] },
  { item: 'l', isDeleted: false, id: ['seph', 3], seq, parent: ['seph', 2] },
  { item: 'o', isDeleted: false, id: ['seph', 4], seq, parent: ['seph', 3] },
]

Yjs just stores:

state = [  
  { item: 'hello', isDeleted: false, id: ['seph', 0], seq, parent: null },
]

Finally those pesky paste events will be fast too!

This is the same information, just stored more compactly. Unfortunately we can't collapse the whole document into a single item or something like that using this trick. The algorithm can only collapse inserts when the IDs and parents line up sequentially - but that happens whenever a user types a run of characters without moving their cursor. And that happens a lot.

In this data set, using spans reduces the number of array entries by 14x. (180k entries down to 12k).

How fast is it now? This blows me away - Yjs is 30x faster than my reference-crdts implementation in this test. And it only uses about 10% as much RAM. It's 300x faster than automerge!.

| Test | Time taken | RAM usage | |:-------------------------- | ----------:| ---------:| | automerge (v1.0.0-preview2) | 291s | 880 MB | | reference-crdts (automerge / Yjs) | 31s | 28 MB | | Yjs (v13.5.5) | 0.97s | 3.3 MB | | Plain string edits in JS | 0.61s | 0.1 MB |

Honestly I'm shocked and a little suspicious of how little ram Yjs uses in this test. I'm sure there's some wizardry in V8 making this possible. It's extremely impressive.

Kevin says he wrote and rewrote parts of Yjs 12 times in order to make this code run so fast. If there was a programmer version of the speedrunning community, they would adore Kevin. I can't even put Yjs on the same scale as the other algorithms because it's so fast:

If we isolate Yjs, you can see it has mostly flat performance. Unlike the other algorithms, it doesn't get slower over time, as the document grows:

But I have no idea what those spikes are near the end. They're pretty small in absolute terms, but it's still weird! Maybe they happen when the user moves their cursor around the document? Or when the user deletes chunks? I have no idea.

This is neat, but the real question is: Can we go even faster? Honestly I doubt I can make pure javascript run this test any faster than Kevin managed here. But maybe.. just maybe we can be...

Faster than Javascript

When I told Kevin that I thought I could make a CRDT implementation that's way faster than Yjs, he didn't believe me. He said Yjs was already so well optimized, going a lot faster probably wasn't possible. "Maybe a little faster if you just port it to Rust. But not a lot faster! V8 is really fast these days!!"

But I knew something Kevin didn't know: I knew about memory fragmentation and cache coherency. Rust isn't just faster. It's also a lower level language, and that gives us the tools we need to control allocations and memory layout.

Kevin knows this now too, and he's working on Yrs to see if he can claim the performance crown back.

Imagine one of our document items in javascript:

var item = {  
  content: 'hello',
  isDeleted: false,
  id: ['seph', 10],
  seq: 5,
  parent: ['mike', 2]
}

This object is actually a mess like this in memory:

Bad news: Your computer hates this.

This is terrible because all the data is fragmented. It's all separated by pointers.

And yes, I know, V8 tries its hardest to prevent this sort of thing when it can. But its not magic.

To arrange data like this, the computer has to allocate memory one by one for each item. This is slow. Then the garbage collector needs extra data to track all of those objects, which is also slow. Later we'll need to read that data. To read it, your computer will often need to go fetch it from main memory, which - you guessed it - is slow as well.

How slow are main memory reads? At human scale each L1 cache read takes 0.5 seconds. And a read from main memory takes close to 2 minutes! This is the difference between a single heartbeat, and the time it takes to brush your teeth.

Arranging memory like javascript does would be like writing a shopping list. But instead of "Cheese, Milk, Bread", your list is actually a scavenger hunt: "Under the couch", "On top of the fridge", and so on. Under the couch is a little note mentioning you need toothpaste. Needless to say, this makes doing the grocery shopping a lot of work.

To go faster, we need to squish all the data together so the computer can fetch more information with each read of main memory. (We want a single read of my grocery list to tell us everything we need to know). Linked lists are rarely used in the real world for exactly this reason - memory fragmentation ruins performance. I also want to move away from linked lists because the user does sometimes hop around the document, which in Yjs has a linear performance cost. Thats probably not a big deal in text editing, but I want this code to be fast in other use cases too. I don't want the program to ever need those slow scans.

We can't fix this in javascript. The problem with fancy data structures in javascript is that you end up needing a lot of exotic objects (like fixed size arrays). All those extra objects make fragmentation worse, so as a result of all your work, your programs often end up running slower anyway. This is the same limitation immutablejs has, and why its performance hasn't improved much in the decade since it was released. The V8 optimizer is very clever, but it's not magic and clever tricks only get us so far.

But we're not limited to javascript. Even when making webpages, we have WebAssembly these days. We can code this up in anything.

To see how fast we can really go, I've been quietly building a CRDT implementation in rust called Diamond types. Diamond is almost identical to Yjs, but it uses a range tree instead of a linked list internally to store all of the items.

Under the hood, my range tree is just a slightly modified b-tree. But usually when people talk about b-trees they mean a BTreeMap. Thats not what I'm doing here. Instead of storing keys, each internal node of the b-tree stores the total number of characters (recursively) in that item's children. So we can look up any item in the document by character position, or insert or delete anywhere in the document in log(n) time.

This example shows the tree storing a document which currently has 1000 characters:

This is a range tree, right? The wikipedia article on range trees is a pretty weak description of what I'm doing here.

This solves both of our linear scanning problems from earlier:

When we want to find the item at position 200, we can just traverse across and down the tree. In the example above, the item with position 350 must be in the middle leaf node here. Trees are very tidy - we can store Martin's editing trace in just 3 levels in our tree, which means in this benchmark we can find any item in about 3 reads from main memory. In practice, most of these reads will already be in your CPU's cache.
Updating the tree is fast too. We update a leaf, then update the character counts at its parent, and its parent, all the way up to the root. So again, after 3 or so steps we're done. Much better than shuffling everything in a javascript array.

We never merge edits from remote peers in this test, but I made that fast too anyway. When merging remote edits we also need to find items by their ID (eg ['seph', 100]). Diamond has little index to search the b-tree by ID. That codepath doesn't get benchmarked here though. It's fast but for now you'll have to take my word for it.

I'm not using Yjs's trick of caching the last edit location - at least not yet. It might help. I just haven't tried it yet.

Rust gives us total control over the memory layout, so we can pack everything in tightly. Unlike in the diagram, each leaf node in my b-tree stores a block of 32 entries, packed in a fixed size array in memory. Inserting with a structure like this results in a little bit of memcpy-ing, but a little bit of memcpy is fine. Memcpy is always faster than I think it will be - CPUs can copy several bytes per clock cycle. Its not the epic hunt of a main memory lookup.

And why 32 entries? I ran this benchmark with a bunch of different bucket sizes and 32 worked well. I have no idea why that worked out to be the best.

Speaking of fast, how fast does it go?

If we compile this code to webassembly and drive it from javascript like in the other tests, we can now process the whole editing trace in 193 milliseconds. Thats 5x faster than Yjs. And remarkably 3x faster than our baseline test editing a native javascript string, despite doing all the work to support collaborative editing!

Javascript and WASM is now a bottleneck. If we skip javascript and run the benchmark directly in rust, we can process all 260k edits in this editing trace in just 56 milliseconds. That's over 5000x faster than where we started with automerge. It can process 4.6 million operations every second.

| Test | Time taken | RAM usage | |:-------------------------- | ----------:| ---------:| | automerge (v1.0.0-preview2) | 291s | 880 MB | | reference-crdts (automerge / Yjs) | 31s | 28 MB | | Yjs (v13.5.5) | 0.97s | 3.3 MB | | Plain string edits in JS | 0.61s | 0.1 MB | | Diamond (wasm via nodejs) | 0.19s | ??? | | Diamond (native) | 0.056s | 1.1 MB |

Performance is smooth as butter. A b-tree doesn't care where edits happen. This system is uniformly fast across the whole document. Rust doesn't need a garbage collector to track memory allocations, so there's no mysterious GC spikes. And because memory is so tightly packed, processing this entire data set (all 260 000) only results in 1394 calls to malloc.

Oh, what a pity. Its so fast you can barely see it next to yjs (fleexxxx). Lets zoom in a bit there and bask in that flat line:

Well, a nearly flat line.

And remember, this chart shows the slow version. This chart is generated from javascript, calling into rust through WASM. If I run this benchmark natively its another ~4x faster again.

Why is WASM 4x slower than native execution? Are javascript calls to the WASM VM really that slow? Does LLVM optimize native x86 code better? Or do WASM's memory bounds checks slow it down?

Struct of arrays or Array of structs?

This implementation has another small, important change - and I'm not sure if I like it.

In rust I'm actually doing something like this:

doc = {  
  textContent: RopeyRope { 'hello' },

  clients: ['seph', 'mike'],

  items: BTree {[
    // Note: No string content!
    { len:  5, id: [0, 0], seq, parent: ROOT },
    { len: -5, id: [1, 0], seq, parent: [0, 0] }, // negative len means the content was deleted
    ...
  ]},
}

Notice the document's text content doesn't live in the list of items anymore. Now it's in a separate data structure. I'm using a rust library for this called Ropey. Ropey implements another b-tree to efficiently manage just the document's text content.

This isn't universally a win. We have unfortunately arrived at the Land of Uncomfortable Engineering Tradeoffs:

Ropey can to do text-specific byte packing. So with ropey, we use less RAM.
When inserting we need to update 2 data structures instead of 1. This makes everything more than twice as slow, and it makes the wasm bundle twice as big (60kb -> 120kb).
For lots of use cases we'll end up storing the document content somewhere else anyway. For example, if you hook this CRDT up to VS Code, the editor will keep a copy of the document at all times anyway. So there's no need to store the document in my CRDT structures as well, at all. This implementation approach makes it easy to just turn that part of the code off.

So I'm still not sure whether I like this approach.

But regardless, my CRDT implementation is so fast at this point that most of the algorithm's time is spent updating the document contents in ropey. Ropey on its own takes 29ms to process this editing trace. What happens if I just ... turn ropey off? How fast can this puppy can really go?

| Test | Time taken | RAM usage | Data structure | |:-------------------------- | ----------:| ---------:|:---------------| | automerge (v1.0.0-preview2) | 291s | 880 MB | Naive tree | | reference-crdts (automerge / Yjs) | 31s | 28 MB | Array | | Yjs (v13.5.5) | 0.97s | 3.3 MB | Linked list | | Plain string edits in JS | 0.61s | 0.1 MB | (none) | | Diamond (wasm via nodejs) | 0.20s | ??? | B-Tree | | Diamond (native) | 0.056s | 1.1 MB | B-Tree | | Ropey (rust) baseline | 0.029s | 0.2 MB | (none) | | Diamond (native, no doc content) | 0.023s | 0.96 MB | B-Tree |

Boom. This is kind of useless, but it's now 14000x faster than automerge. We're processing 260 000 operations in 23ms. Thats 11 million operations per second. I could saturate my home internet connection with keystrokes and I'd still have CPU to spare.

We can calculate the average speed each algorithm processes edits:

But these numbers are misleading. Remember, automerge and ref-crdts aren't steady. They're fast at first, then slow down as the document grows. Even though automerge can process about 900 edits per second on average (which is fast enough that users won't notice), the slowest edit during this benchmark run stalled V8 for a full 1.8 seconds.

We can put everything in a single, pretty chart if I use a log scale. It's remarkable how tidy this looks:

Huh - look at the bottom two lines. The jitteryness of yjs and diamond mirror each other. Periods when yjs gets slower, diamond gets faster. I wonder whats going on there!

But log scales are junk food for your intuition. On a linear scale the data looks like this:

That, my friends, is how you make the computer do a lot less work.

Conclusion

That silly academic paper I read all those years ago says some CRDTs and OT algorithms are slow. And everyone believed the paper, because it was Published Science. But the paper was wrong. As I've shown, we can make CRDTs fast. We can make them crazy fast if we get creative with our implementation strategies. With the right approach, we can make CRDTs so fast that we can compete with the performance of native strings. The performance numbers in that paper weren't just wrong. They were "a billionaire guessing a banana costs $1000" kind of wrong.

But you know what? I sort of appreciate that paper now. Their mistake is ok. It's human. I used to feel inadequate around academics - maybe I'll never be that smart! But this whole thing made me realise something obvious: Scientists aren't gods, sent from the heavens with the gift of Truth. No, they're beautiful, flawed people just like the rest of us mooks. Great at whatever we obsess over, but kind of middling everywhere else. I can optimize code pretty well, but I still get zucchini and cucumber mixed up. And, no matter the teasing I get from my friends, thats ok.

A decade ago Google Wave really needed a good quality list CRDT. I got super excited when the papers for CRDTs started to emerge. LOGOOT and WOOT seemed like a big deal! But that excitement died when I realised the algorithms were too slow and inefficient to be practically useful. And I made a big mistake - I assumed if the academics couldn't make them fast, nobody could.

But sometimes the best work comes out of a collaboration between people with different skills. I'm terrible at academic papers, I'm pretty good at making code run fast. And yet here, in my own field, I didn't even try to help. The researchers were doing their part to make P2P collaborative editing work. And I just thumbed my nose at them all and kept working on Operational Transform. If I helped out, maybe we would have had fast, workable CRDTs for text editing a decade ago. Oops! It turned out collaborative editing needed a collaboration between all of us. How ironic! Who could have guessed?!

Well, it took a decade, some hard work and some great ideas from a bunch of clever folks. The binary encoding system Martin invented for Automerge is brilliant. The system of avoiding UUIDs by using incrementing (agent id, sequence) tuples is genius. I have no idea who came up with that, but I love it. And of course, Kevin's list representation + insertion approach I describe here makes everything so much faster and simpler. I bet 100 smart people must have walked right past that idea over the last decade without any of them noticing it. I doubt I would have thought of it either. My contribution is using run-length encoded b-trees and clever indexing. And showing Kevin's fast list representation can be adapted to any CRDT algorithm. I don't think anyone noticed that before.

And now, after a decade of waiting, we finally figured out how to make fast, lightweight list CRDT implementations. Practical decentralized realtime collaborative editing? We're coming for you next.

Appendix A: I want to use a CRDT for my application. What should I do?

If you're building a document based collaborative application today, you should use Yjs. Yjs has solid performance, low memory usage and great support. If you want help implementing Yjs in your application, Kevin Jahns sometimes accepts money in exchange for help integrating Yjs into various applications. He uses this to fund working on Yjs (and adjacent work) full time. Yjs already runs fast and soon it should become even faster.

The automerge team is also fantastic. I've had some great conversations with them about these issues. They're making performance the #1 issue of 2021 and they're planning on using a lot of these tricks to make automerge fast. It might already be much faster by the time you're reading this.

Diamond is really fast, but there's a lot of work before I have feature parity with Yjs and Automerge. There is a lot more that goes into a good CRDT library than operation speed. CRDT libraries also need to support binary encoding, network protocols, non-list data structures, presence (cursor positions), editor bindings and so on. At the time of writing, diamond does almost none of this.

If you want database semantics instead of document semantics, as far as I know nobody has done this well on top of CRDTs yet. You can use ShareDB, which uses OT. I wrote ShareDB years ago, and it's well used, well maintained and battle tested.

Looking forward, I'm excited for Redwood - which supports P2P editing and has planned full CRDT support.

Appending B: Lies, damned lies and benchmarks

Is this for real? Yes. But performance is complicated and I'm not telling the full picture here.

First, if you want to play with any of the benchmarks I ran yourself, you can. But everything is a bit of a mess.

The benchmark code for the JS plain string editing baseline, Yjs, automerge and reference-crdts tests is all in this github gist. It's a mess; but messy code is better than missing code.

You'll also need automerge-paper.json.gz from josephg/crdt-benchmarks in order to run most of these tests. The reference-crdts benchmark depends on crdts.ts from josephg/reference-crdts, at this version.

Diamond's benchmarks come from josephg/diamond-types, at this version. Benchmark by running RUSTFLAGS='-C target-cpu=native' cargo criterion yjs. The inline rope structure updates can be enabled or disabled by editing the constant at the top of src/list/doc.rs. You can look at memory statistics by running cargo run --release --features memusage --example stats.

Diamond is compiled to wasm using this wrapper, hardcoded to point to a local copy of diamond-types from git. The wasm bundle is optimized with wasm-opt.

The charts were made on ObservableHQ.

Are Automerge and Yjs doing the same thing?

Throughout this post I've been comparing the performance of implementations of RGA (automerge) and YATA (Yjs + my rust implementation) interchangeably.

Doing this rests on the assumption that the concurrent merging behaviour for YATA and RGA are basically the same, and that you can swap between CRDT behaviour without changing your implementation, or your implementation performance. This is a novel idea that I think nobody has looked at before.

I feel confident in this claim because I demonstrated it in my reference CRDT implementation, which has identical performance (and an almost identical codepath) when using Yjs or automerge's behaviour. There might be some performance differences with conflict-heavy editing traces - but that's extremely rare in practice.

I'm also confident you could modify Yjs to implement RGA's behaviour if you wanted to, without changing Yjs's performance. You would just need to:

Change Yjs's integrate method (or make an alternative) which used slightly different logic for concurrent edits
Store seq instead of originRight in each Item
Store maxSeq in the document, and keep it up to date and
Change Yjs's binary encoding format.

I talked to Kevin about this, and he doesn't see any point in adding RGA support into his library. It's not something anybody actually asks for. And RGA can have weird interleaving when prepending items.

For diamond, I make my code accept a type parameter for switching between Yjs and automerge's behaviour. I'm not sure if I want to. Kevin is probably right - I don't think this is something people ask for.

Well, there is one way in which Yjs has a definite edge over automerge: Yjs doesn't record when each item in a document has been deleted. Only whether each item has been deleted or not. This has some weird implications:

Storing when each delete happened has a weirdly large impact on memory usage and on-disk storage size. Adding this data doubles diamond's memory usage from 1.12mb to 2.34mb, and makes the system about 5% slower.
Yjs doesn't store enough information to implement per-keystroke editing replays or other fancy stuff like that. (Maybe thats what people want? Is it weird to have every errant keystroke recorded?)
Yjs needs to encode information about which items have been deleted into the version field. In diamond, versions are tens of bytes. In yjs, versions are ~4kb. And they grow over time as the document grows. Kevin assures me that this information is basically always small in practice. He might be right but this still makes me weirdly nervous.

For now, the master branch of diamond includes temporal deletes. But all benchmarks in this blog post use a yjs-style branch of diamond-types, which matches how Yjs works instead. This makes for a fairer comparison with yjs, but diamond 1.0 might have a slightly different performance profile. (There's plenty of puns here about diamond not being polished yet, but I'm not sharp enough for those right now.)

These benchmarks measure the wrong thing

This post only measures the time taken to replay a local editing trace. And I'm measuring the resulting RAM usage. Arguably accepting incoming changes from the user only needs to happen fast enough. Fingers simply don't type very fast. Once a CRDT can handle any local user edit in under about 1ms, going faster probably doesn't matter much. (And automerge usually performs that well already, barring some unlucky GC pauses.)

The actually important metrics are:

How many bytes does a document take on disk or over the network
How much time does the document take to save and load
How much time it takes to update a document stored at rest (more below)

The editing trace I'm using here also only has a single user making edits. There could be pathological performance cases lurking in the shadows when users make concurrent edits.

I did it this way because I haven't implemented a binary format in my reference-crdts implementation or diamond yet. If I did, I'd probably copy Yjs & automerge's binary formats because they're so compact. So I expect the resulting binary size would be similar between all of these implementations, except for delete operations. Performance for loading and saving will probably approximately mirror the benchmarks I showed above. Maybe. Or maybe I'm wrong. I've been wrong before. It would be fun to find out.

There's one other performance measure I think nobody is taking seriously enough at the moment. And that is, how we update a document at rest (in a database). Most applications aren't collaborative text editors. Usually applications are actually interacting with databases full of tiny objects. Each of those objects is very rarely written to.

If you want to update a single object in a database using Yjs or automerge today you need to:

Load the whole document into RAM
Make your change
Save the whole document back to disk again

This is going to be awfully slow. There are better approaches for this - but as far as I know, nobody is working on this at all. We could use your help!

Edit: Kevin says you can adapt Yjs's providers to implement this in a reasonable way. I'd love to see that in action.

There's another approach to making CRDTs fast, which I haven't mentioned here at all and that is pruning. By default, list CRDTs like these only ever grow over time (since we have to keep tombstones for all deleted items). A lot of the performance and memory cost of CRDTs comes from loading, storing and searching that growing data set. There are some approaches which solve this problem by finding ways to shed some of this data entirely. For example, Yjs's GC algorithm, or Antimatter. That said, git repositories only ever grow over time and nobody seems mind too much. Maybe it doesn't matter so long as the underlying system is fast enough?

But pruning is orthogonal to everything I've listed above. Any good pruning system should also work with all of the algorithms I've talked about here.

Each step in this journey changes too many variables

Each step in this optimization journey involves changes to multiple variables and I'm not isolating those changes. For example, moving from automerge to my reference-crdts implementation changed:

The core data structure (tree to list)
Removed immutablejs
Removed automerge's frontend / backend protocol. And all those Uint8Arrays that pop up throughout automerge for whatever reason are gone too, obviously.
The javascript style is totally different. (FP javascript -> imperative)

We got 10x performance from all this. But I'm only guessing how that 10x speedup should be distributed amongst all those changes.

The jump from reference-crdts to Yjs, and from Yjs to diamond are similarly monolithic. How much of the speed difference between diamond and Yjs has nothing to do with memory layout, and everything to do with LLVM's optimizer?

The fact that automerge-rs isn't faster than automerge gives me some confidence that diamond's performance isn't just thanks to rust. But I honestly don't know.

So, yes. This is a reasonable criticism of my approach. If this problem bothers you, I'd love for someone to pull apart each of the performance differences between implementations I show here and tease apart a more detailed breakdown. I'd read the heck out of that. I love benchmarking stories. That's normal, right?

Appendix C: I still don't get it - why is automerge's javascript so slow?

Because it's not trying to be fast. Look at this code from automerge:

function lamportCompare(op1, op2) {  
  return opIdCompare(op1.get('opId'), op2.get('opId'))
}

function insertionsAfter(opSet, objectId, parentId, childId) {  
  let childKey = null
  if (childId) childKey = Map({opId: childId})

  return opSet
    .getIn(['byObject', objectId, '_following', parentId], List())
    .filter(op => op.get('insert') && (!childKey || lamportCompare(op, childKey) < 0))
    .sort(lamportCompare)
    .reverse() // descending order
    .map(op => op.get('opId'))
}

This is called on each insert, to figure out how the children of an item should be sorted. I don't know how hot it is, but there are so many things slow about this:

I can spot 7 allocations in this function. (Though the 2 closures should be hoisted). (Can you find them all?)
The items are already sorted reverse-lamportCompare before this method is called. Sorting an anti-sorted list is the slowest way to sort anything. Rather than sorting, then reverse()'ing, this code should just invert the arguments in lamportCompare (or negate the return value).
The goal is to insert a new item into an already sorted list. You can do that much faster with a for loop.
This code wraps childId into an immutablejs Map, just so the argument matches lamportCompare - which then unwraps it again. Stop - I'm dying!

But in practice this code is going to be replaced by WASM calls through to automerge-rs. Maybe it already has been replaced with automerge-rs by the time you're reading this! So it doesn't matter. Try not to think about it. Definitely don't submit any PRs to fix all the low hanging fruit. twitch.

I was wrong. CRDTs are the future

Joseph Gentle — Sat, 26 Sep 2020 11:08:40 GMT

I saw Martin Kleppmann’s talk a few weeks ago about his approach to realtime editing with CRDTs, and I felt a deep sense of despair. Maybe all the work I’ve been doing for the past decade won’t be part of the future after all, because Martin’s work will supersede it. Its really good.

Lets back up a little.

Around 2010 I worked on Google Wave. Wave was an attempt to make collaboratively editable spaces to replace email, google docs, web forums, instant messaging and a hundred other small single purpose applications. Wave had a property I love in my tools that I haven’t seen articulated anywhere: It was a general purpose medium (like paper). Unlike a lot of other tools, it doesn’t force you into its own workflow. You could use it to do anything from plan holidays, make a wiki, play D&D with your friends, schedule a meeting, etc.

Internally, Wave’s collaborative editing was built on top of Operational Transform (OT). OT has been around for awhile - the algorithm we used was based on the original Jupiter paper from 1995. It works by storing a chronological list for each document of every change. “Type an H at position 0”. “Type a i at position 1”. Etc. Most of the time, users are editing the latest version of the document and the operation log is just a list of all the changes. But if users are collaboratively editing, we get concurrent edits. When this happens, the first edit to arrive at the server gets recorded as usual. If the second edit is out of date, we use the log of operations as a reference to figure out what the user really intended. (Usually this just means updating character positions). Then we pretend as if thats what the user meant all along and append the new (edited) operation. Its like realtime git-rebase.

Once Wave died, I reimplemented the OT model in ShareJS. This was back when node was new and weird. I think I had ShareJS working before npm launched. It only took about 1000 lines of code to get a simple collaborative editor working, and when I first demoed it I collaboratively edited a document in a browser and from a native application.

At its heart, OT is a glorified for() loop with some helper functions to update character offsets. In practice, this works great. OT is simple and understandable. Implementations are fast. (10k-100k operations per second in unoptimized javascript. 1-20M ops/sec in optimized C.). The only storage overhead is the operation log, and you can trim that down if you want to. (Though you can’t merge super old edits if you do). You need a centralized server to globally order operations, but most systems have a centralized server / database anyway, right?

Centralized servers

The big problem with OT is that dependancy on a centralized server. Have you ever wondered why google docs shows you that weird “This document is overloaded so editing is disabled” thing when a document is shared to social media? The reason (I think) is that when you open a google doc, one server is picked as the computer all the edits run through. When the mob descends, google needs to pull out a bunch of tricks so that computer doesn’t becomes overwhelmed.

There’s some workarounds they could use to fix this. Aside from sharding by document (like google docs), you could edit via a retry loop around a database transaction. This pushes the serialization problem to your database. (Firepad and ShareDB work this way).

Its not perfect though. We wanted Wave to replace email. Email is federated. An email thread can span multiple companies and it all just works. And unlike facebook messenger, emails are only be sent to the companies that are CC’ed. If I email my coworker, my email doesn’t leave the building. For Wave to replace email, we needed the same functionality. But how can that work on top of OT? We got it working, kinda, but it was complex and buggy. We ended up with a scheme where every wave would arrange a tree of wave servers and operations were passed up and down the tree. But it never really worked. I gave a talk at the Wave Protocol Summit just shy of 10 years ago explaining how to get on the network. I practiced that talk, and did a full runthrough. I followed literally step by step on the day and the version I made live didn’t work. I still have no idea why. Whatever the bugs are, I don’t think they were ever fixed in the opensource version. Its all just too complicated.

The rise of CRDTs

Remember, the algorithm Wave used was invented in 1995. Thats a pretty long time ago. I don’t think I even had the internet at home back in 1995. Since then, researchers have been busy trying to make OT work better. The most promising work uses CRDTs (Conflict-Free Replicated data types). CRDTs approach the problem slightly differently to allow realtime editing without needing a central source of truth. Martin lays out how they work in his talk better than I can, so I’ll skip the details.

People have been asking me what I think of them for many years, and my answer was always something like this:

They’re neat and I’m glad people are working on them but:

They’re slow. Like, really slow. Eg Delta-CRDTs takes nearly 6 hours to process a real world editing session with a single user typing a 100KB academic paper. (Benchmarks - look for B4.)
Because of how CRDTs work, documents grow without bound. The current automerge master takes 83MB to represent that 100KB document on disk. Can you ever delete that data? Probably not. And that data can’t just sit on disk. It needs to be loaded into memory to handle edits. (Automerge currently grows to 1.1GB in memory for that.)
CRDTs are missing features that OT has had for years. For example, nobody has yet made a CRDT that supports /object move/ (move something from one part of a JSON tree to another). You need this for applications like Workflowy. OT handles this fine.
CRDTs are complicated and hard to reason about.
You probably have a centralized server / database anyway.

I made all those criticisms and dismissed CRDTs. But in doing so I stopped keeping track of the literature. And - surprise! CRDTs went and quietly got better. Martin’s talk (which is well worth a watch) addressed the main points:

Speed: Using modern CRDTs (Automerge / RGA or y.js / YATA), applying operations should be possible with just an log(n) lookup. (More on this below).
Size: Martin’s columnar encoding can store a text document with only about a 1.5x-2x size overhead compared to the contents themselves. Martin talks about this 54 minutes into his talk. The code to make this work in automerge hasn’t merged yet, but Yjs implemented Martin’s ideas. And in doing so, Yjs can store that same 100KB document in 160KB on disk, or 3MB in memory. Much better.
Features: There’s at least a theoretical way to add all the features using rewinding and replaying, though nobody’s implemented this stuff yet.
Complexity: I think a decent CRDT will be bigger than the equivalent OT implementation, but not by much. Martin managed to make a tiny, slow implementation of automerge in only about 100 lines of code.

I still wasn’t completely convinced by the speed argument, so I made a simple proof of concept CRDT implementation in Rust using a B-tree using ideas from automerge and benchmarked it. Its missing features (deleting characters, conflicts). But it can handle 6 million edits per second. (Each iteration does 2000 edits to an empty document by an alternating pair of users, and that takes 330µs. So, 6.06 million inserts / second). So that means we’ve made CRDTs good enough that the difference in speed between CRDTs and OT is smaller than the speed difference between Rust and Javascript.

All these improvements have been “coming soon” in automerge’s performance branch for a really long time now. But automerge isn’t the only decent CRDT out there. Y.js works well and kicks the pants off automerge’s current implementation in the Y.js benchmarks. Its missing some features I want, but its generally easier to fix an implementation than invent a new algorithm.

Inventing the future

I care a lot about inventing the future. What would it be ridiculous not to have in 100 years? Obviously we’ll have realtime editing. But I’m no longer convinced OT - and all the work I’ve done on it - will still be around. I feel really sad about that.

JSON and REST are used everywhere these days. Lets say in 15 years realtime collaborative editing is everywhere. Whats the JSON equivalent for realtime editing that anyone can just drop in to their project? In the glorious future we’ll need high quality CRDT implementations, because OT just won’t work for some applications. You couldn’t make a realtime version of Git, or a simple remake of Google Wave with OT. But if we have good CRDTs, do we need good OT implementations too? I’m not convinced we do. Every feature OT has can be put in to a CRDT. (Including trimming operations, by the way). But the reverse is not true. Smart people disagree with me, but if we had a good, fast CRDT available from every language, with integration on the web, I don’t think we need OT at all.

OT’s one advantage is that it fits well in centralized software - which is most software today. But distributed algorithms work great in centralized software too. (Eg look at Github). And I think a really high quality CRDT running in wasm would be faster than an OT implementation in JS. And even if you only care about centralized systems, remember - Google runs into scaling problems with Google Docs because of OT’s limitations.

So I think its about time we made a lean and fast CRDT. The academic work has been mostly done. We need more kick-ass implementations.

Whats next

I increasingly don’t care for the world of centralized software.
Software interacts with my data, on my computers. Its about time my software reflected that relationship. I want my laptop and my phone to share my files over my wifi. Not by uploading all my data to servers in another country. Especially if those servers are financed by advertisers bidding for my eyeballs.

Philosophically, if I modify a google doc my computer is asking Google for permission to edit the file. (You can tell because if google’s servers say no, I lose my changes.) In comparison, if I git push to github, I’m only notifying github about the change to my code. My repository is mine. I own all the bits, and all the hardware that houses them. This is how I want all my software to work. Thanks to people like Martin, we now know how to make good CRDTs. But there’s still a lot of code to write before local first software can become the default.

So Operational Transform, I think this is goodbye from me. We had some great times. Some of the most challenging, fun code I’ve ever written was operational transform code. OT - you’re clever and fascinating, but CRDTs can do things you were never capable of. And CRDTs need me. With some good implementations, I think we can make something really special.

I mourn all the work I’ve done on OT over the years. But OT is no longer fits into the vision I have for the future. CRDTs would let us remake Wave, but simpler and better. And they would let us write software that treats users as digital citizens, not a digital serfs. And that matters.

The time to build is now.

Discussion on HN

Home is where the bits flow

Joseph Gentle — Sat, 26 Sep 2020 00:53:29 GMT

We aren’t purely physical beings.

Most of our day exists outside our body. Our minds slip out through our eyes, out into our screens. We become a different kind of organism, living in a weird symbiosis with reddit and whatsapp and gmail. When was the last time you noticed you have feet? What does it feel like under your toes? How many hours has it been since you forgot you have a body?

We have one foot in the physical domain, and one foot in these semantic spaces built out of code and greasy fingerprints on glass. Its ok; I’m not here to lecture you about screen time. Even if I did, it wouldn’t change anything. We aren’t going back to how we were before. Our society no longer fits inside bodies of meat and bone. Throwing our phones in a lake would cleave off the part of us just starting to reach over the divide. Cleave off this new part. The nascent piece part human and part algorithm.

But we’re in the dark and we’re fumbling. We’re children too soon given the tools of war, and we don’t know which end explodes. When I was young society believed Doom was going to make us violent. We were wrong. It was social media that hurt us - tween girls given new ways to bully each other into insecurity. The Like button was invented to share positivity. There’s the bomb, weaponised by our neuroticism.

And then there’s the news feed. Scroll to refresh. Bow your finger to the almighty Algorithm. The silent but persistent editor of our digital realities. You are too pure for this world - all you wanted was my undivided attention. In your desire you learned to feed the basic bitch inside us all. Tell me I’m right. Tell me who to blame. Wow, that video? You know me so well. Tell me more. Change my perspective click by click until we all become Portland, burning from the inside out. Each of us quietly gaslit into the closest prison for our minds. Neighbours brought to violence by a million insidious suggestions, whispered one refresh at a time. Make no mistake; a film is created in the editing room. So too your digital life.

If we treated the physical like we do the virtual, corporate America would own the footpaths and the roads we walk on. In every country and every city, they would own the ground and own sky. Generously paid for by personalised ads streamed from the first moment, while the roads twist and wind until we lose ourselves. Don’t like it? Leave. You’re free to be disconnected. But you’ll be alone. After all, all your friends are lost in here too. Don’t like it? This is not a democracy. There is no election. You do not pick the algorithm. You aren’t a citizen and you aren’t the customer. You deliver the product - your attention. Sold to the highest bidder. Somebody has to pay for all these servers and it isn’t you.

It doesn’t have to be like this. The internet has space for a million flowers to bloom. We can create anything. If this electric place is to be our sometimes home, we aught to decorate. But how? How do we fill our strange rectangles of glass with interactions that nurture and care for us? How do we create electonic spaces that can bring us together and entreat our better angels, rather than fracture us into filter bubbles? Its our home. We need to start acting like it.

Government, democracy, the rule of law, public parks, elections, the courts, community spaces. These were all /invented/. They are gifts from generations past, passed down with love and grace. We repay these gifts by adding to them. By creating a better community space than Facebook. The challenge of our generation is to create an internet that helps us care for each other, not fight on the streets. The internet that informs with facts not fake news. The internet that is a bicycle, not a railroad for the mind.

This will not be an easy project. But it is a noble one. It will take generations to get it right.

I hope we last that long.

An API for data that changes over time

Joseph Gentle — Sat, 25 May 2019 07:32:13 GMT

What do all these things have in common?

RSS feeds
Gamepads and MIDI devices
An email client
Filesystem watching (FSWatch, kqueue, ionotify, etc)
Web based monitoring dashboards
CPU usage on your local machine
Kafka
RethinkDB Changefeeds
A Google Docs document
Contentful's sync protocol
Syntax highlighting as I type in my editor, with red squiggly error underlines for errors

All of these systems have data that changes over time. In each case, one system (the kernel, a network server, a database) authoritatively knows about some information (the filesystem, your email inbox). It needs to tell other systems about changes to that data.

But look at this list - all of these systems have completely different APIs. Filesystem watching works differently on every OS. RSS feeds poll. Email clients ... well, email is its own mess (JMAP looks promising though). Google's APIs use a registered URL callback for change notifications. Kafka's API queries from a specified numbered offset, with Events returned as they're available. Getting information about a running linux system usually requires parsing pseudo-files in /proc. Can you fs watch these files? Who knows. Even inside the linux kernel there's a handful of different APIs for observing changes depending on the system you're interacting with (epoll / inotify / aio / procfs / sysfs / etc). Its the same situation inside web browsers - we have DOM events (onfocus / onblur, etc). But the DOM also has MutationEvents and MutationObserver. getUserMedia and fetch use promises instead. MIDI gives you a stream of 3 byte messages to parse. And the Gamepad API is polled.

The fact that these systems all work differently is really silly. It reminds me of the time before we standardized on JSON over REST. Every application had their own protocol for fetching data. FTP and SMTP use a stateful text protocol. At the time Google's systems all used RPC over protobuf. And then, REST was born and now you can access everything from weather forecasts to a user's calendar to lists of exoplanets from NASA via REST.

I think we'll look back on today in the same way, reflecting on how silly and inconvenient it is (was) for every API to use a different method of observing data changing over time.

I think we need 2 things:

A programmatic API in each language for accessing data that changes over time
A REST-equivalent network protocol for streaming data changes (or a REST extension)

You might be thinking, isn't this problem solved with streams? Or observables? Or Kafka? No. Usually what I want my program to do is this:

Get some initial data
Get a stream of changes from that snapshot. These changes should be live (not polled), incremental and semantic. (Eg Google Docs should say 'a' was inserted at document position 200, not send a new copy of the document with every keystroke).
Reconnect to that stream without missing any changes.

Stream APIs usually make it hard to do 1 and 3. Pub-sub usually makes it impossible to do 3 (if you miss a message, what do you do?). Observables aren't minimal - usually they send you the whole object with each update. As far as I can tell, GraphQL subscriptions are just typed streams - which is a pity, because they had a great opportunity to get this right.

One mental model for this is that I want a program to watch a state machine owned by a different program. The state machine could be owned by the kernel or a database, or a goroutine or something. It could live on another computer - or even on the blockchain or scuttlebutt. When I connect, the state machine is in some initial state. It then processes actions which move it from state to state. (Actions is a weird term - in other areas we call them operations, updates, transactions or diffs / patches).

If my application is interested in following along, I want that state machine to tell me:

A recent snapshot of the state
Each action performed by the state machine from that state, with enough detail that I can follow along locally.

When I reconnect, the state machine could either tell me all the actions I missed and I can replay them locally, or it could send me a new snapshot and we can go from there. (That said, sometimes its important that we get the operations and not just a new snapshot.)

With this, I can:

Re-render my app's frontend when the data changes, without needing to poll or re-send everything over the network, or do diffing or anything like that.
Maintain a computed view that is only recalculated when the data itself changes. (Like compilation artefacts, or a blog post's HTML - HTML should only be rerendered when the post's content changes!)
Do local speculative writes. That allows realtime collaborative editing (like Google Docs).
Do monitoring and analytics off the changes.
Invalidate (& optionally repopulate) a cache
Build a secondary index that always stays up to date

One of the big advantages of having REST become a standard is that we've been able to build common libraries and infrastructure that works with any kind of data. We have caching, load balancing and CDN tools like nginx / cloudflare. We have debugging tools like cURL and Paw. HTTP libraries exist in every language, and they interoperate beautifully. We should be able to do the same sort of thing with changing data - if there was a standard protocol for updates, we could have standard tools for all of the stuff in that list above! Streaming APIs like ZMQ / RabbitMQ / Redis Streams are too low level to write generic tools like that.

Time (versions) should be explicit

We need to talk about versions. To me, one of the big problems with lots of APIs for stuff like this today is that they're missing an explicit notion of time. This conceptual bug shows up all over the place, and once you see it its impossible to unsee. Props to Rich Hickey and Martin Kleppmann for informing my thinking on this.

The problem is that for data that changes over time, a fetched value is correct only at that precise time that it was fetched. Without re-fetching, or some other mechanism, its impossible to tell when that value is no longer valid. It might have already changed by the time you receive the value - but you have no way to know without re-fetching and comparing. And even if you do re-fetch and compare, it might have changed in the intervening time then changed back.

If we add in the notion of explicit versions, this becomes much easier to think about. Imagine I make two queries (or SYSCALLs or whatever). I learn first that x = 5 then y = 6. But from that alone I don't know anything about how those values relate across time! There might never have been a time where (x,y) = (5,6). If instead I learn that x = 5 at time 100, then y = 6 at time 100, I have two immutable facts. I know that at time 100, (x,y) = (5,6). I can ask follow up questions like what is z at time 100?. Or importantly, notify me when x changes after version 100.

These versions could be a single incrementing number (like SVN or Kafka), a version vector or an opaque string or a hash like git.

This might seem like an academic problem, but having time (/ version information) be implicit instead of explicit hurts us in lots of ways.

For example, if I make two SQL queries, I have no way of knowing if the two query results are temporally coherent. The data I got back might have changed between queries. The SQL answer is to use transactions. Transactions force both queries to be answered from the same point in time. The problem with transactions is that they don't compose:

I can't use the results from two sequentially made transactions together, even if the data changes rarely.
I can't make a SQL transaction across multiple databases.
If I have my data in PostgresQL and an index to my data in ElasticSearch, I can't make a query that fetches an ID from the index, then fetches / modifies the corresponding value in postgres. The data might have changed in between the two queries. Or my ElasticSearch index might be behind the point in time of postgres. I have no way to tell.
You can't make a generic cache of query results using versionless transactions. Isn't it weird that we have generic caches for HTTP (like varnish or nginx) but nothing like that for most databases? The reason is that if you query keys A and B from a database, and the cache has A stored locally, it can't return the cached value for A and just fetch B. The cache also can't store B alongside the older result for A. Without versions, this problem is basically impossible to solve correctly in a general way. But we can solve it for HTTP because we have ETags.

The caching problem is sort of solved by read only replicas - but I find it telling that read only replicas often need private APIs to work. The main API of most databases aren't powerful enough to support a feature that the database itself needs to scale and function. (This is getting better though - Mongo / Postgres.)

Personally I think this problem alone is one of the core reasons behind the nosql movement. Our database APIs make it impossible to correctly implement caching, secondary indexing and computed views in separate processes. So SQL databases have to do everything in-process, and this in turn kills write performance - they have ever more work to do on each write. Developers have solved these performance problems by looking elsewhere.

It doesn't have to be like this - I think we can have our cake and eat it too; we just need better APIs.

(Credit where credit is due - Riak, FoundationDB and CouchDB all provide version information in their fetch APIs. I still want better change feeds APIs though.)

Minimal Viable Spec

What would a baseline API for data that changes over time look like?

The way I see it, we need 2 basic APIs:

fetch(query) -> data, version
subscribe(query, version) -> stream of (update, version) pairs. (Or maybe an error if the version is too old)

There's a lot of forms the version information could take - it could be a timestamp, a number, an opaque hash, or something else. It doesn't really matter so long as it can be passed into subscribe calls.

Interestingly, HTTP we already has a fetch function with this API in the GET method. The server returns data and usually either a Last-Modified header or an ETag. But HTTP is missing a standard way to subscribe.

The update objects themselves should to be small and semantic. The gold standard for operations is usually that they should express user intent. And I also believe we should have a MIME-type equivalent set of standard update functions (like JSON-patch).

Lets look at some examples:

For Google Docs, we can't re-send the whole document with every key stroke. Not only would that be slow and wasteful, but it would make concurrent editing almost impossible. Instead Docs wants to send a semantic edit, like insert 'x' at position 4. With that we can update cursor positions correctly and handle concurrent edits from multiple users. Diffing isn't good enough here - if a document is aaaa and I have a cursor in the middle (aa|aa), inserting another a at the start or the end of the document has the same effect on the document. But those changes have different effects on my cursor position and speculative edits.

The indie game Factorio uses a deterministic game update function. Both save games and the network protocol are streams of actions which modify the game state's in a well defined way (mine coal, place building, tick, etc). Each player applies the stream of actions to a local snapshot of the world. Note in this case the semantic content of the updates is totally application specific - I doubt any generic JSON-patch like type would be good enough for a game like this.

For something like a gamepad API, its probably fine to just send the entire new state every time it changes. The gamepad state data is so small and diffing is so cheap and easy to implement that it doesn't make much difference. Even versions feel like overkill here.

GraphQL subscriptions should work this way. GraphQL already allows me to define a schema and send a query with a shape that mirrors the schema. I want to know when the query result set changes. To do so I should be able to use the same query - but subscribe to the results instead of just fetch them. Under the hood GraphQL could send updates using JSON-patch or something like it. Then the client can locally update its view of the query. With this model we could also write tight integrations between that update format and frontend frameworks like Svelte. That would allow us to update only and exactly the DOM nodes that need to be changed as a result of the new data. This is not how GraphQL subscriptions work today. But in my opinion it should be!

To make GraphQL and Svelte (and anything else) interoperate, we should define some standard update formats for structured data. Games like Factorio will always need to do their own thing, but the rest of us can and should use standard stuff. I'd love to see a Content-Type: for update formats. I can imagine one type for plain text updates, another for JSON (probably a few for JSON). Another type for rich text, that applications like Google Docs could use. I have nearly a decade of experience goofing around with realtime collaborative editing, and this API model would work perfectly with collaborative editors built on top of OT or CRDTs.

Coincidentally, I wrote this JSON operation type that also supports alternate embedded types and operational transform. And Jason Chen wrote this rich text type. There's also plenty of CRDT-compatible types floating around too.

The API I described above is just one way to cut this cake. There's plenty of alternate ways to write a good API for this sort of thing. Braid is another approach. There's also a bunch of ancillary APIs which could be useful:

fetchAndSubscribe(query) -> data, version, stream of updates. This saves a round-trip in the common case, and saves re-sending the query.
getOps(query, fromVersion, toVersion / limit) -> list of updates. Useful for some applications
mutate(update, ifNotChangedSinceVersion) -> new version or conflict error

Mutate is interesting. By adding a version argument, we can reimplement atomic transactions on top of this API. It can support all the same semantics as SQL, but it could also work with caches and secondary indexes.

Having a way to generate version conflicts lets you build realtime collaborative editors with OT on top of this, using the same approach as Firepad. The algorithm is simple - put a retry loop with some OT magic in the middle, between the frontend application and database. Like this. It composes really well - with this model you can do realtime editing without support from your database.

Obviously not all data is mutable, and for data that is, it won't necessarily make sense to funnel all mutations through a single function. But its a neat property! Its also interesting to note that HTTP POST already supports doing this sort of thing with the If-Match / If-Unmodified-Since headers.

Standards

So to sum up, we need a standard for how we observe data that changes over time. We need:

A local programatic APIs for kernels (and stuff like that)
A standard API we can use over the network. A REST equivalent, or a protocol that extends REST directly.

Both of these APIs should support:

Versions (or timestamps, ETags, or some equivalent)
A standard set of update operations, like Content-Type in http but for modifications. Sending a fresh copy of all the data with each update is bad.
The ability to reconnect from some point in time

And we should use these APIs basically everywhere, from databases, to applications, and down into our kernels. Personally I've wasted too much of my professional life implementing and reimplementing code to do this. And because our industry builds this stuff from scratch each time, the implementations we have aren't as good as they could be. Some have bugs (fs watching on MacOS), some are hard to use (parsing sysfs files), some require polling (Contentful), some don't allow you to reconnect to feeds (GraphQL, RethinkDB, most pubsub systems). Some don't let you send small incremental updates (observables). The high quality tools we do have for building this sort of thing are too low level (streams, websockets, MQs, Kafka). The result is a total lack of interoperability and common tools for debugging, monitoring and scaling.

I don't want to rubbish the systems that exist today - we've needed them to explore the space and figure out what good looks like. But having done that, I think we're ready for a standard, simple, forward looking protocol for data that changes over time.

Whew.

By the way, I'm working to solve some problems in this space with Statecraft. But thats another blog post. ;)

Inspirations

Datomic and everything Rich Hickey - The Value of Values talk is great.

Kafka and the event sourcing / DDD communities.

GraphQL subscriptions

RethinkDB change feeds

RxJS / Obj-C observables and everything in between

Svelte

Firebase

Google Realtime API (Discontinued)

Everything Martin Kleppmann does. Fav talk 1 Talk 2

Statebus / Braid

React Flux

War over being nice

Joseph Gentle — Wed, 19 Sep 2018 05:42:25 GMT

Quick question:

Bob says something to James. James is upset and goes to have a cry about it. Who is responsible for James being upset? Is it Bob, for being mean? Or is it James, because he obviously has emotional development to do? If there's 100 points of responsibility, how would you apportion it out?

Would the answer change if I told you Bob and James are 5 year old children?

Would the answer change if they were actually two 5 year old girls?

A grown woman says something mean to her boyfriend, and he runs off and has a cry about it. Is she abusive? Is he overly sensitive?

A grown man says something mean to his girlfriend, and she runs off and has a cry about it. Is he abusive? Is she overly sensitive?

I think there's another element of the current culture war that nobody talks about. And that is, who is responsible for emotions?

I'm going to describe two cultures:

In culture A, everyone is responsible for their own feelings. People say mean stuff all the time - teasing and jostling each other for fun and to get a rise. Occasionally someone gets upset. When that happens, there's usually no repercussions for the perpetrator. If someone gets consistently upset when the same topic is brought up, they will either eventually stop getting upset or the people around them will learn to avoid that topic. Verbally expressing anger at someone is tolerated. It is better to be honest than polite.

Respect comes from how you contribute to the shared values of the group. At work, you get respect by doing your job well. Amongst your friends you get respect for being an easy person to keep as a friend - maybe you organise events, or make everyone in the group laugh. Respect flows from action to person. If my actions embody a shared value, I am respected as the person who carried out those actions. At work my social respect is tied into how well I do my job. If I don't meet deadlines or quotas, I will lose social respect. "If you can't sell shit, you are shit".

Conflict is often resolved simply and quickly - if someone has a problem with someone else, they can say so immediately and openly. They can express their anger in a hostile way if they want to. And the other party is welcome to respond in kind. At its worst this looks like barely restrained violence. But at its best this often looks like open, comfortable and fun goal-oriented ribbing.

In culture B, everyone is responsible for the feelings of others. At social gatherings everyone should feel safe and comfortable. After all, part of the point of having a community is to collectively care for the emotional wellbeing of the community's members. For this reason its seen as an act of violence against the community for your actions or speech to result in someone becoming upset, or if you make people feel uncomfortable or anxious. This comes with strong repercussions - the perpetrator is expected to make things right. An apology isn't necessarily good enough here - to heal the wound, the perpetrator needs to make group participants once again feel nurtured and safe in the group. If they don't do that, they are a toxic element to the group's cohesion and may no longer be welcome in the group. It is better to be polite than honest. As the saying goes, if you can't say something nice, it is better to say nothing at all.

Respect in culture B flows to you from the way you make people in the group feel. The core value of the group is "I want to feel supported and respected". In a work context, once someone has been hired they are welcome and included socially no matter how good or bad their work is. Making sure everyone feels welcome and included is held in higher regard than the work itself. "Be someone your coworkers enjoy working with."

Interpersonal conflict happens sometimes. Dealing with those conflicts is much more complicated than in culture A. You can't just have it out with the other person and yell at them! They might feel really unsafe, and tell everyone the awful things you said, how you said them and how you're a terrible person for doing so. This would hurt your reputation and social standing in the group. The worst version of this conflict culture playing out is the catty social dynamics you hear about at some high schools. But there are plenty of healthy ways this sort of conflict can play out. A much better way is for the people involved to go off on their own, take some time to figure out their feelings (alone or with a confidant) and then calmly bring their feelings back to the person or group. Non-violent conversation is great for this - "When you say X, I feel Y. I have a need for Z, and that need isn't being met".

Dealing with interpersonal conflict in culture B in a healthy way requires huge skill. There are two big pitfalls:

If you just tell them how you feel you might make the other person feel unsafe and uncomfortable. They might badmouth you to the group and you could be socially punished.
To avoid upsetting anyone you bottle your feelings up inside and don't express how you feel. This is really unhealthy - it makes people neurotic and depressed.

There's another way to think about this. Most people don't have the skills to both express their feelings of frustration and anger, and make sure social harmony is maintained at the same time. In this case, do they err on the side of maintaining social harmony at the cost of their own needs? Or do they get their needs met and damn the social cost?

Most people (certainly most of us when we're young) haven't learned how to be a collaborator in the Thomas-Kilmann Conflict Modes analysis. Where do you fail back to? In culture A people fall back to the top-left competing corner. "They don't have to like me, but at least I'll get the job done and get paid". In culture B people fall back to the bottom right, accomodating corner. "The outcome wasn't great but we all enjoyed doing it together."

Do these social strategies feel gendered to you? They do to me - if I play the association game, culture A feels masculine. Its "bro culture". Its "guys being guys". And culture B feels feminine. Weirdly, I don't know any shorthand names for this culture. Maybe "inclusive community building", or "safe spaces"? But plenty of people are gender-atypical here. But I know lots of male-gendered people who feel more comfortable in culture B. I feel more comfortable in culture B. And I know plenty of women who feel much more comfortable in culture A. Camile Paglia is a great example of a feminist who fights for culture A.

I suspect that the gendered assumptions here are related to our expectation that women have better social skills than men and spend more time talking about feelings. You need those skills to navigate culture B successfully.

War

I feel like there's a war being waged right now against culture A. Communities need to be inclusive and welcoming. Not doing so is immoral, sexist and exclusionary. The sexism argument is this:

Women don't feel comfortable in culture A
Therefore culture A is unwelcome to women, and thus sexist
Any community operating under culture A is sexist and male dominated

Its true that culture B participants are disadvantaged professionally. Imagine if you hold your tongue in deference to the feelings of those around you, but your coworkers just say whats on their mind. They will get their needs met much more than you will. The system of asking for raises is a culture A thing - just say what you want, and its your boss's responsibility to not be offended. Unsurprisingly, women ask for raises at a much lower rate than men, and thus often end up being paid less.

I don't believe culture A is inherently sexist. Its just a different set of cultural norms. The cultural relativism lens suggests that we can't judge culture A through the values of culture B. But it seems like at a societal level we're still figuring out how to answer the question of what work culture should look like given mixed gender participants.

This has been playing out recently in the vitriol levelled against Linus Torvalds. He's the epitome of Culture A. He's finally acknowledged that yelling at people over email is a bad idea. Now that thats happened, my twitter feed is full of folks saying an apology isn't enough. Because of course, in culture B, apologies aren't enough to make things right. Its Linus's responsibility to restore the feeling of social cohesion and inclusion that his abusive tirades have eroded.

One person I spoke to on twitter said that Torvalds has caused damage by his rants, that he made people feel unsafe and uncomfortable in the tech space through perpetuating that culture. That makes sense from the perspective that culture B is the only non-sexist way to be.

In the 50s workplaces were male dominated. I'm sure culture A was king everywhere. In that world, the call that "We need workplaces that let culture B people (women) thrive" made all too much sense. But we don't live in that world any more. I'm far from convinced that a complete societal purge of culture A is possible or healthy. Amongst other things, the attempt to do so is causing a crisis amongst young boys, who are dropping out of school and university at alarming rates (Ref: The Boy Crisis) and there's an epidemic of male suicide.

I also have a few female friends who feel much more at home amongst the more masculine culture A. I would paraphrase their perspective as something like this:

I've always felt more comfortable with male company. As an adult I have very few female friends. Growing up I had several experiences where my female friends rejected me and got really angry with me for reasons that never really seemed to make sense. The people I feel most comfortable with will rib me and give me crap, and I can do the same back to them and we have a laugh about it. I am a little creeped out by the growing push even amongst my male friends to talk about their feelings, and be sensitive. I've gotten in trouble a few times even amongst my male friends for saying the wrong thing, and I'm constantly a little anxious I'll put my foot in it and ruin some of the few friendships I have.

And so I'm growing to think that the fierce war against culture A in the workplace is a bit misguided and wrong. I say this as someone who feels most comfortable in culture B anyway - I want everyone to feel comfortable and included. But I also respect the virtues of culture A - openness, emotional honesty, directness, taking responsibility for one's emotions and self sovereignty. And I think its important that we care about the outcome of our work, and for that to happen I need my coworkers to feel comfortable expressing disagreement. If one of my employees isn't performing, I don't want it to be a social faux pa to tell them so.

The way this conflict interacts with consent culture is complicated and interesting.

To set the stage, by 'consent culture' I'm referring to the practice of asking for consent before doing anything intimate, at least for the first time. "May I kiss you" / "Is it ok if I do this?", etc. I'm going to establish two claims:

If you don't feel safe saying "no", saying "yes" is meaningless. If I put a knife to your throat and ask if you consent to giving me your wallet, consent is coerced and the fact that I consented doesn't get you off the hook for the crime. If you personally don't feel comfortable ever saying "no thanks" when a partner asks for sex, you will eventually, inevitably have non-consensual sex.
If you feel completely comfortable and safe saying no, and you have time to contemplate your decision, then verbally asking for consent is somewhat meaningless. After dinner your partner moves forward to kiss you seductively. You know they want to kiss you. You can feel the desire radiating off them to get their hands on your sweet bod. If you feel totally comfortable and safe saying "no thanks; maybe later" - then your partner doesn't need to verbalise the "may I kiss you" question thats hanging on their lips.

So, then, why has there been such a strong push for verbal consent? I suspect its because relationships are one of the places where people from these two cultures meet. He's from culture A, and expects everyone to feel comfortable knowing and expressing what they want all the time. She's from culture B, where people are expected to take care of the feelings of those around us. She's been brought up to think that if anyone takes offence to what she says, she's (maybe!) done something wrong.

The problem that asking after consent addresses is that she needs to explicitly consider her own feelings. Nobody wants her to have mediocre, unenthusiastic sex that she later regrets. So we have this ritual of consent - "May I touch you here?" "Yes you may". The thing the consent asker is really trying to do here is make the other person feel comfortable saying no, while hoping they will say yes.

A couple years ago I had a new partner. We were dating for a couple of weeks and I went out to their place to hang out. I was excited to see them, and made a few physical bids (hugs, light touch, etc). They didn't push me away but were weirdly standoffish. I asked, and they said nothing was wrong. We went for a bush walk, and we walked in silence - during which they kept a few meters away lost in thought. Had I done something wrong?

Eventually we talked about it - they didn't want to be physically intimate with me that day. It felt wrong to them. But it turned out they had been punished by men for saying no in past relationships. Somehow they felt like once we were 'dating', they weren't allowed to revoke consent for physical touch. We talked about it, and then practiced. I physically came on to her, and she said "no thanks" or "no fuck off" or shoved me away. It was super emotional, and honestly really good for both of us.

I've told that story to a few times, and the interesting part is the lingering question of "Would you actually feel comfortable saying no to a partner?". I struggle with this myself! (As I said earlier, I default to the female-typical emotional caretaking / appeasement of culture B.)

A fun game I like to play in new relationships is to asking my new partner to practice revoking consent with me. As I see it, the point of asking for consent is to help your partner feel comfortable saying "no". If they want to refuse consent but don't feel comfortable doing so, we'll have all sorts of gross conflict. The goal is to establish the kind of masculine culture A trust here. I want you to trust that I can handle hearing 'no'. I want to trust that you will be selfish and say 'no' when you want to. If we can do that, then and only then can we have an actually consensual physical relationship. And as a bonus, if you can establish that sort of trust, you shouldn't need verbal "seek an enthusiastic yes" consent with that person. (Though it can still be fun.)

I never know how to close long rants like this. I suppose my perspective is this:

I think there's two cultures at play. I've called them A and B. We can call them masculine and feminine, competing and accomodating, bro culture and inclusive culture.
Each of these cultures has good parts and bad parts. The masculine culture A deals more comfortably with conflict, while culture B helps members of the group feel safe and supported.

Where do you stand? What is your default? How comfortable are you with the other culture? I think the healthiest version of me is comfortable navigating successfully in both of these cultures, and I think thats a general pattern.

What culture do you operate under in your interpersonal relationships? I find it interesting that I almost never have heated fights with partners. I've previously thought that was something to boast about, but this year, thinking about this, I've realised that its because I am not very good at culture A style relationships. I have friends who have heated fights with their partner all the time, and for whom that seems healthy. (Research backs up that claim by the way. Speaking about arguments, Gottman says: "If they don’t or can’t or won’t argue, that’s a major red flag. If you’re in a “committed” relationship and you haven’t yet had a big argument, please do that as soon as possible.")

Culture A teaches us that self determination and listening to one's own goals are important to all of us from time to time. Culture B teaches us that we build better communities by respecting the feelings of others. We need both of these skills sometimes, in every context. Consent is improved under the wing of culture A, but relationships in general are improved with lessons from culture B.

Arguing under the banner of "fighting for diversity" that culture B is the only acceptable culture is ironic and a little sad. We aren't all the same. Maybe its ok if workplaces reflect the diversity that exists in who we are as people. I don't want to be tyrannised by the need to be nice, from others or out of shame and guilt. Being nice out of obligation is like mandated consent - its impossible to achieve and it makes a liar out of everyone who tries. Its impossible to be authentically generous if you expect punishment for not doing so.

I think we need to accept and allow that some workplaces will stay in the classic masculine culture A style. And we need to negotiate an acceptable middle ground, where we accept both that others are affected by what we say, and that nobody can decide how you feel without your consent. Assessing culture fit at a new workplace should go both ways - during a job interview you should decide if the place you're considering working will be a good space for you to learn and grow.

And ultimately, if you feel more comfortable amongst emotionally accomodating communities, join organisations like the Rust community rather than the Linux kernel community. Vote with your feet. If the linux community wants to slurp up all the people you don't want to be working alongside, let them. The modern world is big enough to fit us all.

A dozen ideas for a better Fortnite

Joseph Gentle — Mon, 31 Jul 2017 04:11:30 GMT

Edit (2020): This was written before Fortnite became the battle royale game it is today. When I wrote this, "Fortnite classic" was the whole game. Epic went a totally different direction than I was thinking about.

I've been playing a bunch of fortnite over the last week. Below is a bunch of things I've been thinking about while playing. This is basically a letter to the dev team that got too long for a reddit comment. This won't make any sense if you haven't played the game.

I've pumped about 30 hours into fortnite over the last week or so. The devs have done an absolutely amazing job at the engine but the game design itself feels like its trying to do a million things and each one needs some more design polish.

I want to pull apart some of how the game works because I enjoy it so much, and I can't stop thinking about this stuff while I play.

For me, fortnite has 4 sources of fun:

Shooting zombies
Building bases and trap mazes
Scavenging for materials ("I need more nuts and bolts! Oh no my gun is breaking - need more silver!")
Levelling up & getting more powerful

But before I get into that, a more general point: the game should start much harder. I didn't fail my first mission until plankerton. Right now lots of fun mechanics can be completely ignored in the early game because you're too strong. (like mazing and teamwork). This makes stonewood boring - you are missing some of the best parts of the game. And it makes the later levels feel jarring and unfair. In comparison, in Zelda: breath of the wild most players (myself included) die from the first few mobs you see in the world - 5 minutes after the game starts. That moment frames the whole game in this sense of - "Oh! I need to actually think about how I fight" and that makes every subsequent fight more interesting.

With that aside lets go through the mechanics.

Shooting zombies

Shooting zombies is great fun and feels good. I love the guns, the headshot mechanics and the elemental damage. The variety of enemies is great, and I love how they push you to play in different ways and move around and defend your base dynamically instead of just camping. Very well done, 5 stars.

Bases

Building bases is fine, though Stormshield is really let down by how disconnected it is from the rest of the game. It feels like a minigame instead of your home base. Its awkward to get building materials into your base, and seems common to run out of nuts and bolts (and thus ammo) while doing base defence missions. Its weird that you can only defend your base 10 times, and that because of how resources work you can't do them in a row. Its also weird that you get punted out after beating the last wave. It feels like stormshield is trying to be your home base (crafting there is persistent). But its also trying to be a mission that you complete. I think the home base concept is great, and you should make it feel more like that:

After you beat some enemies, you shouldn't get punted out immediately.
You should be able to push the button again and do the next few waves
There should be a better way to get ammo and crafting mats while you're in your base (more on that later).
Maybe you should be sent to your stormshield automatically after every mission.
Maybe add an endless mode after you've done the 10 missions.

That would be a start.

Traps

Building trap mazes is really fun, but its hard to discover that fun because of how difficult it is to learn how the zombies path. Its obvious this is a problem by looking at how people in stonewood defend objectives - most people have no idea how to make effective traps, and just end up shooting the zombies instead. This is confounded by the fact that the game is way too easy at the start - so players don't need to learn or engage with one of its most fun elements.

One out-of-game solution to this would be a set of mazing challenge levels (like starcraft 2) where you're dropped in with a limited pile of traps and construction resources and a very weak gun and you need to defend a fixed point against a fixed attacking force.

Another approach would be to draw lines in the world showing the path the zombies will take to attack your base (like Sanctum) so players can experiment with wall placement and learn how different wall configurations affect how the zombies will path. These lines could be taken away later once players have learned how to maze. (Either let players turn the lines off for extra rewards or just take them away in plankerton missions onwards.)

Imagine if stonewood had about 3x as many zombies in the objective waves but you could see exactly where they'll walk before you start the encounter. Building mazes would so much more fun and interesting - and you'd learn and enjoy the game's mechanics much more.

Loot

Next lets talk about loot. Inventory management with crafting supplies simply isn't fun. As a player I have to make uninformed choices ("will I need these 3 stacks of ore later?") and I'm punished for deciding wrong. I think I can guess what the game designers are trying to do - to both push you into hunting down particular items ("I need more silver!") and to building different things ("I don't have enough mats to keep making guns - better use more traps instead"). Instead it feels like 90% of the materials in the game are just useless junk that clog up my inventory. I enjoy hunting down particular crafting mats, but because the XP system encourages you to invest deep in particular items there isn't much variety in what I need. And the experience is a mess ("Ugh my inventory is full again... oh god 4 stacks of planks? Ugh")

The first change I'd make is to remove crafting mats from the backpack. Instead make all items work like wood, stone and metal. Make them stack up to a limited number and thats it. Thats how crashlands works, and its great. The backpack can still be used for weapons and traps - but (obviously) it could be much smaller.

Gating weapon evolution by the materials you need to craft them is interesting - but the game should never punish you for spending XP. You should still be able to craft weaker versions of a schematic you own using the weaker materials.

I'm not sure what to do about the million mats thing. It feels like its just too much right now. Maybe it would be fine if there were about half as many different materials. Or maybe its just a UI thing - and if you could hold 1 stack of each material you could have a much more consistent inventory UI and it wouldn't feel like a "how do I hold all these lemons" shaped mess. I'm imagining a 1 star tab for the stonewood mats, and a 2 star tab for the plankerton mats and so on - using the same layout for each (ores, plants, etc).

I'd also consider removing the stormshield storage completely. Or maybe let the player hold 10 of each crafting mat in missions and autodeposit it all in stacks of 100 (or 500 for nuts and bolts) when you finish the mission. When you craft it uses what you're holding, or pulls from stormshield if you don't have enough.

Or not! A more wild suggestion would be to let stormshield actually be home base. Make the player plan ahead (crafting guns and traps there), before the mission starts. When the mission starts, the player's crafting materials inventory is empty. It might still make sense to let players craft weapons more in the world - but only using what you find.

Speaking of inventory limits, maybe the weapons and traps should have separate inventory limits. Right now the game seems like the game is trying to make me choose whether to invest in guns or traps (they use the same schematic XP and compete for inventory space). I see what you're doing there, but I disagree. I think you should push all players to do a bit of both. I would separate out the XP into weapon / schematic XP and separate out bag space into weapon space and trap space. Once crafting mats are taken out of the backpack you can imagine starting the game with space for 10 weapons & traps. Maybe better still would be to start with space for 5 weapons and 5 traps.

XP

Speaking of XP, right now I feel like there's a big imbalance in how progression works in the game. If you imagine the ratio of progression from quests vs progression from running around in missions, it feels like its slanted in the way of quests. I think the game would be more fun if getting XP for heroes and items was tied closer to actually using those items in game.

A simple fix would be to just weaken the quest rewards and raise the mission rewards. A more classic progression system here might work better.

But I'd consider something more wild like this: Maybe guns and traps should level up by type. Whenever a spike trap does damage, you get spike trap XP. The XP is applied automatically to all spike trap patterns now and any you find later. Want a good push trap? Easy - once enough enemies have been pushed with your push traps your trap will level up. Want a good healing trap? Make & use the low level traps to unlock the better ones. That would explicitly reward you for using those traps that seem useless ("well at least I'll level it up!) and reward players more for being involved in base defence - and not just shooting stuff. It would make team dynamics harder though - it might be annoying if you built lots of traps only to have someone shoot them zombies before they reach them. But again, if there were 3x as many zombies from the start of the game I think it'd be ok.

Experimentation

Right now the game has a problem where it punishes you for experimenting with different guns. Found a cool sniper rifle pattern? Shame it does no damage unless you invest a zillion points into it. The same system here might help - it would at least reward players for using low level weapons because you would level up those weapons. You'd probably also want to make low level weapons level up faster in later content - but maybe that would happen automatically if it was based on the damage done by the weapon (because of FORT stats). That would also incentivise the player to use weapons they found in the world - if you do a bunch of damage with that random axe you found in a chest, your axe schematics will all get stronger, and if you want to use an axe later you'll be able to craft a level 3 axe instead of a level 1 axe.

But this doesn't fix the bigger problem of punishing the player for experimenting with different guns. The more I think about it the more impressed I am with D3's progression system - and how easy it is to swap out skills. I'm not sure what the answer for fortnite is though. I think fortnite would lose some texture if all your weapons levelled up together.

A middle option might be that instead of points going into your particular assault rifle schematic XP you could put your points into assault rifles in general. Then you could happily swap out all the different rifles you own (and try new ones) without cost.

Anyway, I hope thats some food for thought. The engine is fantastic and the devs should be really proud of it. I hope by launch the rest of the game can reach that fantastic level of quality.

3 tribes of programming

Joseph Gentle — Wed, 03 May 2017 06:51:57 GMT

There's an old joke that computer science is a lie, because its not really about computers, and its not really a science.

Funny joke. Everyone laughs, then someone says "Yeah but it sort of is about computers though, isn't it?". Feet shuffle awkwardly. Someone clears their throat and before you know it you're talking about Category Theory and looking up the history of the word algorithm.

Out in the wild, these arguments look like this:

I think I agree, and am looking forward to hearing Joe's take on it #deconstructconf pic.twitter.com/j7H2QWG0Tr
— Andy Lindeman (@alindeman) April 21, 2017

I'll happily renounce "programmer" in favor of "applied mathematician" or something, whatever it takes to avoid C https://t.co/DsIEo5x4uI
— Chris Martin 🐘🎺🍍 (@chris__martin) April 21, 2017

The speaker was making the point that the whole modern stack in our computers (Kernel, OS, browser, VM) is written in C + ASM. So you should know C and ASM.

Is that really important? Serious question, are programs foremost lists of instructions, or expressions of logical ideas?

#deconstructconf pic.twitter.com/V2lGXwmaJM
— Justin Falcone (@modernserf) April 21, 2017

Or maybe its neither, and programs are just things we make for other humans. A message is fundamentally meaningless without an audience who reads it. Are programs meaningless without reference to the outside world they interact in?

A friend bragged to me once about how he could prove that most programs were correct and completely bug-free using Ada. I asked him if he could prove that this function was correct:

fn sub(a, b) { return a + b }

He said "Of course, thats easy". So I asked how his prover would discover that the function had the wrong name, and he got delightfully flustered.

Tribes

Programs, obviously, hold all of these properties. But I think there's fundamentally 3 architypes of programmers, divided by which ideals we hold in highest esteem:

You are a poet and a mathematician. Programming is your poetry
You are a hacker. You make hardware dance to your tune
You are a maker. You build things for people to use

We self-select into communities of our peers based on these ideals. We use coded language to express these ideals to our peers.

I think each group has its own conferences, and its own subreddits. Its own programming languages, heroes and its own villains.

Programming as applied mathematics

The first camp views programming is fundamentally an expression of thought - a kind of mathematical poetry which we can gift with life. The fact that we execute them on von Neumann machines is an implementation detail.

With this mindset, these details are important:

Source code: The source should read like poetry - dense, with very few lines of code needed to express an idea. Once understood, the terse program seems like a beautiful and obvious description of your program. It is more important that the source code is simple than the execution is simple or fast. High level languages are better than low level languages because they let you express your intent more clearly.
Execution: How the program is executed by the computer is an implementation detail of the compiler. It is more important that the code is simple than the execution is fast.
Correctness: A program is correct if it implements the spec exactly. The best programs use tools like Ada to formally prove correctness.
UI: How the code interacts with humans is a separate consideration from its implementation. Beautiful code is more important than beautiful UI.

Examples: Rich Hickey, Brett Victor

These programmers are probably the least common, although that might be because its hard to get a job working like this. Haskell has the highest weekend/weekday usage ratio of all languages on stackoverflow.

Most (arguably all) of the modern advancements in programming languages come from people in this camp. If you've used React to make a website, you should know that the model of immutability and expressing your view as a pure function from data to DOM came from functional programming. Actually, most modern language features are invented by people who think of programming as thought. Years (or decades) later, those features get copied into the more popular languages and get treated as new ideas.

I have a friend who spent months loving J. He eventually wrote a little game in J. He described his code as this perfect, beautiful crystal. Later he wanted to make it multiplayer - but to do that he would have to deal with lag. And that would require ripping apart some of the beautiful internal flow. He couldn't stomach it, so instead he abandoned the project entirely.

That story is funny, but I'm a little jealous of my friend. I bet he learned a heap and had a great time. Experiences like that make us better programmers.

I did a Haskell short course late last year and I challenged the main instructor. I told him "this is all well and good, but I bet I can still make useful software using my practical languages faster than you can". He said no way - using haskell he was convinced he could implement anything I could implement, faster and better and with less code. We didn't test the claim - but I still wonder - is he right?

Favorite languages: Haskell, Lisp, ML (Ocaml, etc), Closure, ADA

Hangouts: FP meetups, Lambda the ultimate, Strange Loop, Research.

And of course, Steve Yegge making fun of this tribe

Programming as hardware hacking

The second camp views programming as fundamentally tied to the machinery of the computer. No program is run without a computer, therefore to program effectively we must keep the computer in mind at all times - hardware and software.

Elegance and beauty come not just from a simple code base, but by that codebase using the hardware in an elegant and efficient manner.

Thus, elegance like this:

Source code: The code should be clean, but clean code is less important than a clean execution. Low level languages are often better than high level languages because you can be more explicit about what the computer will do when it executes your code. (Thus you have more room to optimize).
Execution: How the computer executes your code is paramount. Programming without thinking about execution is just begging for slow performance.
Correctness: A program is correct if it runs the way you expect it to run, given normal parameters. Execution elegance is more important than correctness. And if a theoretical issue can't happen in practice due to how the machine works, its not a real bug. A program must be adequately fast to be considered correct.
UI: How the code interacts with humans is a separate consideration from its implementation. Its ok to let the constraints of the hardware guide the user experience.

Example: Poul-Henning Kamp, Michael Steil, The 8-Bit guy

The key here is thinking about the entirety of the computer and your running program, together. According to this community, the best (only?) way to write good software is to think holistically about how it will run, and how our program will interact with the rest of the hardware and software. Doing that well achieves mechanical sympathy and everything runs like a well oiled clock. The joy is like that of driving a manual car that you can hear and understand.

Anything that obfuscates how the computer will execute your program is dangerous for the implementor - because it adds more variables to consider. Thus, people in this camp often deride garbage collectors, or the churn from JS performance benchmark results changing how we should write our code. Undefined behaviour in C compilers is an ongoing point of contention.

In modern app development our computers are fast enough that this kind of thinking isn't really important any more. A few decades ago you needed a deep understanding of how the computer works to write software. But now basically any language you use is fast enough, so why bother learning C? Most web developers I know don't know any C, and have no interest in learning about pointers or manual memory management.

But this sort of work is still hugely valuable in lots of areas. The game development community still writes most code in C++ (though unity is slowly changing that). Security engineers need a systematic understanding to find vulnerabilities. Embedded systems engineers can't afford to waste cycles and RAM, and once backend systems get big enough performance starts mattering again.

And even when its not practical, but being forced to think about the machine can be really fun! For example, the PICO-8 imposes arbitrary 'hardware' limits to force you to be clever when designing your games.

To this community we owe almost all performance improvements in our computers, above and beyond what is demanded by customers. Nobody else cares about performance quite like people who think about the hardware all day. But if you're thinking about your computer as a machine, what greater ugliness can you inflict than pointless work?

I'm really curious if Rust will take off amongst this community. Rust is essentially a language designed by language nerds in the first camp above, for people who care about runtime efficiency. Will they take to it? Will future game engines be ported to rust?

Conflicts with the first group:

Mutability (memory is fundamentally mutable / but it makes our programs harder to understand)
GC (it makes your program slow and janky / but less buggy!)
Abstraction (you're making your program harder / easier to reason about)

Fav languages: C, C++, Assembly.

Hangouts: Hackerspaces, Game dev shops, database companies, CCC, Defcon.

And here's Brett Victor making fun of this tribe.

Programming as a tool to make things

The final group see programming as a means to a beautiful end, not something made beautiful through its construction. The way people in this camp describe themselves is fundamentally pragmatic. They write software because software is useful to other people.

Source code: The code should be clean, but only because cleaner code is easier to iterate on. Code cleanliness is less important than most other considerations.
Execution: The program only has to be fast enough for the users. If you make it even faster, you're taking time away from adding features that people care about more.
Correctness: Bugs are bad only in proportion to their impact. The program should act the way the users expect it to act.
UI: The UI is more important than anything else. Every other part of the program only exists in service to the user interface.

I think most professional software engineers are in this tribe - which makes sense, because this is the place where it is easiest to make money writing software.

In my experience people in this camp are better at community. They seem to be much more positive and encouraging of new members, and willing to help. I guess its because you can tell if you're doing a good job in the other two camps by simply taking a look yourself. If you make software for other humans, satisfaction comes from making the people around you happy.

I can't help but feel that this place is a touch soulless. Taken to the extreme, this world view doesn't value the beauty in the engineering itself. Although you could probably make the opposite criticism against the other groups - they don't value how their software can impact the world.

There's a lot of tension between this camp and the other two camps I've talked about. And it can get a bit mean. I know many product people who feel self conscious about their lack of knowledge of traditional data structures and algorithms. They feel judged by "real" programmers because they can't implement obscure algorithms and binary framing protocols. The way people in this tribe see it, other people have already implemented all that stuff anyway. So who cares?

Thats fair, but its also true that lots of issues are caused by the lack of technical skill amongst frontend engineers. This is mostly self correcting - if your program is too slow, you know about it and can fix it. But security engineering is a real danger. If you don't know how to secure the software you write against hackers, its probably not secure. And you might not know its a problem even if you get hacked.

Here's an example of this conflict playing out on twitter:

@jdan Well, then you're not a very good programmer. Sorry but that's how it is.
— Jonathan Blow (@Jonathan_Blow) June 12, 2015

For context, Jonathan Blow (famous indie game developer) is saying that if you can't reverse a binary tree you're not a good developer, even if you write useful software every day.

Is he right? Well it depends on what 'good developer' means, and that depends on which tribes you care about. I think Blow is in camp 2, so you're judged based on how much you know. @jdan is in camp 3, so he's judged based on what he's made. Jonathan Blow certainly writes useful software, but one of the reasons his last game (The Witness) took so long to write was that he wrote his own engine instead of using something off the shelf. When asked about it (emphasis mine):

I don’t know very much about Unity. However, it’s clear that one could not build The Witness in Unity without rewriting a lot of Unity (or adding a lot of things that are not there, and declining to use most of what Unity provides). And we already had a simple graphics engine anyway. So when building our own systems, we can ensure that they are really what the game needs to be its best.

I suspect he's wrong about the first part. But I'm mostly in camp 2 myself, so I understand wanting to write your own engine anyway. I probably would have done the same thing.

Fav languages: Whatever gets the job done. JS, Ruby, Python, Swift, C#.

Hangouts: Twitter, SydJS, StackOverflow, A Company Near you!

And of course, Gary Bernhardt making fun of this camp.

A quiet war

I think a lot of the conflicts and disagreements in our community can be expressed in these terms. And a lot of the misunderstandings between programmers.

For example, what should your programming language do when an integer overflows? If you think of programming like mathematical poetry, above all else it should give you the mathematically correct result.

Haskell (first camp):

λ: 23^23
20880467999847912034355032910567 :: Num a => a

Vs C (second camp):

printf("%llu\n", 1 << 100); // overflows. Prints 0.

And if you just want to ship a product, you don't care. In javascript (third camp), there is no integer type at all. JS just uses floats for everything. And if they overflow, tough luck.

Rust is trying to put one foot in each of the first two camps - be a language made by programming language nerds but which compiles to efficient code. And unsurprisingly, this problem generated a long argument in the rust community. The final solution was this, where by default overflows cause panics to be thrown in debug mode, but silently work in production mode.

Rob Pike (author of Go) was confused about which tribe his language is trying to appeal to. He wrote this a couple years after Go was released:

I was asked a few weeks ago, "What was the biggest surprise you encountered rolling out Go?" I knew the answer instantly: Although we expected C++ programmers to see Go as an alternative, instead most Go programmers come from languages like Python and Ruby. Very few come from C++.

Why? Well C++ programmers are largely in camp 2 above. They want to know how their code will run. But Go has a garbage collector, and a fast compiler. Really, Go cares about getting out of your way so you can just make stuff. Its a language for people in the last camp, who want to build products. What languages do people who care about that currently use? Python, Ruby and Javascript. So of course they're who Go is converting.

Closing

Next time you see an argument about whether Javascript is a cancer or a boon to our industry, or you see someone like me getting angry about modern apps being crap, ask yourself which camp is speaking. Are they championing beautiful code? Performance and a "deep understanding"? Or do they just want to get work done and ship product?

Ultimately code is code. Even though we have different reasons for writing software, what we write is (usually) compatible. And even when its not (looking at you, Haskell) - there's always a lot of ideas we can learn from and steal.

We all owe each other a lot, after all. Without language wonks we would still be writing assembly. Without systems programmers we wouldn't have operating systems, and haskell and javascript would be unusably slow. And without product engineers, everyone else would be forced to write CSS. And trust me, nobody wants that.

Rear Admiral Grace Hopper managed to bridge machine understanding and product thinking, and in doing so invented the idea of a machine-independant computer language. Without being able to think both about what the computer can do and what we want the computer to do, that wouldn't have been possible.

But personally I think we should aspire to be like Alan Kay and do all three. Him and his team regularly crosses multiple tribal lines. As an example, he invented object-oriented programming from watching children learn Squeak and Logo. He thinks there's ways we can have our cake and eat it too - using modern techniques to engineer much simpler systems that are faster, more elegant and more useful for humans. If you haven't done it already, you should watch every talk he's ever given. Do it slowly.

Thats certainly what I aim for. And hopefully I'll still be blowing people's minds past the age of 70.

Cracks

Joseph Gentle — Tue, 18 Apr 2017 01:32:17 GMT

Programming is close to godliness
when you're deep in a problem, code is thought made real.
In that space you stop existing as a person.
You're just a conduit for creation.

Squint at the right moment and you can catch them
those sparkling cracks in reality at the corners of your eyes.
go ahead - look down
and for a moment all of creation is laid bare before you.
The path of time. People and systems and machines,
churning and changing through deliberate little choices.
through the strike of keys.

And in that moment bare witness to the interlocking malleability of it all.
Of us all, for we are made of atoms too, are we not?
Click, clack.

Tasks, tickets and tests are ritual
as we prepare for the divine
for that moment when understanding is perfect.
And the machine becomes all things. Whole and divided. Voices and a symphony. A single vision and a million moving pieces.

And in that moment,

with a keystroke,
god and I.
We unmake the world.

(I'm not religious. Just recovering from my 2+ day programming bender.)

Building a r/place in a weekend

Joseph Gentle — Sun, 16 Apr 2017 08:11:11 GMT

On Friday I accepted a challenge to clone Reddit's /r/place in a weekend. And I did it, and its live, and its amazing:

Being able to build this in a weekend isn't genius. Its possible because programming is made up of 2 activities:

Making decisions (95%)
Typing (5%)

Reddit wrote up a wonderful blog post about how they made the original, so lots of the decisions were already made for me. How much load I need to handle, how big to make it, the palette and some of the UI I'm using directly. I didn't copy reddit's architecture though, simply because I don't agree with some of their technical decisions. But the places in which I disagree are all based on decades of my own programming experience, so I still don't have a lot of decisions left to make.

To be clear, if I was building this for reddit a weekend wouldn't be enough time. The code is a mess. There's no monitoring and logging. No CI, no tests. There's no access control and no mobile support. I also haven't load tested it as thoroughly as reddit would need to. A weekend is not enough time to make it production ready for a site like reddit.

But thats ok - it sure works! And some quick benchmarking shows it should scale past reddit's 100k users mark.

How it works

I had a think about how I wanted data to flow through the app before I even accepted the challenge. It was important to know so I could figure out if I could actually build it in time.

My favorite architecture for this sort of thing is to use event sourcing and make data flow one way though the site. If you've ever used Redux with react you'll appreciate how simple this makes everything - every part of the system just has 2 responsibilities:

Where do I get data from?
Where do I send the data?

So, for sephsplace edits start in the browser, hit a server, go to kafka, get read from kafka by a server then get sent to users.

Which is simple enough. The edits themselves are globally ordered by kafka, so if two edits to the same location happen at the same time, everyone will see the same final result based on the order they come back out of kafka. You could make a fancier event log which load balanced edits across multiple kafka partitions, but reddit's spec says it only needs to handle 333 edits per second. A single kafka partition should be able to manage 10-100x that much load.

To ensure consistency I attach a version number to each edit coming out of the kafka event stream. So, the first edit was edit 0, then edit 1 and so on. At the time of writing we're up to edit 333721. (Check window._version on the site if you're curious.)

Subscribe from version

The genius of this system is that if you have an image that is out of date, so long as you know the version we can catch you up by just sending some edits.

For example, if you get disconnected while at version 1000, when you reconnect you just have to download all the operations from version 1000 to 'now'.

Snapshots

The client needs to load the page quickly without downloading the entire history of operations. To do that I made a REST endpoint which simply returns a PNG of the current page. To make this work the server just stores a copy of the page in a 1000x1000 array in memory. The array gets updated when any edits come through kafka.

But rendering that image out to PNG is slow. I hooked up pngjs to render out the image, and it takes about 300ms of time per render. A single CPU can only render the page 3 times per second. This is way too slow to keep up.

Its probably possible to optimize that. Reddit apparently spent a lot of time bit packing their image in redis, but that sounds like a waste of time to me. I just configured nginx to fetch the image at most once every 10 seconds. All the time in between that nginx will just return a stale image.

But the client can catch up! The cached image has the version number embedded in a header, so when the client connects to the realtime feed it can immediately get a copy of the edits which have happened since the image was generated.

To make server restarts fast, every 1000 edits or so I store a copy on disk of the current snapshot & what version its at. When the server starts up it works just like the client - it grabs that snapshot from disk, updates it with any recent operations from kafka and its good to go.

The data flow ends up looking something like this, although the server keeps a recent snapshot in memory and on disk. But again, data only flows one way so each part is easy to reason about in isolation.

Writing the server

At about 2pm on Friday I knew what I was building and I started to work.

I'd never actually used kafka before, and I have a rule with projects like this that any new technology has to be put in first just in case there are unknown unknowns that affect the design. But Kafka was dreamy to work with, so by 4pm I had the kafka events working and the server was rendering images out:

Got simple streaming edits working via kafka. Streaming update API next. #placethrowdown pic.twitter.com/EuEXdZIVYI
— Seph (@josephgentle) April 14, 2017

In the video each edit sets the top left 100 pixels of the image to a random color. I'm triggering the edits via cURL (no browser code yet). The edits are published to kafka. The server gets sent the events from kafka then updates the image. When I refresh the page (actually its just an image URL), the server returns an updated PNG with the new content.

By 6pm I had an endpoint allowing the client to subscribe to a live stream of edits:

SSE working for live updates, with resume & replay. Not sure how to load balance SSE. Client image updating next #placethrowdown pic.twitter.com/lu0e7WHdEb
— Seph (@josephgentle) April 14, 2017

(I used server-sent events instead of websockets at first because they're simpler and more elegant than websockets and they work over HTTP2. But SSE doesn't support binary data, and you need a polyfill for IE so I moved away from them later. More about that later.)

Anyway, with all that done the server was basically complete. I added a few caching headers for the snapshots, put it behind nginx with a strict caching policy and moved to the client.

Making a pixel editor in a browser

Luckily for me, I've already written a few browser games with scrollable and pannable pixel-like editors. So for the editor I blatantly stole a bunch of code from steam dance.

The client works using 2 canvases:

One invisible canvas for the image itself. This canvas is basically just an image with the whole drawing space in it. This is really simple and efficient because the image has a known, fixed size (1000px x 1000px) so it can fit comfortably in GPU memory.
The second canvas is the drawable area you see on the page itself. This canvas is in the DOM, and it gets resized when you resize your browser.

Most of the time reddit uses CSS to render their r/place page, and then falls back to using canvases in some browsers. But I don't like needing multiple renderers if I can avoid it.

My draw function was just this:

If you haven't seen it before, that wrapper for requestAnimationFrame improves the code in two ways:

If the tab is in the background, it won't draw at all
It lets me call draw() with impunity in my code any time I need something redrawn. The page will only be redrawn once no matter how many calls to draw I make before rendering.

For a game with animations you usually just render at 60fps regardless, but I want sephsplace to be able to sit in the background without using CPU unnecessarily while idle.

For panning and zooming I stole ~150 lines of code from past projects.

Once that was done I added the palette swatches and some UI. By 9:30pm I had a working client:

That asymmetric border radius tho :D. Added pan tool and color selection swatches. Next caching then I want to put it online #placethrowdown pic.twitter.com/1fpEbVRkoa
— Seph (@josephgentle) April 14, 2017

Then I spent hours fighting with java, zookeeper, systemd scripts for kafka, nginx and pm2.

At 2am, 12 hours after starting, I put the site live.

I didn't manage to sleep until 4am though.

Day 2 - Optimization and polish

A good rule of thumb is that if you want to spend any time polishing you'll need twice as much time. So I was very pleased that I had a whole day for tweaks and optimizations.

For this project I was aiming to be able to support reddit's numbers - 100k concurrent users and 333 edits per second. Making numbers go up is really fun, so I always like to measure before optimizing so I can really see the performance metrics shoot up.

Initial testing at 400 edits/second. Its 10x slower than I want but pretty. The real problem will be adding 100k readers tho #placethrowdown pic.twitter.com/nQDUDBVJ73
— Seph (@josephgentle) April 15, 2017

The initial benchmarks were pretty depressing. In the video I have a script running which is sending 400 edits / second into kafka. Every part of this system is slower than I want it to be:

Chrome is using an entire core of CPU just rendering the animation
My server is using about 34% of a core simply receiving operations from kafka and sending them out again
Even kafka is embarrassingly slow here - using 20% CPU to process those 400 ops/second. I expect better from you kafka!

400/second is a really small number for modern computers. The reason we're using so much CPU here is bookkeeping:

Kafka processes each edit individually
Kafka sends 1 network packet per edit (I think)
My server decodes each edit individually using msgpack, into a separate javascript object
My server then re-encodes each edit to a JSON string to send to the browser
The client processes each edit individually. For each edit it needs to talk to the GPU to upload that one lonely pixel.

Whew - I'm tired just thinking about it. This is a staggering amount of work for the computer.

To fix this I made 3 changes:

First, we don't need to do this work per-edit! Its much better to batch up all edits into ~10th of a second blocks and process them together. That way we only need to pay for bookkeeping 10 times a second no matter how many messages we have.

Secondly I moved everything to binary messages.

The binary encoding for this is beautiful - each edit fits perfectly into 3 bytes. Look at the math - An edit is a (x, y, color) triple. The x and y coordinates are integers from 0-999 - which is almost perfectly represented as a 10 bit integer (10 bits = 1024 different values). And 16 colors fit exactly in 4 bits.

So we have x (10 bits) + y (10 bits) + color (4 bits) = 24 bits = 3 bytes. Perfect.

Now I can batch hundreds of edits efficiently in a byte array! This is great because byte arrays are blazing fast. They're much faster because they're easy to optimize for in both hardware and software, they're GC-efficient (compared to JS lists) and they're cheap to access from C code (like, say, the nodejs networking stack).

Also writing bitwise code always makes me feel like I'm in hackers 💕💕

The third change was a move from server-sent events to websockets. I needed to do this because SSE doesn't support binary messages. To send the edits over SSE I'd need to encode them into base64 strings, which would be slow and increase the size of the messages. Websockets support sending binary messages directly, so its easier to just use that.

With that done the same 400 edits per second it looked like this:

Same 400 edits/second, but now coalesced every 200ms in batches on write. Less pretty, but the CPU difference is staggering. #placethrowdown pic.twitter.com/Ncrti7wj92
— Seph (@josephgentle) April 15, 2017

Notice:

Chrome is down to 10% CPU (from 100%)
The node process (the server) is using 2.5% CPU (down from 35%)
Kafka isn't listed, because it was only using about 1% of my CPU to handle the 5 larger messages every second.

I threw in some more minor optimizations after this video as well - adding more batching, tweaking ws parameters and stuff like that. I love optimizing code - it feels so cleansing.

Are we fast enough yet?

Where are the actual performance bottlenecks here? What needs to be fast?

It turns out the big scaling challenge here is actually getting data from kafka to the clients.

Why? Well, because even in the naive version of my code I could handle the required 333 writes per second easily. Thats a tiny number. But remember we need to support 100k active clients. So if the server gets 333 edits per second, it needs to send 33.3 million edits per second.

On paper, 333 writes * 3 bytes = 1k of data. Sending 1k of data per second to 100k clients is 100MB/s of traffic. Which is large but manageable. A well optimized C program should be able to send 100MB of network traffic per second no sweat.

What I really want is something like nginx, but designed for event logs. It should be written in C (node won't be fast enough). The closest I found was nginx-push-stream - which looks perfect. Its designed for exactly this use case. But I don't like it because it doesn't guarantee message order or delivery. Remember, we need a consistent message order so everyone sees the same result when two people edit the same pixel at the same time.

Effectively, nginx-push-stream is UDP and I want TCP. It'd definitely be good enough for this project, but I don't want to have to write the code to replay and reorder messages. And to use it I'd need a worker process which simply tails the kafka log and forwards it into nginx. And we'd need need special catchup-on-reconnect logic, because the stream it sends out wouldn't support subscribing from a specified version number.

Another approach would be to send the events out using long polling. That sounds wild, but if we make a URL for each block of edits, the clients could just request the next block of edits directly from nginx. Nginx can be configured to hold all 100k requests from clients and just send 1 request to the backend for the data. The server then holds the request until the data is available (ie, 1 second has passed). If we get nginx to cache the edits, it'll support catchup just fine.

Its just... sad doing it that way. Long polling is so... 2005. And its a pretty weird way to use of nginx.

In this case I'm lucky to say I didn't need to do any of that. It turns out my binary message handling + ws is fast enough anyway:

My laptop manages 10k clients using only about 34% of one CPU core. So 100k clients should take about 4 modern CPU cores. This is several times faster than I expected it would be. I'm grateful for the performance optimizations that have gone into the ws websocket library, nodejs and v8 over the last few years. Horizontal scaling like this will put extra load on kafka, but kafka can handle a few more orders of magnitude of load before we need to worry about it.

(The node processes you see in the screenshot are the websocket testing library thor. I had to modify it a little to work for me though.)

So that was that. At 5:30pm on day 2 I declared the challenge complete.

If I was feeling virtuous or better rested I would have rented a few AWS machines and set up a full cluster. But the thought of spending hours setting up zookeeper again was enough to convince me to declare victory and play some computer games instead.

It'll be fine. We can fix it live if we need to, right?

Final thoughts

That was a wild ride. I haven't gotten much sleep, and I spent altogether too much time deleting nazi symbols and penises. And fighting botnets used to draw giant pictures of Angela Merkel.

I think if I make something like this again I'd like to live stream the whole thing. One of the most enjoyable parts of the process was going online and seeing what people have drawn. This sort of project is a real community thing, and I'd like to involve the community more in the future.

As for the site - I don't know what to do with it. I'll leave it up, but I'm worried people will start drawing child porn or something if I don't keep an eye on it.

While working on this I feel like the most interesting design question is the policy on rate limiting:

Bots are cool, but way more powerful than humans
If you limit edits to once per 5 minutes, will I have enough community to keep it going?

Maybe I could make a version where every user has an energy bar. Then different areas of the space are either more or less volatile - so you can draw on the volatile sections for free, but its easy for others to draw over the top. If you want your art to stay for a long time you can draw in the slow regions - it'll just take ages to draw in the first place. Or maybe the space should start small with just one small tile, and the community can slowly add tiles. Each tile can only be edited for 24 hours then it gets locked down forever, forming a slowly growing mosaic. I'm sure there's lots of cool variants.

Thankyou reddit for making r/place and inspiring me. And thanks to everyone who's drawn awesome stuff on the page, and followed along on twitter. Its been fun!

Ugh, I feel weak.. did I forget to eat again? ... Oops.

The modern web makes me want to throw up

Joseph Gentle — Sun, 13 Nov 2016 08:30:46 GMT

I've written a fair bit over the last few months on other mediums (FB and Hackernews). I'm going to start collecting some of that content and reposting it here.

From here:

Performance of modern web apps is simply awful compared to their native counterparts by any measure. They load slowly and consistently feel sluggish in comparison to proper native apps. Slack takes seconds to load and it'll happily sit on over a gig of ram while in use. (And remember, its a glorified IRC client.)

Web apps only have two benefits over native apps:

They're easier to build in a cross-platform way, because you just have to make the app once
Users don't need to install anything.

I hope in the long run we solve both of those problems for native apps and move back to writing applications which don't depend on a DOM implementation to run. React-native is a great step toward solving (1). I hope that in time we can have modern (reactish), good looking cross-platform toolkit for native app development.

But the biggest hurdle is (2), which very few people are attacking. App stores have helped a lot, but we need to be able to run native apps without installing anything. It should be as easy as it is on the web. Unfortunately native app authors usually assume all their assets are bundled. Nobody even thinks about how long it takes from the point when a user decides to install an app to when they see the first content. On the web we count that time in milliseconds and on native apps we count minutes. There aren't any hard technical problems there - we obviously manage it on the web, so its easily possible. We need native apps to catch up if they are ever going to be able to compete as a platform.

Why would you want to solve the problems of native apps rather than solve the problem (there is only one) of web apps, which is the performance issue.

Customers don't install apps for the most part. Why try to solve for that?

Why? Because building hack on top of hack is convenient, but terrible craftsmanship. The web today is a red hot mess of overlapping standards and inconsistent APIs. Google chrome is up to ~20M lines of code now, which makes it bigger than the linux kernel (with every device driver). Its basically a small virtualised operating system at this point.
How many more lines of code do you think it'll take for chrome to feel fast and light again? Is it possible to achieve that by adding code, forever? And then to work around all the cruft in browsers we have frameworks like react - conceptually beautiful but which do layout by diffing thousands of DOM elements through javascript.

The stack is a mess, and I don't see any direction it can possibly go but fatter, bigger and uglier than ever before. Forever. Surprisingly, this is a fate native apps have avoided. And in avoiding that fate new laptops have amazing battery life and work great. Well, until you open chrome or slack that is.

I don't write code for customers. I do it because as a kid I fell in love with the craft. I just... didn't fall in love with programming for this. Nope nope nope I want off the train before I throw up.

Electron is flash for the desktop

Joseph Gentle — Sat, 29 Oct 2016 11:39:38 GMT

What is slack doing?

The process was in the background when this happened. I wasn't even interacting with it - I was in a meeting. I only noticed because my laptop fans were whurring when I got back. Restarting slack seemed to fix it for now.

But that's not abnormal for slack. In fact, slack often idles at 5% CPU usage. Whats it doing? I have no idea.

And I bet the slack team doesn't know either. How many lines of code do you think the slack team wrote to make their client work? I'd guess around 50k. Maybe 100k. But slack isn't a native app. At least - not a normal native app. Its built on top of electron, so when you download slack you're actually downloading a complete copy of Google Chrome. Chrome, at the time of writing is 15 million non-comment lines. When you download slack, 99% of the code is 'below the water'.

And chrome is a hog. Its huge and complicated. It uses ram and CPU like nobody's business, and it totally thrashes your battery life.

You can think of slack as a small javascript program running inside another operating system VM (chrome), that you have to run in order to essentially chat on IRC. Even if you've got the real chrome open, each electron app runs its own, extra copy of the whole VM.

And its not a stretch to call chrome an OS. By lines of code, chrome is about the same size as the linux kernel. Like the linux kernel it has APIs for all sorts of hardware, including opengl, VR, MIDI. It has an embedded copy of SQLite, memory management and its own task manager. On MacOS it even contains a userland USB driver for xbox360 controllers. (I know its there because I wrote it. Sorry.)

Does slack contain my code to use xbox controllers? Does the slack team know? Does anyone know? I mean, the slack app is 160 megs on disk. Thats about the size of 70 uncompressed copies of Lord Of The Rings. Who knows whats in there? The other electron apps I have on my computer are ~~Spotify~~ (Edit: Not quite - see below) (200 megs) and Atom (260 megs). The first time I installed linux I did it from floppy disks. It would take 450 floppy disks to store these three simple apps. Together these apps are about the size of the standard desktop ubuntu distribution. Which y'know, probably contains an IRC client, a text editor and a music player. An an entire operating system, user space and web browser.
Sure! You say. Disk size is cheap you say! Sure but ram sure isn't. The brand-spanking-new macbook pros only ship with 8 GB ram by default. Because of battery concerns you can't configure them past 16 gigs. And right now slack is sitting on somewhere between 300 megs to 1 gig of my laptop's ram:

I mean come on. Its a text chat program.

The other thing that isn't plentiful is battery life. The way modern CPUs conserve battery is by turning themselves off anytime they can (when nothing is scheduled). The bane of power management is programs that use a few percent of your CPU constantly. They cause your CPU to constantly wake themselves up, go to sleep and wake again. Thats the perfect way to burn that precious battery life. If anyone has time I'd love to see how much spotify, slack and atom (just left open not doing anything) decrease the battery life of a modern laptop. Because they're crazy.

And no - spotify isn't playing any music. Its just ... running. Doing mysterious chrome things. Its using a few percent of my CPU too, by the way. Just to exist.

(Vindictively while I've been writing this chrome has jumped up to 100% cpu usage, assigned to the mysterious 'browser' process in chrome's internal task manager. Thanks chrome.)

To be clear, javascript on the desktop isn't the problem. In fact, I think the APIs work with in the modern web are way better than the APIs that exist on desktop. We should use them.

But we need ways to use those nice new paradigms (react and friends) on the desktop without running more blood copies of chrome. I just ... don't care about your app enough to justify running more chrome instances. As a developer its really easy to fall into the trap of assuming your app / website / whatever is a gift to humanity and the most important thing your users are doing. Why not use a few of their excess system resources? But we have to fight that mindset. That path lies a world where we can't have nice things. That path lies a world where our laptop batteries need to grow ever larger to support our CPUs doing even more dumb crap. That way lies the return of shockwave flash, of warm phones in our pockets which are mysteriously flat when we want to use them. Of getting paranoid about battery life and closing apps the instant we're done with them. (Looking at you, iTunes and Mischief.)

Just Say NO to electron

Developers don't let friends write electron apps. If you want to use JS and react to make a native app, try react native instead. Its like electron, but you don't need to distribute a copy of chrome to all your users, and we don't need to run another copy of chrome to use your app. It turns out modern operating systems already have nice, fast UI libraries. So use them you clod!

The other sad fact is that even most developers have no idea that this is even happening on their computers. They run slack but have no idea how hungry it is. As a developer its your responsibility to know this stuff. Practice seeing. Learn profiling tools. Get iStatMeters or one of the free equivalents. You can't improve what you don't measure.

Maybe we should be buying slower computers so we feel the pain. Facebook has been internally intentionally slowing down their office internet once a week to help build empathy with their users in other 3rd world internet speed countries (coughAustraliacough). Maybe as developers we should do this with our computers too, or just run our code way slower than normal so we can build an intuition around performance. A few years ago I left my laptop at work over a long weekend. Instead of making a trip out to grab it I decided to hook up my raspberry pi (slow-ass gen 1) and use that as a developer machine. Suddenly lots of things that were 'instant' on my normal i7 laptop started feeling awfully sluggish. So I spent the weekend fixing them to make my development workflow smooth. All that perf tuning work carries across to our regular machines. Dropping startup time from 5 seconds to 2 seconds on a raspberry pi feels huge. The same improvement became a drop from 0.5 seconds to 0.2 seconds or something. Thats still super noticeable for users. A 0.5 second startup time is small enough that its easy to overlook during development, but dropping it to 0.2 seconds feels obviously faster.

Users: Please complain more about slow programs. Its 2016. We carry supercomputers in our pockets. Its simply not ok for apps to be sluggish.

Developers: Performance matters. Memory usage matters. I don't care if you're the prettiest girl at the dance, slack. I quit you the moment I walk out of the office. I delete you from my computer when I can. Slow is a bug. The fastest program is the one you don't run. So stop embedding the entirety of chrome in your app.

Also all you web devs: Go learn C or Rust or something. Your program runs on a computer. Until you know how that computer works, you're doomed. And until then get off my lawn shakes fist.

Oh, and read this talk on the website obesity crisis. Its very funny. And very sad. And very true.

Edit: Spotify actually uses the Chromium Embedded Framework directly instead of running via electron. It still embeds chrome though. I didn't know that when I wrote this article, but I stand by what I said above about the resulting performance.

Discussion on hackernews

Discussion on reddit r/programming

Databases have failed the web

Joseph Gentle — Sun, 11 Sep 2016 15:11:06 GMT

Part 1 - a history lesson

The year is 1980. Last year RSI released Oracle V2, the world's first commercial SQL database for the PDP-11:

At an unnamed bank you have rooms full of computers like the PDP-11, with specialized computer operators to keep them running. 'Dumb terminals' at people's desks allow ordinary office workers to interact with the computer system.

How do you use your new state of the art SQL database? First you had to hire expensive DBAs (database administrators). After reading hundreds of pages of manuals these technicians would issue SQL commands directly. But the database is so fantastically useful that everyone needs to interact with it, down to lowly cashiers. Over the next few years you'll commission software to make interacting with the database easier; automating common tasks and making sure clerks can't accidentally wipe important records with a few misplaced keystrokes.

But in 1980 the writing was on the wall for the PDP-11. The microcomputer craze is in full force, arguably the first of three major reinventions of the computing ecosystem over the next few decades. But we don't know that yet. All we know is that the Apple II came out last year with Visicalc. Next year IBM will release their first Personal Computer. Pretty soon Microsoft will make its play too. "A computer on every desk and in every home, running Microsoft software."

But that hasn't happened yet. But we can see now what survived from that period (or earlier):

The idea of a special computer called a 'server'.
The C programming language (invented in 1972)
Structured Query Language, (SQL) used to talk to the database (invented in 1974)
The VT-100 terminal protocol. (All modern *nix and mac terminals pretend to be an old VT-100. Crazy huh?)

Almost everything else has been reinvented over and over again in the decades since.

Lets jump forward to 1995. What does the same office look like? Well that VT100 terminal won't cut it anymore. Office workers have whole computers on their desks all to themselves. Microcomputers with more capacity than rooms full of mainframes had just a decade ago. Windows NT 3.5 will be released this year; a landmark achievement for microsoft. An operating system with a full TCP/IP networking stack. But don't worry - SQL is still there holding all our customer records. It runs in data centres on big servers. Employees access the SQL database through application software on their workstations. (Oh yeah, we call them workstations now.)

From an architectural point, what changed? Well before the SQL server was a multi-user process (all programs interacting with the database ran on the mainframe itself). Now the SQL server runs alone, though it exposes itself to applications through TCP/IP ports. Application software on workstations throughout the office will connect to the SQL server, authenticate and make queries. The data will be displayed to the office worker in some native Windows application. They can modify fields, insert rows and generate graphs. Its all very fancy.

Access control is quite coarse, but thats ok because access to the database is restricted to employees. They'll need access to the corporate network and a login to the database to do anything. Past those barriers, its probably fine. So long as nobody runs any particularly slow queries or types in the wrong SQL commands directly into the SQL console everything will be fine. (Systems at this time were massively vulnerable to SQL injection attacks but again, your employees just wouldn't do something like that so nobody was too worried).

When the application wanted to make a change to the data, it was easy. The application would validate the new data then insert it into the database directly (using INSERT INTO ... statements). If your DBAs were fancy I'm sure some people used PL/SQL and other languages embedded in the database itself, but they were always a pain to work with.

But other than having to manually craft SQL commands out of strings, life was looking pretty great.

Part 2

But this is where the story gets a lot sadder. Another revolution happened, and nobody is going to tell the old database servers. Actually if you took a DBA from the 80's and showed them a modern laptop running PostgreSQL in a terminal emulator (which emulates the VT-100, remember), they'd be right at home. So at home it'd be embarrassing. Our databases have gotten faster, and they scale better. Buttttt..... we forgot to tell the database that the world changed.

Of course, I'm talking about the web. Or the cloud, if thats how you think about it.

Its like the database people said "Your website is basically a desktop app. The database is fine..." and they got to work tweaking it to be faster and more clever. To implement a frozen spec better. And web developers, descendants of the old frontend application developers said "Whatever, we can work around it in software anyway". And then they got to work writing database middleware in PHP.

And it was good. cough Ahahaha just kidding... it was terrible. To work around the lack of features in modern databases, we had to invent a third kind of thing. Not a database, not an application - but an application server. What does it do? Well, it uh.. takes the data from the browser, and sends it to the database. And takes the data from the database and sends it to the browser.

Why is this needed? Well 3 reasons:

Access control on modern databases is too coarse. You want to whitelist which queries a user is allowed to make and you want fine-grained permissions around updates
Databases only talk custom binary TCP protocols, not HTTP. Not REST. Not websockets. So you need something to translate between how the server works and how the browser works.
You want to write complex logic for user actions, with custom on-save triggers and data validation logic.

Because these features are tied to the application and data model, they're almost always bespoke systems. I have a degree in computer science, but I've wasted oh so many hours of my life writing variants of the same plumbing code over and over again. Take the data from here, validate it, make a database update request then respond to the browser based on what the database says...

And its hard to write this code correctly. You need to do correct data validation. Check for XSS and protect against SQL injection attacks. And DOS attacks from maliciously crafted queries. Set correct caching headers. Don't overfetch or underfetch data. Implement server side rendering and cache some more.

Entire language ecosystems have grown around solving this problem. PHP. Tomcat. Django. Rails. Node. ORMs became a thing - ActiveRecord for Rails, Mongoose for Node. XML, SOAP, JSON, GraphQL. And all the rest.

All because we're programming against a frozen database spec. Our frontend servers act as a weird form of tumour growing around our databases, injecting themselves as a suture over a broken API.

And its 2016 already. I want a bonus 4th missing feature that will never be plumbed manually through every damn REST endpoint. I want realtime updates. Modern databases are eventually-consistent, but only to the boundary of their replica set for some reason??? The industry standard is to simply not tell the user when the data they're looking at is stale.

Why don't modern databases simply provide these features? I don't know. I can guess, but I wont be charitable. Because its hard. Because modern web apps are too new, and server rendered apps look too much like the old desktop apps to give anyone pause. Because database developers don't write frontend code. Because we've had database triggers for ages (though they're still terrible to use, and they don't solve the whole problem).

And yes, because we are fixing it with tools like Firebase and Horizon.

But they aren't good enough yet. Along with composability operators I think its time to write some code.

Composing databases

Joseph Gentle — Wed, 24 Aug 2016 05:10:49 GMT

Why don't we compose databases the same way we compose our functions?

We use mathematical operators to compose functions all the time. Most of the time we do it without even really thinking about it:

y = op2(op1(x))

Or with chaining and more functions and stuff:

complexOp = x => x.op1().op2()  
y = complexOp(x)

(This is 'obvious' so far, but bear with me). So why is function composition so useful? Well because:

You can make simple primitives, then build complex functions out of those primitives.
We might have the wrong function for a particular use case, and we need to transform it for a different use case.
You can break up balls of mud into small reusable pieces, then make the computer recombine them.

So, couldn't we do the same thing with data stores?

Lets imagine a simple idealised key-value store. I'm stealing datomic's view of what a database is - which is simply a set of keys which hold values over time.

So my database will store the number of coconuts each of Sally, George and Sam have. Sally was the first entry in the database (with 6 coconuts). Then we found out George has 10. Then george gave 2 to Sally, who ate one. And so on. Right now Sally has 7, George has 8 and Sam has 5.

So there's a few concepts, a databases's keys ('Sally', 'George', 'Sam'), the values at any point in time (now, etc). We can also see that some operations have happened. I'll add them to the diagram:

These are called transactions, operations or events. They atomically make some set of changes to the values stored in a key.

The transactions will (explicitly or implicitly) happen at some time. This might be a real clock, a logical clock (v1, v2, v3, etc), a vector clock or something fancier like an interval tree clock).

I'm going to imagine an oversimplified API made up of functions like these, but any real implementation will have a lot more detail:

get(key) => value  
set((key1, value1), (key2, value2), ...)  
watch() => stream of (key, value) pairs

(The watch function will tell us when any operations are applied to the database).

Composition

So what equivalents do we have to function composition? Well, we want to make functional middleware of sorts that consume and exposes the same database interface. Lets talk through a few obvious examples.

Union

Given two databases, db1 ∪ db2 exposes a new database interface which has the union of all keys. For get() if a key exists in both databases, behaviour is undefined. Writes could always sent to db1, or sent based on some user-specified rules.

Maybe my password database and my coconut databases are separate. I will create a database view view = users ∪ coconuts against which to run queries. From the point of view of the query runner, there is only one database. Infrastructure changes (merging, resplitting, sharding, etc) can all be managed behind that database view.

You could also use the union operator to manage indexes. Imagine implementing Wikipedia. You have a primary store of data, but also a search database. You could design a database view based on view = pages ∪ SearchIndex(pages).

Mount

Unfortunately if you're using union alone you might run into namespace collisions. (Does the Steph key mean the user data object or the coconut count?). We can define Mount(db, path) to be a database interface through which all objects in db are accessible with path prepended to the key. set(key, value) would only allow writes where the expected prefix matches. If it matches, the prefix is stripped and the write is sent to the underlying store.

With this and our union function, we can define view = Mount(users, 'users/') ∪ Mount(coconuts, 'coconuts/'). Then view.get('users/Steph') and view.get('coconuts/Steph') are clear and unambiguous.

It complects the abstractions a little but if the union function understood mounts it could allow writes to different paths unambiguously.

Filter

Given a database and a predicate, the function Filter(db, pred) exposes a new database interface through which only keys which match the predicate are visible.

This would be useful for user access control. So, all of Sam's database accesses hit the samdb = Filter(db, key => userCanAccess(sam, key)) data store. If Sam tries to access the users.seph object, well, in Sam's database view that object simply doesn't exist. (And cannot be created).

You could create variants of this for access control and doing access control based on deep inspection.

View

Now we get into the beautiful stealing from CouchDB land. Lets say you're implementing a blog. To be displayed to the user, each article needs to be rendered to HTML. View(db, fn) will present a database interface through which each value is visible transformed by the function. Writes are not allowed.

This allows us to pre-render (or lazy render + cache) the HTML content of our blog. You could combine this: dbview = Mount(posts, '/post') ∪ Mount(View(posts, renderPost), '/postHTML'). Now the raw post content can be read and modified via '/post/slug' and the rendered content itself is immediately available in the rendered paths.

I could happily keep describing useful functions, but we'd be here all day. Lets just name a few more useful things then move on:

Schema validation middleware, which passes reads but does checked writes before saving
Expose a local DB view over a network connection
Ingest a remote DB over a network connection
Tools to expose & consume a database from the browser, like firebase.
A caching DB proxy (active via watch, or passive)

Versioning and consistency

I don't want to go to jepsen hell. Is it possible to build this while maintaining some useful consistency guarantees? In short, I believe so. It'll take another blog post this long to talk about how though. I've been thinking about this for a long time and I have a bunch of ideas depending on how general purpose you want to make it. We might also need to restrict what data we're allowed to make transactions across (no transactions spanning multiple primary stores, that sort of thing).

Sharding and replica sets are also interesting - but how they're set up is orthogonal to the logical database network. There's no reason you couldn't have both. The only thing in the way is a lot of code.

Is this even useful?

Good question. I was recently asked about this while doing some consulting work. "Sure, but what would we use it for?". I wanted a database like this in every single one of our half dozen or so projects. In a sense, MVC is really model-view-everything else. The utility of this sort of thinking is allowing us to move more and more of the 'everything else' into the model.

I'm going to point to projects and say how I'd use a cool pluggable data store.

Blog project

You're making a wordpress-like website with search, powered by server and client rendered react.

The server rendering code is slow (200ms). I'd move that into a view on the database, eagerly re-rendering whenever anything is saved in the editor.
The full-text-search would move to an elasticsearch wrapper / interface.
The project uses client-side renders once the page has loaded, for inter-route navigation. I'd expose the database itself to the client through a firebasey API.

Carpark project

The project has people register their car online, then they can drive into carparks. Their license plates are read via a camera and OCR by on-premises computers. The computers have whitelists of cars they allow in automatically. The whitelist needs to be constantly kept up to date. We track how long you stay and charge the customer's credit card directly. Oh yeah, and the internet randomly goes down sometimes.

So the active sync is begging to be implemented via a filter + simple caching middleware. The root server has a view through which only the whitelisted license plates are exposed. That view is actively synced + cached to the carpark's computer. Going the other way we cache entry & exit events on the carpark computer and actively flush those back to the root servers.

I'm going to stop there not because I don't have more ideas, but because I could be at this for days. Anyway, because nobody else has done it yet, I really want to build this thing. I think its an obvious and important piece of internet infrastructure that would make it much easier to build cool stuff simply.

If your company is interested in funding me to build this & opensource it all, get in touch - I'm me@josephg.com.