I'm working on ways to better convey the global structure of programs. The goal: use an open-source tool, get an idea for a simple tweak, fork the repo, orient yourself, and make the change you visualized -- all in a single afternoon. Understanding a strange codebase is hard; I can't change that, but I think I can make it easier for people to persevere. I want to carve steps into the wall of the learning process. I want to replace quantum leaps of understanding after weeks of effort with an hour of achievement for an honest hour (or three) of effort.
This focus on helping outsiders comprehend a project is unconventional. I'm less concerned about the readability of a codebase. I find the usual rhetoric around ‘readability’ tends to focus on helping authors merge contributions rather than helping others understand and appreciate their efforts. If you've ever seen an open source project whose CONTRIBUTING document consists of a nit-picky list of formatting rules and procedures for submitting patches, you know what I mean. There's a paucity of guidance earlier in the pipeline, when newcomers aren't thinking about sending a patch, just trying to understand the sea of abstractions, to keep their heads above water. I think improving this guidance might greatly increase the amount of citizen involvement in open source, the number of eyeballs reviewing code, rather than simply using projects and treating their internals as externalities until the next serious security vulnerability. Our society is more anti-fragile when there's greater grassroots oversight of the software that is eating our world.
Everyone doesn't have to understand every line of code that helps manage their lives, but all software should reward curiosity.
Changing society seems hard. Where to begin? One hint is the observation that early versions of most software are often surprisingly easy to understand. There's this pervasive tendency for software to accumulate accidental complexity over time, making it harder to understand, and also more brittle and ossified and harder to change. If we could resist this ossification we'd make strides in keeping the global structure of a program accessible. Such creeping accidental complexity has at least three causes:
A. Backwards compatibility considerations. Early mistakes in the design of an interface are often perpetuated indefinitely. Supporting them takes code. Projects that add many new features also accumulate many missteps. Over time the weight of these past adaptations starts to prevent future adaptation.
B. Churn in personnel. If a project lasts long enough early contributors eventually leave and are replaced by new ones. The new ones have holes in their knowledge of the codebase, all the different facilities provided, the reasons why design decisions were made just so. Peter Naur pointed out back in 1985 the odd fact that that no matter how much documentation we write, we can't seem to help newcomers understand our programs without talking to the original authors. In-person interactive conversations tend to be a precious resource; there's only so many of them newcomers can have before they need to start contributing to a project, and there's only so much bandwidth the old hands have to review changes for unnecessary complexity or over-engineering. Personnel churn is a lossy process; every generation of programmers on a project tends to know less about it, and to be less in control of it.
C. Vestigial features. Even after accounting for compatibility considerations, projects past a certain age often have features that can be removed. However, such simplification rarely happens because of the risk of regressions. We forget precisely why we did what we did, and that forces us to choose between reintroducing regressions or continuing to cargo-cult old solutions long after they've become unnecessary.
There may be other phenomena I haven't considered, but these three suffice to illuminate a crucial point: they're independent of most technology choices we tend to think about. A new language or tool will at best have a short-term effect unless we're able to keep the big picture of a codebase comprehensible over time as members join and leave it.
Constraints on a solution
In direct correspondence to the above list, I've nailed down the following primary design invariants:
A. Minimize compatibility constraints. We can't avoid users creating habits and muscle memory in the tools they use, but things are different when we're building libraries for other programmers. In that situation the user of our creation is another programmer with the ability to empathize with our situation if given the right context, and without our API deeply embedded in muscle memory.
B. Be friendly to outsiders, because they will be the insiders of tomorrow. Many best practices we teach programmers today help insiders manage a project but hinder understanding in newcomers. In a strange new project straight-line code is usually easier to follow than lots of indirection and abstractions. Comments are of limited value because most comments explain local features, but fail to put them in a global context. Build systems that automate a lot of work in our specialized industrial-strength setup turn out to be brittle on someone's laptop when running for the first time.
C. Be rewrite-friendly. Rewrites are traditionally considered a bad idea, but the question shouldn't be whether they're a good or bad idea. Software is the most malleable medium known to man, and we should do all we can to embrace that essential malleability that is its primary characteristic and boon. If rewriting is stressful and error-prone, perhaps we're doing software all wrong in the first place.
These can seem like difficult constraints to solve independently let alone simultaneously, but it seems likely that conveying global structure lies on the road to them all. If a project is to remain easy to understand over the long term it can't afford to have historical baggage; it is by definition friendly to outsiders; and it has to have been constantly getting rewritten.
Having carved out these constraints, everything else is open to question. I'm willing to try global variables, large functions, even goto statements if I can trade comprehension in the large for readability in the small. Our instincts on the right way to manage complexity have failed to yield dividends, so it's time to go against our instincts.
I'm a lot less certain about the solution than the previous sections, but I've been exploring some promising mechanisms in my current project, an idealized assembly language and OS for teaching programming called Mu:
- Deemphasize interfaces in favor of tests. Automated tests are great not
just for avoiding regressions and encouraging a loosely-coupled architecture,
but also for conveying the big picture of a project. They can support all
three invariants above:
- A. If a library and its users both have good tests, compatibility becomes a lot less important. It's not that hard to change how you call a library when you upgrade, if you can be confident that you'll be alerted to any breakage. It's more desirable to guarantee that upgrades take an hour 100% of the time, than that upgrades take no time 80% of the time but turn into a bottomless time-sink 0.01% of the time.
- B. Instead of describing your architecture, point newcomers at core tests that they can instrument or step through in a debugger. That allows them to focus first on the concrete series of operations desired in a situation rather than how those operations happen to be organized in code.
- C. Tests make a codebase more rewrite-friendly by reducing the possibility of regression that contributes most of the stress and uncertainty to a rewrite.
However, modern tests don't provide nearly enough coverage for our purposes. I'm exploring expanding testing in two directions to help gain confidence that everything is well when all tests pass:
- Testable interfaces at all levels of the stack. I'd like to be able to check that my server can send email even when the disk is full, for example, or that my browser remains responsive even while loading really long webpages. I think the conventional wisdom about not testing I/O is just an artifact of having non-testable interfaces for I/O. For example, a more testable interface for printing to screen would be to require a pointer to a screen object, so that tests can print to a fake screen (dependency injection) and check what was printed.
- White-box testing. Instead of checking that a function merely returns the expected value, allow it to log domain-specific facts that it discovered in the process, and encourage tests to check the log for specific features. This would allow testing more than just functional correctness — performance, fault tolerance, race conditions, etc. For example, you would be able to check that your search function doesn't quadruple the number of lookups required when it searches through double the data. More details →
- Deemphasize abstractions in favor of traces. Traces are like
logs of just the timeless properties of a domain deduced by the program,
as opposed to details of the current implementation. Traces help us
support two of the three invariants above:
- B. It might be more helpful for newcomers to follow a curated and sequenced ‘curriculum’ of traces, rather than to grope blindly through the deep directory tree of a large application as they do today. For example, the repository for a text editor might guide new programmers first to a trace of the events that happen between pressing a key and printing a character to screen, demarcating the major sub-systems of the codebase in the process and allowing each line in the logs to point back at code, silently jumping past details like what the precise function boundaries happen to be at the moment.
- C. Traces can also help veterans more easily manage and rewrite a system. Since traces can be associated with levels of detail (akin to log levels), it is possible to skim them more effectively than a flat listing or a step-by-step debugger. When tracking down bugs I've been finding it very helpful to use a ‘zoomable’ interface to bounce around the trace of a program in a random-access manner.
- Deemphasize modules in favor of layers, a cleaned-up history of the codebase. The key property of an application organized in a sequence of layers is that you can stop after any layer in the sequence and build a working app that runs and passes all its tests. This guarantee helps support invariant B above: it lets noobs first play with a simpler version of the program, focusing on just the core before gradually introducing more peripheral features.
My hypothesis is that building these three mechanisms deep into the stack from day 1 will radically impact the culture of an eco-system in a way that no bolted-on tool or service at higher levels can replicate. It would be easier to be confident that an app is free from regression if all automated tests pass. This would make the stack easy to rewrite and simplify by dropping features, without fear that a subset of targeted apps might break. As a result people might fork projects more easily, adding and particularly deleting unused features, exchanging code between disparate forks (copy the tests over, then try copying code over and making tests pass, rewriting and polishing where necessary). The community would have in effect a diversified portfolio of forks, a “wavefront” of possible combinations of features and alternative implementations of features instead of the single trunk with monotonically growing complexity that we get today. Application writers who wrote thorough tests for their apps (something they just can’t do today) would be able to bounce around between forks more easily without getting locked in to a single one as currently happens.
If these techniques are so great why haven't they been tried before? I think it's because of a ‘drawback’ they all share: they're too powerful, and it's easy to shoot yourself in the foot. It takes taste to trace just domain-independent facts and not implementation details, to make white-box tests robust to radical changes, to organize an app into layers. But I think requiring taste is a good thing. A prime reason we forget crucial details about codebases is that we create rules and processes around them, and those who follow us have no reason to remember the original reasons for the rules and processes. If every generation were to be allowed to make its own mistakes, the reasons would stay ‘in the air’ and not get lost. Our software would better reward curiosity.