Would you rather structurally avoid problems or detect them?
On self-imposed complexity and language design
Programmers tend to overcomplicate things. Are we bored at work? Do we like intricate puzzles? Do we have Not Invented Here syndrome? Do we not have enough hobbies? Who knows? But we do it. Time and again, instead of choosing a simple, robust solution, we try for a complicated one. Would we benefit from a few constraints?
That tendency also shows up in language design. We add complex types to our compiler to ensure we never get a nil
. We add a borrow checker so we don’t segfault. We add garbage collection instead of using a simple memory management technique. These are complex solutions. Further, they seem to allow complex code. For example, people say they need types because their codebase is so large.
I don’t know if this essay has a point or if it will reach a sure conclusion. But I do want to explore this pattern I see repeatedly. This may meander a bit. It’s pure exploration pulled from anecdotes I’ve collected over the years. I’d appreciate your feedback on any technically incorrect statements.
On memory management
On an episode of Software Unscripted, Casey Muratori has this to say about libcurl’s proliferation of memory management bugs:
This is also why people don't understand why I don't love Rust.
Things like the borrow checker are, to me, just examples of embracing that old bad architecture and going, how do we just prevent the things that Curl got wrong from going wrong? That's just backwards. [It’s like saying] we're sort of C again, but we're gonna try and patch the problems of C. That's not the problem I have with C. Nobody who's good at C has any of those Curl problems.
He was talking about how Curl’s primary source of bugs was memory deallocation errors. Curl does a lot of dynamic allocation (using malloc()
), then has to figure out when to free it. That can get complicated and often has bugs.
So people recommend rewriting it in Rust because Rust will complain at compile time if you free too early. Muratori’s point was that no good C programmer would allocate memory that way. Simple architectural patterns make it easy to deallocate at the right time.
One of those patterns is using a memory arena, a small pre-allocated heap of memory that you manually manage. Instead of calling malloc()
whenever you need memory, you take some from the arena. If you use one arena per request, you can deallocate the whole arena when the request is finished. It’s simple and robust. It makes the problem trivial.
So Muratori’s point is that the problems solved by Rust’s lifetime checker are not problems he encounters regularly in good code. Is Rust designed for lousy code? Rust guarantees that no matter how messy your memory usage is, it will tell you if you’re safe. Maybe there should be some maximum messiness. And perhaps that constraint would force us to do a better job of architecting instead of asking the computer for help with our messes.
On garbage collection
I often think about garbage collection in this context. I use garbage collection daily. It has “won,” at least in terms of mindshare and probably in terms of lines of code run daily. It has made programming much easier because we can focus more on the domain problem than on memory management.
But is that true? Sure, we’re not holding onto pointers to eventually call free()
on them. But I’ve been on several garbage-collected projects where programmers are constantly monitoring the memory usage of a server, manually tweaking the amount of memory they’re giving it in AWS consoles. Is that not manual memory management?
Is setting up, maintaining, and navigating monitoring systems any easier than operating under constrained memory? Sometimes, I wonder if the lack of constraints causes us to stop thinking critically. It gives us the illusion of unlimited memory, or as Alan Perlis put it:
A LISP programmer knows the value of everything, but the cost of nothing.
That quote now applies to all garbage-collected languages, not just LISP.
On a related topic, the JVM has had billions of dollars of research put into its amazing garbage collector. It is truly impressive how fast it runs. But programmers still run into memory problems eventually. The GC pauses. You’re still leaking memory somehow.
I can’t help but think about the Erlang process model. Each process has its own heap and shares no memory with other processors, so the GC only has a tiny bit of memory to scan. Further, it can pause one process while the others continue. The GC can be simple because of how small the heap it manages is.
In Erlang, the message’s contents are copied when messages are passed. They’re immutable anyway, so why not copy it? Copying is much faster than you think, especially when compared to a garbage collector loading large portions of the heap into the cache. Avoiding that GC pause is huge.
The only problem I can think of is copying large pieces of data, such as deeply nested data structures. But that’s easy to detect and might be a practical constraint to work under. Just don’t copy big data structures.
On type systems
Ironically, Muratori, in the same episode, makes this tautological “argument”:
Anything with static type checking is an improvement because static type checking is very helpful.
Okay, they’re helpful. But are they only helpful because of other design choices you’ve made, similar to how memory management needs Rust because of lousy architecture?
Gilad Bracha had this to say about currying and types:
But Andrew find that trying to write curried functions in Smalltalk, which you can, was the first time he ever wanted a type checker in Smalltalk. And that's weird. 'cause usually you'd really have very little interest in a type checker when you're writing Smalltalk. The lack of structure, the fact that it's this uniform space of unidentifiable things that all can compose every which way is probably the reason why in that kind of language, maybe you really have no choice but to put in a type checker.
To paraphrase, there’s so little information in point-free style code that the only way to get it right is with the help of a type system.
When I was interviewed on Software Unscripted, we discussed types. Feldman, the host, asked me why I didn’t see more type errors when programming in a dynamically typed language. Here is part of my answer:
And of course those are things you have to like learn over time and it takes a while to see the tricks. Cause I don't even know the tricks I do that work. I hear other people say, “oh man, I got this bug. Where did it come from?”
And I'm like, “Well, look at your code! If you just did it this way, there would be no bug.” And so that person is like pining for a type system, but I'm saying your code is really not idiomatic. There's a lot of habits that Clojure programmers have that avoid those bugs. I'm not trying to say it's better, but it might be one of the reasons why I'm not encountering as many bugs. I've got all these idioms from Clojure and they're together avoiding the bugs.
In summary, over years of programming Clojure, I’ve learned to program in a way that avoids many type errors. But I don’t know what that way is. This is similar to Casey’s sentiment about memory management. (And today, I am exploring whether it’s better.)
Rich Hickey talked a lot about type systems in his 2017 Conj keynote, Effective Programs (transcript here).
I think that they've created problems that they now use types to solve. Oh, I pattern-matched this thing 500 places and I want to add another thing in the middle. Well thank goodness I have types to find those 500 places. But nobody should have cared about [the change] except the new code that consumed it and if I did that a different way I wouldn't have had to change anything except the producer and the consumer, not everybody else who couldn't possibly know about it.
One of the arguments for the type system is how it can help you refactor. You change the type signature, and the compiler will tell you exactly where your errors are. I think that’s true, but is it a self-created problem? You only need to change it in so many places because your type system needs to know the type of every expression in the codebase.
Although the compiler will guide you to correct your program, it’s terrible for exploratory refactoring. In Clojure, I can change a function signature and then test if the change has the effect I want without correcting the whole program. Sometimes, I find out it’s wrong and can back out of the change quickly. I wouldn’t know that in Haskell without spending an hour of work correcting compiler errors.
On puzzle solving
In the same talk, Hickey gives a possible reason for over-complication:
And these type systems they're quite fun, because from an endorphin standpoint solving puzzles and solving problems is the same, it gives you the same rush. Puzzle solving is really cool. But that's not what it should be about. […] Solve problems, not puzzles.
To summarize, solving problems in an overcomplicated way is fun. We should identify what the real problem is we are trying to solve and solve that. But that feels like work.
Interestingly, another hero of mine, Chuck Moore, has this to say about his programming language, Forth:
I write Forth code every day. It is a joy to write a few simple words and solve a problem. As brain exercise it far surpasses cards, crosswords or Sudoku; and is useful.
Forth’s aesthetic is to simplify by reducing a lot of the generality that we take for granted in the mainstream programming world. For example, why not hardcode the font size? And if you make it a power of two, it makes the math easier. Lots of code becomes unnecessary.
Forth programmers talk a lot about factoring, like in math where you factor out a product (as in 2x + 6 = 2 (x + 3)). It’s similar to the DRY aesthetic. They factor their code to an extreme extent, often getting orders of magnitude of savings regarding lines of code. They start at a very low level (basic stack operations) and build up to domain-level constructs. By controlling the whole stack and factoring, they avoid the complicated problems of using existing, complicated language features.
Sometimes, when I look at it, it seems like the overall savings in Forth are worth it. If a codebase’s complexity limits future development, the constraint of writing the whole thing keeps things simple. But I also enjoy the higher-level exploration I can do at a Lisp REPL. I’m constantly torn between them. Is a middle ground the way? Or are both modes (unconstrained and constrained) necessary?
On assumptions
John Carmack has this to say in an interview with Lex Fridman:
I'm kind of fond in a lot of cases of static array size declarations. I went through this period where it's like okay now we have general collection classes we should just make everything variable. I had this history of in the early days you get Doom which had some fixed limits on it then everybody started making crazier and crazier things and they kept bumping up the different limits this many lines thismany sectors uh and it seemed like a good idea well we should just make this completely generic it can go kind of go up to whatever and there's cases where that's the right thing to do.
But it also the other aspect of the world changing around you is it's good to be informed when the world has changed more than you thought it would and if you've got a continuously growing collection you're never going to find out. You might have this quadratic slowdown on something where you thought, “oh i'm only ever going to have a handful of these.” But something changes and there's a new design style and all of a sudden you've got 10 000 of them.
So I kind of like in many cases picking a number some you know nice round power of two number and setting it up in there and having an assert saying it's like, “hey you hit this limit you should probably think, “are the choices that you've made around all of this still relevant?” If somebody's using 10 times more than you thought they would, yeah, this code was originally written with this kind of world view with this kind of set of constraints you were you were thinking of the world in this way.
In other words, he likes to use static array sizes. He’s forced to rethink the assumptions when he reaches the fixed limit. Just because we know how to write the code to handle unlimited items in a list doesn’t mean it will work as expected. The constraint of a fixed-size array might give us the idea of a better solution.
On programmer preferences
Why does Muratami like types, but dislike Rust’s borrow checker? Why does Hickey dislike types?
Recently, David Heinemeier Hansson expressed an opinion on the static vs dynamic typing debate:
These days, I've come to appreciate the magnificence of multiplicity. Programming would be an awful endeavor if we were all confined to the same paradigm. Human nature is much too varied to accept such constraint on its creativity.
Could you imagine if all visual art had to be rendered in the style of cubism? Or realism? Or all novels written in the short, direct flavor of Hemmingway? What a bore it would all quickly be!
His take is that the static vs dynamic typing choice is all about personal preference. Just do what floats your boat!
I’m not quite a fan of that explanation. It’s a very weak take. We adapt to the environment we grow up in. We learn many lessons there within the constraints of that environment, and then we start to see the world from that perspective.
For example, I’ve been doing Clojure for a long time, so I no longer see type errors. I know a subset of Clojure that does the job using data-oriented programming and REPL-Driven Development. When I go to another language, I feel limited by the lack of facilities for that style. It feels limiting, but really, there is a different style that the other language is optimized for. If I learned that, I wouldn’t feel limited.
Likewise, Casey Muratori has been programming games in C and C++ for many years. Memory management is something he internalized long ago. But he still relies on the C type system because he always could. He never had to learn another way.
This would explain why people who learn to program in multiple languages seem to have better skills that transcend any particular language. You get more perspectives and learn to write good code under different constraints. We can use that to get better—code under different sets of constraints.
On loud problems
Muratori has a take on the importance of segmentation faults:
I think it's important not to overweight certain classes of bugs either. It's like, do you worry about the same thing about all the other metrics you might care about? Like when you accept something for a new contributor, do you worry about the fact that it might just be buggy?
Sure it doesn't do the right things. It's not gonna crash, but it's gonna like totally corrupt the machine code output in a way that's gonna take us a really long time to find and we'll produce bugs in all the programs compiled by rock. It's like, Hey, they didn't segfault. It's like, who cares if they segfault?
It's a compiler. If it seg faults, at least we know where the bug was. It's where the segfault is. There's a lot worse things to be worried about than a segfault. Actually, I think people overindex on segfault because it's very noticeable. It's a fail loud kind of a thing. But if your goal is to ship reliable software, you have to be concerned about the bugs across the board.
It can't just be segfault. And it's like, if you're doing all of this work and the only thing that you got outta that was just that you're telling me the language probably won't segfault. I mean, congratulations.
To summarize: Segmentation faults are easy to detect but are unimportant. What about all the other kinds of bugs? A compiler can’t tell me about those. Borrow checking has a high cost, and it finds low-value bugs.
I feel the same when people bring up null pointer exceptions in Clojure. I sometimes get them, but I fix them, and they’re done. I never yearned for Hindley-Milner because I had to track down a bug now and then.
It’s similar to Hickey’s take about the kinds of errors type systems find:
The biggest errors are not caught by these type systems. You need extensive testing to do real-world effectiveness checking. Names dominate semantics.
a → a
,[a] → [a]
, it means nothing, it tells you nothing. If you take away the word "reverse", you don't know anything, you really don't. And to elevate this to say, "oh this is an important thing, we have all these properties", it's not true, it just isn't true. There are thousands of functions that take a list of a and return a list of a. What does that mean? It means nothing. And checking it...I mean, if you only had a list of a’s, where you gonna get something else to return? I mean, obviously you're going to return a list of a’s, unless you're, you know, getting stuff from somewhere else, and if you're functional, you're not.
To paraphrase: Type systems can’t find interesting errors. They can only find obvious ones because they can’t understand the function’s name. But type systems are easy to write, so it seems helpful to have one. But is it solving the hard problem?
Conclusion
We’re prone to solving problems that aren’t real. We often solve one problem with a bigger one, even when a simple solution exists. I’ve shown quotes from people steeped in their aesthetic cultures who have found simple solutions to real problems. It seems like the best case is one in which we learn from each other. Instead of building a technical solution to the problem of segmentation faults, can we teach each other good, robust memory management practices? Instead of complex type systems, could we learn to code so that we make fewer type errors?
I think it's far too quick to claim something like "type systems can't find interesting errors." The example Rich Hickey gives about static types certainly is common enough: people fall into the trap of thinking in terms of type signatures like [a] -> [a] because they're the easiest things to express in the type systems people work with on a day to day basis. But a sufficiently advanced type system can express a lot more than that.
Conal Elliott's talk "Can Tensor Programming Be Liberated from the Fortran Data Paradigm" (https://youtu.be/oaIMMclGuog?si=1sz-bAFNGj6rEbNY) shows a correct-by-construction, purely functional approach to two classic "array" programming problems: parallel scan and FFT. The fundamental challenge, of course, is picking the right representation for the problem. The type system won't tell you what representation to use, but once you start to denote the problem, if used correctly, it can quickly identify inconsistencies in your representation and give you information about how general your solution is. This is how the resulting implementation of FFT is shown to be "portable" across different data structures.
We can fix programs but we cannot (completely) fix people. There will always be violations of the conventions and practices which are not strictly enforced. If an interface implementation is not checked by a compiler or a static type checker then someday somebody will mess up with parameter types or order. And it will be done by mistake or out of lack of knowledge. In any case we will pay the maintenance price. Either in time spent writing typed and fixing compile errors or by communication and fixing bugs.
This seems to be the question of how easy it is to train people in a team to get them on the same level of understanding the project conventions and design practices. This is also a complexity. It is just taken from the tooling and put into people.