Utilization, throughput, and latency

Three ways to manage a team

Nov 19, 2024

I’m looking for a full time position. If you’re looking for someone like me, please reach out. I prefer startups.

Check out Grokking Simplicity as a book to recommend to someone who’s curious about functional programming. And leave a review if you liked it!

I’ve been re-reading The Principles of Product Development Flow by Donald G. Reinertsen. It’s a dense read, but it’s great. And it’s getting me thinking about the progression we’ve seen in management processes. This progression goes from accounting utilization, to throughput, to latency.

Utilization accounting

I’ve digged at my experience of management before: They don’t know how to help, only how to dial up and down the pressure. If things feel like they’re going too slow, they try to get you to go faster. And they might even try taking non-programming work off your plate in an attempt to get you to spend more time programming. (PS, this often backfires and turns you into an unthinking code monkey.)

People over process

Eric Normand

February 26, 2024

Read full story

What is a programmer's job?

Eric Normand

April 2, 2024

Read full story

The framework these managers are using is “utilization accounting.” They want to maximize the amount of work they get from one of their most expensive types of employees. Basically, you measure how much time they worked divided by how much time you’re paying for, and that’s utilization. Higher is better, in this framework.

Optimizing for utilization seems like a good idea, but it can have some disastrous consequences. Here are just a few:

Working hard on the wrong thing.
Working hard on so many things you lose time context switching between them.
Not being able to respond quickly to an opportunity because everyone is busy.

The book The Goal by Eliyahu M. Goldratt shows how this management framework can lead a factory astray. What he suggests instead is to use throughput accounting, which forms the basis for Theory of Constraints.

Throughput Accounting

In throughput accounting, we focus on the rate at which valuable work is delivered. For instance, a car factory might measure their productivity in terms of cars produced per day. What Goldratt showed was that if we focus on the productivity of the parts of the system, we ignore the productivity of the whole system. Local efficiencies could lead to global inefficiencies.

In software teams, we often see them measuring stories or story points as a measure of their throughput—and retrospectives are supposed to get us thinking about how to increase that throughput. There’s a lot of looking for the bottleneck—the most constrained resource. Instead of trying to maximize the use of all resources, you try to maximize the use of the bottleneck, then try to expand the bottleneck. Maximizing the use of anything else won’t help throughput.

Here’s a story that illustrates the sometimes counterintuitive thinking that happens when you switch to a throughput mindset.

There was a factory where they discovered that the bottleneck was on a certain machine. They were already maximizing that bottleneck using various techniques. But it was still limiting the throughput. They calculated that if they could run the machine an extra hour per day, they would increase throughput of the factory and be able to make $10 million more per year.

But management resisted adding an extra hour to the shift with numerous excuses:

Excuse: We’ll have to pay overtime.

Yes, you’ll pay one small team time-and-a-half for one hour per day—and make $10 million more per year.

Excuse: Yes, but they’ll miss the last bus back home.

Pay for a taxi. It would cost $30 per person—and make $10 million more per year.

Usually the story ends with a deeper, more sinister reason why they can’t just don the math:

Excuse: Factory workers don’t deserve a taxi ride home. Not even management gets that.

Oof, that story gives me flashbacks to companies who are willing to spend $500,000/year on an employee but won’t pay $200/month for a tool to help them with their job. The moral is that our sense of the value of employees is tied up in pay and perks. Sometimes management cannot see past the hierarchies they set up.

I wrote a post about the real bottleneck on development teams.

The most underrated book in software engineering management

Eric Normand

December 11, 2023

Read full story

Okay, but back to throughput accounting itself: one of the issues with maximizing throughput is that you often do it at the expense of latency. In software companies, the bottleneck is often the capacity of the team to do the work. One way to maximize throughput is to add a queue before the workers—like a backlog. If you make sure the backlog is always full, then the workers will never have idle time.

Another way to increase throughput is to increase the batch size. Sometimes a programmer will notice that two tickets are very related. Instead of risking that they start on one ticket, then someone else takes the related ticket, then they duplicate some work or have to coordinate, the programmer takes both.

The problem with long backlogs and large batch sizes is that they increase latency. If you have low latency, you can respond to opportunities and emergencies quickly. But if your latency is low, you’re sluggish—even if you’re logging lots of story points per sprint!

Latency accounting

Throughput is very important in a factory where there is more demand than supply. The more cars you produce, the more you can sell. But a factory producing a known commodity is not what we do as programmers. We are more often dealing with an unknown commodity on multiple levels.

We don’t understand the whole of the software (unless it’s very small).
We don’t understand what the customer really wants (they don’t either).
We don’t know how to implement the thing we guess they want.

Maybe one of these things doesn’t apply to a given situation, but I have found that software is characterized by a large degree of uncertainty. Couple that with the speed necessary at a startup, where the clock is ticking to find product market fit before you run out of money, and you’re firmly in the Complex domain of the Cynefin framework.

To succeed there, we often need to optimize for latency—the ability to respond quickly, do many experiments, and get lots of feedback. Throughput does not matter that much. And so, in such environments, you need to keep the backlog short (or non-existent) and keep the batch size small.

Also, notice that this is the original idea behind agile software development. You want to build software incrementally, with many small experiments, and value delivered quickly so you can get customer feedback.

This reminds me of the 3X framework by Kent Beck. In it, he describes three different phases of software development, and they seem to correspond to these three types of accounting, except in the reverse order. In the Explore phase, you want to optimize for latency because you don’t know what you’re looking for. In the Expand phase, you’ve found something valuable and you want it to grow—so you optimize for throughput because you know what you want. Then, finally, in the Extract phase, you’re more focused on maintenance and operating efficiency, so you worry about utilization.

I’m also thinking about Wiring the Winning Organization by Gene Kim and Steven J. Spear, which goes into much finer granularity about different ways of approaching work.

Conclusion

How many times have managers asked why dev teams are going so slow? It’s happened to me so many times that I started researching all of this stuff to be able to explain it. Often, the manager is caught in a utilization accounting mindset. Things must be slow because people aren’t working hard.

Reading about the Theory of Constraints helped me realize that the dev team will always seem slow if it’s the bottleneck of feature development—and that the answer usually isn’t to take non-programming work off their plate. In the throughput mindset, they must be slow because they need more capacity. That never seemed right, either. We know that adding more people usually makes things slower.

But I’ve also found that people’s perception of speed is often based on when things were requested—meaning added to the backlog. “I asked for this 6 months ago and I’m still waiting!” Yep! And there’s another six months of work waiting in front of yours in the backlog. Even after we prioritize it, your work just never gets close to the front!

Using the latency mindset, if we want to be able to move faster, it’s not enough to increase throughput. We actually need to decrease latency. That means decreasing the size of the backlog, doing smaller work items, and decreasing utilization so there is some extra capacity to respond.

And this is very hard. It goes counter to our tendencies. If we know we have to get in line for 6 months before our work will be started, we tend to add more work to the request. “Let’s put everything we can think of in there that we might want, because we don’t want to find out in 6 months we forgot something.” What we need to do instead is ask: “What tiny thing can we build that will help us answer this question?” Or: “What tiny thing can we build that will derisk the rest of the work?”

But in the end, we programmers find ourselves in a bit of a pickle. We’re expensive, so it stimulates the part of our brain that wants to optimize for utilization. And we are a bottleneck, very often, so we are the part of the system that needs attention. Plus, our work requires us to respond quickly to bugs and outages. We’re always going to be the target of management. If it’s bad management, it’s terrible for us. But if it’s good, it can actually make our work more pleasant.

Rock on!
Eric

Jeff Grigg

Nov 19

We feel that software development is slow because it is time consuming labor-intensive work. And because there is an unlimited about of work that could/should be done that would provide business value. And increasing our effectiveness and/or throughput increases the expectations.

So we need to be able to cope with the fact that it will always seem like software development is "too slow."

So we need to focus on prioritization. And on breaking work up so that we can defer lower priority work. And improving responsiveness makes this possible.

And we also need to frequently invest in improving our capacity, efficiency, and effectiveness to avoid getting trapped in a downward spiral of self-destruction. That's part of the problem of "What can I remove from your plate to get this one thing done sooner?" -- It's short term thinking that gets this one thing done faster -- most likely at the expense of all future work. And, having done this now, on this one thing, we'll probably do it again next week for another, and then again, and again, etc. Again, a downward spiral of self-destruction.

Expand full comment

Eric Normand's Newsletter

People over process

What is a programmer's job?

The most underrated book in software engineering management

Discussion about this post