Heisenberg Effect in Software

One of the greatest services that software engineers provide is that we help people understand their own thought processes. We're transcribers of human thought, in a way. We have to understand a human process so intricately that we can write down instructions in another language that the dumbest, most literal being on earth (a computer) can follow.

We hold a mirror up to the user: We built this system in your image--does this look like you? Is the image distorted? If so, how? In the Agile philosophy, we hold this mirror up as early and often as we can.

Another metaphor I like comes from an old interview with Andy Hunt and Dave Thomas (of "Pragmatic Programmer" fame), where they described a Heisenberg effect in the process of building software for people:

Software [has] a Heisenberg effect, where delivering the software changes the user's perception of the requirements.

Users need to see our interpretation of what they told us. That helps them understand themselves better, and then, we can understand them better.

Are You a Stable or Volatile?

I've written before on this blog about the different personality traits of software engineers, and how different traits balance each other on a team.

Some of the spectra I came up with were:

  • Dreamers ↔ Pragmatists
  • Big Picture ↔ Detail-Oriented
  • Move Fast & Break Things ↔ Slow & Methodical
  • Optimists ↔ Pessimists
  • Answerers ↔ Questioners
Recently I read a comment on Hacker News (which I can't find now) that mentioned a blog post from 2012 on Rands In Repose, a blog I've been familiar with for years, but somehow this post had escaped me.

The author, Michael Lopp, introduces the Stables and Volatiles divide, which feels to me like a meta-category that encompasses the ones I listed above.


You can probably guess just from the names, but Stables tend to like having a well-defined plan, are more risk-sensitive, predictable, and reliable. Volatiles are disruptive, like to blaze their own trail, dislike process, and have a strong bias for shipping something (even if it's ugly and unstable).

Lopp says:
Stables will feel like they’re endlessly babysitting and cleaning up Volatiles’ messes, while Volatiles will feel like the Stables’ lack of creativity and risk acceptance is holding back the company and innovation as a whole. Their perspectives, while divergent, are essential to a healthy business.

I believe a healthy company that wants to continue to grow and invent needs to equally invest in both their Stables and their Volatiles.
I think the Stable / Volatile concept feels very true to my experience, and it's one I'll keep with me. Assuming that everyone on a team is equally intelligent, competent, and well-intentioned, you need people of both the Stable and Volatile persuasions. Even though they annoy each other, a project staffed by only one or the other type is doomed.

Who Writes the Backlog?

The Product Owner has the final say in backlog prioritization. But who writes the backlog? Is it always the product owner?

My experience with Agile has been that, in the real world, the backlog contains two types of items:

  1. The "user stories" that discuss functionality from the perspective of an end user
  2. Technical stories--including things like technical debt, maintenance, package upgrades, etc. Things that don't represent user goals that the product owner would be focusing on.

Both types of items are important and need to be worked on. 

You can get into trouble in a couple ways:

  1. Every item in the backlog is forced into a "user story" template ("As a...", "I want to...", "So that...").
  2. The product owner is writing backlog items that are of the "technical story" variety since they think they need to write every story in the backlog.

If you have a product owner who has consistent time to write Jira tickets (or whatever issue tracker you have), you're pretty lucky. My experience with actually-existing-Agile is that the Product Owner is typically someone in middle management who is juggling multiple jobs, "product owner" of this product being one more job, and certainly does not have the time to dedicate to backlog management.

Occasionally you get an overactive product owner who tries to capture every task the development team is working on as backlog items, including purely technical tasks that are better written up by engineers or engineering managers/leads.

My opinion is that anyone on the team (engineers, QA, managers, product owner) should be empowered to write backlog items. When the nature of the backlog item is technical, let the technical people write it. When the backlog item is a "user story", the product owner should write the item, or at least delegate the requirements to whoever writes it (probably not an engineer).

Backlogs are just another tool in the tool chest of software engineering. As long as the people doing the work are clear on what the goal of the item is, and the acceptance criteria for its completion, then anyone can write them in whatever format they want.

Legacy Systems Age in Reverse

I feel like every job I've ever had in software there has been some old legacy system hanging around that everyone denigrated and perennially spoke about replacing with a newer, shinier system. That replacement was always coming any day now. By the time I'd moved on to a new job, that legacy system was still running, quietly doing some important function for the business, having somehow survived the demise everyone predicted.

The first release of jQuery was in 2006. Within a few years--and every year after--web developers would talk continuously about how outmoded jQuery was, and that no self-respecting developer would still be using it when so many newer, better JavaScript libraries had come along to replace it. Well, guess what. In 2024 (18 years later), jQuery is still in active development, and according to the official jQuery blog, 90% of all websites use jQuery currently. As someone who's been working in the web development space since roughly the dawn of jQuery, I can say that it is still extremely rare that I see a codebase that doesn't use jQuery somewhere, whether as a direct dependency, a transitive dependency, or as part of some 3rd-party tool integrated into the product. jQuery will outlast the human species.

Software systems exhibit the Lindy effect:

The Lindy effect (also known as Lindy's Law) is a theorized phenomenon by which the future life expectancy of some non-perishable things, like a technology or an idea, is proportional to their current age. Thus, the Lindy effect proposes the longer a period something has survived to exist or be used in the present, the longer its remaining life expectancy. Longevity implies a resistance to change, obsolescence, or competition, and greater odds of continued existence into the future. Where the Lindy effect applies, mortality rate decreases with time.

They actually age in reverse. Every year they exist doubles their additional life expectancy. That old system that everyone thinks will be replaced any day now is gaining strength before your eyes, becoming every day less likely to be replaced.

New technologies are the least likely to survive another day, another year. Just as a business that's existed for 100 years is more likely to survive to its 101st year than a 2-year-old business is likely to see its 3rd year.

The implications of the Lindy effect to any of us working in "technology" are so widespread that it's hard to overstate. We're in an industry obsessed with the new, but the new things are constantly passing away as the old geezers are running laps around them. Any new programming language you're excited about right now will be dead long before C is. VS Code will kick the bucket before Vim. Be kind to those elderly technologies around you, because they'll be here long after you're gone.

Scrum Tensions: Code Review

There are tensions in Scrum anywhere you have intra-sprint cycles that must resolve by the end of the sprint. In my last post I wrote about manual quality assurance and the tension that causes in a Scrum setting. Another one of these intra-sprint cycles that most Scrums teams include in their process is code review.

You could, in a way, consider code review to be another form of "QA". In both cases we're initiating a cycle-within-a-sprint with a loop of approval, rejection, re-work, and acceptance.

In the sense that Scrum teams strive to get every sprint backlog item to "done" by the end of the sprint, back-and-forth cycles kill sprints. But they're also essential. Therein lies the tension.

Most Scrum teams conduct estimation sessions prior to starting a sprint. The estimates that a team assigns to product backlog items are typically based on a gut feel of how long it will take or how complicated it will be to "complete" a backlog item. In most teams I've worked with, that estimate is implied to mean how long it will take a software engineer on the team to submit a code review (via a pull request or something similar), i.e., when the engineer thinks their part is "done". Some teams also build into the estimate a duration or complexity estimate based on the work necessary by the QA people.

What estimates don't ever capture, in my experience, is how much effort/time/complexity is required to complete the code review or QA cycles on a backlog item. How can we know ahead of time how many pieces of feedback are going to be left on a pull request? How many of those will necessitate re-work on the item before the end of the sprint? How long will the re-work take? What if the re-work is still not up to snuff? Etc., etc., etc. Each code review starts a cycle of unknown duration.

And all the while everyone who cares about the completion of the sprint is tap-tap-tapping their fingers, nervously wondering if these intra-sprint cycles are going to complete on time.

Even though people mean well, what happens in practice is that thorough code review is tacitly disincentivized. Most product owners are not engineers, and although they understand intuitively that code review, like any form of quality assurance, is necessary, it also adds delay to the immediate "done-ness" of work.

Engineers also know that code reviews are important, but there is always pressure to get to them faster. When bugs crop up later, or the codebase slowly accumulates technical debt, no one in management is going to track down the specific pull request that introduced the problem, read the list of names in the approvers list and assign blame to those individuals. A swift click on the "Approve" button satisfies everyone in the moment.

What do we do about this tension? We can't have code review without introducing an unpredictable amount of delay to each sprint backlog item. You can never fully resolve this tension, in my opinion. But there are some ways to ease the tension.

Be Your Own Reviewer

I'm going to put this one on the engineers. And, no, I'm not suggesting that an engineer who submits a pull request should be clicking "Approve" on their own pull request. What I am suggesting is that before an engineer submits a pull request, it should only be after they have examined their own code to the degree that a reviewer would.

I can't tell you how many times over my career that I've reviewed a pull request where it was clear that the submitter had not even looked at what they were submitting for review. You're not ready to submit a pull request until you've looked line-by-line at the diff you're about to submit and pre-corrected every issue you can find. We're not talking about architectural judgment calls here, we're talking about misspelling method names, pushing blank files, including huge sections of commented-out code from various abandoned local experiments.

It should be rare that a reviewer needs to point out an obvious mistake in a review. I personally feel embarrassed when a reviewer points out something that I know I could have found myself. And I feel doubly embarrassed if it's a mistake they've pointed out multiple times.

Automate, Automate, Automate

Take advantage of every opportunity to automate the tasks associated with code review. We have linters, pre-commit hooks in version control systems, IDE integrations. There are ways to automate basically any tedious, repetitive aspects of code review. If reviewers are repeatedly finding the same kinds of mistakes, it's time to automate the checks for those mistakes, so submitters can't submit them in the first place.

Back-and-forth intra-sprint cycles are what kill sprints. Code review is a cycle, but what can we do to get down to one iteration per cycle as often as possible?

The tension between any kind of quality assurance (including code review) and sprint-based methodologies is impossible to erase, but we have techniques to make it less tense.

Scrum Tensions: Manual QA

After working for many years in the software industry, almost always using a methodology resembling Scrum, there are certain tensions that I've come to believe have no good solution--at least--I've never seen them solved elegantly.

One of those tensions comes from manual quality assurance. When sprint backlog items need to be manually tested by dedicated QA people, there is simply no clean way to do it in my experience.

Here are some of the issues that I've seen repeatedly:

  • QA does not have time to test items where the development work is completed late in the sprint
  • QA people are sitting idle for the first few days of the sprint with no completed development work to test yet
  • If a bug is found, there is no time in the current sprint to fix it

Here are some attempted remedies that I've seen:

  • Let's build some slack into the sprint for the developers so they can complete their development work with X days to spare at the end to allow time for QA
  • Let's have the developers "work ahead" of the QA people, as in, the development work that we say is "done" within one sprint is actually tested in the next sprint

And here's where those remedies fall down:

  • Inevitably, there is pressure to work more items into successive sprints, and development work starts creeping over the false "deadline"
  • We end each sprint not knowing what is actually "done", and any bugs that are found have probably already been merged into whatever common branch the developers work from

QA people are in a really tough position as they are always sort of passively pressured to not "hold up" the sprint. They know that everyone wants to see a clean sprint where every item is signed off on by QA by the last day. And the double whammy is that they get heat when bugs are found in production.

This is one of those essential tensions in Scrum-based development that I've come to accept is not resolvable. I have never seen manual QA fit neatly into Scrum. 

As I see it, there are two options to break the tension:

  • Do manual QA, and abandon sprint-based methodologies (try Kanban, for example)
  • Use a sprint-based methodology, and avoid manual QA (some teams fully automate testing)
In my experience, organizations do neither of the above, and the tension continues.

Continuity of Leadership

There's a Kafkaesque situation that develops in companies that cannot retain senior employees. A team feels a sense of urgency about shipping software, but no one can tell them what to build.

Engineers are responsible for making the products that sustain the company, but there's no one around who can say what the products should do exactly.

It's like a movie where the protagonist wakes up every morning with no knowledge of what happened the day before, and tries to piece together his identity from artifacts scattered about.

Big organizational initiatives fail repeatedly because there's no one around from the beginning to the end of the initiative. People working on the initiative find that they can't even remember why the initiative was necessary. The leader who championed the initiative isn't around to take credit for its completion, so there's no one working on the initiative at any given time who cares about its completion.

Cross-team relationships barely exist because whoever is leading one team at the moment doesn't know the leaders of the other teams and has no history with them.

Leaders are continually "backfilling" for other leaders that left, being asked questions that they can't answer.

Morale stays at a low simmer. People show up to work each Monday morning knowing they won’t be productive. Another week where they won’t know if they did good work or not, because no one can define the goal.

An organization without continuity of leadership is an organization with amnesia.