The Fundamental Problems of Software

As far as I can tell, there are six immutable fundamental problems faced by all commercial software.

Identifying the correct problem to solve
Getting the right specifications to solve the problem
Distributing a shared understanding of the specifications
Implementing the specifications
Verifying correct implementation
Repeating the above process in the context of refining the problem and specifications over time

The hardest problems are 1, 2, 3, and 6.

#1: Problem Discovery & Definition

Identifying the correct problem to solve is difficult because human psychology is hard. No matter the nature of the problem, someone only wants us to solve it because of how the person or people purchasing the solution expects that solution will make them feel.

I don’t care what problem you’re solving, at the end of the day you’re working on it because of how one or more people feel about it. “But Anthony, I’m writing software for NASA’s next Mars rover”, you say. Sorry, but we only want it because humans feel curious about the world and perhaps want some greater sense of mastery over an indifferent universe.

Every software project’s greatest risk is a lack of sufficient value. If what we write doesn’t meet a need sufficient for someone to part with their money, our software won’t be run and we probably won’t be paid to write it much longer.

#2: Solution Discovery & Design

The next most challenging aspect of a software project is actually figuring out how to solve the problem. This solution needs to be fit for purpose in multiple dimensions.

It needs to

be understood and usable for its users (those pesky humans and their psychology again!)
be cost-effective for the purchaser.
be economically viable for the business providing it.
meet the requirements of customer support teams.
meet legal and regulatory requirements.
be feasible to implement.
meet the performance, reliability, and durability expectations of its users.
meet the maintainability, changeability, testability, and observability requirements of the team working on it over the life of the service (human psychology strikes again!)
meet the financial reporting requirements of the business.
meet the business intelligence needs of the business.

All aspects of this multidimensional fit need to be considered across many different time scales. What race condition could occur if two actors issue a command within a few hundred milliseconds? How many requests per minute can it handle? How many months until the database needs to scale? How many years ago was this piece of code last changed?

As humans, we tend to solve problems by thinking in 2D-dimensional Euclidean space. If we can draw it on a screen or a page, we can expand our thought process beyond the limited working memory of our brains. And we can even share it with others. But even if we drew sequence diagrams, entity relationship diagrams, process maps, component diagrams, deployment diagrams, infrastructure diagrams, network diagrams, state machine diagrams, interaction overview diagrams, or even the C4 diagrams — system context, container, component, and code — we still would not be able to verify we have a correct and valuable solution until we run code for human use.

And this brings us to the heart of the third fundamental problem of building software.

#3: Shared Understanding

Not only is it effectively impossible to perfectly share the correct specification with a team for implementation, but we can’t know how correct our specification is until after we’ve built it!

So the specification will change while we’re implementing it, and every person implementing it will have a slightly different understanding of what the specification even is. Worse still, every person will have a slightly different understanding of the problem the specification is supposed to solve.

That means that while finding the correct problem and correct solution is a matter of human psychology, creating software is a problem of organisational psychology! Software is written, sold, maintained, and improved in a technosocial and psychosocial context.

For this reason, every software leader should endeavour to find ways to keep both their individual teams as small as possible, as well as their total headcount. Interpersonal complexity increases combinatorially.

But this is in direct tension with the fact that creating a defensible business requires building a large system to address the essential complexity of a valuable (difficult) problem. If the problem could be solved by one or two people, it likely already would be.

#4 & 5: Implementation and Verification

Funnily enough, problems 4 and 5 are the easiest parts of creating a software solution, yet they’re the focus of most of the literature on creating software. Most books, articles, YouTube videos, and online courses are about implementation. How come? Probably because this part of the problem is easily repeatable, consistent, verifiable, and locally reproducible. I can run a Kubernetes cluster on my local, but not the customer’s mind.

That's not to say it isn't important. The people working on this part, the engineers, must have a thorough understanding of the fundamentals of their craft. The stronger their fundamentals, the more capable they are of reducing a problem to its essence and finding the simplest possible solution.

#6: Software Over Time

Problem 6 is the problem of our solution existing and changing over time. It is the fact that this whole process is recursive. Our software solution begets new problems, requiring us to either change our existing solution or implement new ones. This is where software engineering emerges out of programming over time.

Given that most of the complexity, and all of the value, of a software solution exists after it is delivered, this aspect is vitally important. Yet, humans tend to be bad at predictions and timescales. There is an enormous wealth of information addressing this problem from a technological perspective with engineering solutions, e.g. Site Reliability Engineering, but this problem also consists, perhaps equally, of an anthropological component.

Every technological component forms a feedback loop with the people working on it. The people working on it are in a feedback loop with their adjacent teams, the wider business, and the customers. The impact of supposedly technological choices such as branch protection rules or choosing an infrastructure-as-code (IaC) tool is mostly anthropological over a sufficient timescale. That branch protection rule will become your pull request review culture, which will become your team dynamics and hierarchy. That IaC tool choice might determine whether your teams become siloed.

Who Solves These Problems?

When we look at the approach of the people solving these problems, a maturity model emerges. One I see professionals grow through over time. This model consists of three archetypes.

First is the coder, who focuses on problem #4, and a little on #5. Their impact is largely out of their hands, they simply implement the solutions asked of them.

Next is the software engineer, who focuses on problems 4, 5, and 6. They also start thinking about problems 2 and 3, and sometimes problem 1. This person is increasing their impact over time, rather than optimising for the short term.

Finally, we have the product engineer. This person works to solve all six problems. They’re not just leveraging their impact over time, but also in multiple social dimensions. They build trust within their organisation to improve the multidimensional fitness of their solutions by receiving candid feedback, as well as vulnerable expressions of the problems and concerns faced by their colleagues. They also know they need trust across the organisation because aspects of their solution will inevitably be incorrect. Without sufficient trust, they won't get the opportunity to improve their solution. Product engineers aren't just communicating within their group, but also with customers. They know the human context where their solution is used will determine its usefulness. They’re deadly focused on understanding the problem.

Ultimately, the lesson in all of this is that anyone working in software needs to aspire to better understand people. To become a better communicator, to grow their empathy, to become a better person. The technology is just the tool. It's not enough to love our tools, or even the solutions they create. We need to love the people we're building software with, and the people we're building software for.