Technical Debt, & the “Core Four” Practices to Avoid It
Readers! Subscribers! “Followers”! I hope you are all healthy and safe.
What is “Technical Debt”?
People are still debating over the one true meaning of the term “technical debt.” It was coined by Ward Cunningham around 2009, and the short definition is this:
Technical Debt is the deferment of good software design for the sake of expediency. Put simply, we’ve chosen to make some questionable design choices in order to get product delivered. This may be a conscious choice, but—more often than not—it’s unconscious, and the result of a time-crunch.
Why is Technical Debt a concern?
Technical debt has real impact on the bottom line. It can slow the delivery of future releases or sprint increments, make defects harder to find and fix, and erode good testing practices.
Detour: What is “Software Design”?
The word “design” has many meanings within the software industry. “Software design” is distinct from user-interface design (roughly the “look and feel” of the software product) and also user-experience design (roughly, how the user uses the software to do something they want to do, and the subjective quality of that experience).
Software design is the internal structure of the code. It’s something that is chosen and written by software developers, and typically needs to be read and understood only by software developers on the team.
It’s not magic, and it’s not some pie-in-the-sky notion of perfection or art. There’s certainly skill and finesse that goes into doing it well. But the value of a good design is entirely pragmatic. A good software design does two things:
1. Communicates the intent of all software behaviors to the team’s developers, now, and in the future.
2. Facilitates future enhancements, both expected and unexpected.
And that’s really it. There’s a whole library aisle written about it, and a lot of that is very informative. All the lessons of Design Patterns, for example, are very useful. It’s not a waste of time to learn about Design Patterns. And still, they all boil down to those two.
Do we need to get the software design right, upfront?
I built software in ye olde “pre-Agile” days, prior to 1996, and the techniques we used were—in comparison—excessively predictive, rather than relying on fast feedback and empirical methods that we used on XP teams, for example. (Don’t know XP? For now, just think of it as “Scrum++”!)
Doing all the design up-front, in a “design phase,” worked fine until we started testing (if the business felt there was still time for testing…), or until it was in the hands of the customers.
That’s why we enthusiastically embraced Agile methods like Scrum and XP, long before the term “Agile” was used. We said “Let’s do just enough for now, and fold in what we learn.”
That won’t work! We’re doing Agile, and it’s still a painful experience.
Let me tell you about what I call the “Agilist’s Dilemma.”
For about the first 5-8 sprints (or “iterations”), everything may go smoothly: everyone is happy to be doing what they enjoy. Coders code, testers test, teams demonstrate tiny fractions of high-priority working software to stakeholders. There are balloons and cake. (But no glitter! For the love of all that has nooks and corners, please, NO GLITTER!)
After that, everything typically starts to slow down considerably. Why is this?
In order for the developers to add new features, they have to alter code they’ve already written, throughout the application. That’s just the nature of good software development: the more central the lines of code, the more likely they are to change over time, in order to support new functionality.
But developers really don’t want to break anything they’ve worked hard to build. (See, they’re really not trying to make your life miserable…quite the opposite!) So as the software becomes more complex, they have to proceed more and more carefully or risk introducing defects. Either they slow down, or they make mistakes resulting in defects, which means more time spent searching for and fixing those defects. And fixing defects involves more changes, possibly resulting in more defects…
Testers run into a similar dilemma: At first, it’s easy to keep up with the new features. But, because the developers need to change things, testers need to test everything, every sprint. Again, this gets to be a greater and greater challenge. We see teams doing some crazy things, like prioritizing certain tests, or running tests less frequently. Those defects from the previous paragraph sneak past the testers and fall into the user’s lap. Or laptop.
So everything slows down, either because people are trying to do their jobs conscientiously; or because the quality of the product is degrading due to statistically unavoidable human fallibility.
This is unsustainable. And, of course, we then hear that “Agile sucks!”
It should be no surprise that if we’re going to ask our teams to do something highly iterative and incremental, the coding and testing techniques we would have used for a gated “waterfall” process are not going to work anymore. It’s not merely that they’re not sufficient; they’re actually counterproductive!
The solution to the Agilist’s Dilemma is to use development practices that are better suited to a highly iterative and incremental approach and to stop doing practices that act as an impediment to the agility we seek.
What technical practices are needed to reduce or avoid technical debt?
Let’s first look at the heart of the problem: We need to be able to enhance the functionality of our software without damaging any of the prior investment in functionality. We need to have software that is soft: it needs to be easy to change. To that end, our software design needs to:
1. Communicate the existing intent.
2. Be easy to extend and maintain.
On an Agile team, we support this by continuously reshaping the design so that it’s (a) appropriate for the current functionality; and (b) flexible enough to receive unexpected, unpredictable enhancements.
We call this practice “refactoring” – and it’s the core design practice for Agile software development.
Refactoring is the reshaping of the code’s structure without changing any of the behavior of the system, so that we can then more easily add the new functionality.
It’s not rework, and it’s not rewriting. A good design is a changeable design, by definition.
It’s an ongoing, never-ending activity that is best done in very tiny increments, like a few seconds of refactoring every 5 minutes.
Sounds crazy, right? It’s actually quite simple, and very powerful. The best software designs I’ve seen got there through simple, continuous, wholehearted refactoring.
But refactoring can’t be done in isolation. You can’t simply tell the team: “Okay, now we refactor!”
How can we refactor safely?
A team can’t refactor unless they have a lot of confidence that their changes won’t alter existing behavior. And the only way to know that is to have a comprehensive and very fast automated test-suite. I will often refer to this test-suite as “the safety-net.”
In order to build and maintain this safety-net of fast tests, A team needs to be doing either Test-Driven Development (TDD), or Behavior Driven Development (BDD). Or both! (But that’s a longer discussion.)
These practices are often called “test-first” practices because we write a single test or scenario, and we work to get that test passing before we move on to writing another test.
Folks always ask me why the team can’t write the tests after coding. There was an old study comparing TDD with unit-test-after that suggested test-after was a little faster. The problem, though, was that the test-after teams’ test-coverage was abysmal, and quality suffered proportionally.
Also, TDD is actually faster in the short-term, because it’s the technique by which developers think about the decomposition of the new behaviors they’re adding to the system. We record what we expect in a test, rather than drawing diagrams and then trying to fit behaviors into our mistaken conceptual notions (been there, done that, pre-1996).
And TDD is faster in the long-term because it keeps defect counts so low that most of my teams stopped tracking defects. Just as one example: The U of M OTIS2 program I worked on in 2002 is still undergoing enhancements via TDD (yes, it’s old enough to vote, now). The last time a developer had to work OT in the evening or weekend hours was in 2004. OTIS2 is a life-critical (“you break it, a patient may die”) application. The phrase “TDD saves” isn’t just a silly meme.
Whereas refactoring is the core solution, TDD and BDD are the core practices of a smoothly-running Agile software team. These practices become the means by which any ambiguities in what we’ve been asked to build get refined into discrete, and concrete, scenarios. Every high-performing Agile software team that I’ve encountered spends most of their day doing one or both of these.
Test-first includes testing, sure, but also incremental design through refactoring, and just-in-time analysis. That is, we think about what is needed, and what is not. The team is continuously growing the product increment, together with the safety-net around it, so that further enhancements and refactorings can happen swiftly and confidently.
Sounds expensive, right? Upfront, perhaps; for a month, perhaps. But the savings in cost-of-rework, and the ability to adapt to changing market conditions, have typically recouped the “additional” expense of time writing tests, and any training/coaching they had received from me.
Okay, Refactoring and TDD. Got it. Anything else?
Another limitation to good, changeable design is a lack of collaboration. I can tell you from my decades of experience writing code all by myself, that I didn’t learn much about software design. Partly because I thought I knew it all; partly because we were expected to learn these things during our copious “free time.” If we ever saw each others’ code, invariably someone else would disagree with my design, or I would disagree with theirs. And how often do you suppose we had the time to go back and incorporate the new knowledge into the code?
What solved this for us was intense, continuous collaboration. Agile developers need to talk to each other about the code, and they need to design that code together. Two practices that have arisen from this need are “Pair Programming,” and “Mob Programming.”
Pairing is two developers working together to test, write, and design the code. Mob Programming is the whole team sitting together, usually including either the Product Advocate (Scrum’s PO) or a business-savvy BA or QA. They are all seeing the product being developed in real-time on a big screen or two, usually by a pair that changes frequently.
Also sounds untenably expensive, yeah? Yet there are numerous benefits that swamp the costs. For example:
1. The code is reviewed as it’s written. Code reviews no longer constrain the team, and the incorporation of good ideas happens immediately, rather than later or not at all.
2. Like brain cells packed closely together in your cerebral cortex, the immediate cross-connectivity of product, tester, developer, ops, et cetera allows the “mob” to instantly put together an optimal solution. Coding, testing, analysis, research all happen simultaneously towards the completion of the most important piece of the most important feature for the most important customer, today.
3. Questions that arise about some enhancement are answered immediately, rather than waiting for a meeting, or—worse—having the team guess at what you meant, and likely guessing incorrectly.
4. Everyone is engaged. What many “mobs” have discovered is that the shorter the length of time you spend in the same role (driver, navigator, observer), the more likely you are to be engaged in what’s happening up on the screen. During one recent training/coaching session I gave to a mobbing organization, they tried one-minute rotations and thought it worked very well.
5. Cross-team understanding of the design, the technologies and frameworks being employed, and the business domain being expressed. Everyone gets some experience in, and a lot of appreciation for, the various skills required to deliver your specific style of high-quality high-value software.
6. It’s very social, creating a bond amongst the teammates, including product, dev, test, ops, et cetera.
Back to design: If the whole team agrees it’s a maintainable design, then it is. When I used to write code alone, there was only one person who thought it was a great design: Me. Just increasing the number of eyes on a particular bit of fresh code turns one…into MANY. Odds are if two agree it’s a good design, the team will agree. Particularly if they’ve all been working together in this way. Paradoxically, fewer heated design arguments happen on teams that pair or mob.
Refactoring, TDD/BDD, Pairing/Mobbing…that’s three. What is the fourth of the “Core Four”?
The final core Agile tech practice is Continuous Integration (CI).
Your sprint isn’t delivering a shippable product increment if it’s from a version-control repository branch that hasn’t been incorporated into a potentially shippable whole. So all the existing features and new User Stories (Scrum Product Backlog Items) need to be integrated together.
We take this to the extreme by integrating each pair’s work multiple times per day. The repo trunk—also known to git users as the “master branch”—is therefore always the “source system of record” as to what has been built. And when a pair or mob starts to work on a new task, they first obtain all the useful changes that others have made.
If we didn’t do this, then refactoring comes to a quick standstill. If I refactor something on a branch, and you refactor that same code but in very different ways on another branch, we’re going to have a hard time integrating. The smaller the incremental changes, though, the easier they are to integrate.
Mob Programmers working with their own isolated code-base still use a repo to avoid losing any changes and to track versions and change-sets. They may never encounter a merge conflict. Multiple pairs on a team will have the occasional merge conflict, but if they integrate numerous times per day, conflicts are rare and easily resolved. Continuous Integration means we integrate continuously! No surprise, right?
Can you summarize the “Core Four” again?
1. At the heart we have refactoring. And—supporting healthy, continuous refactoring—we have the other three…
2. TDD or BDD, to build our safety net.
3. Continuous collaboration and dialog using Pair Programming or Mob Programming.
4. Continuous Integration, providing the important enhancements and refactorings we’ve done to others on the team.
There’s also an interesting way to look at all four of these as practices that facilitate strong team communication:
1. Refactoring creates an understandable (readable, clear, straightforward) and changeable design.
2. Test-first practices use test scenarios to communicate WHAT the software does, whereas the code itself tells us HOW it does that.
3. Collaborative practices like pairing and mobbing are primarily about instant communication, thus considerably shortening typical team feedback loops.
4. Continuous Integration distributes the most recent changes to the rest of the team much more efficiently and accurately than a status update or a code review.
Back to Technical Debt: Is it okay to take on some tech debt, as long as we do these core four practices?
Technical debt is typically not as valuable as people assume. The notion that we either have to rush to market, or we have to design for the future—but not both—is a false choice.
Once the teams have some experience with these techniques, it takes no additional time to use a test-first technique to think about what the code is solving, and to refactor as you go to keep the design tidy.
To read further about Technical Debt, you can check out my post on the topic: https://agileforall.com/ward-cunninghams-debt-metaphor-isnt-a-metaphor/
You can also check out the video version of this topic on our Agile For All YouTube playlist.
The senior management mostly stays disconnected with team members working at the lowest level of the organizational hierarchy. The retrospective is a ceremony that connects them to these teams’ members after a regularized time interval where they can get to know the problems teams are facing. If senior management gets these problems fixed as soon as possible then it will ramp up the speed of work being done by these teams.