Leanpub: Publish Early, Publish Often

Prelude: Software Engineering’s Telephone Game

The software profession has a problem, widely recognized but which nobody seems willing to do anything about. You can think of this problem as a variant of the well known “telephone game”, where some trivial rumor is repeated from one person to the next until it has become distorted beyond recognition and blown up out of all proportion.

Unfortunately, the objects of this telephone game are generally considered cornerstone truths of the discipline, to the point that their acceptance now hinders further progress.

It is not that these claims are outlandish in themselves; they started as somewhat reasonable hypotheses. The problem is that they have become entrenched as “fact” supposedly supported by “research”, and attained this elevated status in spite of being merely anecdotal.

How we got there

One of the ways that anecdote persists is by dressing itself up in the garments of proper scholarship. Suppose you come across the following claim for the first time:

Early results were often criticized, but decades of research have now accumulated in support of the incontrovertible fact that bugs are caused by bug-producing leprechauns who live in Northern Ireland fairy rings. (Broom 1968, Falk 1972, Palton-Spall 1981, Falk & Grimberg 1988, Demetrios 1995, Haviland 2001)

Let’s assume that this explanation immediately appeals to you: it makes sense of so many of the things you’ve seen in software engineering! The proliferation of bugs in the face of huge efforts to eradicate them; their capricious-seeming nature - why, that is very leprechaun-like!

Of course, you, my reader, may be the kind of hard-headed skeptic who absolutely and definitely dismisses the idea that fairies and leprechauns exist at all. If so, please allow that there exists the kind of person who would be persuaded by a leprechaun-based explanation; but who, while an open-minded person, nevertheless thinks that it is important that explanations be adequately backed by evidence.

Surely you agree that this claim would be convincing to someone like that, since it cites so many respected authors, and papers published in peer-reviewed journals.

As it happens, there are many ways this citation style can be misleading, even without outright fabrication or evil intent:

the papers are not really empirical research
the papers support weaker versions of the claim
the papers don’t support the claim directly, but only cite research that does
the more recent papers are not original research, but only cite older ones
the papers are in fact books or book-length, and you’ll be looking for a needle in a haystack
the papers are obscure, hard to find, out of print or paywalled, and thus hard to verify
the papers are selected only on one “side” of an ongoing controversy

Surface plausibility

When we look closely at some of the “ground truths” of software engineering - the “software crisis”, the 10x variability in performance, the cone of uncertainty, even the famous “cost of change curve” - in many cases we find each of these issues pop up, often in combination (so that for instance newer opinion pieces citing very old papers are passed off as “recent research”).

Because the claims have some surface plausibility, and because many people use them to support something they sincerely believe in - for instance the Agile styles of planning or estimation - one often voices criticism of the claims at the risk of being unpopular. People like their leprechauns.

In fact, you’re likely to encounter complete blindness to your skepticism. “Come on,” people will say, “are you really trying to say that leprechauns live in, what, Africa? Antarctica?” The leprechaun-belief is so well entrenched that your opposition is taken as support for some other silly claim - your interlocutors aren’t even able to recognize that you question the very terms upon which the research is grounded.

For instance, when I argued against the “well-known fact” of 10x variations in software developers’ productivity, the objection I often met was “do you really believe that all developers have the same productivity?” Very few people can even imagine not believing in “productivity” as a valid construct.

Leprechaun spotting

Leprechauns come in many forms, which I’ll call tacit, folklore and formal. We need to deal with these various forms differently.

Tacit

Some Leprechaun claims have become so pervasive in software engineering discourse that they don’t even appear as claims any more.

For instance, people who are trying to hire “rockstar” or “ninja” programmers are probably influenced by a tacit belief in the supposedly large variations in programmer productivity, even if they don’t explicitly say that they are looking for a “10x productivity programmer”. There is a hidden inference at work: “there exist programmers who are ten times as productive as the average, therefore it is a profitable investment for me to go to great expense to find one of these”.

Another example might be someone who defends Agile testing techniques, such as Test-Driven Development (TDD), because “they reveal defects early”. There is a hidden inference too, which relies on the “well-known fact” that software defects are more costly to fix the later they are detected - and therefore TDD lowers costs by catching defects early. Unfortunately, this claim on the cost of fixing defects is at best problematic, as we’ll see laetr on.

Folklore

In many cases, the claims are only secondary. They are reproduced in an article, a blog post or a Powerpoint presentation, often by someone who hasn’t read - in fact hasn’t even looked at - any of the original references.

Here the inference is explicit: there is a point being made, and the claim is offered in support of the point. It can even be the same point as when the claim is tacit, such as the importance of hiring rockstar programmers or the great value of TDD.

Quite frequently, the Leprechaun claim is only ancillary to the main argument: the author has other reasons for believing in the conclusion they are presenting, and the claim is mostly there as a bit of window-dressing.

Formal

Lastly, there is the case of the primary author: someone who did the bibliographical footwork in the first place, should have known better, and is causing a leprechaun-belief to spread.

Whether we like it or not, software practitioners pay scant attention to academic writing about software development. Rather, most of the insights we take for granted come from authors who have a knack as popularizers. They play more or less the same role as popular science journalists with respect to the general public.

Science journalism is a fine and important thing, but it has a well-known failure mode: sensationalism, where the lure of an attention-grabbing headline causes writers to toss caution to the wind and radically misrepresent a claim.

The examples I’ve examined (the cone of uncertainty, the 10x variability, the cost of change curve, etc.) strongly suggest that we should raise our expectations of rigor in software engineering writing, especially writing that popularizes research results.

What you can do

This book is intended as a handbook of skeptical thinking and reading, with worked-out examples.

What I want you to take away from reading the book is a set of reflexes that you will call on whenever you come across a strong opinion about software development, whatever “camp” or “community” or “school” that opinion comes from.

It will probably be easiest to apply these reflexes against what I’ve called the “folklore” and “formal” version of Leprechaun claims: when you come across them in an article, blog or book, and the claim is spelled out explicitly.

It isn’t necessarily the best of ideas to always call out such claims, especially if you are overly antagonistic about it; you may end up being seen as a “troll” - someone more interested in winning arguments than in the truth of things. However, these false claims will keep spreading unless somehow kept in check. I cannot any longer accept that it’s better to keep quiet and not rock the boat.

The best approach is probably to keep track of where the best and most even-handed treatments of these various claims reside, and to respectfully point people to them. I hope that this book serves as one such source - but I’m under no illusion that I can deal with even a substantial fraction of all bogus claims within the space of a single book.

The hardest step

The real challenge will be to apply these reflexes to your own beliefs.

An inspiring example

Graham Lee is the author of “Test-Driven iOS Development” (Addison-Wesley, 2012). Page 4 of his book includes a table which reproduces a claim about the “cost of defects”, which we’ll be examining in detail in a later chapter.

In september 2012, after reading an early draft of Leprechauns, Graham published a retraction in the following terms: “I made a mistake. […] I perpetuated what seems (now, since I analyse it) to be a big myth in software engineering. I uncritically quoted some work without checking its authority, and now find it lacking.”

Graham not only took seriously my warning about the “cost of defects” claim. He actually went looking for the actual evidence and made his call on that basis, rather than taking my word for it. That’s the kind of behaviour I’d like to see more of.

I hold out little hope that people can, in general, convince others to let go of specific pet notions. Speaking out against belief X may not do much for those who currently hold belief X strongly enough that they are writing or blogging about it, although there will hopefully be some happy exceptions like Graham.

However, I do believe that if we manage to raise our overall level of “epistemic hygiene”, we can prevent Leprechauns from spreading in the first place. Like its real-world counterpart, epistemic hygiene can be vastly improved by the use of specific techniques that aren’t hard to learn, like washing hands.

That’s what’s coming next. Onwards!

Up next

Chapter 2: The Cone of Uncertainty