I have been a habitual spreadsheet user for 30 years, and a Keynesian for, like, forever. That particular pairing of predilections is probably why I experienced a brief and thrilling shiver of Schadenfreude as I read one of the many tellings in the press of the replication of a study frequently cited by deficit hawks, “Growth in a Time of Debt.” The results of the replication have called into question the accuracy of the original findings, and much of the brouhaha centered around the unearthing of a bug in an Excel spreadsheet. As reported in the LA Times:
“￼￼￼￼A new study by three researchers at the University of Massachusetts finds that Rogoff and Reinhart made several mistakes that invalidate their thesis. … Most important, they made a spreadsheet error that resulted in their leaving five countries out of an all-important average of countries with higher than 90% debt-to-GDP ratios. By restoring the full average, the UMass authors say, the growth rate for countries in that range becomes 2.2%, not the -0.1% cited by Rogoff and Reinhart.” [emphasis and link added]
But why, you may ask, is this an appropriate topic for this particular blog, which putatively concerns itself only with financial reporting by public companies? I readily admit that the correlation to my bread-and-butter topics is not strong, but I do think there are some interesting lessons to be learned, both for producers of accounting information, and regulators. For starters, the Reinhard/Rogoff brouhaha gives us a highly provocative case example tfor asking why financial professionals of all stripes choose to stake their careers and reputations on the accuracy of overly-complex, ill-structured, poorly documented and untested electronic spreadsheet programs; that if anything are more error friendly then the original versions from way back in 1979. Moreover, spreadsheets can be a scourge of corporate internal controls—both financial reporting-wise, and management-wise. Who knows what bugs lurk in these things, and who knows whether they reflect the most timely inputs?
For example, my good friend, Ken Baker (Dartmouth) and his co-authors found, in their examination of 50 spreadsheets, that .87% of cells containing a formula were calculated incorrectly, and that fully 10% of the spreadsheets had a bottom-line error rate of 5% or more.
And part of my motivation for writing this post is based on personal experience. I have been involved in two lawsuits (as an expert witness) in which errors in spreadsheets were discovered long after the horses left the barn, so to speak. I can’t go into more detail, but trust me, neither error made for a pretty picture.
A Behavioral Explanation of Misplaced Confidence in Spreadsheets
Nearly all electronic spreadsheet errors exist for the same reason that carpenters smash their thumbs with their hammers—human mistakes. But, it strikes me (I couldn’t resist the pun) that there is an important difference that could explain why we love spreadsheets more than is good for us: with a hammer, the time between making an error (striking one’s thumb) and its discovery is a fraction of a second. But with a spreadsheet, the length of time between error and feedback can vary greatly. It means that nobody hits their thumb twice before feeling the pain of the first finger smashing; but, the R&R case vividly illustrates that automated production systems have the potential to produce bad outputs both rapidly and repeatedly. (Note: whether R&R’s faulty paper is responsible for bad policy decisions is not a question I am competent to address, but I do think that it is safe to presume, a la Robert Samuelson that it at minimum provided “intellectual cover” for decision makers who acted consistent with their prior beliefs and values.)
Moving away from hammers, and focusing purely on spreadsheets, a general proposition I would like to offer for consideration is this: the lure of high-tech tools is that we can easily envision how they can enhance productivity, while at the same time requiring less effort (both thinking effort and elbow grease); and even making work itself a more palatable proposition. In choosing to invest in a high-tech tool, these factors are most salient, because they can be easily recalled or envisioned; hence, we perforce underweight the likelihood that a labor-saving device will produce costly incorrect answers (or faulty output in general) at a much higher rate than more labor-intensive production schemes.
It seems that two streams of behavioral research supports this notion. First, there is that A possible Tversky and Kahneman’s “availability” heuristic, which is defined as the tendency to overweight probabilities based on the events which most easily come to mind. For example, newspapers report lightning strikes that injure people, but they don’t report the lightning strikes that do no harm; consequently, humans have a documented tendency to overestimate the probability of being struck by lightning.
Second, humans have a natural tendency to favor, or even actively seek, information that confirms their prior beliefs. This is known as the “confirmation bias,” of which I can’t pinpoint the origin, but I first remember reading about it when I was a doctoral student in a classic (1978) paper by Hillel Einhorn and Robyn Hogarth.
Getting back to R&R’s spreadsheet, here is how these biases may have operated:
- Neither author had (much) experience in working with a completed spreadsheet for which a consequential error was discovered after the fact. Therefore, they underweighted the probability that their spreadsheet might contain such an error, and acted according to their biased subjective assessments (see Baker et al, above) by not scrubbing their work sufficiently.
- Consider what would have happened if R&R had performed their calculations by hand (not saying they should have, mind you). Then, surely, with many more examples of errors to recall, they would responded to the recollections of their own personal experience with math errors by assigning a graduate assistant the task of re-calculating everything — before submitting their work to a journal for publication. My guess is that it never occurred to R&R to hand a grad assistant their data and their mathematical models, and to task her with building another spreadsheet from scratch, so as to validate the results.
- It also appears that the bottom line result of the faulty spreadsheet provided confirming evidence of R&R’s prior beliefs, which would have further reduced their propensity to seek disconfirming evidence through some additional form of validity check on their spreadsheet.
I suppose that one moral of the story is that even economists — the champions at assuming that everyone is as rational as Dr. Spock — are humans, just like the rest of us. Accountants, a more practical form of humanity than economists were among the first pioneers of strategies for dealing with fallibility, beginning with double-entry accounting at least 500 years ago. One attractive feature of an accounting spreadsheet is that self-checks can be easily built in: cash flows must reconcile to accruals, balance sheets have to balance, and “flows” (like income) must “articulate” with measure of stocks in balance sheets.
So, another moral of the story is to beware of accounting spreadsheets that don’t make full use of the checks that are inherent in basic accounting logic. In fact, beware of any spreadsheet that doesn’t have self-checks built in. I’m betting that R&R’s spreadsheet had no self-checks; and that self-checks could indeed have been formulated.
On the bright side, now that the R&R spreadsheet booboo has been so widely publicized, the pendulum of the availability heuristic will swing somewhat in the other direction. Those who know and appreciate the cautionary tale of R&R’s faulty spreadsheet will be more careful — at least for a while. Perhaps I’m being too optimistic, but independent review of a spreadsheet (and other computational tools) could even become standard procedure prior to the publication of an article; or even become an integral part of all company’s system of internal controls over financial reporting.