Video Transcript
In this video, we’re gonna look at a neat way that forensic accountants can detect fraud. It’s been used to uncover expenses scams, fake research, and falsified accounts. And people have got criminal convictions as a result. Despite this, lots of people still don’t know about Benford’s law.
Before we talk about the main topic though, we’re gonna take a bit of a detour back to a time when people didn’t have calculators or computers to carry out complicated calculations for them. But they did have a few tricks up their sleeves to make life easier.
For example, to multiply two large numbers together, rather than carrying out a huge long multiplication calculation, they would look up the logarithms of their numbers in a book of log tables, add them together, then convert the answer back to a regular number using antilog tables. And they could also carry out division calculations by subtracting the logarithms. It saved a lot of time over long multiplication.
The story of logarithms and how they work is actually pretty fascinating. 400 years ago, a man called John Napier spent 20 years creating a huge table which listed the log values of whole numbers up to 10 million, essentially using base one minus 10 to the negative seven to make his calculations easier. Then his friend, Henry Briggs, converted it all into logs of base 10, which made it easier for everyone else to use the tables for their calculations.
Then, back in 1881, an American astronomer called Simon Newcomb was using log tables to do loads of calculations. And he noticed that the early pages, containing logarithms of values beginning with one and two, were getting much more worn out through use than the other pages. He did a bit of research and published a paper about the probabilities of the first digits taking certain values. Later in his career, he suggested that astronomers were almost at the stage where they’d found out everything they could about the night skies. And he was very wrong about that. But he was right about the uneven distribution of first digits.
Nearly 60 years later, a physicist called Frank Benford independently noticed the nonuniform distribution of first digits in a whole range of datasets from the populations of towns, the values of physical constants, statistical numbers used in news articles, surface areas of rivers, and more. And by now, the world was ready to accept the phenomenon. And it became known as Benford’s law, although he called it the law of anomalous numbers. And he wasn’t actually the first person to discover it.
Now this often happens, naming laws after someone other than the first person to discover them. And when I did some research for this video, I found out that this phenomenon is known as Stigler’s law of eponymy. And ironically, when Stephen Stigler proposed his law of eponymy, he noted that it was something first reported by another person called Robert Merton.
You’ll be pleased to hear that many people now call Benford’s law “Newcomb-Benford’s law” in an attempt to give the original discoverer due credit. But it boils down to the frequency distribution of first, or most significant, digits of certain kinds of data looking roughly like this.
This means that one is the first digit around 30 percent of the time, while nine is the first digit only about four or five percent of the time. And that may initially seem surprising. We might expect that all digits are equally likely to occur. So numbers beginning with one, two, three, four, five, six, seven, eight, or nine would all be equally prevalent with a probability of a ninth, around 11 percent.
Now we talked about logarithms to explain how Newcomb first discovered this effect. But the formula for working out the expected probability of each possible first digit actually involves logarithms as well. The probability that the first digit is 𝑥 is equal to log base 10 of one plus one over 𝑥. So the distribution we get with this is quite different to the uniform distribution we’d get if the first digit was equally likely to be one, two, three, four, five, six, seven, eight, or nine.
But if we think about it for a moment, we can see that this law would only work with certain kinds of distributions of numbers. For example, if you were to take adult heights in metres, we would expect many more than 30 percent of them to begin with a one. And if you took those same heights in feet, then almost none of them would begin with a one.
If we restricted ourselves to just looking at the counting numbers between one and 90, then about 12 percent begin with a one, 12 percent with a two, and so on. But only two of them, about two percent, begin with a nine. The distribution of first digits is uniform apart from nines. And Benford’s law doesn’t apply. For different ranges of numbers, it seems obvious that fewer numbers will begin with the higher digits.
The law works best on numbers that span a range of orders of magnitude, that is, powers of 10, so single digit, tens, hundreds, thousands, and so on, and don’t have any artificially applied constraints. For example, telephone numbers are all the same length and begin with specific area codes. And bank card numbers all have an industry identifier as their first digit, as part of the bank identification number. So we wouldn’t expect the law to apply when first digits are constrained by such human-defined rules.
But when we look at populations of countries, areas of states, house prices, or even a range of statistical numbers chosen at random from articles in a newspaper, they’re likely to be drawn from across a range of orders of magnitude and so will probably follow Benford’s law.
Using the law can help you check the validity of data you’ve collected. For example, if you analyse all the expenses claims for a large organisation and the claims vary in size from very small amounts to many thousands of dollars, then you’d expect something like a Benford’s law frequency of first digits. If you find something very different, maybe someone has been putting in false claims using a more uniform distribution of first digit numbers. It could be that there’s another reason, like certain rules about what people can claim expenses for or limits on the amount of expenses or even an error in the accounting software. But it’s a very useful flag of the unexpected.
Some mathematical sequences of numbers also follow Benford’s law. If you write down all the integer powers of two up to any large number, for example, a hundred or a million or a billion or whatever, you’ll see that the first digits have a frequency distribution similar to that suggested by Benford’s law. The same happens with Fibonacci numbers and factorials. So this leads us to thinking about why Benford’s law works.
With the mathematically generated sequences, it can be quite easy to see what’s going on, especially when you’ve got exponential sequences like the powers of two. With the integer exponents of two, each term in the sequence is double the previous one. And if we plot the values on a logarithmic scale, where equal spaces represent exponentially increasing amounts, as you make your way along the axis, we can see that it’s more likely that the first digit will be one.
On our logarithmic scale, we can see that the space between one and two is much larger than the space between two and three, which is larger than the space between three and four, and so on.
So if we plot our powers of two on the scale, two, four, eight, 16, and so on, notice how the steps between subsequent numbers, two to four, four to eight, eight to 16, are equally spaced. And that’s because of this logarithmic scale. So we’re taking equal-sized steps through this scale. And a larger proportion of the regions relate to situations where the first digit is equal to one. And smaller and smaller areas represent numbers beginning with two, three, four, and so on. As we count them up, more of the powers of two will begin with one. More of them will fall into these regions.
Now let’s think about other naturally occurring statistics, like town populations and why they might follow Benford’s law. We count people to work out population. So let’s start by thinking about some really small towns. Obviously, we need at least one person to constitute a tiny town. And if our largest town had a population of one, then 100 percent of towns would have populations with a first digit of one. And zero percent would have a first digit of two, three, four, and so on, up to nine.
If the largest town population was two, then on a random distribution basis, about 50 percent of populations would have a first digit of one and 50 percent would have a first digit of two. We’ve got two choices. And again, no first digits would be higher than that. As the maximum size of the town increases up to nine, the proportion of towns you’d expect to have a first digit of one decreases down to a ninth, about 11 percent.
Now let’s consider towns with populations of up to 10 people. Well, those with one or 10 people have populations with a first digit of one. So now two out of the 10 options have a first digit of one. Then as we include towns with 11, 12, 13 people, and so on, up to 19, the percentage of potential town populations beginning with one increases up to 58 percent. If towns have random populations between one and 19, then there’re 11 ways to get a first digit of one out of 19 different possibilities.
Then we could think about towns with populations of up to 99 people. And the proportion of towns with population first digits of one will reduce down to 11 out of 99. That’s just over 11 percent.
Then as we work our way through possible populations up to 100, then 101, and so on, up to 199, the proportion of possible populations with a first digit of one gradually increases until you see that there are 111 out of 199 ways of having a first digit of one in the numbers up to 199. That’s about 56 percent.
If we plot these proportions on a line graph, we can see that this pattern continues as we increase the possible town size. Every time we introduce the next order of magnitude of possible populations, the proportion of possibilities for a first digit of one rapidly increases up to about 50 percent and then decreases slowly back down to 11 percent as we include more of the possible populations up to the next order of magnitude.
If each of these theoretical maximum populations is equally likely, the expected proportion of town populations beginning with one is some kind of average between about 11 percent and just over 50 percent. It turns out to be around 30 percent.
So if the data we’re looking at is uniformly randomly distributed over the range one to 9999, then we’d expect about 11 percent of the numbers to begin with a one. But if the data is uniformly randomly distributed over the range one to 19999, then you’d expect about 56 percent of numbers to begin with one. Since both situations are equally likely for a variety of different datasets, then it’s not so surprising that we see an average of around 30 percent of numbers in our newspapers and accounts and general statistics with a first digit of one.
Benford’s law then isn’t actually a mysterious law of anomalous numbers saying that ones appear much more often than you’d think. It’s just a simple observation that, depending on where you start and stop counting, more or fewer of the numbers will begin with one.
The situations in which Benford’s law breaks down are those where we approach either extreme. If the maximum possible value for our data is right on the boundary of an order of magnitude, then you shouldn’t be at all surprised to see only 11 percent of your figures beginning with one. But if the maximum possible value is about 10 percent higher than that, then over 50 percent of your numbers could begin with one.
So Benford’s law is an observation that more numbers representing statistical observations tend to begin with lower digits than higher digits. And this can help us to spot when someone is trying to cook the books or falsify their data. When used wisely, it can help us validate and verify our data and prove a really useful scientific tool.