CHAPTER 5Benford's Law: Completing the Cycle
THE HIGH-LEVEL TESTS SUCH as the data profile, the histogram, the periodic graph, the descriptive statistics, the first-two digits test, and the Benford-related conformity statistics give us valuable insights into the internal diagnostics of our data. These tests give us a deeper understanding of the entity and our data, and they can also point us to large-scale frauds, errors, and anomalies when these issues are significant enough to affect the results. The two tests described in this chapter are more focused than our suite of high-level tests. The number duplication test tells us which numbers occurred most frequently in our data. The last-two digits test analyzes the last-two digits of our data and is often effective at detecting invented numbers. The tests can both be run in Excel, Access, or IDEA. The use of R to run the number duplication test is also demonstrated.
THE NUMBER DUPLICATION TEST
The number duplication test is a test that tells us which specific amounts were causing the spikes on the first-two digits graph. Spikes on the first-two digits graph are usually linked with some specific amounts occurring abnormally often. For example, the first-two digits of the District of Columbia purchasing card data in Figure 4.6 had 12 visible spikes. Also, the three largest Z-statistics were for the first-two digits 50, 24, and 25 in that order. The results are presented again, this time in IDEA. The dialog box is activated by ...
Get Forensic Analytics, 2nd Edition now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.