Chapter 1. Analytical Thinking and the AI-Driven Enterprise

It is April 2020, and the world is in the middle of a very serious global pandemic caused by novel coronavirus SARS-CoV-2 and the ensuing disease (COVID-19), with confirmed cases in the millions and deaths in the hundreds of thousands. Had you searched online for AI coronavirus, you could’ve found some very prestigious media and academic outlets highlighting the role that artificial intelligence (AI) can play in the battle against the pandemic (Figure 1-1).

What makes many uncomfortable with headlines like these is that they dress AI in a superhero suit that has become rather common, overstretching the limits of what can be achieved with AI today.

What Is AI?

If I had to divide the people of the world according to their understanding of the term “AI,” I’d say there are four types of people.

On one end of the spectrum are those who’ve never heard the term. Since AI has become part of the popular folklore and is now a common theme in movies, TV shows, books, magazines, talk shows, and the like, I’d guess that this group is rather small.

Most people belong to a second group that believes that AI is closer to what practitioners call Artificial General Intelligence (AGI) or human-like intelligence. In their view, AI are humanoid-like machines that are able to complete the same tasks and make decisions like humans. For them, AI is no longer in the realm of science fiction as almost every day they come across some type of media coverage on how AI is changing our lives.

A third group, the practitioners, actually dislike the term and prefer to use the less sexy machine learning (ML) label to describe what they do. ML is mainly concerned with making accurate predictions with the use of powerful algorithms and vast amounts of data. There are many such algorithms, but the darling of ML techniques is known as deep learning—short for learning through deep neural networks—and is pretty much responsible for all the media attention the field gets nowadays.

To be sure, deep learning is also about using predictive algorithms that have proven quite powerful in tackling problems that a few years ago were only accessible to humans, specifically in the domains of image recognition and natural language processing (think Facebook automatically labelling your friends in a photo or virtual assistants like Alexa smoothing out your purchase experience on Amazon and controlling your lights and other devices connected to the internet at home).

I don’t want to distract your attention with technical details, so if you want to learn more about these topics please consult the Appendix. The only thing I want to highlight here is that practitioners think “ML” when they hear or read “AI,” and in their minds, this really just means prediction algorithms.

The fourth and final group is what I’ll call “the experts,” those very few individuals who are doing research, and are thus advancing the field of AI. These days most funds are directed toward pushing the boundaries in the field of deep learning, but in some cases they are doing significant research on other topics that aim at achieving AGI.

So what is AI? In this book I’ll use AI and ML interchangeably since it has become the standard in the industry, but keep in mind that there are other topics besides prediction that are part of the AI research arena.

Why Current AI Won’t Deliver on Its Promises

The trouble with AI starts with the name itself, as it inevitably makes us think about machines with human-like intelligence. But the difficulty comes not only from a misnomer but also from comments coming from within, as some recognized leaders in the field have reinforced expectations that will be hard to accomplish in the short term. One such leader claimed in 2016 that “pretty much anything that a normal person can do in <1 sec, we can now automate with AI”. Others may be more cautious, but their firm conviction that deep neural networks are fundamental building blocks for achieving AGI provides the media with juicy headlines.

But I digress: what really matters for the purpose of this book is how this hype has affected the way we run our businesses. It is not uncommon to hear chief executive officers and other high-ranking executives say that they are disrupting their industries with AI. While they may not be fully aware of what the term entails, they are nonetheless backed by vendors and consultants that are very happy to share the riches before the bubble pops.

Hypes are risky because a natural response to unfulfilled expectations is to cut all funds and organizational focus.¹ My aim with this book is to show that while we may be far from creating human-like intelligence, with the current technology, we can create substantial value by transforming our companies into AI-driven enterprises. To do so we must start using AI as an input to improve our business decision-making capabilities.

Before that, let’s understand how we got here, as this will help showcase some of the difficulties with the current approach and the opportunities that are already achievable.

How Did We Get Here?

Figure 1-2 shows the evolution of the top 10 global companies by market capitalization. With the exception of Berkshire Hathaway (Warren Buffett’s conglomerate), Visa, and JPMorgan, all of the remaining companies are in the technology sector and all have embraced the data and AI revolutions.² At face value, this would suggest that if this worked for them, it must work for any other company. But is this the case?

Behind these successes, there are two stories that only converged recently. One has to do with the evolution of AI, and the other with the big data revolution.

The Data Revolution

Not so long ago, the queen of tech headlines was big data, and hardly anyone talked about AI (according to The Economist, in 2017 big data was the new oil). Let’s briefly tell the story of how big data rose to the crown and how AI surprisingly stole the spotlight in recent years.

In 2004, Google published its famous MapReduce paper that enabled companies to distribute computation of large chunks of data (that wouldn’t fit in a single computer) across different machines. Later, Yahoo! made its own open source version of the Google algorithm, marking the beginning of the data revolution.

It took a couple of years for technology commentators and consulting firms to start claiming that data would provide companies with endless opportunities for value creation. At the beginning, this revolution was built around one pillar: having more, diverse, and quickly accessible data. As the hype matured, two more pillars were added: predictive algorithms and a data-driven culture.

The three Vs

The first pillar involved the now well-known three Vs: volume, variety, and velocity. The internet transformation had provided companies with ever-increasing volumes of data. One 2018 estimate claimed that 90% of the data created in the history of humankind had been generated in the previous two years, and many such calculations abound. Technology had to adapt if we wanted to analyze this apparently unlimited supply of information. We not only had to store and process larger amounts of data, but we also needed to deal with new unstructured types of data such as text, images, videos, and recordings that were not easily stored or processed with the data infrastructure available at the time.

Structured and Unstructured Data

The second V, variety, emphasizes the importance of analyzing all types of data, not just structured data. If you have never heard of this distinction, think of your favorite spreadsheet program (Excel, Google Sheets, etc.). These programs organize information in tabular arrangements of rows and columns that provide a lot of structure so that we can efficiently process information within a user-friendly interface. This is a simple example of structured data: anything you can store and analyze using rows and columns belongs to this class.

Have you ever copied and pasted an image in Excel? Not only can you paste images, but you can also use it to store entire texts and even videos. But the fact that you can paste them doesn’t mean you can analyze them. And storage isn’t efficient either: you can save a lot of space on disk by using some type of compression or efficient formats. Unstructured datasets are not efficiently stored or analyzed using tabular formats, and these include all types of multimedia (images, videos, tweets, etc.). Now, these provide a lot of valuable information for companies, so why shouldn’t we use them?

After the innovations were made, consultants and vendors came up with new ways to market these new technologies. Before the age of big data, the Enterprise Data Warehouse was used to store and analyze structured data. The new age needed something equally new, and thus the data lake was born with the promise of providing flexibility and computational power to store and analyze big data.

Thanks to “linear scalability,” if twice the work needed to be done, we would just have to install twice the computing power to meet the same deadlines. Similarly, for a given task, we could cut the current time in half by doubling the amount of infrastructure. Computing power could be easily added by way of commodity hardware, efficiently operated by open source software readily available for us to use. But the data lake also allowed for quick access to the larger variety of data sources.

Once we tackled the volume and variety problems, velocity was the next frontier, and our objective had to be the reduction of time-to-action and time-to-decision. We were now able to store and process large amounts of very diverse data in real time or near-real time if necessary. The three Vs were readily achievable for any company willing to invest in the technology and the know-how. Nonetheless, the riches were not in sight yet, so two new pillars were added—prediction and data-driven culture—along with a recipe for success.

Data maturity models

Since data alone was not creating the value that was promised, we needed some extra guidance; this is where maturity models entered with the promise of helping companies navigate through the turbulent waters created by the data revolution. One such model is depicted in Figure 1-3, which I will explain now.

Descriptive stage

Starting from the left, one thing was apparent from the outset: having more, better, and timely data could provide a more granular view of our businesses’ performance. And our ability to react quickly would certainly allow us to create some value. A health analogy may help to understand why.

Imagine you install sensors in your body, either externally through wearables or by means of other soon-to-be-invented internal devices, that provide you with more, better, and timely data on your health. Since you may now know when your heart rate or your blood pressure increases above some critical level, you can take whatever measures are needed to bring things back to normal. Similarly, you can track your sleeping patterns or sugar levels and adjust your daily habits accordingly. If we react fast enough, this newly available data may even save our lives. This kind of descriptive analysis of past data may provide some insights about your health, and the creation of value depends critically on our ability to react quickly enough.

Predictive stage

But more often than not it’s too late when we react. Can we do better? One approach would be to replace reaction with predictive action. As long as predictive power is strong enough, this layer should buy us time to find better actions, and thus, new opportunities to create value.

This new stage allowed us to develop new data products, such as recommendation engines (think Netflix), and it also gave rise to the age of data monetization. The online advertising business was thus born, marking an important inflection point in our story. The dream of marketers came to life with the promise of selling the right product to the right person at the right time, all thanks to data and the predictions created with it.

Importance of Online Advertising

Most of the riches created by big data were the product of the success of online advertising. The online advertising business is huge and highly lucrative. One source estimates that more than $500 billion will be spent during 2023 across the globe. If that figure alone doesn’t say much, consider that it is close to Belgium’s Gross Domestic Product.

The two main players in this business are Google and Facebook. They have built their businesses largely funded by the revenues from this profitable industry, and thanks to the riches that came with it, they have been able to fund the fast recent development in the AI arena (many times through acquisitions).

So it seems fair to say that the success of big data in online advertising has played an important role in facilitating AI’s current boom.

Prescriptive stage

The top rank in this hierarchy of value creation is taken by our ability to automate and design intelligent systems. We are now at the prescriptive layer: once you have enough predictive power you can start finding the best actions for your business objectives. This is the layer where firms move from prediction to optimization, the throne in the data Olympus, and interestingly enough, this is the least explored step in most maturity models.

A Tale of Unrealized Expectations

In less than 15 years, we’ve lived through two booms—the big data revolution followed by the current AI stage—so you may wonder why the promises have yet to be fulfilled.

I’m not a big fan of data maturity models, but I believe the answer lies within them: most companies have yet to arrive at the prescriptive stage. Big data was all about the descriptive stage, and as we’ve mentioned, AI is primarily concerned with prediction. Since everything has been laid out for us in the past few years, the question about what’s behind our apparent inability to move forward remains.

I’m convinced that market forces are an important factor, meaning that once a hype begins, market players want to reap the benefits until completely exhausted before moving on to the next big thing. Since we’re still in that phase, there are no incentives to move forward yet.

But it is also true that to become prescriptive we need to acquire a new set of analytical skills. As of today, with the current technology, this stage is done by humans, so we need to prepare humans to pose and solve prescriptive problems. This book aims at taking us closer to that objective.

Analytical Skills for the Modern AI-Driven Enterprise

Tom Davenport’s now classic Competing on Analytics (Harvard Business Press) pretty much equates analytical thinking with what later came to be known as data-drivenness: “By analytics we mean the extensive use of data, statistical and quantitative analysis, explanatory and predictive models, and fact-based management to drive decisions and actions.” One alternative definition can be found in Albert Rutherford’s The Analytical Mind (independently published): “Analytical skills are, simply put, problem-solving skills. They are characteristics and abilities that allow you to approach problems in a logical, rational manner in an effort to sort out the best solution.”

In this book I will define analytical reasoning as the ability to translate business problems into prescriptive solutions. This ability entails both being data-driven and being able to solve problems rationally and logically, so the definition is in fact in accordance with the two described previously.

To make things practical, I will equate business problems with business decisions. Other problems that are purely informative and do not entail actions may have intrinsic value for some companies, but I will not treat them here, as my interest is in creating value through analytical decision-making. Since most decisions are made without knowing the actual consequences, AI will be our weapon to embrace this intrinsic uncertainty. Notice that under this approach, prediction technologies are important inputs into our decision-making process but not the end. Improvements in the quality of predictions can have first- or second-order effects depending on whether we are already making near-to-optimal choices.

Key Takeways

Most companies haven’t been able to create value through data or AI in a sustainable and systematic way: nonetheless, many have already embarked on their own efforts just to reach a wall of disappointment.
Today’s AI is about prediction: AI is overhyped, not only because of its deceiving name, but also because there is only so much one can achieve through better prediction. These days, AI most commonly refers to deep learning. Deep neural networks are highly nonlinear prediction algorithms that have shown remarkable success in the areas of image recognition and natural language processing.
Before AI, we had the big data revolution: the data revolution preceded the current hype and also came with the promise to generate outstanding business results. It was built around the three Vs—volume, variety, and velocity—and later complemented with prediction algorithms and data-driven culture.
Data and prediction cannot create sustainable value by themselves: maturity models suggest that value is created by making optimal decisions in a data-driven way. For this, we need data and prediction as inputs in our decision-making process.
We need a new set of analytical skills to be successful in this prescriptive stage: current technology precludes us from automating the process of translating business problems into prescriptive solutions. Since humans need to be involved all along the way, we need to upscale our skillset to capture all the value from data- and AI-driven decision-making.

Analytical Skills for AI and Data Science by Daniel Vaughan

Chapter 1. Analytical Thinking and the AI-Driven Enterprise

Figure 1-1. AI and the coronavirus

What Is AI?

Why Current AI Won’t Deliver on Its Promises

How Did We Get Here?

Figure 1-2. Evolution of market capitalization top-10 ranking—(companies that left the ranking before 2018 are not labeled)

The Data Revolution

The three Vs

Data maturity models

Figure 1-3. A possible data maturity model showing a hierarchy of value creation

Descriptive stage

Predictive stage

Prescriptive stage

A Tale of Unrealized Expectations

Analytical Skills for the Modern AI-Driven Enterprise

Key Takeways

Further Reading

Don’t leave empty-handed

It’s yours, free.

Check it out now on O’Reilly