Data Analytics Toolkit: From Excel to Python, R, and Tableau

Video description

11+ Hours of Video Instruction

The perfect way to up your data analytics game: tools and parallel case studies!

Overview

There are lots of ways to learn data analysis, but one way that really helps it to sink in is to see analyses of actual data. Add to that the use of four analytic tools to analyze each of the five data sets, and you have an excellent way to learn data analysis. In Data Analytics Toolkit: From Excel to Python, R, and Tableau that is exactly what is done. Data from five case studies are each analyzed using Excel, Tableau, R, and Python. This approach provides a unique, thorough, and in-depth way to learn data analytics.

About the Instructors

McKenzie Lamb is professor of mathematical sciences at DePauw University. in Indiana. Although his graduate work focused on topics in differential geometry, about 10 years ago McKenzie pivoted toward more applied areas, including data analysis, applied optimization, machine learning, and quantitative reasoning. Since 2019 he has been focusing on data visualization utilizing Tableau. Since moving to DePauw University, he has had the immensely enjoyable opportunity to teach a dedicated course called Data Visualization in Tableau as part of their new School of Business and Leadership. He also recently led a Tableau workshop for masters degree students at the University of Colorado Boulder.

Eric Gaze directs the Quantitative Reasoning (QR) program in the Baldwin Center for Learning and Teaching (BCLT) at Bowdoin College, and he is also a Senior Lecturer in the Mathematics Department. He is a past President of the National Numeracy Network (NNN 2013–2019) and past chair of the BCLT (2015–2019). He gives talks and leads workshops on QR course development and assessment, in addition to being on review teams of academic support programs. He is the author of a 2023 QR textbook published by Pearson, Thinking Quantitatively: Communicating with Numbers (3rd edition).

Learn How To

  • Use Excel, Tableau, R, and Python to analyze data from five case studies.

Who Should Take This Course

Anyone who

  • Is interested in learning data analytics through the lens of four different tools applied to five case studies

Course Requirements

No specific requirements

Lesson Descriptions

Lesson 1: Bank Data

In this first lesson we look at a basic dataset involving bank information that introduces the concepts of both quantitative and qualitative variables. The dataset is small enough and tidy enough to be easily manipulated in Excel and lends itself nicely to the fundamental concepts of transforming and visualizing your data. Creating identical column charts in Excel using pivot tables and in Tableau illustrates the parallel data analysis structures of these two software packages. Tableau further introduces the idea of “importing” your data, something basic to using R and Python but unfamiliar to Excel users who always “see” their data in front of them.

Lesson 2: Countries

This simple case study in Lesson 2 highlights some of the key functionality benefits of Tableau. Excel is relatively simple to use, and without a clear rationale for learning a new software package, faculty (like students) will resist the learning curve required for orientation to a new package. This case study uses data easily downloaded from the internet consisting of basic statistics for countries: GDP, life expectancy, infant mortality, and population.

The bank data example started with tidy data (so no cleaning or wrangling involved) and introduced the basic column chart or bar chart. This Countries example highlights the cleaning process of data analysis. This is typically the most arduous part of data analysis and a major weakness in Excel, which lacks a straightforward way to merge data sets and remove observations with missing values. Typically, we simply sort the data and delete rows by hand in Excel. In the other packages, there are powerful data cleaning functions and operations. We start with Tableau and create four more of the fundamental data visualizations: a scatterplot, histogram, boxplot and also a choropleth or heat map.

Lesson 3: Wisconsin Elections

In Lesson 3 we explore how voting patterns changed in Wisconsin between the 2012 and 2016 presidential elections. In 2012, Democrat Barack Obama defeated Republican Mitt Romney by a substantial margin, both nationally and in Wisconsin. On the other hand, Republican Donald Trump won both Wisconsin and the electoral college vote over Democrat Hillary Clinton (although not the overall popular vote) in 2016. Based on the final vote counts, Wisconsin became redder (more Republican) between 2012 and 2016. We create a variety of graphics to explore how this happened.

Lesson 4: COVID-19

Lesson 4 illustrates a phenomenon called Simpson’s Paradox, which can occur when applying an aggregate calculation to subsets of a data set. Counterintuitively (though not actually paradoxically), it is possible for the aggregate calculation―say, an average―to yield one kind of result on the majority or even all of the subsets while producing a contradictory result on the data set as a whole. The Wikipedia page on Simpson’s Paradox describes an especially compelling example. The baseball player David Justice had a better batting average than Derek Jeter in 1995 and 1996, but combining all of the data for those two years into a single data set yields the opposite result: Derek Jeter’s batting average was better for the two-year period as a whole.

In this example, we have data about a group of people who all got COVID-19. Overall, a greater percentage of vaccinated people died than unvaccinated. However, if we divide the people in the data set into two age groups, we find that the death rate for unvaccinated people was greater in each group on its own.

Lesson 5: Nightingale’s Rose

In Lesson 5 we reproduce a famous visualization created by the nurse and statistician Florence Nightingale. The graphic shows deaths at a military hospital in Crimea, broken down by month and by cause of death. The visualization is further divided into two timeframes: before improved sanitation measures were implemented and after. The point of the graphic was to illustrate that preventable diseases did the following:

  • Caused the vast majority of deaths
  • Could be prevented using basic sanitation procedures

Nightingale used a highly creative and compelling graphical form called a polar bar chart to display this information. In the lesson you see an image of her original chart and then we recreate it using our visualization tools.

About Pearson Video Training

Pearson publishes expert-led video tutorials covering a wide selection of technology topics designed to teach you the skills you need to succeed. These professional and personal technology videos feature world-leading author instructors published by your trusted technology brands: Addison-Wesley, Cisco Press, Pearson IT Certification, Prentice Hall, Sams, and Que Topics include IT Certification, Network Security, Cisco Technology, Programming, Web Development, Mobile Development

Product information

  • Title: Data Analytics Toolkit: From Excel to Python, R, and Tableau
  • Author(s): Eric Gaze / McKenzie Lamb
  • Release date: October 2024
  • Publisher(s): Pearson
  • ISBN: 0135397731