Data Cleaning and Manipulation

Before we dive into data analysis, data needs to be properly prepared and structured. Some datasets, for example, structured computer logs, are ready to go from the start, but, most of the time, the majority of the time is spent preparing data properly. This process inevitably requires certain decisions that depend on the specifics of the task.

In this chapter, we will learn how to prepare the data with pandas, using the dataset we collected from Wikipedia in Chapter 7Scraping Data from the Web with Beautiful Soup 4, as an example. 

We will cover the following topics in the chapter:

  • Quick start with pandas
  • Working with real data
  • Regular expressions
  • Using custom functions with pandas dataframes
  • Writing the ...

Get Learn Python by Building Data Science Applications now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.