CHAPTER 3Time Series Data Preparation

In this chapter, I will walk you through the most important steps to prepare your time series data for forecasting models. Data preparation is the practice of transforming your raw data so that data scientists can run it through machine learning algorithms to discover insights and, eventually, make predictions.

Each machine learning algorithm expects data as input that needs to be formatted in a very specific way, so time series data sets generally require some cleaning and feature engineering processes before they can generate useful insights. Time series data sets may have values that are missing or may contains outliers, hence the need for the data preparation and cleaning phase is essential. Since the time series data has temporal property, only some of the statistical methodologies are appropriate for time series data. Good time series data preparation produces clean and well-curated data, which leads to more practical, accurate predictions.

Specifically, in this chapter we will discuss the following:

  • Python for Time Series Data – Python is a very powerful programming language to handle data, offering an assorted suite of libraries for time series data and excellent support for time series analysis. In this section of Chapter 3, you will see how libraries such as SciPy, NumPy, Matplotlib, pandas, statsmodels, and scikit-learn can help you prepare, explore, and analyze your time series data.
  • Time Series Exploration and Understanding ...

Get Machine Learning for Time Series Forecasting with Python now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.