Chapter 10. Creating ML Models to Predict Sequences

Chapter 9 introduced sequence data and the attributes of a time series, including seasonality, trend, autocorrelation, and noise. You created a synthetic series to use for predictions and explored how to do basic statistical forecasting. Over the next couple of chapters, you’ll learn how to use ML for forecasting. But before you start creating models, you need to understand how to structure the time series data for training predictive models, by creating what we’ll call a windowed dataset.

To understand why you need to do this, consider the time series you created in Chapter 9. You can see a plot of it in Figure 10-1.

Synthetic time series
Figure 10-1. Synthetic time series

If at any point you want to predict a value at time t, you’ll want to predict it as a function of the values preceding time t. For example, say you want to predict the value of the time series at time step 1,200 as a function of the 30 values preceding it. In this case, the values from time steps 1,170 to 1,199 would determine the value at time step 1,200, as shown in Figure 10-2.

Previous values impacting prediction
Figure 10-2. Previous values impacting prediction

Now this begins to look familiar: you can consider the values from 1,170–1,199 to be your features and the value at 1,200 to be your label. If you can ...

Get AI and Machine Learning for Coders now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.