2
Data Preparation for Deep Learning Projects
The first step in every machine learning (ML) project consists of data collection and data preparation. As a subset of ML, deep learning (DL) involves the same data processing processes. We will start this chapter by setting up a standard DL Python notebook environment using Anaconda. Then, we will provide concrete examples for collecting data in various formats (JSON, CSV, HTML, and XML). In many cases, the collected data gets cleaned up and preprocessed as it consists of unnecessary information or invalid formats.
The chapter will introduce popular techniques in this domain: filling in missing values, dropping unnecessary entries, and normalizing the values. Next, you will learn common feature ...
Get Production-Ready Applied Deep Learning now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.