Chapter 8. Learning signal and ignoring noise: introduction to regularization and batching

In this chapter

Overfitting
Dropout
Batch gradient descent

“With four parameters I can fit an elephant, and with five I can make him wiggle his trunk.”

John von Neumann, mathematician, physicist, computer scientist, and polymath

Three-layer network on MNIST

Let’s return to the MNIST dataset and attempt to classify it with- h the new network

In last several chapters, you’ve learned that neural networks model correlation. The hidden layers (the middle one in the three-layer network) can even create intermediate correlation to help solve for a task (seemingly out of midair). How do you know the network is creating good correlation?

When we discussed ...

Get Grokking Deep Learning now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.

Start your free trial

Grokking Deep Learning by Andrew W. Trask