Chapter 5. Relational Databases

In Chapter 2, Data Preprocessing, we looked at some standard ways that data is stored. We saw that small unstructured datasets are often stored as text files, using white space, tabs, or commas to separate the data fields. Small, structured datasets are better handled by formats such as XML and JSON.

A database is a large, usually structured data collection that is accessed by an independent software system.

In this chapter, we will look at relational databases and the relational database systems that manage them. In Chapter 10, Working with NoSQL Databases, we will examine non-relational databases.

The relation data model

A relational database (RDB) is a database that stores its data in tables that are related by certain ...

Get Java Data Analysis now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.