Preface

The MySQL database management system is popular for many reasons. It’s fast, and it’s easy to set up, use, and administer. It runs under many varieties of Unix and Windows, and MySQL-based programs can be written in many languages.

MySQL’s popularity raises the need to address questions its users have about how to solve specific problems. That is the purpose of MySQL Cookbook: to serve as a handy resource to which you can turn for quick solutions or techniques for attacking particular types of questions that come up when you use MySQL. Naturally, because it’s a cookbook, it contains recipes: straightforward instructions you can follow, rather than how to develop your own code from scratch. It’s written using a problem-and-solution format designed to be extremely practical and to make the contents easy to read and assimilate. It contains many short sections, each describing how to write a query, apply a technique, or develop a script to solve a problem of limited and specific scope. This book doesn’t develop full-fledged, complex applications. Instead, it assists you in developing such applications yourself by helping you get past problems that have you stumped.

For example, a common question is How can I deal with quotes and special characters in data values when I’m writing queries? That’s not difficult, but figuring out how to do it is frustrating when you’re not sure where to start. This book demonstrates what to do; it shows you where to begin and how to proceed from there. This knowledge will serve you repeatedly because after you see what’s involved, you’ll be able to apply the technique to any kind of data, such as text, images, sound or video clips, news articles, compressed files, or PDF documents. Another common question is Can I access data from multiple tables at the same time? The answer is Yes, and it’s easy to do because it’s just a matter of knowing the proper SQL syntax. But it’s not always clear how until you see examples, which this book gives you. Other techniques that you’ll learn from this book include how to do the following:

  • Use SQL to select, sort, and summarize rows

  • Find matches or mismatches between tables

  • Perform transactions

  • Determine intervals between dates or times, including age calculations

  • Identify or remove duplicate rows

  • Use LOAD DATA to read your datafiles properly or find which values in the file are invalid

  • Use CHECK constraints to prevent entry of bad data into your database

  • Generate sequence numbers to use as unique row identifiers

  • Use a view as a virtual table

  • Write stored procedures and functions, set up triggers that activate to perform specific data-handling operations when you insert or update table rows, and use the Event Scheduler to run queries on a schedule

  • Set up replication

  • Manage user accounts

  • Control server logging

One part of using MySQL is understanding how to communicate with the server—that is, how to use Structured Query Language (SQL; pronounced “sequel”), the language in which queries are formulated. Therefore, one major emphasis of this book is using SQL to formulate queries that answer particular kinds of questions. One helpful tool for learning and using SQL is the mysql client program that is included in MySQL distributions. You can use client interactively to send SQL statements to the server and see the results. This is extremely useful because it provides a direct interface to SQL—so useful, in fact, that the first chapter is devoted to mysql.

But the ability to issue SQL queries alone is not enough. Information extracted from a database often requires further processing or presentation in a particular way. What if you have queries with complex interrelationships, such as when you need to use the results of one query as the basis for others? What if you need to generate a specialized report with very specific formatting requirements? These problems bring us to the other major emphasis of the book—how to write programs that interact with the MySQL server through an application programming interface (API). When you know how to use MySQL from within the context of a programming language, you gain other ways to exploit MySQL’s capabilities:

  • You can save query results and reuse them later.

  • You have full access to the expressive power of a general-purpose programming language. This enables you to make decisions based on the success or failure of a query, or on the content of the rows that are returned, and then tailor the actions taken accordingly.

  • You can format and display query results however you like. If you’re writing a command-line script, you can generate plain text. If it’s a web-based script, you can generate an HTML table. If it’s an application that extracts information for transfer to some other system, you might generate a datafile expressed in XML or JSON.

Combining SQL with a general-purpose programming language gives you an extremely flexible framework for issuing queries and processing their results. Programming languages increase your capability to perform complex database operations. But that doesn’t mean this book is complex. It keeps things simple, showing how to construct small building blocks using techniques that are easy to understand and easily mastered.

We’ll leave it to you to combine these techniques in your own programs, which you can do to produce arbitrarily complex applications. After all, the genetic code is based on only four nucleic acids, but these basic elements have been combined to produce the astonishing array of biological life we see all around us. Similarly, there are only 12 notes in the scale, but in the hands of skilled composers, they are interwoven to produce a rich and endless variety of music. In the same way, when you take a set of simple recipes, add your imagination, and apply them to the database programming problems you want to solve, you can produce applications that perhaps are not works of art but are certainly useful and will help you and others be more productive.

Who This Book Is For

This book will be useful for anybody who uses MySQL, ranging from individuals who want to use a database for personal projects such as a blog or wiki, to professional database and web developers. The book is also intended for people who do not know how to use MySQL but would like to.

If you’re new to MySQL, you’ll find lots of ways to use it here. If you’re more experienced, you’re probably already familiar with many of the problems addressed here but may not have had to solve them before and should find the book a great time-saver. Take advantage of the recipes given in the book, and use them in your own programs rather than writing the code from scratch.

The material ranges from introductory to advanced, so if a recipe describes techniques that seem obvious to you, skip it. Conversely, if you don’t understand a recipe, set it aside and come back to it later, perhaps after reading some of the other recipes.

What’s in This Book

It’s very likely when you use this book that you’re trying to develop an application but are not sure how to implement certain pieces of it. In this case, you already know what type of problem you want to solve; check the table of contents or the index for a recipe that shows how to do what you want. Ideally, the recipe will be just what you had in mind. Alternatively, you may be able to adapt a recipe for a similar problem to suit the issue at hand. We explain the principles involved in developing each technique so that you can modify it to fit the particular requirements of your own applications.

Another way to approach this book is to just read through it with no specific problem in mind. This can give you a broader understanding of the things MySQL can do, so we recommend that you page through the book occasionally. It’s a more effective tool if you know the kinds of problems it addresses.

As you get into later chapters, you’ll find recipes that assume a knowledge of topics covered in earlier chapters. This also applies within a chapter, where later sections often use techniques discussed earlier in the chapter. If you jump into a chapter and find a recipe that uses a technique with which you’re not familiar, check the table of contents or the index to find where the technique is explained earlier. For example, if a recipe sorts a query result using an ORDER BY clause that you don’t understand, turn to Chapter 9, which discusses various sorting methods and explains how they work.

Here’s a summary of each chapter to give you an overview of the book’s contents.

Chapter 1, “Using the mysql Client Program”, describes how to use the standard MySQL command-line client. mysql is often the first or primary interface to MySQL that people use, and it’s important to know how to exploit its capabilities. This program enables you to issue queries and see their results interactively, so it’s good for quick experimentation. You can also use it in batch mode to execute canned SQL scripts or send its output into other programs. In addition, the chapter discusses other ways to use mysql, such as how to make long lines more readable or generate output in various formats.

Chapter 2, “Using MySQL Shell”, introduces the new MySQL command-line client, developed by the MySQL Team for versions 5.7 and newer. mysqlsh is compatible with mysql when it is running in SQL mode but also supports NoSQL in JavaScript and Python programming interfaces. With MySQL Shell, you can run SQL and NoSQL queries and automate many administrative tasks easily.

Chapter 3, “MySQL Replication”, describes how to set up and use replication. Some of the content in this chapter is advanced. However, we decided to place it in the beginning of the book, because the replication is necessary for stable MySQL installations that can survive such disasters as corruptions or hardware failures. Practically, any production MySQL installation should use one of the replication setups. While setting up a replication is an administrative task, we believe that all MySQL users need to have knowledge of how the replication works and, as a result, write effective queries that would be performant on both source and replica servers.

Chapter 4, “Writing MySQL-Based Programs”, demonstrates the essential elements of MySQL programming: how to connect to the server, issue queries, retrieve the results, and handle errors. It also discusses how to handle special characters and NULL values in queries, how to write library files to encapsulate code for commonly used operations, and various ways to gather the parameters needed for making connections to the server.

Chapter 5, “Selecting Data from Tables”, covers several aspects of the SELECT statement, which is the primary vehicle for retrieving data from the MySQL server: specifying which columns and rows you want to retrieve, dealing with NULL values, and selecting one section of a query result. Later chapters cover some of these topics in more detail, but this chapter provides an overview of the concepts on which they depend if you need some introductory background on row selection or don’t yet know a lot about SQL.

Chapter 6, “Table Management”, covers table cloning, copying results into other tables, using temporary tables, and checking or changing a table’s storage engine.

Chapter 7, “Working with Strings”, describes how to deal with string data. It covers character sets and collations, string comparisons, dealing with case-sensitivity issues, pattern matching, breaking apart and combining strings, and performing FULLTEXT searches.

Chapter 8, “Working with Dates and Times”, shows how to work with temporal data. It describes MySQL’s date format and how to display date values in other formats. It also covers how to use MySQL’s special TIMESTAMP data type, how to set the time zone, how to convert between different temporal units, how to perform date arithmetic to compute intervals or generate one date from another, and how to perform leap-year calculations.

Chapter 9, “Sorting Query Results”, describes how to put the rows of a query result in the order you want. This includes specifying the sort direction, dealing with NULL values, accounting for string case sensitivity, and sorting by dates or partial column values. It also provides examples that show how to sort special kinds of values, such as domain names, IP numbers, and ENUM values.

Chapter 10, “Generating Summaries”, shows techniques for assessing the general characteristics of a set of data, such as how many values it contains or its minimum, maximum, and average values.

Chapter 11, “Using Stored Routines, Triggers, and Scheduled Events”, describes how to write functions and procedures that are stored on the server side, triggers that activate when tables are modified, and events that execute on a scheduled basis.

Chapter 12, “Working with Metadata”, discusses how to get information about the data that a query returns, such as the number of rows or columns in the result, or the name and data type of each column. It also shows how to ask MySQL what databases and tables are available or how to determine the structure of a table.

Chapter 13, “Importing and Exporting Data”, describes how to transfer information between MySQL and other programs. This includes how to use LOAD DATA, convert files from one format to another, and determine table structure appropriate for a dataset.

Chapter 14, “Validating and Reformatting Data”, describes how to extract or rearrange columns in datafiles, check and validate data, and rewrite values such as dates that often come in a variety of formats.

Chapter 15, “Generating and Using Sequences”, discusses AUTO_INCREMENT columns, MySQL’s mechanism for producing sequence numbers. It shows how to generate new sequence values or determine the most recent value, how to resequence a column, and how to use sequences to generate counters. It also shows how to use AUTO_INCREMENT values to maintain a master-detail relationship between tables, including pitfalls to avoid.

Chapter 16, “Using Joins and Subqueries”, shows how to perform operations that select rows from multiple tables. It demonstrates how to compare tables to find matches or mismatches, produce master-detail lists and summaries, and enumerate many-to-many relationships.

Chapter 17, “Statistical Techniques”, illustrates how to produce descriptive statistics, frequency distributions, regressions, and correlations. It also covers how to randomize a set of rows or pick rows at random from the set.

Chapter 18, “Handling Duplicates”, discusses how to identify, count, and remove duplicate rows—and how to prevent them from occurring in the first place.

Chapter 19, “Working with JSON”, illustrates how to use JSON in MySQL. It covers such topics as validation, searching, and manipulation of JSON data. The chapter also discusses how to use MySQL as a Document Store.

Chapter 20, “Performing Transactions”, shows how to handle multiple SQL statements that must execute together as a unit. It discusses how to control MySQL’s auto-commit mode and how to commit or roll back transactions.

Chapter 22, “Server Administration”, is written for database administrators. It covers server configuration, the plug-in interface, and log management.

Chapter 23, “Monitoring the MySQL Server”, illustrates how to monitor and troubleshoot MySQL issues, such as startup or connection failures. It shows how to use MySQL log files, built-in instruments, and standard operating system utilities to get information about the performance of MySQL queries and internal structures.

Chapter 24, “Security”, is another administrative chapter. It discusses user account management, including creating accounts, setting passwords, and assigning privileges. It also describes how to implement password policy, find and fix insecure accounts, and expire or unexpire passwords.

MySQL APIs Used in This Book

MySQL programming interfaces exist for many languages, including C, C++, Eiffel, Go, Java, Perl, PHP, Python, Ruby, and Tcl. Given this fact, writing a MySQL cookbook presents authors with a challenge. The book should provide recipes for doing many interesting and useful things with MySQL, but which API or APIs should the book use? Showing an implementation of every recipe in every language results either in covering very few recipes or in a very, very large book! It also results in redundancies when implementations in different languages bear a strong resemblance to one another. On the other hand, it’s worthwhile taking advantage of multiple languages, because one is often more suitable than another for solving a particular problem.

To resolve this dilemma, we’ve chosen a small number of APIs to write the recipes in this book. This makes its scope manageable while permitting latitude to choose from multiple APIs:

  • The Perl DBI module

  • Ruby, using the Mysql2 gem

  • PHP, using the PDO extension

  • Python, using the MySQL Connector/Python driver for the DB API

  • Go, using the Go-MySQL-Driver for the sql interface

  • Java, using the MySQL Connector/J driver for the JDBC interface

Why these languages? Perl is a widely used language that was very popular for writing MySQL programs when the first edition of this book was published and is still used in many applications today. Ruby has an easy-to-use database-access module. PHP is widely deployed, especially on the web. Go is getting very popular lately and replaces other languages, especially Perl, in many MySQL applications. Python and Java each has a significant number of followers.

We believe these languages taken together reflect pretty well the majority of the existing user base of MySQL programmers. If you prefer some language not shown here, be sure to pay careful attention to Chapter 4, to familiarize yourself with the book’s primary APIs. Knowing how to perform database operations with the programming interfaces used here will help you translate recipes for other languages.

Version and Platform Notes

Development of the code in this book took place under MySQL 5.7 and 8.0. Because new features are added to MySQL on a regular basis, some examples will not work under older versions. For example, MySQL 5.7 introduces group replication, and MySQL 8.0 introduces CHECK constraints and common table expressions.

We do not assume that you are using Unix, although that is our own preferred development platform. (In this book, Unix also refers to Unix-like systems such as Linux and macOS X.) Most of the material here is applicable both to Unix and Windows.

Conventions Used in This Book

This book uses the following font conventions:

Constant width

Used for program listings, as well as within paragraphs to refer to program elements such as variable or function names, databases, data types, environment variables, statements, and keywords.

Constant width bold

Used to indicate text that you type when running commands.

Constant width italic

Used to indicate variable input; you should substitute a value of your own choosing.

Italic

Used for URLs, hostnames, names of directories and files, Unix commands and options, programs, and occasionally for emphasis.

Tip

This element signifies a tip or suggestion.

Caution

This element indicates a warning or caution.

Note

This element signifies a general note.

Commands often are shown with a prompt to illustrate the context in which they are used. Commands issued from the command line are shown with a $ prompt:

$ chmod 600 my.cnf

That prompt is one that Unix users are used to seeing, but it doesn’t necessarily signify that a command works only under Unix. Unless indicated otherwise, commands shown with a $ prompt generally should work under Windows, too.

If you should run a command under Unix as the root user, the prompt is # instead:

# perl -MCPAN -e shell

Commands that are specific to Windows use the C:\> prompt:

C:\> "C:\Program Files\MySQL\MySQL Server 5.6\bin\mysql"

SQL statements that are issued from within the mysql client program are shown with a mysql> prompt and terminated with a semicolon:

mysql> SELECT * FROM my_table;

For examples that show a query result as you would see it when using mysql, we sometimes truncate the output, using an ellipsis (...) to indicate that the result consists of more rows than are shown. The following query produces many rows of output, from which those in the middle have been omitted:

mysql> SELECT name, abbrev FROM states ORDER BY name;
+----------------+--------+
| name           | abbrev |
+----------------+--------+
| Alabama        | AL     |
| Alaska         | AK     |
| Arizona        | AZ     |
…
| West Virginia  | WV     |
| Wisconsin      | WI     |
| Wyoming        | WY     |
+----------------+--------+

Examples that show only the syntax for SQL statements do not include the mysql> prompt, but they do include semicolons as necessary to make it clearer where statements end. For example, this is a single statement:

CREATE TABLE t1 (i INT)
SELECT * FROM t2;

But this example represents two statements:

CREATE TABLE t1 (i INT);
SELECT * FROM t2;

The semicolon is a notational convenience used within mysql as a statement terminator. But it is not part of SQL itself, so when you issue SQL statements from within programs that you write (for example, using Perl or Java), don’t include terminating semicolons.

If a statement or a command output is too long and does not fit the book page, we use the symbol to show that the line was indented to fit:

mysql> SELECT 'Mysql: The Definitive Guide to Using, Programming,↩
    -> and Administering Mysql 4 (Developer\'s Library)' AS book;
+-----------------------------------------------------+
| book                                                |
+-----------------------------------------------------+
| Mysql: The Definitive Guide to Using, Programming,↩
  and Administering Mysql 4 (Developer's Library)     |
+-----------------------------------------------------+
1 row in set (0,00 sec)

The MySQL Cookbook Companion GitHub Repository

MySQL Cookbook has a companion GitHub repository where you can obtain source code and sample data for examples developed throughout this book, and auxiliary documentation.

Recipe Source Code and Data

The examples in this book are based on source code and sample data from a distribution named recipes available at the companion GitHub repository.

The recipes distribution is the primary source of examples, and references to it occur throughout the book. The distribution is also available as a compressed TAR file (recipes.tar.gz) or as a ZIP file (recipes.zip). Either distribution format when unpacked creates a directory named mysqlcookbook-VERSION/recipes.

Use the recipes distribution to save yourself a lot of typing. For example, when you see a CREATE TABLE statement in the book that describes what a database table looks like, you’ll usually find a SQL batch file in the tables directory that you can use to create the table instead of entering the definition manually. Change location into the tables directory and execute the following command, where filename is the name of the file containing the CREATE TABLE statement:

$ mysql cookbook < filename

If you need to specify MySQL username or password options, add them to the command line.

To import all the tables from the recipes distribution, use the command:

$ mysql cookbook < cookbook.sql

The recipes distribution contains programs as shown in the book, but in many cases it also includes implementations in additional languages. For example, a script shown in the book using Python may be available in the recipes distribution in Perl, Ruby, PHP, Go, or Java as well. This may save you translation effort should you wish to convert a program shown in the book to a different language.

Amazon Review Data (2018)

Amazon-related review data used in Chapter 7, “Working with Strings”, can be found at http://deepyeti.ucsd.edu/jianmo/amazon/index.html and can be downloaded using this form. Justifying recommendations using distantly labeled reviews and fined-grained aspects. Jianmo Ni, Jiacheng Li, Julian McAuley Empirical Methods in Natural Language Processing (EMNLP), 2019.

MySQL Cookbook Companion Documents

Some appendices included in previous MySQL Cookbook editions are now available in standalone form at the companion GitHub repository. They provide background information for topics covered in the book.

“Executing Programs from the Command Line” provides instructions for executing commands at the command prompt and setting environment variables such as PATH.

Using Code Examples

Supplemental material (code examples, exercises, etc.) is available for download at https://github.com/svetasmirnova/mysqlcookbook.

If you have a technical question or a problem using the code examples, please email .

This book is here to help you get your job done. In general, if example code is offered with this book, you may use it in your programs and documentation. You do not need to contact us for permission unless you’re reproducing a significant portion of the code. For example, writing a program that uses several chunks of code from this book does not require permission. Selling or distributing examples from O’Reilly books does require permission. Answering a question by citing this book and quoting example code does not require permission. Incorporating a significant amount of example code from this book into your product’s documentation does require permission.

We appreciate, but do not require, attribution. An attribution usually includes the title, author, publisher, and ISBN. For example: “MySQL Cookbook, Fourth Edition by Sveta Smirnova and Alkin Tezuysal (O’Reilly). Copyright 2022 Sveta Smirnova and Alkin Tezuysal, 978-1-492-09316-9.”

If you feel your use of code examples falls outside fair use or the permission given above, feel free to contact us at .

O’Reilly Online Learning

Note

For more than 40 years, O’Reilly Media has provided technology and business training, knowledge, and insight to help companies succeed.

Our unique network of experts and innovators share their knowledge and expertise through books, articles, and our online learning platform. O’Reilly’s online learning platform gives you on-demand access to live training courses, in-depth learning paths, interactive coding environments, and a vast collection of text and video from O’Reilly and 200+ other publishers. For more information, visit http://oreilly.com.

How to Contact Us

Please address comments and questions concerning this book to the publisher:

  • O’Reilly Media, Inc.
  • 1005 Gravenstein Highway North
  • Sebastopol, CA 95472
  • 800-998-9938 (in the United States or Canada)
  • 707-829-0515 (international or local)
  • 707-829-0104 (fax)

We have a web page for this book, where we list errata, examples, and any additional information. You can access this page at https://oreil.ly/oreillymysql-ckbk4e.

Email to comment or ask technical questions about this book.

For more information about our books, courses, conferences, and news, see our website at https://www.oreilly.com.

Find us on LinkedIn: https://linkedin.com/company/oreilly-media

Follow us on Twitter: https://twitter.com/oreillymedia

Watch us on YouTube: https://www.youtube.com/oreillymedia

Acknowledgments

To each reader, thank you for reading our book. We hope that it serves you well and that you find it useful.

From Paul DuBois, for the Third Edition

Thanks to my technical reviewers, Johannes Schlüter, Geert Vanderkelen, and Ulf Wendel. They made several corrections and suggestions that improved the text in many ways, and I appreciate their help.

Andy Oram prodded me to begin the third edition and served as its editor, Nicole Shelby guided the book through production, and Kim Cofer and Lucie Haskins provided proofreading and indexing.

Thanks to my wife, Karen, whose encouragement and support throughout the writing process means more than I can say.

From Sveta Smirnova and Alkin Tezuysal

Many thanks to our technical reviewers for their invaluable contributions for this book.

Gillian Gunson not only provided comprehensive technical feedback but also showed how our text could be read by people with different backgrounds. Her language suggestions helped us make the recipes easier to read. Her attentiveness to details helped us identify inaccuracies and even potential risk areas that may show up when your database load grows. Gillian also reviewed all code examples and suggested how to make Ruby and Java code more aligned to current standards.

Ege Gunes reviewed all Go language examples to ensure they were aligned with Go’s standard style.

Karthik Appigatla, Timur Solodovnikov, Daniel Guzman Burgos, and Vladimir Fedorkov reviewed selected chapters of the book. Their suggested corrections helped us improve the book a great deal.

Andy Kwan invited us to write the fourth edition of this book. Amelia Blevins and Jeff Bleiel were our editors and helped make the book easier to read. Rita Fernando reviewed a few chapters and provided feedback that allowed us to make the book easier to read and be more aligned with O’Reilly standards.

From Sveta Smirnova

I want to thank my colleagues at Percona Support who understood that I needed to work a second shift on the book and allowed me to take time off when needed.

Many thanks to my husband, Serguei Lassounov, who always supports me in all of my professional endeavors.

From Alkin Tezuysal

I want to thank my wife, Aslihan, and my daughters, Ilayda and Lara, for their patience and support when I needed to focus and use family time to write this book.

Many thanks to my colleagues and team at PlanetScale, especially Deepthi Sigireddi, for her extra care and support. Special thanks go to the MySQL community and my friends, and family members as well.

I also want to take a moment to thank Sveta Smirnova for her endless support while coaching me throughout my first book journey.

Get MySQL Cookbook, 4th Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.