Chapter 4. Writing MySQL-Based Programs

4.0 Introduction

This chapter discusses how to use MySQL from within the context of a general-purpose programming language. It covers basic application programming interface (API) operations that are fundamental to and form the basis for the programming recipes developed in later chapters. These operations include connecting to the MySQL server, executing statements, and retrieving the results.

MySQL-based client programs can be written using many languages. This book covers the languages and interfaces shown in Table 4-1 (for information on obtaining the interface software, see the Preface).

Table 4-1. Languages and interfaces covered in this book
LanguageInterface
PerlPerl DBI
RubyMysql2 gem
PHPPDO
PythonDB-API
GoGo sql
JavaJDBC

MySQL client APIs provide the following capabilities, each covered in a section of this chapter:

Connecting to the MySQL server, selecting a database, and disconnecting from the server

Every program that uses MySQL must first establish a connection to the server. Most programs also select a default database, and well-behaved MySQL programs close the connection to the server when they’re done with it.

Checking for errors

Any database operation can fail. If you know how to find out when that occurs and why, you can take appropriate action, such as terminating the program or informing the user of the problem.

Executing SQL statements and retrieving results

The point of connecting to a database server is to execute SQL statements. Each API provides at least one way to do this, as well as methods for processing statement results.

Handling special characters and NULL values in statements

Data values can be embedded directly in statement strings. However, some characters such as quotes and backslashes have special meaning, and their use requires certain precautions. The same is true for NULL values. If you handle these improperly, your programs will generate SQL statements that are erroneous or yield unexpected results. If you incorporate data from external sources into queries, your program might become subject to SQL injection attacks. Most APIs enable you to avoid these problems by using placeholders, which refer to data values symbolically in a statement to be executed and supply those values separately. The API inserts data into the statement string after properly encoding any special characters or NULL values. Placeholders are also known as parameter markers.

Identifying NULL values in result sets

NULL values are special not only when you construct statements but also in the results returned from them. Each API provides a convention for recognizing and dealing with them.

No matter which programming language you use, it’s necessary to know how to perform each of the fundamental database API operations just described, so this chapter shows each operation in all five languages. Seeing how each API handles a given operation should help you see the correspondences between APIs more easily and better understand the recipes shown in the following chapters, even if they’re written in a language you don’t use much. (Later chapters usually implement recipes using only one or two languages.)

It may seem overwhelming to see each recipe in several languages if your interest is in only one particular API. If so, we advise you to read just the introductory recipe part that provides the general background, then go directly to the section for the language in which you’re interested. Skip the other languages; should you develop an interest in them later, come back and read about them then.

This chapter also discusses the following topics, which are not directly part of the MySQL APIs but help you use them more easily:

Writing library files

As you write program after program, you find that you carry out certain operations repeatedly. Library files enable encapsulating code for those operations so they can be performed easily from multiple scripts without repeating the code in each one. This reduces code duplication and makes your programs more portable. This chapter shows how to write a library file for each API that includes a routine for connecting to the server—one operation that every program that uses MySQL must perform. Later chapters develop additional library routines for other operations.

Additional techniques for obtaining connection parameters

An early section on establishing connections to the MySQL server relies on connection parameters hardwired into the code. However, there are other (and better) ways to obtain parameters, ranging from storing them in a separate file to enabling the user to specify them at runtime.

To avoid manually typing in the example programs, get a copy of the recipes source distribution (see the Preface). Then, when an example says something like Create a file named xyz that contains the following information, you can use the corresponding file from the recipes distribution. Most scripts for this chapter are located under the api directory; library files are located in the lib directory.

The primary table used for examples in this chapter is named profile. It first appears in Recipe 4.4, which you should know in case you skip around in the chapter and wonder where it came from. See also the section at the very end of the chapter about resetting the profile table to a known state for use in other chapters.

Note

The programs discussed here can be run from the command line. For instructions on invoking programs for each language covered here, read cmdline.pdf in the recipes distribution.

Assumptions

To use the material in this chapter most effectively, make sure to satisfy these requirements:

  • Install MySQL programming support for any languages that you plan to use (see the Preface).

  • You should already have set up a MySQL user account for accessing the server and a database for executing SQL statements. As described in Recipe 1.1, the examples in this book use a MySQL account that has a username and password of cbuser and cbpass, and we’ll connect to a MySQL server running on the local host to access a database named cookbook. To create the account or the database, see the instructions in that recipe.

  • The discussion here shows how to use each API language to perform database operations but assumes a basic understanding of the language itself. If a recipe uses programming constructs with which you’re unfamiliar, consult a general reference for the language of interest.

  • Proper execution of some of the programs might require that you set certain environment variables. General syntax for doing so is covered in cmdline.pdf in the recipes distribution (see the Preface). For details about environment variables that apply specifically to library file locations, see Recipe 4.3.

MySQL Client API Architecture

Each MySQL programming interface covered in this book uses a two-level architecture:

  • The upper level provides database-independent methods that implement database access in a portable way that’s the same whether you use MySQL, PostgreSQL, Oracle, or whatever.

  • The lower level consists of a set of drivers, each of which implements the details for a single database system.

This two-level architecture enables application programs to use an abstract interface not tied to details specific to any particular database server. This enhances portability of your programs: to use a different database system, just select a different lower-level driver. However, perfect portability is elusive:

  • The interface methods provided by the upper level of the architecture are consistent regardless of the driver you use, but it’s still possible to write SQL statements that use constructs supported only by a particular server. For example, MySQL has SHOW statements that provide information about database and table structure, but using SHOW with a non-MySQL server likely will produce an error.

  • Lower-level drivers often extend the abstract interface to make it more convenient to access database-specific features. For example, the MySQL driver for Perl DBI makes the most recent AUTO_INCREMENT value available as a database handle attribute accessible as $dbh->{mysql_insertid}. Such features make a program easier to write but less portable. To use the program with another database system will require some rewriting.

Despite these factors that compromise portability to some extent, the general portability characteristics of the two-level architecture provide significant benefits for MySQL developers.

Another characteristic common to the APIs used in this book is that they are object oriented. Whether you write in Perl, Ruby, PHP, Python, Java, or Go, the operation that connects to the MySQL server returns an object that enables you to process statements in an object-oriented manner. For example, when you connect to the database server, you get a database connection object with which to further interact with the server. The interfaces also provide objects for statements, result sets, metadata, and so forth.

Now let’s see how to use these programming interfaces to perform the most fundamental MySQL operations: connecting to and disconnecting from the server.

4.1 Connecting, Selecting a Database, and Disconnecting

Problem

You need to establish a connection to the database server and shut down the connection when you’re done.

Solution

Each API provides routines for connecting and disconnecting. The connection routines require that you provide parameters specifying the host on which the MySQL server is running and the MySQL account to use. You can also select a default database.

Discussion

This section shows how to perform some fundamental operations common to most MySQL programs:

Establishing a connection to the MySQL server

Every program that uses MySQL does this, no matter which API you use. The details on specifying connection parameters vary between APIs, and some APIs provide more flexibility than others. However, there are many common parameters, such as the host on which the server is running, and the username and password of the MySQL account to use for accessing the server.

Selecting a database

Most MySQL programs select a default database.

Disconnecting from the server

Each API provides a way to close an open connection. It’s best to do so as soon as you’re done using the server. If your program holds the connection open longer than necessary, the server cannot free up resources allocated to servicing the connection. It’s also preferable to close the connection explicitly. If a program simply terminates, the MySQL server eventually notices, but an explicit close on the user end enables the server to perform an immediate orderly close on its end.

This section includes example programs that show how to use each API to connect to the server, select the cookbook database, and disconnect. The discussion for each API also indicates how to connect without selecting any default database. This might be the case if you plan to execute a statement that doesn’t require a default database, such as SHOW VARIABLES or SELECT VERSION(). Or perhaps you’re writing a program that enables the user to specify the database after the connection has been made.

Tip

The scripts shown here use localhost as the hostname. If they produce a connection error indicating that a socket file cannot be found, try changing localhost to 127.0.0.1, the TCP/IP address of the local host. This tip applies throughout the book.

Perl

To write MySQL scripts in Perl, the DBI module must be installed, as well as the MySQL-specific driver module, DBD::mysql. To obtain these modules if they’re not already installed, see the Preface.

The following Perl script, connect.pl, connects to the MySQL server, selects cookbook as the default database, and disconnects:

#!/usr/bin/perl
# connect.pl: connect to the MySQL server

use strict;
use warnings;
use DBI;

my $dsn = "DBI:mysql:host=localhost;database=cookbook";
my $dbh = DBI->connect ($dsn, "cbuser", "cbpass")
            or die "Cannot connect to server\n";
print "Connected\n";
$dbh->disconnect ();
print "Disconnected\n";

To try connect.pl, locate it under the api directory of the recipes distribution and run it from the command line. The program should print two lines indicating that it connected and disconnected successfully:

$ perl connect.pl
Connected
Disconnected

In the rest of the section, we will walk through the code and explain how it works.

Tip

If you get an Access Denied error when you connect to MySQL 8.0, ensure that the version of DBD::MySQL is linked with the MySQL 8.0 client library, or use the authentication plug-in mysql_native_password instead of the default caching_sha2_password plug-in. We discuss authentication plug-ins in Recipe 24.2.

For background on running Perl programs, read cmdline.pdf in the recipes distribution (see the Preface).

The use strict line turns on strict variable checking and causes Perl to complain about any variables that are used without having been declared first. This precaution helps find errors that might otherwise go undetected.

The use warnings line turns on warning mode so that Perl produces warnings for any questionable constructs. Our example script has none, but it’s a good idea to get in the habit of enabling warnings to catch problems that occur during the script development process. use warnings is similar to specifying the Perl -w command-line option but provides more control over which warnings to display. (For more information, execute a perldoc warnings command.)

The use DBI statement tells Perl to load the DBI module. It’s unnecessary to load the MySQL driver module (DBD::mysql) explicitly. DBI does that itself when the script connects to the database server.

The next two lines establish the connection to MySQL by setting up a data source name (DSN) and calling the DBI connect() method. The arguments to connect() are the DSN, the MySQL username and password, and any connection attributes you want to specify. The DSN is required. The other arguments are optional, although usually it’s necessary to supply a username and password.

The DSN specifies which database driver to use and other options that indicate where to connect. For MySQL programs, the DSN has the format DBI:mysql:options. The second colon in the DSN is required even if you specify no following options.

Use the DSN components as follows:

  • The first component is always DBI. It’s not case sensitive.

  • The second component tells DBI which database driver to use, and it is case sensitive. For MySQL, the name must be mysql.

  • The third component, if present, is a semicolon-separated list of name=value pairs that specify additional connection options, in any order. For our purposes, the two most relevant options are host and database, to specify the hostname where the MySQL server is running and the default database.

Based on that information, the DSN for connecting to the cookbook database on the local host localhost looks like this:

DBI:mysql:host=localhost;database=cookbook

If you omit the host option, its default value is localhost. These two DSNs are equivalent:

DBI:mysql:host=localhost;database=cookbook
DBI:mysql:database=cookbook

To select no default database, omit the database option.

The second and third arguments of the connect() call are your MySQL username and password. Following the password, you can also provide a fourth argument to specify attributes that control DBI’s behavior when errors occur. With no attributes, DBI by default prints error messages when errors occur but does not terminate your script. That’s why connect.pl checks whether connect() returns undef, which indicates failure:

my $dbh = DBI->connect ($dsn, "cbuser", "cbpass")
            or die "Cannot connect to server\n";

Other error-handling strategies are possible. For example, to tell DBI to terminate the script if an error occurs in any DBI call, disable the PrintError attribute and enable RaiseError instead:

my $dbh = DBI->connect ($dsn, "cbuser", "cbpass",
                        {PrintError => 0, RaiseError => 1});

Then you need not check for errors yourself. The trade-off is that you also lose the ability to decide how your program recovers from errors. Recipe 4.2 discusses error handling further.

Another common attribute is AutoCommit, which sets the connection’s auto-commit mode for transactions. MySQL enables this by default for new connections, but we’ll set it from this point on to make the initial connection state explicit:

my $dbh = DBI->connect ($dsn, "cbuser", "cbpass",
                        {PrintError => 0, RaiseError => 1, AutoCommit => 1});

As shown, the fourth argument to connect() is a reference to a hash of attribute name/value pairs. An alternative way of writing this code follows:

my $conn_attrs = {PrintError => 0, RaiseError => 1, AutoCommit => 1};
my $dbh = DBI->connect ($dsn, "cbuser", "cbpass", $conn_attrs);

Use whichever style you prefer. Scripts in this book use the $conn_attr hashref to make connect() calls simpler to read.

Assuming that connect() succeeds, it returns a database handle that contains information about the state of the connection. (In DBI parlance, references to objects are called handles.) Later we’ll see other handles, such as statement handles, which are associated with particular statements. Perl DBI scripts in this book conventionally use $dbh and $sth to signify database and statement handles.

To specify the path to a socket file for localhost connections on Unix, provide a mysql_socket option in the DSN:

my $dsn = "DBI:mysql:host=localhost;database=cookbook"
          . ";mysql_socket=/var/tmp/mysql.sock";

To specify the port number for non-localhost (TCP/IP) connections, provide a port option:

my $dsn = "DBI:mysql:host=127.0.0.1;database=cookbook;port=3307";

Ruby

To write MySQL scripts in Ruby, the Mysql2 gem must be installed. To obtain this gem if it is not already installed, see the Preface.

The following Ruby script, connect.rb, connects to the MySQL server, selects cookbook as the default database, and disconnects:

#!/usr/bin/ruby -w
# connect.rb: connect to the MySQL server

require "mysql2"

begin
  client = Mysql2::Client.new(:host => "localhost", 
                              :username => "cbuser",
                              :password => "cbpass",
                              :database => "cookbook")
  puts "Connected"
rescue => e
  puts "Cannot connect to server"
  puts e.backtrace
  exit(1)
ensure
  client.close()
  puts "Disconnected"
end

To try connect.rb, locate it under the api directory of the recipes distribution and run it from the command line. The program should print two lines indicating that it connected and disconnected successfully:

$ ruby connect.rb
Connected
Disconnected

For background on running Ruby programs, read cmdline.pdf in the recipes distribution (see the Preface).

The -w option turns on warning mode so that Ruby produces warnings for any questionable constructs. Our example script has no such constructs, but it’s a good idea to get in the habit of using -w to catch problems that occur during the script development process.

The require statement tells Ruby to load the Mysql2 module.

To establish the connection, create a Mysql2::Client object. Pass connection parameters as named arguments for the method new.

To select no default database, omit the database option.

Assuming that the Mysql2::Client object is successfully created, it will act as a database handle that contains information about the state of the connection. Ruby scripts in this book conventionally use client to signify a database handle object.

If the new() method fails, it raises an exception. To handle exceptions, put the statements that might fail inside a begin block, and use a rescue clause that contains the error-handling code. Exceptions that occur at the top level of a script (that is, outside of any begin block) are caught by the default exception handler, which prints a stack trace and exits. Recipe 4.2 discusses error handling further.

To specify the path to a socket file for localhost connections on Unix, provide a socket option for the method new:

client = Mysql2::Client.new(:host => "localhost",
                            :socket => "/var/tmp/mysql.sock",
                            :username => "cbuser",
                            :password => "cbpass",
                            :database => "cookbook")

To specify the port number for non-localhost (TCP/IP) connections, provide a port option:

client = Mysql2::Client.new(:host => "127.0.0.1",
                            :port => 3307,
                            :username => "cbuser",
                            :password => "cbpass",
                            :database => "cookbook")

PHP

To write PHP scripts that use MySQL, your PHP interpreter must have MySQL support compiled in. If your scripts are unable to connect to your MySQL server, check the instructions included with your PHP distribution to see how to enable MySQL support.

PHP actually has multiple extensions that enable the use of MySQL, such as mysql, the original (and now deprecated) MySQL extension; mysqli, the MySQL improved extension; and, more recently, the MySQL driver for the PDO (PHP Data Objects) interface. PHP scripts in this book use PDO. To obtain PHP and PDO if they’re not already installed, see the Preface.

PHP scripts usually are written for use with a web server. I assume that if you use PHP that way, you can copy PHP scripts into your server’s document tree, and request them from your browser, and they will execute. For example, if you run Apache as the web server on the host localhost and you install a PHP script named myscript.php at the top level of the Apache document tree, you should be able to access the script by requesting this URL:

http://localhost/myscript.php

This book uses the .php extension (suffix) for PHP script filenames, so your web server must be configured to recognize the .php extension. Otherwise, when you request a PHP script from your browser, the server simply sends the literal text of the script and that’s what appears in your browser window. You don’t want this to happen, particularly if the script contains the username and password for connecting to MySQL.

PHP scripts often are written as a mixture of HTML and PHP code, with the PHP code embedded between the special <?php and ?> tags. Here is an example:

<html>
<head><title>A simple page</title></head>
<body>
<p>
<?php
  print ("I am PHP code, hear me roar!");
?>
</p>
</body>
</html>

For brevity in examples consisting entirely of PHP code, typically I’ll omit the enclosing <?php and ?> tags. If you see no tags in a PHP example, assume that <?php and ?> surround the entire block of code that is shown. Examples that switch between HTML and PHP code do include the tags, to make it clear what is PHP code and what is not.

PHP can be configured to recognize short tags as well, written as <? and ?>. This book does not assume that you have short tags enabled and does not use them.

The following PHP script, connect.php, connects to the MySQL server, selects cookbook as the default database, and disconnects:

<?php
# connect.php: connect to the MySQL server

try
{
  $dsn = "mysql:host=localhost;dbname=cookbook";
  $dbh = new PDO ($dsn, "cbuser", "cbpass");
  print ("Connected\n");
}
catch (PDOException $e)
{
  die ("Cannot connect to server\n");
}
$dbh = NULL;
print ("Disconnected\n");
?>

To try connect.php, locate it under the api directory of the recipes distribution, copy it to your web server’s document tree, and request it using your browser. Alternatively, if you have a standalone version of the PHP interpreter for use from the command line, execute the script directly:

$ php connect.php
Connected
Disconnected

For background on running PHP programs, read cmdline.pdf in the recipes distribution (see the Preface).

$dsn is the data source name (DSN) that indicates how to connect to the database server. It has this general syntax:

driver:name=value;name=value ...

The driver value is the PDO driver type. For MySQL, this is mysql.

Following the driver name, semicolon-separated name=value pairs specify connection parameters, in any order. For our purposes, the two most relevant options are host and dbname, to specify the hostname where the MySQL server is running and the default database. To select no default database, omit the dbname option.

To establish the connection, invoke the new PDO() class constructor, passing to it the appropriate arguments. The DSN is required. The other arguments are optional, although usually it’s necessary to supply a username and password. If the connection attempt succeeds, new PDO() returns a database-handle object that is used to access other MySQL-related methods. PHP scripts in this book conventionally use $dbh to signify a database handle.

If the connection attempt fails, PDO raises an exception. To handle this, put the connection attempt within a try block and use a catch block that contains the error-handling code, or just let the exception terminate your script. Recipe 4.2 discusses error handling further.

To disconnect, set the database handle to NULL. There is no explicit disconnect call.

To specify the path to a socket file for localhost connections on Unix, provide a unix_socket option in the DSN:

$dsn = "mysql:host=localhost;dbname=cookbook"
         . ";unix_socket=/var/tmp/mysql.sock";

To specify the port number for non-localhost (TCP/IP) connections, provide a port option:

$dsn = "mysql:host=127.0.0.1;database=cookbook;port=3307";

Python

To write MySQL programs in Python, a module must be installed that provides MySQL connectivity for the Python DB API, also known as Python Database API Specification v2.0 (PEP 249). This book uses MySQL Connector/Python. To obtain it if it’s not already installed, see the Preface.

To use the DB API, import the database driver module that you want to use (which is mysql.connector for MySQL programs that use Connector/Python). Then create a database connection object by calling the driver’s connect() method. This object provides access to other DB API methods, such as the close() method that serves the connection to the database server.

The following Python script, connect.py, connects to the MySQL server, selects cookbook as the default database, and disconnects:

#!/usr/bin/python3
# connect.py: connect to the MySQL server

import mysql.connector

try:
  conn = mysql.connector.connect(database="cookbook",
                                 host="localhost",
                                 user="cbuser",
                                 password="cbpass")
  print("Connected")
except:
  print("Cannot connect to server")
else:
  conn.close()
  print("Disconnected")

To try connect.py, locate it under the api directory of the recipes distribution and run it from the command line. The program should print two lines indicating that it connected and disconnected successfully:

$ python3 connect.py
Connected
Disconnected

For background on running Python programs, read cmdline.pdf in the recipes distribution (see the Preface).

The import line tells Python to load the mysql.connector module. Then the script attempts to establish a connection to the MySQL server by calling connect() to obtain a connection object. Python scripts in this book conventionally use conn to signify connection objects.

If the connect() method fails, Connector/Python raises an exception. To handle exceptions, put the statements that might fail inside a try statement and use an except clause that contains the error-handling code. Exceptions that occur at the top level of a script (that is, outside of any try statement) are caught by the default exception handler, which prints a stack trace and exits. Recipe 4.2 discusses error handling further.

The else clause contains statements that execute if the try clause produces no exception. It’s used here to close the successfully opened connection.

Because the connect() call uses named arguments, their order does not matter. If you omit the host argument from the connect() call, its default value is 127.0.0.1. To select no default database, omit the database argument or pass a database value of "" (the empty string) or None.

Another way to connect is to specify the parameters using a Python dictionary and pass the dictionary to connect():

conn_params = {
  "database": "cookbook",
  "host": "localhost",
  "user": "cbuser",
  "password": "cbpass",
}
conn = mysql.connector.connect(**conn_params)
print("Connected")

This book generally uses that style from now on.

To specify the path to a socket file for local host connections on Unix, omit the host parameter and provide a unix_socket parameter:

conn_params = {
  "database": "cookbook",
  "unix_socket": "/var/tmp/mysql.sock",
  "user": "cbuser",
  "password": "cbpass",
}
conn = mysql.connector.connect(**conn_params)
print("Connected")

To specify the port number for TCP/IP connections, include the host parameter and provide an integer-valued port parameter:

conn_params = {
  "database": "cookbook",
  "host": "127.0.0.1",
  "port": 3307,
  "user": "cbuser",
  "password": "cbpass",
}
conn = mysql.connector.connect(**conn_params)

Go

To write MySQL programs in Go, a Go SQL Driver must be installed. This book uses Go-MySQL-Driver. To obtain it if it’s not already installed, install Git, then issue the following command:

$ go get -u github.com/go-sql-driver/mysql

To use the Go SQL interface, import the database/sql package and your driver package. Then create a database connection object by calling the sql.Open() function. This object provides access to other database/sql package functions, such as the db.Close() that closes the connection to the database server. We also use a defer statement to call the db.Close() to make sure the function call is performed later in the program execution. You will see this usage throughout this chapter.

Tip

The Go database/sql package and the Go-MySQL-Driver support context cancellation. This means that you can cancel database operations, such as running a query, if you cancel the context. To use this feature, you need to call context-aware functions of the sql interface. For brevity, we didn’t use Context in our examples in this chapter. We include an example using Context when we discuss transaction handling in Recipe 20.9.

The following Go script, connect.go, connects to the MySQL server, selects cookbook as the default database, and disconnects:

// connect.go: connect to MySQL server
package main

import (
    "database/sql"
    "fmt"
    "log"

    _ "github.com/go-sql-driver/mysql"
)

func main() {

    db, err := sql.Open("mysql", "cbuser:cbpass@tcp(127.0.0.1:3306)/cookbook")

    if err != nil {
        log.Fatal(err)
    }
    defer db.Close()
    
    err = db.Ping()

    if err != nil {
        log.Fatal(err)
    }


    fmt.Println("Connected!")
}

To try connect.go, locate it under the api/01_connect directory of the recipes distribution and run it from the command line. The program should print a single line indicating that it connected:

$ go run connect.go
Connected!

The import line tells Go to load the go-sql-driver/mysql package. Then the script validates connection parameters and obtains a connection object by calling sql.Open(). No MySQL connection established yet!

If the sql.Open() method fails, go-sql-driver/mysql returns an error. To handle the error, store it into a variable (in our example err) and use an if block that contains the error-handling code. Recipe 4.2 discusses error handling further.

The db.Ping() call establishes the database connection. Only then can we say that we connected to the MySQL server successfully.

To specify the path to a socket file for local host connections on Unix, omit the tcp parameter in the DSN and provide a unix parameter:

// connect_socket.go : Connect MySQL server using socket
package main

import (
    "database/sql"
    "fmt"
    "log"

    _ "github.com/go-sql-driver/mysql"
)

func main() {
    db, err := sql.Open("mysql","cbuser:cbpass@unix(/tmp/mysql.sock)/cookbook")
    defer db.Close()

    if err != nil {
        log.Fatal(err)
    }


    var user string
    err = db.QueryRow("SELECT USER()").Scan(&user)

    if err != nil {
        log.Fatal(err)
    }

    fmt.Println("Connected User:", user, "via MySQL socket")
}

Run this program:

$ go run connect_socket.go
Connected User: cbuser@localhost via MySQL socket

To specify the port number for TCP/IP connections, include the tcp parameter into the DSN and provide an integer-valued port port number:

// connect_tcpport.go : Connect MySQL server using tcp port number
package main

import (
	"database/sql"
	"fmt"
	"log"

	_ "github.com/go-sql-driver/mysql"
)

func main() {
	db, err := sql.Open("mysql",
	"cbuser:cbpass@tcp(127.0.0.1:3306)/cookbook?charset=utf8mb4")

	if err != nil {
		log.Fatal(err)
	}

	var user string
	err2 := db.QueryRow("SELECT USER()").Scan(&user)

	if err2 != nil {
		log.Fatal(err2)
	}

	fmt.Println("Connected User:", user, "via MySQL TCP/IP localhost on port 3306")
}

Run this program:

$  go run connect_tcpport.go
Connected User: cbuser@localhost via MySQL TCP/IP localhost on port 3306

Go accepts a DSN (Data Source Name) in this form:

[username[:password]@][protocol[(address)]]/dbname[?param1=value1&..&paramN=valueN]

Where protocol could be either tcp or unix.

A DSN in its fullest form is as follows:

username:password@protocol(address)/dbname?param=value

Java

Database programs in Java use the JDBC interface, together with a driver for the particular database engine you want to access. That is, the JDBC architecture provides a generic interface used in conjunction with a database-specific driver.

Java programming requires a Java Development Kit (JDK), and you must set your JAVA_HOME environment variable to the location where your JDK is installed. To write MySQL-based Java programs, you’ll also need a MySQL-specific JDBC driver. Programs in this book use MySQL Connector/J. To obtain it if it’s not already installed, see the Preface. For information about obtaining a JDK and setting JAVA_HOME, read cmdline.pdf in the recipes distribution (see the Preface).

The following Java program, Connect.java, connects to the MySQL server, selects cookbook as the default database, and disconnects:

// Connect.java: connect to the MySQL server

import java.sql.*;

public class Connect {

  public static void main (String[] args) {
    Connection conn = null;
    String url = "jdbc:mysql://localhost/cookbook";
    String userName = "cbuser";
    String password = "cbpass";

    try {
      conn = DriverManager.getConnection (url, userName, password);
      System.out.println("Connected");
    } catch (Exception e) {
      System.err.println("Cannot connect to server");
      System.exit (1);
    }
    
    if (conn != null) {
      try {
        conn.close();
        System.out.println("Disconnected");
      } catch (Exception e) { /* ignore close errors */ }
    }
  }
}

To try Connect.java, locate it under the api directory of the recipes distribution, compile it, and execute it. The class statement indicates the program’s name, which in this case is Connect. The name of the file containing the program must match this name and include a .java extension, so the filename for the program is Connect.java. Compile the program using javac:

$ javac Connect.java

If you prefer a different Java compiler, substitute its name for javac.

The Java compiler generates compiled byte code to produce a class file named Connect.class. Use the java program to run the class file (specified without the .class extension). The program should print two lines indicating that it connected and disconnected successfully:

$ java Connect
Connected
Disconnected

You might need to set your CLASSPATH environment variable before the example program will compile and run. The value of CLASSPATH should include at least your current directory (.) and the path to the Connector/J JDBC driver. For background on running Java programs or setting CLASSPATH, read cmdline.pdf in the recipes distribution (see the Preface).

Tip

Starting from Java 11, you can skip the javac call for a single-file program and run it as:

$ java Connect.java
Connected
Disconnected

The import java.sql.* statement references the classes and interfaces that provide access to the data types used to manage different aspects of your interaction with the database server. These are required for all JDBC programs.

To connect to the server, call DriverManager.getConnection() to initiate the connection and obtain a Connection object that maintains information about the state of the connection. Java programs in this book conventionally use conn to signify connection objects.

DriverManager.getConnection() takes three arguments: a URL that describes where to connect and the database to use, the MySQL username, and the password. The URL string has this format:

jdbc:driver://host_name/db_name

This format follows the Java convention that the URL for connecting to a network resource begins with a protocol designator. For JDBC programs, the protocol is jdbc, and you’ll also need a subprotocol designator that specifies the driver name (mysql, for MySQL programs). Many parts of the connection URL are optional, but the leading protocol and subprotocol designators are not. If you omit host_name, the default host value is localhost. To select no default database, omit the database name. However, you should not omit any of the slashes in any case. For example, to connect to the local host without selecting a default database, the URL is the following:

jdbc:mysql:///

In JDBC, you don’t test method calls for return values that indicate an error. Instead, provide handlers to be called when exceptions are thrown. Recipe 4.2 discusses error handling further.

Some JDBC drivers (Connector/J among them) permit you to specify the username and password as parameters at the end of the URL. In this case, omit the second and third arguments of the getConnection() call. Using that URL style, write the code that establishes the connection in the example program like this:

// connect using username and password included in URL
Connection conn = null;
String url = "jdbc:mysql://localhost/cookbook?user=cbuser&password=cbpass";

try
{
  conn = DriverManager.getConnection (url);
  System.out.println ("Connected");
}

The character that separates the user and password parameters should be &, not ;.

Connector/J does not natively support Unix domain socket file connections, so even connections for which the hostname is localhost are made via TCP/IP. To specify an explicit port number, add :port_num to the hostname in the connection URL:

String url = "jdbc:mysql://127.0.0.1:3307/cookbook";

However, you can use third-party libraries that provide support for connections via a socket. See “Connecting Using Unix Domain Sockets” in the Reference Manual for details.

4.2 Checking for Errors

Problem

Something went wrong with your program, and you don’t know what.

Solution

Everyone has problems getting programs to work correctly. But if you don’t anticipate problems by checking for errors, the job becomes much more difficult. Add some error-checking code so your programs can help you figure out what went wrong.

Discussion

After working through Recipe 4.1, you know how to connect to the MySQL server. It’s also a good idea to know how to check for errors and how to retrieve specific error information from the API, so we cover that next. You’re probably anxious to do more interesting things (such as executing statements and getting back the results), but error checking is fundamentally important. Programs sometimes fail, especially during development, and if you can’t determine why failures occur, you’re flying blind. Plan for failure by checking for errors so that you can take appropriate action.

When an error occurs, MySQL provides three values:

  • A MySQL-specific error number

  • A MySQL-specific descriptive text error message

  • A five-character SQLSTATE error code defined according to the ANSI and ODBC standards

This recipe shows how to access this information. The example programs are deliberately designed to fail so that the error-handling code executes. That’s why they attempt to connect using a username and password of baduser and badpass.

Tip

A general debugging aid not specific to any API is to use the available logs. Check the MySQL server’s general query log to see what statements the server is receiving. (This requires that log to be enabled; see Recipe 22.3.) The general query log might show that your program is not constructing the SQL statement string you expect. Similarly, if you run a script under a web server and it fails, check the web server’s error log.

Perl

The DBI module provides two attributes that control what happens when DBI method invocations fail:

  • PrintError, if enabled, causes DBI to print an error message using warn().

  • RaiseError, if enabled, causes DBI to print an error message using die(). This terminates your script.

By default, PrintError is enabled and RaiseError is disabled, so a script continues executing after printing a message if an error occurs. Either or both attributes can be specified in the connect() call. Setting an attribute to 1 or 0 enables or disables it, respectively. To specify either or both attributes, pass them in a hash reference as the fourth argument to the connect() call.

The following code sets only the AutoCommit attribute and uses the default settings for the error-handling attributes. If the connect() call fails, a warning message results, but the script continues to execute:

my $conn_attrs = {AutoCommit => 1};
my $dbh = DBI->connect ($dsn, "baduser", "badpass", $conn_attrs);

Because you really can’t do much if the connection attempt fails, it’s often prudent to exit instead after DBI prints a message:

my $conn_attrs = {AutoCommit => 1};
my $dbh = DBI->connect ($dsn, "baduser", "badpass", $conn_attrs)
            or exit;

To print your own error messages, leave RaiseError disabled and disable PrintError as well. Then test the results of DBI method calls yourself. When a method fails, the $DBI::err, $DBI::errstr, and $DBI::state variables contain the MySQL error number, a descriptive error string, and the SQLSTATE value, respectively:

my $conn_attrs = {PrintError => 0, AutoCommit => 1};
my $dbh = DBI->connect ($dsn, "baduser", "badpass", $conn_attrs)
            or die "Connection error: "
                   . "$DBI::errstr ($DBI::err/$DBI::state)\n";

If no error occurs, $DBI::err is 0, or undef; $DBI::errstr is the empty string, or undef; and $DBI::state is empty, or 00000.

When you check for errors, access these variables immediately after invoking the DBI method that sets them. If you invoke another method before using them, DBI resets their values.

If you print your own messages, the default settings (PrintError enabled, RaiseError disabled) are not so useful. DBI prints a message automatically, then your script prints its own message. This is redundant, as well as confusing to the person using the script.

If you enable RaiseError, you can call DBI methods without checking for return values that indicate errors. If a method fails, DBI prints an error and terminates your script. If the method returns, you can assume it succeeded. This is the easiest approach for script writers: let DBI do all the error checking! However, if both PrintError and RaiseError are enabled, DBI may call warn() and die() in succession, resulting in error messages being printed twice. To avoid this problem, disable PrintError whenever you enable RaiseError:

my $conn_attrs = {PrintError => 0, RaiseError => 1, AutoCommit => 1};
my $dbh = DBI->connect ($dsn, "baduser", "badpass", $conn_attrs);

This book generally uses that approach. If you don’t want the all-or-nothing behavior of enabling RaiseError for automatic error checking versus having to do all your own checking, adopt a mixed approach. Individual handles have PrintError and RaiseError attributes that can be enabled or disabled selectively. For example, you can enable RaiseError globally by turning it on when you call connect(), and then disable it selectively on a per-handle basis.

Suppose that a script reads the username and password from the command-line arguments and then loops while the user enters statements to be executed. In this case, you’d probably want DBI to die and print the error message automatically if the connection fails (you cannot proceed to the statement-execution loop in that case). After connecting, however, you wouldn’t want the script to exit just because the user enters a syntactically invalid statement. Instead, print an error message and loop to get the next statement. The following code shows how to do this. The do() method used in the example executes a statement and returns undef to indicate an error:

my $user_name = shift (@ARGV);
my $password = shift (@ARGV);
my $conn_attrs = {PrintError => 0, RaiseError => 1, AutoCommit => 1};
my $dbh = DBI->connect ($dsn, $user_name, $password, $conn_attrs);
$dbh->{RaiseError} = 0; # disable automatic termination on error
print "Enter statements to execute, one per line; terminate with Control-D\n";
while (<>)              # read and execute queries
{
  $dbh->do ($_) or warn "Statement failed: $DBI::errstr ($DBI::err)\n";
}

If RaiseError is enabled, you can execute code within an eval block to trap errors without terminating your program. If an error occurs, eval returns a message in the $@ variable:

eval
{
  # statements that might fail go here...
};
if ($@)
{
  print "An error occurred: $@\n";
}

This eval technique is commonly used to perform transactions (see Recipe 20.4).

Using RaiseError in combination with eval differs from using RaiseError alone:

  • Errors terminate only the eval block, not the entire script.

  • Any error terminates the eval block, whereas RaiseError applies only to DBI-related errors.

When you use eval with RaiseError enabled, disable PrintError. Otherwise, in some versions of DBI, an error may simply cause warn() to be called without terminating the eval block as you expect.

In addition to using the error-handling attributes PrintError and RaiseError, lots of information about your script’s execution is available using DBI’s tracing mechanism. Invoke the trace() method with an argument indicating the trace level. Levels 1 to 9 enable tracing with increasingly more verbose output, and level 0 disables tracing:

DBI->trace (1); # enable tracing, minimal output
DBI->trace (3); # elevate trace level
DBI->trace (0); # disable tracing

Individual database and statement handles also have trace() methods, so you can localize tracing to a single handle if you want.

Trace output normally goes to your terminal (or, in the case of a web script, to the web server’s error log). To write trace output to a specific file, provide a second argument that indicates the filename:

DBI->trace (1, "/tmp/trace.out");

If the trace file already exists, its contents are not cleared first; trace output is appended to the end. Beware of turning on a file trace while developing a script but forgetting to disable the trace when you put the script into production. You’ll eventually find to your chagrin that the trace file has become quite large. Or worse, a filesystem will fill up, and you’ll have no idea why!

Ruby

Ruby signals errors by raising exceptions, and Ruby programs handle errors by catching exceptions in a rescue clause of a begin block. Ruby Mysql2 methods raise exceptions when they fail and provide error information by means of a Mysql2::Error object. To get the MySQL error number, error message, and SQLSTATE value, access the errno, message, and sql_state methods of this object. The following example shows how to trap exceptions and access error information in a Ruby script:

begin
  client = Mysql2::Client.new(:host => "localhost",
                              :username => "baduser",
                              :password => "badpass",
                              :database => "cookbook")
  puts "Connected"
rescue Mysql2::Error => e
  puts "Cannot connect to server"
  puts "Error code: #{e.errno}"
  puts "Error message: #{e.message}"
  puts "Error SQLSTATE: #{e.sql_state}"
  exit(1)
ensure
  client.close()s
end

PHP

The new PDO() constructor raises an exception if it fails, but other PDO methods by default indicate success or failure by their return value. To cause all PDO methods to raise exceptions for errors, use the database handle resulting from a successful connection attempt to set the error-handling mode. This enables uniform handling of all PDO errors without checking the result of every call. The following example shows how to set the error mode if the connection attempt succeeds and how to handle exceptions if it fails:

try
{
  $dsn = "mysql:host=localhost;dbname=cookbook";
  $dbh = new PDO ($dsn, "baduser", "badpass");
  $dbh->setAttribute (PDO::ATTR_ERRMODE, PDO::ERRMODE_EXCEPTION);
  print ("Connected\n");
}
catch (PDOException $e)
{
  print ("Cannot connect to server\n");
  print ("Error code: " . $e->getCode () . "\n");
  print ("Error message: " . $e->getMessage () . "\n");
}

When PDO raises an exception, the resulting PDOException object provides error information. The getCode() method returns the SQLSTATE value. The getMessage() method returns a string containing the SQLSTATE value, MySQL error number, and error message.

Database and statement handles also provide information when an error occurs. For either type of handle, errorCode() returns the SQLSTATE value, and errorInfo() returns a three-element array containing the SQLSTATE value and a driver-specific error code and message. For MySQL, the latter two values are the error number and message string. The following example demonstrates how to get information from the exception object and the database handle:

try
{
  $dbh->query ("SELECT"); # malformed query
}
catch (PDOException $e)
{
  print ("Cannot execute query\n");
  print ("Error information using exception object:\n");
  print ("SQLSTATE value: " . $e->getCode () . "\n");
  print ("Error message: " . $e->getMessage () . "\n");

  print ("Error information using database handle:\n");
  print ("Error code: " . $dbh->errorCode () . "\n");
  $errorInfo = $dbh->errorInfo ();
  print ("SQLSTATE value: " . $errorInfo[0] . "\n");
  print ("Error number: " . $errorInfo[1] . "\n");
  print ("Error message: " . $errorInfo[2] . "\n");
}

Python

Python signals errors by raising exceptions, and Python programs handle errors by catching exceptions in the except clause of a try statement. To obtain MySQL-specific error information, name an exception class, and provide a variable to receive the information. Here’s an example:

conn_params = {
  "database": "cookbook",
  "host": "localhost",
  "user": "baduser",
  "password": "badpass"
}

try:
  conn = mysql.connector.connect(**conn_params)
  print("Connected")
except mysql.connector.Error as e:
  print("Cannot connect to server")
  print("Error code: %s" % e.errno)
  print("Error message: %s" % e.msg)
  print("Error SQLSTATE: %s" % e.sqlstate)

If an exception occurs, the errno, msg, and sqlstate members of the exception object contain the error number, error message, and SQLSTATE values, respectively. Note that access to the Error class is through the driver module name.

Go

Go does not support exceptions. Instead, its multivalue returns make it easy to pass an error when needed. To handle errors in Go, store the returned value of the type Error into a variable (we use the variable name err here) and handle it accordingly. To handle errors, Go offers a defer statement and Panic() and Recover() built-in functions, shown in Table 4-2.

Table 4-2. Error handling in Go
Function or statementMeaning
deferDefers statement execution until the calling function returns.
Panic()The normal execution of the calling function stops, all deferred functions are executed, then the function returns a call to panic up the stack. The process continues. Finally, the program crashes.
Recover()Allows you to regain control in the panicking goroutine, so the program won’t crash and will continue executing. This works only in the deferred functions. If called in the function that is not deferred, it does nothing and returns nil.
// mysql_error.go : MySQL error handling
package main

import (
    "database/sql"
    "log"
    "fmt"

    _ "github.com/go-sql-driver/mysql"
)

var actor string

func main() {

    db, err := sql.Open("mysql", "cbuser:cbpass!@tcp(127.0.0.1:3306)/cookbook")
    defer db.Close()

    if err != nil {
        log.Fatal(err)
    }

    err = db.QueryRow("SELECT actor FROM actors where actor='Dwayne Johnson'").
          Scan(&actor)
    if err != nil {
	    if err == sql.ErrNoRows {
		    fmt.Print("There were no rows, but otherwise no error occurred")
	    } else {
		    fmt.Println(err.Error())
	    }
    }
    fmt.Println(actor)
}

If an error occurs, the function returns an object of the type error. Its Error() function returns a MySQL error code and message for the errors, raised by the Go-MySQL-Driver.

There is an exceptional case for the QueryRow() function with the subsequent Scan() call. By default, Scan() returns nil if there is no error and error if there is an error. However, if the query ran successfully but returned no rows, this function returns sql.ErrNoRows.

Java

Java programs handle errors by catching exceptions. To do the minimum amount of work, print a stack trace to inform the user where the problem lies:

try
{
  /* ... some database operation ... */
}
catch (Exception e)
{
  e.printStackTrace ();
}

The stack trace shows the location of the problem but not necessarily what the problem was. Also, it may not be meaningful except to you, the program’s developer. To be more specific, print the error message and code associated with an exception:

  • All Exception objects support the getMessage() method. JDBC methods may throw exceptions using SQLException objects; these are like Exception objects but also support getErrorCode() and getSQLState() methods. getErrorCode() and getMessage() return the MySQL-specific error number and message string, and getSQLState() returns a string containing the SQLSTATE value.

  • Some methods generate SQLWarning objects to provide information about nonfatal warnings. SQLWarning is a subclass of SQLException, but warnings are accumulated in a list rather than thrown immediately. They don’t interrupt your program, and you can print them at your leisure.

The following example program, Error.java, demonstrates how to access error messages by printing all the error information available to it. It attempts to connect to the MySQL server and prints exception information if the attempt fails. Then it executes a statement and prints exception and warning information if the statement fails:

// Error.java: demonstrate MySQL error handling

import java.sql.*;

public class Error {
  public static void main(String[] args) {
    Connection conn = null;
    String url = "jdbc:mysql://localhost/cookbook";
    String userName = "baduser";
    String password = "badpass";

    try {
      conn = DriverManager.getConnection(url, userName, password);
      System.out.println("Connected");
      tryQuery(conn);    // issue a query
    } catch (Exception e) {
      System.err.println("Cannot connect to server");
      System.err.println(e);
      if (e instanceof SQLException)  // JDBC-specific exception?
      {
        // e must be cast from Exception to SQLException to
        // access the SQLException-specific methods
        printException((SQLException) e);
      }
    } finally {
      if (conn != null) {
        try {
          conn.close ();
          System.out.println("Disconnected");
        } catch (SQLException e) {
          printException (e);
        }
      }
    }
  }

  public static void tryQuery(Connection conn) {
    try {
      // issue a simple query
      Statement s = conn.createStatement();
      s.execute("USE cookbook");
      s.close();

      // print any accumulated warnings
      SQLWarning w = conn.getWarnings();
      while (w != null) {
        System.err.println("SQLWarning: " + w.getMessage());
        System.err.println("SQLState: " + w.getSQLState());
        System.err.println("Vendor code: " + w.getErrorCode());
        w = w.getNextWarning();
      }
    } catch (SQLException e) {
      printException(e);
    }
  }

  public static void printException(SQLException e) {
    // print general message, plus any database-specific message
    System.err.println("SQLException: " + e.getMessage ());
    System.err.println("SQLState: " + e.getSQLState ());
    System.err.println("Vendor code: " + e.getErrorCode ());
  }
}

4.3 Writing Library Files

Problem

You notice that you’re repeating code to perform common operations in multiple programs.

Solution

Write routines to perform those operations, put them in a library file, and arrange for your programs to access the library. This enables you to write the code only once. You might need to set an environment variable so that your scripts can find the library.

Discussion

This section describes how to put code for common operations in library files. Encapsulation (or modularization) isn’t really a recipe so much as a programming technique. Its principal benefit is that you need not repeat code in each program you write. Instead, simply call a routine that’s in the library. For example, by putting the code for connecting to the cookbook database into a library routine, you need not write out all the parameters associated with making that connection. Simply invoke the routine from your program, and you’re connected.

Connection establishment isn’t the only operation you can encapsulate, of course. Later sections in this book develop other utility functions to be placed in library files. All such files, including those shown in this section, are located under the lib directory of the recipes distribution. As you write your own programs, be on the lookout for operations that you perform often and that are good candidates for inclusion in a library. Use the techniques in this section to write your own library files.

Library files have other benefits besides making it easier to write programs, such as promoting portability. If you write connection parameters directly into each program that connects to the MySQL server, you must change all those programs if you move them to another machine that uses different parameters. If instead you write your programs to connect to the database by calling a library routine, it’s necessary only to modify the affected library routine, not all the programs that use it.

Code encapsulation can also improve security. If you make a private library file readable only to yourself, only scripts run by you can execute routines in the file. Or suppose that you have some scripts located in your web server’s document tree. A properly configured server executes the scripts and sends their output to remote clients. But if the server becomes misconfigured somehow, the result can be that it sends your scripts to clients as plain text, thus displaying your MySQL username and password. If you place the code for establishing a connection to the MySQL server in a library file located outside the document tree, those parameters won’t be exposed to clients.

Warning

Be aware that if you install a library file to be readable by your web server, you don’t have much security if other developers use the same server. Any of those developers can write a web script to read and display your library file because, by default, the script runs with the permissions of the web server and thus will have access to the library.

The recipes that follow demonstrate how to write, for each API, a library file that contains a routine for connecting to the cookbook database on the MySQL server. The calling program can use the error-checking techniques discussed in Recipe 4.2 to determine whether a connection attempt fails. The connection routine for each language returns a database handle or connection object when it succeeds or raises an exception if the connection cannot be established.

Libraries are of no utility in themselves, so the following discussion illustrates each one’s use by a short test harness program. To use any of these harness programs as the basis for creating new programs, make a copy of the file and add your own code between the connect and disconnect calls.

Library-file writing involves not only the question of what to put in the file but also subsidiary issues such as where to install the file so it is accessible by your programs, and (on multiuser systems such as Unix) how to set its access privileges so its contents aren’t exposed to people who shouldn’t see it.

Choosing a library-file installation location

If you install a library file in a directory that a language processor searches by default, programs written in that language need do nothing special to access the library. However, if you install a library file in a directory that the language processor does not search by default, you must tell your scripts how to find it. There are two common ways to do this:

  • Most languages provide a statement that can be used within a script to add directories to the language processor search path. This requires that you modify each script that needs the library.

  • You can set an environment or configuration variable that changes the language processor search path. With this approach, each user who executes scripts that require the library must set the appropriate variable. Alternatively, if the language processor has a configuration file, you might be able to set a parameter in the file that affects scripts globally for all users.

We’ll use the second approach. For our API languages, Table 4-3 shows the relevant variables. In each case, the variable value is a directory or list of directories.

Table 4-3. Default library paths
LanguageVariable nameVariable type
PerlPERL5LIB Environment variable
RubyRUBYLIB Environment variable
PHPinclude_path Configuration variable
PythonPYTHONPATH Environment variable
GoGOPATH Environment variable
JavaCLASSPATH Environment variable

For general information on setting environment variables, read cmdline.pdf in the recipes distribution (see the Preface). You can use those instructions to set environment variables to the values in the following discussion.

Suppose that you want to install library files in a directory that language processors do not search by default. For purposes of illustration, let’s use /usr/local/lib/mcb on Unix and C:\lib\mcb on Windows. (To put the files somewhere else, adjust the pathnames in the variable settings accordingly. For example, you might want to use a different directory, or you might want to put libraries for each language in separate directories.)

Under Unix, if you put Perl library files in the /usr/local/lib/mcb directory, set the PERL5LIB environment variable appropriately. For a shell in the Bourne shell family (sh, bash, ksh), set the variable like this in the appropriate startup file:

export PERL5LIB=/usr/local/lib/mcb
Note

For the original Bourne shell, sh, you may need to split this into two commands:

PERL5LIB=/usr/local/lib/mcb
export PERL5LIB

For a shell in the C shell family (csh, tcsh), set PERL5LIB like this in your .login file:

setenv PERL5LIB /usr/local/lib/mcb

Under Windows, if you put Perl library files in C:\lib\mcb, set PERL5LIB as follows:

PERL5LIB=C:\lib\mcb

In each case, the variable value tells Perl to look in the specified directory for library files, in addition to any other directories it searches by default. If you set PERL5LIB to name multiple directories, the separator character between directory pathnames is a colon (:) in Unix or a semicolon (;) in Windows.

Specify the other environment variables (RUBYLIB, PYTHONPATH, and CLASSPATH) using the same syntax.

Note

Setting these environment variables as just discussed should suffice for scripts that you run from the command line. For scripts intended to be executed by a web server, you likely must configure the server as well so that it can find the library files.

For PHP, the search path is defined by the value of the include_path variable in the php.ini PHP initialization file. On Unix, the file’s pathname is likely /usr/lib/php.ini or /usr/local/lib/php.ini. Under Windows, the file is likely found in the Windows directory or under the main PHP installation directory. To determine the location, run this command:

$ php --ini

Define the value of include_path in php.ini with a line like this:

include_path = "value"

Specify value using the same syntax as for environment variables that name directories. That is, it’s a list of directory names, with the names separated by colons in Unix or semicolons in Windows. In Unix, if you want PHP to look for included files in the current directory and in /usr/local/lib/mcb, set include_path like this:

include_path = ".:/usr/local/lib/mcb"

In Windows, to search the current directory and C:\lib\mcb, set include_path like this:

include_path = ".;C:\lib\mcb"

If PHP is running as an Apache module, restart Apache to make php.ini changes take effect.

Setting library-file access privileges

If you use a multiple-user system such as Unix, you must make decisions about library-file ownership and access mode:

  • If a library file is private and contains code to be used only by you, place the file under your own account and make it accessible only to you. Assuming that a library file named mylib is already owned by you, you can make it private like this:

    $ chmod 600 mylib
  • If the library file is to be used only by your web server, install it in a server library directory and make it owned by and accessible only to the server user ID. You may need to be root to do this. For example, if the web server runs as wwwusr, the following commands make the file private to that user:

    # chown wwwusr mylib
    # chmod 600 mylib
  • If the library file is public, you can place it in a location that your programming language searches automatically when it looks for libraries. (Most language processors search for libraries in some default set of directories, although this set can be influenced by setting environment variables as described previously.) You may need to be root to install files in one of these directories. Then you can make the file world readable:

    # chmod 444 mylib

Now let’s construct a library for each API. Each section here demonstrates how to write the library file itself and discusses how to use the library from within programs.

Perl

In Perl, library files are called modules and typically have an extension of .pm (Perl module). It’s conventional for the basename of a module file to be the same as the identifier on the package line in the file. The following file, Cookbook.pm, implements a module named Cookbook:

package Cookbook;
# Cookbook.pm: library file with utility method for connecting to MySQL
# using the Perl DBI module

use strict;
use warnings;
use DBI;

my $db_name = "cookbook";
my $host_name = "localhost";
my $user_name = "cbuser";
my $password = "cbpass";
my $port_num = undef;
my $socket_file = undef;

# Establish a connection to the cookbook database, returning a database
# handle.  Raise an exception if the connection cannot be established.

sub connect
{
my $dsn = "DBI:mysql:host=$host_name";
my $conn_attrs = {PrintError => 0, RaiseError => 1, AutoCommit => 1};

  $dsn .= ";database=$db_name" if defined ($db_name);
  $dsn .= ";mysql_socket=$socket_file" if defined ($socket_file);
  $dsn .= ";port=$port_num" if defined ($port_num);

  return DBI->connect ($dsn, $user_name, $password, $conn_attrs);
}

1;  # return true

The module encapsulates the code for establishing a connection to the MySQL server into a connect() method, and the package identifier establishes a Cookbook namespace for the module. To invoke the connect() method, use the module name:

$dbh = Cookbook::connect ();

The final line of the module file is a statement that trivially evaluates to true. (If the module doesn’t return a true value, Perl assumes that something is wrong with it and exits.)

Perl locates library files by searching the list of directories named in its @INC array. To check the default value of this variable on your system, invoke Perl as follows at the command line:

$ perl -V

The last part of the output from the command shows the directories listed in @INC. If you install a library file in one of those directories, your scripts will find it automatically. If you install the module somewhere else, tell your scripts where to find it by setting the PERL5LIB environment variable, as discussed in the introductory part of this recipe.

After installing the Cookbook.pm module, try it from a test harness script, harness.pl:

#!/usr/bin/perl
# harness.pl: test harness for Cookbook.pm library

use strict;
use warnings;
use Cookbook;

my $dbh;
eval
{
  $dbh = Cookbook::connect ();
  print "Connected\n";
};
die "$@" if $@;
$dbh->disconnect ();
print "Disconnected\n";

harness.pl has no use DBI statement. It’s unnecessary because the Cookbook module itself imports DBI; any script that uses Cookbook also gains access to DBI.

If you don’t catch connection errors explicitly with eval, you can write the script body more simply:

my $dbh = Cookbook::connect ();
print "Connected\n";
$dbh->disconnect ();
print "Disconnected\n";

In this case, Perl catches any connection exception and terminates the script after printing the error message generated by the connect() method.

Ruby

The following Ruby library file, Cookbook.rb, defines a Cookbook class that implements a connect class method:

# Cookbook.rb: library file with utility method for connecting to MySQL
# using the Ruby Mysql2 module

require "mysql2"

# Establish a connection to the cookbook database, returning a database
# handle.  Raise an exception if the connection cannot be established.

class Cookbook
  @@host_name = "localhost"
  @@db_name = "cookbook"
  @@user_name = "cbuser"
  @@password = "cbpass"

  # Class method for connecting to server to access the
  # cookbook database; returns a database handle object.

  def Cookbook.connect
    return Mysql2::Client.new(:host => @@host_name,
                              :database => @@db_name,
                              :username => @@user_name, 
                              :password => @@password)
  end
end

The connect method is defined in the library as Cookbook.connect because Ruby class methods are defined as class_name.method_name.

Ruby locates library files by searching the list of directories named in its $LOAD_PATH variable (also known as $:), which is an array. To check the default value of this variable on your system, use interactive Ruby to execute this statement:

$ irb
>> puts $LOAD_PATH

If you install a library file in one of those directories, your scripts will find it automatically. If you install the file somewhere else, tell your scripts where to find it by setting the RUBYLIB environment variable, as discussed in the introductory part of this recipe.

After installing the Cookbook.rb library file, try it from a test harness script, harness.rb:

#!/usr/bin/ruby -w
# harness.rb: test harness for Cookbook.rb library

require "Cookbook"

begin
  client = Cookbook.connect
  print "Connected\n"
rescue Mysql2::Error => e
  puts "Cannot connect to server"
  puts "Error code: #{e.errno}"
  puts "Error message: #{e.message}"
  exit(1)
ensure
  client.close()
  print "Disconnected\n"
end

harness.rb has no require statement for the Mysql2 module. It’s unnecessary because the Cookbook module itself imports Mysql2; any script that imports Cookbook also gains access to Mysql2.

If you want a script to die if an error occurs without checking for an exception yourself, write the script body like this:

client = Cookbook.connect
print "Connected\n"
client.close
print "Disconnected\n"

PHP

PHP library files are written like regular PHP scripts. A Cookbook.php file that implements a Cookbook class with a connect() method looks like this:

<?php
# Cookbook.php: library file with utility method for connecting to MySQL
# using the PDO module

class Cookbook
{
  public static $host_name = "localhost";
  public static $db_name = "cookbook";
  public static $user_name = "cbuser";
  public static $password = "cbpass";

  # Establish a connection to the cookbook database, returning a database
  # handle.  Raise an exception if the connection cannot be established.
  # In addition, cause exceptions to be raised for errors.

  public static function connect ()
  {
    $dsn = "mysql:host=" . self::$host_name . ";dbname=" . self::$db_name;
    $dbh = new PDO ($dsn, self::$user_name, self::$password);
    $dbh->setAttribute (PDO::ATTR_ERRMODE, PDO::ERRMODE_EXCEPTION);
    return ($dbh);
  }

} # end Cookbook
?>

The connect() routine within the class is declared using the static keyword to make it a class method rather than an instance method. This designates it as directly callable without instantiating an object through which to invoke it.

The new PDO() constructor raises an exception if the connection attempt fails. Following a successful attempt, connect() sets the error-handling mode so that other PDO calls raise exceptions for failure as well. This way, individual calls need not be tested for an error return value.

Although most PHP examples throughout this book don’t show the <?php and ?> tags, we’ve shown them as part of Cookbook.php here to emphasize that library files must enclose all PHP code within those tags. The PHP interpreter makes no assumptions about the contents of a library file when it begins parsing it because you might include a file that contains nothing but HTML. Therefore, you must use <?php and ?> to specify explicitly which parts of the library file should be considered as PHP code rather than as HTML, just as you do in the main script.

PHP looks for libraries by searching the directories named in the include_path variable in the PHP initialization file, as described in the introductory part of this recipe.

Note

PHP scripts often are placed in the document tree of your web server, and clients can request them directly. For PHP library files, we recommend that you place them somewhere outside the document tree, especially if (like Cookbook.php) they contain a username and password.

After installing Cookbook.php in one of the include_path directories, try it from a test harness script, harness.php:

<?php
# harness.php: test harness for Cookbook.php library

require_once "Cookbook.php";

try
{
  $dbh = Cookbook::connect ();
  print ("Connected\n");
}
catch (PDOException $e)
{
  print ("Cannot connect to server\n");
  print ("Error code: " . $e->getCode () . "\n");
  print ("Error message: " . $e->getMessage () . "\n");
  exit (1);
}
$dbh = NULL;
print ("Disconnected\n");
?>

The require_once statement accesses the Cookbook.php file that is required to use the Cookbook class. require_once is one of several PHP file-inclusion statements:

  • require and include instruct PHP to read the named file. They are similar, but require terminates the script if the file cannot be found; include produces only a warning.

  • require_once and include_once are like require and include except that if the file has already been read, its contents are not processed again. This is useful for avoiding multiple-declaration problems that can easily occur when library files include other library files.

Python

Python libraries are written as modules and referenced from scripts using import statements. To create a method for connecting to MySQL, write a module file, cookbook.py (Python module names should be lowercase):

# cookbook.py: library file with utility method for connecting to MySQL
# using the Connector/Python module

import mysql.connector

conn_params = {
  "database": "cookbook",
  "host": "localhost",
  "user": "cbuser",
  "password": "cbpass",
}

# Establish a connection to the cookbook database, returning a connection
# object.  Raise an exception if the connection cannot be established.

def connect():
  return mysql.connector.connect(**conn_params)

The filename basename determines the module name, so the module is called cookbook. Module methods are accessed through the module name; thus, import the cookbook module and invoke its connect() method like this:

import cookbook

conn = cookbook.connect();

The Python interpreter searches for modules in directories named in the sys.path variable. To check the default value of sys.path on your system, run Python interactively and enter a few commands:

$ python
>>> import sys
>>> sys.path

If you install cookbook.py in one of the directories named by sys.path, your scripts will find it with no special handling. If you install cookbook.py somewhere else, you must set the PYTHONPATH environment variable, as discussed in the introductory part of this recipe.

After installing the cookbook.py library file, try it from a test harness script, harness.py:

#!/usr/bin/python
# harness.py: test harness for cookbook.py library

import mysql.connector
import cookbook

try:
  conn = cookbook.connect()
  print("Connected")
except mysql.connector.Error as e:
  print("Cannot connect to server")
  print("Error code: %s" % e.errno)
  print("Error message: %s" % e.msg)
else:
  conn.close()
  print("Disconnected")

The cookbook.py file imports the mysql.connector module, but a script that imports cookbook does not thereby gain access to mysql.connector. If the script needs Connector/Python-specific information (such as mysql.connector.Error), the script itself must import mysql.connector.

If you want a script to die if an error occurs without checking for an exception yourself, write the script body like this:

conn = cookbook.connect()
print("Connected")
conn.close()
print("Disconnected")

Go

Go programs are organized into packages that are a collection of the source files, located in the same directory. Packages, in their turn, are organized into modules that are collections of Go packages that are released together. Modules belong to a Go repository. A typical Go repository contains only one module, but you may have several modules in the same repository.

The Go interpreter searches for packages in directories named in the $GOPATH/src/{domain}/{project} variable. However, when using modules, Go no longer uses GOPATH. You do not need to change this variable no matter where your module is installed. We’ll use modules for our examples.

To create a method for connecting to MySQL, write a package file, cookbook.go:

package cookbook

import (
  "database/sql"
  _"github.com/go-sql-driver/mysql"
)

func Connect() (*sql.DB, error) {
  db, err := sql.Open("mysql","cbuser:cbpass@tcp(127.0.0.1:3306)/cookbook")

  if err != nil {
    panic(err.Error())
  }

  err = db.Ping()

  return db, err
}

The filename basename does not determine the package name: Go searches through all files in the import path until it finds the one with the required package declaration. Package methods are accessed via the package name.

To test the package, you can specify a relative path to the directory where the package file is located:

import "../../lib"

This is a very easy way to quickly test your libraries, but such commands, like go install, won’t work for packages imported this way. As a result, your program will be rebuilt from scratch each time you access it.

A better way to work with packages is to publish them as parts of modules. To do this, run the following in the directory where you store cookbook.go:

go mod init cookbook

This will create a file, go.mod, that will have your module name and version of Go. You can name the module as you wish.

You can publish your module on the internet and access it from the local program as you would do with any other module. However, during development, it would be useful to have the module only locally. In this case, you need to make few adjustments in the program directory that will use it.

First, create a program that will call the package, harness.go:

package main

import (
  "fmt"
  "github.com/svetasmirnova/mysqlcookbook/recipes/lib"
)

func main() {
  db, err := cookbook.Connect()

  if err != nil {
    fmt.Println("Cannot connect to server")
    fmt.Printf("Error message: %s\n", err.Error())
  } else {
    fmt.Println("Connected")
  }
  defer db.Close()
}

Then, in the directory, after the package is installed, initialize the module:

go mod init harness

Once the module is initialized and go.mod is created, edit it with the following:

go mod edit -replace ↩
github.com/svetasmirnova/mysqlcookbook/recipes/lib=↩
/home/sveta/src/mysqlcookbook/recipes/lib

Replace the URL and the local path with the ones that are valid in your environment.

This command will tell Go to replace the remote module path with the local directory.

Once done, you can test your connection:

$ go run harness.go
Connected

Java

Java library files are similar to Java programs in most ways:

  • The class line in the source file indicates a class name.

  • The file should have the same name as the class (with a .java extension).

  • Compile the .java file to produce a .class file.

Java library files also differ from Java programs in some ways:

  • Unlike regular program files, Java library files have no main() function.

  • A library file should begin with a package identifier that specifies the position of the class within the Java namespace.

A common convention for Java package identifiers is to use the domain of the code author as a prefix; this helps make identifiers unique and avoids conflict with classes written by other authors. Domain names proceed right to left, from more general to more specific within the domain namespace, whereas the Java class namespace proceeds left to right, from general to specific. Thus, to use a domain as the prefix for a package name within the Java class namespace, it’s necessary to reverse it. For example, Paul’s domain is kitebird.com, so if he writes a library file and places it under mcb within his domain’s namespace, the library begins with a package statement like this:

package com.kitebird.mcb;

Java packages developed for this book are placed within the com.kitebird.mcb namespace to ensure their uniqueness in the package namespace.

The following library file, Cookbook.java, defines a Cookbook class that implements a connect() method for connecting to the cookbook database. connect() returns a Connection object if it succeeds and throws an exception otherwise. To help the caller deal with failures, the Cookbook class also defines getErrorMessage() and printErrorMessage() utility methods that return the error message as a string and print it to System.err, respectively:

// Cookbook.java: library file with utility methods for connecting to MySQL
// using MySQL Connector/J and for handling exceptions

package com.kitebird.mcb;

import java.sql.*;

public class Cookbook {
  // Establish a connection to the cookbook database, returning
  // a connection object.  Throw an exception if the connection
  // cannot be established.

  public static Connection connect() throws Exception {
    String url = "jdbc:mysql://localhost/cookbook";
    String user = "cbuser";
    String password = "cbpass";

    return (DriverManager.getConnection(url, user, password));
  }

  // Return an error message as a string

  public static String getErrorMessage(Exception e) {
    StringBuffer s = new StringBuffer ();
    if (e instanceof SQLException) { // JDBC-specific exception?
      // print general message, plus any database-specific message
      s.append("Error message: " + e.getMessage () + "\n");
      s.append("Error code: " + ((SQLException) e).getErrorCode() + "\n");
    } else {
      s.append (e + "\n");
    }
    return (s.toString());
  }

  // Get the error message and print it to System.err

  public static void printErrorMessage(Exception e) {
    System.err.println(Cookbook.getErrorMessage(e));
  }
}

The routines within the class are declared using the static keyword, which makes them class methods rather than instance methods. That is done here because the class is used directly rather than creating an object from it and invoking the methods through the object.

To use the Cookbook.java file, compile it to produce Cookbook.class, then install the class file in a directory that corresponds to the package identifier.

This means that Cookbook.class should be installed in a directory named com/kitebird/mcb (Unix) or com\kitebird\mcb (Windows) that is located under some directory named in your CLASSPATH setting. For example, if CLASSPATH includes /usr/local/lib/mcb under Unix, you can install Cookbook.class in the /usr/local/lib/mcb/com/kitebird/mcb directory. (For more information about the CLASSPATH variable, see the Java discussion in Recipe 4.1.)

To use the Cookbook class from within a Java program, import it and invoke the Cookbook.connect() method. The following test harness program, Harness.java, shows how to do this:

// Harness.java: test harness for Cookbook library class

import java.sql.*;
import com.kitebird.mcb.Cookbook;

public class Harness {
  public static void main(String[] args) {
    Connection conn = null;
    try {
      conn = Cookbook.connect ();
      System.out.println("Connected");
    } catch (Exception e) {
      Cookbook.printErrorMessage (e);
      System.exit (1);
    } finally  {
      if (conn != null) {
        try {
          conn.close();
          System.out.println("Disconnected");
        } catch (Exception e) {
          String err = Cookbook.getErrorMessage(e);
          System.out.println(err);
        }
      }
    }
  }
}

Harness.java also shows how to use the error message utility methods from the Cookbook class when a MySQL-related exception occurs:

  • printErrorMessage() takes the exception object and uses it to print an error message to System.err.

  • getErrorMessage() returns the error message as a string. You can display the message yourself, write it to a logfile, or whatever.

4.4 Executing Statements and Retrieving Results

Problem

You want a program to send a SQL statement to the MySQL server and retrieve its result.

Solution

Some statements return only a status code; others return a result set (a set of rows). Some APIs provide different methods for executing each type of statement. If so, use the appropriate method for the statement to be executed.

Discussion

You can execute two general categories of SQL statements. Some retrieve information from the database; others change that information or the database itself. Statements in the two categories are handled differently. In addition, some APIs provide multiple routines for executing statements, complicating matters further. Before we get to examples demonstrating how to execute statements from within each API, we’ll describe the database table the examples use, and then further discuss the two statement categories and outline a general strategy for processing statements in each category.

In Chapter 1, we created a table named limbs to try some sample statements. In this chapter, we’ll use a different table named profile. It’s based on the idea of a buddy list, that is, the set of people we like to keep in touch with while we’re online. The table definition looks like this:

CREATE TABLE profile
(
  id    INT UNSIGNED NOT NULL AUTO_INCREMENT,
  name  VARCHAR(20) NOT NULL,
  birth DATE,
  color ENUM('blue','red','green','brown','black','white'),
  foods SET('lutefisk','burrito','curry','eggroll','fadge','pizza'),
  cats  INT,
  PRIMARY KEY (id)
);

The profile table indicates the things that are important to us about each buddy: name, age, favorite color, favorite foods, and number of cats. Additionally, the table uses several different data types for its columns, and these come in handy to illustrate how to solve problems that pertain to specific data types.

The table also includes an id column containing unique values so that we can distinguish one row from another, even if two buddies have the same name. id and name are declared as NOT NULL because they’re each required to have a value. The other columns are implicitly permitted to be NULL (and that is also their default value) because we might not know the value to assign them for any given individual. That is, NULL signifies unknown.

Notice that although we want to keep track of age, there is no age column in the table. Instead, there is a birth column of DATE type. Ages change, so if we store age values, we’d have to keep updating them. Storing birth dates is better: they don’t change and can be used to calculate age any time (see Recipe 8.14). color is an ENUM column; color values can be any one of the listed values. foods is a SET, which permits the value to be any combination of the individual set members. That way we can record multiple favorite foods for any buddy.

To create the table, use the profile.sql script in the tables directory of the recipes distribution. Change location into that directory, then run this command:

$ mysql cookbook < profile.sql

The script also loads sample data into the table. You can experiment with the table, then restore it if you change its contents by running the script again. (See Recipe 4.9 on the importance of restoring the profile table after modifying it.)

The contents of the profile table as loaded by the profile.sql script look like this:

mysql> SELECT * FROM profile;
+----+---------+------------+-------+-----------------------+------+
| id | name    | birth      | color | foods                 | cats |
+----+---------+------------+-------+-----------------------+------+
|  1 | Sybil   | 1970-04-13 | black | lutefisk,fadge,pizza  |    0 |
|  2 | Nancy   | 1969-09-30 | white | burrito,curry,eggroll |    3 |
|  3 | Ralph   | 1973-11-02 | red   | eggroll,pizza         |    4 |
|  4 | Lothair | 1963-07-04 | blue  | burrito,curry         |    5 |
|  5 | Henry   | 1965-02-14 | red   | curry,fadge           |    1 |
|  6 | Aaron   | 1968-09-17 | green | lutefisk,fadge        |    1 |
|  7 | Joanna  | 1952-08-20 | green | lutefisk,fadge        |    0 |
|  8 | Stephen | 1960-05-01 | white | burrito,pizza         |    0 |
+----+---------+------------+-------+-----------------------+------+

Although most of the columns in the profile table permit NULL values, none of the rows in the sample dataset actually contain NULL yet. (We want to defer the complications of NULL value processing to Recipes 4.5 and 4.7.)

SQL statement categories

SQL statements can be grouped into two broad categories, depending on whether they return a result set (a set of rows):

INSERT, DELETE, or UPDATE

Statements that return no result set, such as INSERT, DELETE, or UPDATE. As a general rule, statements of this type generally change the database in some way. There are some exceptions, such as USE db_name, which changes the default (current) database for your session without making any changes to the database itself. The example data-modifying statement used in this section is an UPDATE:

UPDATE profile SET cats = cats+1 WHERE name = 'Sybil';

We’ll cover how to execute this statement and determine the number of rows that it affects.

SELECT, SHOW, EXPLAIN, or DESCRIBE

Statements that return a result set, such as SELECT, SHOW, EXPLAIN, or DESCRIBE. We refer to such statements generically as SELECT statements, but you should understand that category to include any statement that returns rows. The example row-retrieval statement used in this section is a SELECT:

SELECT id, name, cats FROM profile;

We’ll cover how to execute this statement, fetch the rows in the result set, and determine the number of rows and columns in the result set. (To get information such as the column names or data types, access the result set metadata. That’s Recipe 12.2.)

The first step in processing a SQL statement is to send it to the MySQL server for execution. Some APIs (those for Perl and Java, for example) recognize a distinction between the two categories of statements and provide separate calls for executing them. Other APIs (such as the one for Python or Ruby) use a single call for all statements. However, one thing all APIs have in common is that no special character indicates the end of the statement. No terminator is necessary because the end of the statement string terminates it. This differs from executing statements in the mysql program, where you terminate statements using a semicolon (;) or \g. (It also differs from how this book usually includes semicolons in examples to make it clear where statements end.)

When you send a statement to the server, be prepared to handle errors if it did not execute successfully. If a statement fails and you proceed on the basis that it succeeded, your program won’t work. For the most part, this section does not show error-checking code, but that is for brevity. Production code should always include error handling. The sample scripts in the recipes distribution from which the examples are taken do include error handling, based on the techniques illustrated in Recipe 4.2.

If a statement does execute without error, your next step depends on the statement type. If it’s one that returns no result set, there’s nothing else to do, unless you want to check how many rows were affected. If the statement does return a result set, fetch its rows, then close the result set. In a context where you don’t know whether a statement returns a result set, Recipe 12.2 discusses how to tell.

Perl

The Perl DBI module provides two basic approaches to SQL statement execution, depending on whether you expect to get back a result set. For a statement such as INSERT or UPDATE that returns no result set, use the database handle do() method. It executes the statement and returns the number of rows affected by it, or undef if an error occurs. If Sybil gets a new cat, the following statement increments her cats count by one:

my $count = $dbh->do ("UPDATE profile SET cats = cats+1
                       WHERE name = 'Sybil'");
if ($count)   # print row count if no error occurred
{
  $count += 0;
  print "Number of rows updated: $count\n";
}

If the statement executes successfully but affects no rows, do() returns a special value, "0E0" (the value zero in scientific notation, expressed as a string). "0E0" can be used for testing the execution status of a statement because it is true in Boolean contexts (unlike undef). For successful statements, it can also be used when counting how many rows were affected because it is treated as the number zero in numeric contexts. Of course, if you print that value as is, you’ll print "0E0", which might look odd to people who use your program. The preceding example makes sure that doesn’t happen by adding zero to the value to coerce it to numeric form so that it displays as 0. Alternatively, use printf with a %d format specifier to cause an implicit numeric conversion:

if ($count)   # print row count if no error occurred
{
  printf "Number of rows updated: %d\n", $count;
}

If RaiseError is enabled, your script terminates automatically for DBI-related errors, so you need not check $count to find out whether do() failed and consequently can simplify the code:

my $count = $dbh->do ("UPDATE profile SET cats = cats+1
                       WHERE name = 'Sybil'");
printf "Number of rows updated: %d\n", $count;

To process a statement such as SELECT that does return a result set, use a different approach that involves these steps:

  1. Specify the statement to be executed by calling prepare() using the database handle. prepare() returns a statement handle to use with all subsequent operations on the statement. (If an error occurs, the script terminates if RaiseError is enabled; otherwise, prepare() returns undef.)

  2. Call execute() to execute the statement and generate the result set.

  3. Loop to fetch the rows returned by the statement. DBI provides several methods for this; we cover them shortly.

  4. If you don’t fetch the entire result set, release resources associated with it by calling finish().

The following example illustrates these steps, using fetchrow_array() as the row-fetching method and assuming that RaiseError is enabled so that errors terminate the script:

my $sth = $dbh->prepare ("SELECT id, name, cats FROM profile");
$sth->execute ();
my $count = 0;
while (my @val = $sth->fetchrow_array ())
{
  print "id: $val[0], name: $val[1], cats: $val[2]\n";
  ++$count;
}
$sth->finish ();
print "Number of rows returned: $count\n";

The row array size indicates the number of columns in the result set.

The row-fetching loop just shown is followed by a call to finish(), which closes the result set and tells the server to free any resources associated with it. If you fetch every row in the set, DBI notices when you reach the end and releases the resources for you. Thus, the example could omit the finish() call without ill effect.

As the example illustrates, to determine how many rows a result set contains, count them while fetching them. Do not use the DBI rows() method for this purpose. The DBI documentation discourages this practice because rows() is not necessarily reliable for SELECT statements—due to differences in behavior among database engines and drivers.

DBI has several methods that fetch a row at a time. The one used in the preceding example, fetchrow_array(), returns an array containing the next row, or an empty list when there are no more rows. Array elements are present in the order named in the SELECT statement. Access them as $val[0], $val[1], and so forth.

The fetchrow_array() method is most useful for statements that explicitly name the columns to select. (With SELECT *, there are no guarantees about the positions of columns within the array.)

fetchrow_arrayref() is like fetchrow_array(), except that it returns a reference to the array, or undef when there are no more rows. As with fetchrow_array(), array elements are present in the order named in the statement. Access them as $ref->[0], $ref->[1], and so forth:

while (my $ref = $sth->fetchrow_arrayref ())
{
  print "id: $ref->[0], name: $ref->[1], cats: $ref->[2]\n";
}

fetchrow_hashref() returns a reference to a hash structure, or undef when there are no more rows:

while (my $ref = $sth->fetchrow_hashref ())
{
  print "id: $ref->{id}, name: $ref->{name}, cats: $ref->{cats}\n";
}

To access the elements of the hash, use the names of the columns selected by the statement ($ref->{id}, $ref->{name}, and so forth). fetchrow_hashref() is particularly useful for SELECT * statements because you can access elements of rows without knowing anything about the order in which columns are returned. You need know only their names. On the other hand, it’s more expensive to set up a hash than an array, so fetchrow_hashref() is slower than fetchrow_array() or fetchrow_arrayref(). It’s also possible to lose row elements if they have the same name because column names must be unique. Same-name columns are not uncommon for joins between tables. For solutions to this problem, see Recipe 16.11.

In addition to the statement execution methods just described, DBI provides several high-level retrieval methods that execute a statement and return the result set in a single operation. All are database-handle methods that create and dispose of the statement handle internally before returning the result set. The methods differ in the form in which they return the result. Some return the entire result set, others return a single row or column of the set, as summarized in Table 4-4.

Table 4-4. Perl methods to retrieve results
MethodReturn value
selectrow_array() First row of result set as an array
selectrow_arrayref() First row of result set as a reference to an array
selectrow_hashref() First row of result set as a reference to a hash
selectcol_arrayref() First column of result set as a reference to an array
selectall_arrayref() Entire result set as a reference to an array of array references
selectall_hashref() Entire result set as a reference to a hash of hash references

Most of these methods return a reference. The exception is selectrow_array(), which selects the first row of the result set and returns an array or a scalar, depending on how you call it. In array context, selectrow_array() returns the entire row as an array (or the empty list if no row was selected). This is useful for statements from which you expect to obtain only a single row. The return value can be used to determine the result set size. The column count is the number of elements in the array, and the row count is 1 or 0:

my @val = $dbh->selectrow_array ("SELECT name, birth, foods FROM profile
                                  WHERE id = 3");
my $ncols = @val;
my $nrows = $ncols ? 1 : 0;

selectrow_arrayref() and selectrow_hashref() select the first row of the result set and return a reference to it, or undef if no row was selected. To access the column values, treat the reference the same way you treat the return value from fetchrow_arrayref() or fetchrow_hashref(). The reference also provides the row and column counts:

my $ref = $dbh->selectrow_arrayref ($stmt);
my $ncols = defined ($ref) ? @{$ref} : 0;
my $nrows = $ncols ? 1 : 0;

my $ref = $dbh->selectrow_hashref ($stmt);
my $ncols = defined ($ref) ? keys (%{$ref}) : 0;
my $nrows = $ncols ? 1 : 0;

selectcol_arrayref() returns a reference to a single-column array representing the first column of the result set. Assuming a non-undef return value, access elements of the array as $ref->[i] for the value from row i. The number of rows is the number of elements in the array, and the column count is 1 or 0:

my $ref = $dbh->selectcol_arrayref ($stmt);
my $nrows = defined ($ref) ? @{$ref} : 0;
my $ncols = $nrows ? 1 : 0;

selectall_arrayref() returns a reference to an array containing an element for each row of the result. Each element is a reference to an array. To access row i of the result set, use $ref->[i] to get a reference to the row. Then treat the row reference the same way, as a return value from fetchrow_arrayref(), to access individual column values in the row. The result set row and column counts are available as follows:

my $ref = $dbh->selectall_arrayref ($stmt);
my $nrows = defined ($ref) ? @{$ref} : 0;
my $ncols = $nrows ? @{$ref->[0]} : 0;

selectall_hashref() returns a reference to a hash, each element of which is a hash reference to a row of the result. To call it, specify an argument that indicates which column to use for hash keys. For example, if you retrieve rows from the profile table, the primary key is the id column:

my $ref = $dbh->selectall_hashref ("SELECT * FROM profile", "id");

Access rows using the keys of the hash. For a row that has a key column value of 12, the hash reference for the row is $ref->{12}. That row value is keyed on column names, which you can use to access individual column elements (for example, $ref->{12}->{name}). The result set row and column counts are available as follows:

my @keys = defined ($ref) ? keys (%{$ref}) : ();
my $nrows = scalar (@keys);
my $ncols = $nrows ? keys (%{$ref->{$keys[0]}}) : 0;

The selectall_XXX() methods are useful when you need to process a result set more than once because Perl DBI provides no way to rewind a result set. By assigning the entire result set to a variable, you can iterate through its elements multiple times.

Take care when using the high-level methods if you have RaiseError disabled. In that case, a method’s return value may not enable you to distinguish an error from an empty result set. For example, if you call selectrow_array() in scalar context to retrieve a single value, an undef return value is ambiguous because it may indicate any of three things: an error, an empty result set, or a result set consisting of a single NULL value. To test for an error, check the value of $DBI::errstr, $DBI::err, or $DBI::state.

Ruby

The Ruby Mysql2 API uses the same calls for SQL statements that do not return a result set and those that do. To process a statement in Ruby, use the query method. If the statement fails with an error, query raises an exception. Otherwise, the affected_rows method returns the number of rows changed for the last statement that modifies data:

client.query("UPDATE profile SET cats = cats+1 WHERE name = 'Sybil'")
puts "Number of rows updated: #{client.affected_rows}"

For statements such as SELECT that return a result set, the query method returns the result set as an instance of the Mysql2::Result class. The affected_rows method will return the number of rows in the result set for such statements. You can also obtain the number of rows in the result set by using the count method of the Mysql2::Result object:

result = client.query("SELECT id, name, cats FROM profile")
puts "Number of rows returned: #{client.affected_rows}"
puts "Number of rows returned: #{result.count}"
result.each do |row|
  printf "id: %s, name: %s, cats: %s\n", row["id"], row["name"], row["cats"]
end

result.fields contains the names of the columns in the result set.

PHP

PDO has two connection-object methods to execute SQL statements: exec() for statements that do not return a result set and query() for those that do. If you have PDO exceptions enabled, both methods raise an exception if statement execution fails. (Another approach couples the prepare() and execute() methods; see Recipe 4.5.)

To execute statements such as INSERT or UPDATE that don’t return rows, use exec(). It returns a count to indicate how many rows were changed:

$count = $dbh->exec ("UPDATE profile SET cats = cats+1 WHERE name = 'Sybil'");
printf ("Number of rows updated: %d\n", $count);

For statements such as SELECT that return a result set, the query() method returns a statement handle. Generally, you use this object to call a row-fetching method in a loop, and count the rows if you need to know how many there are:

$sth = $dbh->query ("SELECT id, name, cats FROM profile");
$count = 0;
while ($row = $sth->fetch (PDO::FETCH_NUM))
{
  printf ("id: %s, name: %s, cats: %s\n", $row[0], $row[1], $row[2]);
  $count++;
}
printf ("Number of rows returned: %d\n", $count);

To determine the number of columns in the result set, call the statement handle columnCount() method.

The example demonstrates the statement handle fetch() method, which returns the next row of the result set or FALSE when there are no more. fetch() takes an optional argument that indicates what type of value it should return. As shown, with an argument of PDO::FETCH_NUM, fetch() returns an array with elements accessed using numeric subscripts, beginning with 0. The array size indicates the number of result set columns.

With a PDO::FETCH_ASSOC argument, fetch() returns an associative array containing values accessed by column name ($row["id"], $row["name"], $row["cats"]).

With a PDO::FETCH_OBJ argument, fetch() returns an object having members accessed using the column names ($row->id, $row->name, $row->cats).

fetch() uses the default fetch mode if you invoke it with no argument. Unless you’ve changed the mode, it’s PDO::FETCH_BOTH, which is a combination of PDO::FETCH_NUM and PDO::FETCH_ASSOC. To set the default fetch mode for all statements executed within a connection, use the setAttribute database-handle method:

$dbh->setAttribute (PDO::ATTR_DEFAULT_FETCH_MODE, PDO::FETCH_ASSOC);

To set the mode for a given statement, call its setFetchMode() method after executing the statement and before fetching the results:

$sth->setFetchMode (PDO::FETCH_OBJ);

It’s also possible to use a statement handle as an iterator. The handle uses the current default fetch mode:

$sth->setFetchMode (PDO::FETCH_NUM);
foreach ($sth as $row)
  printf ("id: %s, name: %s, cats: %s\n", $row[0], $row[1], $row[2]);

The fetchAll() method fetches and returns the entire result set as an array of rows. It permits an optional fetch-mode argument:

$rows = $sth->fetchAll (PDO::FETCH_NUM);
foreach ($rows as $row)
  printf ("id: %s, name: %s, cats: %s\n", $row[0], $row[1], $row[2]);

In this case, the row count is the number of elements in $rows.

Python

The Python DB API uses the same calls for SQL statements that do not return a result set and those that do. To process a statement in Python, use your database connection object to get a cursor object. Then use the cursor’s execute() method to send the statement to the server. If the statement fails with an error, execute() raises an exception. Otherwise, if there is no result set, statement execution is complete, and the cursor’s rowcount attribute indicates how many rows were changed:

cursor = conn.cursor()
cursor.execute("UPDATE profile SET cats = cats+1 WHERE name = 'Sybil'")
print("Number of rows updated: %d" % cursor.rowcount)
conn.commit()
cursor.close()
Note

The Python DB API specification indicates that database connections should begin with auto-commit mode disabled, so Connector/Python disables auto-commit when it connects to the MySQL server. If you use transactional tables, modifications to them are rolled back when you close the connection unless you commit the changes first, which is why the preceding example invokes the commit() method. For more information on auto-commit mode, see Chapter 20, particularly Recipe 20.7.

If the statement returns a result set, fetch its rows, then close the cursor. The fetchone() method returns the next row as a sequence, or None when there are no more rows:

cursor = conn.cursor()
cursor.execute("SELECT id, name, cats FROM profile")
while True:
  row = cursor.fetchone()
  if row is None:
    break
  print("id: %s, name: %s, cats: %s" % (row[0], row[1], row[2]))
print("Number of rows returned: %d" % cursor.rowcount)
cursor.close()

As you can see from the preceding example, the rowcount attribute is useful for SELECT statements, too; it indicates the number of rows in the result set.

len(row) tells you the number of columns in the result set.

Alternatively, use the cursor itself as an iterator that returns each row in turn:

cursor = conn.cursor()
cursor.execute("SELECT id, name, cats FROM profile")
for (id, name, cats) in cursor:
  print("id: %s, name: %s, cats: %s" % (id, name, cats))
print("Number of rows returned: %d" % cursor.rowcount)
cursor.close()

The fetchall() method returns the entire result set as a list of tuples. Iterate through the list to access the rows:

cursor = conn.cursor()
cursor.execute("SELECT id, name, cats FROM profile")
rows = cursor.fetchall()
for row in rows:
  print("id: %s, name: %s, cats: %s" % (row[0], row[1], row[2]))
print("Number of rows returned: %d" % cursor.rowcount)
cursor.close()

The DB API provides no way to rewind a result set, so fetchall() can be convenient when you must iterate through the rows of the result set more than once or access individual values directly. For example, if rows holds the result set, you can access the value of the third column in the second row as rows[1][2] (indexes begin at 0, not 1).

Go

The Go sql interface has two connection-object functions to execute SQL statements: Exec() for statements that do not return a result set and Query() for the statements that do. Both return error if the statement fails.

To run a statement that doesn’t return any row, such as INSERT, UPDATE, or DELETE, use the Exec() function. Its return values can have a Result type or an error type. The interface Result has a RowsAffected() function that indicates how many rows were changed:

sql := "UPDATE profile SET cats = cats+1 WHERE name = 'Sybil'"
res, err := db.Exec(sql)

if err != nil {
	panic(err.Error())
}

affectedRows, err := res.RowsAffected()

if err != nil {
	log.Fatal(err)
}

fmt.Printf("The statement affected %d rows\n", affectedRows)

For the statements that return a result set, typically SELECT, use the Query() function. This function returns the cursor to the object of the Rows type that holds the result of the query. Call the Next() function to iterate through the result and store returned values in the variables using the Scan() function. If Next() returns false, this means there is no result:

res, err := db.Query("SELECT id, name, cats FROM profile")

defer res.Close()

if err != nil {
	log.Fatal(err)
}

for res.Next() {

	var profile Profile
	err := res.Scan(&profile.id, &profile.name, &profile.cats)

	if err != nil {
		log.Fatal(err)
	}

	fmt.Printf("%+v\n", profile)
}

If Next() is called and returns false, the Rows are closed automatically. Otherwise, you need to close them using the Close() function.

For the queries that expect to return at most one row, there is a special function, QueryRow(), that returns a Row object that can be scanned immediately. QueryRow() never returns an error until Scan() is called. If the query returns no row, Scan() returns ErrNoRows:

row := db.QueryRow("SELECT id, name, cats FROM profile where id=3")

var profile Profile
err = row.Scan(&profile.id, &profile.name, &profile.cats)

if err == sql.ErrNoRows {
	fmt.Println("No row matched!")
} else if err != nil {
	log.Fatal(err)
} else {
	fmt.Printf("%v\n", profile)
}

Java

The JDBC interface provides specific object types for the various phases of SQL statement processing. Statements are executed in JDBC using Java objects of one type. The results, if any, are returned as objects of another type.

To execute a statement, first get a Statement object by calling the createStatement() method of your Connection object:

Statement s = conn.createStatement ();

Then use the Statement object to send the statement to the server. JDBC provides several methods for doing this. Choose the one that’s appropriate for the type of statement: executeUpdate() for statements that don’t return a result set, executeQuery() for statements that do, and execute() when you don’t know. Each method raises an exception if the statement fails.

The executeUpdate() method sends a statement that generates no result set to the server and returns a count indicating the number of affected rows. When you’re done with the statement object, close it:

Statement s = conn.createStatement ();
int count = s.executeUpdate(
               "UPDATE profile SET cats = cats+1 WHERE name = 'Sybil'");
s.close();   // close statement
System.out.println("Number of rows updated: " + count);

For statements that return a result set, use executeQuery(). Then get a result set object, and use it to retrieve the row values. When you’re done, close the result set and statement objects:

Statement s = conn.createStatement ();
s.executeQuery("SELECT id, name, cats FROM profile");
ResultSet rs = s.getResultSet();
int count = 0;
while (rs.next ()) { // loop through rows of result set\
  int id = rs.getInt(1);   // extract columns 1, 2, and 3
  String name = rs.getString(2);
  int cats = rs.getInt(3);
  System.out.println("id: " + id
                     + ", name: " + name
                     + ", cats: " + cats);
  ++count;
}
rs.close ();  // close result set
s.close ();   // close statement
System.out.println ("Number of rows returned: " + count);

The ResultSet object returned by the getResultSet() method of your Statement object has its own methods, such as next(), to fetch rows and various getXXX() methods that access columns of the current row. Initially, the result set is positioned just before the first row of the set. Call next() to fetch each row in succession until it returns false. To determine the number of rows in a result set, count them yourself, as shown in the preceding example.

Tip

For queries that return a single result set, it isn’t necessary to call getResultSet. The preceding code could be written as follows:

ResultSet rs = s.executeQuery("SELECT id, name, cats FROM profile");

A separate call is needed when your query can return multiple result sets, for example, if you call a stored routine.

To access column values, use the getInt(), getString(), getFloat(), or getDate() methods. To obtain the column value as a generic object, use getObject(). The argument to a getXXX() call can indicate either column position (beginning at 1, not 0) or column name. The previous example shows how to retrieve the id, name, and cats columns by position. To access columns by name instead, write the row-fetching loop as follows:

while (rs.next ()) { // loop through rows of result set
  int id = rs.getInt("id");
  String name = rs.getString("name");
  int cats = rs.getInt("cats");
  System.out.println("id: " + id
                     + ", name: " + name
                     + ", cats: " + cats);
  ++count;
}

To retrieve a given column value, use any getXXX() call that makes sense for the data type. For example, getString() retrieves any column value as a string:

String id = rs.getString("id");
String name = rs.getString("name");
String cats = rs.getString("cats");
System.out.println("id: " + id
                   + ", name: " + name
                   + ", cats: " + cats);

Or use getObject() to retrieve values as generic objects and convert the values as necessary. The following example uses toString() to convert object values to printable form:

Object id = rs.getObject("id");
Object name = rs.getObject("name");
Object cats = rs.getObject("cats");
System.out.println("id: " + id.toString()
                   + ", name: " + name.toString()
                   + ", cats: " + cats.toString());

To determine the number of columns in the result set, access its metadata:

ResultSet rs = s.getResultSet();
ResultSetMetaData md = rs.getMetaData(); // get result set metadata
int ncols = md.getColumnCount();         // get column count from metadata

The third JDBC statement-execution method, execute(), works for either type of statement. It’s particularly useful when you receive a statement string from an external source and don’t know whether it generates a result set or returns multiple result sets. The return value from execute() indicates the statement type so that you can process it appropriately: if execute() returns true, there is a result set, otherwise not. Typically, you’d use it something like this, where stmtStr represents an arbitrary SQL statement:

Statement s = conn.createStatement();
if (s.execute(stmtStr)) {
  // there is a result set
  ResultSet rs = s.getResultSe();

  // ... process result set here ...

  rs.close();  // close result set
} else {
  // there is no result set, just print the row count
  System.out.println("Number of rows affected: " + s.getUpdateCount ());
}
s.close();   // close statement

4.5 Handling Special Characters and NULL Values in Statements

Problem

You need to construct SQL statements that refer to data values containing special characters such as quotes or backslashes, or special values such as NULL. Or you are constructing statements using data obtained from external sources and want to prevent SQL injection attacks.

Solution

Use your API’s placeholder mechanism or quoting function to make data safe for insertion.

Discussion

Up to this point in the chapter, our statements have used safe data values that require no special treatment. For example, we can easily construct the following SQL statements from within a program by writing the data values literally in the statement strings:

SELECT * FROM profile WHERE age > 40 AND color = 'green';

INSERT INTO profile (name,color) VALUES('Gary','blue');

However, some data values are not so easily handled and cause problems if you are not careful. Statements might use values that contain special characters such as quotes, backslashes, binary data, or values that are NULL. The following discussion describes the difficulties these values cause and the proper techniques for handling them.

Suppose that you want to execute this INSERT statement:

INSERT INTO profile (name,birth,color,foods,cats)
VALUES('Alison','1973-01-12','blue','eggroll',4);

There’s nothing unusual about that. But if you change the name column value to something like De'Mont that contains a single quote, the statement becomes syntactically invalid:

INSERT INTO profile (name,birth,color,foods,cats)
VALUES('De'Mont','1973-01-12','blue','eggroll',4);

The problem is the single quote inside a single-quoted string. To make the statement legal by escaping the quote, precede it with either a single quote or a backslash:

INSERT INTO profile (name,birth,color,foods,cats)
VALUES('De''Mont','1973-01-12','blue','eggroll',4);
INSERT INTO profile (name,birth,color,foods,cats)
VALUES('De\'Mont','1973-01-12','blue','eggroll',4);

Alternatively, quote the name value itself within double quotes rather than within single quotes (assuming that the ANSI_QUOTES SQL mode is not enabled):

INSERT INTO profile (name,birth,color,foods,cats)
VALUES("De'Mont",'1973-01-12','blue','eggroll',4);

If you are writing a statement literally in your program, you can escape or quote the name value by hand because you know what the value is. But if the name is stored in a variable, you don’t necessarily know what the variable’s value is. Worse yet, a single quote isn’t the only character you must be prepared to deal with; double quotes and backslashes cause problems, too. And if the database stores binary data such as images or sound clips, a value might contain anything—not only quotes or backslashes but other characters such as nulls (zero-valued bytes). The need to handle special characters properly is particularly acute in a web environment where statements are constructed using form input (for example, if you search for rows that match search terms entered by the remote user). You must be able to handle any kind of input in a general way because you can’t predict in advance what kind of information a user will supply. It’s not uncommon for malicious users to enter garbage values containing problematic characters in a deliberate attempt to compromise the security of your server and even execute fatal commands, such as DROP TABLE. That is a standard technique for exploiting insecure scripts, called SQL injection.

The SQL NULL value is not a special character, but it too requires special treatment. In SQL, NULL indicates no value. This can have several meanings depending on context, such as unknown, missing, out of range, and so forth. Our statements thus far have not used NULL values, to avoid dealing with the complications they introduce, but now it’s time to address these issues. For example, if you don’t know De’Mont’s favorite color, you can set the color column to NULL—but not by writing the statement like this:

INSERT INTO profile (name,birth,color,foods,cats)
VALUES('De''Mont','1973-01-12','NULL','eggroll',4);

Instead, the NULL value must have no enclosing quotes:

INSERT INTO profile (name,birth,color,foods,cats)
VALUES('De''Mont','1973-01-12',NULL,'eggroll',4);

Were you writing the statement literally in your program, you’d simply write the word NULL. But if the color value comes from a variable, the proper action is not so obvious. You must know whether the variable’s value represents NULL to determine whether to enclose it within quotes when you construct the statement.

You have two means at your disposal for dealing with special characters such as quotes and backslashes and with special values such as NULL:

  • Use placeholders in the statement string to refer to data values symbolically, then bind the data values to the placeholders when you execute the statement. This is the preferred method because the API itself does all or most of the work for you of providing quotes around values as necessary, quoting or escaping special characters within the data value, and possibly interpreting a special value to map onto NULL without enclosing quotes.

  • Use a quoting function (if your API provides one) for converting data values to a safe form that is suitable for use in statement strings.

This section shows how to use these techniques to handle special characters and NULL values for each API. One of the examples demonstrated here shows how to insert a profile table row that contains De'Mont for the name value and NULL for the color value. However, the principles shown here have general utility and handle any special characters, including those found in binary data. Also, the principles are not limited to INSERT statements. They work for other kinds of statements as well, such as SELECT. One of the other examples shown here demonstrates how to execute a SELECT statement using placeholders.

Processing of special characters and NULL values comes up in other contexts covered elsewhere:

  • The placeholder and quoting techniques described here are only for data values and not for identifiers such as database or table names. For a discussion of identifier quoting, refer to Recipe 4.6.

  • Comparisons of NULL values require different operators than non-NULL values. Recipe 5.6 discusses how to construct SQL statements that perform NULL comparisons from within programs.

  • This section covers the issue of getting special characters into your database. A related issue is the inverse operation of transforming special characters in values returned from your database for display in various contexts. For example, if you generate HTML pages that include values taken from your database, you must perform output encoding to convert < and > characters in those values to the HTML entities &lt; and &gt; to make sure they display properly.

Using placeholders

Placeholders enable you to avoid writing data values literally in SQL statements. Using this approach, you write statements using placeholders—special markers that indicate where the values go. Two common parameter markers are ? and %s. Depending on the marker, rewrite the INSERT statement to use placeholders like this:

INSERT INTO profile (name,birth,color,foods,cats)
VALUES(?,?,?,?,?);

INSERT INTO profile (name,birth,color,foods,cats)
VALUES(%s,%s,%s,%s,%s);

Then pass the statement string to the database server and supply the data values separately. The API binds the values to the placeholders to replace them, resulting in a statement that contains the data values.

One benefit of placeholders is that parameter-binding operations automatically handle escaping of characters such as quotes and backslashes. This is especially useful for inserting binary data such as images into your database or using data values with unknown content such as input submitted by a remote user through a form in a web page. Also, there is usually some special value that you bind to a placeholder to indicate that you want a SQL NULL value in the resulting statement.

A second benefit of placeholders is that you can prepare a statement in advance, then reuse it by binding different values to it each time it’s executed. Prepared statements thus encourage statement reuse. Statements become more generic because they contain placeholders rather than specific data values. If you perform an operation over and over, you may be able to reuse a prepared statement and simply bind different data values to it each time you execute it. Some database systems (MySQL not among them) have the capability of performing some preparsing or even execution planning prior to executing a prepared statement. For a statement that is executed multiple times later, this reduces overhead because anything that can be done prior to execution need be done only once, not once per execution. For example, if a program executes a particular type of SELECT statement several times while it runs, such a database system can construct a plan for the statement and then reuse it each time, rather than rebuild the plan over and over. MySQL doesn’t build query plans in advance, so you get no performance boost from using prepared statements. However, if you port a program to a database that does reuse query plans and you’ve written your program to use prepared statements, you can get this advantage of prepared statements automatically. You need not convert from nonprepared statements to enjoy that benefit.

A third (admittedly subjective) benefit is that code that uses placeholder-based statements can be easier to read. As you work through this section, compare the statements used here with those from Recipe 4.4 that didn’t use placeholders to see which you prefer.

Using a quoting function

Some APIs provide a quoting function that takes a data value as its argument and returns a properly quoted and escaped value suitable for safe insertion into a SQL statement. This is less common than using placeholders, but it can be useful for constructing statements that you don’t intend to execute immediately. However, you must have a connection open to the database server while you use such a quoting function because the API cannot select the proper quoting rules until the database driver is known. (The rules differ among database systems.)

Note

As we’ll indicate later, some APIs quote as strings all non-NULL values, even numbers, when binding them to parameter markers. This can be an issue in contexts that require numbers, as described further in Recipe 5.11.

Perl

To use placeholders with Perl DBI, put a ? in your SQL statement string at each data value location. Then bind the values to the statement by passing them to do() or execute(), or by calling a DBI method specifically intended for placeholder substitution. Use undef to bind a NULL value to a placeholder.

With do(), add the profile row for De’Mont by passing the statement string and the data values in the same call:

my $count = $dbh->do ("INSERT INTO profile (name,birth,color,foods,cats)
                       VALUES(?,?,?,?,?)",
                      undef,
                      "De'Mont", "1973-01-12", undef, "eggroll", 4);

The arguments following the statement string are undef, then one data value for each placeholder. The undef argument is a historical artifact but must be present.

Alternatively, pass the statement string to prepare() to get a statement handle, then use that handle to pass the data values to execute():

my $sth = $dbh->prepare ("INSERT INTO profile (name,birth,color,foods,cats)
                          VALUES(?,?,?,?,?)");
my $count = $sth->execute ("De'Mont", "1973-01-12", undef, "eggroll", 4);

In either case, DBI generates this statement:

INSERT INTO profile (name,birth,color,foods,cats)
VALUES('De\'Mont','1973-01-12',NULL,'eggroll','4');

The Perl DBI placeholder mechanism provides quotes around data values when they are bound to the statement string, so don’t put quotes around the ? characters in the string.

Note that the placeholder mechanism adds quotes around numeric values. DBI relies on the MySQL server to perform type conversion as necessary to convert strings to numbers. If you bind undef to a placeholder, DBI puts a NULL into the statement and correctly refrains from adding enclosing quotes.

To execute the same statement over and over again, use prepare() once, then call execute() with the appropriate data values each time you run it.

You can use these methods for other types of statements as well. For example, the following SELECT statement uses a placeholder to look for rows that have a cats value larger than 2:

my $sth = $dbh->prepare ("SELECT * FROM profile WHERE cats > ?");
$sth->execute (2);
while (my $ref = $sth->fetchrow_hashref ())
{
  print "id: $ref->{id}, name: $ref->{name}, cats: $ref->{cats}\n";
}

High-level retrieval methods such as selectrow_array() and selectall_arrayref() can be used with placeholders, too. Like the do() method, the arguments are the statement string, undef, and the data values to bind to the placeholders. Here’s an example:

my $ref = $dbh->selectall_arrayref (
  "SELECT name, birth, foods FROM profile WHERE id > ? AND color = ?",
  undef, 3, "green"
);

The Perl DBI quote() database-handle method is an alternative to using placeholders. Here’s how to use quote() to create a statement string that inserts a new row in the profile table. Write the %s format specifiers without enclosing quotes because quote() provides them automatically as necessary. Non-undef values are inserted with quotes, and undef values are inserted as NULL without quotes:

my $stmt = sprintf ("INSERT INTO profile (name,birth,color,foods,cats)
                     VALUES(%s,%s,%s,%s,%s)",
                    $dbh->quote ("De'Mont"),
                    $dbh->quote ("1973-01-12"),
                    $dbh->quote (undef),
                    $dbh->quote ("eggroll"),
                    $dbh->quote (4));
my $count = $dbh->do ($stmt);

The statement string generated by this code is the same as when you use placeholders.

Ruby

Ruby DBI uses ? as the placeholder character in SQL statements and nil as the value for binding a SQL NULL value to a placeholder.

To use the ?, pass the statement string to prepare to get a statement handle, then use that handle to invoke execute with the data values:

sth = client.prepare("INSERT INTO profile (name,birth,color,foods,cats)
                   VALUES(?,?,?,?,?)")
sth.execute("De'Mont", "1973-01-12", nil, "eggroll", 4)

Mysql2 includes properly escaped quotes and a properly unquoted NULL value in the resulting statement:

INSERT INTO profile (name,birth,color,foods,cats)
VALUES('De\'Mont','1973-01-12',NULL,'eggroll',4);

The Ruby Mysql2 placeholder mechanism provides quotes around data values as necessary when they are bound to the statement string, so don’t put quotes around the ? characters in the string.

PHP

To use placeholders with the PDO extension, pass a statement string to prepare() to get a statement object. The string can contain ? characters as placeholder markers. Use this object to invoke execute(), passing to it the array of data values to bind to the placeholders. Use the PHP NULL value to bind a SQL NULL value to a placeholder. The code to add the profile table row for De’Mont looks like this:

$sth = $dbh->prepare ("INSERT INTO profile (name,birth,color,foods,cats)
                          VALUES(?,?,?,?,?)");
$sth->execute (array ("De'Mont","1973-01-12",NULL,"eggroll",4));

The resulting statement includes a properly escaped quote and a properly unquoted NULL value:

INSERT INTO profile (name,birth,color,foods,cats)
VALUES('De\'Mont','1973-01-12',NULL,'eggroll','4');

The PDO placeholder mechanism provides quotes around data values when they are bound to the statement string, so don’t put quotes around the ? characters in the string. (Note that even the numeric value 4 is quoted; PDO relies on MySQL to perform type conversion as necessary when the statement executes.)

Python

The Connector/Python module implements placeholders using %s format specifiers in the SQL statement string. (To place a literal % character into the statement, use %% in the statement string.) To use placeholders, invoke the execute() method with two arguments: a statement string containing format specifiers and a sequence containing the values to bind to the statement string. Use None to bind a NULL value to a placeholder. The code to add the profile table row for De’Mont looks like this:

cursor = conn.cursor()
cursor.execute('''
               INSERT INTO profile (name,birth,color,foods,cats)
               VALUES(%s,%s,%s,%s,%s)
               ''', ("De'Mont", "1973-01-12", None, "eggroll", 4))
cursor.close()
conn.commit()

The statement sent to the server by the preceding execute() call looks like this:

INSERT INTO profile (name,birth,color,foods,cats)
VALUES('De\'Mont','1973-01-12',NULL,'eggroll',4);

The Connector/Python placeholder mechanism provides quotes around data values as necessary when they are bound to the statement string, so don’t put quotes around the %s format specifiers in the string.

If you have only a single value, val, to bind to a placeholder, write it as a sequence using the syntax (val,):

cursor = conn.cursor()
cursor.execute("SELECT id, name, cats FROM profile WHERE cats = %s", (2,))
for (id, name, cats) in cursor:
  print("id: %s, name: %s, cats: %s" % (id, name, cats))
cursor.close()

Alternatively, write the value as a list using the syntax [val].

Go

The Go sql package uses question marks (?) as placeholder markers. You can use placeholders with single Exec() or Query() calls, and you can also prepare the statement in advance and execute it later. The latter method is good when you need to execute the statement multiple times. The code to add the profile table row for De’Mont looks like this:

stmt := `INSERT INTO profile (name,birth,color,foods,cats)
           VALUES(?,?,?,?,?)`
_, err = db.Exec(stmt, "De'Mont", "1973-01-12", nil, "eggroll", 4)

The same code with the Prepare() call looks like this:

pstmt, err := db.Prepare(`INSERT INTO profile (name,birth,color,foods,cats)
                 VALUES(?,?,?,?,?)`)
if err != nil {
  log.Fatal(err)
}
defer pstmt.Close()

_, err = pstmt.Exec("De'Mont", "1973-01-12", nil, "eggroll", 4)

Java

JDBC provides support for placeholders if you use prepared statements. Recall that the process for executing nonprepared statements in JDBC is to create a Statement object and then pass the statement string to the executeUpdate(), executeQuery(), or execute() function. To use a prepared statement instead, create a PreparedStatement object by passing a statement string containing ? placeholder characters to your connection object’s prepareStatement() method. Then bind the data values to the statement using setXXX() methods. Finally, execute the statement by calling executeUpdate(), executeQuery(), or execute() with an empty argument list.

Here is an example that uses executeUpdate() to execute an INSERT statement that adds the profile table row for De’Mont:

PreparedStatement s;
s = conn.prepareStatement(
            "INSERT INTO profile (name,birth,color,foods,cats)"
            + " VALUES(?,?,?,?,?)");
s.setString(1, "De'Mont");         // bind values to placeholders
s.setString(2, "1973-01-12");
s.setNull(3, java.sql.Types.CHAR);
s.setString(4, "eggroll");
s.setInt(5, 4);
s.close();   // close statement

The setXXX() methods that bind data values to statements take two arguments: a placeholder position (beginning with 1, not 0) and the value to bind to the placeholder. Choose each value-binding call to match the data type of the column to which the value is bound: setString() to bind a string to the name column, setInt() to bind an integer to the cats column, and so forth. (Actually, we cheated a bit by using setString() to treat the date value for birth as a string.)

One difference between JDBC and the other APIs is that you don’t bind a NULL to a placeholder by specifying some special value (such as undef in Perl or nil in Ruby). Instead, invoke setNull() with a second argument that indicates the type of the column: java.sql.Types.CHAR for a string, java.sql.Types.INTEGER for an integer, and so forth.

The setXXX() calls add quotes around data values if necessary, so don’t put quotes around the ? placeholder characters in the statement string.

To handle a statement that returns a result set, the process is similar, but execute the prepared statement with executeQuery() rather than executeUpdate():

PreparedStatement s;
s = conn.prepareStatement("SELECT * FROM profile WHERE cats > ?");
s.setInt(1, 2);  // bind 2 to first placeholder
s.executeQuery();
// ... process result set here ...
s.close();     // close statement

4.6 Handling Special Characters in Identifiers

Problem

You need to construct SQL statements that refer to identifiers containing special characters.

Solution

Quote each identifier so it can be inserted safely into statement strings.

Discussion

Recipe 4.5 discusses how to handle special characters in data values by using placeholders or quoting methods. Special characters can also be present in identifiers such as database, table, and column names. For example, the table name some table contains a space, which is not permitted by default:

mysql> CREATE TABLE some table (i INT);
ERROR 1064 (42000): You have an error in your SQL syntax near 'table (i INT)'

Special characters are handled differently in identifiers than in data values. To make an identifier safe for insertion into a SQL statement, quote it by enclosing it within backticks:

mysql> CREATE TABLE `some table` (i INT);
Query OK, 0 rows affected (0.04 sec)

In MySQL, backticks are always permitted for identifier quoting. The double-quote character is permitted as well, if the ANSI_QUOTES SQL mode is enabled. Thus, with ANSI_QUOTES enabled, both of these statements are equivalent:

CREATE TABLE `some table` (i INT);
CREATE TABLE "some table" (i INT);

If it’s necessary to know which identifier quoting characters are permitted, execute a SELECT @@sql_mode statement to retrieve the SQL mode for your session and check whether its value includes ANSI_QUOTES.

If a quoting character appears within the identifier itself, double it when quoting the identifier. For example, quote abc`def as `abc``def`.

Be aware that although string data values in MySQL normally can be quoted using either single-quote or double-quote characters ('abc', "abc"), that is not true when ANSI_QUOTES is enabled. In that case, MySQL interprets 'abc' as a string and "abc" as an identifier, so you must use only single quotes for strings.

Within a program, you can use an identifier-quoting routine if your API provides one, or write one yourself if not. Perl DBI has a quote_identifier() method that returns a properly quoted identifier. For an API that has no such method, you can quote an identifier by enclosing it within backticks and doubling any backticks that occur within the identifier. Here’s a PHP routine that does so:

function quote_identifier ($ident)
{
  return ('`' . str_replace('`', '``', $ident) . '`');
}

Portability note: if you write your own identifier-quoting routines, remember that other database management systems (DBMSs) may require different quoting conventions.

In contexts where identifiers are used as data values, handle them as such. If you select information from the INFORMATION_SCHEMA metadata database, it’s common to indicate which rows to return by specifying database object names in the WHERE clause. For example, this statement retrieves the column names for the profile table in the cookbook database:

SELECT COLUMN_NAME FROM INFORMATION_SCHEMA.COLUMNS
WHERE TABLE_SCHEMA = 'cookbook' AND TABLE_NAME = 'profile';

The database and table names are used here as data values, not as identifiers. Were you to construct this statement within a program, parameterize them using placeholders, not identifier quoting. For example, in Ruby, do this:

sth = client.prepare("SELECT COLUMN_NAME
                        FROM INFORMATION_SCHEMA.COLUMNS
                        WHERE TABLE_SCHEMA = ? AND TABLE_NAME = ?")
names = sth.execute(db_name, tbl_name)

4.7 Identifying NULL Values in Result Sets

Problem

A query result includes NULL values, but you’re not sure how to identify them.

Solution

Your API probably has some special value that represents NULL by convention. You just have to know what it is and how to test for it.

Discussion

Recipe 4.5 describes how to refer to NULL values when you send statements to the database server. In this section, we’ll deal instead with the question of how to recognize and process NULL values returned from the database server. In general, this is a matter of knowing what special value the API maps NULL values to, or what method to call. Table 4-5 shows these values.

Table 4-5. Detected NULL values
LanguageNULL-detection value or method
Perl DBIundef value
Ruby Mysql2 gemnil value
PHP PDONULL value
Python DB APINone value
Go sql interfaceGo Null type implementation for the nullable data types.
Java JDBCwasNull() method

The following sections show a very simple application of NULL value detection. The examples retrieve a result set and print all values in it, mapping NULL values onto the printable string "NULL".

To make sure that the profile table has a row that contains some NULL values, use mysql to execute the following INSERT statement, then execute the SELECT statement to verify that the resulting row has the expected values:

mysql> INSERT INTO profile (name) VALUES('Amabel');
mysql> SELECT * FROM profile WHERE name = 'Amabel';
+----+--------+-------+-------+-------+------+
| id | name   | birth | color | foods | cats |
+----+--------+-------+-------+-------+------+
|  9 | Amabel | NULL  | NULL  | NULL  | NULL |
+----+--------+-------+-------+-------+------+

The id column might contain a different number, but the other columns should appear as shown, with values of NULL.

Perl

Perl DBI represents NULL values using undef. To detect such values, use the defined() function; it’s particularly important to do so if you enable warnings with the Perl -w option or by including a use warnings line in your script. Otherwise, accessing undef values causes Perl to issue Use of uninitialized value warnings.

To prevent these warnings, test column values that might be undef with defined() before using them. The following code selects a few columns from the profile table and prints "NULL" for any undefined values in each row. This makes NULL values explicit in the output without activating any warning messages:

my $sth = $dbh->prepare ("SELECT name, birth, foods FROM profile");
$sth->execute ();
while (my $ref = $sth->fetchrow_hashref ())
{
  printf "name: %s, birth: %s, foods: %s\n",
         defined ($ref->{name}) ? $ref->{name} : "NULL",
         defined ($ref->{birth}) ? $ref->{birth} : "NULL",
         defined ($ref->{foods}) ? $ref->{foods} : "NULL";
}

Unfortunately, testing multiple column values is ponderous and becomes worse the more columns there are. To avoid this, test and set undefined values using a loop or map prior to printing them. The following example uses map:

my $sth = $dbh->prepare ("SELECT name, birth, foods FROM profile");
$sth->execute ();
while (my $ref = $sth->fetchrow_hashref ())
{
  map { $ref->{$_} = "NULL" unless defined ($ref->{$_}); } keys (%{$ref});
  printf "name: %s, birth: %s, foods: %s\n",
         $ref->{name}, $ref->{birth}, $ref->{foods};
}

With this technique, the amount of code to perform the tests is constant, not proportional to the number of columns to be tested. Also, there is no reference to specific column names, so it can more easily be used in other programs or as the basis for a utility routine.

If you fetch rows into an array rather than into a hash, use map like this to convert undef values:

my $sth = $dbh->prepare ("SELECT name, birth, foods FROM profile");
$sth->execute ();
while (my @val = $sth->fetchrow_array ())
{
  @val = map { defined ($_) ? $_ : "NULL" } @val;
  printf "name: %s, birth: %s, foods: %s\n",
         $val[0], $val[1], $val[2];
}

Ruby

The Ruby Mysql2 module represents NULL values using nil, which can be identified by applying the nil? method to a value. The following example uses the nil? method and ternary operator to determine whether to print result set values as is or as the string "NULL" for NULL values:

result = client.query("SELECT name, birth, foods FROM profile")
result.each do |row|
  printf "name %s, birth: %s, foods: %s\n", 
         row["name"].nil? ? "NULL" : row["name"], 
         row["birth"].nil? ? "NULL" : row["birth"], 
         row["foods"].nil? ? "NULL" : row["foods"]
end

PHP

PHP represents SQL NULL values in result sets as the PHP NULL value. To determine whether a value from a result set represents a NULL value, compare it to the PHP NULL value using the === triple equal operator:

if ($val === NULL)
{
  # $val is a NULL value
}

In PHP, the triple equal operator means exactly equal to. The usual ==, equal to, comparison operator is not suitable here: with ==, PHP considers the NULL value, the empty string, and 0 all equal.

The following code uses the === operator to identify NULL values in a result set and print them as the string "NULL":

$sth = $dbh->query ("SELECT name, birth, foods FROM profile");
while ($row = $sth->fetch (PDO::FETCH_NUM))
{
  foreach (array_keys ($row) as $key)
  {
    if ($row[$key] === NULL)
      $row[$key] = "NULL";
  }
  print ("name: $row[0], birth: $row[1], foods: $row[2]\n");
}

An alternative to === for NULL value tests is is_null().

Python

Python DB API programs represent NULL in result sets using None. The following example shows how to detect NULL values:

cursor = conn.cursor()
cursor.execute("SELECT name, birth, foods FROM profile")

for row in cursor:
  row = list(row)  # convert nonmutable tuple to mutable list
  
  for i, value in enumerate(row):
    if value is None:  # is the column value NULL?
      row[i] = "NULL"
      
  print("name: %s, birth: %s, foods: %s" % (row[0], row[1], row[2]))
  
cursor.close()

The inner loop checks for NULL column values by looking for None and converts them to the string "NULL". The example converts row to a mutable object (list) prior to the loop because fetchall() returns rows as sequence values, which are immutable (read only).

Go

The Go sql interface provides special data types to handle values in the result set that may contain NULL values. They are defined for the standard Go types. Table 4-6 contains the list of the standard data types and their nullable equivalents.

Table 4-6. Handling NULL values in Go
Standard Go typeType that can contain NULL values
boolNullBool
float64NullFloat64
int32NullInt32
int64NullInt64
stringNullString
time.TimeNullTime

To define a variable that can take both NULL and non-NULL values when passed as an argument to the function Scan(), use the corresponding nullable type.

All nullable types contain two functions: Valid() that returns true if the value is not NULL and false if the value is NULL. The second function is the type name, started from the capital letter, for example, String() for string values and Time() for time.Time values. This method returns the actual value when it is not NULL.

The following example shows you how to handle NULL values in Go:

// null-in-result.go : Selecting NULL values in Go
package main

import (
	"database/sql"
	"fmt"
	"log"

	_ "github.com/go-sql-driver/mysql"
)

type Profile struct {
    name     string
    birth    sql.NullString
    foods    sql.NullString
}

func main() {

	db, err := sql.Open("mysql", "cbuser:cbpass@tcp(127.0.0.1:3306)/cookbook")

	if err != nil {
		log.Fatal(err)
	}
	defer db.Close()

	sql := "SELECT name, birth, foods FROM profile"
	res, err := db.Query(sql)

	if err != nil {
		log.Fatal(err)
	}
	defer res.Close()

	for res.Next() {
		var profile Profile
		err = res.Scan(&profile.name, &profile.birth, &profile.foods)
		if err != nil {
			log.Fatal(err)
		}

        if (profile.birth.Valid && profile.foods.Valid) {
          fmt.Printf("name: %s, birth: %s, foods: %s\n", 
                      profile.name, profile.birth.String, profile.foods.String)
        } else if profile.birth.Valid {
          fmt.Printf("name: %s, birth: %s, foods: NULL\n",
                      profile.name, profile.birth.String)
        } else if profile.foods.Valid {
          fmt.Printf("name: %s, birth: NULL, foods: %s\n",
                      profile.name, profile.foods.String)
        } else {
          fmt.Printf("name: %s, birth: NULL, foods: NULL\n",
                      profile.name)
        }
	}
}
Warning

We used the NullString type for the birth column for simplicity. If you want to use the NullTime type, you need to add the parseTime=true parameter to your connection string.

Tip

Alternatively, you can use MySQL’s COALESCE() function to convert the NULL value to a string during the query execution:

sql := `SELECT name, 
               COALESCE(birth, '') as birthday
        FROM profile WHERE id = 9`
res, err := db.Query(sql)
defer res.Close()

Java

For JDBC programs, if it’s possible for a column in a result set to contain NULL values, it’s best to check for them explicitly. The way to do this is to fetch the value and then invoke wasNull(), which returns true if the column is NULL and false otherwise, for example:

Object obj = rs.getObject (index);
if (rs.wasNull ())
{ /* the value's a NULL */ }

The preceding example uses getObject(), but the principle holds for other getXXX() calls as well.

Here’s an example that prints each row of a result set as a comma-separated list of values, with "NULL" printed for each NULL value:

Statement s = conn.createStatement();
s.executeQuery("SELECT name, birth, foods FROM profile");
ResultSet rs = s.getResultSet();
ResultSetMetaData md = rs.getMetaData();
int ncols = md.getColumnCount();
while (rs.next ()) { // loop through rows of result set
  for (int i = 0; i < ncols; i++) { // loop through columns
    String val = rs.getString(i+1);
    if (i > 0)
      System.out.print(", ");
    if (rs.wasNull())
      System.out.print("NULL");
    else
      System.out.print(val);
  }
  System.out.println();
}
rs.close();  // close result set
s.close();   // close statement

4.8 Obtaining Connection Parameters

Problem

You need to obtain connection parameters for a script so that it can connect to a MySQL server.

Solution

There are several ways to do this. Take your pick from the alternatives described here.

Discussion

Any program that connects to MySQL specifies connection parameters such as the username, password, and hostname. The recipes shown so far have put connection parameters directly into the code that attempts to establish the connection, but that is not the only way for your programs to obtain the parameters. This discussion briefly surveys some of the available techniques:

Hardwire the parameters into the program

The parameters can be given either in the main source file or in a library file used by the program. This technique is convenient because users need not enter the values themselves, but it’s also inflexible. To change parameters, you must modify your program. It is also insecure, because everyone who accesses the library would be able to read your database credentials.

Ask for the parameters interactively

In a command-line environment, you can ask the user a series of questions. In a web or GUI environment, you might do this by presenting a form or dialog. Either way, this becomes tedious for people who use the application frequently, due to the need to enter the parameters each time.

Get the parameters from the command line

You can use this method either for commands run interactively or from within a script. Like the method of obtaining parameters interactively, you must supply parameters for each command invocation. (A factor that mitigates this burden is that many shells enable you to easily recall commands from your history list for re-execution.) This method could be insecure if you provide your credentials this way.

Get the parameters from the execution environment

The most common way to do this is to set the appropriate environment variables in one of your shell’s startup files (such as .profile for sh, bash, or ksh; or .login for csh or tcsh). Programs that you run during your login session then can get parameter values by examining their environment.

Get the parameters from a separate file

With this method, store information such as the username and password in a file that programs can read before connecting to the MySQL server. Reading parameters from a file that’s separate from your program gives you the benefit of not having to enter them each time you use the program, without hardwiring the values into it. Also, storing the values in a file enables you to centralize parameters for use by multiple programs, and for security purposes, you can set the file access mode to keep other users from reading the file.

The MySQL client library itself supports an option file mechanism, although not all APIs provide access to it. For those that don’t, workarounds may exist. (As an example, Java supports the use of properties files and supplies utility routines for reading them.)

Use a combination of methods

It’s often useful to combine methods, to give users the flexibility of providing parameters different ways. For example, MySQL clients such as mysql and mysqladmin look for option files in several locations and read any that are present. They then check the command-line arguments for further parameters. This enables users to specify connection parameters in an option file or on the command line.

These methods of obtaining connection parameters do involve security issues:

  • Any method that stores connection parameters in a file may compromise your system’s security unless the file is protected against access by unauthorized users. This is true whether parameters are stored in a source file, an option file, or a script that invokes a command and specifies the parameters on the command line. (Web scripts that can be read only by the web server don’t qualify as secure if other users have administrative access to the server.)

  • Parameters specified on the command line or in environment variables are not particularly secure. While a program is executing, its command-line arguments and environment may be visible to other users who run process status commands such as ps -e. In particular, storing the password in an environment variable perhaps is best limited to those situations in which you’re the only user on the machine or you trust all other users.

The rest of this section discusses how to process command-line arguments to get connection parameters and how to read parameters from option files.

Getting parameters from the command line

The convention used by standard clients such as mysql and mysqladmin for command-line arguments is to permit parameters to be specified using either a short option or a long option. For example, the username cbuser can be specified either as -u cbuser (or -ucbuser) or --user=cbuser. In addition, for either of the password options (-p or --password), the password value may be omitted after the option name to cause the program to prompt for the password interactively.

The standard flags for these command options are -h or --host, -u or --user, and -p or --password. You could write your own code to iterate through the argument list, but it’s much easier to use existing option-processing modules written for that purpose. Under the api directory of the recipes distribution, you’ll find example programs that show how to process command arguments to obtain the hostname, username, and password for Perl, Ruby, Python, and Java. An accompanying PDF file explains how each one works.

Note

Insofar as possible, the programs mimic option-handling behavior of the standard MySQL clients. An exception is that option-processing libraries may not permit making the password value optional, and they provide no way of prompting the user for a password interactively if a password option is specified without a password value. Consequently, the programs are written so that if you use -p or --password, you must provide the password value following the option.

Getting parameters from option files

If your API supports it, you can specify connection parameters in a MySQL option file and let the API read the parameters from the file for you. For APIs that do not support option files directly, you may be able to arrange to read other types of files in which parameters are stored or to write your own functions that read option files.

Recipe 1.4 describes the format of MySQL option files. We assume that you’ve read the discussion there and concentrate here on how to use option files from within programs. You can find files containing the code discussed here under the api directory of the recipes distribution.

Under Unix, user-specific options are specified by convention in ~/.my.cnf (that is, in the .my.cnf file in your home directory). However, the MySQL option-file mechanism can look in several different files if they exist, although no option file is required to exist. (For the list of standard locations in which MySQL programs look for them, see Recipe 1.4.) If multiple option files exist and a given parameter is specified in several of them, the last value found takes precedence.

Programs you write do not use MySQL option files unless you tell them to:

  • Perl DBI and Ruby Mysql2 gem provide direct API support for reading option files; simply indicate that you want to use them at the time that you connect to the server. It’s possible to specify that only a particular file should be read, or that the standard search order should be used to look for multiple option files.

  • PHP PDO, Connector/Python, Java, and Go do not support option files. (The PDO MySQL driver does but not if you use mysqlnd as the underlying library.) As a workaround for PHP, we’ll write a simple option-file parsing function. For Java, we’ll adopt a different approach that uses properties files. For Go, we will utilize the INI parsing library.

Although the conventional name under Unix for the user-specific option file is .my.cnf in the current user’s home directory, there’s no rule that your own programs must use this particular file. You can name an option file anything you like and put it wherever you want. For example, you might set up a file named mcb.cnf and install it in the /usr/local/lib/mcb directory for use by scripts that access the cookbook database. Under some circumstances, you might even want to create multiple option files. Then, from within any given script, select the file that’s appropriate for the access privileges the script needs. For example, you might have one option file, mcb.cnf, that lists parameters for a full-access MySQL account, and another file, mcb-readonly.cnf, that lists connection parameters for an account that needs only read-only access to MySQL. Another possibility is to list multiple groups within the same option file and have your scripts select options from the appropriate group.

Perl

Perl DBI scripts can use option files. To take advantage of this, place the appropriate option specifiers in the third component of the Data Source Name (DSN) string:

  • To specify an option group, use mysql_read_default_group=groupname. This tells MySQL to search the standard option files for options in the named group and in the [client] group. Write the groupname value without the surrounding square brackets. (If a group in an option file begins with a [my_prog] line, specify the groupname value as my_prog.) To search the standard files but look only in the [client] group, groupname should be client.

  • To name a specific option file, use mysql_read_default_file=filename in the DSN. When you do this, MySQL looks only in that file and only for options in the [client] group.

  • If you specify both an option file and an option group, MySQL reads only the named file but looks for options both in the named group and in the [client] group.

The following example tells MySQL to use the standard option-file search order to look for options in both the [cookbook] and [client] groups:

my $conn_attrs = {PrintError => 0, RaiseError => 1, AutoCommit => 1};
# basic DSN
my $dsn = "DBI:mysql:database=cookbook";
# look in standard option files; use [cookbook] and [client] groups
$dsn .= ";mysql_read_default_group=cookbook";
my $dbh = DBI->connect ($dsn, undef, undef, $conn_attrs);

The next example explicitly names the option file located in $ENV{HOME}, the home directory of the user running the script. Thus, MySQL looks only in that file and uses options from the [client] group:

my $conn_attrs = {PrintError => 0, RaiseError => 1, AutoCommit => 1};
# basic DSN
my $dsn = "DBI:mysql:database=cookbook";
# look in user-specific option file owned by the current user
$dsn .= ";mysql_read_default_file=$ENV{HOME}/.my.cnf";
my $dbh = DBI->connect ($dsn, undef, undef, $conn_attrs);

If you pass an empty value (undef or the empty string) for the username or password arguments of the connect() call, connect() uses whatever values are found in the option file or files. A nonempty username or password in the connect() call overrides any option-file value. Similarly, a host named in the DSN overrides any option-file value. Use this behavior to enable DBI scripts to obtain connection parameters both from option files as well as from the command line as follows:

  1. Create $host_name, $user_name, and $password variables, each with a value of undef. Then parse the command-line arguments to set the variables to non-undef values if the corresponding options are present on the command line. (The cmdline.pl Perl script under the api directory of the recipes distribution demonstrates how to do this.)

  2. After parsing the command arguments, construct the DSN string, and call connect(). Use mysql_read_default_group and mysql_read_default_file in the DSN to specify how you want option files to be used, and, if $host_name is not undef, add host=$host_name to the DSN. In addition, pass $user_name and $password as the username and password arguments to connect(). These will be undef by default; if they were set from the command-line arguments, they will have non-undef values that override any option-file values.

If a script follows this procedure, parameters given by the user on the command line are passed to connect() and take precedence over the contents of option files.

Ruby

Ruby Mysql2 scripts can read option files, specified by the default_file connection parameter. If you want to specify the default group, use the default_group option.

This example uses the standard option-file search order to look for options in both the [cookbook] and [client] groups:

client = Mysql2::Client.new(:default_group => "cookbook", :database => "cookbook")

The following example uses the .my.cnf file in the current user’s home directory to obtain parameters from the [client] group:

client = Mysql2::Client.new(:default_file => "#{ENV['HOME']}/.my.cnf",
:database => "cookbook")

PHP

As mentioned earlier, the PDO MySQL driver does not necessarily support using MySQL option files (it does not if you use mysqlnd as the underlying library). To work around that limitation, use a function that reads an option file, such as the read_mysql_option_file() function shown in the following listing. It takes as arguments the name of an option file and an option group name or an array containing group names. (Group names should be written without square brackets.) It then reads any options present in the file for the named group or groups. If no option group argument is given, the function looks by default in the [client] group. The return value is an array of option name/value pairs, or FALSE if an error occurs. It is not an error for the file not to exist. (Note that quoted option values and trailing #-style comments following option values are legal in MySQL option files, but this function does not handle those constructs.):

function read_mysql_option_file ($filename, $group_list = "client")
{
  if (is_string ($group_list))           # convert string to array
    $group_list = array ($group_list);
  if (!is_array ($group_list))           # hmm ... garbage argument?
    return (FALSE);
  $opt = array ();                       # option name/value array
  if (!@($fp = fopen ($filename, "r")))  # if file does not exist,
    return ($opt);                       # return an empty list
  $in_named_group = 0;  # set nonzero while processing a named group
  while ($s = fgets ($fp, 1024))
  {
    $s = trim ($s);
    if (preg_match ("/^[#;]/", $s))              # skip comments
      continue;
    if (preg_match ("/^\[([^]]+)]/", $s, $arg))  # option group line
    {
      # check whether we are in one of the desired groups
      $in_named_group = 0;
      foreach ($group_list as $group_name)
      {
        if ($arg[1] == $group_name)
        {
          $in_named_group = 1;    # we are in a desired group
          break;
        }
      }
      continue;
    }
    if (!$in_named_group)         # we are not in a desired
      continue;                   # group, skip the line
    if (preg_match ("/^([^ \t=]+)[ \t]*=[ \t]*(.*)/", $s, $arg))
      $opt[$arg[1]] = $arg[2];    # name=value
    else if (preg_match ("/^([^ \t]+)/", $s, $arg))
      $opt[$arg[1]] = "";         # name only
    # else line is malformed
  }
  return ($opt);
}

Here are two examples showing how to use read_mysql_option_file(). The first reads a user’s option file to get the [client] group parameters and uses them to connect to the server. The second reads the system-wide option file, /etc/my.cnf, and prints the server startup parameters that are found there (that is, the parameters in the [mysqld] and [server] groups):

$opt = read_mysql_option_file ("/home/paul/.my.cnf");
$dsn = "mysql:dbname=cookbook";
if (isset ($opt["host"]))
  $dsn .= ";host=" . $opt["host"];
$user = $opt["user"];
$password = $opt["password"];
try
{
  $dbh = new PDO ($dsn, $user, $password);
  print ("Connected\n");
  $dbh = NULL;
  print ("Disconnected\n");
}
catch (PDOException $e)
{
  print ("Cannot connect to server\n");
}

$opt = read_mysql_option_file ("/etc/my.cnf", array ("mysqld", "server"));
foreach ($opt as $name => $value)
  print ("$name => $value\n");

PHP does have a parse_ini_file() function that is intended for parsing .ini files. These have a syntax that is similar to MySQL option files, so you might find this function of use. However, there are some differences to watch out for. Suppose that you have a file written like this:

[client]
user=paul

[client]
host=127.0.0.1

[mysql]
no-auto-rehash

Standard MySQL option parsing considers both the user and host values part of the [client] group, whereas parse_ini_file() returns only the contents of the final [client] stanza; the user option is lost. Also, parse_ini_file() ignores options that are given without a value, so the no-auto-rehash option is lost.

Go

The Go-MySQL-Driver doesn’t support option files. However, the INI parsing library supports reading properties files that contain lines in the name=value format. Here is a sample properties file:

# this file lists parameters for connecting to the MySQL server
[client]
user=cbuser
password=cbpass
host=localhost

The MyCnf() function shows one way to read a properties file named ~/.my.cnf to obtain connection parameters:

import (
        "fmt"
        "os"
        "gopkg.in/ini.v1"
)

// Configuration Parser
func MyCnf(client string) (string, error) {
    cfg, err := ini.LoadSources(ini.LoadOptions{AllowBooleanKeys: true}, 
                                os.Getenv("HOME")+"/.my.cnf")
    if err != nil {
        return "", err
    }
    for _, s := range cfg.Sections() {
        if client != "" && s.Name() != client {
            continue
        }
        host := s.Key("host").String()
        port := s.Key("port").String()
        dbname := s.Key("dbname").String()
        user := s.Key("user").String()
        password := s.Key("password").String()
        return fmt.Sprintf("%s:%s@tcp(%s:%s)/%s", user, password, host, port, dbname),
               nil
    }
    return "", fmt.Errorf("No matching entry found in ~/.my.cnf")
}

The function MyCnf() defined in the cookbook.go, developed elsewhere in the chapter (see Recipe 4.3). It is used in the file mycnf.go, which you will find in the directory api/06_conn_params in the recipes distribution:

// mycnf.go : Reads ~/.my.cnf file for DSN construct
package main

import (
	"fmt"
	"github.com/svetasmirnova/mysqlcookbook/recipes/lib"
)

func main() {
    fmt.Println("Calling db.MyCnf()")
    var dsn string

    dsn, err := cookbook.MyCnf("client")
    if err != nil {
	  fmt.Printf("error: %v\n", err)
    } else {
	  fmt.Printf("DSN is: %s\n", dsn)
    }
}

The MyCnf() function accepts the section name as a parameter. If you want to replace the [client] section with any other name, change MyCnf() to MyCnf("other"), where other is the name of the section.

Java

The JDBC MySQL Connector/J driver doesn’t support option files. However, the Java class library supports reading properties files that contain lines in the name=value format. This is similar but not identical to the MySQL option-file format (for example, properties files do not permit [groupname] lines). Here is a simple properties file:

# this file lists parameters for connecting to the MySQL server
user=cbuser
password=cbpass
host=localhost

The following program, ReadPropsFile.java, shows one way to read a properties file named Cookbook.properties to obtain connection parameters. The file must be in some directory named in your CLASSPATH variable, or you must specify it using a full pathname (the example shown here assumes that the file is in a CLASSPATH directory):

import java.sql.*;
import java.util.*;   // need this for properties file support

public class ReadPropsFile {
  public static void main(String[] args) {
    Connection conn = null;
    String url = null;
    String propsFile = "Cookbook.properties";
    Properties props = new Properties();

    try {
      props.load(ReadPropsFile.class.getResourceAsStream(propsFile));
    } catch (Exception e) {
      System.err.println("Cannot read properties file");
      System.exit (1);
    }
    try {
      // construct connection URL, encoding username
      // and password as parameters at the end
      url = "jdbc:mysql://"
            + props.getProperty("host")
            + "/cookbook"
            + "?user=" + props.getProperty("user")
            + "&password=" + props.getProperty("password");
      conn = DriverManager.getConnection(url);
      System.out.println("Connected");
    } catch (Exception e) {
      System.err.println("Cannot connect to server");
    } finally {
      try {
        if (conn != null) {
          conn.close();
          System.out.println("Disconnected");
        }
      } catch (SQLException e) { /* ignore close errors */ }
    }
  }
}

To have getProperty() return a particular default value when the named property is not found, pass that value as a second argument. For example, to use 127.0.0.1 as the default host value, call getProperty() like this:

String hostName = props.getProperty("host", "127.0.0.1");

The Cookbook.java library file developed elsewhere in the chapter (see Recipe 4.3) includes an extra library call in the version of the file that you’ll find in the lib directory of the recipes distribution: a propsConnect() routine that is based on the concepts discussed here. To use it, set up the contents of the properties file, Cookbook.properties, and copy the file to the same location where you installed Cookbook.class. You can then establish a connection within a program by importing the Cookbook class and calling Cookbook.propsConnect() rather than by calling Cookbook.connect().

4.9 Resetting the profile Table

Problem

While working on the examples in this chapter, you changed the original content of the profile table and now want it back, so you can use it while working with other recipes.

Solution

Reload the table using the mysql client.

Discussion

It’s a good idea to reset the profile table used in this chapter to a known state. Change location into the tables directory of the recipes distribution, and run these commands:

$ mysql cookbook < profile.sql
$ mysql cookbook < profile2.sql

Several statements in later chapters use the profile table; by reinitializing it, you’ll get the same results displayed in those chapters when you run the statements shown there.

This chapter discussed the basic operations provided by each of our APIs for handling various aspects of interaction with the MySQL server. These operations enable you to write programs that execute any kind of statement and retrieve the results. Up to this point, we’ve used simple statements because the focus is on the APIs rather than on SQL. The next chapter focuses on SQL instead, to show how to ask the database server more complex questions.

Get MySQL Cookbook, 4th Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.