BUY THIS BOOK
Add to Cart

Print Book $44.95


Add to Cart

Print+PDF $58.44

Add to Cart

PDF $35.99

Safari Books Online

What is this?

Add to UK Cart

Print Book £31.95

What is this?

Looking to Reprint or License this content?


SQL in a Nutshell
SQL in a Nutshell, Second Edition A Desktop Quick Reference

By Kevin Kline
With  Daniel Kline Brand Hunt
Book Price: $44.95 USD
£31.95 GBP
PDF Price: $35.99

Cover | Table of Contents | Colophon


Table of Contents

Chapter 1: SQL History and Implementations
In the early 1970s, the seminal work of IBM research fellow Dr. E. F. Codd led to the development of a relational data model product called SEQUEL, or Structured English Query Language. SEQUEL ultimately became SQL, or Structured Query Language.
IBM, along with other relational database vendors, wanted a standardized method for accessing and manipulating data in a relational database. Although IBM was first to develop relational database theory, Oracle was first to market with the technology. Over time, SQL proved popular enough in the marketplace to attract the attention of the American National Standards Institute (ANSI), which released standards for SQL in 1986, 1989, 1992, 1999, and 2003. Since 1986, competing languages have allowed programmers and developers to access and manipulate relational data. However, few were as easy to learn or as universally accepted as SQL. Programmers and administrators now have the benefit of learning a single language that, with minor adjustments, is applicable to a wide variety of database platforms, applications, and products.
SQL in a Nutshell, Second Edition, describes five common implementations of SQL2003:
  • IBM's DB2 Universal Database Version 8 for Linux, Unix, and Windows
  • MySQL Version 4
  • Oracle Database 10g
  • PostgreSQL Version 7
  • Microsoft's SQL Server 2000
We also cover Sybase Adaptive Server Enterprise (ASE), but to a lesser degree, in Appendix B.
Relational Database Management Systems (RDBMSs), such as those covered in this book, are the primary engines of information systems worldwide, particularly web applications and distributed client/server computing systems. They enable a multitude of users to quickly and simultaneously access, create, edit, and manipulate data without impeding or impacting other users. They also allow developers to write useful applications to access their resources as well as provide administrators with the capabilities they need to maintain, secure, and optimize organizational data resources.
An RDBMS is defined as a system whose users view data as a collection of tables related to each other through common data values. Data is stored in
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
The Relational Model and ANSI SQL
Relational Database Management Systems (RDBMSs), such as those covered in this book, are the primary engines of information systems worldwide, particularly web applications and distributed client/server computing systems. They enable a multitude of users to quickly and simultaneously access, create, edit, and manipulate data without impeding or impacting other users. They also allow developers to write useful applications to access their resources as well as provide administrators with the capabilities they need to maintain, secure, and optimize organizational data resources.
An RDBMS is defined as a system whose users view data as a collection of tables related to each other through common data values. Data is stored in tables, which are composed of rows and columns. Tables of independent data can be linked (or related) to one another if they each have unique, identifying columns of data (called keys) that represent a data value held in common. E. F. Codd first described relational database theory in his landmark paper "A Relational Model of Data for Large Shared Data Banks," published in the Communications of the ACM (Association for Computing Machinery) in June, 1970. Under Codd's new relational data model, data was structured (into tables of rows and columns); manageable using operations like selections, projections, and joins; and consistent as the result of integrity rules like keys and referential integrity. Codd also articulated rules that governed how a relational database should be designed; the process for applying these rules is now known as normalization.
Codd applied rigorous mathematical theories, primarily set theory, to the management of data, and he compiled a list of criteria a database must meet to be considered relational. At its core, the relational database concept centers around storing data in tables. This concept is now so common as to seem trivial; however, not long ago designing a system capable of sustaining the relational model was considered a long shot with limited usefulness. Following are Codd's
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
History of the SQL Standard
Because SQL dialects began to proliferate, the ANSI published its first SQL standard in 1986 to bring greater conformity among vendors, and a second, widely-adopted standard in 1989. The International Standards Organization (ISO) also approves the SQL standard. ANSI released updates in 1992, known as SQL92 and SQL2, and again in 1999, termed both SQL99 and SQL3. Now that the ANSI standard has advanced, we generally use the industry standard term SQL2003. Each time it revises the SQL standard, ANSI adds new features and incorporates new commands and capabilities into the language. For example, the SQL99 standard added a group of capabilities that handled object-oriented datatype extensions.
SQL99 had two main parts, Foundation:1999 and Bindings:1999. The SQL2003 Foundation section includes all of the Foundation and Bindings standards from SQL99, but a new section called Schemata was created.
The Core requirements of SQL2003 did not change from Core SQL99. So the database platforms that conformed to Core SQL99 automatically conform to SQL2003. Although the Core of SQL2003 has no additions (except for a few new reserved words), a number of individual statements and behaviors have been updated or modified. Because these updates are reflected in the individual syntax descriptions of each statement in Chapter 3, we won't spend time on them here.
A few elements of the Core in SQL99 have been deleted from SQL2003, including:
  • The BIT and BIT VARYING datatypes
  • The UNION JOIN clause
  • The UPDATE...SET ROW statement
In addition, a number of features, most of which were rather obscure, have been added, deleted, or renamed. At this point in time, many of the new features of SQL2003 standard are interesting mostly from an academic standpoint because none of the database platforms support the feature yet. However, there are a few new features that hold more than passing interest:
Elementary OLAP functions
Adds an Online Analytical Processing (OLAP) amendment, including a number of windowing functions to support widely used calculations such as moving averages and cumulative sums. Windowing functions are aggregates computed over a window of data:
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
SQL Dialects
The constantly evolving nature of the SQL standard has given rise to a number of SQL dialects among the various vendors and platforms. These dialects most commonly evolved because the user community of a given database vendor required capabilities in the database before the ANSI committee created a standard. Occasionally, though, a new feature is introduced by the academic or research communities due to competitive pressures from competing technologies. For example, many database vendors are augmenting their current programmatic offerings with Java (as is the case with DB2, Oracle, and Sybase) or VBScript (as Microsoft is doing). In the future, programmers and developers will use these programming languages in concert with SQL to build SQL programs.
Many of these dialects include conditional processing capabilities (such as that controlled through IF . . . THEN statements), control-of-flow functions (such as WHILE loops), variables, and error handling. Because ANSI had not yet developed a standard for these important features at the time users began to demand them, RDBMS developers and vendors were free to create their own commands and syntax. In fact, some of the earliest vendors from the 1980s have variances in the most elementary commands, such as SELECT, because their implementations predate the standards. ANSI is now refining standards that address these inconsistencies.
Some of these dialects have introduced procedural commands to support the functionality of a much more complete programming language. For example, these procedural implementations contain error-handling commands, control-of-flow language, conditional commands, variable handling, arrays, and many other extensions. Although these are technically divergent procedural implementations, they are called dialects here. The SQL/PSM (Persistent Stored Module) package provides many features associated with programming stored procedures and incorporates many of the extensions offered by these dialects.
Some popular dialects of SQL include:
PL/SQL
Found in Oracle. PL/SQL stands for Procedural Language/SQL and contains many similarities to the language Ada.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Chapter 2: Foundational Concepts
SQL provides an easy, intuitive way to interact with a database. While the SQL2003 standard does not define the concept of a "database," it does define all the functions and concepts needed for a user to create, retrieve, update, and delete data. It is important to know the types of syntax in the SQL2003 standard and the particular platform-specific syntax guidelines.
SQL in a Nutshell, Second Edition, describes the SQL standard and the platform-specific implementations of five leading RDBMSs:
DB2
IBM's popular DBMS runs on hardware platforms ranging from PCs all the way up to huge mainframes. It runs on a multitude of operating systems, including the burgeoning Linux OS. DB2 is very popular in corporate environments, especially those with a healthy investment in IBM software, equipment, or services. This book covers DB2 Universal Database Version 8.1 for Linux, Unix, and Windows.
MySQL
MySQL is a popular open source DBMS that is known for its blistering performance. It runs on numerous operating systems, including most Linux variants. To improve performance, it has a slimmer feature set than do many other DBMSs. This book covers MySQL 4.0.
Oracle
Oracle is a leading RDBMS in the commercial sector. It runs on a multitude of operating systems and hardware platforms. Its scalable, reliable architecture has made it the platform of choice for many users. SQL in a Nutshell, Second Edition, covers Oracle Database 10g.
PostgreSQL
PostgreSQL is the most feature-rich open source database platform available. Where MySQL is best known for its high-speed performance, PostgreSQL is best known for its excellent support for ANSI standards and robust transaction processing capabilities, as well as rich datatype and database object support. In addition to its full set of features, PostgreSQL runs on a wide variety of operating systems and hardware platforms. This book covers PostgreSQL 7.2.
SQL Server
Microsoft SQL Server is a popular RDBMS that runs only on the Windows platform. Its features include ease of use, an all-inclusive feature set, low cost, and high performance. This book covers Microsoft SQL Server 2000.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Database Platforms Described in This Book
SQL in a Nutshell, Second Edition, describes the SQL standard and the platform-specific implementations of five leading RDBMSs:
DB2
IBM's popular DBMS runs on hardware platforms ranging from PCs all the way up to huge mainframes. It runs on a multitude of operating systems, including the burgeoning Linux OS. DB2 is very popular in corporate environments, especially those with a healthy investment in IBM software, equipment, or services. This book covers DB2 Universal Database Version 8.1 for Linux, Unix, and Windows.
MySQL
MySQL is a popular open source DBMS that is known for its blistering performance. It runs on numerous operating systems, including most Linux variants. To improve performance, it has a slimmer feature set than do many other DBMSs. This book covers MySQL 4.0.
Oracle
Oracle is a leading RDBMS in the commercial sector. It runs on a multitude of operating systems and hardware platforms. Its scalable, reliable architecture has made it the platform of choice for many users. SQL in a Nutshell, Second Edition, covers Oracle Database 10g.
PostgreSQL
PostgreSQL is the most feature-rich open source database platform available. Where MySQL is best known for its high-speed performance, PostgreSQL is best known for its excellent support for ANSI standards and robust transaction processing capabilities, as well as rich datatype and database object support. In addition to its full set of features, PostgreSQL runs on a wide variety of operating systems and hardware platforms. This book covers PostgreSQL 7.2.
SQL Server
Microsoft SQL Server is a popular RDBMS that runs only on the Windows platform. Its features include ease of use, an all-inclusive feature set, low cost, and high performance. This book covers Microsoft SQL Server 2000.
In addition to the five databases just mentioned, Sybase Adaptive Server Enterprise (ASE) Version 12.5 is covered separately in Appendix A. Because Sybase and SQL Server have common roots, the appendix discusses the differences between the generally usable SQL Server syntax and those specifics and extensions offered by Sybase ASE.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Categories of Syntax
To begin to use SQL, you must understand how statements are written. SQL syntax falls into four main categories. Each category is introduced in the following list and then explained in further detail in the sections that follow:
Identifiers
Describe a user- or system-supplied name for database objects, such as a database, a table, a constraint on a table, the columns in a table, a view, etc.
Literals
Describe a user- or system-supplied string or value that is not otherwise an identifier or a keyword. Literals may be strings like "hello", numbers like 1234, dates like "Jan 01, 2002", or Boolean values like TRUE.
Operators
Are symbols specifying an action performed on one or more expressions, most often in DELETE , INSERT, SELECT, or UPDATE statements. Operators are also used frequently in the creation of database objects.
Reserved words and keywords
Have special meaning to the database SQL parser, such as SELECT, GRANT, DELETE, or CREATE. Reserved words are words that cannot be used as identifiers within the database platform, usually commands or SQL statements. Keywords are words that may become reserved some time in the future. You can circumvent the restriction on using reserved words and keywords as identifiers through quoted identifiers, described in a moment. This is not recommended! Elsewhere in the book (besides here), we use the term keyword to describe both concepts.
Keep in mind that RDBMSs are built upon set theory. In ANSI terms, clusters contain sets of catalogs, catalogs contain sets of schemas, schemas contain sets of objects, and so on. Most database platforms use corollary terms: instances contain one or more databases; databases contain one or more schemas; schemas contain one or more tables, views, or stored procedures, and the privileges associated with each object. At each level of this structure, items require a unique name (that is, an identifier) so that they can be referenced by programs and system processes.This means that each
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
SQL2003 and Platform-Specific Datatypes
A table can contain one or many columns. Each column must be defined with a datatype that provides a general classification of the data that the column will store. In real-world applications, datatypes provide some control and efficiency as to how tables are defined and how the data is stored within the table. Using specific datatypes enables better, more understandable queries and helps control the integrity of data.
The tricky thing about SQL2003 datatypes is that they do not always map directly to an identical implementation in a given platform. Although the platforms specify "datatypes" that correspond to the SQL2003 datatypes, these are not always true SQL2003 datatypes. For example, MySQL's implementation of a BIT datatype is actually identical to a CHAR(1) datatype value. Nonetheless, each of the platform datatypes is close enough to the standard to be both easily understandable and job-ready.
The official SQL2003 datatypes (as opposed to platform-specific datatypes) fall into the general categories described in Table 2-8. (Note that the SQL2003 standard contains a few rarely used datatypes (ARRAY, MULTISET, REF, and ROW) that are shown only in Table 2-8 and not elsewhere in the book.)
Table 2-8: SQL2003 categories and datatypes
Category
Example datatypes and abbreviations
Description
BINARY
BINARY LARGE OBJECT (BLOB)
This datatype stores binary string values in hexadecimal format. Binary string values are stored without reference to any character set and without any length limit.
BOOLEAN
BOOLEAN
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Constraints
Constraints allow you to automatically enforce the integrity of data and to filter the data that is placed in a database. In a sense, constraints are rules that define which data values are valid during INSERT, UPDATE, and DELETE operations. When a data modification transaction breaks the rules of a constraint, the transaction is rejected.
In the ANSI standard, there are four constraint types: CHECK, PRIMARY KEY, UNIQUE, and FOREIGN KEY. (The RDBMS platforms may allow more. Refer to different platform sections of the to find out these exceptions.)
Constraints may be applied at the column-level or the table-level:
Column-level constraints
Declared as part of a column definition and apply only to that column.
Table-level constraints
Declared independently from any column definitions (traditionally, at the end of a CREATE TABLE statement) and may apply to one or more columns in the table. A table constraint is required when you wish to define a constraint that applies to more than one column.
Constraints are defined when you create or alter a table. The general syntax for constraints is shown here:
CONSTRAINT [constraint_name] constraint_type [(column [,...])] 
[predicate] [constraint_deferment] [deferment_timing]
The syntax elements are as follows:
CONSTRAINT [constraint_name]
Begins a constraint definition and, optionally, provides a name for the constraint. When you omit constraint_name, the system will create one for you automatically. On some platforms, like DB2, you may omit the CONSTRAINT keyword as well. System-generated names are often incomprehensible. It is good practice to specify human-readable, sensible names for constraints.
constraint_type
Declares the constraint as one of the allowable types: CHECK, PRIMARY KEY, UNIQUE, or FOREIGN KEY. More information about each type of constraint appears later in this section.
column [,...]
Associates one or more columns with the constraint. Specify the columns in a comma-delimited list. The column list must be enclosed in parentheses. The column list should be omitted for column-level constraints. Columns are not used in every constraint. For example,
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Chapter 3: SQL Statement Command Reference
This chapter is the heart of SQL in a Nutshell, Second Edition. It is an alphabetical listing of SQL commands with detailed explanations and examples. Each command and function is identified in Table 3-1 as being "supported," "supported with variations," "supported with limitations," or "not supported," for each of the five SQL dialects covered in this book: DB2, MySQL, Oracle, PostgreSQL, SQL Server. After a brief description of the SQL2003 standard, each vendor application is discussed briefly but thoroughly, with supporting examples. If a specific platform does not support a particular command, that fact will be noted in the table that heads the command and then the command will not be listed for that platform. Similarly, although SQL in a Nutshel l, Second Edition, is not a comprehensive review of the SQL2003 standard, each command has been validated against SQL2003.
When researching a command in this chapter:
  1. Read Section 3.2.
  2. Check the platform support table.
  3. Read the section on SQL2003 syntax and description even if you are looking for a specific platform implementation.
  4. Finally, read the specific platform implementation information.
Any common features between the platform implementations of a command are discussed and compared against the SQL2003 topic. Thus, the subsection on a platform's implementation of a particular command may not describe every aspect of that command, since some of its details may be covered in the SQL2003 section. Please note that if there is a keyword that appears in a command syntax but not in its keyword description, this is because we chose not to repeat descriptions that appear under the ANSI entry.
If you skip straight to a platform implementation and do not see a discussion of a keyword or clause that you are interested in, that means it is described (in common across all supporting platforms) under the SQL2003 entry of the command.
Table 3-1 provides a listing of the SQL statements, the platforms that support them, and the degree to which they support them. The following list offers useful tips for reading Table 3-1, as well as what each abbreviation stands for. The sections that follow describe the commands from the table in detail:
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
How to Use This Chapter
When researching a command in this chapter:
  1. Read Section 3.2.
  2. Check the platform support table.
  3. Read the section on SQL2003 syntax and description even if you are looking for a specific platform implementation.
  4. Finally, read the specific platform implementation information.
Any common features between the platform implementations of a command are discussed and compared against the SQL2003 topic. Thus, the subsection on a platform's implementation of a particular command may not describe every aspect of that command, since some of its details may be covered in the SQL2003 section. Please note that if there is a keyword that appears in a command syntax but not in its keyword description, this is because we chose not to repeat descriptions that appear under the ANSI entry.
If you skip straight to a platform implementation and do not see a discussion of a keyword or clause that you are interested in, that means it is described (in common across all supporting platforms) under the SQL2003 entry of the command.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
SQL Platform Support
Table 3-1 provides a listing of the SQL statements, the platforms that support them, and the degree to which they support them. The following list offers useful tips for reading Table 3-1, as well as what each abbreviation stands for. The sections that follow describe the commands from the table in detail:
  1. The first column contains the alphabetized SQL commands.
  2. The SQL statement class for each command is indicated in the second column.
  3. The subsequent columns list the vendor's level of support:
Supported (S)
The platform supports the SQL2003 standard for the particular command.
Supported, with variations (SWV)
The platform supports the SQL2003 standard for the particular command, using vendor-specific code or syntax.
Supported, with limitations (SWL)
The platform supports some but not all of the functions specified by the SQL2003 standard for the particular command.
Not supported (NS)
The platform does not support the particular command according to the SQL2003 standard.
Remember that even if a specific SQL2003 command is listed as "Not supported," the platform usually has alternative coding or syntax to enact the same command or function. Therefore, be sure to read the discussion and examples for each command later in this chapter. Likewise, a few of the commands in Table 3-1 are not found in the SQL2003 standard, and these have been indicated with the term "Non-ANSI" under the heading "SQL2003 class" in the table.
Conversely, a few ANSI commands in the table (such as CREATE DOMAIN and ALTER DOMAIN) are not currently supported by any of the database platforms covered in this text. Since this book focuses on the implementation of the SQL language, unsupported ANSI commands are shown in Table 3-1 but are not documented elsewhere in the book.
Table 3-1: Alphabetical quick SQL command reference
SQL2003 Command
SQL2003 class
DB2
v8.1
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
SQL Command Reference
ALL/ANY/SOME Operators
The ALL operator performs a Boolean test of a subquery for the existence of a value in all rows. The ANY operator, and its synonym SOME, performs a Boolean test of a subquery for the existence of a value in any of the rows tested.
Platform
Command
DB2
Supported
MySQL
Not supported
Oracle
Supported
PostgreSQL
Supported
SQL Server
Supported
SQL2003 Syntax
SELECT ...
WHERE expression comparison {ALL | ANY | SOME} ( subquery )
Keywords
WHERE expression
Tests a scalar expression, such as a column, against every value in the subquery, for ALL, and against every value until a match is found for ANY and SOME. All rows must match the expression to return a Boolean TRUE value for the ALL operator, while one or more rows must match the expression, for ANY and SOME, to return a Boolean TRUE value.
comparison
Compares the expression to the subquery. The comparison must be a standard comparison operator like =, <>, !=, >, >=, <, or <=.
Rules at a Glance
The ALL operator returns a Boolean TRUE value when one of two things happen: either the subquery returns an empty set (i.e., no records) or every record in the set meets the comparison. ALL returns FALSE when any records in the set do not match the value comparison.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Chapter 4: SQL Functions
A function is a special type of command word in the SQL command set, and each SQL dialect varies in its implementation of the command set. In effect, functions are one-word commands that return single values. The value of a function can be determined by input parameters, as with a function that averages a list of database values. But many functions do not use any type of input parameter, such as the function that returns the current system time, CURRENT_TIME.
The ANSI standard supports a number of useful functions. This chapter covers those functions, providing detailed descriptions and examples for each platform. In addition, each database maintains a long list of their own internal functions that are outside of the scope of the standard SQL. This chapter provides parameters and descriptions for each database implementation's internal functions.
In addition, most database platforms support the ability to create user-defined functions (UDFs). For more information on UDFs, refer to the .
There are different ways to categorize functions into groups. The following subsections describe distinctions that are critical to understand how functions work.
Functions can be either deterministic or nondeterministic . A deterministic function always returns the same results if given the same input values. A nondeterministic function may return different results every time it is called, even when the same input values are provided.
Why is it important that a given input always returns the same output? It is important because of how functions may be used within views, in user-defined functions, and in stored procedures. Restrictions vary across implementations, but these objects sometimes allow only deterministic functions within their defining code. For example, SQL Server allows the creation of an index on a column expression—as long as the expression does not contain nondeterministic functions. Rules and restrictions vary between the platforms, so check the specific documentation when using functions.
Another way of categorizing functions is in terms of whether they operate on values from just one row at a time, on values from a collection, or on a set of rows.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Types of Functions
There are different ways to categorize functions into groups. The following subsections describe distinctions that are critical to understand how functions work.
Functions can be either deterministic or nondeterministic . A deterministic function always returns the same results if given the same input values. A nondeterministic function may return different results every time it is called, even when the same input values are provided.
Why is it important that a given input always returns the same output? It is important because of how functions may be used within views, in user-defined functions, and in stored procedures. Restrictions vary across implementations, but these objects sometimes allow only deterministic functions within their defining code. For example, SQL Server allows the creation of an index on a column expression—as long as the expression does not contain nondeterministic functions. Rules and restrictions vary between the platforms, so check the specific documentation when using functions.
Another way of categorizing functions is in terms of whether they operate on values from just one row at a time, on values from a collection, or on a set of rows. Aggregate functions operate against a collection of values and return a single summarizing value. Scalar functions return a single value based on scalar input arguments. Some scalar functions, such as CURRENT_TIME, do not require any arguments.
Window functions can be thought of as being similar to aggregate functions in that they operate over many rows at one time. The difference lies in how you define those rows. Aggregate functions operate over the sets of rows defined by a query's GROUP BY clause. With window functions, you specify the set of rows for each function call, and different invocations of a function within the same query can execute over different sets of rows.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
ANSI SQL Aggregate Functions
Aggregate functions return a single value based upon a set of other values. If used among other expressions in the item list of a SELECT statement, the SELECT must have a GROUP BY or HAVING clause. No GROUP BY or HAVING clause is required if the aggregate function is the only value retrieved by the SELECT statement. The supported aggregate functions and their syntax are listed in Table 4-1.
Table 4-1: ANSI SQL aggregate functions
Function
Usage
AVG(expression)
Computes the average value of a column given by expression.
CORR(dependent, independent)
Computes a correlation coefficient.
COUNT(expression)
Counts the rows defined by the expression.
COUNT(*)
Counts all rows in the specified table or view.
COVAR_POP ( 
 
dependent, independent)
Computes population covariance.
COVAR_SAMP( 
 
dependent, independent)
Computes sample covariance.
CUME_DIST( 
 
value_list) WITHIN GROUP (ORDER BY sort_list)
Computes the relative rank of a hypothetical row within a group of rows, where the rank is equal to the number of rows less than or equal to the hypothetical row divided by the number of rows in the group.
DENSE_RANK( 
 
value_list) WITHIN GROUP (ORDER BY sort_list)
Generates a dense rank (no ranks are skipped) for a hypothetical row (
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
ANSI SQL Window Functions
SQL2003 allows for a window_clause in aggregate function calls, the addition of which makes those functions into window functions. Both Oracle and DB2 support this window function syntax. This section describes how to use the window_clause within Oracle and DB2.
Oracle tends to refer to window functions as analytic functions.
Window, or analytic, functions are similar to standard aggregate functions in that they operate on multiple rows, or groups of rows, within the result set returned from a query. However, the groups of rows that a window function operates on are defined not by a GROUP BY clause, but by partitioning and windowing clauses. Furthermore, the order within these groups is defined by an ordering clause, but that order only affects function evaluation, and has no effect on the order in which rows are returned by the query.
Window functions are the last items in a query to be evaluated except for the ORDER BY clause. Because of this late evaluation, window functions cannot be used within the WHERE, GROUP BY, or HAVING clauses.
SQL2003 specifies the following syntax for window functions:
FUNCTION_NAME(expr) OVER {window_name|(window_specification)}

window_specification ::= [window_name][partitioning][ordering][framing]

partitioning ::= PARTITION BY value [, value...] [COLLATE collation_name]

ordering ::= ORDER [SIBLINGS] BY rule [, rule...]

rule ::= {value|position|alias} [ASC|DESC] [NULLS {FIRST|LAST}]

framing ::= {ROWS|RANGE} {start|between} [exclusion]

start ::= {UNBOUNDED PRECEDING|unsigned-integer PRECEDING|CURRENT ROW}

between ::= BETWEEN bound AND bound

bound ::= {start|UNBOUNDED FOLLOWING|unsigned-integer FOLLOWING}

exclusion ::= {EXCLUDE CURRENT ROW|EXCLUDE GROUP
              |EXCLUDE TIES|EXCLUDE NO OTHERS}
Oracle's window function syntax is as follows:
FUNCTION_NAME(expr) OVER (window_clause)

window_clause ::= [partitioning] [ordering [framing]]

partitioning ::= PARTITION BY value [, value...]

ordering ::= ORDER [SIBLINGS] BY rule [, 
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
ANSI SQL Scalar Functions
The ANSI SQL scalar functions return a single value each time they are invoked. The SQL standard provides many scalar functions that can be used to manipulate date and time types, strings, and numbers, as well as retrieve system information such as the current user or login name. Scalar functions fall into the categories listed in Table 4-2.
Table 4-2: Categories of scalar functions
Function category
Explanation
Built-in
Performs operations on values or settings built into the database.
Oracle uses the term "built-in" to describe all the specialty functions that are provided by Oracle, and thus "built into" their DBMS. This is a distinct and separate usage from the "built-in" functions described here.
CASE and CAST
While these two functions operate on scalar input values, they are in a category all their own. CASE supplies IF-THEN logic to SQL statements and CAST can convert values from one datatype to another.
Date and Time
Performs operations on temporal datatypes and returns values in a temporal datatype format. There is no SQL2003 function that operates on a temporal datatype and returns a temporal result. The closest function is EXTRACT (covered in Section 4.4.3 later in this chapter), which operates on temporal values and returns numeric values. Functions returning temporal values but operating on no arguments are covered in the later Section 4.4.1.
Numeric
Performs operations on numeric values and returns numeric values.
String
Performs operations on character values (e.g.,
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Platform-Specific Extensions
The following sections provide a full listing and description of each vendor-supported function. The functions are platform-specific. Thus, a MySQL function, for example, is not guaranteed to be supported by any other vendor.
This section provides an alphabetical listing of DB2-supported functions that are specific to DB2 and not from the ANSI standard, with examples and corresponding results.
ABSVAL( number )
Synonym for ABS( number ). For example:
VALUES( ABSVAL( -1 ) ) -> 1
ACOS( number )
Returns the arc cosine of a number ranging from -1 to 1. The result ranges from 0 to and is expressed in radians. For example:
SELECT ACOS( 0 ) -> 1.570796
ASCII( text )
Returns the ASCII code of the first character of text. For example:
SELECT ASCII('x') -> 120
ASIN( number )
Returns the arc sine of number ranging from -1 to 1. The resulting value ranges from -/2 to /2 and is expressed in radians. For example:
SELECT ASIN( 0 ) -> 0.000000
ATAN( number )
Returns the arctangent of any number. The resulting value ranges from -/2 to /2 and is expressed in radians. For example:
SELECT ATAN( 3.1415 ) -> 1.262619
ATAN2( number,nbr )
Returns the arctangent of number and nbr. The values for number and nbr are not restricted, but the results range from - to and are expressed in radians. ATAN2( x,y ) is similar to ATAN( y/x ) with the exception that the signs of
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Chapter 5: Database Programming
While SQL plays an important role in standardizing the communication with different RDBMSs, a missing piece still remains for those who want to write database software applications. The missing piece is the database programming Application Programming Interface (API) used to transport SQL statements to and process their results from an RDBMS. While all the database platforms discussed in this book provide their own proprietary interfaces for database application developers, this chapter focuses on two commonly used APIs that provide a consistent interface across database platforms. Specifically, this chapter introduces you to:
ADO.NET
ADO.NET is Microsoft's high-level database programming API on the .NET platform. The ADO.NET API is a collection of .NET interfaces that are accessible from any of the .NET languages. The primary benefits of ADO.NET are ease of use, portability within the .NET platform, XML integration, and access to data sources other than relational databases. The ADO.NET examples covered in this book are written in C#; however, ADO.NET is also available from Visual Basic and the other .NET languages.
JDBC
JDBC, or Java Database Connectivity, was developed primarily by Sun Microsystems to be the primary database programming API for the Java language. JDBC is the most popular database programming API for the Java language and offers operating-system portability, reasonable performance for most applications, and a well-documented interface. In addition, drivers for most database systems are typically free. This chapter covers JDBC Version 3.0. For additional information, please browse http://java.sun.com/jdbc.
As a quick desk reference, this chapter won't provide all the information needed to develop a large enterprise database application. However, we give you enough to get started by covering components that are common to all database applications, both large and small.
Developing successful database applications, large and small, involves many steps. Careful thought must be given to application architecture, and especially to the following issues:
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Database Programming Overview
Developing successful database applications, large and small, involves many steps. Careful thought must be given to application architecture, and especially to the following issues:
  • How to map application data, which is typically object-oriented, to a relational database
  • How to handle errors gracefully
  • How to maximize performance and scalability
A typical database application will require many different SQL statements. The management of so many statements is simplified by the fact that all SQL statements will follow roughly the same pattern of execution within an application. Figure 5-1 is a state diagram showing how SQL statements are prepared, executed, and then processed by a database application when interacting with a relational database system. The state diagram has been broken down into eleven steps, four of which are optional (and are indented in the diagram).
Figure 5-1: Statement execution state diagram
Following are detailed descriptions of each step shown in Figure 5-1:
  1. Establish connectivity : Establishing connectivity is the first step in every successful database application. It is in this step that the client, or database application, makes a physical connection to the database that will be used to transmit the SQL statements to the database and the results back to the client. The actual physical connection could be over a LAN, WAN, or even a simple logical connection back to the client in cases where the database application and server are running on the same machine. For more information on how to establish connectivity, please see the section Section 5.2 later in this chapter.
  2. Begin transaction (optional) : A transaction may be begun so that the database changes may be rolled back on failure or committed on success. For more information on transaction control from database programming APIs, please see Section 5.4 later in this chapter.
  3. Create statement object : Most modern database programming APIs are object-oriented and therefore use an object to represent a SQL statement. There will typically be one statement object per SQL statement executed within the application. The statement object holds the state information required to execute the SQL statement, such as the SQL statement itself and input parameters when they exist.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Opening a Database Connection
Before interacting with a database, an application must first establish a connection to the database server. The APIs covered in this book abstract the low-level connectivity details into a few simple object-oriented classes, making it easier to focus on the database application instead of protocols and network topology.
Opening a connection with ADO.NET requires instantiating a connection object with a properly formatted connection string and then invoking the Open method on the connection object. The connection object can be an OdbcConnection, SqlConnection, or an OleDbConnection. The OdbcConnection is designed for any ODBC datasource, and the OleDbConnection type will work with any OLE DB Provider. For the highest performance data access, use connection objects specifically tuned to the specific database platform, such as SqlConnection for Microsoft SQL Server. Following is the syntax for creating a Connection object in ADO.NET:
{Odbc|OleDb|Sql}Connection connection = 
   new {Odbc|OleDb|Sql}Connection(connection_string);
connection.Open( );
The format of the connection string is the same for all of the connection types. The format is a string of key/value pairs delimited by semicolons. For example:
key1=value1; key2=value2; key3=value3; ...
While the format is the same for every connection type, the keys and values are quite different. Tables Table 5-1 through Table 5-3 list the attributes for the three connection types listed above. Many database platforms support additional attributes that can also be set through the connection string. For a list of these attributes, please consult the appropriate database vendor documentation.
Following are examples of two connection strings for an OdbcConnection:
DSN=MyOracleDSN; UID=scott; PWD=tiger;
DRIVER={SQL Server};SERVER=(local);UID=sa;PWD=;DATABASE=pubs;
The first string connects to a Data Source Name, or DSN, with the name MyOracleDSN using the username scott and the password tiger.
The second string connects using the SQL Server driver to a database named pubs on the local server. The username is
Additional content appearing in thi