BUY THIS BOOK

Safari Books Online

What is this?

Looking to Reprint this content?


Oracle Parallel Processing By Tushar Mahapatra, Sanjay Mishra
August 2000
Pages: 288

Cover | Table of Contents | Colophon


Table of Contents

Chapter 1: Introduction
Parallel processing is becoming increasingly important in the world of database computing. These days, databases often grow to enormous sizes and are accessed by larger and larger numbers of users. This growth strains the ability of single-processor and single-computer systems to handle the load. More and more organizations are turning to parallel processing technologies to give them the performance, scalability, and reliability they need. Oracle Corporation is a leader in providing parallel processing technologies in a wide range of products. This chapter provides an overview of parallel processing in general and also describes how parallel processing features are implemented in an Oracle environment.
Parallel processing involves taking a large task, dividing it into several smaller tasks, and then working on each of those smaller tasks simultaneously. The goal of this divide-and-conquer approach is to complete the larger task in less time than it would have taken to do it in one large chunk.
Your local grocery store provides a good, real-life analogy to parallel processing. Your grocer must collect money from customers for the groceries they purchase. He could install just one checkout stand, with one cash register, and force everyone to go through the same line. However, the line would move slowly, people would get fidgety, and some would go elsewhere to shop. To speed up the process, your grocer doubtless uses several checkout stands, each with a cash register of its own. This is parallel processing at work. Instead of checking out one customer at a time, your grocer can now handle several at a time.
In our grocery store analogy, parallel processing required several checkout stands, each with its own cash register. Without trying to push the analogy too far, think of each checkout stand as a computer and each cash register as a processor. In a computing environment, the multiple processors in a parallel processing system may all reside on the same computer, or they may be spread across separate computers. When they are spread across separate computers, each computer is referred to as a
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
About Parallel Processing
Parallel processing involves taking a large task, dividing it into several smaller tasks, and then working on each of those smaller tasks simultaneously. The goal of this divide-and-conquer approach is to complete the larger task in less time than it would have taken to do it in one large chunk.
Your local grocery store provides a good, real-life analogy to parallel processing. Your grocer must collect money from customers for the groceries they purchase. He could install just one checkout stand, with one cash register, and force everyone to go through the same line. However, the line would move slowly, people would get fidgety, and some would go elsewhere to shop. To speed up the process, your grocer doubtless uses several checkout stands, each with a cash register of its own. This is parallel processing at work. Instead of checking out one customer at a time, your grocer can now handle several at a time.
In our grocery store analogy, parallel processing required several checkout stands, each with its own cash register. Without trying to push the analogy too far, think of each checkout stand as a computer and each cash register as a processor. In a computing environment, the multiple processors in a parallel processing system may all reside on the same computer, or they may be spread across separate computers. When they are spread across separate computers, each computer is referred to as a node.
There are a few basic requirements of parallel computing:
  • Computer hardware that is designed to work with multiple processors and that provides a means of communication between those processors
  • An operating system that is capable of managing multiple processors
  • Application software that is capable of breaking large tasks into multiple smaller tasks that can be performed in parallel
Weather forecasting provides another real-life example of parallel processing at work. Satellites used for weather forecasting collect millions of bytes of data per second on the condition of earth's atmosphere, formation of clouds, wind intensity and direction, temperature, and so on. This huge amount of data has to be processed by complex algorithms to arrive at a proper forecast. Thousands of iterations of computation may be needed to interpret this environmental data. Parallel computers are used to perform these computations in a timely manner so a weather forecast can be generated early enough for it to be useful.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Parallel Processing for Databases
Three issues are driving the increasing use of parallel processing in database environments:
The need for increased speed or performance
Database sizes are increasing, queries are becoming more complex—especially in data warehouse systems—and the database software must somehow cope with the increasing demands that result from this complexity.
The need for scalability
This requirement goes hand-in-hand with performance. Databases often grow rapidly, and companies need a way to easily and cost-effectively scale their systems to match that growth.
The need for high availability
High availability refers to the need to keep a database up and running with minimal or no downtime. With the increasing use of the Internet, companies need to accommodate users at all hours of the day and night.
Database sizes have been increasing steadily, and it's now quite common to find data warehouses holding several hundred gigabytes of data. Some databases, referred to as Very Large Databases (VLDBs), even hold several terabytes of data. Complex queries are run on these data warehouses to gather business intelligence and to aid in decision making. Such queries require a lot of processing time to execute. By executing these queries in parallel, you can reduce the elapsed time while still providing the required processor time.
Speedup is defined as the ratio between the runtime with one processor and the runtime using multiple processors. It measures the performance improvement gained using multiple processors instead of a single processor and is calculated using the following formula:
Speedup = Time1 / Timem
Time1 is the time it takes to execute a task using only one processor, while Timem is the time it takes to execute that same task using m processors.

Section 1.2.1.1: Speedup example

Figure 1.1 shows a query that takes four minutes to complete using one processor, but that takes only one minute to complete using four processors.
Figure 1.1: Parallel processing speedup
Plugging the values from Figure 1.1 into the speedup formula yields the following results:
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Parallel Processing in Oracle
Most modern commercial DBMS products have implemented parallel features, and Oracle is no exception. With every release, Oracle has consistently improved its support for parallel processing. Oracle's support for parallel processing can be divided into the following two specific feature sets:
Parallel execution
Refers to intra-query parallelism
Parallel server
Refers to the use of multiple instances to open a single, shared database
As you delve into Oracle's parallel feature set, you'll encounter several very similar terms that all begin with the word "parallel." Read through the following definitions; they will help you understand these terms before you read further:
Oracle Parallel Server
Refers to the Oracle option allowing you to share one database among multiple instances. Typically, these instances will be running on separate nodes of a clustered system.
Parallel Server
The same as Oracle Parallel Server.
Parallel execution
Refers to Oracle's ability to apply multiple CPUs to the task of executing a single SQL statement in order to complete execution faster than would be possible using only a single CPU.
Parallel SQL
Means the same thing as parallel execution.
Parallel Query
Refers to Oracle's ability to execute SELECT statements in parallel, using multiple CPUs. When parallel features first were introduced into Oracle years ago, the only support was for parallel SELECT statements, and at that point the feature was known as Parallel Query and was available through the Parallel Query Option (PQO). Now, Parallel Query is only a subset of Oracle's parallel execution features.
Parallel DML
Refers to Oracle's ability to execute Data Manipulation Language (DML) statements in parallel. Parallel DML is a subset of Oracle's parallel execution feature set.
Parallel DDL
Refers to Oracle's ability to execute some Data Definition Language (DDL) statements in parallel. Parallel DDL is a subset of Oracle's parallel execution feature set.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Overhead for Oracle Parallel Processing
As you might imagine, the benefits of parallel execution and Oracle Parallel Server do not come without a price. The next two sections discuss the various overhead issues that apply to parallel execution and Oracle Parallel Server.
Parallel execution entails a cost in terms of the processing overhead necessary to break up a task into pieces, manage the execution of each of those pieces, and combine the results when the execution is complete. Figure 1.8 illustrates some of the steps involved in parallel execution.
Figure 1.8: Steps in parallel execution
Parallel execution overhead can be divided into three areas: startup cost, interference, and skew.

Section 1.4.1.1: Startup cost

Startup cost refers to the time it takes to start parallel execution of a query or a DML statement. It takes time and resources to divide one large task into smaller subtasks that can be run in parallel. Time also is required to create the processes needed to execute those subtasks and to assign each subtask to one of these processes. For a large query, this startup time may not be significant in terms of the overall time required to execute the query. For a small query, however, the startup time may end up being a significant portion of the total time.

Section 1.4.1.2: Interference

Interference refers to the slowdown that one process imposes on other processes when accessing shared resources. While the slowdown resulting from one process is small, with a large number of processors the impact can be substantial.

Section 1.4.1.3: Skew

Skew refers to the variance in execution time of parallel subtasks. As the number of parallel subtasks increases (perhaps as a result of using more processors), the amount of work performed by each subtask decreases. The result is a reduction in processing time required for each of those subtasks and also for the overall task. There are always variations, however, in the size of these subtasks. In some situations these variations may lead to large differences in execution time between the various subtasks. The net effect, when this happens, is that the processing time of the overall task becomes equivalent to that of the longest subtask.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Requirements for Oracle's Parallel Features
In order to use Oracle's parallel processing features for your database applications, you must:
  • Have both suitable hardware and operating system software that supports parallel processing
  • Configure Oracle Server appropriately
  • Modify your database applications to get the benefit of parallel processing
Table 1.1 briefly summarizes the requirements for using Oracle's parallel execution features and for using Oracle Parallel Server.
Table 1.1: Requirements for Using Oracle's Parallel Features
Type of Requirement
Parallel Execution
Oracle Parallel Server
Hardware
Multiple processors (SMP, MPP, or NUMA).
Multiple nodes (cluster, MPP, or NUMA cluster) shared disk architecture.
Operating system software
Operating system must support multiple processors.
Operating system must support resource management and communication across nodes. The operating system also must support a disk-sharing configuration such as the use of raw devices on Unix and NT platforms.
Oracle DBMS
Configure initialization parameters. If using Oracle7, install the Oracle Parallel Query Option.
Purchase and install the Oracle Parallel Server Option. Configure initialization parameters.
Application
Minor changes to SQL statements.
Application partitioning.
The following sections discuss some of the Table 1.1 requirements in greater detail.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Chapter 2: Architectures for Parallel Processing
In the never-ending quest for increased performance, vendors over the years have developed several different hardware and software architectures to use as a basis for parallel processing. In this chapter, we present the different hardware architectures used for parallel processing in databases. This chapter discusses the basic characteristics of these hardware architectures and contrasts their relative strengths and weakness. In addition, this chapter also discusses the three types of software architectures commonly used by DBMS vendors to implement parallel processing in their database software; this discussion provides you with a framework for understanding the software architecture choices made by Oracle when implementing parallel processing under the different hardware architectures. Finally, this chapter describes the availability of Oracle's parallel processing features on each type of hardware architecture.
Parallel processing refers to the use of multiple processors to reduce the time needed to complete a given task. Instead of one processor's executing an entire task, several processors each work on a separate piece of the task. Obviously, parallel processing requires computer hardware that can support more than a single processor. Several architectural approaches to multiple processor systems have been developed over the years. The following four are in commercial use today:
  • Symmetric Multiprocessing (SMP)
  • Clustered
  • Massively Parallel Processing (MPP)
  • Non Uniform Memory Access (NUMA)
The next four sections describe each of these architectures.
In a Symmetric Multiprocessing (SMP) architecture, multiple processors (CPUs) within a single computer share the same system memory and I/O resources. (See Figure 2.1 for an illustration.) The sharing is accomplished through the use of a high-speed system bus. One copy of the operating system runs on the computer and controls all of the processors. This operating system must be designed to support multiple processors, and it must have a scheduling algorithm that utilizes all the processors in the system evenly
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Hardware Architectures
Parallel processing refers to the use of multiple processors to reduce the time needed to complete a given task. Instead of one processor's executing an entire task, several processors each work on a separate piece of the task. Obviously, parallel processing requires computer hardware that can support more than a single processor. Several architectural approaches to multiple processor systems have been developed over the years. The following four are in commercial use today:
  • Symmetric Multiprocessing (SMP)
  • Clustered
  • Massively Parallel Processing (MPP)
  • Non Uniform Memory Access (NUMA)
The next four sections describe each of these architectures.
In a Symmetric Multiprocessing (SMP) architecture, multiple processors (CPUs) within a single computer share the same system memory and I/O resources. (See Figure 2.1 for an illustration.) The sharing is accomplished through the use of a high-speed system bus. One copy of the operating system runs on the computer and controls all of the processors. This operating system must be designed to support multiple processors, and it must have a scheduling algorithm that utilizes all the processors in the system evenly. The SMP architecture also is referred to as a tightly coupled or shared memory architecture.
Please assume that all of the figures presented in this chapter are conceptual. The figures show notable features of the architectures being discussed and illustrate differences between them. Although the concepts remain the same, the detailed hardware implementations will vary significantly among various vendors.
Figure 2.1: Symmetric Multiprocessing (SMP) architecture
Each processor in an SMP system can execute programs independently, with each processor accessing a separate area of memory. A single-processor system running in a time-sharing mode really can work on only one job at a time. It accomplishes the illusion of doing several things at once by rapidly cycling through all the different jobs that are running. An SMP system, on the other hand, really can work on several jobs simultaneously, because separate processors are dedicated to each job. (This is equivalent to the inter-query parallelism that you read about in Chapter 1.) Because processor job assignment is entirely transparent to applications, database applications can reap the benefits of increased throughput from SMP systems without requiring that any changes be made to the RDBMS software. In addition, the many processors in an SMP system can be made to work cooperatively together on a single large program in order to complete it faster than any single processor could. (This is the equivalent of intra-query parallelism, which you also read about in Chapter 1.) Intra-query parallelism, however, does require specific support from the RDBMS software and from database applications.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
High-Speed Interconnect
In Figure 2.2 through Figure 2.4, we've referred to the connection networks used in cluster, MPP, and NUMA architectures as high-speed interconnects . Performance of this interconnect is an important consideration in parallel architectures. Interconnect performance is measured in two dimensions: bandwidth and latency. Bandwidth is the rate at which data can be moved between nodes and is measured in megabytes (MB) per second. Latency is defined as the time spent in setting up access to a remote node so that communications can occur. Interconnects should have low latency in order to maximize the number of messages that can be set up and placed on the interconnect in a given period of time. As the number of nodes in a configuration increases, more data and messages are passed around between nodes. It's important that the interconnect have a high enough bandwidth and a low enough latency to support this message traffic.
Cluster interconnects often are implemented using standard LAN-based technology such as Ethernet or Fiber Distributed Data Interchange (FDDI). When a cluster is configured with a large number of nodes, the limitations of the network can degrade performance. The bandwidth and latency of the cluster interconnect is the key to improving the scalability of a cluster.
Figure 2.5 shows a sample network configuration in a two-node cluster. Nodes in the cluster are connected to the network with both primary and standby interface cards. When the primary interface fails, the operating system switches the IP address assigned to the primary interface to the standby interface, and the system continues to operate. In addition to redundant network interface cards, a secondary LAN also is provided. In the event that the primary LAN fails, the secondary LAN will take over, and the cluster will continue to operate.
Figure 2.5: Redundant network components in a two-node cluster
The interconnect in IBM RS/6000 (an MPP architecture) is referred to as a high-performance switch (HPS). The switch architecture is such that as the number of nodes is increased in an MPP system, switching components are added to maintain node-to-node bandwidth. Because of this feature, HPS scales well and can support a large number of nodes. RS/6000 SP systems have latency on the order of a microsecond, and peak node-to-node bandwidth of 100 MB per second.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Software Architectures of Parallel Database Systems
In implementing parallelism in their database software, database vendors use one of three software architectures, commonly referred to as:
  • Shared everything
  • Shared disk
  • Shared nothing
The sharing refers to the sharing of disk and memory by multiple processors.
Implementing each of these software architectures requires an appropriate underlying parallel hardware architecture. Some software architectures are a better match than others for a given hardware architecture. For example, a shared everything software architecture is a natural match for SMP hardware, because all processors in an SMP system share the same memory and disk. Other combinations are not so good. Implementing a shared everything software architecture on an MPP platform, for example, is not a good choice, because MPP hardware is based on a distributed memory architecture. The software implementation of a shared memory abstraction on top of the distributed memory of the individual nodes in an MPP system would be difficult, and performance would be poor.
The three figures in this section, Figure 2.6 through Figure 2.8, highlight the differences among the three database software architectures with respect to the sharing of memory and disks. They do not imply any particular underlying hardware architecture. It is possible to implement more than one software architecture on a given type of hardware architecture. For example, on IBM RS/6000 SP, which is an MPP system, Oracle Parallel Server runs with a shared disk software architecture. DB2/6000 Parallel Edition, on the other hand, runs on the same IBM RS/6000 SP system with a shared nothing software architecture.
In a shared everything database architecture, all processors share the same memory and disks. Figure 2.6 illustrates such a system. One copy of the operating system and one copy of the database software run on the system. Shared memory allows for efficient coordination between DBMS processes. In this architecture, it is relatively simple to implement inter-query and intra-query parallelism, because the operating system automatically allocates the queries and subqueries to available CPUs. The shared everything software architecture is widely used on SMP hardware. The NUMA hardware architecture also is suitable for a shared everything software architecture.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Oracle's Parallel Processing Architecture
Oracle's parallel features fall into two broad categories: parallel execution and parallel server. Table 2.1 shows which parallel features are available under each combination of hardware and software architecture that Oracle supports. Oracle's parallel processing features are available on all parallel hardware architectures. Table 2.1 provides examples of specific hardware architectures in each type. Oracle runs under many operating systems, including Windows NT, Alpha VMS, and variations of Unix.
Table 2.1: Availability of Oracle's Parallel Processing Features on Different Hardware Architectures
Hardware/Software Architecture
Oracle Parallel Server?
Parallel Execution?
Vendor Platform Examples
SMP/shared everything
No
Yes
IBM RS/6000
Sun Enterprise 4500
SGI Origin 2000
HP 9000
Windows NT
Cluster/shared disk
Yes
Yes
HP 9000 S800 Cluster
IBM RS/6000 HACMP Cluster
Sun Ultra Enterprise PDB Cluster
Pentium-based NT Clusters
MPP/shared disk
Yes
Yes
IBM RS/6000 SP
Siemens MPP RM1000
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Chapter 3: Parallel Execution Concepts
This chapter presents Oracle's parallel execution (sometimes known as parallel SQL) features: parallel query, parallel DML, parallel DDL, and several other types of parallel operations. If implemented effectively, parallel operations can significantly improve the performance of data-intensive jobs. The decision to execute an operation in parallel is made by Oracle at runtime and is mostly transparent to programmers and end users.
Oracle's parallel execution features are different from the features provided by Oracle Parallel Server (OPS). You don't need OPS to perform parallel SQL execution. However, parallel SQL execution can work in conjunction with OPS, and some aspects of parallel execution apply only to Oracle Parallel Server. Throughout this chapter, unless we note otherwise, we are dealing with parallel SQL operations in a standalone instance environment. Part III of this book discusses OPS; in particular, Chapter 13, deals with parallel execution in an OPS environment.
When a user connects to an Oracle instance, a process referred to as a server process, or shadow process, starts on the database server. The purpose of this shadow process is to perform the operations requested by the user process. When the user executes a SQL statement, the corresponding shadow process performs the necessary tasks to get the result. In a serial execution environment, all the tasks are performed by a single process (the shadow process), as shown in Figure 3.1.
Figure 3.1: Serial statement execution
Things are different in a parallel execution environment. In a parallel execution environment, the tasks involved in executing a SQL statement are divided among multiple processes, which work together to complete the execution faster. These multiple processes are referred to as slave processes. Figure 3.2 illustrates parallel execution.
Figure 3.2: Parallel statement execution
Symmetric Multiprocessing (SMP) systems, Massively Parallel Processing (MPP) systems, and clustered systems benefit significantly from executing SQL statements in parallel, because they allow tasks to be spread across the multiple CPUs available in these systems. You can use parallel SQL on single-CPU machines too. However, multiple parallel processes working on behalf of one SQL statement would contend for the same CPU and ultimately might result in poor performance.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
What Is Parallel SQL?
When a user connects to an Oracle instance, a process referred to as a server process, or shadow process, starts on the database server. The purpose of this shadow process is to perform the operations requested by the user process. When the user executes a SQL statement, the corresponding shadow process performs the necessary tasks to get the result. In a serial execution environment, all the tasks are performed by a single process (the shadow process), as shown in Figure 3.1.
Figure 3.1: Serial statement execution
Things are different in a parallel execution environment. In a parallel execution environment, the tasks involved in executing a SQL statement are divided among multiple processes, which work together to complete the execution faster. These multiple processes are referred to as slave processes. Figure 3.2 illustrates parallel execution.
Figure 3.2: Parallel statement execution
Symmetric Multiprocessing (SMP) systems, Massively Parallel Processing (MPP) systems, and clustered systems benefit significantly from executing SQL statements in parallel, because they allow tasks to be spread across the multiple CPUs available in these systems. You can use parallel SQL on single-CPU machines too. However, multiple parallel processes working on behalf of one SQL statement would contend for the same CPU and ultimately might result in poor performance.
Parallel SQL was first introduced in Oracle7 (release 7.1) as the Oracle Parallel Query Option (PQO). PQO is a separately installable option in Oracle7 and helps in parallelizing queries (SELECT statements). With the release of Oracle8, PQO was incorporated into the core RDBMS product and now gets installed along with it. In addition, the parallel execution feature was enhanced to add support for more parallel operations. The full list of Oracle parallel execution features currently includes the following:
Parallel query
The ability to break up the execution of a SELECT statement into multiple tasks and then to execute those tasks in parallel using multiple processors
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
How Parallel Execution Works
Parallel execution of SQL statements is mostly transparent to end users. Oracle divides the task of executing a SQL statement into multiple smaller units, each of which is executed by a separate process. When parallel execution is used, the user's shadow process takes on the role of the parallel coordinator. The parallel coordinator is also referred to as parallel execution coordinator or query coordinator. The parallel coordinator does the following:
  1. Dynamically divides the work into smaller units that can be parallelized.
  2. Acquires a sufficient number of parallel processes to execute the individual smaller units. These parallel processes are called parallel slave processes. They also are sometimes referred to as parallel execution server processes, parallel server processes, parallel query slaves, or simply slave processes. The most common of the terms, parallel slave processes and slave processes, are used throughout this book.
  3. Assigns each unit of work to a slave process.
  4. Collects and combines the results from the slave processes, and returns those results to the user process.
  5. Releases the slave processes after the work is done.
Oracle maintains a pool of parallel slave processes for each instance. The parallel coordinator for a SQL statement assigns parallel tasks to slave processes from this pool. These parallel slave processes remain assigned to a task until its execution is complete. After that, these processes return to the pool and can be assigned tasks from some other parallel operation. A parallel slave process serves only one SQL statement at a time.
The following parameters control the number of parallel slave processes in the pool:
PARALLEL_MIN_SERVERS
Specifies the minimum number of parallel slave processes for an instance. When an instance starts up, it creates the specified number of parallel slave processes. The default value for this parameter is 0, meaning that no slave processes would be created at startup.
PARALLEL_MAX_SERVERS
Specifies the maximum number of parallel slave processes that an instance is allowed to have at one time. The default value for PARALLEL_MAX_SERVERS is platform-specific.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Chapter 4: Using Parallel Execution
In Chapter 3, we introduced Oracle's parallel execution features and talked about the concepts behind how they work. These features include:
  • Parallel query
  • Parallel DML
  • Parallel DDL
  • Parallel data loading
  • Parallel recovery
  • Parallel replication propagation
In this chapter, we'll describe how to use each of these features and show you how to take advantage of them. Most of these features come with a list of restrictions and caveats that may affect your ability to use them in a given situation. We'll describe those as well.
Throughout this chapter, we assume that you're familiar with the basic Oracle features being discussed. This isn't, for example, a tutorial on recovery or replication. The focus is always on the parallel features that can be applied to the task at hand. If you're familiar with DML and DDL, for example, you won't have any problem following the discussion of how to parallelize those types of statements. The same is true for the other types of operations covered in this chapter.
Parallel query is the most commonly used of Oracle's parallel execution features. It was the first parallel execution feature to be developed by Oracle and was introduced in Oracle7 (release 7.1) as the Oracle Parallel Query Option (PQO). Parallel execution can significantly reduce the elapsed time for large queries, but it doesn't apply to every query. To parallelize a SELECT statement, the following conditions must be met:
  • At least one of the tables is accessed through a full table scan, or an index is accessed through a range scan involving multiple partitions.
  • If the execution involves a full table scan, the statement must contain a PARALLEL hint specifying the corresponding table, or the corresponding table must have a parallel declaration in its definition.
  • If the execution involves an index range scan spanning multiple partitions, the statement must contain a PARALLEL_INDEX hint specifying the corresponding index, or the corresponding index must have a parallel declaration in its definition.
The following two sections explain how the degree of parallelism is chosen for a SELECT statement and discuss restrictions on the use of the parallel query feature.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Parallel Query
Parallel query is the most commonly used of Oracle's parallel execution features. It was the first parallel execution feature to be developed by Oracle and was introduced in Oracle7 (release 7.1) as the Oracle Parallel Query Option (PQO). Parallel execution can significantly reduce the elapsed time for large queries, but it doesn't apply to every query. To parallelize a SELECT statement, the following conditions must be met:
  • At least one of the tables is accessed through a full table scan, or an index is accessed through a range scan involving multiple partitions.
  • If the execution involves a full table scan, the statement must contain a PARALLEL hint specifying the corresponding table, or the corresponding table must have a parallel declaration in its definition.
  • If the execution involves an index range scan spanning multiple partitions, the statement must contain a PARALLEL_INDEX hint specifying the corresponding index, or the corresponding index must have a parallel declaration in its definition.
The following two sections explain how the degree of parallelism is chosen for a SELECT statement and discuss restrictions on the use of the parallel query feature.
Once Oracle decides to execute a SELECT statement in parallel, the degree of parallelism is determined by following precedence rules:
  1. Oracle retrieves the DEGREE and INSTANCES specifications from the definition of all tables and indexes involved in the query and chooses the highest values found for those settings.
  2. Oracle checks the statement for a parallel hint. If such a hint is found, the hint overrides the degree of parallelism obtained as a result of the previous step.
You can use the PARALLEL and PARALLEL_INDEX hints to specify the degree of parallelism for a SELECT statement. You can use the NOPARALLEL and NOPARALLEL_INDEX hints to ensure that parallel execution is not performed for the SELECT.
The PARALLEL hint controls the degree of parallelism applied to tables accessed by a query and takes this form:
/*+ PARALLEL (table_name[, degree[, instances]]) */
where:
table_name
Is the name of the table on which to parallelize operations
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Parallel DML
Data Manipulation Language (DML) operations such as INSERT, UPDATE, and DELETE can be parallelized by Oracle. Parallel execution can speed up large DML operations and is particularly advantageous in data warehousing environments where it's necessary to maintain large summary or historical tables. In OLTP systems, parallel DML sometimes can be used to improve the performance of long-running batch jobs.
When you issue a DML statement such as an INSERT, UPDATE, or DELETE, Oracle applies a set of rules to determine whether that statement can be parallelized. For UPDATE and DELETE statements, the rules are identical. INSERT statements, however, have their own set of rules.
To parallelize DML (INSERT, UPDATE, and DELETE) statements, you must enable parallel DML. Refer to the section later in this chapter titled "Enabling Parallel DML" for details.

Section 4.2.1.1: UPDATE and DELETE statements

Oracle can parallelize UPDATE and DELETE statements on partitioned tables, but only when multiple partitions are involved. You cannot parallelize UPDATE or DELETE operations on a nonpartitioned table or when such operations affect only a single partition. If such a statement includes a subquery, then the decision to parallelize the UPDATE or DELETE operation is made independently of the decision to parallelize the subquery.
UPDATE or DELETE operations are parallelized only if the table being updated or deleted has a parallel specification or if a PARALLEL hint is included after the UPDATE or DELETE keyword in the statement.

Section 4.2.1.2: INSERT statements

Standard INSERT statements using a VALUES clause cannot be parallelized. Oracle can parallelize only INSERT . . . SELECT . . . FROM statements. The decision to parallelize the INSERT operation is made independently from the decision to parallelize the SELECT operation.
The INSERT operation can be parallelized only if the table being inserted into has a parallel specification or if a PARALLEL hint is included after the INSERT keyword in the statement.
As we discussed in Chapter 3, the degree of parallelism can be specified at the statement level, the object definition level, or the instance level. The PARALLEL and PARALLEL_INDEX hints discussed in relation to the SELECT statements apply to other DML statements as well. The following sections describe the rules used to determine the degree of parallelism for DML statements.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Parallel DDL
Parallel DDL works for both tables and indexes, whether partitioned or nonpartitioned. For nonpartitioned tables and indexes, only the following types of DDL statements can be parallelized:
CREATE TABLE...AS SELECT
CREATE INDEX
ALTER INDEX...REBUILD
If you're working with partitioned tables and indexes, the scope of Oracle's parallel DDL support broadens. The following statements can be parallelized for partitioned tables and indexes:
CREATE TABLE...AS SELECT
ALTER TABLE...MOVE PARTITION
ALTER TABLE...SPLIT PARTITION
CREATE INDEX
ALTER INDEX...REBUILD PARTITION
ALTER INDEX...SPLIT PARTITION
Not all tables allow these operations to be executed in parallel. Tables with object columns or LOB columns don't allow parallel DDL.
With DDL statements, you do not specify the degree of parallelism using hints. Instead, you use the PARALLEL clause. The PARALLEL clause takes the following form:
{NOPARALLEL |
PARALLEL (DEGREE {degree | DEFAULT} 
   [INSTANCES {instances | DEFAULT}] )}
Here's an example that shows the PARALLEL clause being used in a CREATE INDEX statement. The specified DEGREE is 4, so 4 parallel slave processes will be used to create the index:
CREATE INDEX customer_ix ON customers (customer_id)
   TABLESPACE ind01
   STORAGE (INITIAL 100M NEXT 100M PCTINCREASE 0 MAXEXTENTS 20)
   PARALLEL (DEGREE 4);
When you create a table or index using parallel DDL, two or more parallel slave processes work on behalf of your statement to create the object. Each parallel slave process creates a temporary segment during the creation process. At the end, the parallel coordinator trims each of these temporary segments to release any free space and then combines these segments into one segment.
When you are using parallel DDL to create an object such as a table or an index, you need to be aware of two issues:
  • How space is allocated
  • The potential for fragmentation
You should be careful while specifying the storage clause for tables or indexes being created in parallel, because the amount of space allocated may be much greater than what you would otherwise expect. When you create a table or an index using parallel DDL, each parallel slave process allocates space based on the table or index's storage clause. For example, if you create a table with INITIAL 10 MB and a degree of parallelism of 4, then each of the four parallel slave processes allocates an extent of 10 MB. The result is that a total of 40 MB will be allocated for the table during creation.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Parallel Data Loading
Oracle's SQL*Loader utility loads data into Oracle tables from external files. With some restrictions, SQL*Loader supports the loading of data in parallel. If you have a large amount of data to load, SQL*Loader's parallel support can dramatically reduce the elapsed time needed to perform that load.
SQL*Loader supports parallel loading by allowing you to initiate multiple concurrent direct path load sessions that all load data into the same table or into the same partition of a partitioned table. Unlike the case when you execute a SQL statement in parallel, the task of dividing up the work falls on your shoulders. Follow these steps to use parallel data loading:
  1. Create multiple input datafiles.
  2. Create a SQL*Loader control file for each input datafile.
  3. Initiate multiple SQL*Loader sessions, one for each control file and datafile pair.
When you initiate the SQL*Loader sessions, you must tell SQL*Loader that you are performing a parallel load. You do that by adding the PARALLEL=TRUE parameter to the SQL*Loader command line. For example, the following commands could be used to initiate a load performed in parallel by four different sessions:
SQLLOAD scott/tiger CONTROL=part1.ctl DIRECT=TRUE PARALLEL=TRUE
SQLLOAD scott/tiger CONTROL=part2.ctl DIRECT=TRUE PARALLEL=TRUE
SQLLOAD scott/tiger CONTROL=part3.ctl DIRECT=TRUE PARALLEL=TRUE
SQLLOAD scott/tiger CONTROL=part4.ctl DIRECT=TRUE PARALLEL=TRUE
Note that the commands here should be executed from four different operating system sessions. The intent is to get four SQL*Loader sessions going at once, not to run four sessions one at a time. For example, if you are using the Unix operating system, you might open four command-prompt windows and execute one SQL*Loader command in each window.
Another important thing to note here is that you need to use the direct path in order to perform a load in parallel, as explained in the next section. This is achieved by the command-line argument DIRECT=TRUE. Parallel loads are not possible using the conventional path option.
Parallel loads must be done using the direct path. When you initiate a direct path load, SQL*Loader formats the input data into Oracle data blocks and writes those blocks directly to the datafiles. The blocks are always added above the target table's high-water mark (HWM). Direct path loads bypass SQL command processing, and they bypass the database buffer cache in the SGA. The result is much higher performance than you can get using a conventional path load.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Parallel Recovery
Parallel recovery can speed up both instance recovery and media recovery. In parallel recovery, multiple parallel slave processes are used to perform recovery operations. The SMON background process reads the redo log files, and the parallel slave processes apply the changes to the datafiles. Parallel recovery is most beneficial when several datafiles on different disks are being recovered.
In a serial recovery scenario, the SMON background process both reads the redo log files and applies the changes to the datafiles. This may take a considerably long time when multiple datafiles need to be recovered. However, when parallel recovery is being used, the SMON process is responsible only for reading the redo log files. The changes are applied to the datafiles by multiple parallel slave processes, thereby reducing the recovery time.
Recovery requires that the changes be applied to the datafiles in exactly the same order in which they occurred. This is achieved by single-threading the read phase of the recovery process by the SMON process. SMON reads the redo log files and serializes the changes before dispatching them to the parallel slave processes. The parallel slave processes then apply those changes to the datafiles in the proper order. Therefore, the reading of the redo log files is performed serially even during a parallel recovery operation.
The RECOVERY_PARALLELISM initialization parameter controls the degree of parallelism to use for a recovery. You can override that setting for a specific situation by using the RECOVER command's PARALLEL clause. Both options are described in the following sections.
The initialization parameter RECOVERY_PARALLELISM specifies the number of parallel slave processes to participate in a recovery process. It applies to both instance recovery and media recovery. A value of or 1 indicates serial recovery—no parallelism will be used. The RECOVERY_PARALLELISM parameter setting cannot exceed the PARALLEL_MAX_SERVERS setting.
The PARALLEL clause can be used with the RECOVER command to parallelize media recovery. You use it to specify the degree or the number of parallel slave processes that will be used. The syntax for the PARALLEL clause is discussed in Section 4.3 of this chapter. You can use the PARALLEL clause with the RECOVER DATABASE, RECOVER TABLESPACE, and RECOVER DATAFILE commands. Here are some examples:
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Parallel Replication Propagation
Oracle provides replication mechanisms allowing you to maintain copies of database objects in multiple databases. Changes are propagated among these databases over database links. The SNP (snapshot) background processes perform the replication process. For large volumes of replicated data, parallel propagation can be used to enhance throughput.
With serial replication, Oracle propagates replicated transactions one at a time in the same order in which they are committed in the source database. With parallel propagation, Oracle enlists multiple parallel slave processes to propagate replicated transactions using multiple parallel streams. Oracle orders the dependent transactions properly based on the System Change Number (SCN). During parallel propagation, you can see multiple connections to the destination database.
You enable parallel replication propagation at the database link level. A database link is created for a particular destination database. When you enable parallel propagation for a database link, Oracle uses multiple parallel slave processes to replicate to the corresponding destination. If you're using Replication Manager, edit the property sheet for the link, enable the "Parallel Propagation" setting, and then specify the desired degree of parallelism in the "Processes" field.
To enable parallel replication propagation from the SQL*Plus command line, you need to use the Oracle built-in package DBMS_DEFER_SYS. Execute the DBMS_DEFER_SYS.SCHEDULE_PUSH procedure for the destination database link, and pass the desired degree of parallelism as the value for the parallelism argument. The following shows an example of setting the degree of parallelism for replication propagation using the DBMS_DEFER_SYS.SCHEDULE_PUSH procedure:
SQL> EXECUTE DBMS_DEFER_SYS.SCHEDULE_PUSH (-
> DESTINATION => 'finprod.world', -
> INTERVAL => 'SYSDATE+1/24', -
> NEXT_DATE => 'SYSDATE+1/24', -
> PARALLELISM => 6);
            
This example sets the degree of parallelism to 6 for propagating to the "finprod.world" destination database. It also specifies the time when the first propagation will start (NEXT_DATE) and the frequency of propagation (INTERVAL). For a detailed discussion of replication, refer to
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Chapter 5: Monitoring and Tuning Parallel Execution
Parallel execution can significantly reduce the processing time required for queries and other SQL statements, but only if it is properly implemented and tuned. You must monitor and tune the performance of parallel execution in your environment on a regular basis. In this chapter you'll learn about the various tools and procedures at your disposal for monitoring the parallel execution performance of a database. You'll also find some tuning tips and information on some important Oracle8i enhancements related to parallel execution performance.
Oracle tuning in general is a very large area, but when you focus strictly on parallel execution, there are only a few tuning-related adjustments that you can make. These include:
  • Changing the PARALLEL_MIN_SERVERS initialization parameter, which controls the minimum number of parallel slave processes maintained in the parallel slave process pool
  • Changing the PARALLEL_MAX_SERVERS initialization parameter, which controls the maximum number of parallel slave processes maintained in the parallel slave process pool
  • Changing the PARALLEL_SERVER_IDLE_TIME initialization parameter, which controls the amount of time a slave process is kept around when it's not being used
  • Changing the degree of parallelism, using the DEGREE and INSTANCES keywords, for either a statement or a table
  • Changing your statements to avoid "parallel from serial" bottlenecks (i.e., those caused by your execution plan's having to perform the PARALLEL_FROM_SERIAL step, described in the discussion of EXPLAIN PLAN later in this chapter)
  • Using Oracle8i's new automatic parallel execution tuning feature
Of course, to tune something, you need a way to gather statistics and measure performance. As you might have guessed already, Oracle's dynamic performance views (the V$ views) are your prime source of information on parallel execution performance. Another useful source of information is the EXPLAIN PLAN statement, which can tell you if the execution plan for a given SQL statement contains any "parallel from serial" bottlenecks.
Several Oracle dynamic performance views provide important information that's useful for monitoring and tuning parallel SQL execution. The views that you need to be aware of are shown in the following list:
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Tuning Overview
Oracle tuning in general is a very large area, but when you focus strictly on parallel execution, there are only a few tuning-related adjustments that you can make. These include:
  • Changing the PARALLEL_MIN_SERVERS initialization parameter, which controls the minimum number of parallel slave processes maintained in the parallel slave process pool
  • Changing the PARALLEL_MAX_SERVERS initialization parameter, which controls the maximum number of parallel slave processes maintained in the parallel slave process pool
  • Changing the PARALLEL_SERVER_IDLE_TIME initialization parameter, which controls the amount of time a slave process is kept around when it's not being used
  • Changing the degree of parallelism, using the DEGREE and INSTANCES keywords, for either a statement or a table
  • Changing your statements to avoid "parallel from serial" bottlenecks (i.e., those caused by your execution plan's having to perform the PARALLEL_FROM_SERIAL step, described in the discussion of EXPLAIN PLAN later in this chapter)
  • Using Oracle8i's new automatic parallel execution tuning feature
Of course, to tune something, you need a way to gather statistics and measure performance. As you might have guessed already, Oracle's dynamic performance views (the V$ views) are your prime source of information on parallel execution performance. Another useful source of information is the EXPLAIN PLAN statement, which can tell you if the execution plan for a given SQL statement contains any "parallel from serial" bottlenecks.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Dynamic Performance Views
Several Oracle dynamic performance views provide important information that's useful for monitoring and tuning parallel SQL execution. The views that you need to be aware of are shown in the following list:
V$PQ_SYSSTAT
V$PQ_SESSTAT
V$PQ_SLAVE
V$PQ_TQSTAT
V$SYSSTAT
V$SESSTAT
V$PX_PROCESS
V$PX_SESSION
V$PX_SESSTAT
V$PX_PROCESS_SYSSTAT
Several of these views are new with the release of Oracle8i. V$SYSSTAT and V$SESSTAT have been around for a while; they contain general statistics, some of which apply to parallel execution. The other views return information specific to parallel execution. The names of the new parallel execution views all begin with V$PX, while the names of older parallel query views begin with V$PQ. For the most part, the new V$PX views will eventually replace the older V$PQ views. Table 5.1 describes the relationship between these two sets of views.
Table 5.1: Relationship of V$PX Views to V$PQ Views
New in Oracle8i
Oracle8
Comments
V$PX_SESSION
N/A
V$PX_SESSTAT
N/A
N/A
V$PQ_SESSTAT
V$PQ_SESSTAT eventually will be dropped.
V$PX_PROCESS
V$PQ_SLAVE
V$PQ_SLAVE eventually will be dropped.
V$PX_PROCESS_SYSTAT
V$PQ_SYSSTAT
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Using EXPLAIN PLAN to View Parallel Execution
The sequence of operations that Oracle performs to execute a SQL statement is called an execution plan. Sometimes it's useful to look at the execution plan for a statement, in order to determine whether the plan picked by the optimizer is an efficient one. You can use Oracle's EXPLAIN PLAN statement for this purpose. EXPLAIN PLAN takes a SQL statement, determines the execution plan, and stores that plan in a special table known as the plan table.
If you are not familiar with the use of EXPLAIN PLAN, you may want to consult Oracle Corporation's Oracle8 Tuning manual, Oracle SQL*Plus: The Definitive Guide by Jonathan Gennick (O'Reilly & Associates, 1998), or Oracle SQL: The Essential Reference by David Kreines (O'Reilly & Associates, 2000).
The plan table used by EXPLAIN PLAN is most often named PLAN_TABLE. The columns of the table describe various aspects of the execution plan. Two columns are of special importance for parallel execution:
OBJECT_NODE
Describes the order in which the output from execution plan operations is consumed.
OTHER_TAG
Describes the parallel or serial relationships between steps. The possible values for the OTHER_TAG column are described in Table 5.4.
Table 5.4: Values of the OTHER_TAG Column in the Plan Table
Value
Meaning
SERIAL
A serial operation
No value; the column is null