Java Programming with Oracle JDBCBy Donald Bales
0-596-00088-x, Order Number: 088x
496 pages, $39.95
Performance is usually considered an issue at the end of a development cycle when it should really be considered from the start. Often, a task called "performance tuning" is done after the coding is complete, and the end user of a program complains about how long it takes the program to complete a particular task. The net result of waiting until the end of the development cycle to consider performance includes the expense of the additional time required to recode a program to improve its performance. It's my opinion that performance is something that is best considered at the start of a project.
When it comes to performance issues concerning JDBC programming there are two major factors to consider. The first is the performance of the database structure and the SQL statements used against it. The second is the relative efficiency of the different ways you can use the JDBC interfaces to manipulate a database.
In terms of the database's efficiency, you can use the EXPLAIN PLAN facility to explain how the database's optimizer plans to execute your SQL statements. Armed with this knowledge, you may determine that additional indexes are needed, or that you require an alternative means of selecting the data you desire.
On the other hand, when it comes to using JDBC, you need to know ahead of time the relative strengths and weaknesses of using auto-commit, SQL92 syntax, and a
CallableStatementobject. In this chapter, we'll examine the relative performance of various JDBC objects using example programs that report the amount of time it takes to accomplish a given task. We'll first look at auto-commit. Next, we'll look at the impact of the SQL92 syntax parser. Then we'll start a series of comparisons of the
Statementobject versus the
PreparedStatementobject versus the
CallableStatementobject. At the same time we'll also examine the performance of the OCI versus the Thin driver in each situation to see if, as Oracle's claims, there is a significant enough performance gain with the OCI driver that you should use it instead of the Thin driver. For the most part, our discussions will be based on timing data for 1,000 inserts into the test performance table TESTXXXPERF. There are separate programs for performing these 1,000 inserts using the OCI driver and the Thin driver.
The performance test programs themselves are very simple and are available online with the rest of the examples in this book. However, for brevity, I'll not show the code for the examples in this chapter. I'll only talk about them. Although the actual timing values change from system to system, their relative values, or ratios from one system to another, remain consistent. The timings used in this chapter were gathered using Windows 2000. Using objective data from these programs allows us to come to factual conclusions on which factors improve performance, rather than relying on hearsay.
I'm sure you'll be surprised at the reality of performance for these objects, and I hope you'll use this knowledge to your advantage. Let's get started with a look at the testing framework used in this chapter.
A Testing Framework
For the most part, the test programs in this chapter report the timings for inserting data into a table. I picked an INSERT statement because it eliminates the performance gain of the database block buffers that may skew timings for an UPDATE, DELETE, or SELECT statement.
The test table used in the example programs in this chapter is a simple relational table. I wanted it to have a NUMBER, a small VARCHAR2, a large VARCHAR2, and a DATE column. Table TESTXXXPERF is defined as:
create table TestXXXPerf (
insert_date date )
tablespace users pctfree 20
storage( initial 1 M next 1 M pctincrease 0 );
alter table TestXXXPerf
add constraint TestXXXPerf_Pk
primary key ( id )
tablespace users pctfree 20
storage( initial 1 M next 1 M pctincrease 0 );
The initial extent size used for the table makes it unlikely that the database will need to take the time to allocate another extent during the execution of one of the test programs. Therefore, extent allocation will not impact the timings. Given this background, you should have a context to understand what is done in each section by each test program.
By default, JDBC's auto-commit feature is on, which means that each SQL statement is committed as it is executed. If more than one SQL statement is executed by your program, then a small performance increase can be achieved by turning off auto-commit.
Let's take a look at some numbers. Table 19-1 shows the average time, in milliseconds, needed to insert 1,000 rows into the TESTXXXPERF table using a
Statementobject. The timings represent the average from three runs of the program. Both drivers experience approximately a one-second loss as overhead for committing between each SQL statement. When you divide that one second by 1,000 inserts, you can see that turning off auto-commit saves approximately 0.001 seconds (1 millisecond) per SQL statement. While that's not interesting enough to write home about, it does demonstrate how auto-commit can impact performance.
Table 19-1: Auto-commit timings (in milliseconds)
Clearly, it's more important to turn off auto-commit for managing multistep transactions than for gaining performance. But on a heavily loaded system where many users are committing transactions, the amount of time it takes to perform commits can become quite significant. So my recommendation is to turn off auto-commit and manage your transactions manually. The rest of the tests in this chapter are performed with auto-commit turned off.
SQL92 Token Parsing
Like auto-commit, SQL92 escape syntax token parsing is on by default. In case you don't recall, SQL92 token parsing allows you to embed SQL92 escape syntax in your SQL statements (see "Oracle and SQL92 Escape Syntax" in Chapter 9). These standards-based snippets of syntax are parsed by a JDBC driver transforming the SQL statement into its native syntax for the target database. SQL92 escape syntax allows you to make your code more portable--but does this portability come with a cost in terms of performance?
Table 19-2 shows the number of milliseconds needed to insert 1,000 rows into the TESTXXXPERF table. Timings are shown with the SQL92 escape syntax parser on and off for both the OCI and Thin drivers. As before, these timings represent the result of three program runs averaged together.
Table 19-2: SQL92 token parser timings (in milliseconds)
Notice from Table 19-2 that with the OCI driver we lose 177 milliseconds when escape syntax parsing is turned off, and we lose only 37 milliseconds when the parser is turned off with the Thin driver. These results are the opposite of what you might intuitively expect. It appears that both drivers have been optimized for SQL92 parsing, so you should leave it on for best performance.
Now that you know you never have to worry about turning the SQL92 parser off, let's move on to something that has some potential for providing a substantial performance improvement.
Statement Versus PreparedStatement
There's a popular belief that using a
PreparedStatementobject is faster than using a
Statementobject. After all, a prepared statement has to verify its metadata against the database only once, while a statement has to do it every time. So how could it be any other way? Well, the truth of the matter is that it takes about 65 iterations of a prepared statement before its total time for execution catches up with a statement. This has performance implications for your application, and exploring these issues is what this section is all about.
When it comes to which SQL statement object performs better under typical use, a
PreparedStatement, the truth is that the
Statementobject yields the best performance. When you consider how SQL statements are typically used in an application--1 or 2 here, maybe 10-20 (rarely more) per transaction--you realize that a
Statementobject will perform them in less time than a
PreparedStatementobject. In the next two sections, we'll look at this performance issue with respect to both the OCI driver and the Thin driver.
The OCI Driver
Table 19-3 shows the timings in milliseconds for 1 insert and 1,000 inserts in the TESTXXXPERF table. The inserts are done first using a
Statementobject and then a
PreparedStatementobject. If you look at the results for 1,000 inserts, you may think that a prepared statement performs better. After all, at 1,000 inserts, the
PreparedStatementobject is almost twice as fast as the
Statementobject, but if you examine Figure 19-1, you'll see a different story.
Table 19-3: OCI driver timings (in milliseconds)
Figure 19-1 is a graph of the timings needed to insert varying numbers of rows using both a
Statementobject and a
PreparedStatementobject. The number of inserts begins at 1 and climbs in intervals of 10 up to a maximum of 150 inserts. For this graph and for those that follow, the lines themselves are polynomial trend lines with a factor of 2. I chose polynomial lines instead of straight trend lines so you can better see a change in the performance as the number of inserts increases. I chose a factor of 2 so the lines have only one curve in them. The important thing to notice about the graph is that it's not until about 65 inserts that the
PreparedStatementobject outperforms the
Statementobject. 65 inserts! Clearly, the
Statementobject is more efficient under typical use when using the OCI driver.
Figure 19-1. OCI driver timings
The Thin Driver
If you examine Table 19-4 (which shows the same timings as for Table 19-3, but for the Thin driver) and Figure 19-2 (which shows the data incrementally), you'll see that the Thin driver follows the same behavior as the OCI driver. However, since the
Statementobject starts out performing better than the
PreparedStatementobject, it takes about 125 inserts for the
Table 19-4: Thin driver timings (in milliseconds)
Figure 19-2. Thin driver timings
When you consider typical SQL statement usage, even with the Thin driver, you'll get better performance if you execute your SQL statements using a
Statementobject instead of a
PreparedStatementobject. Given that, you may ask: why use a
PreparedStatementat all? It turns out that there are some reasons why you might use a
PreparedStatementobject to execute SQL statements. First, there are several types of operations that you simply can't perform without a
PreparedStatementobject. For example, you must use a
PreparedStatementobject if you want to use large objects like BLOBs or CLOBs or if you wish to use object SQL. Essentially, you trade some loss of performance for the added functionality of using these object technologies. A second reason to use a
PreparedStatementis its support for batching.
As you saw in the previous section,
PreparedStatementobjects eventually become more efficient than their
Statementcounterparts after 65-125 executions of the same statement. If you're going to execute a given SQL statement a large number of times, it makes sense from a performance standpoint to use a
PreparedStatementobject. But if you're really going to do that many executions of a statement, or perhaps more than 50, you should consider batching. Batching is more efficient because it sends multiple SQL statements to the server at one time. Although JDBC defines batching capability for
Statementobjects, Oracle supports batching only when
Statementobjects are used. This makes some sense. A SQL statement in a
PreparedStatementobject is parsed once and can be reused many times. This naturally lends itself to batching.
The OCI Driver
Table 19-5 lists
PreparedStatementtimings, in milliseconds, for 1 insert and for 1,000 inserts. At the low end, one insert, you take a small performance hit for supporting batching. At the high end, 1,000 inserts, you've gained 75% throughput.
Table 19-5: OCI driver timings (in milliseconds)
If you examine Figure 19-3, a trend line analysis of the
Statementobject versus the batched
PreparedStatementobject, you'll see that this time, the batched
Statementobject becomes more efficient than the
Statementobject at about 50 inserts. This is an improvement over the prepared statement without batching.
Figure 19-3. OCI driver timings for batched SQL
WARNING: There's a catch here. The 8.1.6 OCI driver has a defect by which it does not support standard Java batching, so the numbers reported here were derived using Oracle's proprietary batching.
Now, let's take a look at batching in conjunction with the Thin driver.
The Thin Driver
The Thin driver is even more efficient than the OCI driver when it comes to using batched prepared statements. Table 19-6 shows the timings for the Thin driver using a
Statementobject versus a batched
PreparedStatementobject in milliseconds for the specified number of inserts.
Table 19-6: Thin driver timings (in milliseconds)
The Thin driver takes the same performance hit on the low end, one insert, but gains a whopping 86% improvement on the high end. Yes, 1,000 inserts in less than a second! If you examine Figure 19-4, you'll see that with the Thin driver, the use of a batched
PreparedStatementobject becomes more efficient than a
Statementobject more quickly than with the OCI driver--at about 40 inserts.
Figure 19-4. Thin driver timings for batched SQL
If you intend to perform many iterations of the same SQL statement against a database, you should consider batching with a
We've finished looking at improving the performance of inserts, updates, and deletes. Now let's see what we can do to squeak out a little performance while selecting data.
Predefined SELECT Statements
Every time you execute a SELECT statement, the JDBC driver makes two round trips to the database. On the first round trip, it retrieves the metadata for the columns you are selecting. On the second round trip, it retrieves the actual data you selected. With this in mind, you can improve the performance of a SELECT statement by 50% if you predefine the SELECT statement by using Oracle's
defineColumnType( )method with an
OracleStatementobject (see "Defining Columns" in Chapter 9). When you predefine a SELECT statement, you provide the JDBC driver with the column metadata using the
defineColumnType( )method, obviating the need for the driver to make a round trip to the database for that information. Hence, for a singleton SELECT, you eliminate half the work when you predefine the statement.
Table 19-7 shows the timings in milliseconds required to select a single row from the TESTXXXPERF table. Timings are shown for when the column type has been predefined and when it has not been predefined. Timings are shown for both the OCI and Thin drivers. Although the
defineColumnType( )method shows little improvement with either driver in my test, on a loaded network, you'll see a differentiation in the timings of about 50%. Given a situation in which you need to make several tight calls to the database using a
Statement, a predefined SELECT statement can save you a significant amount of time.
Table 19-7: Select timings (in milliseconds)
Now that we've looked at auto-commit, SQL92 parsing, prepared statements, and a predefined SELECT, let's take a look at the performance of callable statements.
As you may recall,
CallableStatementobjects are used to execute database stored procedures. I've saved
CallableStatementobjects until last, because they are the slowest performers of all the JDBC SQL execution interfaces. This may sound counterintuitive, because it's commonly believed that calling stored procedures is faster than using SQL, but that's simply not true. Given a simple SQL statement, and a stored procedure call that accomplishes the same task, the simple SQL statement will always execute faster. Why? Because with the stored procedure, you not only have the time needed to execute the SQL statement but also the time needed to deal with the overhead of the procedure call itself.
Table 19-8 lists the relative time, in milliseconds, needed to call the stored procedure TESTXXXPERF$.SETTESTXXXPERF( ). This stored procedure inserts one row into the table TESTXXXPERF. Timings are provided for both the OCI and Thin drivers. Notice that both drivers are slower when inserting a row this way than when using either a statement or a batched prepared statement (refer to Tables 19-3 through 19-6). Common sense will tell you why. The SETTESTXXXPERF( ) procedure inserts a row into the database. It does exactly the same thing that the other JDBC objects did but with the added overhead of a round trip for executing the remote procedure call.
Table 19-8: Stored procedure call timings (in milliseconds)
Stored procedures do have their uses. If you have a complex task that requires several SQL statements to complete, and you encapsulate those SQL statements into a stored procedure that you then call only once, you'll get better performance than if you executed each SQL statement separately from your program. This performance gain is the result of your program not having to move all the related data back and forth over the network, which is often the slowest part of the data manipulation process. This is how stored procedures are supposed to be used with Oracle--not as a substitute for SQL, but as a means to perform work where it can be done most efficiently.
OCI Versus Thin Drivers
Oracle's documentation states that you should use the OCI driver for maximum performance and the Thin driver for maximum portability. However, I recommend using the Thin driver all the time. Let's take a look at some numbers from Windows 2000. Table 19-9 lists all the statistics we've covered in this chapter.
Table 19-9: OCI versus Thin driver timings (in milliseconds)
1,000 inserts with auto-commit
1,000 inserts with manual commit
1 insert with
1,000 inserts with
1 insert with
1,000 inserts batched
1 insert with
1,000 inserts with
As you can see from Table 19-9, the Thin driver clearly outperforms the OCI driver for every type of operation except executions of
CallableStatementobjects. On a Unix platform, my experience has been that the
CallableStatementnumbers are tilted even more in favor of the OCI driver. Nonetheless, you can feel completely comfortable using the Thin driver in almost any setting. The Thin driver has been well-tuned by Oracle's JDBC development team to perform better than its OCI counterpart.
Back to: Java Programming with Oracle JDBC
© 2001, O'Reilly & Associates, Inc.