10.2 Logical Query Optimization

The first step in optimization is to choose the relational algebra translation for the high-level query. The relational algebra formulation of a query is of central importance in query processing and optimization, because it specifies the order of operations, and that order can largely determine how efficient the query plan will be. We will use the University database schema to illustrate manipulation of relational algebra queries. We will assume that Student has 10,000 records, Class has 2,500 records, Faculty has 500 records, and Enroll has 50,000 records. The schema is as follows:

An illustration of the relation between 4 schemas. The schemas are as follows. Student, open parentheses, s t u I d, comma, last Name, comma, first Name, comma, major, comma, credits, close parentheses. The variable s t u I d is underlined. Class, open parentheses, class Number, comma, f a c I d, comma, schedule, comma, room, close parentheses. The variable labeled class Number is underlined. Faculty, open parentheses, f a c I d, comma, name, comma, department, comma, rank, close parentheses. The variable labeled f a c I d is underlined. Enroll, open parentheses, s t u d I d, comma, class Number, comma, grade, close parentheses. The variables s t u I d and class Number are underlined. An arrow is drawn from the variable s t u d I d of Enroll to the variable of the same name in Student. An arrow is drawn from the variable class Number of Enroll to the variable of the same name in Class. An arrow is drawn from the variable f a c I d of Class to the variable of the same name in Faculty.

10.2.1 The Query Tree

A technique ...

Get Databases Illuminated, 4th Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.