In this chapter
Creating Table Aliases with AS
Creating Joins with JOIN or WHERE
Creating a Cross Join with CROSS JOIN
Creating a Natural Join with NATURAL JOIN
Creating an Inner Join with INNER JOIN
All the queries so far have retrieved rows from a single table. This chapter explains how to use joins to retrieve rows from multiple tables simultaneously. Recall from “Relationships” in Chapter 2 that a relationship is an association established between common columns in two tables. A join is a table operation that uses related columns to combine rows from two input tables into one result table. You can chain joins to retrieve rows from an unlimited number of tables.
Why do joins matter? The most important database information isn’t so much stored in the rows of individual tables; rather, it’s the implied relationships between sets of related rows. In the sample database, for example, the individual rows of the tables authors, publishers, and titles contain important values, of course, but it’s the implied relationships that let you understand and analyze your data in its entirety: Who wrote what? Who published what? To whom do we send royalty checks? For how much? And so on.
This chapter explains the different types of joins, why they’re used, and how to create a SELECT statement that uses them.
Recall from “Tables, Columns, and Rows” in Chapter 2 that column names must be unique within a table but can be reused in other tables. The tables authors and publishers in the sample database both contain a column named city, for example.
To identify an otherwise-ambiguous column uniquely in a query that involves multiple tables, use its qualified name. A qualified name is a table name followed by a dot and the name of the column in the table. Because tables must have different names within a database, a qualified name identifies a single column uniquely within the entire database.
To qualify a column name:
Type:
table.column
column is a column name, and table is name of the table that contains column (Listing 7.1 and Figure 7.1).
Listing 7.1Here, the qualified names resolve otherwise-ambiguous references to the column city in the tables authors and publishers. See Figure 7.1 for the result.
SELECT au_id, authors.city
FROM authors
INNER JOIN publishers
ON authors.city = publishers.city;
Figure 7.1Result of Listing 7.1. This result lists authors who live in the same city as some publisher; the join syntax is explained later in this chapter.
au_id city
----- -------------
A03 San Francisco
A04 San Francisco
A05 New York
Qualification still works in queries that involve a single table. In fact, every column has an implicit qualifier. The following two statements are equivalent:
SELECT
au_fname,
au_lname
FROM authors;
and
SELECT
authors.au_fname,
authors.au_lname
FROM authors;
Your query might require still more qualifiers, depending on where it resides in the DBMS hierarchy. You might need to qualify a table with a server, database, or schema name, for example (see Table 2.2 in Chapter 2). Table aliases, described in the next section, are useful in SQL statements that require lengthy qualified names. A fully qualified table name in Microsoft SQL Server, for example, is:
server.database.owner.table
Oracle 8i and earlier require WHERE join syntax; see “Creating Joins with JOIN or WHERE” later in this chapter. To run Listing 7.1, type:
SELECT au_id, authors.city
FROM authors, publishers
WHERE authors.city =
publishers.city;
You can create table aliases by using AS just as you can create column aliases; see “Creating Column Aliases with AS” in Chapter 4. Table aliases:
To create a table alias:
In a FROM clause or JOIN clause, type:
table [AS] alias
table is a table name, and alias is its alias name. alias is a single, unquoted word that contains only letters, digits, or underscores; don’t use spaces, punctuation, or special characters. The AS keyword is optional (Listing 7.2 and Figure 7.2).
Listing 7.2Tables aliases make queries shorter and easier to read. Note that you can use an alias in the SELECT clause before it’s actually defined later in the statement. See Figure 7.2 for the result.
SELECT au_fname, au_lname, a.city
FROM authors a
INNER JOIN publishers p
ON a.city = p.city;
Figure 7.2Result of Listing 7.2.
au_fname au_lname city
--------- -------- -------------
Hallie Hull San Francisco
Klee Hull San Francisco
Christian Kells New York
An alias name hides a table name. If you alias a table, then you must use its alias in all qualified references. The following statement is illegal because the alias a occludes the table name authors:
SELECT authors.au_id
FROM authors a; --Illegal
Using WHERE join syntax , Listing 7.2 is equivalent to:
SELECT a.au_fname, a.au_lname, a.city
FROM authors a, publishers p
WHERE a.city = p.city;
In Oracle, you must omit the keyword AS when you create a table alias.
PostgreSQL implicitly adds table name(s) that appear in the SELECT clause to the FROM clause, which can cause unexpected cross joins. The query SELECT titles.title_id FROM titles t;
, for example, cross-joins the table titles, returning 169 (132) rows instead of an error. To turn off this behavior, use SET ADD_MISSING_FROM TO FALSE;
.
You can use a join to extract data from more than one table. The rest of this chapter explains the different types of joins (Table 7.1), why they’re used, and how to create SELECT statements that use them.
Join | Description |
---|---|
Cross join | Returns all rows from the first table in which each row from the first table is combined with all rows from the second table. |
Natural join | A join that compares, for equality, all the columns in the first table with corresponding columns that have the same name in the second table. |
Inner join | A join that uses a comparison operator to match rows from two tables based on the values in common columns from each table. Inner joins are the most common type of join. |
Left outer join | Returns all the rows from the left table, not just the ones in which the joined columns match. If a row in the left table has no matching rows in the right table, then the associated result row contains nulls for all SELECT-clause columns coming from the right table. |
Right outer join | The reverse of a left outer join. All rows from the right table are returned. Nulls are returned for the left table if a right-table row has no matching left-table row. |
Full outer join | Returns all rows in both the left and right tables. If a row has no match in the other table, then the SELECT-clause columns from the other table contain nulls. If there is a match between the tables, then the entire result row contains values from both tables. |
Self-join | A join of a table to itself. |
The important characteristics of joins are:
The data types of the join columns must be compatible, meaning that the DBMS can convert values to a common type for comparisons. For most DBMSs, numeric data types (INTEGER, FLOAT, and NUMERIC, for example), character data types (CHAR, VARCHAR), and datetime data types (DATE, TIMESTAMP) are compatible. You can’t join binary objects.
Conversions require computational overhead. For the best performance, the join columns should have identical data types and constraints, including whether nulls are allowed.
Domains and Comparisons
The values that you compare in joins and WHERE clauses must be meaningfully comparable—that is, have the same data type and the same meaning. The sample-database columns au_id and pub_id, for example, have the same data type—both are CHAR(3), a letter followed by two digits—but mean different things, so they can’t be compared sensibly.
Recall from “Tables, Columns, and Rows” in Chapter 2 that a domain is the set of permissible values for a column. To prevent meaningless comparisons, the relational model requires that comparable columns draw from domains that have the same meaning. Unfortunately, SQL and DBMSs stray from the model and have no intrinsic mechanism that prevents users from comparing, say, IQ and shoe size. If you’re building a database application, then it’s up to you to stop (or warn) users from making meaningless comparisons that waste processing time or, worse, yield results that might be interpreted as valid.
You have two alternative ways of specifying a join: by using JOIN syntax or WHERE syntax. SQL-92 and later standards prescribe JOIN syntax, but older standards prescribe WHERE; hence, both JOIN and WHERE are used widely.
This section explains the general syntax for JOIN and WHERE joins that involve two tables. The actual syntax that you’ll use in real queries will vary by the join type, the number of columns joined, the number of tables joined, and the syntax requirements of your DBMS. The syntax diagrams and examples in the following sections show you how to create specific joins.
To create a join by using JOIN:
Type:
SELECT columns
FROM table1 join_type table2
ON join_conditions
[WHERE search_condition]
[GROUP BY grouping_columns]
[HAVING search_condition]
[ORDER BY sort_columns];
columns is one or more comma-separated expressions or column names from table1 or table2. If table1 and table2 have a column name in common, then you must qualify all references to these columns throughout the query to prevent ambiguity; see “Qualifying Column Names” earlier in this chapter.
table1 and table2 are the names of the joined tables. You can alias the table names; see “Creating Table Aliases with AS” earlier in this chapter.
join_type specifies what kind of join is performed: CROSS JOIN, NATURAL JOIN, INNER JOIN, LEFT OUTER JOIN, RIGHT OUTER JOIN, or FULL OUTER JOIN.
join_conditions specifies one or more join conditions to be evaluated for each pair of joined rows. (The ON clause isn’t allowed in cross joins and natural joins.) A join condition takes this form:
[table1.]column op [table2.]column
op usually is = but can be any comparison operator: =, <>, <, <=, >, or >= (refer to Table 4.2 in Chapter 4). You can combine multiple join conditions with AND or OR; see “Combining and Negating Conditions with AND, OR, and NOT” in Chapter 4.
The WHERE and ORDER BY clauses are covered in Chapter 4; GROUP BY and HAVING are covered in Chapter 6.
To create a join by using WHERE:
Type:
SELECT columns
FROM table1, table2
WHERE join_conditions
[GROUP BY grouping_columns]
[HAVING search_condition]
[ORDER BY sort_columns];
columns, table1, and table2 have the same meaning as in “To create a join by using JOIN” earlier in this section.
join_conditions also has the same meaning as in “To create a join by using JOIN” earlier in this section, except that op can be a special symbol that indicates the join type. The WHERE clause also can include (nonjoin) search conditions to filter rows; see “Filtering Rows with WHERE” in Chapter 4.
The ORDER BY clause is covered in Chapter 4; GROUP BY and HAVING are covered in Chapter 6.
Listings 7.3a and 7.3b show equivalent queries that use JOIN and WHERE syntax. See Figure 7.3 for the result.
Listing 7.3aA join that uses JOIN syntax. See Figure 7.3 for the result.
SELECT au_fname, au_lname, a.city
FROM authors a
INNER JOIN publishers p
ON a.city = p.city;
Listing 7.3bThe same join, using WHERE syntax. See Figure 7.3 for the result.
SELECT au_fname, au_lname, a.city
FROM authors a, publishers p
WHERE a.city = p.city;
Figure 7.3Result of Listings 7.3a and 7.3b.
au_fname au_lname city
--------- -------- -------------
Hallie Hull San Francisco
Klee Hull San Francisco
Christian Kells New York
Query Execution Sequence
When your DBMS processes joins, it uses a logical sequence to execute the entire query. The DBMS:
Oracle 8i and earlier don’t support JOIN syntax; use WHERE joins instead. Oracle 9i and later support JOIN syntax.
Your DBMS might prohibit joins on columns with particular data types (especially binary and long-text data types). Microsoft SQL Server prohibits joins on nvarchar(max), varchar(max), and varbinary(max) columns, and Oracle prohibits joins on LOB columns, for example. Search your DBMS documentation for joins.
The USING Clause
For JOIN syntax, the SQL standard also defines a USING clause that can be used instead of the ON clause if the joined columns have the same name and are compared for equality:
FROM table1 join_type table2
USING (columns)
columns is a comma-separated list of one or more column names. The parentheses are required. The query performs an equijoin on the named pair(s) of columns. The type of join is called a named columns join. Rewriting Listing 7.3a with USING:
SELECT au_fname, au_lname, city
FROM authors
INNER JOIN publishers
USING (city);
The USING clause acts like a natural join, except that you can use it if you don’t want to join all pairs of columns with the same name in both tables. Note that the preceding USING example joins only on the column city in both tables, whereas a natural join would join on both the columns city and state common to the tables. See “Creating a Natural Join with NATURAL JOIN” later in this chapter.
USING is a syntactic convenience that doesn’t add extra functionality to SQL. A USING clause always can be replicated with an ON clause in JOIN syntax or with a WHERE clause in WHERE syntax.
Microsoft Access, Microsoft SQL Server, and Db2 don’t support USING. MySQL requires the SELECT clause’s common column names to be qualified in USING queries. To run the preceding example, change city to authors.city in the SELECT clause.
A cross join:
To create a cross join:
Type:
SELECT columns
FROM table1
CROSS JOIN table2
columns is one or more comma-separated expressions or column names from table1 or table2. table1 and table2 are the names of the joined tables. If the tables have some column names in common, then qualify those column names with the names of the tables (Listing 7.4 and Figure 7.4).
Listing 7.4A cross join displays all possible combinations of rows from two tables. See Figure 7.4 for the result.
SELECT
au_id,
pub_id,
a.state AS "au_state",
p.state AS "pub_state"
FROM authors a
CROSS JOIN publishers p;
Figure 7.4Result of Listing 7.4.
au_id pub_id au_state pub_state
----- ------ -------- ---------
A01 P01 NY NY
A02 P01 CO NY
A03 P01 CA NY
A04 P01 CA NY
A05 P01 NY NY
A06 P01 CA NY
A07 P01 FL NY
A01 P02 NY CA
A02 P02 CO CA
A03 P02 CA CA
A04 P02 CA CA
A05 P02 NY CA
A06 P02 CA CA
A07 P02 FL CA
A01 P03 NY NULL
A02 P03 CO NULL
A03 P03 CA NULL
A04 P03 CA NULL
A05 P03 NY NULL
A06 P03 CA NULL
A07 P03 FL NULL
A01 P04 NY CA
A02 P04 CO CA
A03 P04 CA CA
A04 P04 CA CA
A05 P04 NY CA
A06 P04 CA CA
A07 P04 FL CA
Using WHERE syntax, Listing 7.4 is equivalent to:
SELECT au_id, pub_id,
a.state AS "au_state",
p.state AS "pub_state"
FROM authors a, publishers p;
Use SELECT * to retrieve all columns from both tables. This query retrieves all columns from the tables authors and publishers:
SELECT *
FROM authors
CROSS JOIN publishers;
Equivalently, using WHERE syntax:
SELECT *
FROM authors, publishers;
Use SELECT table.* to retrieve all columns from just one of the tables. The following query retrieves all columns from the table authors and only the column pub_id from the table publishers:
SELECT authors.*, p.pub_id
FROM authors
CROSS JOIN publishers p;
Equivalently, using WHERE syntax:
SELECT authors.*, p.pub_id
FROM authors, publishers p;
To find the cross product of n tables by using JOIN syntax, type:
SELECT columns
FROM table1
CROSS JOIN table2
...
CROSS JOIN tableN
Equivalently, using WHERE syntax:
SELECT columns
FROM table1, table2,..., tableN
The join
t1 CROSS JOIN t2
is equivalent to any of the following joins:
t1 INNER JOIN t2 ON 1 = 1
t1 LEFT OUTER JOIN t2 ON 1 = 1
t1 RIGHT OUTER JOIN t2 ON 1 = 1
t1 FULL OUTER JOIN t2 ON 1 = 1
t1 and t2 are tables, and 1 = 1 represents any condition that’s always true. Inner and outer joins are covered later in this chapter.
Microsoft Access supports only WHERE syntax for cross joins. To run Listing 7.4, use the SQL statement given in the first tip in this section.
Oracle 8i and earlier don’t support JOIN syntax; use WHERE joins instead.
A natural join:
To create a natural join:
Type:
SELECT columns
FROM table1
NATURAL JOIN table2
columns is one or more comma-separated expressions or column names from table1 or table2. Your DBMS might require identical column names to be qualified with the names of the tables (see the DBMS tip in “Tips for NATURAL JOIN”). table1 and table2 are the names of the joined tables.
The columns in table1 are joined with the identically named columns in table2 and compared for equality. NATURAL JOIN creates natural inner joins; to create natural outer joins, see “Tips for NATURAL JOIN”.
When your DBMS runs Listing 7.5, it will join rows in the table publishers with rows in the table titles that have equal values in the columns publishers.pub_id and titles.pub_id—the two columns that have the same name in both tables. See Figure 7.5 for the result.
Listing 7.5List each book’s publisher. See Figure 7.5 for the result.
SELECT
title_id,
pub_id,
pub_name
FROM publishers
NATURAL JOIN titles;
Figure 7.5Result of Listing 7.5.
title_id pub_id pub_name
-------- ------ -------------------
T01 P01 Abatis Publishers
T02 P03 Schadenfreude Press
T03 P02 Core Dump Books
T04 P04 Tenterhooks Press
T05 P04 Tenterhooks Press
T06 P01 Abatis Publishers
T07 P03 Schadenfreude Press
T08 P04 Tenterhooks Press
T09 P04 Tenterhooks Press
T10 P01 Abatis Publishers
T11 P04 Tenterhooks Press
T12 P01 Abatis Publishers
T13 P03 Schadenfreude Press
In Listing 7.6, I’ve added another join to Listing 7.5 to retrieve the advance for each book. The WHERE condition retrieves books with advances less than $20000. When your DBMS runs Listing 7.6, it will join the pub_id columns in the tables publishers and titles, and it will join the title_id columns in the tables titles and royalties. See Figure 7.6 for the result.
Listing 7.6List each book’s publisher and advance for books with advances less than $20000. See Figure 7.6 for the result.
SELECT
title_id,
pub_id,
pub_name,
advance
FROM publishers
NATURAL JOIN titles
NATURAL JOIN royalties
WHERE advance < 20000;
Figure 7.6Result of Listing 7.6.
title_id pub_id pub_name advance
-------- ------ ------------------- -------
T01 P01 Abatis Publishers 10000
T02 P03 Schadenfreude Press 1000
T03 P02 Core Dump Books 15000
T08 P04 Tenterhooks Press 0
T09 P04 Tenterhooks Press 0
To replicate a natural join by using WHERE syntax, use an equijoin with a WHERE clause that uses AND operators to combine join conditions. Each join condition equates each pair of columns with the same name in the input tables. The equivalent WHERE queries are (Listing 7.5):
SELECT t.title_id, t.pub_id,
p.pub_name
FROM publishers p, titles t
WHERE p.pub_id = t.pub_id;
and (Listing 7.6):
SELECT t.title_id, t.pub_id,
p.pub_name, r.advance
FROM publishers p, titles t,
royalties r
WHERE p.pub_id = t.pub_id
AND t.title_id = r.title_id
AND r.advance < 20000;
To replicate a natural join by using inner or outer JOIN syntax, use an equijoin with an ON clause that uses AND operators to combine join conditions. Each join condition equates each pair of columns with the same name in both input tables. The equivalent JOIN queries are (Listing 7.5):
SELECT t.title_id, t.pub_id,
p.pub_name
FROM publishers p
INNER JOIN titles t
ON p.pub_id = t.pub_id;
and (Listing 7.6):
SELECT t.title_id, t.pub_id,
p.pub_name, r.advance
FROM publishers p
INNER JOIN titles t
ON p.pub_id = t.pub_id
INNER JOIN royalties r
ON t.title_id = r.title_id
WHERE r.advance < 20000;
You also can replicate a natural join by using JOIN syntax with a USING clause (see “The USING Clause” earlier in this chapter). NATURAL JOIN is a shorthand form of USING; it forms a USING list consisting of exactly those column names that appear in both tables. The equivalent USING queries are (Listing 7.5):
SELECT title_id, pub_id,
pub_name
FROM publishers
INNER JOIN titles
USING (pub_id);
and (Listing 7.6):
SELECT title_id, pub_id,
pub_name, advance
FROM publishers
INNER JOIN titles
USING (pub_id)
INNER JOIN royalties
USING (title_id)
WHERE advance < 20000;
The syntax NATURAL JOIN actually creates an inner join: NATURAL JOIN is equivalent to NATURAL INNER JOIN. You can create natural outer joins with:
NATURAL LEFT [OUTER] JOIN
NATURAL RIGHT [OUTER] JOIN
NATURAL FULL [OUTER] JOIN
The meaning of natural join differs slightly in the relational model (Chapter 2) and the SQL standard. In the model, a natural join always is a join from a foreign key to its parent key. In SQL, a natural join is a join of two tables over all columns that have the same name (not just key columns). See Listing 7.9 later in this chapter for an example of a natural join that doesn’t involve key columns.
To make the model and the SQL definitions of a natural join agree, a database designer will ensure that all the foreign keys have the same names as their parent keys and that all other columns have unique names.
Microsoft Access, Microsoft SQL Server, and Db2 don’t support NATURAL JOIN syntax. To run Listings 7.5 and 7.6, use either WHERE syntax (given in the first tip in this section) or equivalent JOIN syntax (given in the second tip in this section).
Oracle 8i and earlier don’t support JOIN syntax; use WHERE joins instead.
MySQL 4.1 and earlier require common column names to be qualified in natural joins. To run Listings 7.5 and 7.6, add qualifiers (Listing 7.5):
SELECT
t.title_id,
t.pub_id,
p.pub_name
FROM publishers p
NATURAL JOIN titles t;
and (Listing 7.6):
SELECT
t.title_id,
t.pub_id,
p.pub_name,
r.advance
FROM publishers p
NATURAL JOIN titles t
NATURAL JOIN royalties r
WHERE r.advance < 20000;
An inner join:
To create an inner join:
Type:
SELECT columns
FROM table1
INNER JOIN table2
ON join_conditions
columns is one or more comma-separated expressions or column names from table1 or table2. table1 and table2 are the names of the joined tables. If the tables have some column names in common, then qualify those column names with the names of the tables.
join_conditions specifies one or more join conditions to be evaluated for each pair of joined rows. A join condition takes this form:
[table1.]column op [table2.]column
op usually is = but can be any comparison operator: =, <>, <, <=, >, or >= (refer to Table 4.2 in Chapter 4). You can combine multiple join conditions with AND or OR; see “Combining and Negating Conditions with AND, OR, and NOT” in Chapter 4.
To create an inner join of three or more tables by using JOIN syntax, type:
SELECT columns
FROM table1
INNER JOIN table2
ON join_condition1
INNER JOIN table3
ON join_condition2
...
Using WHERE syntax, type:
SELECT columns
FROM table1, table2,...
WHERE join_condition1
AND join_condition2
...
You can use WHERE syntax or JOIN syntax in Microsoft Access, but if you use JOIN syntax in joins that involve three or more tables, then Access requires you to nest joins by using the following general syntax:
SELECT columns
FROM table1
INNER JOIN (table2 INNER JOIN [(]table3
[INNER JOIN [(]tableN [INNER JOIN ...)]
ON table3.column3 op tableN.columnN)]
ON table2.column2 op table3.column3)
ON table1.column1 op table2.column2;
(Other DBMSs also let you nest joins by using parentheses, but Access requires it.)
Oracle 8i and earlier don’t support JOIN syntax; use WHERE joins instead. Oracle 9i and later support JOIN syntax.
Listing 7.7 joins two tables on the column au_id to list the books that each author wrote (or cowrote). Each author’s au_id in the table authors matches zero or more rows in the table title_authors. See Figure 7.7 for the result. Note that author A07 (Paddy O'Furniture) is omitted from the result because he has written no books and so has no matching rows in title_authors.
Listing 7.7List the books that each author wrote (or cowrote). See Figure 7.7 for the result.
SELECT
a.au_id,
a.au_fname,
a.au_lname,
ta.title_id
FROM authors a
INNER JOIN title_authors ta
ON a.au_id = ta.au_id
ORDER BY a.au_id ASC, ta.title_id ASC;
Figure 7.7Result of Listing 7.7.
au_id au_fname au_lname title_id
----- --------- --------- --------
A01 Sarah Buchman T01
A01 Sarah Buchman T02
A01 Sarah Buchman T13
A02 Wendy Heydemark T06
A02 Wendy Heydemark T07
A02 Wendy Heydemark T10
A02 Wendy Heydemark T12
A03 Hallie Hull T04
A03 Hallie Hull T11
A04 Klee Hull T04
A04 Klee Hull T05
A04 Klee Hull T07
A04 Klee Hull T11
A05 Christian Kells T03
A06 Kellsey T08
A06 Kellsey T09
A06 Kellsey T11
Using WHERE syntax, Listing 7.7 is equivalent to:
SELECT a.au_id, a.au_fname,
a.au_lname, ta.title_id
FROM authors a, title_authors ta
WHERE a.au_id = ta.au_id
ORDER BY a.au_id ASC,
ta.title_id ASC;
Listing 7.8 joins two tables on the column pub_id to list each book’s title name and ID, and each book’s publisher name and ID. Note that the join is necessary to retrieve only the publisher name (the fourth column in the result); all the other three columns are available in the table titles. See Figure 7.8 for the result.
Listing 7.8List each book’s title name and ID and each book’s publisher name and ID. See Figure 7.8 for the result.
SELECT
t.title_id,
t.title_name,
t.pub_id,
p.pub_name
FROM titles t
INNER JOIN publishers p
ON p.pub_id = t.pub_id
ORDER BY t.title_name ASC;
Figure 7.8Result of Listing 7.8.
title_id title_name pub_id pub_name -------- ----------------------------------- ------ ------------------- T01 1977! P01 Abatis Publishers T02 200 Years of German Humor P03 Schadenfreude Press T03 Ask Your System Administrator P02 Core Dump Books T04 But I Did It Unconsciously P04 Tenterhooks Press T05 Exchange of Platitudes P04 Tenterhooks Press T06 How About Never? P01 Abatis Publishers T07 I Blame My Mother P03 Schadenfreude Press T08 Just Wait Until After School P04 Tenterhooks Press T09 Kiss My Boo-Boo P04 Tenterhooks Press T10 Not Without My Faberge Egg P01 Abatis Publishers T11 Perhaps It's a Glandular Problem P04 Tenterhooks Press T12 Spontaneous, Not Annoying P01 Abatis Publishers T13 What Are The Civilian Applications? P03 Schadenfreude Press
Using WHERE syntax, Listing 7.8 is equivalent to:
SELECT t.title_id, t.title_name,
t.pub_id, p.pub_name
FROM titles t, publishers p
WHERE p.pub_id = t.pub_id
ORDER BY t.title_name ASC;
Listing 7.9 uses two join conditions to list the authors who live in the same city and state as some publisher (any publisher). See Figure 7.9 for the result. Note that this query is a natural join on the identically named, nonkey columns city and state in the two tables (see “Creating a Natural Join with NATURAL JOIN” earlier in this chapter). An equivalent query is:
SELECT a.au_id, a.au_fname,
a.au_lname, a.city, a.state
FROM authors a
NATURAL JOIN publishers p
ORDER BY a.au_id ASC;
Listing 7.9List the authors who live in the same city and state in which a publisher is located. See Figure 7.9 for the result.
SELECT
a.au_id,
a.au_fname,
a.au_lname,
a.city,
a.state
FROM authors a
INNER JOIN publishers p
ON a.city = p.city
AND a.state = p.state
ORDER BY a.au_id;
Figure 7.9Result of Listing 7.9.
au_id au_fname au_lname city state
----- --------- -------- ------------- -----
A03 Hallie Hull San Francisco CA
A04 Klee Hull San Francisco CA
A05 Christian Kells New York NY
Using WHERE syntax, Listing 7.9 is equivalent to:
SELECT a.au_id, a.au_fname,
a.au_lname, a.city, a.state
FROM authors a, publishers p
WHERE a.city = p.city
AND a.state = p.state
ORDER BY a.au_id ASC;
Listing 7.10 combines an inner join with WHERE conditions to list books published in California or outside the large North American countries; see “Filtering Rows with WHERE” in Chapter 4. See Figure 7.10 for the result.
Listing 7.10List the books published in California or outside the large North American countries. See Figure 7.10 for the result.
SELECT
t.title_id,
t.title_name,
p.state,
p.country
FROM titles t
INNER JOIN publishers p
ON t.pub_id = p.pub_id
WHERE p.state = 'CA'
OR p.country NOT IN
('USA', 'Canada', 'Mexico')
ORDER BY t.title_id ASC;
Figure 7.10Result of Listing 7.10.
title_id title_name state country -------- ----------------------------------- ----- ------- T02 200 Years of German Humor NULL Germany T03 Ask Your System Administrator CA USA T04 But I Did It Unconsciously CA USA T05 Exchange of Platitudes CA USA T07 I Blame My Mother NULL Germany T08 Just Wait Until After School CA USA T09 Kiss My Boo-Boo CA USA T11 Perhaps It's a Glandular Problem CA USA T13 What Are The Civilian Applications? NULL Germany
Using WHERE syntax, Listing 7.10 is equivalent to:
SELECT t.title_id, t.title_name,
p.state, p.country
FROM titles t, publishers p
WHERE t.pub_id = p.pub_id
AND (p.state = 'CA'
OR p.country NOT IN
('USA', 'Canada', 'Mexico'))
ORDER BY t.title_id ASC;
Listing 7.11 combines an inner join with the aggregate function COUNT() and a GROUP BY clause to list the number of books that each author wrote (or cowrote). For information about aggregate functions and GROUP BY, see Chapter 6. See Figure 7.11 for the result. Note that, as in Figure 7.7, author A07 (Paddy O'Furniture) is omitted from the result because he has written no books and so has no matching rows in title_authors. See Listing 7.30 in “Creating Outer Joins with OUTER JOIN” later in this chapter for an example that lists authors who have written no books.
Listing 7.11List the number of books that each author wrote (or cowrote). See Figure 7.11 for the result.
SELECT
a.au_id,
COUNT(ta.title_id) AS "Num books"
FROM authors a
INNER JOIN title_authors ta
ON a.au_id = ta.au_id
GROUP BY a.au_id
ORDER BY a.au_id ASC;
Figure 7.11Result of Listing 7.11.
au_id Num books
----- ---------
A01 3
A02 4
A03 2
A04 4
A05 1
A06 3
Using WHERE syntax, Listing 7.11 is equivalent to:
SELECT a.au_id,
COUNT(ta.title_id)
AS "Num books"
FROM authors a, title_authors ta
WHERE a.au_id = ta.au_id
GROUP BY a.au_id
ORDER BY a.au_id ASC;
Listing 7.12 uses WHERE conditions to list the advance paid for each biography. See Figure 7.12 for the result.
Listing 7.12List the advance paid for each biography. See Figure 7.12 for the result.
SELECT
t.title_id,
t.title_name,
r.advance
FROM royalties r
INNER JOIN titles t
ON r.title_id = t.title_id
WHERE t.type = 'biography'
AND r.advance IS NOT NULL
ORDER BY r.advance DESC;
Figure 7.12Result of Listing 7.12.
title_id title_name advance
-------- ------------------------- -----------
T07 I Blame My Mother 1000000.00
T12 Spontaneous, Not Annoying 50000.00
T06 How About Never? 20000.00
Using WHERE syntax, Listing 7.12 is equivalent to:
SELECT t.title_id, t.title_name,
r.advance
FROM royalties r, titles t
WHERE r.title_id = t.title_id
AND t.type = 'biography'
AND r.advance IS NOT NULL
ORDER BY r.advance DESC;
Listing 7.13 uses aggregate functions and a GROUP BY clause to list the count and total advance paid for each type of book. See Figure 7.13 for the result.
Listing 7.13List the count and total advance paid for each type of book. See Figure 7.13 for the result.
SELECT
t.type,
COUNT(r.advance)
AS "COUNT(r.advance)",
SUM(r.advance)
AS "SUM(r.advance)"
FROM royalties r
INNER JOIN titles t
ON r.title_id = t.title_id
WHERE r.advance IS NOT NULL
GROUP BY t.type
ORDER BY t.type ASC;
Figure 7.13Result of Listing 7.13.
type COUNT(r.advance) SUM(r.advance)
---------- ---------------- --------------
biography 3 1070000.00
children 2 0.00
computer 1 15000.00
history 3 31000.00
psychology 3 220000.00
Using WHERE syntax, Listing 7.13 is equivalent to:
SELECT t.type,
COUNT(r.advance)
AS "COUNT(r.advance)",
SUM(r.advance)
AS "SUM(r.advance)"
FROM royalties r, titles t
WHERE r.title_id = t.title_id
AND r.advance IS NOT NULL
GROUP BY t.type
ORDER BY t.type ASC;
Listing 7.14 is similar to Listing 7.13, except that it uses an additional grouping column to list the count and total advance paid for each type of book by publisher. See Figure 7.14 for the result.
Listing 7.14List the count and total advance paid for each type of book, by publisher. See Figure 7.14 for the result.
SELECT
t.type,
t.pub_id,
COUNT(r.advance) AS "COUNT(r.advance)",
SUM(r.advance) AS "SUM(r.advance)"
FROM royalties r
INNER JOIN titles t
ON r.title_id = t.title_id
WHERE r.advance IS NOT NULL
GROUP BY t.type, t.pub_id
ORDER BY t.type ASC, t.pub_id ASC;
Figure 7.14Result of Listing 7.14.
type pub_id COUNT(r.advance) SUM(r.advance) ---------- ------ ---------------- -------------- biography P01 2 70000.00 biography P03 1 1000000.00 children P04 2 0.00 computer P02 1 15000.00 history P01 1 10000.00 history P03 2 21000.00 psychology P04 3 220000.00
Using WHERE syntax, Listing 7.14 is equivalent to:
SELECT t.type, t.pub_id,
COUNT(r.advance)
AS "COUNT(r.advance)",
SUM(r.advance)
AS "SUM(r.advance)"
FROM royalties r, titles t
WHERE r.title_id = t.title_id
AND r.advance IS NOT NULL
GROUP BY t.type, t.pub_id
ORDER BY t.type ASC, t.pub_id ASC;
Listing 7.15 uses a HAVING clause to list the number of coauthors of each book written by two or more authors. For information about HAVING, see “Filtering Groups with HAVING” in Chapter 6. See Figure 7.15 for the result.
Listing 7.15List the number of coauthors of each book written by two or more authors. See Figure 7.15 for the result.
SELECT
ta.title_id,
COUNT(ta.au_id) AS "Num authors"
FROM authors a
INNER JOIN title_authors ta
ON a.au_id = ta.au_id
GROUP BY ta.title_id
HAVING COUNT(ta.au_id) > 1
ORDER BY ta.title_id ASC;
Figure 7.15Result of Listing 7.15.
title_id Num authors
-------- -----------
T04 2
T07 2
T11 3
Using WHERE syntax, Listing 7.15 is equivalent to:
SELECT ta.title_id,
COUNT(ta.au_id) AS "Num authors"
FROM authors a, title_authors ta
WHERE a.au_id = ta.au_id
GROUP BY ta.title_id
HAVING COUNT(ta.au_id) > 1
ORDER BY ta.title_id ASC;
You also can join values in two columns that aren’t equal. Listing 7.16 uses greater-than (>) join to find each book whose revenue (= price × sales) is at least 10 times greater than the advance paid to the author(s). See Figure 7.16 for the result. The use of <, <=, >, and >= joins is common, but not-equal joins (<>) are used rarely. Generally, not-equal joins make sense only when used with a self-join; see “Creating a Self-Join” later in this chapter.
Listing 7.16List each book whose revenue (= price × sales) is at least 10 times greater than its advance. See Figure 7.16 for the result.
SELECT
t.title_id,
t.title_name,
r.advance,
t.price * t.sales AS "Revenue"
FROM titles t
INNER JOIN royalties r
ON t.price * t.sales > r.advance * 10
AND t.title_id = r.title_id
ORDER BY t.price * t.sales DESC;
Figure 7.16Result of Listing 7.16.
title_id title_name advance Revenue -------- ----------------------------------- ---------- ----------- T07 I Blame My Mother 1000000.00 35929790.00 T05 Exchange of Platitudes 100000.00 1400008.00 T12 Spontaneous, Not Annoying 50000.00 1299012.99 T03 Ask Your System Administrator 15000.00 1025396.65 T13 What Are The Civilian Applications? 20000.00 313905.33 T06 How About Never? 20000.00 225834.00 T02 200 Years of German Humor 1000.00 190841.70 T09 Kiss My Boo-Boo .00 69750.00 T08 Just Wait Until After School .00 40950.00
Using WHERE syntax, Listing 7.16 is equivalent to:
SELECT t.title_id, t.title_name, r.advance,
t.price * t.sales AS "Revenue"
FROM titles t, royalties r
WHERE t.price * t.sales >
r.advance * 10
AND t.title_id = r.title_id
ORDER BY t.price * t.sales DESC;
Complicated queries can arise from simple questions. In Listing 7.17, I must join three tables to list the author names and the names of the books that each author wrote (or cowrote). See Figure 7.17 for the result.
Listing 7.17List the author names and the names of the books that each author wrote (or cowrote). See Figure 7.17 for the result.
SELECT
a.au_fname,
a.au_lname,
t.title_name
FROM authors a
INNER JOIN title_authors ta
ON a.au_id = ta.au_id
INNER JOIN titles t
ON t.title_id = ta.title_id
ORDER BY a.au_lname ASC,
a.au_fname ASC, t.title_name ASC;
Figure 7.17Result of Listing 7.17.
au_fname au_lname title_name --------- --------- ----------------------------------- Sarah Buchman 1977! Sarah Buchman 200 Years of German Humor Sarah Buchman What Are The Civilian Applications? Wendy Heydemark How About Never? Wendy Heydemark I Blame My Mother Wendy Heydemark Not Without My Faberge Egg Wendy Heydemark Spontaneous, Not Annoying Hallie Hull But I Did It Unconsciously Hallie Hull Perhaps It's a Glandular Problem Klee Hull But I Did It Unconsciously Klee Hull Exchange of Platitudes Klee Hull I Blame My Mother Klee Hull Perhaps It's a Glandular Problem Christian Kells Ask Your System Administrator Kellsey Just Wait Until After School Kellsey Kiss My Boo-Boo Kellsey Perhaps It's a Glandular Problem
Using WHERE syntax, Listing 7.17 is equivalent to:
SELECT a.au_fname, a.au_lname, t.title_name
FROM authors a, title_authors ta,
titles t
WHERE a.au_id = ta.au_id
AND t.title_id = ta.title_id
ORDER BY a.au_lname ASC,
a.au_fname ASC,
t.title_name ASC;
To run Listing 7.17 in Microsoft Access, type:
SELECT a.au_fname, a.au_lname, t.title_name
FROM titles AS t
INNER JOIN (authors AS a
INNER JOIN title_authors AS ta
ON a.au_id = ta.au_id)
ON t.title_id = ta.title_id
ORDER BY a.au_lname ASC,
a.au_fname ASC,
t.title_name ASC;
Expanding on Listing 7.17, Listing 7.18 requires a four-table join to list the publisher names along with the names of the authors and books. See Figure 7.18 for the result.
Listing 7.18List the author names, the names of the books that each author wrote (or cowrote), and the publisher names. See Figure 7.18 for the result.
SELECT
a.au_fname,
a.au_lname,
t.title_name,
p.pub_name
FROM authors a
INNER JOIN title_authors ta
ON a.au_id = ta.au_id
INNER JOIN titles t
ON t.title_id = ta.title_id
INNER JOIN publishers p
ON p.pub_id = t.pub_id
ORDER BY a.au_lname ASC, a.au_fname ASC,
t.title_name ASC;
Figure 7.18Result of Listing 7.18.
au_fname au_lname title_name pub_name --------- --------- ----------------------------------- ------------------- Sarah Buchman 1977! Abatis Publishers Sarah Buchman 200 Years of German Humor Schadenfreude Press Sarah Buchman What Are The Civilian Applications? Schadenfreude Press Wendy Heydemark How About Never? Abatis Publishers Wendy Heydemark I Blame My Mother Schadenfreude Press Wendy Heydemark Not Without My Faberge Egg Abatis Publishers Wendy Heydemark Spontaneous, Not Annoying Abatis Publishers Hallie Hull But I Did It Unconsciously Tenterhooks Press Hallie Hull Perhaps It's a Glandular Problem Tenterhooks Press Klee Hull But I Did It Unconsciously Tenterhooks Press Klee Hull Exchange of Platitudes Tenterhooks Press Klee Hull I Blame My Mother Schadenfreude Press Klee Hull Perhaps It's a Glandular Problem Tenterhooks Press Christian Kells Ask Your System Administrator Core Dump Books Kellsey Just Wait Until After School Tenterhooks Press Kellsey Kiss My Boo-Boo Tenterhooks Press Kellsey Perhaps It's a Glandular Problem Tenterhooks Press
Using WHERE syntax, Listing 7.18 is equivalent to:
SELECT a.au_fname, a.au_lname,
t.title_name, p.pub_name
FROM authors a, title_authors ta,
titles t, publishers p
WHERE a.au_id = ta.au_id
AND t.title_id = ta.title_id
AND p.pub_id = t.pub_id
ORDER BY a.au_lname ASC,
a.au_fname ASC,
t.title_name ASC;
To run Listing 7.18 in Microsoft Access, type:
SELECT a.au_fname, a.au_lname,
t.title_name, p.pub_name
FROM (publishers AS p
INNER JOIN titles AS t
ON p.pub_id = t.pub_id)
INNER JOIN (authors AS a
INNER JOIN title_authors AS ta
ON a.au_id = ta.au_id)
ON t.title_id = ta.title_id
ORDER BY a.au_lname ASC,
a.au_fname ASC,
t.title_name ASC;
Listing 7.19 calculates the total royalties for all books. The gross royalty of a book is the book’s revenue (= sales × price) times the royalty rate (the fraction of revenue paid to the author). In most cases, the author receives an initial advance against royalties. The publisher deducts the advance from the gross royalty to get the net royalty. If the net royalty is positive, then the publisher must pay the author; if the net royalty is negative or zero, then the author gets nothing because he or she still hasn’t “earned out” the advance. See Figure 7.19 for the result. Gross royalties are labeled “Total royalties”, gross advances are labeled “Total advances”, and net royalties are labeled “Total due to authors”.
Listing 7.19 calculates total royalties for all books; the subsequent examples in this section will show you how to break down royalties by author, book, publisher, and other groups.
Listing 7.19Calculate the total royalties for all books. See Figure 7.19 for the result.
SELECT
SUM(t.sales * t.price *
r.royalty_rate)
AS "Total royalties",
SUM(r.advance)
AS "Total advances",
SUM((t.sales * t.price *
r.royalty_rate) - r.advance)
AS "Total due to authors"
FROM titles t
INNER JOIN royalties r
ON r.title_id = t.title_id
WHERE t.sales IS NOT NULL;
Figure 7.19Result of Listing 7.19.
Total royalties Total advances Total due to authors --------------- -------------- -------------------- 4387219.55 1336000.00 3051219.55
Using WHERE syntax, Listing 7.19 is equivalent to:
SELECT
SUM(t.sales * t.price *
r.royalty_rate)
AS "Total royalties",
SUM(r.advance)
AS "Total advances",
SUM((t.sales * t.price *
r.royalty_rate) - r.advance)
AS "Total due to authors"
FROM titles t, royalties r
WHERE r.title_id = t.title_id
AND t.sales IS NOT NULL;
Listing 7.20 uses a three-table join to calculate the royalty earned by each author for each book that the author wrote (or cowrote). Because a book can have multiple authors, per-author royalty calculations involve each author’s share of a book’s royalty (and advance). The author’s royalty share for each book is given in the table title_authors in the column royalty_share. For a book with a sole author, royalty_share is 1.0 (100 percent). For a book with multiple authors, the royalty_share of each author is a fractional amount between 0 and 1 (inclusive); all the royalty_share values for a particular book must sum to 1.0 (100 percent). See Figure 7.20 for the result. The sum of the values in each of the last three columns in the result equals the corresponding total in Figure 7.19.
Listing 7.20Calculate the royalty earned by each author for each book that the author wrote (or cowrote). See Figure 7.20 for the result.
SELECT
ta.au_id,
t.title_id,
t.pub_id,
t.sales * t.price *
r.royalty_rate * ta.royalty_share
AS "Royalty share",
r.advance * ta.royalty_share
AS "Advance share",
(t.sales * t.price *
r.royalty_rate * ta.royalty_share) -
(r.advance * ta.royalty_share)
AS "Due to author"
FROM title_authors ta
INNER JOIN titles t
ON t.title_id = ta.title_id
INNER JOIN royalties r
ON r.title_id = t.title_id
WHERE t.sales IS NOT NULL
ORDER BY ta.au_id ASC, t.title_id ASC;
Figure 7.20Result of Listing 7.20.
au_id title_id pub_id Royalty share Advance share Due to author ----- -------- ------ ------------- ------------- ------------- A01 T01 P01 622.32 10000.00 -9377.68 A01 T02 P03 11450.50 1000.00 10450.50 A01 T13 P03 18834.32 20000.00 -1165.68 A02 T06 P01 18066.72 20000.00 -1933.28 A02 T07 P03 1976138.45 500000.00 1476138.45 A02 T12 P01 116911.17 50000.00 66911.17 A03 T04 P04 8106.38 12000.00 -3893.62 A03 T11 P04 15792.90 30000.00 -14207.10 A04 T04 P04 5404.26 8000.00 -2595.74 A04 T05 P04 126000.72 100000.00 26000.72 A04 T07 P03 1976138.45 500000.00 1476138.45 A04 T11 P04 15792.90 30000.00 -14207.10 A05 T03 P02 71777.77 15000.00 56777.77 A06 T08 P04 1638.00 .00 1638.00 A06 T09 P04 3487.50 .00 3487.50 A06 T11 P04 21057.20 40000.00 -18942.80
Using WHERE syntax, Listing 7.20 is equivalent to:
SELECT ta.au_id, t.title_id, t.pub_id,
t.sales * t.price *
r.royalty_rate * ta.royalty_share
AS "Royalty share",
r.advance * ta.royalty_share
AS "Advance share",
(t.sales * t.price *
r.royalty_rate * ta.royalty_share) -
(r.advance * ta.royalty_share)
AS "Due to author"
FROM title_authors ta,
titles t, royalties r
WHERE t.title_id = ta.title_id
AND r.title_id = t.title_id
AND t.sales IS NOT NULL
ORDER BY ta.au_id ASC,
t.title_id ASC;
To run Listing 7.20 in Microsoft Access, type:
SELECT ta.au_id, t.title_id, t.pub_id,
t.sales * t.price *
r.royalty_rate * ta.royalty_share
AS "Royalty share",
r.advance * ta.royalty_share
AS "Advance share",
(t.sales * t.price *
r.royalty_rate * ta.royalty_share) -
(r.advance * ta.royalty_share)
AS "Due to author"
FROM (titles AS t
INNER JOIN royalties AS r
ON t.title_id = r.title_id)
INNER JOIN title_authors AS ta
ON t.title_id = ta.title_id
WHERE t.sales IS NOT NULL
ORDER BY ta.au_id ASC,
t.title_id ASC;
Listing 7.21 is similar to Listing 7.20 except that it adds a join to the table authors to print the author names and includes a WHERE condition to retrieve rows with only positive royalties. See Figure 7.21 for the result.
Listing 7.21List only positive royalties earned by each author for each book that the author wrote (or cowrote). See Figure 7.21 for the result.
SELECT
a.au_id,
a.au_fname,
a.au_lname,
t.title_name,
(t.sales * t.price *
r.royalty_rate * ta.royalty_share) -
(r.advance * ta.royalty_share)
AS "Due to author"
FROM authors a
INNER JOIN title_authors ta
ON a.au_id = ta.au_id
INNER JOIN titles t
ON t.title_id = ta.title_id
INNER JOIN royalties r
ON r.title_id = t.title_id
WHERE t.sales IS NOT NULL
AND (t.sales * t.price *
r.royalty_rate * ta.royalty_share) -
(r.advance * ta.royalty_share) > 0
ORDER BY a.au_id ASC, t.title_id ASC;
Figure 7.21Result of Listing 7.21.
au_id au_fname au_lname title_name Due to author ----- --------- --------- ----------------------------- ------------- A01 Sarah Buchman 200 Years of German Humor 10450.50 A02 Wendy Heydemark I Blame My Mother 1476138.45 A02 Wendy Heydemark Spontaneous, Not Annoying 66911.17 A04 Klee Hull Exchange of Platitudes 26000.72 A04 Klee Hull I Blame My Mother 1476138.45 A05 Christian Kells Ask Your System Administrator 56777.77 A06 Kellsey Just Wait Until After School 1638.00 A06 Kellsey Kiss My Boo-Boo 3487.50
Using WHERE syntax, Listing 7.21 is equivalent to:
SELECT a.au_id, a.au_fname,
a.au_lname, t.title_name,
(t.sales * t.price *
r.royalty_rate * ta.royalty_share) -
(r.advance * ta.royalty_share)
AS "Due to author"
FROM authors a, title_authors ta,
titles t, royalties r
WHERE a.au_id = ta.au_id
AND t.title_id = ta.title_id
AND r.title_id = t.title_id
AND t.sales IS NOT NULL
AND (t.sales * t.price *
r.royalty_rate * ta.royalty_share) -
(r.advance * ta.royalty_share) > 0
ORDER BY a.au_id ASC,
t.title_id ASC;
To run Listing 7.21 in Microsoft Access, type:
SELECT a.au_id, a.au_fname,
a.au_lname, t.title_name,
(t.sales * t.price *
r.royalty_rate * ta.royalty_share) -
(r.advance * ta.royalty_share)
AS "Due to author"
FROM (titles AS t
INNER JOIN royalties AS r
ON t.title_id = r.title_id)
INNER JOIN (authors AS a
INNER JOIN title_authors AS ta
ON a.au_id = ta.au_id)
ON t.title_id = ta.title_id
WHERE t.sales IS NOT NULL
AND (t.sales * t.price *
r.royalty_rate * ta.royalty_share) -
(r.advance * ta.royalty_share) > 0
ORDER BY a.au_id ASC,
t.title_id ASC;
Listing 7.22 uses a GROUP BY clause to calculate the total royalties paid by each publisher. The aggregate function COUNT() computes the total number of books for which each publisher pays royalties. Note that each author’s royalty share is unnecessary here because no per-author calculations are involved. See Figure 7.22 for the result. The sum of the values in each of the last three columns in the result equals the corresponding total in Figure 7.19.
Listing 7.22Calculate the total royalties paid by each publisher. See Figure 7.22 for the result.
SELECT
t.pub_id,
COUNT(t.sales)
AS "Num books",
SUM(t.sales * t.price * r.royalty_rate)
AS "Total royalties",
SUM(r.advance)
AS "Total advances",
SUM((t.sales * t.price
* r.royalty_rate) -
r.advance)
AS "Total due to authors"
FROM titles t
INNER JOIN royalties r
ON r.title_id = t.title_id
WHERE t.sales IS NOT NULL
GROUP BY t.pub_id
ORDER BY t.pub_id ASC;
Figure 7.22Result of Listing 7.22.
pub_id Num books Total royalties Total advances Total due to authors ------ --------- --------------- -------------- -------------------- P01 3 135600.21 80000.00 55600.21 P02 1 71777.77 15000.00 56777.77 P03 3 3982561.72 1021000.00 2961561.72 P04 5 197279.85 220000.00 -22720.15
Using WHERE syntax, Listing 7.22 is equivalent to :
SELECT t.pub_id,
COUNT(t.sales)
AS "Num books",
SUM(t.sales * t.price *
r.royalty_rate)
AS "Total royalties",
SUM(r.advance)
AS "Total advances",
SUM((t.sales * t.price *
r.royalty_rate) -
r.advance)
AS "Total due to authors"
FROM titles t, royalties r
WHERE r.title_id = t.title_id
AND t.sales IS NOT NULL
GROUP BY t.pub_id
ORDER BY t.pub_id ASC;
Listing 7.23 is similar to Listing 7.22 except that it calculates the total royalties earned by each author for all books written (or cowritten). See Figure 7.23 for the result. The sum of the values in each of the last three columns in the result equals the corresponding total in Figure 7.19.
Listing 7.23Calculate the total royalties earned by each author for all books written (or cowritten). See Figure 7.23 for the result.
SELECT
ta.au_id,
COUNT(sales)
AS "Num books",
SUM(t.sales * t.price *
r.royalty_rate * ta.royalty_share)
AS "Total royalties share",
SUM(r.advance * ta.royalty_share)
AS "Total advances share",
SUM((t.sales * t.price *
r.royalty_rate * ta.royalty_share) -
(r.advance * ta.royalty_share))
AS "Total due to author"
FROM title_authors ta
INNER JOIN titles t
ON t.title_id = ta.title_id
INNER JOIN royalties r
ON r.title_id = t.title_id
WHERE t.sales IS NOT NULL
GROUP BY ta.au_id
ORDER BY ta.au_id ASC;
Figure 7.23Result of Listing 7.23.
au_id Num books Total royalties share Total advances share Total due to author ----- ----------- --------------------- -------------------- ------------------- A01 3 30907.14 31000.00 -92.86 A02 3 2111116.34 570000.00 1541116.34 A03 2 23899.28 42000.00 -18100.72 A04 4 2123336.32 638000.00 1485336.32 A05 1 71777.77 15000.00 56777.77 A06 3 26182.70 40000.00 -13817.30
Using WHERE syntax, Listing 7.23 is equivalent to:
SELECT ta.au_id,
COUNT(sales)
AS "Num books",
SUM(t.sales * t.price *
r.royalty_rate * ta.royalty_share)
AS "Total royalties share",
SUM(r.advance * ta.royalty_share)
AS "Total advances share",
SUM((t.sales * t.price *
r.royalty_rate * ta.royalty_share) -
(r.advance * ta.royalty_share))
AS "Total due to author"
FROM title_authors ta, titles t,
royalties r
WHERE t.title_id = ta.title_id
AND r.title_id = t.title_id
AND t.sales IS NOT NULL
GROUP BY ta.au_id
ORDER BY ta.au_id ASC;
To run Listing 7.23 in Microsoft Access, type:
SELECT ta.au_id,
COUNT(sales)
AS "Num books",
SUM(t.sales * t.price *
r.royalty_rate * ta.royalty_share)
AS "Total royalties share",
SUM(r.advance * ta.royalty_share)
AS "Total advances share",
SUM((t.sales * t.price *
r.royalty_rate * ta.royalty_share) -
(r.advance * ta.royalty_share))
AS "Total due to author"
FROM (title_authors AS ta
INNER JOIN titles AS t
ON t.title_id = ta.title_id)
INNER JOIN royalties AS r
ON r.title_id = t.title_id
WHERE t.sales IS NOT NULL
GROUP BY ta.au_id
ORDER BY ta.au_id ASC;
Listing 7.24 uses two grouping columns to calculate the total royalties to be paid by each U.S. publisher to each author for all books written (or cowritten) by the author. The HAVING condition returns retrieve rows with only positive net royalties, and the WHERE condition retrieves only U.S. publishers. See Figure 7.24 for the result.
Listing 7.24Calculate the positive net royalties to be paid by each U.S. publisher to each author for all books written (or cowritten) by the author. See Figure 7.24 for the result.
SELECT
t.pub_id,
ta.au_id,
COUNT(*)
AS "Num books",
SUM(t.sales * t.price *
r.royalty_rate * ta.royalty_share)
AS "Total royalties share",
SUM(r.advance * ta.royalty_share)
AS "Total advances share",
SUM((t.sales * t.price *
r.royalty_rate * ta.royalty_share) -
(r.advance * ta.royalty_share))
AS "Total due to author"
FROM title_authors ta
INNER JOIN titles t
ON t.title_id = ta.title_id
INNER JOIN royalties r
ON r.title_id = t.title_id
INNER JOIN publishers p
ON p.pub_id = t.pub_id
WHERE t.sales IS NOT NULL
AND p.country IN ('USA')
GROUP BY t.pub_id, ta.au_id
HAVING SUM((t.sales * t.price *
r.royalty_rate * ta.royalty_share) -
(r.advance * ta.royalty_share)) > 0
ORDER BY t.pub_id ASC, ta.au_id ASC;
Figure 7.24Result of Listing 7.24.
pub_id au_id Num books Total royalties share Total advances share Total due to author ------ ----- ----------- --------------------- -------------------- ------------------- P01 A02 2 134977.89 70000.00 64977.89 P02 A05 1 71777.77 15000.00 56777.77 P04 A04 3 147197.87 138000.00 9197.87
Using WHERE syntax, Listing 7.24 is equivalent to:
SELECT t.pub_id, ta.au_id,
COUNT(*)
AS "Num books",
SUM(t.sales * t.price *
r.royalty_rate * ta.royalty_share)
AS "Total royalties share",
SUM(r.advance * ta.royalty_share)
AS "Total advances share",
SUM((t.sales * t.price *
r.royalty_rate * ta.royalty_share) -
(r.advance * ta.royalty_share))
AS "Total due to author"
FROM title_authors ta, titles t,
royalties r, publishers p
WHERE t.title_id = ta.title_id
AND r.title_id = t.title_id
AND p.pub_id = t.pub_id
AND t.sales IS NOT NULL
AND p.country IN ('USA')
GROUP BY t.pub_id, ta.au_id
HAVING SUM((t.sales * t.price *
r.royalty_rate * ta.royalty_share) -
(r.advance * ta.royalty_share)) > 0
ORDER BY t.pub_id ASC,
ta.au_id ASC;
To run Listing 7.24 in Microsoft Access, type:
SELECT t.pub_id, ta.au_id,
COUNT(*)
AS "Num books",
SUM(t.sales * t.price *
r.royalty_rate * ta.royalty_share)
AS "Total royalties share",
SUM(r.advance * ta.royalty_share)
AS "Total advances share",
SUM((t.sales * t.price *
r.royalty_rate * ta.royalty_share) -
(r.advance * ta.royalty_share))
AS "Total due to author"
FROM ((publishers AS p
INNER JOIN titles AS t
ON p.pub_id = t.pub_id)
INNER JOIN royalties AS r
ON t.title_id = r.title_id)
INNER JOIN title_authors AS ta
ON t.title_id = ta.title_id
WHERE t.sales IS NOT NULL
AND p.country IN ('USA')
GROUP BY t.pub_id, ta.au_id
HAVING SUM((t.sales * t.price *
r.royalty_rate * ta.royalty_share) -
(r.advance * ta.royalty_share)) > 0
ORDER BY t.pub_id ASC,
ta.au_id ASC;
In the preceding section, you learned that inner joins return rows only if at least one row from both tables satisfies the join condition(s). An inner join eliminates the rows that don’t match with a row from the other table, whereas an outer join returns all rows from at least one of the tables (provided that those rows meet any WHERE or HAVING search conditions).
Outer joins are useful for answering questions that involve missing quantities: authors who have written no books or classes with no enrolled students, for example. Outer joins also are helpful for creating reports in which you want to list all the rows of one table along with matching rows from another table: all authors and any books that sold more than a given number of copies, for example, or all products with order quantities, including products that no one ordered.
Unlike other joins, the order in which you specify the tables in outer joins is important, so the two join operands are called the left table and the right table. Outer joins come in three flavors:
To summarize, all rows are retrieved from the left table referenced in a left outer join, all rows are retrieved from the right table referenced in a right outer join, and all rows from both tables are retrieved in a full outer join. In all cases, unmatched rows are padded with nulls. In the result, you can’t distinguish the nulls (if any) that were in the input tables originally from the nulls inserted by the outer-join operation. Remember that the conditions NULL = NULL and NULL = any_value are unknown and not matches; see “Nulls” in Chapter 3.
To create a left outer join:
Type:
SELECT columns
FROM left_table
LEFT [OUTER] JOIN right_table
ON join_conditions
columns is one or more comma-separated expressions or column names from left_table or right_table. left_table and right_table are the names of the joined tables. If the tables have some column names in common, then qualify those column names with the names of the tables.
join_conditions specifies one or more join conditions to be evaluated for each pair of joined rows. A join condition takes this form:
[left_table.]column op [right_table.]column
op usually is = but can be any comparison operator: =, <>, <, <=, >, or >= (refer to Table 4.2 in Chapter 4). You can combine multiple join conditions with AND or OR; see “Combining and Negating Conditions with AND, OR, and NOT” in Chapter 4.
The keyword OUTER is optional.
To create a right outer join:
Type:
SELECT columns
FROM left_table
RIGHT [OUTER] JOIN right_table
ON join_conditions
columns, left_table, right_table, and join_conditions have the same meanings as in “To create a left outer join” earlier in this section.
The keyword OUTER is optional.
To create a full outer join:
Type:
SELECT columns
FROM left_table
FULL [OUTER] JOIN right_table
ON join_conditions
columns, left_table, right_table, and join_conditions have the same meanings as in “To create a left outer join” earlier in this section.
The keyword OUTER is optional.
Be wary of the order in which tables appear in outer joins. Unlike other joins, outer joins aren’t associative—that is, the result of a query that involves an outer join depends on the order in which the tables are grouped and joined (associated). The following two three-table inner joins are equivalent (except for the column order in the result):
SELECT * FROM table1
INNER JOIN table2
INNER JOIN table3
and:
SELECT * FROM table2
INNER JOIN table3
INNER JOIN table1
But the following three-table outer joins yield different results:
SELECT * FROM table1
LEFT OUTER JOIN table2
LEFT OUTER JOIN table3
and:
SELECT * FROM table2
LEFT OUTER JOIN table3
LEFT OUTER JOIN table1
Prior to the SQL:2003 standard, SQL had a union join, which doesn’t actually match rows from two tables but returns a full outer join with matching rows removed. Every row in a union join has the columns of one table joined with nulls for the columns of the other table. The result of the statement t1 UNION JOIN t2
looks like this table:
All rows of t1 | Nulls |
Nulls | All rows of t2 |
A UNION JOIN has few practical uses, and not many DBMSs support it. You can simulate a union join by using a full outer join.
t1 UNION JOIN t2
is equivalent to:
t1 FULL OUTER JOIN t2 ON 1 = 2
t1 and t2 are tables, and 1 = 2 represents any condition that’s always false. Note that UNION JOIN differs from UNION, which is a set operation and not a join; see “Combining Rows with UNION” in Chapter 9.
Microsoft SQL Server supports the standard OUTER JOIN syntax but older versions use the (nonstandard) outer join operator * in WHERE syntax to create outer joins. Attach * to the left or right of the comparison operator to create a left or right outer join. For outer joins, WHERE syntax is less precise than OUTER JOIN syntax and can yield an ambiguous query. The *= and =* outer-join operators are deprecated (slated for extinction) in newer versions of SQL Server.
Oracle 8i and earlier don’t support JOIN syntax; use WHERE joins instead. Oracle 9i and later support the standard OUTER JOIN syntax. In WHERE syntax, Oracle uses the (nonstandard) outer join operator (+) to create outer joins. Add (+) after the table that must be expanded (filled with nulls). See the examples later in this section.
For reference in the following four examples, Listing 7.25 and Figure 7.25 show the city for each author and publisher.
Listing 7.25List the cities of the authors and the cities of the publishers. See Figure 7.25 for the result.
SELECT a.au_fname, a.au_lname, a.city
FROM authors a;
SELECT p.pub_name, p.city
FROM publishers p;
Figure 7.25Result of Listing 7.25.
au_fname au_lname city
--------- ----------- -------------
Sarah Buchman Bronx
Wendy Heydemark Boulder
Hallie Hull San Francisco
Klee Hull San Francisco
Christian Kells New York
Kellsey Palo Alto
Paddy O'Furniture Sarasota
pub_name city
------------------- -------------
Abatis Publishers New York
Core Dump Books San Francisco
Schadenfreude Press Hamburg
Tenterhooks Press Berkeley
Listing 7.26 performs an inner join of the tables authors and publishers on their city columns. The result, Figure 7.26, lists only the authors who live in cities in which a publisher is located. You can compare the result of this inner join with the results of the outer joins in the following three examples.
Listing 7.26List the authors who live in cities in which a publisher is located. See Figure 7.26 for the result.
SELECT a.au_fname, a.au_lname, p.pub_name
FROM authors a
INNER JOIN publishers p
ON a.city = p.city;
Figure 7.26Result of Listing 7.26.
au_fname au_lname pub_name
--------- -------- -----------------
Hallie Hull Core Dump Books
Klee Hull Core Dump Books
Christian Kells Abatis Publishers
Using WHERE syntax, Listing 7.26 is equivalent to:
SELECT a.au_fname, a.au_lname, p.pub_name
FROM authors a, publishers p
WHERE a.city = p.city;
Listing 7.27 uses a left outer join to include all authors in the result, regardless of whether a publisher is located in the same city. See Figure 7.27 for the result.
Listing 7.27This left outer join includes all rows in the table authors in the result, whether or not there’s a match in the column city in the table publishers. See Figure 7.27 for the result.
SELECT a.au_fname, a.au_lname, p.pub_name
FROM authors a
LEFT OUTER JOIN publishers p
ON a.city = p.city
ORDER BY p.pub_name ASC,
a.au_lname ASC, a.au_fname ASC;
Figure 7.27Result of Listing 7.27. Note that there’s no matching data for four of the listed authors, so these rows contain null in the column pub_name.
au_fname au_lname pub_name
--------- ----------- -----------------
Sarah Buchman NULL
Wendy Heydemark NULL
Kellsey NULL
Paddy O'Furniture NULL
Christian Kells Abatis Publishers
Hallie Hull Core Dump Books
Klee Hull Core Dump Books
To run Listing 7.27 in older versions of Microsoft SQL Server by using WHERE syntax, type:
SELECT a.au_fname, a.au_lname, p.pub_name
FROM authors a, publishers p
WHERE a.city *= p.city
ORDER BY p.pub_name ASC,
a.au_lname ASC, a.au_fname ASC;
To run Listing 7.27 in Oracle 8i or earlier, type:
SELECT a.au_fname, a.au_lname, p.pub_name
FROM authors a, publishers p
WHERE a.city = p.city (+)
ORDER BY p.pub_name ASC,
a.au_lname ASC, a.au_fname ASC;
Listing 7.28 uses a right outer join to include all publishers in the result, regardless of whether an author lives in the publisher’s city. See Figure 7.28 for the result.
Listing 7.28This right outer join includes all rows in the table publishers in the result, whether or not there’s a match in the column city in the table authors. See Figure 7.28 for the result.
SELECT a.au_fname, a.au_lname, p.pub_name
FROM authors a
RIGHT OUTER JOIN publishers p
ON a.city = p.city
ORDER BY p.pub_name ASC,
a.au_lname ASC, a.au_fname ASC;
Figure 7.28Result of Listing 7.28. Note that there’s no matching data for two of the listed publishers, so these rows contain nulls in the columns au_fname and au_lname.
au_fname au_lname pub_name
--------- -------- -------------------
Christian Kells Abatis Publishers
Hallie Hull Core Dump Books
Klee Hull Core Dump Books
NULL NULL Schadenfreude Press
NULL NULL Tenterhooks Press
To run Listing 7.28 in older versions of Microsoft SQL Server by using WHERE syntax, type:
SELECT a.au_fname, a.au_lname, p.pub_name
FROM authors a, publishers p
WHERE a.city =* p.city
ORDER BY p.pub_name ASC,
a.au_lname ASC, a.au_fname ASC;
To run Listing 7.28 in Oracle 8i or earlier, type:
SELECT a.au_fname, a.au_lname, p.pub_name
FROM authors a, publishers p
WHERE a.city (+) = p.city
ORDER BY p.pub_name ASC,
a.au_lname ASC, a.au_fname ASC;
Listing 7.29 uses a full outer join to include all publishers and all authors in the result, regardless of whether a publisher and author are located in the same city. See Figure 7.29 for the result.
Listing 7.29This full outer join includes all rows in the tables authors and publishers in the result, whether or not there’s a match in the city columns. See Figure 7.29 for the result.
SELECT a.au_fname, a.au_lname, p.pub_name
FROM authors a
FULL OUTER JOIN publishers p
ON a.city = p.city
ORDER BY p.pub_name ASC,
a.au_lname ASC, a.au_fname ASC;
Figure 7.29Result of Listing 7.29. This result contains nine rows: four rows for authors who have no matching rows in the table publishers, three rows in which the author and publisher coexist in the same city, and two rows for publishers who have no matching city in the table authors.
au_fname au_lname pub_name
--------- ----------- -------------------
Sarah Buchman NULL
Wendy Heydemark NULL
Kellsey NULL
Paddy O'Furniture NULL
Christian Kells Abatis Publishers
Hallie Hull Core Dump Books
Klee Hull Core Dump Books
NULL NULL Schadenfreude Press
NULL NULL Tenterhooks Press
In older versions of Microsoft SQL Server, you can’t place the * operator on both sides of the comparison operator to form a full outer join. Instead, form the union of a left and right outer join; see “Combining Rows with UNION” in Chapter 9. To run Listing 7.29 by using WHERE syntax, type:
SELECT a.au_fname, a.au_lname, p.pub_name
FROM authors a, publishers p
WHERE a.city *= p.city
UNION ALL
SELECT a.au_fname, a.au_lname, p.pub_name
FROM authors a, publishers p
WHERE a.city =* p.city
AND a.city IS NULL;
In Oracle, you can’t place the (+) operator on both sides of the comparison operator to form a full outer join. Instead, form the union of a left and right outer join; see “Combining Rows with UNION” in Chapter 9. To run Listing 7.29 in Oracle 8i or earlier, type:
SELECT a.au_fname, a.au_lname, p.pub_name
FROM authors a, publishers p
WHERE a.city = p.city (+)
UNION ALL
SELECT a.au_fname, a.au_lname, p.pub_name
FROM authors a, publishers p
WHERE a.city (+) = p.city
AND a.city IS NULL;
Microsoft Access and MySQL don’t support full outer joins, but you can replicate one by taking the union of left and right outer joins; see “Combining Rows with UNION” in Chapter 9. In the following example, the first UNION table is a left outer join restricted to return all the rows in authors as well as the matched rows in publishers based on city. The second UNION table is a right outer join restricted to return only the unmatched rows in publishers. To run Listing 7.29, type:
SELECT a.au_fname, a.au_lname, p.pub_name
FROM authors a
LEFT OUTER JOIN publishers p
ON a.city = p.city
UNION ALL
SELECT a.au_fname, a.au_lname, p.pub_name
FROM authors a
RIGHT OUTER JOIN publishers p
ON a.city = p.city
WHERE a.city IS NULL;
Listing 7.30 uses a left outer join to list the number of books that each author wrote (or cowrote). See Figure 7.30 for the result. Note that in contrast to Listing 7.11 in “Creating an Inner Join with INNER JOIN” earlier in this chapter, the author A07 (Paddy O'Furniture) appears in the result even though he has written no books.
Listing 7.30List the number of books that each author wrote (or cowrote), including authors who have written no books. See Figure 7.30 for the result.
SELECT
a.au_id,
COUNT(ta.title_id) AS "Num books"
FROM authors a
LEFT OUTER JOIN title_authors ta
ON a.au_id = ta.au_id
GROUP BY a.au_id
ORDER BY a.au_id ASC;
Figure 7.30Result of Listing 7.30.
au_id Num books
----- ---------
A01 3
A02 4
A03 2
A04 4
A05 1
A06 3
A07 0
To run Listing 7.30 in Oracle 8i or earlier, type:
SELECT a.au_id,
COUNT(ta.title_id) AS "Num books"
FROM authors a, title_authors ta
WHERE a.au_id = ta.au_id (+)
GROUP BY a.au_id
ORDER BY a.au_id ASC;
Listing 7.31 uses a WHERE condition to test for null and list only the authors who haven’t written a book. See Figure 7.31 for the result.
Listing 7.31List the authors who haven’t written (or cowritten) a book. See Figure 7.31 for the result.
SELECT a.au_id, a.au_fname, a.au_lname
FROM authors a
LEFT OUTER JOIN title_authors ta
ON a.au_id = ta.au_id
WHERE ta.au_id IS NULL;
Figure 7.31Result of Listing 7.31.
au_id au_fname au_lname
----- -------- -----------
A07 Paddy O'Furniture
To run Listing 7.31 in Oracle 8i or earlier, type:
SELECT a.au_id, a.au_fname, a.au_lname
FROM authors a, title_authors ta
WHERE a.au_id = ta.au_id (+)
AND ta.au_id IS NULL;
Listing 7.32 combines an inner join and a left outer join to list all authors and any books they wrote (or cowrote) that sold more than 100,000 copies. In this example, first I created a filtered INNER JOIN result and then OUTER JOINed it with the table authors, from which I wanted all rows. See Figure 7.32 for the result.
Listing 7.32List all authors and any books written (or cowritten) that sold more than 100,000 copies. See Figure 7.32 for the result.
SELECT a.au_id, a.au_fname, a.au_lname,
tta.title_id, tta.title_name, tta.sales
FROM authors a
LEFT OUTER JOIN
(SELECT ta.au_id, t.title_id,
t.title_name, t.sales
FROM title_authors ta
INNER JOIN titles t
ON t.title_id = ta.title_id
WHERE sales > 100000) tta
ON a.au_id = tta.au_id
ORDER BY a.au_id ASC, tta.title_id ASC;
Figure 7.32Result of Listing 7.32.
au_id au_fname au_lname title_id title_name sales ----- --------- ----------- -------- ------------------------- ------- A01 Sarah Buchman NULL NULL NULL A02 Wendy Heydemark T07 I Blame My Mother 1500200 A02 Wendy Heydemark T12 Spontaneous, Not Annoying 100001 A03 Hallie Hull NULL NULL NULL A04 Klee Hull T05 Exchange of Platitudes 201440 A04 Klee Hull T07 I Blame My Mother 1500200 A05 Christian Kells NULL NULL NULL A06 Kellsey NULL NULL NULL A07 Paddy O'Furniture NULL NULL NULL
To run Listing 7.32 in Oracle 8i or earlier, type:
SELECT a.au_id, a.au_fname, a.au_lname,
tta.title_id, tta.title_name, tta.sales
FROM authors a,
(SELECT ta.au_id, t.title_id,
t.title_name, t.sales
FROM title_authors ta, titles t
WHERE t.title_id = ta.title_id
AND sales > 100000) tta
WHERE a.au_id = tta.au_id (+)
ORDER BY a.au_id ASC, tta.title_id ASC;
MySQL 4.1 and later will run Listing 7.32, but earlier versions don’t support subqueries; see the DBMS tip in “Tips for Subqueries” in Chapter 8. For complicated queries, you often can create a temporary table to hold the subquery; see “Creating a Temporary Table with CREATE TEMPORARY TABLE” in Chapter 11. To run Listing 7.32 in MySQL 4.0 and earlier, type:
CREATE TEMPORARY TABLE tta
SELECT ta.au_id, t.title_id,
t.title_name, t.sales
FROM title_authors ta
INNER JOIN titles t
ON t.title_id = ta.title_id
WHERE sales > 100000;
SELECT a.au_id, a.au_fname,
a.au_lname, tta.title_id,
tta.title_name, tta.sales
FROM authors a
LEFT OUTER JOIN tta
ON a.au_id = tta.au_id
ORDER BY a.au_id ASC, tta.title_id ASC;
DROP TABLE tta;
A self-join is a normal SQL join that joins a table to itself and retrieves rows from a table by comparing values in one or more columns in the same table. Self-joins often are used in tables with a reflexive relationship, which is a primary-key/foreign-key relationship from a column or combination of columns in a table to other columns in that same table. For information about keys, see “Primary Keys” and “Foreign Keys” in Chapter 2.
Suppose that you have the following table, named employees:
emp_id emp_name boss_id
------ ----------------- -------
E01 Lord Copper NULL
E02 Jocelyn Hitchcock E01
E03 Mr. Salter E01
E04 William Boot E03
E05 Mr. Corker E03
emp_id is a primary key that uniquely identifies the employee, and boss_id is an employee ID that identifies the employee’s manager. Each manager also is an employee, so to ensure that each manager ID that is added to the table matches an existing employee ID, boss_id is defined as a foreign key of emp_id. Listing 7.33 uses this reflexive relationship to compare rows within the table and retrieve the name of the manager of each employee. (You wouldn’t need a join at all to get just the manager’s ID.) See Figure 7.33 for the result.
Listing 7.33List the name of each employee and the name of his or her manager. See Figure 7.33 for the result.
SELECT
e1.emp_name AS "Employee name",
e2.emp_name AS "Boss name"
FROM employees e1
INNER JOIN employees e2
ON e1.boss_id = e2.emp_id;
Figure 7.33Result of Listing 7.33. Note that Lord Copper, who has no boss, is excluded from the result because his null boss_id doesn’t satisfy the join condition.
Employee name Boss name
----------------- -----------
Jocelyn Hitchcock Lord Copper
Mr. Salter Lord Copper
William Boot Mr. Salter
Mr. Corker Mr. Salter
The same table (employees) appears twice in Listing 7.33 with two different aliases (e1 and e2) that are used to qualify column names in the join condition:
e1.boss_id = e2.emp_id
As with any join, a self-join requires two tables, but instead of adding a second table to the join, you add a second instance of the same table. That way, you can compare a column in the first instance of the table to a column in the second instance. As with all joins, the DBMS combines and returns rows of the table that satisfy the join condition. You actually aren’t creating another copy of the table—you’re joining the table to itself—but the effect might be easier to understand if you think about it as being two tables.
To create a self-join:
Type:
SELECT columns
FROM table [AS] alias1
INNER JOIN table [AS] alias2
ON join_conditions
columns is one or more comma-separated expressions or column names from table. alias1 and alias2 are different alias names that are used to refer to table in join_conditions. See “Creating Table Aliases with AS” earlier in this chapter.
join_conditions specifies one or more join conditions to be evaluated for each pair of joined rows. A join condition takes this form:
alias1.column op alias2.column
op can be any comparison operator: =, <>, <, <=, >, or >= (refer to Table 4.2 in Chapter 4). You can combine multiple join conditions with AND or OR; see “Combining and Negating Conditions with AND, OR, and NOT” in Chapter 4.
Listing 7.34 uses a WHERE search condition and self-join from the column state to itself to find all authors who live in the same state as author A04 (Klee Hull). See Figure 7.34 for the result.
Listing 7.34List the authors who live in the same state as author A04 (Klee Hull). See Figure 7.34 for the result.
SELECT a1.au_id, a1.au_fname,
a1.au_lname, a1.state
FROM authors a1
INNER JOIN authors a2
ON a1.state = a2.state
WHERE a2.au_id = 'A04';
Figure 7.34Result of Listing 7.34.
au_id au_fname au_lname state
----- -------- -------- -----
A03 Hallie Hull CA
A04 Klee Hull CA
A06 Kellsey CA
Using WHERE syntax, Listing 7.34 is equivalent to:
SELECT a1.au_id, a1.au_fname,
a1.au_lname, a1.state
FROM authors a1, authors a2
WHERE a1.state = a2.state
AND a2.au_id = 'A04';
Self-joins can often be restated as subqueries (see Chapter 8). Using a subquery, Listing 7.34 is equivalent to:
SELECT au_id, au_fname,
au_lname, state
FROM authors
WHERE state IN
(SELECT state
FROM authors
WHERE au_id = 'A04');
For every biography, Listing 7.35 lists the other biographies that outsold it. Note that the WHERE search condition requires type = 'biography' for both tables t1 and t2 because the join condition considers the column type to be two separate columns. See Figure 7.35 for the result.
Listing 7.35For every biography, list the title ID and sales of the other biographies that outsold it. See Figure 7.35 for the result.
SELECT t1.title_id, t1.sales,
t2.title_id AS "Better seller",
t2.sales AS "Higher sales"
FROM titles t1
INNER JOIN titles t2
ON t1.sales < t2.sales
WHERE t1.type = 'biography'
AND t2.type = 'biography'
ORDER BY t1.title_id ASC, t2.sales ASC;
Figure 7.35Result of Listing 7.35.
title_id sales Better seller Higher sales
-------- ------ ------------- ------------
T06 11320 T12 100001
T06 11320 T07 1500200
T12 100001 T07 1500200
Using WHERE syntax, Listing 7.35 is equivalent to:
SELECT t1.title_id, t1.sales,
t2.title_id AS "Better seller",
t2.sales AS "Higher sales"
FROM titles t1, titles t2
WHERE t1.sales < t2.sales
AND t1.type = 'biography'
AND t2.type = 'biography'
ORDER BY t1.title_id ASC, t2.sales ASC;
Listing 7.36 is a self-join to find all pairs of authors within New York state. See Figure 7.36 for the result.
Listing 7.36List all pairs of authors who live in New York state. See Figure 7.36 for the result.
SELECT
a1.au_fname, a1.au_lname,
a2.au_fname, a2.au_lname
FROM authors a1
INNER JOIN authors a2
ON a1.state = a2.state
WHERE a1.state = 'NY'
ORDER BY a1.au_id ASC, a2.au_id ASC;
Figure 7.36Result of Listing 7.36.
au_fname au_lname au_fname au_lname
--------- -------- --------- --------
Sarah Buchman Sarah Buchman
Sarah Buchman Christian Kells
Christian Kells Sarah Buchman
Christian Kells Christian Kells
Using WHERE syntax, Listing 7.36 is equivalent to:
SELECT
a1.au_fname, a1.au_lname,
a2.au_fname, a2.au_lname
FROM authors a1, authors a2
WHERE a1.state = a2.state
AND a1.state = 'NY'
ORDER BY a1.au_id ASC, a2.au_id ASC;
The first and fourth rows of Figure 7.36 are unnecessary because they indicate that Sarah Buchman lives in the same state as Sarah Buchman, and likewise for Christian Kells. Adding a join condition retains only those rows in which the two authors differ (Listing 7.37 and Figure 7.37).
Listing 7.37List all different pairs of authors who live in New York state. See Figure 7.37 for the result.
SELECT
a1.au_fname, a1.au_lname,
a2.au_fname, a2.au_lname
FROM authors a1
INNER JOIN authors a2
ON a1.state = a2.state
AND a1.au_id <> a2.au_id
WHERE a1.state = 'NY'
ORDER BY a1.au_id ASC, a2.au_id ASC;
Figure 7.37Result of Listing 7.37.
au_fname au_lname au_fname au_lname
--------- -------- --------- --------
Sarah Buchman Christian Kells
Christian Kells Sarah Buchman
Using WHERE syntax, Listing 7.37 is equivalent to:
SELECT
a1.au_fname, a1.au_lname,
a2.au_fname, a2.au_lname
FROM authors a1, authors a2
WHERE a1.state = a2.state
AND a1.au_id <> a2.au_id
AND a1.state = 'NY'
ORDER BY a1.au_id ASC, a2.au_id ASC;
Listing 7.37 still isn’t quite what I want because the two result rows are redundant. The first row states that Sarah Buchman lives in the same state as Christian Kells, and the second row gives the same information. To eliminate this redundancy, I’ll change the second join condition’s comparison operator from not-equal to less-than (Listing 7.38 and Figure 7.38).
Listing 7.38List all different pairs of authors who live in New York state, with no redundancies. See Figure 7.38 for the result.
SELECT
a1.au_fname, a1.au_lname,
a2.au_fname, a2.au_lname
FROM authors a1
INNER JOIN authors a2
ON a1.state = a2.state
AND a1.au_id < a2.au_id
WHERE a1.state = 'NY'
ORDER BY a1.au_id ASC, a2.au_id ASC;
Figure 7.38Result of Listing 7.38.
au_fname au_lname au_fname au_lname
-------- -------- --------- --------
Sarah Buchman Christian Kells
Using WHERE syntax, Listing 7.38 is equivalent to:
SELECT
a1.au_fname, a1.au_lname,
a2.au_fname, a2.au_lname
FROM authors a1, authors a2
WHERE a1.state = a2.state
AND a1.au_id < a2.au_id
AND a1.state = 'NY'
ORDER BY a1.au_id ASC, a2.au_id ASC;