23
Sat, Nov
1 New Articles

Of Shoes and SQL Subqueries

SQL
Typography
  • Smaller Small Medium Big Bigger
  • Default Helvetica Segoe Georgia Times

In 1936, track athlete Jesse Owens competed in the Berlin Olympic Games, capturing four gold metals. On his feet were the work of the master running shoe makers, the Dassler brothers. These brothers eventually had a large fight and split. Each brother formed his own firm: One started Puma, and the other started Adidas. Sometimes things get even better when they are split up.

Think of SQL in the same way. There are some tasks you just can’t accomplish, or at least not elegantly, without breaking a statement up into subqueries. Subqueries are also known as subselects, inner queries, or inner selects. A subquery is a nested query within another query. You can even nest a subquery within another subquery. A subquery may be used in the WHERE, HAVING, or FROM clauses of an SQL statement. The subquery answers some intermediate question, the result of which will be used to complete the main query.

To demonstrate the subquery feature in SQL, I created four tables. Figures 1 through 4 show sample data for the Customer, Inventory, Order, and Item Type tables. If you’d like to put these tables on your system so that you can try out my examples, download the text file that accompanies this article from the Midrange Computing Web site at www.midrangecomputing.com/mc. The text file contains the SQL statements for creating the tables and for inserting rows into them. Note that SQL naming conventions are used here. That is, the term table is used instead of physical file. Row means record, and column means field.

The WHERE Clause

Use a subquery in an SQL WHERE clause whenever there is a need for comparison to summarized data within a table. Suppose, for example, you work for the Adidas outlet. One morning, the sales manager comes and asks for a simple query. She wants to know which orders have a value that is greater than the average value of all the orders. She wants to reward the customers who placed the largest orders with a special gift. Your first reaction may be to type in an SQL statement:

select * from orders

where ordervalue > avg(ordervalue)


Unfortunately, this is not allowed. You may not use a column function like AVG (average) in this manner. Instead, you need a summary of the orders table that will return the average of the ordervalue column. This is a subquery. The subquery will look like this: In our hypothetical scenario, this subquery will return the value of 335.833. To get the query to work, insert the subquery into the WHERE clause comparison: This returns a result set with the orders 22, 23, 25, 30, and 36—all of the orders that have a value over $335.83. The parentheses around the subquery are required. They indicate to the SQL engine that this is a subquery and not a syntax error on the user’s part.

Be careful how you define your subquery when you will use the result set to compare to a column in the outer query. The result set of the subquery must be compatible with the column’s datatype to which the comparison is being made. In this example, ordervalue is decimal, so the result set from the subquery must return a numeric value.

Correlated Subqueries

You can also use a subquery in an SQL WHERE clause whenever you need a comparison to find out if a row exists in another table. Suppose the sales manager liked the first report so well that she came back and had another request. She decides to pay a personal visit to customers who are not doing business with the company right now. What she needs is a list of every customer who does not have anything on order. The SQL code to accomplish this looks like the following: Notice that the inner and outer queries are tied together by using the customernumber from the outer query in the subquery. This has a special name. It is called a correlated subquery.

In this statement, the outer query takes each row from the customer’s table and keeps it in the result set if a row does not exist in the result set of the subquery. In this case, the only row returned is that of customer 208; all other existing customers have something on order. For each row in the outer query, the result set in the subquery is recalculated. EXISTS is a keyword in SQL that returns true if the subquery returns any rows at all. It returns false if no rows are returned. The NOT keyword was added because the request was for customers without orders. If the NOT were removed from the WHERE clause, the result set of the outer query would be all customers who have orders.

ALL and ANY

The next thing you know, the inventory manager pops his head in the door. He needs to know all of the inventory items for which the quantity on hand is greater than or equal to the average quantity on hand of any item type. SQL has an answer here as well:


select avg(ordervalue) from orders

select * from orders

where ordervalue >

(select avg(ordervalue)

from orders)

select firstname, lastname,

customernumber

from customers

where not exists

(select * from orders

where customernumber =

customers.customernumber)

select itemnumber,

itemdescription,

itemtype,

quantityonhand

from inventory

where quantityonhand >= all

(select avg(quantityonhand)

from inventory

group by itemtype)

The subquery returns a result set that will include the average quantity on hand for each item type. The outer query takes each row from the inventory table and compares the quantity on hand from that row with each row in the subquery’s result set. The row is included in the result set of the outer query if the outer query’s quantity on hand is greater than or equal to all entries in the subquery result set. This statement returns items 102, 104, 105, 108, and 110.

You can take the same example and replace the ALL keyword with the keyword ANY to achieve a different result set. ANY will compare each row in the outer query with the complete result set of the subquery. The row will be included in the outer query result set if the quantity on hand is greater than or equal to any of the subquery’s rows. With this change, the statement would return items 101, 102, 104, 105, 106, 108, and 110.

Set Membership

Two minutes later, the inventory manager comes back. He thinks there is a data integrity problem and wonders if there may be a bug in one of your programs that is allowing item types into the item master table that are not in the item type master table.

The keyword IN can be used to compare the rows generated from the outer query to a subselect’s set of values: In this case, the item type from the outer query is compared against a list of all of the item types contained within the item type master table. If the statement finds a type in the item master that is not in the item type master table, the row will be in the result set. In this case, that would be CASUL. When you run this query, item numbers 106 and 109 appear in the result set.

I have shown how to construct a subselect and how to use a subselect in a WHERE clause. In addition, I’ve given examples of a variety of essential keywords like IN, NOT, ALL, ANY, and EXISTS. These are fundamental building blocks for advanced SQL use. From here, I will show you examples of other places within SQL that the subselect can be used. These examples will use the basic syntax already seen in this article.

Using Subselects in the HAVING Clause

The HAVING clause is similar to the WHERE clause. The WHERE clause selects individual rows from tables. The HAVING clause selects summaries of rows. During processing, the query selects the rows that should be included based on the WHERE


select itemnumber,

itemdescription,

itemtype,

quantityonhand

from inventory

where itemtype not in

(select tittyp

from itemtype)

clause. Then, the records are grouped based on the GROUP BY clause. It is this intermediate result set that is then acted upon by the HAVING clause to produce the final result set.

This statement produces a summary of the inventory table by itemtype that shows each item type and the average quantity on hand for that item type.

ITEMTYPE AVG
CASUL 20.0000
DRESS 60.0000
RUN 53.2857

Perhaps you were really only interested in item types that had an average quantity on hand greater than fifty. If so, use the HAVING clause to exclude some of the summarized records. The following example does this and its result set is the same as the previous result set without the row for CASUL.

As with the WHERE clause, you can add subqueries to the HAVING clause for more powerful SQL statements. The inventory manager needs a new report. Once again, he wants to know the average quantity on hand for each item type. But this time, he only wants the item types reported where the average quantity on hand is less that the overall average quantity on hand for the entire inventory. To accomplish this, use a subquery in the HAVING clause.

The result set contains only the CASUL shoe item type.

The JOIN Clause

Imagine that you were limited to just one F-spec per RPG program. Life would be quite interesting trying to build sophisticated information systems with only one file allowed per program. Storing data in relational databases implies normalized data and multiple tables connected by key columns. Whether you are coding in RPG or SQL, you will need to access more than one table at a time to solve complex problems. SQL implements access to multiple tables in the FROM clause by using the JOIN keyword. A JOIN will always specify at least two tables.

A simple join would look like this:

Select itemtype,

avg(quantityonhand)

from inventory

group by itemtype

Select itemtype,

avg(quantityonhand)

from inventory

group by itemtype

having avg(quantityonhand)<50

select itemtype, avg(quantityonhand)

from inventory

group by itemtype

having avg(quantityonhand) <

(select avg(quantityonhand)

from inventory)

Select ordernumber,


ord.customernumber,

firstname, lastname

from orders ord

join customers cust

on ord.customernumber =

cust.customernumber

This joins the order table and the customer table by the customer number so that the result set of order numbers can also include the first and last name of the person ordering. Figure 5 shows the result set.

SQL is so flexible that even here in the join clause, a subselect can be used. How can this be an advantage? Look at this example: What will you get as output? (Hey, you should know this stuff by now!) You will get a list of all orders in which the order value is greater than the average of all order values for that customer. The SQL command certainly was a very simple way to get out this information, much easier and faster than coding in RPG. However, something is missing. The result set does not tell you what the average value is. To get the average value in the result set, you must execute a second query containing the subquery—not very elegant.

Once again, subqueries can come to the rescue. By using a subquery in the FROM clause, you get a final result set that includes the average amount. The ability to combine detailed and summarized data using a subquery in the FROM clause is very powerful and little known.

Select ord.ordernumber,

ord.customernumber,

ord.ordervalue,

ordsum.val

from orders ord join

(select customernumber,

avg(ordervalue) as val

from orders

group by customernumber) ordsum

on ord.customernumber=ordsum.customernumber

where ord.ordervalue>ordsum.val

order by ordernumber

See Figure 6 for a comparison of the result sets of this and the previous examples.

Select This, Select That

So far, every example has been with a select statement. Is it possible to use subqueries elsewhere in SQL? It is not only possible, it is profitable to do so. Subqueries in select statements make your reporting options easier and more powerful. Subqueries in other SQL statements make your maintenance tasks a breeze.


select ord.ordernumber,

ord.customernumber,

ord.ordervalue

from orders ord

where ordervalue>

(select avg(ordervalue)

from orders sums

where sums.customernumber =

ord.customernumber)

What if you needed to create a subset of data in a table? Let’s say you have to create a copy of your item master table that only contains items of a single item type. Assuming the copy of the table is called partialInventoryMaster, your query would look like this: This would save all of those pesky CASUL rows to partialInventory Master. As you may recall, they caused the data integrity problem since CASUL did not exist in the item type master table. Be careful, you want to match the layout of the target table with the layout of the result set from the subquery. I used an asterisk (*) to get all of the columns in itemmaster. The copy of the table, partialInventoryMaster, did not have to have all of the columns defined in it that are in itemmaster. That, too, could have been a subset. In that case, the asterisk would be replaced by the explicit column names, in the explicit column order found in partialInventoryMaster. A subselect can be used just as easily in a DELETE statement or an UPDATE statement for powerful control.

Tying It All Up

Some of these examples could have had the exact same results with simpler, non-subquery SQL statements, but I have tried to keep the examples as simple as possible so that you can build upon them later. The key thing to remember is that a subquery may be included in the WHERE, HAVING, and FROM clauses in SQL. When in doubt, just try it out. At worst, the SQL processor will give you a nasty message. At best, you will save yourself hours of RPG programming. It is always a joy to do in one SQL statement what would have taken many lines of RPG, DDS, and CL code.

If more RPG programmers made the small effort to learn SQL, there would be a measurable leap in RPG programmer productivity. On top of that, SQL is cross-platform, so it is a skill that will travel with you wherever you may roam. Now put on your running shoes and keep up with the competition by learning SQL.

FIRSTNAME LASTNAME CUSTOMERNUMBER CUSTOMERTYPE

BILLY SMITH 200 RUN JOHNNY WILLIAMS 201 RUN MYRON WILTON 202 DRESS GUNTHER MASTERSON 204 RUN LARRY LEVIATHAN 205 CASUL HARRY NARVISTUS 206 RUN LARRY LION 207 DRESS CYNTHIA CELLO 209 RUN WENDY LAVA 210 RUN TONYA QUEST 203 RUN YIMMEE AWAY 208 CASUL

Figure 1: This is the sample data for the Customer table.


Insert into partialInventoryMaster

select *

from inventory

where itemtype = ‘CASUL’

ITEMNUMBER ITEMDESCRIPTION ITEMTYPE QUANTITYONHAND

100 Fast Running Shoe RUN 16 101 Dress Shoe w/Holes DRESS 45 102 Slow Running Shoe RUN 105 103 Blue Running Shoe RUN 12 104 Sprinting Shoe RUN 66 105 Pilgrim Shoe DRESS 75 106 Wagon Wheel Shoe CASUL 35 107 Pole Vault Shoe RUN 2 108 Fast Green Shoe RUN 80 109 Earthy Shoe CASUL 5 110 Yellow Slow Shoe RUN 92

Figure 2: This is the sample data for the Inventory table.

ORDERNUMBER CUSTOMERNUMBER ORDERVALUE LINEITEMS

19 207 100.00 4 20 201 75.00 3 22 210 500.00 4 23 210 600.00 6 24 205 67.00 1 25 206 360.00 10 30 209 1000.00 35 35 202 100.00 1 36 201 890.00 23 37 200 98.00 2 38 204 62.00 3 39 203 178.00 5

Figure 3: This is the sample data for the Order table.


ITEMTYPE TYPEDESCRIPTION

RUN Running Shoe DRESS Dress Shoe

Figure 4: This is the sample data for the Item Type table.

ORDERNUMBER CUSTOMERNUMBER FIRSTNAME LASTNAME

19 207 LARRY LION 20 201 JOHNNY WILLIAMS 22 210 WENDY LAVA
23 210 WENDY LAVA
24 205 LARRY LEVIATHAN 25 206 HARRY NARVISTUS 30 209 CYNTHIA CELLO
35 202 MYRON WILTON
36 201 JOHNNY WILLIAMS 37 200 BILLY SMITH
38 204 GUNTHER MASTERSON 39 203 TONYA QUEST

Figure 5: Joins are simple to master and can be used to build complex queries.

First Query: ORDERNUMBER CUSTOMERNUMBER ORDERVALUE 23 210 600.00
36 201 890.00 With Subquery:
ORDERNUMBER CUSTOMERNUMBER ORDERVALUE VA 23 210 600.00 550.00 36 201 890.00 482.50

Figure 6: A subquery in a FROM


BLOG COMMENTS POWERED BY DISQUS

LATEST COMMENTS

Support MC Press Online

$

Book Reviews

Resource Center

  • SB Profound WC 5536 Have you been wondering about Node.js? Our free Node.js Webinar Series takes you from total beginner to creating a fully-functional IBM i Node.js business application. You can find Part 1 here. In Part 2 of our free Node.js Webinar Series, Brian May teaches you the different tooling options available for writing code, debugging, and using Git for version control. Brian will briefly discuss the different tools available, and demonstrate his preferred setup for Node development on IBM i or any platform. Attend this webinar to learn:

  • SB Profound WP 5539More than ever, there is a demand for IT to deliver innovation. Your IBM i has been an essential part of your business operations for years. However, your organization may struggle to maintain the current system and implement new projects. The thousands of customers we've worked with and surveyed state that expectations regarding the digital footprint and vision of the company are not aligned with the current IT environment.

  • SB HelpSystems ROBOT Generic IBM announced the E1080 servers using the latest Power10 processor in September 2021. The most powerful processor from IBM to date, Power10 is designed to handle the demands of doing business in today’s high-tech atmosphere, including running cloud applications, supporting big data, and managing AI workloads. But what does Power10 mean for your data center? In this recorded webinar, IBMers Dan Sundt and Dylan Boday join IBM Power Champion Tom Huntington for a discussion on why Power10 technology is the right strategic investment if you run IBM i, AIX, or Linux. In this action-packed hour, Tom will share trends from the IBM i and AIX user communities while Dan and Dylan dive into the tech specs for key hardware, including:

  • Magic MarkTRY the one package that solves all your document design and printing challenges on all your platforms. Produce bar code labels, electronic forms, ad hoc reports, and RFID tags – without programming! MarkMagic is the only document design and print solution that combines report writing, WYSIWYG label and forms design, and conditional printing in one integrated product. Make sure your data survives when catastrophe hits. Request your trial now!  Request Now.

  • SB HelpSystems ROBOT GenericForms of ransomware has been around for over 30 years, and with more and more organizations suffering attacks each year, it continues to endure. What has made ransomware such a durable threat and what is the best way to combat it? In order to prevent ransomware, organizations must first understand how it works.

  • SB HelpSystems ROBOT GenericIT security is a top priority for businesses around the world, but most IBM i pros don’t know where to begin—and most cybersecurity experts don’t know IBM i. In this session, Robin Tatam explores the business impact of lax IBM i security, the top vulnerabilities putting IBM i at risk, and the steps you can take to protect your organization. If you’re looking to avoid unexpected downtime or corrupted data, you don’t want to miss this session.

  • SB HelpSystems ROBOT GenericCan you trust all of your users all of the time? A typical end user receives 16 malicious emails each month, but only 17 percent of these phishing campaigns are reported to IT. Once an attack is underway, most organizations won’t discover the breach until six months later. A staggering amount of damage can occur in that time. Despite these risks, 93 percent of organizations are leaving their IBM i systems vulnerable to cybercrime. In this on-demand webinar, IBM i security experts Robin Tatam and Sandi Moore will reveal:

  • FORTRA Disaster protection is vital to every business. Yet, it often consists of patched together procedures that are prone to error. From automatic backups to data encryption to media management, Robot automates the routine (yet often complex) tasks of iSeries backup and recovery, saving you time and money and making the process safer and more reliable. Automate your backups with the Robot Backup and Recovery Solution. Key features include:

  • FORTRAManaging messages on your IBM i can be more than a full-time job if you have to do it manually. Messages need a response and resources must be monitored—often over multiple systems and across platforms. How can you be sure you won’t miss important system events? Automate your message center with the Robot Message Management Solution. Key features include:

  • FORTRAThe thought of printing, distributing, and storing iSeries reports manually may reduce you to tears. Paper and labor costs associated with report generation can spiral out of control. Mountains of paper threaten to swamp your files. Robot automates report bursting, distribution, bundling, and archiving, and offers secure, selective online report viewing. Manage your reports with the Robot Report Management Solution. Key features include:

  • FORTRAFor over 30 years, Robot has been a leader in systems management for IBM i. With batch job creation and scheduling at its core, the Robot Job Scheduling Solution reduces the opportunity for human error and helps you maintain service levels, automating even the biggest, most complex runbooks. Manage your job schedule with the Robot Job Scheduling Solution. Key features include:

  • LANSA Business users want new applications now. Market and regulatory pressures require faster application updates and delivery into production. Your IBM i developers may be approaching retirement, and you see no sure way to fill their positions with experienced developers. In addition, you may be caught between maintaining your existing applications and the uncertainty of moving to something new.

  • LANSAWhen it comes to creating your business applications, there are hundreds of coding platforms and programming languages to choose from. These options range from very complex traditional programming languages to Low-Code platforms where sometimes no traditional coding experience is needed. Download our whitepaper, The Power of Writing Code in a Low-Code Solution, and:

  • LANSASupply Chain is becoming increasingly complex and unpredictable. From raw materials for manufacturing to food supply chains, the journey from source to production to delivery to consumers is marred with inefficiencies, manual processes, shortages, recalls, counterfeits, and scandals. In this webinar, we discuss how:

  • The MC Resource Centers bring you the widest selection of white papers, trial software, and on-demand webcasts for you to choose from. >> Review the list of White Papers, Trial Software or On-Demand Webcast at the MC Press Resource Center. >> Add the items to yru Cart and complet he checkout process and submit

  • Profound Logic Have you been wondering about Node.js? Our free Node.js Webinar Series takes you from total beginner to creating a fully-functional IBM i Node.js business application.

  • SB Profound WC 5536Join us for this hour-long webcast that will explore:

  • Fortra IT managers hoping to find new IBM i talent are discovering that the pool of experienced RPG programmers and operators or administrators with intimate knowledge of the operating system and the applications that run on it is small. This begs the question: How will you manage the platform that supports such a big part of your business? This guide offers strategies and software suggestions to help you plan IT staffing and resources and smooth the transition after your AS/400 talent retires. Read on to learn: