23
Mon, Dec
1 New Articles

SQL Subselect: Soup to Nuts

SQL
Typography
  • Smaller Small Medium Big Bigger
  • Default Helvetica Segoe Georgia Times
This article will discuss SQL subselect, starting with basic subselect, then moving to scalar subselect, subselect with the UPDATE, INSERT, and DELETE statements, subselect and CREATE TABLE, and ending with subselect and derived table.

Example Tables

The following tables are used in the examples in this article to illustrate how subselect works.

Employee Master (EMP)
Nbr
Nam
Cls
Sex
Dpt
Sal
10
Ed
5
M
911
700
20
Heikki
2
M
901
600
50
Marcela
3
F
911
750
40
Mike
4
M
977
650
30
John
5
M
977
320
60
Frank
2
M
990
650

Department Master (DEP)
Dpt
Dnm
901
Accounts
911
Sales
977
Manufact
990
Spares

 

Employee Transfer Master (TRANSFER)
Nbr
Nam
Dpt
20
Heikki
901
30
John
977
50
Marcela
911

 

New Salary Table #1 (NEWSAL)
Nbr
Nsal
10
789
20
689
30
409
40
739
50
839
60
739

 

New Salary Table #2 (NEWSAL2)
Nbr
Nsal
10
752
20
675
30
372
60
702

Simple Subselect

The simplest form of subselect can be found in a SELECT statement where a subselect is used to create a selection list for the IN predicate within the WHERE clause.

Suppose you want to see all the employees who work in a department that have a department name beginning with an uppercase S. The challenge is that the department name (the column called dnm) exists only in the Department Master table, not in the Employee Master table.

You need a technique that allows SQL to find those department numbers that have a name beginning with an S and then use that information to determine which employees work in those departments. The use of a subselect on the right side of the IN predicate for the WHERE clause of the SELECT statement provides this capability and is coded as shown below.

SELECT  nbr,  nam,  dpt, sal  FROM  emp
      WHERE  dpt  IN  
                  (SELECT  dpt  FROM  dep  WHERE  dnm  LIKE  'S%')
      ORDER  BY  nbr

This SQL statement retrieves the columns employee number (nbr), employee name (nam), department number (dpt), and salary (sal) FROM the Employee Master table (emp) WHERE the department number (dpt) is IN the list of department numbers (dpt) SELECTed FROM the Department Master table (dep) WHERE the department name (dnm) begins with or is LIKE S. The rows that are retrieved are ORDERed BY employee number (emp).

Nbr
Nam
Dpt
Sal
10
Ed
911
700
50
Marcela
911
750
60
Frank
990
650

 

Subselect Terminology

At this point and before proceeding further, we need to discuss some subselect terminology. The first SQL statement — in this case, the first SELECT statement — is called the primary or outer SQL statement. In this example, it is called the primary or outer SELECT:

SELECT  nbr,  nam,  sal,  dpt, FROM  emp 

The second SELECT statement is the subselect (also called the inner SELECT):

(SELECT  dpt  FROM  dep  WHERE  dnm  LIKE  'S%') 

A maximum of 256 SELECT statements, all referred to as inner selects, can be embedded or nested inside the outer select or outer SQL statement. Please keep in mind that the more subselects you embed together inside an SQL statement, the longer that statement may take to execute.

Scalar Subselect

A scalar subselect retrieves data from a secondary table based on data in the primary table and can be used anywhere an SQL function or expression is used. Let's modify our simple subselect example to include the department name in the rows returned. To do this, you must add a scalar subselect to the column list of the example immediately after the column named sal, as shown below:

SELECT  nbr,  nam,  dpt,  sal,
           (SELECT  dnm  FROM  dep  b  WHERE  a.dpt  =  b.dpt)  
                 AS Dept_Name
      FROM  emp  a
      WHERE  dpt  IN  (SELECT  dpt  FROM  dep  

      WHERE  dnm  LIKE  'S%')
      ORDER  BY  nbr

This example now contains two subselect statements: The scalar subselect within the column list (SELECT nbr, nam, dpt, sal, (SELECT dnm FROM dep b ...)) and the subselect within the WHERE clause (WHERE dpt IN (SELECT dpt FROM dep ...)) of the outer SELECT statement. Executing this SELECT statement returns the same rows as in the previous example: all the employees who work in a department with a department name that begins with S. However, now the department name is also included.

Nbr
Nam
Dpt
Sal
Dept_Name
10
Ed
911
700
Sales
50
Marcela
911
750
Sales
60
Frank
990
650
Spares

In this example, what happens if an employee row exists with an invalid department number in it and the scalar subselect cannot find a matching department number in the Department Master table? If this situation occurs, no department number is retrieved and the default value for the column will be used instead.

If you think about it, this SQL process is very similar to an application program in which you read a record from a primary file and use information from this record to do a single-record random retrieval (CHAIN in RPG, GET BY KEY in other high-level languages) to retrieve information from a secondary file. As you will see later, the information from the record retrieved in the secondary file can also be used to update the current record from the primary file. What should become readily apparent is that it takes significantly less lines of code to do this in SQL than in a high-level language like RPG, Cobol, C, or Java.

Correlated Naming

In this example, the scalar subselect must use correlated naming (translation: qualified column names) since the emp and dep tables each have a column called dpt and these columns are compared in the WHERE clause (WHERE a.dpt = b.dpt) of the scalar subselect. A qualified column name is required whenever the same column name exists in one or more tables, and a comparison will be made between tables using that column name.

Correlated naming is used because SQL supports long table names, up to 128 characters in length. Because of this, SQL allows you to assign a short correlated or alternate name to the table in the FROM clause of the SQL statement, and you can use this alternate name (in place of the long SQL name) to qualify column names where necessary.

In the FROM clause for the outer SELECT statement (FROM emp a) and the FROM clause for the inner scalar subselect (FROM dep b), the table called emp is followed by an a, and the table called dep is followed by a b. The a is the correlated or alternate name for emp, and the b is the correlated or alternate name for dep; these are then used to qualify the column name in the WHERE clause of the scalar subselect (WHERE a.dpt = b.dpt). You can use any alternate name of your choosing; I arbitrarily chose a and b.

Subselect and INSERT

Subselect can be used in the INSERT statement to add one or more rows to a table or to initially populate the table with rows. To illustrate this, let's create an Employee Name work table using the CREATE statement and then populate the work table using a simple subselect in the INSERT statement. The CREATE and INSERT statements required to do this are shown below:

CREATE TABLE    empname                         
      (number   DEC (3,0)   NOT NULL WITH DEFAULT,
       name   CHAR (10)    NOT NULL WITH DEFAULT,  
       dept    DEC (3,0)    NOT NULL WITH DEFAULT) 

INSERT INTO    empname                
      SELECT    nbr,    nam,    dpt    FROM    emp

After executing these two SQL statements, the work table contains the following rows:

Number
Name
Dept
10
Ed
911
20
Heikki
901
30
John
977
40
Mike
977
50
Marcela
911
60
Frank
990

You should note that the column names in the INTO clause of the INSERT statement must be listed in the same order and must have attributes compatible with the corresponding columns in the CREATE TABLE statement.

Now, let's add a layer of complexity to this example by creating the work table and including the department name in addition to the three original columns. Can subselect still be used with the INSERT statement to populate the work table? Yes! The CREATE TABLE and INSERT statements to do that are shown below:

CREATE TABLE    empname                         
      (number   DEC (3,0)   NOT NULL WITH DEFAULT,
       name   CHAR (10)    NOT NULL WITH DEFAULT,  
       dept    DEC (3,0)    NOT NULL WITH DEFAULT,
       dptnam   CHAR (10)    NOT NULL WITH DEFAULT) 

INSERT INTO    empname                
      SELECT   nbr,   nam,   dpt,  
                  (SELECT   dnm   FROM   dep   b 
                        WHERE   a.dpt   =   b.dpt)   AS   dptnam 
            FROM   emp   a


After executing these SQL statements, the work table contains the following rows, which now include the department name.

Number
Name
Dept
Dptnam
10
Ed
911
Sales
20
Heikki
901
Accounts
30
John
977
Manufact
40
Mike
977
Manufact
50
Marcela
911
Sales
60
Frank
990
Spares

Subselect and CREATE TABLE

This last example can actually be done in one step (instead of two steps) by using a single subselect within the CREATE TABLE statement: the CREATE TABLE followed by the INSERT. To do this, replace the column list in the previous CREATE TABLE statement with subselect from the INSERT statement, as shown below:

CREATE TABLE    empname    AS                
      (SELECT   nbr,   nam,   dpt,                              
                   (SELECT   dnm   FROM   dep   b 
                         WHERE   a.dpt   =   b.dpt)   AS   dptnam 
            FROM   emp   a)
      WITH DATA

The work table that results from executing this CREATE TABLE is the same as the one that results from executing the INSERT followed by the CREATE TABLE.

When using CREATE TABLE with the subselect, each derived column in the subselect must be given a name with the AS operator, and the CREATE TABLE statement must be ended with the WITH DATA or WITH NO DATA clause. The WITH DATA clause causes the table to be created and populated with data, and the WITH NO DATA clause causes the table to be created without being populated with data (an empty member).

Subselect and DELETE

You can use subselect in the DELETE statement much the same way you can use it in the INSERT statement.

Suppose that a few of the employees in the Employee Master table have been transferred to another division of the company. After sending the employee information to the receiving division, you need to delete the rows for the transferred employees in the Employee Master table.

The Employee Transfer Master table shown below has the information for three employees that are to be transferred.

Nbr
Nam
Dpt
20
Heikki
901
30
John
977
50
Marcela
911

This table can be referenced in the FROM clause of a subselect within a DELETE statement that will delete those employees that are being transferred, as shown in the following example:

DELETE   FROM   emp   WHERE   nbr   IN 
      (SELECT   nbr   FROM   transfer)

After the code executes, the three employees will have been deleted from the Employee Master table, and it will contain the following rows:

Nbr
Nam
Cls
Sex
Dpt
Sal
10
Ed
5
M
911
700
40
Mike
4
M
977
650
60
Frank
2
M
990
650

 

Subselect and UPDATE

You can use a subselect in an UPDATE statement to update one or more columns in a row from one table based on the value or values retrieved from one or more columns in a row from another table. This is similar to an RPG program that retrieves a column value from a row in a primary table and then chains to a row in a secondary table using that column value as the key. Another, different column value is then retrieved from the row in the secondary table and used to update a corresponding column in the current row of the primary table.

In the following example, we want to change each employee's current salary to the amount listed in the new salary file. To do this, we use the UPDATE statement in conjunction with the scalar subselect, as shown:

UPDATE   emp   aa 
      SET   sal   =   
            (SELECT   nsal   FROM   newsal   bb  
                  WHERE   aa.nbr   =   bb.nbr)

When using the SQL syntax listed above, each row in the emp table must have a corresponding, matching row in the newsal table, based on employee number (nbr). If you review the rows in the newsal table found in the Example Tables section at the beginning of this article, you will see that this is the case. Therefore, all rows in the emp table are updated and will reflect the new salaries as shown below:

Nbr
Nam
Cls
Sex
Dpt
Sal
10
Ed
5
M
911
789
20
Heikki
2
M
901
689
30
John
5
M
977
409
40
Mike
4
M
977
739
50
Marcela
3
F
911
839
60
Frank
2
M
990
739

The obvious question is what happens when each row in the emp table does not have a corresponding, matching row in the newsal table. The answer is simple: The update fails with the first row that does not have a match in the newsal table. If you review the rows in the newsal2 table listed in the example tables at the beginning of this article, you will see that employee numbers 40 and 50 do not have corresponding rows in the newsal2 table. If the update of the original emp table is attempted using the newsal2 table, the update fails with the following error message:

UPDATE   emp   aa 
      SET   sal   =   
            (SELECT   nsal   FROM   newsal2   bb  
                  WHERE   aa.nbr   =   bb.nbr)

Null values not allowed in column or variable SAL.

This UPDATE statement fails at employee number 40 because there is no corresponding employee number 40 row provided in the newsal2 table to update the salary column in the emp table. Therefore, the default update value for the salary column becomes a null, and since the salary column is not null-capable, the UPDATE statement fails at employee number 40. In this situation, the Employee Master table is not journaled and is not using commitment control. Therefore, employee numbers 10, 20, and 30 are successfully updated, but employee number 60 is not.

This UPDATE statement can be written or coded so it will complete successfully even though each row in the emp table does not have a corresponding, matching row in the newsal2 table. By adding a WHERE clause with a second subselect to the UPDATE statement as shown below, the designated rows for employee numbers 10, 20, 30, and 60 will be updated successfully.

UPDATE   emp   aa 
            SET   sal   =   
                  (SELECT   nsal   FROM   newsal2   bb  
                        WHERE   aa.nbr   =   bb.nbr)
     WHERE   aa.nbr   IN (SELECT   nbr   FROM   newsal2)

The second subselect in the WHERE clause of the UPDATE statement is used to build an IN list based on the employee numbers in the newsal2 table. The result is that only those employee rows in the emp table that have a corresponding matching row in the newsal2 table are updated with a new salary. The result is that employee numbers 10, 20, 30, and 60 are updated with a new salary, but employee numbers 40 and 50 are not updated. The results of the update to the salary column in the emp table are shown below:

Nbr
Nam
Cls
Sex
Dpt
Sal
10
Ed
5
M
911
752
20
Heikki
2
M
901
675
30
John
5
M
977
372
40
Mike
4
M
977
650
50
Marcela
3
F
911
750
60
Frank
2
M
990
702

 

Derived Table and Subselect

A derived table is one that does not exist until the SQL statement is executed; it's created dynamically on the fly by the SQL statement execution. A subselect is used in the FROM clause of a SELECT statement to create the derived table as part of the statement execution process.

Suppose you want to see all the employee data for the first or lowest employee number in each department. (Yes, I know this isn't the best example, but it works with the example tables provided at the beginning of the article). There is no way to do this without using a derived table. The SELECT statement using a derived table is shown below:
 

SELECT   *   FROM   emp   
      WHERE   nbr   IN                              
            (SELECT   number   FROM                 
                  (SELECT   dpt,   MIN(nbr)   AS  number   
                        FROM emp                         
                        GROUP BY   dpt)   
                  AS   first_row_table)          
     ORDER BY   nbr

This solution employs two subselects, with the innermost subselect using a derived table function. It works as follows: The desired result is to retrieve the first or lowest employee number for each department number in the emp table. To accomplish this, an IN list containing the employee numbers for these rows meeting the above criteria needs to be created within the WHERE clause for the outer SELECT statement.

This is done with the subselect immediately after the WHERE clause. This first (or leftmost) subselect employs a second (or inner) subselect that uses the MINIMUM function (MIN(nbr) AS number) and derives a temporary table (... FROM (SELECT number ...)) named first_row_table, which uses the GROUP BY function to group the rows in the derived table by department number and the first or lowest employee number for that department. Note that the AS clause (AS first_row_table) in the first subselect appears not to be referenced, but it is required.

When this SELECT statement with the derived table is executed, the following rows are retrieved:

Nbr
Name
Class
Sex
Dept
Salary
10
Ed
5
M
911
700
20
Heikki
2
M
901
600
30
John
5
M
977
320
60
Frank
2
M
990
650


If you compare the retrieved rows to the rows listed in the Employee Master table, you will find that the first employee row (the row with the lowest employee number) for each department is listed in the result set.

Take Advantage of Subselect

A subselect allows you to embed up to 256 SELECT statements inside another SELECT, INSERT, UPDATE, DELETE, or CREATE TABLE statement to create very powerful, advanced SQL statements. This provides the capability to reference multiple tables in a SELECT, INSERT, UPDATE, DELETE, or CREATE TABLE statement, without having to do a join of the tables involved. Subselect provides the capability to write a single SQL statement that can perform a complex database task that could require many more lines of code and be significantly more complex if you used a high-level language like RPG, Cobol, C, or Java. The result can be a significant increase in productivity when performing database-related types of tasks.

By learning and using SQL subselect, you add a powerful tool to your repertoire as an IT professional. The key is to make the commitment to do so.

BLOG COMMENTS POWERED BY DISQUS

LATEST COMMENTS

Support MC Press Online

$

Book Reviews

Resource Center

  • SB Profound WC 5536 Have you been wondering about Node.js? Our free Node.js Webinar Series takes you from total beginner to creating a fully-functional IBM i Node.js business application. You can find Part 1 here. In Part 2 of our free Node.js Webinar Series, Brian May teaches you the different tooling options available for writing code, debugging, and using Git for version control. Brian will briefly discuss the different tools available, and demonstrate his preferred setup for Node development on IBM i or any platform. Attend this webinar to learn:

  • SB Profound WP 5539More than ever, there is a demand for IT to deliver innovation. Your IBM i has been an essential part of your business operations for years. However, your organization may struggle to maintain the current system and implement new projects. The thousands of customers we've worked with and surveyed state that expectations regarding the digital footprint and vision of the company are not aligned with the current IT environment.

  • SB HelpSystems ROBOT Generic IBM announced the E1080 servers using the latest Power10 processor in September 2021. The most powerful processor from IBM to date, Power10 is designed to handle the demands of doing business in today’s high-tech atmosphere, including running cloud applications, supporting big data, and managing AI workloads. But what does Power10 mean for your data center? In this recorded webinar, IBMers Dan Sundt and Dylan Boday join IBM Power Champion Tom Huntington for a discussion on why Power10 technology is the right strategic investment if you run IBM i, AIX, or Linux. In this action-packed hour, Tom will share trends from the IBM i and AIX user communities while Dan and Dylan dive into the tech specs for key hardware, including:

  • Magic MarkTRY the one package that solves all your document design and printing challenges on all your platforms. Produce bar code labels, electronic forms, ad hoc reports, and RFID tags – without programming! MarkMagic is the only document design and print solution that combines report writing, WYSIWYG label and forms design, and conditional printing in one integrated product. Make sure your data survives when catastrophe hits. Request your trial now!  Request Now.

  • SB HelpSystems ROBOT GenericForms of ransomware has been around for over 30 years, and with more and more organizations suffering attacks each year, it continues to endure. What has made ransomware such a durable threat and what is the best way to combat it? In order to prevent ransomware, organizations must first understand how it works.

  • SB HelpSystems ROBOT GenericIT security is a top priority for businesses around the world, but most IBM i pros don’t know where to begin—and most cybersecurity experts don’t know IBM i. In this session, Robin Tatam explores the business impact of lax IBM i security, the top vulnerabilities putting IBM i at risk, and the steps you can take to protect your organization. If you’re looking to avoid unexpected downtime or corrupted data, you don’t want to miss this session.

  • SB HelpSystems ROBOT GenericCan you trust all of your users all of the time? A typical end user receives 16 malicious emails each month, but only 17 percent of these phishing campaigns are reported to IT. Once an attack is underway, most organizations won’t discover the breach until six months later. A staggering amount of damage can occur in that time. Despite these risks, 93 percent of organizations are leaving their IBM i systems vulnerable to cybercrime. In this on-demand webinar, IBM i security experts Robin Tatam and Sandi Moore will reveal:

  • FORTRA Disaster protection is vital to every business. Yet, it often consists of patched together procedures that are prone to error. From automatic backups to data encryption to media management, Robot automates the routine (yet often complex) tasks of iSeries backup and recovery, saving you time and money and making the process safer and more reliable. Automate your backups with the Robot Backup and Recovery Solution. Key features include:

  • FORTRAManaging messages on your IBM i can be more than a full-time job if you have to do it manually. Messages need a response and resources must be monitored—often over multiple systems and across platforms. How can you be sure you won’t miss important system events? Automate your message center with the Robot Message Management Solution. Key features include:

  • FORTRAThe thought of printing, distributing, and storing iSeries reports manually may reduce you to tears. Paper and labor costs associated with report generation can spiral out of control. Mountains of paper threaten to swamp your files. Robot automates report bursting, distribution, bundling, and archiving, and offers secure, selective online report viewing. Manage your reports with the Robot Report Management Solution. Key features include:

  • FORTRAFor over 30 years, Robot has been a leader in systems management for IBM i. With batch job creation and scheduling at its core, the Robot Job Scheduling Solution reduces the opportunity for human error and helps you maintain service levels, automating even the biggest, most complex runbooks. Manage your job schedule with the Robot Job Scheduling Solution. Key features include:

  • LANSA Business users want new applications now. Market and regulatory pressures require faster application updates and delivery into production. Your IBM i developers may be approaching retirement, and you see no sure way to fill their positions with experienced developers. In addition, you may be caught between maintaining your existing applications and the uncertainty of moving to something new.

  • LANSAWhen it comes to creating your business applications, there are hundreds of coding platforms and programming languages to choose from. These options range from very complex traditional programming languages to Low-Code platforms where sometimes no traditional coding experience is needed. Download our whitepaper, The Power of Writing Code in a Low-Code Solution, and:

  • LANSASupply Chain is becoming increasingly complex and unpredictable. From raw materials for manufacturing to food supply chains, the journey from source to production to delivery to consumers is marred with inefficiencies, manual processes, shortages, recalls, counterfeits, and scandals. In this webinar, we discuss how:

  • The MC Resource Centers bring you the widest selection of white papers, trial software, and on-demand webcasts for you to choose from. >> Review the list of White Papers, Trial Software or On-Demand Webcast at the MC Press Resource Center. >> Add the items to yru Cart and complet he checkout process and submit

  • Profound Logic Have you been wondering about Node.js? Our free Node.js Webinar Series takes you from total beginner to creating a fully-functional IBM i Node.js business application.

  • SB Profound WC 5536Join us for this hour-long webcast that will explore:

  • Fortra IT managers hoping to find new IBM i talent are discovering that the pool of experienced RPG programmers and operators or administrators with intimate knowledge of the operating system and the applications that run on it is small. This begs the question: How will you manage the platform that supports such a big part of your business? This guide offers strategies and software suggestions to help you plan IT staffing and resources and smooth the transition after your AS/400 talent retires. Read on to learn: