21
Sat, Dec
3 New Articles

Practical SQL: OLAP, Part 2

SQL
Typography
  • Smaller Small Medium Big Bigger
  • Default Helvetica Segoe Georgia Times

In this next installment on SQL's Online Analytical Processing (OLAP) functions, you'll see how OLAP continues to provide SQL users the business functions we've enjoyed in RPG for so long.

 

In my previous OLAP article, I used the OLAP function ROW_NUMBER to provide the concept of "next record" that we take for granted in RPG's native access (contrast RPG's native ISAM access with SQL's relational access). In this article, I'll show you the magic of GROUPING SETS and how they can be used to provide the same functionality as control-level breaks.

 

Did You Say Control-Level Breaks?

 

Why yes, yes I did. You recall those lovely little L1 and L2 indicators, don't you? Control-level breaks (also known as level breaks or control breaks) have been around for a very, very long time. In fact, we can reach back nearly two decades into the MC Press archives to read an article on control breaks written by my favorite technical writer of all time, Ted Holt. And of course the concept reaches back decades before then. Control breaks are just an indicator to the programmer that a key field has changed. The simplest use of this concept is when accumulating totals; a change in a key field indicates that a line should be printed and an accumulator should be cleared.

This is an easy concept in RPG. In fact, judicious use of level breaks and blank-after field controls allowed RPG programmers to write reports that tabulated incoming data with very few lines of actual code. The mainline code simply accumulated each record into total fields, and then the level break indicators were used to write the data to a report. The very act of writing the report line cleared the accumulators and the program continued on. I refuse to replicate the code; as wonderful as the cycle was for its time, I don't wish to revisit the days of I-specs, O-specs, and fixed-format C-specs. But trust me when I say it didn't take a lot of programming to do this:

SALES REP   CUSTOMER     Quantity           Amount

---------   --------     --------         --------

123      1001         200.00         2,464.00

123      1002         343.00         5,155.00

123      2001       1,500.00           780.00

123                 2,043.00         8,399.00

555      1001          50.00         1,000.00

555      2001         150.00         1,805.00

555                   200.00         2,805.00

                    2,243.00        11,204.00

The file processed is the order file, which has among its fields Sales Rep (salesrep), Customer (custnum), Quantity, and Amount. The program simply reads through the file in salesrep/custnum order and accumulates quantity and amount at the three levels of the report: salesrep/custnum, salesrep, and grand totals. It prints out a line at the total break for each level. A good RPG programmer knocks this out in maybe 30 lines of code, including F-, I-, C-, and O-specs.

GROUP BY Doesn't Quite Cut It

In SQL, it's not quite so simple. Yes, you can use GROUP BY to accumulate things in SQL, but it doesn't lend itself very well to interspersing totals and details, or to expressing totals at different levels. In practice, to do the simple list above, you have to do three separate selects with GROUP BY clauses: one for the sales rep and customer totals, one for the sales rep totals, and one for the grand totals. You need to insert dummy values into the customer and sales rep fields in the higher-level accumulations and then finally perform a union between those three. It's not always an easy task. In this case, it's doable, but the results are far from pretty and the code is not particularly intuitive. Here's the SQL:

with t1 as

   (select salesrep, custnum, sum(quantity), sum(amount)

   from salesord group by salesrep, custnum),                      

t2 as

   (select salesrep, 999999 custnum, sum(quantity), sum(amount)

   from salesord group by salesrep),                                

t3 as

   (select 999 salesrep, 999999 custnum, sum(quantity),      

   sum(amount) from salesord)                                      

select * from t1 union                                          

select * from t2 union                                          

select * from t3 order by salesrep, custnum                      

You'll see three common table expressions (CTEs): those are the sub-selects with identifiers T1, T2, and T3. T1 accumulates quantity and sales amount by salesrep and custnum, T2 accumulates by salesrep using a dummy custnum of 999999, and finally T3 accumulates the grand totals using dummy values of 999 for salesrep and 999999 for custnum. The last part of the statement then ties those three subselects together with a UNION and orders them. The result is functional, if not particularly pretty (for brevity's sake, I'm not going to bother with column headings for the rest of the article):

123       1001         200.00         2,464.00

123       1002         343.00         5,155.00

123       2001       1,500.00           780.00

123     999999       2,043.00         8,399.00

555       1001          50.00         1,000.00

555       2001         150.00         1,805.00

555     999999         200.00         2,805.00

999     999999       2,243.00        11,204.00

You end up with the right numbers, certainly, as long as you recognize the 999 and 999999 values as special values that indicate a total line. But I hope you'd agree that the SQL statement isn't exactly the most intuitive thing you've ever seen. Not to mention the problem you might have if you actually have a salesrep number 999!

Enter GROUPING SETS

And that brings us to the subject of this article, the concept of grouping sets (be careful with this link; the IBM page goes into detail not only on grouping sets but also on the related but somewhat more complex and in my opinion less flexible CUBE and ROLLUP functions). Grouping sets perform for SQL the same function that level breaks and accumulators perform for our older RPG programs, and they do it in a very intuitive fashion. The SQL statement is quite simple:

select salesrep, custnum, sum(quantity), sum(amount) from salesord                                            

group by grouping sets ( (salesrep, custnum), (salesrep), () )

order by salesrep, custnum                              

Notice that the statement requires no CTEs and no dummy values. Summing is handled by the GROUPING SETS clause, which specifies that you want totals at three different levels: salesrep and custnum, salesrep only, and then grand totals, indicated by the empty set "()". Here's what you get:

123     1001       200.00         2,464.00

123     1002       343.00         5,155.00

123     2001     1,500.00           780.00

123        -     2,043.00         8,399.00

555     1001        50.00         1,000.00

555     2001       150.00         1,805.00

555        -       200.00         2,805.00

  -        -     2,243.00        11,204.00

Nice! The dashes indicate null values; you can check for the nulls in your program or you can cast them to a special value using IFNULL in your select statement. Either way, the SQL code is very simple. More important to me is the fact that it's really easy to change the grouping. Here's how I run the same query, grouping by salesrep and item instead of salesrep and custnum:

select custnum, item, sum(quantity), sum(amount) from salesord      

group by grouping sets ( (custnum, item), (custnum), () )

order by custnum, item

1001 ABC123                 200.00         2,464.00

1001 DEF333                  50.00         1,000.00

1001 -                      250.00         3,464.00

1002 ABC123                 220.00         2,695.00

1002 DEF333                 123.00         2,460.00

1002 -                      343.00         5,155.00

2001 ABC123                 150.00         1,805.00

2001 GHI987               1,500.00           780.00

2001 -                    1,650.00         2,585.00

   - -                    2,243.00        11,204.00

Boom! The report is reordered. And you know the totals are correct because the grand totals match the grand totals from the other report. I don't even want to go through the changes required for the other report. They're not horrible by any means, but in my opinion they're not nearly as intuitive as the simple changes required here. So, the next time you need to do some accumulation in your SQL, please take a look at the GRUOPING SETS clause and see how it can help you. I've got more OLAP functions in the works, so stay tuned!

BLOG COMMENTS POWERED BY DISQUS

LATEST COMMENTS

Support MC Press Online

$

Book Reviews

Resource Center

  • SB Profound WC 5536 Have you been wondering about Node.js? Our free Node.js Webinar Series takes you from total beginner to creating a fully-functional IBM i Node.js business application. You can find Part 1 here. In Part 2 of our free Node.js Webinar Series, Brian May teaches you the different tooling options available for writing code, debugging, and using Git for version control. Brian will briefly discuss the different tools available, and demonstrate his preferred setup for Node development on IBM i or any platform. Attend this webinar to learn:

  • SB Profound WP 5539More than ever, there is a demand for IT to deliver innovation. Your IBM i has been an essential part of your business operations for years. However, your organization may struggle to maintain the current system and implement new projects. The thousands of customers we've worked with and surveyed state that expectations regarding the digital footprint and vision of the company are not aligned with the current IT environment.

  • SB HelpSystems ROBOT Generic IBM announced the E1080 servers using the latest Power10 processor in September 2021. The most powerful processor from IBM to date, Power10 is designed to handle the demands of doing business in today’s high-tech atmosphere, including running cloud applications, supporting big data, and managing AI workloads. But what does Power10 mean for your data center? In this recorded webinar, IBMers Dan Sundt and Dylan Boday join IBM Power Champion Tom Huntington for a discussion on why Power10 technology is the right strategic investment if you run IBM i, AIX, or Linux. In this action-packed hour, Tom will share trends from the IBM i and AIX user communities while Dan and Dylan dive into the tech specs for key hardware, including:

  • Magic MarkTRY the one package that solves all your document design and printing challenges on all your platforms. Produce bar code labels, electronic forms, ad hoc reports, and RFID tags – without programming! MarkMagic is the only document design and print solution that combines report writing, WYSIWYG label and forms design, and conditional printing in one integrated product. Make sure your data survives when catastrophe hits. Request your trial now!  Request Now.

  • SB HelpSystems ROBOT GenericForms of ransomware has been around for over 30 years, and with more and more organizations suffering attacks each year, it continues to endure. What has made ransomware such a durable threat and what is the best way to combat it? In order to prevent ransomware, organizations must first understand how it works.

  • SB HelpSystems ROBOT GenericIT security is a top priority for businesses around the world, but most IBM i pros don’t know where to begin—and most cybersecurity experts don’t know IBM i. In this session, Robin Tatam explores the business impact of lax IBM i security, the top vulnerabilities putting IBM i at risk, and the steps you can take to protect your organization. If you’re looking to avoid unexpected downtime or corrupted data, you don’t want to miss this session.

  • SB HelpSystems ROBOT GenericCan you trust all of your users all of the time? A typical end user receives 16 malicious emails each month, but only 17 percent of these phishing campaigns are reported to IT. Once an attack is underway, most organizations won’t discover the breach until six months later. A staggering amount of damage can occur in that time. Despite these risks, 93 percent of organizations are leaving their IBM i systems vulnerable to cybercrime. In this on-demand webinar, IBM i security experts Robin Tatam and Sandi Moore will reveal:

  • FORTRA Disaster protection is vital to every business. Yet, it often consists of patched together procedures that are prone to error. From automatic backups to data encryption to media management, Robot automates the routine (yet often complex) tasks of iSeries backup and recovery, saving you time and money and making the process safer and more reliable. Automate your backups with the Robot Backup and Recovery Solution. Key features include:

  • FORTRAManaging messages on your IBM i can be more than a full-time job if you have to do it manually. Messages need a response and resources must be monitored—often over multiple systems and across platforms. How can you be sure you won’t miss important system events? Automate your message center with the Robot Message Management Solution. Key features include:

  • FORTRAThe thought of printing, distributing, and storing iSeries reports manually may reduce you to tears. Paper and labor costs associated with report generation can spiral out of control. Mountains of paper threaten to swamp your files. Robot automates report bursting, distribution, bundling, and archiving, and offers secure, selective online report viewing. Manage your reports with the Robot Report Management Solution. Key features include:

  • FORTRAFor over 30 years, Robot has been a leader in systems management for IBM i. With batch job creation and scheduling at its core, the Robot Job Scheduling Solution reduces the opportunity for human error and helps you maintain service levels, automating even the biggest, most complex runbooks. Manage your job schedule with the Robot Job Scheduling Solution. Key features include:

  • LANSA Business users want new applications now. Market and regulatory pressures require faster application updates and delivery into production. Your IBM i developers may be approaching retirement, and you see no sure way to fill their positions with experienced developers. In addition, you may be caught between maintaining your existing applications and the uncertainty of moving to something new.

  • LANSAWhen it comes to creating your business applications, there are hundreds of coding platforms and programming languages to choose from. These options range from very complex traditional programming languages to Low-Code platforms where sometimes no traditional coding experience is needed. Download our whitepaper, The Power of Writing Code in a Low-Code Solution, and:

  • LANSASupply Chain is becoming increasingly complex and unpredictable. From raw materials for manufacturing to food supply chains, the journey from source to production to delivery to consumers is marred with inefficiencies, manual processes, shortages, recalls, counterfeits, and scandals. In this webinar, we discuss how:

  • The MC Resource Centers bring you the widest selection of white papers, trial software, and on-demand webcasts for you to choose from. >> Review the list of White Papers, Trial Software or On-Demand Webcast at the MC Press Resource Center. >> Add the items to yru Cart and complet he checkout process and submit

  • Profound Logic Have you been wondering about Node.js? Our free Node.js Webinar Series takes you from total beginner to creating a fully-functional IBM i Node.js business application.

  • SB Profound WC 5536Join us for this hour-long webcast that will explore:

  • Fortra IT managers hoping to find new IBM i talent are discovering that the pool of experienced RPG programmers and operators or administrators with intimate knowledge of the operating system and the applications that run on it is small. This begs the question: How will you manage the platform that supports such a big part of your business? This guide offers strategies and software suggestions to help you plan IT staffing and resources and smooth the transition after your AS/400 talent retires. Read on to learn: