24
Tue, Dec
1 New Articles

TechTip: Creating Local OLAP Cube Data Files with DB2 Data

Typography
  • Smaller Small Medium Big Bigger
  • Default Helvetica Segoe Georgia Times

Give users offline access to DB2 data in a versatile pivot table format.

 

In "Crunching AS/400 Data with OLAP Cubes and Excel 2000," I outlined how to use the Pivot Table wizard with Excel 2000 (and higher) to create pivot tables from DB2 data using an ODBC data source. This TechTip will go a step further.

 

Pivot tables are extremely useful because they allow a user to aggregate data in innumerable ways. For instance, sales dollars and quantity can be summarized by calendar quarter, by customer, by item classification, etc. Pivot tables, therefore, relieve IT from writing the same basic report over and over. (How many times have developers written the same basic report, just summarized a little differently?) Instead, pivot tables allow IT to collect the basic data and leave it to the user to summarize and format it as desired.

 

In this TechTip, I'll demonstrate how a script process can create a local OLAP cube file from DB2 data without using the Excel interface. (An OLAP cube file is a specialized file that stores aggregations in an optimized format.) Once a cube file is created, it can be distributed via a network share, emailed, or posted on a Web site for users to access and review using pivot tables within Excel. Since this is a script process, the cube data build can be scheduled to refresh at regular intervals. Furthermore, creating a local cube file alleviates the need for the user to have DB2 connectivity software and to go through the complicated Create Cube wizard.

Requirements

I'll demonstrate the technique to build a cube file using VBScript, although you could use VBA, .NET, or any other development environment that can take advantage of the ActiveX Data Objects library. The ODBC component from the System i Access CD (formerly iSeries Access) is required to be installed on the computer that will create the cube file. Alternatively, a third-party ODBC driver can be substituted. Keep in mind that almost any database server in your organization can be used as a data source for an offline cube.

 

You computer should also have the latest Microsoft Data Access Components (MDAC) library installed, along with a version of Excel (2000 or higher.) Most computers already have a recent MDAC version installed.

The Data

Since IBM hasn't given DB2 for i users a large test data set to tinker with (hint to IBM: the CORPDATA sample data is good, but we need more data!), I decided to re-use the data from the original article, an export of Microsoft's Northwind Traders Access database to DB2. The Northwind database stores customer order data. Figure 1 below shows the tables and relationships in the Northwind schema.

 

102408SansoterraFigure1.JPG

Figure 1: These are the tables and relationships in the Northwind schema example. (Click images to enlarge.)

 

A Quick Review

How can we make this relational data useful to management? One way is to create a cube file that will "pre-aggregate" the sales data, such as quantity and extended price (aka measures) by a variety of summarization columns (aka dimensions) such as item number, item description, item category, and customer name. Once the data is pre-aggregated in a local cube file, we can allow users to quickly review the aggregations in Excel using pivot tables (more about that in a minute).

 

But first, remember that when defining a cube we need to define what is being aggregated (sometimes called measures, facts, or values) and how to summarize these aggregates (called dimensions or labels). The table below shows a list of measures and dimensions for the sample cube.

 

Sample Cube Definition

Dimensions

Measures / Aggregate

Product Category

Quantity

Sum

Country

Extended Price

Sum

Customer

Highest Price

Max

Order Date

 

Product

 

Supplier

 

 

When this cube is built, it will contain summarizations of the aggregates by all combinations of dimensions. For example, a user could ask for total quantity and sales by supplier, by country within product, etc., and the cube will give an answer. This example is simple, but other meaningful measures, such as gross profit (extended price minus cost), could be added to the cube as a measure.

 

If you find yourself lost at this point, it is probably a good idea to review the prior article. Here is a warning from the previous article when designing a cube: local cube files are limited in scalability, so choose your dimension combinations wisely. Each additional dimension exponentially bumps up the number of aggregations done and increases the cube file size, processing time, and wait time accordingly.

A Brief Review of Pivot Tables

What is a pivot table? A pivot table is a tool in Excel that allows summarization, sorting, and analysis of data from a data source. Possible data sources to feed a pivot table are a range of cells in a spreadsheet or, you guessed it, an OLAP cube file.

 

The best way to talk about pivot table is to show one. Figure 2 shows an Excel 2007 spreadsheet containing a pivot table based on our sample cube.

 

102408SansoterraFigure2.JPG

Figure 2: This Excel spreadsheet contains a pivot table based on our sample cube.

 

As you can see, the Extended Price column (i.e., sales data) is summarized by date (columns) and product category (rows). The Country box at the top left is called a report filter. In this case, the report filter can be used to drill down to one or more countries.

 

The beauty is that, by a simple drag-and-drop operation, the data can be summarized in another way. Instead of summarizing rows by product category, we can summarize by supplier or customer or customer within product. If we wanted, we could put the order date (year, quarter, etc.) as a report filter in place of country. The possibilities are endless, and users can do this themselves so that IT doesn't have to write yet another variation on an existing report.

 

For those who need pivot table help, there are plenty of tutorials in the Microsoft help and on the Web.

Building a Local Cube Offline

In the prior article, we used the "Microsoft Query add-in for Excel" to build the cube data within Excel. Now however, we want to write a program to build a cube. In order to do this, we'll need the following:

•·        A connection to the cube's data source (in this case, ODBC will get data from DB2)

•·        A cube definition that defines the cube's measures and dimensions

•·        A DB2 SQL statement that can gather all of the required cube information into a single result set

 

With these things in mind, look at the end of this TechTip for the sample VBScript code (CreateSalesCube.vbs) required to build a cube file.

 

A local cube file is created using a specialized OLE DB provider from Microsoft called MSOLAP. As with any OLE DB provider, to use MSOLAP, you need to do the following:

•·        Instantiate an ActiveX Data Objects (ADO) connection object. (ADO is a standard Microsoft technology that allows a data connection to be made to a variety of data sources, usually database servers but also Exchange, Active Directory, and spreadsheets. An OLE DB provider is a vendor-specific plug-in to the ADO library.)

•·        Set a connection string to a data source.

•·        Open the connection.

 

Once the connection is opened, a program performs operations against a data source and then closes the connection.

 

The great thing about using MSOLAP is that it will build a local cube file from a remote data source using instructions specified all within the connection string. Once the connection is opened, the local cube will be built with no subsequent instructions. Therefore, the VBScript program does the following four steps:

•·        Create a connection object.

•·        Set the connection string to create a cube from a remote data source.

•·        Open the connection (which executes the connection string instructions).

•·        Close the connection.

 

Now that's easy enough to understand, but the connection string itself needs explanation. Not your typical connection string, the connection string for MSOLAP is divided into the following sections:

•·        LOCATION identifies where the local cube file will be created. It can be a drive path or a UNC path.

•·        SOURCE_DSN is the ODBC data source name to use when executing the underlying data source query (in this case, it will be an ODBC DSN called AS400).

•·        CREATECUBE is the statement that defines the structure of the cube, including the measures and dimensions. A tutorial on this topic would be extensive, but I will give a hint on an easy way to do this! After conceptually designing my local cube and building a supporting SQL statement, I use the Microsoft Query cube wizard to build this statement for me! (This method is outlined in the previous article.) The file name for an OLAP query created by MS Query is .OQY. Once you've created an .OQY file, simply open it with Notepad and extract the CREATECUBE section.

•·        INSERTINTO is the statement that populates the cube from the supplied SQL statement. The DB2 SQL joins various tables to get all the relevant data for the cube. As a side note, if your dimension columns have the possibility of containing a NULL, it is a good idea to modify your SQL to use a function such as COALESCE to replace NULLs with a value such as *Undefined*. Also notice here that the number and sequence of columns in the SELECT statement should correspond with the cube's columns.

 

With a mammoth connection string like that, it's no wonder all the work is done when the connection is opened!

What to Do with the Local Cube File

Once a cube file is created, there are a number of options for getting the data to the users. Here are a few ideas:

1.      Cube files can be opened directly by Excel using the File-Open dialog. Unfortunately, the Excel installation process doesn't automatically associate the cube (.CUB) extension with the Excel application. However, this can be easily remedied by associating Excel with a .CUB file extension using Windows Explorer. Also, if Excel squawks about a data source being potentially unsafe, just tell Excel to "get over it" and move on. When opening a cube file directly, Excel will present an empty pivot table.

2.      Cube files can be placed on a network drive for access by Excel users. If my memory serves me, though, with Excel 2000 only one user at a time can access a cube file.

3.      The cube creation script can be enhanced to zip and distribute cube files to users via email. This is a good solution for road warrior sales personnel or busy execs who can't necessarily connect directly to the DB2 server but may want to analyze current numbers.

4.      If the network configuration allows, copy the cube file right to a user's "My Documents" folder for easy access.

5.      Create a pre-built spreadsheet with pivot tables and pivot charts that reference the cube data on a network share. This option alleviates the need to have users start the pivot table process every time they open a cube file. In fact, a workbook can be created that has several predefined pivot table views that can serve as a starting point for users.

6.      It is a good idea to schedule the cube to be rebuilt automatically for a sensible period of time (depending on the application) such as daily, weekly, or monthly.

 

Number Crunching the Fun Way

The combination of OLAP cubes and pivot tables is compelling because it allows users to create their own report variations with existing tools. This allows developers to do more useful things than writing a million variations of a single report. The cube file also allows quite a bit of data to be analyzed without having to purchase an OLAP product.

 

OLAP cubes allow DB2 users to crunch data offline in a flexible and easy-to-use format. With a little training, the bean counters and sales reps in your company will love them!

 

Sample VBScript (CreateSalesCube.vbs) Code Required to Build a Cube File

 

'

' CreateSalesCube.VBS

'

'

'

' This script is intended to build

' a local OLAP cube file (.CUB) from

' "DB2 for i" data on a regular basis.

'

'

' Michael Sansoterra    09/20/2008

'

'

'

'

' Build Connection String

' (4 Components required for MSOLAP provider)

'

' 1.  Location of Cube

'

Dim ConnLocation

ConnLocation="LOCATION=C:NorthwindExample.cub;"

'

' 2.  ODBC Data Source Name

'     (Watch for those embedded double quotes!)

'

'     Place your own DSN name, credentials and other

'     connection string keywords here.

'

'

'     This example assumes AS400 uses the iSeries Access

'     (aka System i Access) ODBC Driver

'

Dim ConnDSN

ConnDSN="SOURCE_DSN=""DSN=AS400;UID=USERID;PWD=PASSWORD"";"

'

' 3.  Create Cube Statement (Define Dimensions & Measures)

'

Dim ConnCreateCube

ConnCreateCube = _

"CREATECUBE=CREATE CUBE [OrderCube] (" & _

"DIMENSION [Customer]," & _

"    LEVEL [All] TYPE ALL," & _

"    LEVEL [CompanyName]," & _

"DIMENSION [Country]," & _

"    LEVEL [All] TYPE ALL," & _

"    LEVEL [Country]," & _

"    LEVEL [Region], " & _

"DIMENSION [Product]," & _

"    LEVEL [All] TYPE ALL," & _

"    LEVEL [ProductName]," & _

"DIMENSION [Supplier]," & _

"    LEVEL [All] TYPE ALL," & _

"    LEVEL [CompanyName1]," & _

"DIMENSION [Category]," & _

"    LEVEL [All] TYPE ALL," & _

"    LEVEL [CategoryName]," & _

"DIMENSION [OrderDate] TYPE TIME," & _

"    LEVEL [All] TYPE ALL," & _

"    LEVEL [Year] TYPE YEAR," & _

"    LEVEL [Quarter] TYPE QUARTER," & _

"    LEVEL [Month] TYPE MONTH," & _

"   MEASURE [Quantity]    FUNCTION SUM," & _

"   MEASURE [Ext Price]   FUNCTION SUM," & _

"   MEASURE [Highest Price] FUNCTION MAX);"

'

' 4. Insert Data into Cube

'

Dim ConnInsertCube

ConnInsertCube="INSERTINTO=INSERT INTO OrderCube(" & _

"[Quantity], [Ext Price], [Highest Price], [CompanyName], " & _

"[Country].[Country], [Region], [ProductName], " & _

"[CompanyName1], [CategoryName], [OrderDate])  " & _

"OPTIONS ATTEMPT_ANALYSIS " & _

"SELECT OD.Quantity, " & _

"           OD.UnitPrice*OD.Quantity ExtPrice1," & _

"           OD.UnitPrice*OD.Quantity ExtPrice2," & _

"           CS.CompanyName, CS.Country," & _

"           CS.Region, PD.ProductName," & _

"           SP.CompanyName," & _

"           CT.CategoryName, OH.OrderDate" & _

"      FROM Northwind.OrderDetails OD " & _

"INNER JOIN Northwind.Orders       OH ON OH.OrderID=OD.OrderID " & _

"INNER JOIN Northwind.Products     PD ON PD.ProductID=OD.ProductID " & _

"INNER JOIN Northwind.Suppliers    SP ON SP.SupplierID=PD.SupplierID " & _

"INNER JOIN Northwind.Customers    CS ON CS.CustomerID=OH.CustomerID " & _

"INNER JOIN Northwind.Categories   CT ON CT.CategoryID=PD.CategoryID "

'

' Define ADO Connection Object

' Set the connection provider to MSOLAP

'

Dim Connection

Set Connection = CreateObject("ADODB.Connection")

Connection.Provider = "msolap"

'

' Combine 4 components of the MSOLAP connection string

' into one

'

Connection.ConnectionString = _

    ConnLocation & _

    ConnDSN & _

    ConnCreateCube & _

    ConnInsertCube

'

' Open the connection (this will create the cube and

' populate the cube from the remote data source.  The

' provider works to pre-aggregate the cube's measures by

' the various combinations of the cube's dimensions)

'

Connection.Open

'

' Close The Connection

'

Connection.Close

Set Connection=Nothing

'

' Signal user the script is done

'

MsgBox "Cube has been created!"

BLOG COMMENTS POWERED BY DISQUS

LATEST COMMENTS

Support MC Press Online

$

Book Reviews

Resource Center

  • SB Profound WC 5536 Have you been wondering about Node.js? Our free Node.js Webinar Series takes you from total beginner to creating a fully-functional IBM i Node.js business application. You can find Part 1 here. In Part 2 of our free Node.js Webinar Series, Brian May teaches you the different tooling options available for writing code, debugging, and using Git for version control. Brian will briefly discuss the different tools available, and demonstrate his preferred setup for Node development on IBM i or any platform. Attend this webinar to learn:

  • SB Profound WP 5539More than ever, there is a demand for IT to deliver innovation. Your IBM i has been an essential part of your business operations for years. However, your organization may struggle to maintain the current system and implement new projects. The thousands of customers we've worked with and surveyed state that expectations regarding the digital footprint and vision of the company are not aligned with the current IT environment.

  • SB HelpSystems ROBOT Generic IBM announced the E1080 servers using the latest Power10 processor in September 2021. The most powerful processor from IBM to date, Power10 is designed to handle the demands of doing business in today’s high-tech atmosphere, including running cloud applications, supporting big data, and managing AI workloads. But what does Power10 mean for your data center? In this recorded webinar, IBMers Dan Sundt and Dylan Boday join IBM Power Champion Tom Huntington for a discussion on why Power10 technology is the right strategic investment if you run IBM i, AIX, or Linux. In this action-packed hour, Tom will share trends from the IBM i and AIX user communities while Dan and Dylan dive into the tech specs for key hardware, including:

  • Magic MarkTRY the one package that solves all your document design and printing challenges on all your platforms. Produce bar code labels, electronic forms, ad hoc reports, and RFID tags – without programming! MarkMagic is the only document design and print solution that combines report writing, WYSIWYG label and forms design, and conditional printing in one integrated product. Make sure your data survives when catastrophe hits. Request your trial now!  Request Now.

  • SB HelpSystems ROBOT GenericForms of ransomware has been around for over 30 years, and with more and more organizations suffering attacks each year, it continues to endure. What has made ransomware such a durable threat and what is the best way to combat it? In order to prevent ransomware, organizations must first understand how it works.

  • SB HelpSystems ROBOT GenericIT security is a top priority for businesses around the world, but most IBM i pros don’t know where to begin—and most cybersecurity experts don’t know IBM i. In this session, Robin Tatam explores the business impact of lax IBM i security, the top vulnerabilities putting IBM i at risk, and the steps you can take to protect your organization. If you’re looking to avoid unexpected downtime or corrupted data, you don’t want to miss this session.

  • SB HelpSystems ROBOT GenericCan you trust all of your users all of the time? A typical end user receives 16 malicious emails each month, but only 17 percent of these phishing campaigns are reported to IT. Once an attack is underway, most organizations won’t discover the breach until six months later. A staggering amount of damage can occur in that time. Despite these risks, 93 percent of organizations are leaving their IBM i systems vulnerable to cybercrime. In this on-demand webinar, IBM i security experts Robin Tatam and Sandi Moore will reveal:

  • FORTRA Disaster protection is vital to every business. Yet, it often consists of patched together procedures that are prone to error. From automatic backups to data encryption to media management, Robot automates the routine (yet often complex) tasks of iSeries backup and recovery, saving you time and money and making the process safer and more reliable. Automate your backups with the Robot Backup and Recovery Solution. Key features include:

  • FORTRAManaging messages on your IBM i can be more than a full-time job if you have to do it manually. Messages need a response and resources must be monitored—often over multiple systems and across platforms. How can you be sure you won’t miss important system events? Automate your message center with the Robot Message Management Solution. Key features include:

  • FORTRAThe thought of printing, distributing, and storing iSeries reports manually may reduce you to tears. Paper and labor costs associated with report generation can spiral out of control. Mountains of paper threaten to swamp your files. Robot automates report bursting, distribution, bundling, and archiving, and offers secure, selective online report viewing. Manage your reports with the Robot Report Management Solution. Key features include:

  • FORTRAFor over 30 years, Robot has been a leader in systems management for IBM i. With batch job creation and scheduling at its core, the Robot Job Scheduling Solution reduces the opportunity for human error and helps you maintain service levels, automating even the biggest, most complex runbooks. Manage your job schedule with the Robot Job Scheduling Solution. Key features include:

  • LANSA Business users want new applications now. Market and regulatory pressures require faster application updates and delivery into production. Your IBM i developers may be approaching retirement, and you see no sure way to fill their positions with experienced developers. In addition, you may be caught between maintaining your existing applications and the uncertainty of moving to something new.

  • LANSAWhen it comes to creating your business applications, there are hundreds of coding platforms and programming languages to choose from. These options range from very complex traditional programming languages to Low-Code platforms where sometimes no traditional coding experience is needed. Download our whitepaper, The Power of Writing Code in a Low-Code Solution, and:

  • LANSASupply Chain is becoming increasingly complex and unpredictable. From raw materials for manufacturing to food supply chains, the journey from source to production to delivery to consumers is marred with inefficiencies, manual processes, shortages, recalls, counterfeits, and scandals. In this webinar, we discuss how:

  • The MC Resource Centers bring you the widest selection of white papers, trial software, and on-demand webcasts for you to choose from. >> Review the list of White Papers, Trial Software or On-Demand Webcast at the MC Press Resource Center. >> Add the items to yru Cart and complet he checkout process and submit

  • Profound Logic Have you been wondering about Node.js? Our free Node.js Webinar Series takes you from total beginner to creating a fully-functional IBM i Node.js business application.

  • SB Profound WC 5536Join us for this hour-long webcast that will explore:

  • Fortra IT managers hoping to find new IBM i talent are discovering that the pool of experienced RPG programmers and operators or administrators with intimate knowledge of the operating system and the applications that run on it is small. This begs the question: How will you manage the platform that supports such a big part of your business? This guide offers strategies and software suggestions to help you plan IT staffing and resources and smooth the transition after your AS/400 talent retires. Read on to learn: