Point-and-click your way to building a repeatable, dynamic Excel spreadsheet from DB2.
Thanks for sticking it out through my preliminary Microsoft articles. It's about to pay off. Creating Excel spreadsheets from DB2 seems to be a common theme these days, with multiple ways of going about it. The topic of this article is using SQL Server Integration Services (SSIS) with DB2, but it's only the tip of the iceberg for articles to come in the interests of business intelligence, data warehousing, and website feeds.
Using SSIS, you simply point-and-click to set up your flow and configurations without writing any code except the SQL that gets the data. Using this technique, you could easily have your Microsoft gurus accessing your DB2 data with minimal programming experience or in-depth DB2 knowledge.
In previous articles, I have been covering introductory information on SSIS and how to use it to create a basic package for use with Microsoft SQL database:
- Simplify Common Database Operations, Including ETL, with a Few Point-and-Clicks
- Getting Started with Microsoft Visual Studio Express
This article will tie all of the previous Microsoft topics together to create a file from a DB2 physical file. This process, which can be run daily as a scheduled task, will be a simple query to retrieve all active users.
Simple Specifications for Our First DB2 File Export
We'll simply be exporting a user file that will export only active members and will run daily. The file will have a simple DDS as follows, where the active members have the JRSTATUS field set to either "A" or "Y."
* FILE: JR_USER - SIMPLE MEMBER USER FILE
A R JRUSER
A JRUSERKEY 6S 0
A JRFNAME 32A
A JRLNAME 32A
A JRMI 1A
A JRSTATUS 1A
We'll seed the initial data to look as follows.
Note that I have attached the DDL and DML source code for download for easy setup of this example.
Figure 1: Initial Data from DML
Connecting SSIS to DB2 with ODBC
In order to access your data, we'll be setting up an ODBC connection to your DB2 database. That requires an installation unless you already have it. ODBC is an optional component to install with iSeries Access.
To install ODBC, start up your installation media for iSeries Access—whether it be CD, DVD, Media Share, etc. — and perform a custom installation to select the ODBC component.
Figure 2: ODBC Component Selection
Due to my limited access to IBM products these days, please forgive the screenshots as I took them from a V5R4 installation that I have.
Once ODBC is installed, you can set it up using ODBC Data Sources under Windows Administration Tools.
Figure 3: Viewing ODBC Data Sources
If you're on a computer that's pre-Windows 8, you can find this on your Control Panel. I'm using the 32-bit ODBC driver, so I selected the 32-bit version. The default screen will go to the User DSN tab. Click on the Add button to set up your new connection.
Figure 4: Adding a New ODBC Data Source
Select Client Access ODBC. (Perhaps yours will say iSeries Access ODBC?)
Figure 5: Selecting the Client Access ODBC Driver
Once you have selected your ODBC driver, you'll be able to configure it to your system. Select a Data source name that you'll be able to identify when referencing it and select your system.
Figure 6: ODBC Setup for Your Server
Note: You may want to go into the Server and Packages tabs to set your default library.
Now that you have your ODBC connection, we can begin building our SSIS package. We'll be reusing the code from our previous SSIS article "Simplify Common Database Operations, Including ETL, with a Few Point-and-Clicks" to update the database connection and verify the output.
Visual Studio 2013 Note: I have updated from Visual Studio 2012 to Visual Studio 2013 since my last article. The process will be the same as the previous article, but I wanted to provide the location of the SSDT package for 2013 if you're interested, which is here: Microsoft SQL Server Data Tools - Business Intelligence for Visual Studio 2013
Building the SSIS
To begin creating our SSIS:
- Open Visual Studio.
- Select File > New Project…
- Select Templates > Business Intelligence > Integration Services Project.
- Enter the name for your project and select the directory location to be created.
Figure 7: Integration Services for SSIS Project
Connection Managers
With the new project created, we now need to add the source and destination of our project. We'll be pulling data from a Database source and writing the data to an Excel destination. These are specified in the SSIS package using connection managers.
Source Connection Manager—Database
To create our database source connection manager, go to the bottom panel labeled Connection Managers, right-click, and select New Connection…
Figure 8: New Connection Manager
When you select the New Connection… option, the following screen will allow you to choose ODBC.
Figure 9: Using ODBC with Connection Manager
When you select ODBC, you'll see your data source listed in the dropdown box for selection. Here, you will also provide the user name and password.
Figure 10: Login Information for ODBC Connection
When you're specifying your database details, it's not a bad idea to click on the Test Connection button to ensure that you have your information entered correctly and that your authentication method is working to make sure you're not troubleshooting your SSIS package when/if you have a connection problem.
Destination Connection Manager—Excel Spreadsheet
To create our database source connection manager, right-click in the Connection Manager panel again, and select New Connection… again. Then select EXCEL for the file type. Now you can enter the location of the output Excel spreadsheet.
Figure 11: Excel Connection Manager for Output File
For our example, we'll just create a file in the C:\Temp folder and use Microsoft Excel 97-2003, leaving the "First row has column names" checkbox checked. Then click on the OK button. Note that you need to name your file with the correct extension or Excel will complain later on that the format doesn't match the file name.
Building the Excel Spreadsheet
In your main window, there are several tabs over the top. We'll be working only with the Control Flow and the Data Flow tabs in this article.
First, we need to specify how the spreadsheet will look. To do this, we'll use Execute SQL Task from the SSIS toolbox on the far left. Click and drag the Execute SQL Task into the work area.
Figure 12: Execute SQL Task
It's not intuitive that you would use Execute SQL Task to build the Excel spreadsheet. The way I look at it is that we're building some output, which could be a database table, an Excel spreadsheet, etc. To create the worksheet in the spreadsheet, we'll create a table, and that's exactly what the syntax looks like when we build our task as follows:
Figure 13: SQL Task to Create Excel Sheet
The key settings that you need to make are outlined in green above and shown in the table below.
Property |
Value |
ConnectionType |
EXCEL |
Connection |
Excel Connection Manager |
SQLSourceType |
Direct Input |
SQLStatement |
CREATE TABLE… |
For the SQLStatement, click on the ellipses (…) button to get the Enter SQL Query window as shown in Figure 13. The query we are using will be:
CREATE TABLE MCPress (
MC_KEY LongText,
F_NAME LongText,
L_NAME LongText
)
I deliberately left the middle initial out of the table to illustrate that you could change the list of fields that are selected for your output.
We're keeping it simple for this example by putting the SQL directly into the SSIS package to keep this article at a reasonable size. We could have also called a stored procedure to separate the SQL from the SSIS package, making it easier to split the project across multiple developers who may have different skill sets. Using stored procedures would also separate the maintenance of the code by enabling you to update the stored procedure to change the results of the SSIS without having to change the SSIS itself.
Data Flow Task
Next, we'll create a Data Flow Task by dragging Data Flow Task from the left onto your main work area, as shown:
Figure 14: Creating a Data Flow Task
After the Data Flow Task is added, click on the Execute SQL Task and you should see a green arrow on the bottom. Click the arrow and drag to the Data Flow Task to connect them. This will allow the Data Flow Task to execute on successful completion of the Execute SQL Task.
ODBC Source
Double-click on the newly added Data Flow Task and you'll be taken to the Data Flow tab, which will have a new blank work area. Under Other Sources, click and drag the ODBC Source from the toolbox onto the Data Flow work area that is now available.
Double-click on ODBC Source to set the properties.
Figure 15: Selecting Connection Manager for ODBC Source
We'll be selecting all the records from the jr_user table. To do this, simply:
- Select the ODBC connection manager we created earlier that points to our database.
- Select the Table Name option for Data Access Mode.
- The dropdown box will list our JR_USER table to be selected. Click on it.
- Click on OK.
Data Conversion
Because we're writing to text fields in an Excel spreadsheet, we'll need to convert the text data from the database into Unicode characters; otherwise, you'll get an error when you try to run it. To do this, we'll pass all of the data through a Data Conversion by clicking on the Data Conversion from the toolbox onto the work area.
Then click on the OLE DB Source that we created earlier to display the arrows and drag the blue arrow to the new Data Conversion you just created. This is needed so that the Data Conversion is aware of what data needs to be converted.
Double-click on the Data Conversion and you should see the columns from your jr_user table available for conversion.
Figure 16: Data Conversion Transformation
Output Aliases are automatically created to prefix with "Copy of" as shown above in the blue box. You can rename these or leave as is, which is what will be done here.
The default Data Types for the text fields is String [DT_STR]. We'll change all text fields to Unicode string [DT_WSTR]. Once the Data Types are all changed to Unicode String, click on the OK button.
Excel Destination
Finally, we plug the Excel Destination into our SSIS package circuitry. Click and drag Excel Destination from the toolbox onto the work area. Click on Data Conversion to display the arrows, and drag the blue arrow down to Excel Destination to connect them. Your Data Flow should now look something like this. Orientation doesn't matter, but the cleaner the better.
Figure 17: Excel Destination
Double-click on Excel Destination to set the properties.
Figure 18: Excel Destination Connection Manager
Select:
- Your Excel Connection Manager
- Data access mode: Table or view
- Name of Excel sheet: MCPress$
For the name of the Excel sheet, you have two options:
- MCPress is the name of a range.
- MCPress$ is the worksheet name.
The available Excel sheet names are populated from the table that you created earlier with Execute SQL Task. For our example, we'll use the one with the dollar sign ($).
If there are no tables listed, you can execute the individual Execute SQL Task by going back to the Control Flow tab. Then right-click on the Execute SQL Task and click on Execute. If that task is set up correctly, you should see a green checkmark on that task.
Figure 19: Execute Only the SQL Task to Make MCPress$ Visible for Selection
After executing the task, go back to the Excel Destination on the Data Flow tab. You should now see your table in the dropdown list.
Mappings
Upon completing the selection of the table, while still within the Excel Destination, click on Mappings on the far left.
Figure 20: Map Data to Output
In the Mappings window, you will map the "Copy of" fields that are the converted values in Unicode over to the fields that you defined in your Excel spreadsheet. This is done by clicking on one side and dragging to the other side. Just like those matching tests back in the day. Then click OK to finish.
Debugging the SSIS Package
Now that all of your "development" is complete, you need to save your work. You should actually save regularly during your development. To run your SSIS package, go to the menu bar Debug and Start Debugging.
You will likely encounter your first error. Stop debugging and review your output at the bottom. You may see this: "Table 'MCPress' already exists."
To fix this error, go to the location you specified for your Excel destination and delete the file. To permanently fix this problem of recreating the file multiple times, you could drop the table or name the output to contain the datetime, which could be a follow up article if the interest is there.
After deleting the file, run again. You should see green checkmarks on all of your tasks. If you open your output file, you should now see your data with headers.
Figure 21: Diagram Illustrating Characteristics of Excel Destination Output
Things to notice are:
- The spreadsheet name is what you specified for the Excel Destination.
- The headers match what you specified in your CREATE TABLE statement.
- The sheet name equals the name of the table that you created.
Behold the Value
I hope that you can see the potential to even open up other ideas! Of course, SSIS can access Microsoft SQL server, but now you see you can access DB2! That could open the mental door to how many other databases you could use this tool with. Or maybe you're thinking of how many other things SSIS can do that you could use with DB2.
Chances are highly likely that you have a Microsoft-focused team in your company that collaborates with your IBM-centric team. The Microsoft SSIS tools are freely available with Microsoft SQL server and easily integrate with your IBM database. Building SSIS is more on the level of a power user versus a developer, so either your IBM developers or your Microsoft team could use it. You give even more value to your DB2 data by exposing it to Microsoft land, which is a good segue to my next articles.
LATEST COMMENTS
MC Press Online