Accessing i Data from MS SQL Server

SQL

Typography

Smaller Small Medium Big Bigger
Default Helvetica Segoe Georgia Times
Reading Mode

It's one thing to access SQL Server data from the System i, but it's another thing to access System i data from SQL Server.

Heterogeneous data is one of the hallmarks of today's IT. Whether due to disparities in disk price or processing power, data in many shops has become fragmented across not only multiple physical machines, but multiple platforms and databases. One of the primary jobs of any IT department is allowing seamless access to those different databases.

If you're a programmer, you can usually find the tools you need to access data on any machine in your network. It's a bit harder to do that without programming--that is, by using the user-oriented data access tools that are prevalent on today's desktops. In fact, as you'll read here, sometimes its just plain impossible.

Getting at Data Is Easy for a Programmer

Whether it's a Type 4 Java Database Connectivity (JDBC) driver or an Object Linking and Embedding Database (OLE DB) provider, objects typically exist to allow a programmer on one platform to access databases on another. This means that, with just a bit of effort, a competent programmer can access data from multiple data sources in the network and provide meaningful information to the user.

Perhaps the simplest route is to use Java and JDBC. JDBC has progressed significantly since its original incarnation. When JDBC first appeared, it was primarily as a wrapper around Open Database Connectivity (ODBC). ODBC was Microsoft's API for allowing standardized access to all databases, and it involved a "native" piece written for the operating system on which the client ran. In the Microsoft world, that meant primarily Windows, so Windows drivers for various databases soon became available. This in turn meant JDBC access for those databases.

This, however, didn't translate to access from the i. While there was plenty of support for ODBC drivers to access DB2 data, it was hard to find someone interested in writing OS/400- or i5/OS-specific ODBC drivers. Thus, it was easy to access DB2 data from Windows yet hard to access SQL Server data from the System i. This changed with JDBC Type 4 drivers, in which the entire driver is written in Java, with no native driver required. With this type of driver, all you need is a JVM, and the i has a very good one. So, now it's relatively easy to access SQL Server data and DB2 data using a Java program.

However, by definition this involves someone writing a program; it doesn't provide the sort of ad hoc access that many users have come to expect. The ability for a user to be able to easily write an SQL statement and import the results of that statement into a desktop document has become a commonplace requirement. I'll resist the urge here to expound on why I don't like direct user access to rows and columns in the database, because that's outside the scope of this article. Instead, I'll assume that you understand the risks, and I'll instead focus on how to make this happen.

Non-Programmatic Access

What is needed is a way for users to write simple queries that access data on two databases simultaneously. Two ways exists to do this: either use a tool that knows how to join multiple databases, or convince the SQL engine on one platform to talk with data on another platform.

Tools for the former exist, although they're usually relatively expensive. For example, IBM has its DB2 Information Integrator. This is a powerful tool that allows you to create queries over widely disparate data sources (not just RDBMS), but it's not cheap. The last pricing I saw was in the realm of $20K per server and $10K per database. That was a long time ago, but it gives you an idea of the value placed on these tools.

Another option is to get a Windows-based package that understands the various databases, such as UniDirect .NET. I haven't used this product; I only bring it up because it really highlights the biggest issue when dealing with System i data: most people in the "outside" world, if they've heard of DB2, only know of it from the small-server world (that would be DB2 LUW, the Linux/UNIX/Windows version). Occasionally, they know the mainframe version (DB2 for z/OS). In any case, they tend to provide support for "DB2," but that doesn't always mean DB2 for the i.

DB2 Data and SQL Server Data Side by Side

The other option is to somehow convince one SQL engine to look at data from another data source. For example, to get SQL Server and DB2 to work together at the SQL engine level, you have two choices: get DB2 to talk to SQL Server or get SQL Server to talk to DB2.

For the first option, IBM uses Distributed Relational Database Architecture (DRDA). DRDA was originally an IBM architecture. It's been around for a decade or so and is currently under the auspices of The Open Group. However, while a number of RDBMS products will participate in a DRDA environment, Microsoft SQL Server is not one of them (surprise, surprise).

Linked Servers in SQL

The SQL Server approach is called a linked server. Michael Sansoterra wrote about this in some detail, and while the article is a couple of years old now, it still has great information. But the basic concept can be boiled down pretty quickly: if your database has an ODBC or OLE DB connector, then it can (probably) be used as a linked server. The "probably" indicates one of two cautions: first, not every feature may be supported by your driver, and second, even if something is supported, the performance may be so bad that you need to find a different approach after all.

Ignoring those caveats for the time being, let's take a quick look at the linked server approach. As I said, Michael's article goes into this subject in depth, and it really covers the topic well from a technical perspective, so all I really have to do here is hit the high points. First, you define your server using some special Transact-SQL (T-SQL) statements that identify the server type and location.

Once that's done, you can use the new server (and its schemas and tables) in your basic SQL statements using a four-part naming syntax that includes the server definition name and schema in the table name. This isn't a perfect solution. The operations you can perform are pretty limited, especially by today's standards. Basically, you're stuck with what I consider to be first-generation SQL statements: SELECT, UPDATE, INSERT. You can also execute basic database management functions like DROP TABLE and SELECT INTO. The latter is a powerful way to move data from SQL Server to another database (or vice versa).

However, linked servers have no support for parameter markers, and stored procedure support is nonexistent for remote databases other than SQL Server. To use those advanced SQL features, you need to use either the EXEC AT feature or Data Transformation Services (DTS), which actually has been replaced by SQL Server Integration Services (SSIS).

DTS Is Itself Transformed, to SSIS

If your data needs include access to stored procedures, Data Transformation Services (DTS) was the route that had been available since version 7 of SQL Server. DTS basically allowed you to create sort of an SQL batch job that was designed to move data from one database to another. Because it was the only way to do certain things, DTS was often used for jobs other than data transformation, but the idea remained the same: you created a set of SQL statements and then invoked them as needed. Note that these statements depend on the capabilities of the driver used, but they usually provide enough capability to make DTS the tool of choice for transformations between different databases.

However, with the introduction of SQL Server 2005, DTA was replaced by SQL Server Integration Services (SSIS), which was available only in the enterprise edition of the product. With the release of SQL Server 2008, SSIS has been broken up into a variety of components with differing availabilities. So, if you plan to use SSIS, you need to take a careful look at your server edition to see if the required features are available.

It's not surprising that Microsoft doesn't include SSIS in all editions of its database (it seems that Redmond is taking a page from IBM and unbundling technology with new releases); the functionality is pretty extensive, including support for things like XML and allowing complex transformations and scheduled data conversions. It really is now much more of a data mining and warehousing utility, even a rudimentary form of business intelligence. However, that unbundling does mean that you have to budget for which components you will need in your environment.

EXEC AT

As of SQL Server 2005, the other option is to use the EXEC AT capability of T-SQL. This is sort of like RCMD in FTP; it passes a statement directly through to the host database and executes it there. In its base form, the EXEC (or EXECUTE) operation in T-SQL lets you execute an arbitrary character string. Similar to prepared statements in embedded SQL languages, this is the T-SQL method of executing dynamic SQL. You can build an entire SQL statement character by character and then execute it.

One of the lesser-known capabilities of this technique is to take that dynamic statement and transmit it for execution at a remote database that has been previously defined as a linked server. The data is literally sent to the other machine and executed there. This very powerful capability has some caveats, however. First, the syntax for the call has to be in the syntax of the remote database. While this makes sense, it can be somewhat confusing to see non-Microsoft syntax in a T-SQL program. Also, you have to take into account character-coding issues; character parameters are subject to translation.

Performance

So it's not impossible to manipulate databases remotely and to copy data from one database to another using simple SQL syntax. The linked server approach in SQL Server makes that syntax relatively easy, and the enhancements to the EXEC AT operation allow even more complex vendor-specific instructions to be passed through to the remote database.

The last issue is performing operations that actually join two databases and, specifically, the performance issue that can occur. One of the primary reasons that SQL is becoming so prevalent is the fact that the query optimization capabilities of the SQL engines make them very fast. Because an SQL engine has a deep understanding of its database, including statistics on the data itself, the engine has the ability to make decisions that improve performance.

Such understanding is lost when two databases are joined, especially two completely different databases such as SQL Server and DB2. The performance statistics gathered by the DB2 database are not available to SQL Server, so it must take the most pessimistic options for every decision. Add to that the pure overhead of having to go across the wire just to get a record, and it's clear that queries can quickly become very complex. As important as database and query design are to standard single-database applications, they are that much more important in multi-database environments.

Another hurdle to the multiple-database environment is the fact that it is often very hard to find appropriate test data. It's difficult enough to get a realistic snapshot of data on one machine, much less get snapshots on two machines, especially if the data needs to be somehow related. With cross-database queries, you need to test using data whose keys are going to coincide in order to properly exercise your join criteria. You have to understand both databases in order to extract the appropriate test information as well as have a place to put it. These issues are magnified in smaller shops with limited resources; it's hard enough finding extra disk space for Windows machines, much less extra room on an i.

Because of the difficulty of setting up proper test data, this is the type of situation where programmers may be less stringent in their testing procedures. However, testing is more rather than less important in this situation, especially load testing. More issues factor into the equation. In addition to raw data size, you have to take into account system load on both of the machines as well as network congestion. Before putting any query into production, it is essential to try it out on production quantities of data when network traffic will be the same as the time that the statements will be executed.

Multi-database queries are not to be taken lightly. For online transactions, they can add a measurable delay to response time. For batch operations, they can add an unwelcome level of overhead. Depending on the business requirement, you may actually be better off with an alternate approach, such as staging data from one machine to another during off-peak hours.

Final Note and Denouement

I noted above that both the EXEC AT and the DTS approaches require that you use DB2 syntax for your SQL code rather than SQL Server's T-SQL syntax. This allows me a quick reference to the fact that SQL dialects differ from one vendor to another; the differences can be subtle and a little tricky. Syntactical differences in stored-procedure syntax can be even trickier. So moving back and forth is not always easy.

Back in my early days of SQL, I used a tool called SwisSQL to convert between various flavors of SQL query syntax. The tool is one of the best I've ever used, although the pricing has risen considerably over the years. I think the basic query converter is $395, while the stored procedure converter is $995. For awhile, AdventNet had free versions of the tools available, but now I think they offer a 30-day, limited-use trial.

Still, you might get a kick out of looking at what happens. What's important is that you realize that while it's probably reasonable to think about allowing users to create ad hoc queries that join disparate databases, as soon as you start talking about more advanced features such as stored procedures, or even trying to do large-scale inserts or updates, you may be better off attacking the problem programmatically.

Joe Pluta is the founder and chief architect of Pluta Brothers Design, Inc. He has been extending the IBM midrange since the days of the IBM System/3. Joe uses WebSphere extensively, especially as the base for PSC/400, the only product that can move your legacy systems to the Web using simple green-screen commands. He has written several books, including Developing Web 2.0 Applications with EGL for IBM i, E-Deployment: The Fastest Path to the Web, Eclipse: Step by Step, and WDSC: Step by Step. Joe performs onsite mentoring and speaks at user groups around the country. You can reach him at This email address is being protected from spambots. You need JavaScript enabled to view it..

MC Press books written by Joe Pluta available now on the MC Press Bookstore.


		Developing Web 2.0 Applications with EGL for IBM i Joe Pluta introduces you to EGL Rich UI and IBM’s Rational Developer for the IBM i platform. List Price $39.95 Now On Sale

		WDSC: Step by Step Discover incredibly powerful WDSC with this easy-to-understand yet thorough introduction. List Price $74.95 Now On Sale

		Eclipse: Step by Step Quickly get up to speed and productivity using Eclipse. List Price $59.00 Now On Sale

BLOG COMMENTS POWERED BY DISQUS

LATEST COMMENTS

Book Reviews

Book Review: Extract, Transform, and Load with SSIS

Do your business apps access different data sources? This book shows you how to make that task easier
Book Review: 21st Century RPG: /Free, ILE, and MVC

David Shirey’s first book is an educational and entertaining read for “modern” and “old” RPG programmers alike
Book Review: Developing Business Applications for the Web--With HTML, CSS, JSP, PHP, ASP.NET, and JavaScript

If you are ready to get into Web application development, take this book along as your guide
Book Review: DB2 10.5 Fundamentals for LUW: Certification Study Guide (Exam 615)

DBAs who use the book will find it very helpful first in their test study and later as a reference book.
Book Review: DB2 11 for z/OS Database Administration—Certification Study Guide

This is a well-written DB2 11 book that could easily stand on its own as a reference manual, not just a certification guide.
Book Review: Free-Format RPG IV, Third Edition

Jim Martin comes through for us again.
Book Review: IBM i Security Administration and Compliance, Second Edition
Book Review: Programming in ILE RPG, Fifth Edition

This book really hits the mark and is a must-read for all RPG developers.
Book Review: DB2 10.1/10.5 for Linux, UNIX, and Windows Database Administration: Certification Guide
Book Review: Subfiles in Free-Format RPG

Whether you're a newbie or a seasoned pro, this book has something for you.
Book Review: Evolve Your RPG Coding: Move from OPM to ILE ... and Beyond

This book provides an amazingly comprehensive introduction to the concepts while at the same time delivering enough technical detail to make you productive very quickly.
Book Review: Database Design and SQL for DB2
Book Review: The Chief Data Officer Handbook for Data Governance

When implemented appropriately, data governance is a powerful framework.
Book Review: DB2 10 for z/OS: The Smarter, Faster Way to Upgrade

Trying to figure out whether to upgrade? Read on.
Book Review: 5 Keys to Business Analytics Program Success
Book Review: DB2 11: The Ultimate Database for Cloud, Analytics, and Mobile
Book Review: Flexible Input, Dazzling Output with IBM i

Today, it's all about input and output. Getting data into the IBM i from non-traditional sources and then displaying it back out again in varied formats. But where can you go to learn all that you need to know about this critical skill?
Book Review: Advanced Guide to PHP on IBM i

Enterprise-level PHP skills and techniques have been adapted for IBM i developers in Kevin Schroeder's new book.
Book Review: Java for RPG Programmers

If you've been putting off learning Java, you have no excuse anymore!
Book Review: DB2 10.1 Fundamentals: Certification Study Guide

Too valuable to be classified as merely excellent certification material, this book should also rightly take its place on DB2 DBA bookshelves as a solid day-to-day DB2 reference.
Book Review: DB2 10 for Z/OS Database Administration: Certification Study Guide

Whether you're trying to get certified or you just need a great reference book, this is the book for you.
Book Review: Developing Web 2.0 Applications with EGL for IBM i

It's everything you need to know, from the bottom up.
Book Review: Advanced Integrated RPG

Isn't it about time somebody told us how to integrate RPG and Java?
Book Review: Managing Without Walls

If you manage remote or satellite teams, this book is a must-read!
Book Review: Managing Without Walls

If you manage remote or satellite teams, this book is a must-read!
Book Review: The Remote System Explorer

This book speaks directly to the thousands of IBM i programmers who develop in RPG, COBOL, CL, and DDS every day.
Book Review: IBM System i APIs at Work, Second Edition

API expert Bruce Vining delivers the only comprehensive guide to APIs.
Book Review: Functions in Free-Format RPG IV

This one short volume manages to essentially be both a general introduction and a detailed reference.
Book Review: DB2 11: The Database for Big Data and Analytics
Book Review: IBM Mainframe Security: Beyond the Basics

Beginners will have a strong foundation after reading this book. Experienced professionals will reference it frequently.
Book Review: IBM InfoSphere: A Platform for Big Data Governance and Process Data Governance

Find out how IBM is addressing the challenges of big data.
Book Review: Fundamentals of Technology Project Management

Projects can be overwhelming, but taken in small, deliberate steps, all projects are achievable.
Book Review: Customer Experience Analytics

Use CEA as a strategic weapon to stay ahead of your competitors.
Book Review: Big Data Analytics: Disruptive Technologies for Changing the Game

The disciplines of data analytics are evolving to meet the new challenges of big data.
Book Review: IBM i Security: Administration and Compliance

If you have any interest in IBM i security, whether as an administrator, a programmer, or an auditor, then this book is the perfect resource.
Book Review: DB2 9.7 for Linux, UNIX, and Windows Database Administration (Exam 541)

This book, written by the creator of the certification exam, reveals exactly what you'll need to know to prep for the test.
Book Review: Selling Information Governance to the Business

Who governs the information that runs your company?
Book Review: You Want to Do WHAT with PHP?

If you're serious about programming in PHP, get a book that treats you that way.
Book Review: The IBM i Programmer's Guide to PHP

Both a primer and a reference, this book is a must-have for anyone who wants to program in PHP.
Book Review: JavaScript for the Business Developer

There's no faster, easier way to become proficient in JavaScript.
Book Review: SOA for the Business Developer

If you want to know how SOA works in the real world, this is your book.
Book Review: DB2 9 Fundamentals

Whether you want to obtain an IBM certified DB2 professional certification or simply become well-rounded in the fundamental concepts of DB2 and general database theory, this is your book.
Book Review: The Modern RPG IV Language, Fourth Edition

This book isn't a training manual; it's a reference book.

Resource Center

How to Modernize Fast and Within Budget (Quick Guide)
Why Migrate When You Can Modernize?

Business users want new applications now. Market and regulatory pressures require faster application updates and delivery into production. Your IBM i developers may be approaching retirement, and you see no sure way to fill their positions with experienced developers. In addition, you may be caught between maintaining your existing applications and the uncertainty of moving to something new.
Resource Center

The MC Resource Centers bring you the widest selection of white papers, trial software, and on-demand webcasts for you to choose from. >> Review the list of White Papers, Trial Software or On-Demand Webcast at the MC Press Resource Center. >> Add the items to yru Cart and complet he checkout process and submit
IBM i Transformation Risks Every Business Leader Should Know

Join us for this hour-long webcast that will explore:
What to Do When Your AS/400 Talent Retires

IT managers hoping to find new IBM i talent are discovering that the pool of experienced RPG programmers and operators or administrators with intimate knowledge of the operating system and the applications that run on it is small. This begs the question: How will you manage the platform that supports such a big part of your business? This guide offers strategies and software suggestions to help you plan IT staffing and resources and smooth the transition after your AS/400 talent retires. Read on to learn:

Analytics & Cognitive Categories

Latest Analytics & Cognitive News

Career Catgories

Latest Career News

Cloud Categories

Latest Cloud News

IT Infrastructure Categories

Latest IT Infrastructure News

News Categories

Latest News

Programming Categories

Latest Programming News

Security Categories

Latest Security News

Typography

Share This

Getting at Data Is Easy for a Programmer

Non-Programmatic Access

DB2 Data and SQL Server Data Side by Side

Linked Servers in SQL

DTS Is Itself Transformed, to SSIS

EXEC AT

Performance

Final Note and Denouement

LATEST COMMENTS

MC Press Online

Support MC Press Online

Book Reviews

Book Review: Extract, Transform, and Load with SSIS

Book Review: 21st Century RPG: /Free, ILE, and MVC

Book Review: Developing Business Applications for the Web--With HTML, CSS, JSP, PHP, ASP.NET, and JavaScript

Book Review: DB2 10.5 Fundamentals for LUW: Certification Study Guide (Exam 615)

Book Review: DB2 11 for z/OS Database Administration—Certification Study Guide

Book Review: Free-Format RPG IV, Third Edition

Book Review: IBM i Security Administration and Compliance, Second Edition

Book Review: Programming in ILE RPG, Fifth Edition

Book Review: DB2 10.1/10.5 for Linux, UNIX, and Windows Database Administration: Certification Guide

Book Review: Subfiles in Free-Format RPG

Book Review: Evolve Your RPG Coding: Move from OPM to ILE ... and Beyond

Book Review: Database Design and SQL for DB2

Book Review: The Chief Data Officer Handbook for Data Governance

Book Review: DB2 10 for z/OS: The Smarter, Faster Way to Upgrade

Book Review: 5 Keys to Business Analytics Program Success

Book Review: DB2 11: The Ultimate Database for Cloud, Analytics, and Mobile

Book Review: Flexible Input, Dazzling Output with IBM i

Book Review: Advanced Guide to PHP on IBM i

Book Review: Java for RPG Programmers