Avoid headaches and wasted time by learning the major differences between DB2 for i and T-SQL dialects.
It's apparent by emails I've received that some SQL Server developers are having an opportunity to delve into the DB2 for i realm. Even though SQL would ideally be a platform-neutral, standards-based language, the truth is there are many SQL syntax and feature differences among the major database providers.
This article is intended as a "cheat sheet" for SQL Server T-SQL developers who are doing development in DB2 for i. DB2 for i SQL developers who have the chance to use T-SQL can benefit also.
Most of these tips are aimed at common coding tasks—not security, schema structures, or any of the other innumerable differences that can arise. Finally, the tips are only supposed to be informational to save you some time finding equivalents between the two platforms, and they are presented in no special order.
Catalog Views
System catalogs (metadata catalogs containing names of all views, procedures, columns, etc.) can be indispensable for a database developer, especially in an environment where changes are done by SQL scripts (or procedures). The names of the DB2 for i catalog views can be found in Appendix F of the SQL Reference manual.
Character Collations vs. CCSIDs
In SQL Server, character data is assigned a collation. The collation controls many things, including code page, case sensitivity, accent sensitivity, and sort order. The default collation name for a U.S.-based SQL Server install is SQL_Latin1_General_CP1_CI_AS. The CI in this name stands for "case-insensitive" and the AS stands for "accent-sensitive." So this collation uses a Latin character set with case-insensitive (but accent-sensitive) sorts and comparisons.
In contrast, DB2 for i tags character data with a coded character set identifier (CCSID) and relies on a session setting (sort sequence) to determine how the sorting and comparing of character data will be done. The default CCSID for a U.S.-based IBM i (aka AS/400) install is EBCIDC 37, and the default sort sequence is hex (i.e., binary sorting and comparing). So DB2 for i databases are most often case-sensitive; therefore (assuming the defaults), sorts and comparisons will behave differently between the two database engines.
CLR Routines
Using their favorite .NET language, SQL Server developers can create .NET versions of the following types of database objects: procedures, triggers, scalar functions, table functions, user-defined aggregates, and user-defined data types. This feature allows developers to write database routines that can talk with Web services, download an FTP file, query an LDAP server, connect to a socket, read and parse a Web page, etc.
DB2 for i developers can also create "external" functions, procedures, and triggers (but not aggregates and data types) using languages such as COBOL, C, RPG, or Java. See, for example, the CREATE PROCEDURE (external) statement.
Clustered Index Views
These are not supported in DB2 for i; however, a materialized query table (see the materialized-query-definition section of CREATE TABLE) shares many similarities. The main difference is that the materialized query table's content must be refreshed periodically, whereas clustered index views are always up to date based on the underlying data.
Column Aliases
In a DB2 for i query, column aliases must come after the column names. SQL Server allows this, but many developers use this alternative method to alias a column:
SELECT EXT_PRICE = SUM(QTY * PRICE) -- Now allowed in DB2 for i
FROM MY_ORDERS
The correct way is as follows:
SELECT SUM(QTY * PRICE) AS EXT_PRICE -- Allowed in DB2/SQL Server
FROM MY_ORDERS
Computed Columns on Tables
These are not supported in DB2 for i.
CROSS APPLY and OUTER APPLY
CROSS APPLY (similar to a row-based INNER JOIN) and OUTER APPLY (similar to a row-based LEFT JOIN) are not supported in DB2 for i, although sometimes the same thing can be accomplished. For example, a table function that is participating in a join in DB2 can have the input parameters vary (say, from a column in another table) from row to row, whereas SQL Server table functions can't be used in this manner without using one of the APPLY operators. So in the case of table functions, the APPLY operators aren't even required in DB2.
Additionally, LATERAL correlation can be used to mimic the behavior of CROSS APPLY. However, I'm not aware of any way to use LATERAL to mimic OUTER APPLY.
Cursor Parameters Are Evaluated Differently
At what point does a cursor assimilate its parameter values? This subtle difference once got me good on a T-SQL project! In DB2 for i, any parameters in the cursor are evaluated when the cursor is opened. However, in T-SQL, the parameter values are assimilated at the point the cursor is declared. In the following T-SQL code snippet, the cursor will return region codes that start with 'S' because that's the value of the parameter at cursor declaration.
DECLARE @REGION VARCHAR(10)
DECLARE @REGION_NAME VARCHAR(30)
SET @REGION='S%'
DECLARE DEMO CURSOR FOR
SELECT REGION_NAME FROM dbo.REGIONS
WHERE REG_CODE LIKE @REGION
SET @REGION='M%'
OPEN DEMO
If this were a DB2 for i cursor definition, the selected regions would begin with M, the value of the @REGION variable at the time the cursor is opened.
Data Type Differences
Common T-SQL data types such as BIT, TINYINT, SMALLMONEY, and MONEY are not supported in DB2 for i. Also, while NUMERIC and DECIMAL are supported in DB2 for i, they have different internal storage mechanisms (zoned decimal vs. packed decimal), which translates to different storage requirements.
Default Sorting of NULLs and Numbers Is Incompatible
In SQL Server, NULLs are placed at the top of the sort order before any NON-NULL values.
SELECT MY_COL
FROM (VALUES(null),('A'),('B'),('C')) DATA(MY_COL)
ORDERBY MY_COL
In contrast, NULLs are at the bottom of the barrel in the DB2 for i sort order.
A related issue is that many legacy DB2 for i tables are EBCDIC-based. In EBCDIC, numbers are placed at the bottom of the sort order, whereas systems with an ASCII heritage place numbers before letters in the sort order.
Dynamic SQL
Dynamic SQL in SQL Server is executed by either the EXEC statement or the sp_ExecuteSQL stored procedure. In DB2, the EXECUTE IMMEDIATE statement can be used for executing SQL strings similar to T-SQL's EXEC (@SQL). Or, the PREPARE and EXECUTE statements can be used to execute parameterized SQL statements (similar to sp_ExecuteSQL). Dynamically prepared parameterized statements in DB2 for i must use parameter markers (?) instead of variable names.
Error Trapping
SQL Server error-handling utilizes the common TRY/CATCH block, whereas in DB2 for i, a handler declaration is used to control the logic flow when a particular error is encountered within a Compound Statement (commonly recognized as a BEGIN/END block).
In the DB2 for i realm, errors can be reviewed by checking the numeric SQL Code (IBM i OS–specific) or the character-based SQL State (cross-platform compatible among the DB2 family). If you're new to DB2, it's best to get in the habit of using the SQL State.
FROM Clause Requirement
SQL Server does not always require a FROM clause when writing a query. The following are valid T-SQL queries that return one row as a result set:
SELECT 'MyName';
SELECT GET_DATE();
SELECT dbo.MyFunction();
DB2 always requires a FROM clause:
Select *
From (Values(CURRENT_DATE)) As Dummy_Table(Now)
Select Current_Date AS Now
From SYSIBM.SYSDUMMY1 -- Special 1 Row System Table
T-SQL System Functions
Many T-SQL system functions that provide feedback about the status of the last executed statement, such as @@ROWCOUNT and @@ERROR, have their DB2 for i analog in the GET DIAGNOSTICS statement. Connection-related functions, such as SYSTEM_USER, HOST_NAME, and APP_NAME, are replaced with various SPECIAL REGISTERS.
Square Bracket Identifiers
When creating identifiers (column names, table names, etc.) with special characters such as spaces, SQL Server allows the usage of square brackets. These oft-used square bracketed identifiers in T-SQL are not allowed in DB2 for i. For example:
CREATE TABLE [My Table]
([Column 1] INT NOT NULL PRIMARY KEY,
etc.
In DB2 for i, the double quotes are used:
CREATE TABLE "My Table"
("Column 1" INT NOT NULL PRIMARY KEY,
etc.
The named identifiers within double quotes are case-sensitive. By default, SQL Server can use identifiers with double quotes, but most T-SQL developers are accustomed to using the square brackets.
If you're creating DB2 for i objects that will be accessed by RPG or COBOL programs, I'd advise staying away from identifiers that contain special characters.
IF Statement Coding
DB2 for i requires an explicit END IF to be specified at the end of an IF block as opposed to using the BEGIN/END multi-statement block in T-SQL.
INSERT/EXECUTE
INSERT/EXECUTE in T-SQL is not available in DB2 for i. T-SQL employs a special syntax to INSERT rows into a table based on the first result set returned from a stored procedure.
INSERT INTO MYTABLE
EXEC MYPROCEDURE
DB2 for i does not support this syntax. However, in DB2 for i 7.1, a procedure's result set can be opened as a cursor and each row inserted individually into a work table.
Linked Servers
Linked Servers are not supported in DB2 for i, although writing a table function with Java can allow you to simulate a linked server for read-only applications. See "Query Remote Database Tables from the iSeries Using SQL and Java Database Tables from the iSeries Using SQL and Java" for more info.
Also, something called Distributed Relational Database Architecture (DRDA) can be used to allow various members of the DB2 family (Windows, Linux, z/OS, etc.) to communicate with one another. See the CONNECT and SET CONNECTION statements for more info on how to connect to another database within the DB2 family. The major limitation here is that a DB2 for i session can be connected to only one relational database at a time, so you can't query different DB2 servers in a single statement like you can with linked servers.
"Max Length" Character and Binary Data Types
T-SQL has the VARBINARY(MAX), VARCHAR(MAX), and NVARCHAR(MAX) data types. Variables and columns defined as one of these types can be used like normal variables but can grow very large like a BLOB or CLOB. In DB2 for i, the maximum character length for a variable is about 32K. Anything bigger requires the use of the BLOB, CLOB, or DBCLOB large object data types.
Multi-Part Naming for Objects Is Different
In SQL Server, four part names can be specified:
SELECT * FROM server.database.schema.object
In DB2 for i, the multi-part naming is like this:
SELECT * FROM catalog.schema.object
They look similar, but the terms have different meanings. The objects are the same (TABLES and VIEWS). In DB2 for i, a schema (aka a library) is analogous to the SQL Server database container. The SQL Server schema is generally used for security (ownership) or object classification purposes. The DB2 for i catalog is the name of the relational database that the SQL session is connected with and is somewhat similar in concept to (but not as versatile as) SQL Server's linked/remote server.
To recap, SQL Server developers will, for the most part, want to forget about ever using the server and schema aspects of the four part naming and further remember that a DB2 for i schema is akin to a SQL Server database.
Overloading Parameter Signatures
In SQL Server, each stored procedure and function must have a unique name. However, in DB2, multiple stored procedures and functions can exist with the same name as long as the parameter signatures are different.
In the following example, two external CREATE FUNCTION definitions are shown for user-defined function Item_Price; however, the second parameter varies in each definition. The first signature accepts an effective date value as a decimal data type (common in legacy IBM i applications), and the second signature accepts an effective date using a date data type.
Create Function Item_Price(parmItem Char(15),
parmEffDate Dec(7,0))
Returns Dec(7,0)
External Name 'MYLIB/MYRPGPGM(ITEM_PRICE_DEC)'
Specific Item_Price_Dec
Language RPGLE
Parameter Style General
Deterministic
Allow Parallel
No SQL
Create Function Item_Price(parmItem Char(15),
parmEffDate Date)
Returns Dec(7,0)
External Name 'MYLIB/MYRPGPGM(ITEM_PRICE_DATE)'
Specific Item_Price_Date
Language RPGLE
Parameter Style General
Deterministic
Allow Parallel
No SQL
Both functions can co-exist because of the differences in the signature. In DB2, a "specific name" is assigned to each function/procedure name, and it is this name that must be unique. If a specific name is not supplied in the CREATE statement, DB2 will automatically assign a specific name based on the function or procedure's name.
OUTPUT Clause of DML
The OUTPUT Clause of DML statements, which is used to capture a picture of modified data as a result set, is not supported. However, in i7.1, the results of an INSERT statement (including identity values) can be immediately captured in a SELECT statement using a special table reference called a data change table reference.
Raising an Error
When SQL code needs to raise an error or warning condition, SQL Server developers use the RAISERROR statement (THROW is also supported in the upcoming SQL Server 2012). Similarly, DB2 for i users can use the SIGNAL statement to create an error condition and can use the RESIGNAL statement to percolate an existing error condition up to the previous caller.
Automatic Result Sets
SQL Server's automatic result sets have no analog in DB2. In SQL Server, a query statement will generate a result set and prepare it to be consumed by the caller without any additional code.
SELECT * FROM MYTABLE -- can be consumed automatically by client (JDBC, ODBC, .NET, etc.)
However, in DB2, a cursor declaration is required and the cursor must be left open for the client to be able to consume the result set:
DECLARE MY_RESULT CURSOR FOR
SELECT * FROM MYTABLE;
OPEN MY_RESULT; -- query results can now be consumed by client
When writing SQL in DB2 for i, there just are no "plain SELECT" statements to return result sets like those so often witnessed in T-SQL code.
Scripting
Scripting is unavailable in DB2 for i. This is one of the most disappointing differences. T-SQL developers are accustomed to writing T-SQL scripts with conditional logic that can be executed dynamically. This feature aids immensely in development, testing, and debugging. Consider this example where a result set is generated and returned to the user only if a certain condition is true:
DECLARE @DEBUG BIT
SET @DEBUG=1
-- lots of sql code here
IF @DEBUG=1
AND EXISTS (SELECT * FROM ORDERS WHERE SHIP_DATE<ORDER_DATE)
BEGIN
SELECT * FROM ORDERS WHERE SHIP_DATE<ORDER_DATE
END
-- lots of sql code here
There is no such "script" animal in the DB2 for i world; all code has to be written and executed within an SQL routine or an embedded SQL program. Some tools can execute single statements sequentially to give the appearance of script execution, but there is no ability to evaluate variables or perform conditional logic in an interactive setting.
Starting in i7.1, the CREATE OR REPLACE syntax (e.g., CREATE OR REPLACE MYPROCEDURE) can be used as a substitute for the common SQL Server scripting practice of checking whether an object exists, dropping the object if it does, and creating or re-creating it.
SELECT INTO vs. CREATE TABLE AS
T-SQL's SELECT INTO is roughly equivalent to DB2's CREATE TABLE AS. T-SQL developers often use a SELECT/INTO statement to create a temporary (or permanent) table based on the results of a SELECT query:
SELECT *
INTO #MY_TEMP_TABLE -- Store query results here
FROM MY_BIG_QUERY
In DB2, SELECT INTO is available but is used for a different purpose (more on this later). The approximate equivalent of T-SQL's SELECT INTO would be the CREATE TABLE AS statement:
CREATE TABLE QTEMP.MY_TEMP_TABLE AS
(SELECT * FROM MY_BIG_QUERY)
WITH DATA
Be sure to check out all the options this powerful statement offers. Unlike T-SQL's SELECT INTO, you have the option to propagate identity, default, and other attributes from an existing table or query definition.
SELECT @VAR=COLUMN
T-SQL's SELECT @VAR=COLUMN is not supported in DB2 for i . T-SQL allows a SELECT statement to function as an assignment statement, similar to combining multiple SET statements into one:
DECLARE @MYVAR1 INT, @MYVAR2 INT
SELECT @MYVAR1=100, @MYVAR2=200
The equivalent statements in DB2 for i are the SET and VALUES INTO statements. Both SET and VALUES INTO can assign multiple values at once, similar to how T-SQL's SELECT does in the above example.
Further, in T-SQL, a single row lookup from a query can also be used to populate variables, as follows:
DECLARE @MYVAR1 INT, @MYVAR2 INT
SELECT @MYVAR1=COLUMN1,@MYVAR2=COLUMN2
FROM MY_TABLE
WHERE MY_KEY = 123
In DB2 for i, the SELECT INTO syntax is used when intending to fetch a single row into variables:
SELECT COLUMN1,COLUMN2
INTO @MYVAR1,@MYVAR2
FROM MY_TABLE
WHERE MY_KEY = 123
If the query returns multiple rows during a SELECT INTO, DB2 for i issues an error. Remember, while both T-SQL and DB2 for i have SELECT INTO statements, their usage is very different.
Other T-SQL variations of assigning values to variables directly in a query include the ability to modify a variable's content based on each row in a result set (such as the concatenation example below) and the ability to use a self-referencing variable in an UPDATE statement. For example:
DECLARE @ORDER_ITEMS VARCHAR(MAX)
SET @ORDER_ITEMS='' -- Concat Example is Unsupported in DB2
SELECT @ORDER_ITEMS=@ORDER_ITEMS+OD.ITEM+','
FROM ORDER_DETAIL OD
WHERE ORDER_ID = 123
These variable assignment variations are not supported in DB2 for i.
Semicolon Termination Character
DB2 for i requires a semicolon at the end of every executable statement within a procedure. In contrast, T-SQL, with few exceptions, is pretty forgiving about whether or not you use one. In fact, most T-SQL coders don't seem to use them except for the few minor instances when they're required.
Stored Procedure Invocation
Stored procedure invocation syntax is different. The DB2 CALL statement is used to invoke a stored procedure instead of T-SQL's EXEC statement:
EXEC MyDatabase..MyProcedure PARM1, PARM2 -- SQL Server
CALL MyDatabase.MyProcedure PARM1, PARM2 -- DB2 Call
String Concatenation
The string concatenation operator is different. In T-SQL, the plus sign (+) is used to concatenate strings. In DB2 for i, two pipe (|) characters are required:
SELECT FIRST_NAME+' '+LAST_NAME ...-- T-SQL
SELECT FIRST_NAME||' '||LAST_NAME ...-- DB2
Table Variables
Table variables are not supported in DB2 for i. Using QTEMP (aka temporary) tables or an array (starting with i7.1) are the usual alternatives.
TOP vs. FETCH FIRST
T-SQL's TOP is roughly equivalent to DB2's FETCH FIRST clause. One popular feature in T-SQL is to use the TOP clause to retrieve, update, or delete the first n rows (or the top p percent of rows) in a result set.
DB2 for i has a similar FETCH FIRST clause that allows a query to retrieve the first n rows.
SELECT * FROM MyData
ORDER BY MyKey
FETCH FIRST 10 ROWS ONLY
FETCH FIRST does not support a percentage option. Also, the DELETE and UPDATE statements do not support the FETCH FIRST clause.
Transactions
All SQL Server data modifications are recorded in the transaction log, and there is no way to shut this feature off. Each data modification statement in SQL Server is treated as an individual implicit transaction unless the BEGIN TRANSACTION and COMMIT/ROLLBACK statements are used to define an explicit transaction boundary.
Data modifications are logged in DB2 for i only when the database tables are being "journaled." A journal is roughly equivalent to a transaction log except that, in DB2 for i, the developer decides which objects in the database are going to be logged or journaled. A journal receiver object is what holds the before and after pictures of each data modification. Many DB2 for i shops use the journal feature for auditing because, unlike the SQL Server log, the DB2 for i journal does record user and session information for each change made.
In DB2 for i, transactions are allowed only when journaling is active on all tables being modified within the transaction. There will not be an atomic transaction if there is no journal/log in use for a rollback. To further complicate matters, logging can be shut off on a DML statement within a transaction by specifying the "No Commit" isolation clause. Hence, because the isolation level can be changed in this manner, a rollback may not be able to restore your database to a consistent state. SQL Server developers generally use table variables to persist data beyond a rolled back transaction.
Further, it is common for DB2 for i developers and administrators to restore individual tables from a backup, whereas SQL Server users typically restore the entire database to a point in time.
If you use the DB2 for i SQL environment from start to finish, starting with the creation of a schema using the CREATE SCHEMA statement, all of your database data will be journaled and consistent (assuming that non-transacted modifications are not allowed). However, most legacy applications in the IBM i arena do not conform to this pattern.
Transaction boundaries are established with a SET TRANSACTION statement and finished with a COMMIT or ROLLBACK statement. DB2 for i, like SQL Server, has the concept of a save point to define and, if necessary, roll back a partial transaction.
Tempdb vs. QTEMP
The SQL Server tempdb database is known as QTEMP in DB2 for i. T-SQL developers rely on a special database called tempdb for storing and manipulating temporary data sets. Data in tempdb disappears when the SQL session ends. Likewise, in DB2 for i, each SQL session has its own private temporary area called QTEMP and temporary objects are purged when the session (aka job) ends.
Unlike tempdb, a hash symbol is not required for creating a temporary table in DB2; rather, you create the table within the QTEMP schema:
CREATE TABLE QTEMP.TEMP_DATA
(DATA_COLUMN VARCHAR(1024) NOT NULL)
The DECLARE GLOBAL TEMPORARY TABLE statement provides another way to create a temporary table. In DB2 for i, "global" means the table is available for all processes in the session. Don't confuse the DB2 "Global Temporary Table" concept with SQL Server's. DB2 for i has no equivalent to SQL Server's global temporary table, which is a table that is sharable among all user sessions on the server (a table name prefixed with two hashes.)
Truncate Table
DB2 for i doesn't offer an equivalent to the TRUNCATE TABLE statement. However, there is an equivalent OS command called CLRPFM that can, with a little work, be invoked from SQL. Otherwise, you're stuck deleting all the rows from a table.
UPDATE/DELETE Statements with a JOIN
One thing I love about T-SQL is its ability to run an UPDATE or DELETE statement involving one or more JOINs. However, this syntax is not supported in DB2 for i. Instead of using JOINs, all correlations are usually done with a subquery and an EXISTS predicate.
UPDATE MYDATA.NEW_CUSTOMER TARGET
SET BALANCE_DUE=(SELECT BALANCE_DUE
FROM QTEMP.OLD_CUSTOMER AS SOURCE
WHERE TARGET.CUSTOMER_NO=SOURCE.CUSTOMER_NO)
WHERE EXISTS
(SELECT BALANCE_DUE
FROM QTEMP.OLD_CUSTOMER AS SOURCE
WHERE TARGET.CUSTOMER_NO=SOURCE.CUSTOMER_NO
AND TARGET.BALANCE_DUE<>SOURCE.BALANCE_DUE)
Yes, it's redundant to specify the subquery twice!
Variable Naming Requirement
In contrast to T-SQL, DB2 for i's SQL variable names are not required to start with an "at" (@) sign (although I like this practice in DB2 because it makes differentiating variable names and column names easier).
Window Aggregate Functions
Window aggregate functions such as SUM(expression) OVER(partition clause) are not yet supported (as of IBM i 7.1).
Are All Database Servers Created Equal?
Just because there are many unsupported features between the two database servers doesn't imply that DB2 for i is an inferior product, as DB2 for i has implemented the core requirements of the ANSI 93 SQL standard. Most of the unsupported SQL Server options mentioned are proprietary to SQL Server. Further, there are many areas of convergence, including these:
- National character NCHAR and NVARCHAR data types (i6.1, although you can use the GRAPHIC data type equivalent starting in V5R3)
- OLAP ranking functions (V5R4)
- Super groups (V5R4)
- Merge (i7.1)
- Recursive queries (V5R4)
- Filtered indexes (i6.1)
- Row version columns (V5R4)
- XML processing (i7.1—and prior with a separate product—although the available features are highly divergent between the two systems)
- Instead of triggers (V5R3)
- Sequences (V5R3 and will be available in SQL Server 2012)
- Full outer join (i6.1)
- Common table expressions (V5R1) although the syntax is a little different between the two database engines for DML statements
If you're already familiar with SQL Server programming, a little practice will get you up to speed quickly with DB2 for i.
Links
Microsoft SQL Server to IBM DB2 UDB Conversion Guide (SQL Server 2005)
LATEST COMMENTS
MC Press Online