Learn why recreating your SQL routines can improve performance.
As a developer, it’s easy to operate by the motto “If it ain’t broke, don’t fix it.” That practice is understandable, given there’s usually not enough hours in the workday to get everything done. If you have SQL procedures, functions, or triggers that are frequently used, then you may be able to improve performance by making an exception to this practice.
You may not be aware that when an SQL routine object such as a user-defined function (UDF) is created with the SQL procedural language, the Db2 underneath the covers generates a C program object with embedded SQL statements. To maximize performance, Db2 does try to generate C code to implement some of the simpler SQL statements, such as SET and IF. An assignment statement like SET v1=’ABC’ will run much faster by using generated C code than by invoking the Db2 engine to copy 3 bytes of data into a variable.
The good news from an SQL performance perspective is that IBM has improved its C code generation over the years. The bad news is that your SQL routines must be recreated to reap the benefits of these C code generation improvements. These code generation improvements are most often delivered as part of a new IBM i release. Code gen enhancements are also delivered with Database Group PTFs as this Db2 website shows.
Based on these delivery vehicles, a good strategy to employ is to recreate all your SQL routines after installing a new release or Database Group PTF. If your shop uses a change management tool, then recreating all of your SQL procedures, functions, and triggers after these events should be an easy task.
Another strategy is focusing your recreate efforts on those SQL routines that haven’t been recreated in a while and that are frequently used. You might be thinking that will be difficult to figure out. However, the Db2 system catalog views make this task quite simple.
The catalog views that you will want to use for this task are the SYSPROGRAMTSTAT and SYSPROGRAMSTMTSTAT views in QSYS2. The SYSPROGRAMTSTAT view contains one row for every program that contains an embedded SQL statement on the system. As discussed earlier, SQL routines get implemented as a C program with embedded SQL, so this catalog view will contain a row for every SQL procedure, function, and trigger on your system. Correspondingly, the SYSPROGRAMSTMTSTAT view contains one row for every embedded SQL statement in the programs contained in the prior view. An SQL statement that gets implemented with generated code will not be represented in this view.
The code below contains an example of a query that can be run against these two catalog views to identify SQL routines that haven’t been recreated in over 12 months and that have been used in the last 6 months.
WITH Assignment_Count AS (SELECT program_schema, program_name,
SUM( CASE WHEN Statement_Text like 'SET%' THEN 1 ELSE 0 END) Assignment_Stmts
FROM qsys2.sysprogramstmtstat GROUP BY program_schema, program_name)
SELECT p.program_schema, p.program_name, routine_type, number_statements, assignment_stmts,
DATE(creation_timestamp) creation_date, DATE(last_used_timestamp) last_used_date, days_used_count FROM qsys2.sysprogramstat p INNER JOIN assignment_count a
ON p.program_schema = a.program_schema AND p.program_name = a.program_name
WHERE routine_body='SQL' AND routine_type IN ('PROCEDURE','FUNCTION','TRIGGER') AND
MONTHS_BETWEEN(CURRENT_DATE, DATE(last_used_timestamp))<6 AND
MONTHS_BETWEEN(CURRENT_DATE, DATE(creation_timestamp))>=12
ORDER BY last_used_timestamp DESC, creation_timestamp ;
The Assignment_Count common table expression is used on the SYSPROGRAMSTMTSTAT view to count the number of SET statements in each routine that are using SQL instead of C code since those are the easiest targets for C code generation to replace. The main SELECT statement joins SYSPROGRAMSTMT to this table expression and returns only those programs that are associated with SQL routines by filtering on routine_body and routine_type. The statement also specifies filtering criteria on the two timestamp columns to return only those routines used in the last 6 months that have not been recreated in over 12 months.
Figure 1 contains a sample result set from this query. The catalog view query has returned three functions that meet the specified criteria. Notice that it returned one SQL procedure that was last recreated in 2014. That procedure and the SQL function in the list have the most potential for improved performance from a recreate since a high percentage of the SQL statements are assignment statements. You can easily customize this query to generate a more targeted list of routines.
Figure 1: Output from catalog view query to identify SQL routines to recreate
It should now be obvious how simple recreates of your SQL routines could improve performance. Once you’ve mastered regular recreates for performance, I’ll leave you with a challenge of reviewing this white paper to ensure that your SQL coding techniques are optimal for C code generation.
LATEST COMMENTS
MC Press Online