Don't let a small thing like a literal string completely ruin your multi-national program.
The variations of CCSID are a vast topic and are in fact the main issue to consider during globalization of an application. You will find many pages on the Web on this topic (search for Unicode, CCSID, globalization, or national language support). Some of the most interesting are the ones from the IBM i Information Center.
In my previous article, I showed how a simple line of RPG code can turn a program haywire just because the JOB CCSID has changed. In this article, I will explain how to handle literal strings, which are sensitive to CCSID variations.
This simple line of RPG code made my MAIL program misbehave because the literal string '@' is hard-coded in the source code:
// * one @ is mandatory
i = %scan ('@':email);
The Coded Character Set Identifier (CCSID) is a table that assigns hexadecimal codes to a list of pictures (the picture of each printable character). Click for an example. This list of pictures is called the character set.
For more information about characters, globalization, and the i, have a look at the V5R2 iSeries Information Center. Yes, I said V5R2! Globalization is not really a new problem.
The character set decomposes itself in three parts:
- The invariant character set
- The portable character set
- The remainder
The invariant character set is composed of the following characters:
0 1 2 3 4 5 6 7 8 9 - % & ( ) * , . / : ; ? _ ' " + < = > a A b B c C d D e E f F g G h H i I j J k K l L m M n N o O p P q Qr R s S t T u U v V w W x X y Y z Z
These characters almost always have the same hexadecimal code (there are some exceptions).
The invariant character set is named CCSID(640). The CCSID(640) contains all these characters, and solely these.
The portable character set is the character set that makes the C compilers crazy, in particular in the UNIX world. Details are in V5R2 iSeries Information Center.
It is composed of these 13 characters: $^ ~ #@ [\] {|}! and the accent grave (`). Note: Wikipedia includes the invariant CSet into the portable CSet.
About My RPG Programs
- In the MAIL program, the "at" sign @ is an issue.
- In the PDF-generation program, the brackets ([ and ]) are an issue.
- In the RTF-generation program, the braces ({ and }) are an issue.
All these characters are in the portable character set. The solution consists, therefore, in determining the correct value of the constants for the program, according to the CCSID of the job.
Constants can vary? Yes. Hence the title of this series of articles.
In all programs, in all constants, must appear only the characters that are in the invariant character set—what makes them effectively constants.
Yes, but for my @, what to do?
How Do I Fix the Problem?
To fix this problem, I need to have a correct @—that is, the correct hexadecimal value that corresponds to the @, depending on the actual CCSID of the job. Once I have the correct hexadecimal code of the @, I will be able to verify the email address without getting confusing error messages.
To get the correct value of the @ according to job, I need…
- A known value that is "fixed" (meaning hard-coded, unchangeable)
- An adaptation process
I'll describe that now.
The Solution, Data Side
I added these declarations into an /INCLUDE:
dPortableCharInz ds qualified
d Dollar 1 inz(x'5B')
d AccentAcute 1 inz(x'BE')
d Caret 1 inz(x'5F')
d Tilde 1 inz(x'A1')
d NumberSign 1 inz(x'7B')
d AtSign 1 inz(x'7C')
d LeftBracket 1 inz(x'4A')
d BackSlash 1 inz(x'E0')
d RightBracket 1 inz(x'5A')
d LeftBrace 1 inz(x'C0')
d LogicalOr 1 inz(x'BB')
d RightBrace 1 inz(x'D0')
d ExclamationPoint...
d 1 inz(x'4F')
d CCSID 5s 0 inz(500)
and
d PortableChar ds qualified
d Dollar 1 inz('_')
d AccentAcute 1 inz('_')
d Caret 1 inz('_')
d Tilde 1 inz('_')
d NumberSign 1 inz('_')
d AtSign 1 inz('_')
d LeftBracket 1 inz('_')
d BackSlash 1 inz('_')
d RightBracket 1 inz('_')
d LeftBrace 1 inz('_')
d LogicalOr 1 inz('_')
d RightBrace 1 inz('_')
d ExclamationPoint...
d 1
d CCSID 5s 0 inz(0)
I also added into this /INCLUDE the prototypes for the IBM-supplied iConv API and the CONVCCSID procedure (a wrapper I wrote).
What is the objective?
PortableCharInz is in fact a hexadecimal constant whose values are chosen according to the CCSID 500. The CCSID 500 is the basic CCSID for America and West Europe. I could have coded this string as a constant, but it would be a lot less legible.
The Solution, Code Side
The idea is to use, at the beginning of the program, PortableCharInz (whose content is stationary, known, and in CCSID 500) to convert it to the current job CCSID.
Like this:
PortableChar = ConvCcsid(PortablecharInz.ccsid:0:PortableCharInz);
From there, the modification to bring in the program MAIL is the following:
// * one @ is mandatory
i = %scan(portablechar.AtSign:email);
And the program runs correctly again.
The complete RPG source code of ConvCcsid is in JP4INC.MBR here. The complete RPG code of MAIL is available here.
This second article has described the solution. We now are sure there is some method to take care of the Job CCSID. In the next (and final) article of this series, I will share some tips and tricks I found while solving the bug with the @.
Below, you will find the included prototypes for the IBM i APIs and my wrapper, copied from the include JP4INC.
These are the most important prototypes and data declarations:
*==================================================================
* Type definitions for Code Conversion APIs
*==================================================================
D iconv_t ds based(pDummy)
d qualified
D rc 10i 0
D cd 10i 0 dim(12)
D iconvtoCode ds qualified
D ccsid 10i 0 inz(500)
D convA 10i 0
D subA 10i 0
D shftA 10i 0
D lnOpt 10i 0
D erOpt 10i 0
D res 12a inz(*ALLx'00')
D iconvfromCode ds qualified
D ccsid 10i 0 inz(0)
D convA 10i 0 inz(0)
D subA 10i 0 inz(0)
D shftA 10i 0 inz(1)
D lnOpt 10i 0 inz(0)
D erOpt 10i 0 inz(0)
D res 12a inz(*ALLx'00')
*==================================================================
* Prototype for iconv_open()--Code Conversion Allocation API
*==================================================================
D iconv_open pr extproc('QtqIconvOpen') like(iconv_t)
D pToCode * value
D pFromCode * value
*==================================================================
* Prototype for iconv()--Code Conversion API
*==================================================================
D iconv pr 10i 0 extproc('iconv')
D cd value like(iconv_t)
D pInBuf * const
D inBytesLft 10i 0
D pOutBuf * const
D outBytesLft 10i 0
*==================================================================
* Prototype for iconv_close()--Code Conversion Deallocation API
*==================================================================
D iconv_close pr 10i 0 extproc('iconv_close')
D cd value like(iconv_t)
*==================================================================
d convccsid pr 1000 varying
d 10i 0 const
d 10i 0 const
d 1000 const varying
This is the procedure for ConvCCSID:
P convccsid b export
d pi 1000 varying
d InCcsid 10i 0 const
d OutCcsid 10i 0 const
d InString_p 1000 const varying
d InString s 1000 static varying
d OutString s 1000 static varying
d inLen s 10i 0 static
d OutLen s 10i 0 inz(1000)
D hIconv ds likeds(iconv_t) inz
d errcode ds likeds(ErrorCodeHandler)
d inz(*likeds)
d rc s 10i 0
d ToCCSID s 10i 0
/free
ToCCSID = OutCCSID;
if ToCCSID = 0;
// system info
if not jp4.GotOsVersion;
reset ErrorCodeHandler;
APIlen = %size(PRDR0100) ;
APIformat = 'PRDR0100' ;
osinfo='*OPSYS *CUR 0000*CODE ' ;
rtvprdinf(PRDR0100
: APIlen
: APIformat
: osinfo
: ErrorCodeHandler );
if (ErrorCodeHandler.available>0);
message(ErrorCodeHandler.msgid:ErrorCodeHandler.msgdta
:'':'QCPFMSG':'*ESCAPE');
endif;
jp4.gotosversion = true;
jp4.OSVersion=prdr0100.Release_level;
endif;
if jp4.OSVersion < 'V6R1M0';
// get job info - iconv V5R4 does not handle ccsid 65535
if not jp4.GotJobi04 ;
reset ErrorCodeHandler;
RtvJobA ( JOBI0400
: %Size( JOBI0400 )
: 'JOBI0400'
: '*'
: *Blank
: ECH
);
if (ErrorCodeHandler.available>0);
message(ErrorCodeHandler.msgid:ErrorCodeHandler.msgdta
:'':'QCPFMSG':'*ESCAPE');
endif;
jp4.GotJobi04 =true;
endif;
if jobi0400.CodedcharactersetID=65535;
jp4.jobccsid=jobi0400.Defaultcodedcharactersetidentifier;
else;
jp4.jobccsid=jobi0400.CodedcharactersetID ;
endif;
ToCCSID = jp4.jobccsid;
else;
// v6r1 handles ccsid 65535
endif;
endif;
iconvfromCode.ccsid = InCcsid ;
iconvtoCode.ccsid = ToCCSID ;
hIconv = iconv_open(%addr(iconvtoCode) :
%addr(iconvfromCode) ) ;
if hiconv.rc <> 0;
if errno() <> 0;
message(errnomsg(errno()):'':'*LIBL':'QCPFMSG':'*DIAG');
return '';
endif;
endif;
if jp4.OSVersion < 'V6R1M0';
// iconv does not return the job ccsid
else;
if toccsid = 0;
jp4.jobccsid=HICONV.CD(2);
endif;
InString = InString_P;
InLen = %len(InString) ;
OutString = '';
%len(OutString)=OutLen;
rc = iconv( hIconv :
%addr(InString )+2 :
Inlen :
%addr(OutString )+2 :
Outlen );
if rc< 0;
if c_errno <> 0;
message(errnomsg(c_errno):'':'*LIBL':'QCPFMSG') ;
return '';
endif;
endif;
// outlen = first unused position into outstring
%len(OutString)=%size(OutString)-OutLen-2;
iconv_close( hiconv);
return OutString ;
begsr *pssr;
monitor;
dumpcallstack();
// debug mode ?
clear errcode;
errcode.provided =%size(errcode);
debugmode.dbgattr='*DEBUGJOB';
debugging=true;
RtvDbgAttr ( debugmode.DbgAttr :
debugmode.RtnAttr :
errcode );
if (errcode.available>0);
if errcode.msgid ='CPF9541';
debugging=false;
else;
message(errcode.msgid :errcode.msgdta :'':'QCPFMSG':'*DIAG');
endif;
endif;
if debugging ;
dump;
endif;
on-error;
endmon;
endsr;
/end-free
P e
as/400, os/400, iseries, system i, i5/os, ibm i, power systems, 6.1, 7.1, V7, V6R1
LATEST COMMENTS
MC Press Online