asciistr

Oracle character set conversion downgrade checkup with utl_i18n

Doing an export/import or a ctas or an oci/jdbc client action into a database that has smaller sized character set, multi- to single-byte for example, will raise the problem of information loss in terms of character data. The information loss, however, is not limited to loosing one or the other character. Some characters may also become replaced by a default or a best guess replacement character in the target character set automatically.
Profound analysis of the outcome of the character set conversion might also cause difficulties when only the source database is already available and just an estimation of the information loss is what you need. You may come up with good old convert() to execute an input/output compare but aside from using convert() is discouraged with the latest releases of Oracle, convert() will also fracture the strings on input as soon as the first multibyte character appears. convert() will alert you of some problem but it will not tell you why or even better: what character is (first) the stumbling block.

(more…)

Just another discussion of unicode character conversion for oracle

Running an oracle in multibyte unicode storage like AL32UTF8, disregarding the char and byte column length topic, is actually no different from the old days single byte storage, e.g. in WE8MSWIN1252. However, any job that includes sort of character conversion in terms of character, decimal and hex reprasentations, does require at least a basic understanding of available unicode storage options and sql functions with oracle. To me, the main reason of common problems is the mismatch being imposed by oracle’s impure layout of the sql functions ascii, asciistr, chr, nchr and unistr concerning the database and the national characterset.

The following has been executed on a 11gR2 on win64 using these database and the national characterset storage options.

SQL> select * from NLS_database_PARAMETERS where parameter like '%CHARACTERSET%';
PARAMETER                VALUE
---------                -----
NLS_CHARACTERSET         AL32UTF8
NLS_NCHAR_CHARACTERSET   AL16UTF16

(more…)