XMLType.schemaValidate is your only true friend for xmlschema validation

When it comes to introducing xml-data into your database storage, you might, as all sincere developers do, at first attempt to take care of proper data integrity checking on import. Since xml is a really powerful but a somewhat complex document-alike data representation, such integrity checking must incorporate proving the document strucuture, as sort of integrated data types, iff you like, as well as the document data, in terms of facets of given actual values. According to the xml standard, xmlschema is the means of choice here, offering another xml-spelled specification layer to achive the two beforementioned goals. Eventually, a given xml-instance will have to be thrown against the xmlschema provided, to assure its integrity (and being well-formed too, btw), the earlier the better, at best on data import already.

The paragraph above is actually not very oracle specific. Any implementation of the xml standard, as oracle’s xml db does, proposes this course of action. However, since some xml operations may become quite costly when the xml-instances get large and the whole dom tree has to be set up in memory, oracle, as others, dabbles at dodging and shifting pricey work to the most reasonable extend possible. You may load xml into the database, claiming any xml-instances are just fine, validate xml-instances only to be well-formed, manage the validation status of xml-instances at your own responsibility and so forth. There is nothing wrong about that whatsoever. The point is, though, it might be a stony walkway to learn to distinguish the maybe from the certain, to establish a reliable check to safeguard your xml data integrity, just as simple as ALTER TABLE mytab ADD(CONSTRAINT mytab_CHK CHECK(INSERT_TS is not null) NOT DEFERRABLE ENABLE);.

Jep, what next? I’m going to briefly discuss the xmlschema / xmlinstance used below and then show what I experienced will happen using the various ways of “validating” the instance against the schema. The post is somewhat lengthy, but do not get frightened, this is for the c&p examples all along the way.


Oracle character set conversion downgrade checkup with utl_i18n

Doing an export/import or a ctas or an oci/jdbc client action into a database that has smaller sized character set, multi- to single-byte for example, will raise the problem of information loss in terms of character data. The information loss, however, is not limited to loosing one or the other character. Some characters may also become replaced by a default or a best guess replacement character in the target character set automatically.
Profound analysis of the outcome of the character set conversion might also cause difficulties when only the source database is already available and just an estimation of the information loss is what you need. You may come up with good old convert() to execute an input/output compare but aside from using convert() is discouraged with the latest releases of Oracle, convert() will also fracture the strings on input as soon as the first multibyte character appears. convert() will alert you of some problem but it will not tell you why or even better: what character is (first) the stumbling block.


The 11g pivot query and the group by clause

There has been this long awaited pivot query feature available since 11g that saves us time in writing those infamous decode / group by queries (see expert one-on-one from Tom Kyte or the web for examples) to flip (grouped by) leading row values to column names.

Searching the web for application examples, however, does only reveal this emp-table stuff, e.g. on, that still contains the well known group by with the base query.

  FROM (SELECT job, deptno, sum(sal) sal FROM emp GROUP BY job, deptno)
         PIVOT ( sum(sal) FOR deptno IN (10, 20, 30, 40) );

JOB               10         20         30         40
--------- ---------- ---------- ---------- ----------
CLERK           1300       1900        950
SALESMAN                              5600
PRESIDENT       5000
MANAGER         2450       2975       2850
ANALYST                    6000

Jep so, what I wanted to achive was a simple flip of rows of status values and counts into columns of status names and count values based on some data like the following.


Just another discussion of unicode character conversion for oracle

Running an oracle in multibyte unicode storage like AL32UTF8, disregarding the char and byte column length topic, is actually no different from the old days single byte storage, e.g. in WE8MSWIN1252. However, any job that includes sort of character conversion in terms of character, decimal and hex reprasentations, does require at least a basic understanding of available unicode storage options and sql functions with oracle. To me, the main reason of common problems is the mismatch being imposed by oracle’s impure layout of the sql functions ascii, asciistr, chr, nchr and unistr concerning the database and the national characterset.

The following has been executed on a 11gR2 on win64 using these database and the national characterset storage options.

SQL> select * from NLS_database_PARAMETERS where parameter like '%CHARACTERSET%';
PARAMETER                VALUE
---------                -----


Implementing bottom-up path traversal for hierarchical tables

Update: There is a newer and imho better approach available with Bottom-up hierarchical root path query recipe in oracle .

oracle supports hierarchical queries on tables by the start with and connect by clauses. this usually preassumes that the rows of some table are chained by a value in column dad_id of some row pointing to a value in column son_id in a parent row, where son_id also denotes the primary key in the table. the root rows of the tree being set up that way, furthermore have a dedicated value in column dad_id. for positive, evolving numbers (id’s ;-), one may just choose 0 here. so far, a typical hierarchical query for a (male) family tree may look like this: