Presenting “Analytic Functions in Practice” at the DOAG-Regio Hamburg/Nord Dec 4 2018

Save the date to see my presentation about “Oracle Analytic Functions in Practice Applications” along the regular meetup of the DOAG/Oracle-community in Hamburg on Dec 4 2018. Although analytic functions is nothing really new in the Oracle world, I’m again attempting to propagate the thrilling productivity of this SQL-toolset, since practice applications are still not as dispersed as expected. To this extent, I want to stress windowing and aggregation over analytics and statistics in real world scenarios and examples. The presentation will be held in German, I suppose. Event details go here and there:


Setting up and mastering radius search in PostgreSQL (9.1)

Radius search in PostgreSQL may come in employing a light and/or a much more sophisticated version. This article discusses the light one, namely the cube and the earth distance extensions, most probably sufficient for the web user’s getting here and there requirements. earth distance, depending on cube, assumes the earth to be perfectly spherical, anyone demanding a higher accuracy level, especially for the mountainous parts, may take a look at the PostGIS project.

Although radius search, the light variety, will be up fast and performing well, there may be some mantrap around, for the ones who prefer to read documentation the easy way too. First of all, PostgreSQL: Documentation: 9.1: earthdistance indicates that the point-based earth distance calculation is hard-wired to statute miles in units. You may use this circumstance to your advantage, like datachomp did in Radius Queries in Postgres, as long as you know what you’re doing. Second to that, taking on the alternate cube-based earth distance calculation, the earth_box function, accepting a lat/long and a radius on input, may return locations farther than the actual radius given (documented alike). This is because earth_box, as the name implies, still handles a box geometry on the idealized sphere (and not some higher order circle surface). But more on that below.


Flashback version query and the proper use of timestamp and scn clauses

Flashback version query essentially enables you to lookup the incarnations of a row (defined by primary key) in the past, in a consecutive manner. Version information is depicted by a couple of pseudo-columns, namely versions_xid, versions_startscn, versions_endscn, versions_starttime, versions_endtime and versions_operation. See Using Oracle Flashback Version Query in the docs for explanations.

In combination with flashback query or flashback transaction query, one may restore a row incarnation from the past into a new table or even rollback to a past row incarnation within the same table.

This article will discuss flashback version query together with flashback query to restore one to many rows, just shown for a row of a unique key here for brevity, detailing when and when not to use timestamp and scn select where clauses to prevent pitfalls. An example table / dataset will be given, representing a real world scenario where some past data needs to be identified first and is then to be made available again.

Flashback version query uses the following pattern, including the pseudo-columns introduced above, on an actual application-, but not a system-table (alike flashback transaction query). A timestamp– or scn-range must be supplied to define the lookup window (defined by the stock of the available undo-data, remember) and to actually populate the pseudo-columns, respectively:


Grepping the actual bind values of sql through v$sql_bind_capture

Programmatically or intercepted by forced cursor sharing, sql where clause actual values shall always be bound to variable placeholders and never / no longer be literally coded. You know Tom Kyte talking that story on each and every occasion, pointing out that so many sql areas got stuck in size and performance because of literal values and the resulting multitude of variants of sql statements. On the other hand, people complain about the sporadic issue that the optimizer takes wrong decisions due to one-time probing of actual values in a given session, a nightmare in 10g, much improved by adaptive cursor sharing throughout 12c meanwhile. Well, I do not want to dive into this again, what follows is just a commented recipe sql statement to inspect what actual bind values have had been in action for a (potentially) sub-optimal sql execution. Please note, that the STATISTICS_LEVEL initialization parameter takes to be greater than BASIC to have the statement deliver any data.

The statement is neither complex nor long-running, just using two perfomance views v$sql_bind_capture and v$sqlarea, resp. While v$sql_bind_capture provides for the bind names, values, capture state and time, v$sqlarea offers different ways to approach the subject, by schema or sql-id or module etc, and not at least, the raw text of the statement in question.


Getting a raw constant number of rows from oracle’s table sample function

You may of course know these two famous posts called To sample or not to sample… (part-2) about data sampling by Mark Hornick. Although very limited in scope, the two posts (imho) very well sketch why we may employ data sampling and how we may lift off table sampling in oracle.
In general, sampling is used to make a representative statement about a collection of data while only regarding a limited random selection, the sample. As long as you are ok to analyze just a sufficient subset of your 1o million rows table for an analysis, you will save your environment a lot of resources and time. On some other scenario, a limited random data selection may also serve verification or testing purposes where, however, not the representativeness but the randomness at a more or less constant sample size, determines the quality of the sample output. Again, as long as you are ok to not exceed this 15 minutes time window overnight, you will be allowed to run that live unit test on any table in question, on 1, 10 or 100 million rows.
In sql, selecting in regard to gain a representative statement will feed the sample function with a requested percentage of rows to sample from. This is what the oracle sample function already offers. Yet another sql to accept a requested actual number of rows to return, independent of the table size, is not available so far (although most people do expect exactly this behaviour when they spot the sql sample function for the first time, weird). The following text will outline a snippet of pl/sql to provide for a sample function to accept the expected number of rows as a parameter.