Flashback version query and the proper use of timestamp and scn clauses

Flashback version query essentially enables you to lookup the incarnations of a row (defined by primary key) in the past, in a consecutive manner. Version information is depicted by a couple of pseudo-columns, namely versions_xid, versions_startscn, versions_endscn, versions_starttime, versions_endtime and versions_operation. See Using Oracle Flashback Version Query in the docs for explanations.

In combination with flashback query or flashback transaction query, one may restore a row incarnation from the past into a new table or even rollback to a past row incarnation within the same table.

This article will discuss flashback version query together with flashback query to restore one to many rows, just shown for a row of a unique key here for brevity, detailing when and when not to use timestamp and scn select where clauses to prevent pitfalls. An example table / dataset will be given, representing a real world scenario where some past data needs to be identified first and is then to be made available again.

Flashback version query uses the following pattern, including the pseudo-columns introduced above, on an actual application-, but not a system-table (alike flashback transaction query). A timestamp– or scn-range must be supplied to define the lookup window (defined by the stock of the available undo-data, remember) and to actually populate the pseudo-columns, respectively:

SELECT pseudo_col1, ..., app_col1, ... FROM app_tab
VERSIONS BETWEEN
  { SCN | TIMESTAMP } { expr | MINVALUE }
  AND { expr | MAXVALUE } ]

Flashback query then provides for the historic data selection in terms of application-table columns, typically used to run a create table as select..., if you like, supplying an as of timestamp– or scn selector:

-- CTAS, maybe
SELECT app_col1, ... FROM app_tab
  AS OF { SCN | TIMESTAMP } expr

All theory so far, yet the real example upfront, make shure have grasped what this trinity in fact means:

  • Flashback version query
  • Flashback query
  • Flashback transaction query

Also be prepared to unhesitantly get down to pen and paper when it comes to history lookup time windows and row level incarnations of data. It’s quite tricky and bewildering sometimes, be concentrated, have a sketch of timelines at hand (you know for https://en.wikipedia.org/wiki/Primer_(film), for example, some visualization like https://en.wikipedia.org/wiki/File:Time_Travel_Method-2.svg may come in very handy).

Example

Ok then, the example table / data comprises an excerpt from a storage application, where some document file will be uploaded somewhere in three or three plus steps. Here it goes:

  1. Enter the upload job item into some workflow table, (insert) columns doc_id = 13390083, file_id is null, status = 'UPLOAD_TODO', try_cnt = 0 and p_slave_id = -1.
  2. Assign a parallel slave upload channel to the job item, (update) column p_slave_id = 1921590.
  3. Perform the upload. Iff successful, (update) column file_id = 2676994, status = 'UPLOAD_DONE' and try_cnt = 1. Iff unsuccessful, for whatever reason, retry, incerementing column try_cnt up to some max until bailing out.

That said, a single row of doc_id = 13390083 might have been existed in three or more incarnations as shown by this flashback version query, using an appropriate timestamp-range for the versions between clause.

Identification

This is the identification step, using Flashback version query, of the above mentioned scenario. Of course, you might need to have any clue about the overall location of the timestamp-range in history, may also need to iterate a few times. Using scn is not of much sense here imho, since our wall clocks read time and not scn‘s (and the application manager phoned in at 8am).

Application-data part:

select doc_id, file_id, status, status_ts, try_cnt, p_slave_id
from doc_storage
  versions between timestamp
      to_timestamp('03.11.2017 10:05:00', 'dd.mm.yyyy hh24:mi:ss')
    and to_timestamp('03.11.2017 11:05:00', 'dd.mm.yyyy hh24:mi:ss')
  where doc_id  = 13390083
order by versions_startscn
;

Version-(meta)data part:

select versions_xid, versions_startscn, versions_endscn, versions_starttime,
  versions_endtime, versions_operation
from doc_storage
  versions between timestamp
      to_timestamp('03.11.2017 10:05:00', 'dd.mm.yyyy hh24:mi:ss')
    and to_timestamp('03.11.2017 11:05:00', 'dd.mm.yyyy hh24:mi:ss')
  where doc_id  = 13390083
order by versions_startscn
;

Restore

Having identified the historic data, we may continue restoring whatever share of incarnations we prefer. Unlike in identification, in restore, we exclusively employ scn-selectors to retrieve the rows in question. There is a couple of reasons why to proceed so, first of all, the versions_starttime– and versions_endtime-timestamp-columns, as shown above, essentially do not have the numeric resolution to select a row uniquely. Second to that, the above values for versions_starttime and versions_endtime do not, as one may suggest, relate to the according values of versions_startscn and versions_endscn in the metadata!

For example, since we have versions_starttime = '03.11.2017 10:25:59,000000000' for the initial insert step, the following select should deliver any result, but it remanins empty. Actually, only from versions_starttime = '03.11.2017 10:26:02,000000000' onwards, any data will be returned at all.

select doc_id, file_id, status, status_ts, try_cnt, p_slave_id
from doc_storage
    as of timestamp to_timestamp('03.11.2017 10:25:59', 'dd.mm.yyyy hh24:mi:ss')
  where doc_id  = 13390083
;

Funny, right? However, since there is no point in jesting for machines, let’s play around with timestamp_to_scn and scn_to_timestamp and see what comes up next.

Obviously, timestamp_to_scn(versions_starttime) does map to another scn, which is even younger in time as vstart_scn_diff proves. The other way around, looking at the results of scn_to_timestamp(versions_startscn), everything is fine, no difference in vstart_ts_diff.

select
  versions_startscn, timestamp_to_scn(versions_starttime) vstart_ts_2_scn,
  versions_startscn - timestamp_to_scn(versions_starttime) as vstart_scn_diff,
  versions_starttime, scn_to_timestamp(versions_startscn) vstart_scn_2_ts,
  versions_starttime - scn_to_timestamp(versions_startscn) as vstart_ts_diff,
  versions_operation
from doc_storage
  versions between timestamp
      to_timestamp('03.11.2017 10:05:00', 'dd.mm.yyyy hh24:mi:ss')
    and to_timestamp('03.11.2017 11:05:00', 'dd.mm.yyyy hh24:mi:ss')
  where doc_id  = 13390083
order by versions_startscn
;

Entirely puzzled now? Not necessary. Just use scn to restore the data, as this recipe recommends. Do note in addition, for sufficiency, that as of scn (must) relate(s) to versions_startscn, because the final incarnation/version of a row will have versions_endscn is null.

select doc_id, file_id, status, status_ts, try_cnt, p_slave_id
from doc_storage
    as of scn 8816864221191 -- compares to versions_startscn!
  where doc_id  = 13390083
;

So, finally, do use timestamp selection in identification and then switch to scn selection in restore and you’ll be fine. That’s it, essentially.

Enjoy

Advertisements

Leave a Reply

Please log in using one of these methods to post your comment:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.