Solr 2 oracle date indexing timezone handling and probable issues


Apache Solr provides a field definition type for datetime values called solr.TrieDateField (TrieDateField) that is based on an efficient compare-/sort-representation. Being an extension-/derived-class to the well known solr.DateField (DateField) up to Solr 4.x, solr.TrieDateField does replace solr.DateField for Solr releases > 5.0 . Using one or the other date field preassumes one important convention: to handle any value passed around or processed within Solr as UTC (Coordinated Universal Time) or zulu time (Z appended) such that all that timezone-detection- and timezone-math-hassle can be avoided. Solr thus exclusively allows values given to the DataImportHandler as defined in ISO 8601, “1995-12-31T23:59:59Z” as an example. However, iff you do not pass the value as a string but as a database date or timestamp w/o tz datatype within an sql select statement to DataImportHandler, secondary Solr-side processing may have to be taken into account.

<!-- schema -->
...
<fieldType name="date" class="solr.TrieDateField" omitNorms="true"
  precisionStep="6" positionIncrementGap="0"/>
<field name="XYZ_TS" type="date" indexed="true" stored="true" />
...
<!-- import (xyz_ts is of type oracle date) -->
...
<document name = "DOC">
  <entity name = "ENT" pk = "DOC_ID""
    query = "select doc_id, xyz_ts from some_tab"
...

That is, to actually derive a zulu time from a timezone-less database value, Solr indeed acts in a respectably smart way by regarding the timezone-setting of the integrated Jetty-server that in turn determines the timezone-setting of the JVM-process in java.properties. The according setting resides with the solr.in.sh shell script in the bin directory and originally reads like so (and may be verified on the Solr admin page http://ip:port/solr/#/~java-properties in line “user.​timezone | UTC“):

# By default the start script uses UTC; override the timezone if needed
#SOLR_TIMEZONE="UTC"

A setting of SOLR_TIMEZONE="UTC" just takes the given timezone-less value as zulu time for granted. Executing just another query through http://ip:port/solr/#/core/query does show a zulu time formatted string for a database feeded solr.TrieDateField as "XYZ_TS": "2014-07-07T00:00:00Z", for example.

Iff you, however, prefer to run your (Jetty/JVM)/Solr with SOLR_TIMEZONE="Europe/Berlin" for whatever reason, be prepared to find something like "XYZ_TS": "2014-07-06T22:00:00Z" in the query output, because a time deviation of 1 hour to UTC as well as of 1 hour for daylight saving in summer has been incorporated.

That’s it, have fun, Peter

Advertisements

Leave a Reply

Please log in using one of these methods to post your comment:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s