data guard

Client-failover for dataguard switchover and failover


The following is nothing more than some extension of the oracle metalink document:

[ID 740029.1] Step By Step Guide On How To Configure And Test Client-Failover For Dataguard Switchover And Failover

That is, the article does not only present the cookbook how-to of its anchestor but also examines what happens in the database and what the clients experience will be concerning a so called transparent application failover (TAF). The scenario has been tested on an oracle on windows, please note that the metalink document points to a some more applicable technique available with oracle 11gR2 (Client Failover Best Practices for Highly Available Oracle Databases: Oracle Database 11g Release 2).

Client-failover for dataguard, away from other failover scenarios, essentially aims at using a general tnsnames entry against some server-side database endpoint, no matter what server-side database instance is currently running in a dataguard primary role. This is actually the transparency in a transparent application failover where the application, client here, does not have to care to switch its network configuration in any way (some simple implementation would be, not uncommonly, to provide some tnsnames.ora, tnsnames.ora.prm and tnsnames.ora.stb and rename the files) during a failover or switchover.


RFS: No standby redo logfiles …

Some day, you may trap the following error message with your standby database alert log (seen on a

RFS: No standby redo logfiles available

The reason for this message is most probably that you have cloned your standby database with rman and configured your dataguard environment to directly write into redo log at standby (using LGWR), instead of just transfering the archive log items from primary (using ARCH), but have missed to provide the necessary standby redo log files. RFS may also log RFS[1]: Unable to open standby log 9: 313 or something similar (the primary will also complain about connection problems to the standby destination target, which is the standby redo log, actually).

RFS: No standby redo logfiles available of size 104857600 bytes


Simulating and resolving a dataguard archivelog gap

I was wondering what will happen iff some archivelog gap really happens in a dataguard environment. How to resolve it? What has to be done manually and what happens automatically due to the remote file service (fal), for example.

Well, my attempt in producing the problem goes like this:

  • verify normal operation by doing manual log switches on primary and watching the alert.log on primary and standby what happens
  • defering the remote log destination (usually #2) on primary
  • doing another manual log switches on primary that do now, don’t get shipped to standby
  • backup primary with removing all backed up archive logs from the recovery destination
  • bounce primary and standby
  • activating the remote log destination (usually #2) on primary again, doing another manual log switches on primary, watching the alert.log on primary and standby what happens

Ok, let’s follow the alert.log on primary and standby …


PING[ARC1]: Heartbeat failed to connect to standby ‘dgp’. Error is 1031.

just for short, digged another “circumstance” of oracle operation, in a data guard environment this time. i usually set up standby databases from a rman backup set, clone and adapt the spfile, clone the password file, create a new instance and so on. this works great, no problems so far.

this afternoon however, i got notified that some alert.log of a primary database is packed with messages of the following pattern: