Simulating and resolving a dataguard archivelog gap


I was wondering what will happen iff some archivelog gap really happens in a dataguard environment. How to resolve it? What has to be done manually and what happens automatically due to the remote file service (fal), for example.

Well, my attempt in producing the problem goes like this:

  • verify normal operation by doing manual log switches on primary and watching the alert.log on primary and standby what happens
  • defering the remote log destination (usually #2) on primary
  • doing another manual log switches on primary that do now, don’t get shipped to standby
  • backup primary with removing all backed up archive logs from the recovery destination
  • bounce primary and standby
  • activating the remote log destination (usually #2) on primary again, doing another manual log switches on primary, watching the alert.log on primary and standby what happens

Ok, let’s follow the alert.log on primary and standby …

When standby comes up it finds logseq #83 and #84 to recover and waits for #85.

...
Media Recovery Log E:\ORADATA\DGS\ARCH\ARC00083_0733948503.001
Media Recovery Log E:\ORADATA\DGS\ARCH\ARC00084_0733948503.001
Media Recovery Waiting for thread 1 sequence 85 (in transit)
...

However, primary is already far further with log seq’s,up to #92 with a defered remote destination and now up to #95 with a reactivated remote destination.

...
Thread 1 advanced to log sequence 92 (LGWR switch)
  Current log# 1 seq# 92 mem# 0: J:\ORADATA\DGP\REDO\REDO01.LOG
  Current log# 1 seq# 92 mem# 1: J:\ORADATA\DGP\REDO\REDO01B.LOG
...
Thread 1 advanced to log sequence 95 (LGWR switch)
  Current log# 1 seq# 95 mem# 0: J:\ORADATA\DGP\REDO\REDO01.LOG
  Current log# 1 seq# 95 mem# 1: J:\ORADATA\DGP\REDO\REDO01B.LOG
...

Of course, standby already gets aware of this. It firstly finds a gap of #85 to #94 but then notifies #93 and #94 being present though, correcting the warning message.

...
Fetching gap sequence in thread 1, gap sequence 85-94
...
RFS[4]: Archived Log: 'E:\ORADATA\DGS\STBARCH\ARC00093_0733948503.001'
RFS[4]: Archived Log: 'E:\ORADATA\DGS\STBARCH\ARC00094_0733948503.001'
Fetching gap sequence in thread 1, gap sequence 85-92
...

Standby tries to get the missing archive logs but to no avail.

...
FAL[client]: Failed to request gap sequence 
 GAP - thread 1 sequence 85-92
 DBID 1647279508 branch 733948503
FAL[client]: All defined FAL servers have been attempted.
...

Primary has meanwhile proceeded to log seq #98

...
Thread 1 advanced to log sequence 98 (LGWR switch)
  Current log# 1 seq# 98 mem# 0: J:\ORADATA\DGP\REDO\REDO01.LOG
  Current log# 1 seq# 98 mem# 1: J:\ORADATA\DGP\REDO\REDO01B.LOG
...

Looking up the archive log repository with the control file on standby shows the following, which clearly states the log seq’s gap between #85 and #92.

SQL> SELECT * FROM (
  2    SELECT dest_id, standby_dest, fal, CREATOR, REGISTRAR, applied, sequence#,
  3      to_char(completion_time, 'dd.mm.yyyy hh24:mi:ss')
  4    FROM v$archived_log
  5    ORDER BY recid DESC
  6  ) WHERE rownum < 26;

   DEST_ID STA FAL CREATOR REGISTR APP  SEQUENCE# TO_CHAR(COMPLETION_
---------- --- --- ------- ------- --- ---------- -------------------
         1 NO  NO  ARCH    RFS     NO          97 02.11.2010 10:01:37
         1 NO  NO  ARCH    RFS     NO          96 02.11.2010 10:01:26
         2 NO  YES ARCH    RFS     NO          94 02.11.2010 10:00:51
         2 NO  YES ARCH    RFS     NO          93 02.11.2010 10:00:51
         1 NO  NO  ARCH    RFS     NO          95 02.11.2010 10:00:47
         1 NO  NO  ARCH    RFS     YES         84 02.11.2010 09:43:58
         1 NO  NO  ARCH    RFS     YES         83 02.11.2010 09:43:54
         2 YES NO  FGRD    FGRD    NO          82 02.11.2010 09:43:32
         1 NO  NO  FGRD    RFS     YES         82 02.11.2010 09:43:32
         2 YES NO  FGRD    FGRD    NO          81 02.11.2010 09:43:16
         1 NO  NO  FGRD    FGRD    YES         81 02.11.2010 09:43:16
         2 YES NO  ARCH    ARCH    YES         80 02.11.2010 09:42:52
         1 NO  NO  ARCH    ARCH    YES         80 02.11.2010 09:42:52
         2 YES NO  LGWR    LGWR    YES         79 02.11.2010 09:42:47
         1 NO  NO  ARCH    ARCH    YES         79 02.11.2010 09:42:47
         2 YES NO  LGWR    LGWR    YES         78 02.11.2010 09:35:30
         1 NO  NO  FGRD    FGRD    YES         78 02.11.2010 09:35:30
         2 YES NO  LGWR    LGWR    YES         77 02.11.2010 09:34:52
         1 NO  NO  FGRD    FGRD    YES         77 02.11.2010 09:34:52
         2 YES NO  LGWR    LGWR    YES         76 02.11.2010 09:34:06
         1 NO  NO  FGRD    FGRD    YES         76 02.11.2010 09:34:06
         1 NO  NO  ARCH    ARCH    YES         75 02.11.2010 09:32:04
         2 YES NO  LGWR    LGWR    YES         75 02.11.2010 09:31:59
         2 YES YES ARCH    ARCH    YES         74 02.11.2010 09:31:48
         1 NO  NO  ARCH    ARCH    YES         74 02.11.2010 09:31:48

Weird! Log transfer is still active in the environment but apply on standby does not proceed. This is due to the backup with the delete archive log option on primary! The necessary archive logs have just vanished and can no longer be shipped for log gap resolution. What else to do than simply restoring the needed archive logs to the standard recovery destination on primary …

RESTORE ARCHIVELOG FROM SEQUENCE 85 UNTIL SEQUENCE 92;

And, what wonder, standby immediately gets and applies the restored objects as well as the newer archive logs.

...
RFS[4]: Archived Log: 'E:\ORADATA\DGS\STBARCH\ARC00085_0733948503.001'
RFS[4]: Archived Log: 'E:\ORADATA\DGS\STBARCH\ARC00086_0733948503.001'
RFS[4]: Archived Log: 'E:\ORADATA\DGS\STBARCH\ARC00087_0733948503.001'
RFS[4]: Archived Log: 'E:\ORADATA\DGS\STBARCH\ARC00088_0733948503.001'
RFS[4]: Archived Log: 'E:\ORADATA\DGS\STBARCH\ARC00089_0733948503.001'
RFS[4]: Archived Log: 'E:\ORADATA\DGS\STBARCH\ARC00090_0733948503.001'
RFS[4]: Archived Log: 'E:\ORADATA\DGS\STBARCH\ARC00091_0733948503.001'
RFS[4]: Archived Log: 'E:\ORADATA\DGS\STBARCH\ARC00092_0733948503.001'
...
Media Recovery Log E:\ORADATA\DGS\STBARCH\ARC00085_0733948503.001
Media Recovery Log E:\ORADATA\DGS\STBARCH\ARC00086_0733948503.001
Media Recovery Log E:\ORADATA\DGS\STBARCH\ARC00087_0733948503.001
Media Recovery Log E:\ORADATA\DGS\STBARCH\ARC00088_0733948503.001
Media Recovery Log E:\ORADATA\DGS\STBARCH\ARC00089_0733948503.001
Media Recovery Log E:\ORADATA\DGS\STBARCH\ARC00090_0733948503.001
Media Recovery Log E:\ORADATA\DGS\STBARCH\ARC00091_0733948503.001
Media Recovery Log E:\ORADATA\DGS\STBARCH\ARC00092_0733948503.001
Media Recovery Log E:\ORADATA\DGS\STBARCH\ARC00093_0733948503.001
Media Recovery Log E:\ORADATA\DGS\STBARCH\ARC00094_0733948503.001
Media Recovery Log E:\ORADATA\DGS\ARCH\ARC00095_0733948503.001
Media Recovery Log E:\ORADATA\DGS\ARCH\ARC00096_0733948503.001
Media Recovery Log E:\ORADATA\DGS\ARCH\ARC00097_0733948503.001
Media Recovery Waiting for thread 1 sequence 98 (in transit)
...

From that on, the day is sunny again on standby.

...
Primary database is in MAXIMUM PERFORMANCE mode
RFS[3]: Successfully opened standby log 4: 'E:\ORADATA\DGS\STBREDO\STBREDO04A.LOG'
Tue Nov 02 11:13:38 2010
Media Recovery Log E:\ORADATA\DGS\ARCH\ARC00103_0733948503.001
Media Recovery Waiting for thread 1 sequence 104 (in transit)
...

Another look up of the archive log repository with the control file on standby again shows

  • a clean state: column APP=YES, for applied, as well as
  • the remote fetch: column FAL=YES, for log seq’s between #85 and #92.
   DEST_ID STA FAL CREATOR REGISTR APP  SEQUENCE# TO_CHAR(COMPLETION_
---------- --- --- ------- ------- --- ---------- -------------------
         1 NO  NO  ARCH    RFS     YES        103 02.11.2010 11:13:36
         1 NO  NO  ARCH    RFS     YES        102 02.11.2010 10:58:37
         1 NO  NO  ARCH    RFS     YES        101 02.11.2010 10:43:36
         1 NO  NO  ARCH    RFS     YES        100 02.11.2010 10:28:36
         1 NO  NO  ARCH    RFS     YES         99 02.11.2010 10:13:37
         1 NO  NO  ARCH    RFS     YES         98 02.11.2010 10:13:34
         2 NO  YES ARCH    RFS     YES         92 02.11.2010 10:11:45
         2 NO  YES ARCH    RFS     YES         91 02.11.2010 10:11:45
         2 NO  YES ARCH    RFS     YES         90 02.11.2010 10:11:45
         2 NO  YES ARCH    RFS     YES         89 02.11.2010 10:11:45
         2 NO  YES ARCH    RFS     YES         88 02.11.2010 10:11:45
         2 NO  YES ARCH    RFS     YES         87 02.11.2010 10:11:45
         2 NO  YES ARCH    RFS     YES         86 02.11.2010 10:11:45
         2 NO  YES ARCH    RFS     YES         85 02.11.2010 10:11:45
         1 NO  NO  ARCH    RFS     YES         97 02.11.2010 10:01:37
         1 NO  NO  ARCH    RFS     YES         96 02.11.2010 10:01:26
         2 NO  YES ARCH    RFS     YES         94 02.11.2010 10:00:51
         2 NO  YES ARCH    RFS     YES         93 02.11.2010 10:00:51
         1 NO  NO  ARCH    RFS     YES         95 02.11.2010 10:00:47
         1 NO  NO  ARCH    RFS     YES         84 02.11.2010 09:43:58
         1 NO  NO  ARCH    RFS     YES         83 02.11.2010 09:43:54
         2 YES NO  FGRD    FGRD    NO          82 02.11.2010 09:43:32
         1 NO  NO  FGRD    RFS     YES         82 02.11.2010 09:43:32
         2 YES NO  FGRD    FGRD    NO          81 02.11.2010 09:43:16
         1 NO  NO  FGRD    FGRD    YES         81 02.11.2010 09:43:16

Interestingly, the fal-fetch targets the standby-arch-location on standby, iff you do use such an extra directoriy, and registeres this behaviour with column DEST_ID=2.

Cool stuff! Have fun.

Advertisements

One comment

Leave a Reply

Please log in using one of these methods to post your comment:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s