The look of : transaction deadlock due to missing foreign key index

This about transaction deadlocks resulting from an index missing on foreign key colum(s). Actually, the post will not dive into the odds resulting from this failure but show how to detect this dedicated reason for an “ORA-00060: deadlock detected while waiting for resource” message. In particular, what to look out for in the trace file and how to find the table objects (instead of rows) being deadlocked against each other. The best resource on the web tackling this subject is of course (Reading deadlock trace files) from Tom Kyte, which also guided me along the way of a “no row deadlock”, see below.
Ok, on first place, some tool, Enterprise Manager for example, or whatever will notify you about entries in the alert.log, reading roughly the like as “Sun May 29 02:51:07 2016 ORA-00060: Deadlock detected. More info in file H:\ORACLE\diag\rdbms\got\got\trace\got_ora_4536.trc“. Inspecting the trace file, you should focus your attention on the first 70 lines, in most cases neglegting the trailing process information dumps. I got this, anything noteworthy will be highlighted and discussed below:

Oracle Database 11g Enterprise Edition Release - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
Windows NT Version V6.1 Service Pack 1 
CPU                 : 4 - type 8664, 4 Physical Cores
Process Affinity    : 0x0x0000000000000000
Memory (Avail/Total): Ph:3493M/16383M, Ph+PgF:19457M/32764M 
VM name             : VMWare Version (6)
Instance name: got
Redo thread mounted by this instance: 1
Oracle process number: 167
Windows thread id: 6960, image: ORACLE.EXE (J001)

*** 2016-12-09 12:16:58.820
*** SESSION ID:(447.46153) 2016-12-09 12:16:58.820
*** CLIENT ID:() 2016-12-09 12:16:58.820
*** SERVICE NAME:(SYS$USERS) 2016-12-09 12:16:58.820
*** MODULE NAME:(srch_queue_mgr.set_iu_event) 2016-12-09 12:16:58.820
*** ACTION NAME:(BLD) 2016-12-09 12:16:58.820

*** 2016-12-09 12:16:58.820
[Transaction Deadlock]
The following deadlock is not an ORACLE error. It is a
deadlock due to user error in the design of an application
or from issuing incorrect ad-hoc SQL. The following
information may aid in determining the deadlock:
Deadlock graph:
                     ---------Blocker(s)--------  ---------Waiter(s)---------
Resource Name        process session holds waits  process session holds waits
TM-0001e2cf-00000000     167     447    SX   SSX      172      38    SX   SSX
TM-0001e2cf-00000000     172      38    SX   SSX      167     447    SX   SSX
session 447: DID 0001-00A7-004D3814	session 38: DID 0001-00AC-001DCB08 
session 38: DID 0001-00AC-001DCB08	session 447: DID 0001-00A7-004D3814 
Rows waited on:
  Session 447: no row
  Session 38: no row
----- Information for the OTHER waiting sessions -----
Session 38:
  sid: 38 ser: 53179 audsid: 319758247 user: 70/HUGO
    flags: (0x8010041) USR/- flags_idl: (0x1) BSY/-/-/-/-/-
    flags2: (0x40009) -/-/INC
  pid: 172 O/S info: user: svc_dbprd, term: DBPRD, ospid: 6104
    image: ORACLE.EXE (J000)
  client details:
    O/S info: user: svc_dbprd, term: DBPRD, ospid: 6104
    machine: DBPRD program: ORACLE.EXE (J000)
    application name: srch_queue_mgr.set_iu_event, hash value=2326741934
    action name: BLD, hash value=1673980416
  current SQL:
  DELETE FROM thattable WHERE thattable_ID = :B1 
----- End of information for the OTHER waiting sessions -----
Information for THIS session:

----- Current SQL Statement for this session (sql_id=06sfjafxypyh6) -----
DELETE FROM thattable WHERE thattable_ID = :B1 
----- PL/SQL Stack -----
----- PL/SQL Call Stack -----
  object      line  object
  handle    number  name
00000003E71A8CC8       479  package body BLA.MIG
00000003E71A8CC8       511  package body BLA.MIG
00000003E1D4D348         1  anonymous block
  • 28-32: the deadlock graph depicting processes/sessions and objects in the scenario
  • 37-39: the weird thing, a deadlock with no rows incorporated, how come?
  • 41: other ssesions wrap up, may have more that just one entry like here, regard the sql
  • 58: this sessions info, sql given again, along with a pl/sql call stack

What now? Since we have a “no row deadlock” in 37-39, the deadlock appears to be schema- but not data-related. Further on, although we noticed the (current sessions) sql in the trace as of 41 and 58, the deadlock may not necessarily have happenend to the named table objects, thattable here, but to tables below in some call stack. The most important next step is therefore to evaluate the “resource name”s given in the the deadlock graph (28-32), i.e. map these names to actual table object names as follows.

-- use the middle part of TM-0001e2cf-00000000
SQL> select object_name from user_objects
 where object_id = to_number('0001e2cf','xxxxxxxx');

Ok, knowing that thistable holds a foreign key to thattable, a concurrent delete on thattable from at least two sessions will concurrently lock data in thistable. For an unindexed foreign key up to thattable, any delete (and updates to) will lock any data in thistable completely whatsoever and its just a matter of time when locking timespans may overlap. On the other hand, just not the case here, redundant (function-based) unique key indexes is another pitfall spotted often.

Being curious to reproduce the problem in a straightforward way, I adopted and executed an example given by Tom Kyte in the article mentioned above like this (#2# identifies the second, concurrent session). Great to see, how problems become manageable (and avoidable) when there is a simple script of reproduction.

<CODE@HOST> create table p ( x int primary key );
<CODE@HOST> create table c ( x references p );
-- <CODE@HOST> create index c_idx on c(x); -- leave it == be deadlocked
<CODE@HOST> insert into p select rownum from all_users where rownum <= 2;
<CODE@HOST> insert into c select * from p;
<CODE@HOST> commit;
<CODE@HOST> select * from p;
<CODE@HOST> select * from c;
<CODE@HOST> delete from c where x = 1;
#2# <CODE@HOST> delete from c where x = 2;
#2# <CODE@HOST> delete from p where x = 2;
<CODE@HOST> delete from p where x = 1;
#2# delete from p where x = 2
#2#             *
#2# FEHLER in Zeile 1:
#2# ORA-00060: Deadlock beim Warten auf Ressource festgestellt
#2# <CODE@HOST> commit; -- unblock #1#
<CODE@HOST> commit;
<CODE@HOST> select * from p;
<CODE@HOST> select * from c;

Regards, Peter


Leave a Reply

Please log in using one of these methods to post your comment: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s