Implementing CANCEL_SQL with the resource manager
November 2, 2009 3 Comments
this article is more a recipe on how to get CANCEL_SQL working for a dedicated use case than a discussion of oracle resource manager concepts and technology. but, however, as long as CANCEL_SQL can not be implemented without having at least a basic understanding of resource manager, be adviced to study the relevant oracle documentation (from 10g onwards) before running or even reintegrating the code below. It is important to learn that activating a resource manager plan is a system wide effect because there can only be one resource manager plan active at one defined point in time. additionally, others may also be granted to activate or switch resource manager plans. sounds like the big wheel and side effects en masse? no, not really, you just need to know where to plug in your resource requirements and how to adapt your application code accordingly.
plugging into the resource manager for CANCEL_SQL
without too much detailing resource manager concepts and technology please regard the following (application specific) terms and definitions: there shall always exist a dedicated resource manager plan being active. this plan may have subplans and/or consumer groups to manage every defined aspect of resource allocation. this is called a plan directive. in our case we do not regard subplans but different consumer groups that may be toggled when appropriate. that is, for a certain requirement, instead of switching the resource manager plan over and over, which implies a system-wide effect, one should better set up a specific consumer group and switch to/from it on a schema/session basis.
one may now intent to homebrew a structure as given above but this is not nessecary or even advicable. oracle already comes with a predefined resource manager plan and consumer groups right after installation that can be reused. namely you have SYSTEM_PLAN, as a resource manager plan that is just not being activated by default, and assigned consumer groups as for example:
- SYS_GROUP, for SYS/SYSTEM-based sessions,
- AUTO_TASK_CONSUMER_GROUP, for the automatically running database maintainance tasks, or
- OTHER_GROUP, the default for SYSTEM_PLAN.
this query shows you all resource consumer groups that exist in the database:
select * from dba_rsrc_consumer_groups;
the assignment of plan directives, that is subplans or consumer groups, to resource manager plans can be queried as follows:
select * from dba_rsrc_plan_directives order by plan, group_or_subplan, type;
ok, having done the theory, let’s set up some specific consumer group for implementing CANCEL_SQL as an extension of SYSTEM_PLAN by adding another consumer group directive. the following should be run as SYS and regards all the change management calls nessecary to alter the resource manager environment on a live database. it introduces the new consumer group MY_QUERY_GROUP and configures it to automatically switch to the predefined CANCEL_SQL state iff a sql-statements run longer than 10 seconds. do not care so much to band together this semantics with the method parameter names below. dbms_resource_manager.create_plan_directive serves a lot of purposes with a single signature.
begin dbms_resource_manager.clear_pending_area; dbms_resource_manager.create_pending_area; dbms_resource_manager.create_consumer_group( consumer_group => 'MY_QUERY_GROUP', comment => 'Setup of the MY_QUERY_GROUP cons-group for canceling long-running sql'); -- switch_time_in_call behaves like switch_time, but provides for pooled sessions/connections -- in multitier-environments, such a way, the original consumer group will be restored after -- finishing the top call -->> great stuff for a possible later employment - but did not work -- on testing with 10.2.0.3 dbms_resource_manager.create_plan_directive( plan => 'SYSTEM_PLAN', group_or_subplan => 'MY_QUERY_GROUP', comment => 'Integration of MY_QUERY_GROUP cons-group into SYSTEM_PLAN', switch_group => 'CANCEL_SQL', -- predefined, KILL_SESSION is also available switch_time => 10); -- in seconds (switch_time_in_call => 10) dbms_resource_manager.validate_pending_area; dbms_resource_manager.submit_pending_area; -- dbms_resource_manager.switch_plan( plan_name => 'SYSTEM_PLAN'); end; /
the last call above comprises dbms_resource_manager.switch_plan and actually activates the (system-wide) resource manager plan, iff nessecary. please note, that this setting is not persistent, i.e. will not endure the next instance bounce. for a persistent setting of the resource manager plan do employ:
ALTER SYSTEM SET resource_manager_plan = 'SYSTEM_PLAN' COMMENT = 'Activate this predefined plan to integrate the cons-group MY_QUERY_GROUP' SCOPE = both;
also, iff you later need to adopt another switch time do proceed like this:
begin dbms_resource_manager.clear_pending_area; dbms_resource_manager.create_pending_area; dbms_resource_manager.update_plan_directive( plan => 'SYSTEM_PLAN', group_or_subplan => 'MY_QUERY_GROUP', new_switch_time => 25); -- in seconds dbms_resource_manager.validate_pending_area; dbms_resource_manager.submit_pending_area; end; /
finally, the right to switch to/from a certain consumer group has to be granted to a schema, FREDDA, the Berghutze, in our case. take into regard, by the way, that there is no way to grant this to a role for later application with pl/sql, as usual.
begin dbms_resource_manager_privs.grant_switch_consumer_group( grantee_name => 'FREDDA', consumer_group => 'MY_QUERY_GROUP', grant_option => false); end; /
instrumenting the application code
after extending and activating SYSTEM_PLAN one may now employ the provided CANCEL_SQL functionality by switching the consumer group of a session with suspiciously (;-) long running sql. this time, however, we do not call any dbms_resource_manager method but restick to dbms_session. the following code may be regarded self-explanatory and should be compiled into as well as run from the FREDDA account being intruduced above. some sqlpus settings as
set timing on set serveroutput on
are also required for debugging output.
CREATE OR REPLACE procedure test_cancel_sql is v_query_grp varchar2(30) := 'MY_QUERY_GROUP'; v_empty_grp varchar2(30) := ''; v_rt number; v_cnt NUMBER; query_timeout exception; pragma exception_init(query_timeout, -00040); -- procedure switch is begin DBMS_SESSION.SWITCH_CURRENT_CONSUMER_GROUP(new_consumer_group => v_query_grp, old_consumer_group => v_empty_grp, initial_group_on_error => true); dbms_output.put_line('consumer group switch = old: ' || nvl(v_empty_grp, '*EMPTY*') || ', new: ' || v_query_grp); end; procedure switch_back is begin DBMS_SESSION.SWITCH_CURRENT_CONSUMER_GROUP(new_consumer_group => v_empty_grp, old_consumer_group => v_query_grp, initial_group_on_error => true); dbms_output.put_line('consumer group backswitch = old: ' || v_query_grp || ', new: ' || nvl(v_empty_grp, '*EMPTY*')); end; begin -- change the consumer group and activate the time limit until setting CANCEL_SQL switch(); v_rt := DBMS_UTILITY.GET_TIME; -- do a very long running thing (the following query does a cartesian product without -- any predicate and takes up significant cpu time) FOR i IN 1..100000 loop FOR j IN 1..100000 LOOP select count(*) into v_cnt from v$sysstat, v$system_event; END LOOP; END LOOP; -- clean up the switch to be able to execute another consumer group change in that session v_rt := (DBMS_UTILITY.GET_TIME - v_rt); switch_back(); dbms_output.put_line('runtime info: ' || round(v_rt/100, 2) || 'secs'); exception when query_timeout then dbms_output.put_line('sqlcode = ' || sqlcode || ', sqlerrm = ' || sqlerrm); switch_back(); end test_cancel_sql; /
now run the code within sqlplus and notice that the session initially comes up with consumer group OTHER_GROUP, a state of execution we derived from reusing SYSTEM_PLAN as expained above:
you may also consider testing the execution within jobs, because the job runtime sometimes imposes surprising restrictions on code evaluation. this is especially true, iff you try to run a collection of code granted from different schemata.
DECLARE job BINARY_INTEGER; BEGIN dbms_job.submit(job, 'test_cancel_sql;', TO_DATE('01-JAN-4000 12.00.00 AM', 'DD-MON-RRRR HH.MI.SS AM'), 'sysdate + 1/24/60/2', True); commit; END; /
CANCEL_SQL does not work at least before 10.1.0.5 (see bugs 6805380 as a duplicate of 2932210). the code above successfully runs on 10.2.0.3 and 10.2.0.4 windows, solaris and linux.
the availabilty of CANCEL_SQL within the wider context of resource manager is really great stuff. you just identify your resource requirements and plug in/configure them into resource manager, integrating with others resource requirements on a defined basis. reusing predefined resource manager components such as SYSTEM_PLAN or OTHER_GROUP comes in handy because you do not have to reinvent system defaults or fallbacks and stuff.
i can not explain why things like CANCEL_SQL do lie buggy for a whole 10gR1 sub-release period. does nobody use it?
just found a blog entry from Martin Ruthner as http://www.pythian.com/news/2740/oracle-limiting-query-runtime-without-killing-the-session. he uses another approach in direcly mapping an oracle schema to a consumer group. this is also possible and actually the the way SYS and SYSTEM become connected to the SYS_GROUP (see dba_rsrc_group_mappings). however, this article also shows how much it pays off to plug into SYSTEM_PLAN and not unnessesarily reinvent an own homebrew plan.
11g: there is no plan identifier SYSTEM_PLAN any longer, the new name is DEFAULT_PLAN. however, the semantics seems to stay the same in that this basic default plan just prioritizes SYS_GROUP operations.