Thursday, May 22, 2008

SQL Merge and Sequence gaps

I hope You my dear reader isn't one of the people cursed by requirement for gapless sequences ;)
Yesterday I just found another nail in the coffin for oracle sequence objects used as gapless sequence generators. Of course there are many causes not to do that, just like simple rollback, users changing their minds, network connection error, flushing cache etc. But I didn't know at least till yesterday, that MERGE statements generate gaps with quite big guarantee. It seems that Merge statement is generating the same number of calls for sequence next value as row count of source table or subquery.
Let's look at example:

CREATE TABLE target (pk number, data varchar2(10));
CREATE TABLE source (pk_s number, data_s varchar2(10));
INSERT INTO target VALUES (1, 'a');
INSERT INTO target VALUES (3, 'c');
INSERT INTO source VALUES (1, 'a');
INSERT INTO source VALUES (2, 'b');
INSERT INTO source VALUES (3, 'c');
INSERT INTO source VALUES (4, 'd');
COMMIT;
CREATE SEQUENCE seq START WITH 5;

So now we have 2 rows in target table, and 4 rows in source table. We'll compare them using data and data_s column. We'd insert only two rows, because another 2 already exists and even won't use WHEN MATCHED clause for update.

MERGE INTO target
USING source
ON (data = data_s)
WHEN NOT MATCHED THEN
INSERT VALUES (seq.nextval, data_s);
SQL> SELECT * FROM target;
PK DATA
---------- ----------
1 a
3 c
7 d
8 b

Although the sequence should have started from 5 new pk values are 7 and 8. Let's add another row into source and look what happens:

INSERT INTO source VALUES (5, 'e');
MERGE INTO target
USING source
ON (data = data_s)
WHEN NOT MATCHED THEN
INSERT VALUES (seq.nextval, data_s);
SQL> SELECT * FROM target;
PK DATA

---------- ----------
1 a
3 c
7 d
8 b
13 e

So now the value is 13 i.e. 8 + count of rows from source table (5). Now let's add another row and delete all previous.

INSERT INTO source VALUES (6, 'f');
DELETE source WHERE pk_s <=5;
MERGE INTO target
USING source
ON (data = data_s)
WHEN NOT MATCHED THEN
INSERT VALUES (seq.nextval, data_s);
SQL> SELECT * FROM target;
PK DATA

---------- ----------
1 a
3 c
7 d
8 b
13 e
14 f

As we can see pk is incremented only by 1.
After my experiments I found also metalink note "Merge Increments SEQUENCE.NEXTVAL for Both Insert and Update" (Note:554656.1), which also assures this as expected situation and not bug. So another reason not to worry about sequence values, just uniqueness of them ;)

2 comments:

Edgar Chupit said...

Good morning Gints,
We all know that MERGE is internally converted into OUTER JOIN and MERGE (INSERT/UPDATE). So I think, this is just one of the side effects of this conversion. Just compare the output of two following explain plans:

SQL> explain plan for
2 MERGE INTO target
3 USING source
4 ON (data = data_s)
5 WHEN NOT MATCHED THEN
6 INSERT VALUES (seq.nextval, data_s);

Explained.

SQL> @?/rdbms/admin/utlxpls
Plan hash value: 11443630

--------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------------
| 0 | MERGE STATEMENT | | 4 | 108 | 5 (0)| 00:00:01 |
| 1 | MERGE | TARGET | | | | |
| 2 | VIEW | | | | | |
| 3 | SEQUENCE | SEQ | | | | |
| 4 | NESTED LOOPS OUTER| | 4 | 208 | 5 (0)| 00:00:01 |
| 5 | TABLE ACCESS FULL| SOURCE | 4 | 80 | 2 (0)| 00:00:01 |
|* 6 | TABLE ACCESS FULL| TARGET | 1 | 32 | 1 (0)| 00:00:01 |
--------------------------------------------------------------------------------

SQL> explain plan for
2 select t.*, seq.nextval
3 from (
4 select /*+ use_nl(s,t) */ *
5 from source s, target t
6 where s.pk_s = t.pk(+)
7 ) t
8 /

Explained.

SQL> @?/rdbms/admin/utlxpls
Plan hash value: 243054123

------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 4 | 160 | 5 (0)| 00:00:01 |
| 1 | SEQUENCE | SEQ | | | | |
| 2 | NESTED LOOPS OUTER| | 4 | 160 | 5 (0)| 00:00:01 |
| 3 | TABLE ACCESS FULL| SOURCE | 4 | 80 | 2 (0)| 00:00:01 |
|* 4 | TABLE ACCESS FULL| TARGET | 1 | 20 | 1 (0)| 00:00:01 |
------------------------------------------------------------------------------

Of course, in this case, Oracle could see, that MERGE statement does not have UPDATE clause, but only insert and could transform outer join statement to something like this:
select t.*, seq.nextval
from (
select /*+ use_nl(s,t) */ *
from source s, target t
where s.pk_s = t.pk(+)
and t.pk is null
) t

But apparently, for some reason, Oracle does not do this...

PS: I’ve used use_nl in select statement to demonstrate the same plan, on my setup, select was tend to do hash join instead of nested loop.
PSS: I’ve checked the same queries on 11g, the results are the same. It doesn’t look like Oracle is trying to improve this part or as I said before, there could be a good reason for not doing so :)

Have a nice day,
Edgar

Gints Plivna said...

Hello!
You are right of course. My main point was to show another reason not to expect sequences to be gapless :)
However I experimented a bit more - it seems that if there isn't MERGE INSERT part then oracle is doing INNER JOIN (not important hash or nested loops) but in case the MERGE INSERT part is present then it switches to OUTER JOIN.
See below (for both clauses and just insert clause HASH JOIN OUTER, for just update clause HASH JOIN)
SQL> set autot on explain
SQL> MERGE INTO target
2 USING source
3 ON (data = data_s)
4 WHEN MATCHED THEN UPDATE SET pk = pk_s
5 WHEN NOT MATCHED THEN
6 INSERT VALUES (seq.nextval, data_s)
7 /

4 rows merged.


Execution Plan
----------------------------------------------------------
Plan hash value: 1520567312

--------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------------
| 0 | MERGE STATEMENT | | 4 | 100 | 7 (15)| 00:00:01 |
| 1 | MERGE | TARGET | | | | |
| 2 | VIEW | | | | | |
| 3 | SEQUENCE | SEQ | | | | |
|* 4 | HASH JOIN OUTER | | 4 | 40 | 7 (15)| 00:00:01 |
| 5 | TABLE ACCESS FULL| SOURCE | 4 | 20 | 3 (0)| 00:00:01 |
| 6 | TABLE ACCESS FULL| TARGET | 2 | 10 | 3 (0)| 00:00:01 |
--------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

4 - access("DATA"(+)="DATA_S")

SQL> ed
Wrote file afiedt.buf

1 MERGE INTO target
2 USING source
3 ON (data = data_s)
4 --WHEN MATCHED THEN UPDATE SET pk = pk_s
5 WHEN NOT MATCHED THEN
6* INSERT VALUES (seq.nextval, data_s)
SQL> /

0 rows merged.


Execution Plan
----------------------------------------------------------
Plan hash value: 1520567312

--------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------------
| 0 | MERGE STATEMENT | | 4 | 48 | 7 (15)| 00:00:01 |
| 1 | MERGE | TARGET | | | | |
| 2 | VIEW | | | | | |
| 3 | SEQUENCE | SEQ | | | | |
|* 4 | HASH JOIN OUTER | | 4 | 40 | 7 (15)| 00:00:01 |
| 5 | TABLE ACCESS FULL| SOURCE | 4 | 20 | 3 (0)| 00:00:01 |
| 6 | TABLE ACCESS FULL| TARGET | 2 | 10 | 3 (0)| 00:00:01 |
--------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

4 - access("DATA"(+)="DATA_S")

SQL> ed
Wrote file afiedt.buf

1 MERGE INTO target
2 USING source
3 ON (data = data_s)
4 WHEN MATCHED THEN UPDATE SET pk = pk_s
5 --WHEN NOT MATCHED THEN
6* --INSERT VALUES (seq.nextval, data_s)
SQL> /

4 rows merged.


Execution Plan
----------------------------------------------------------
Plan hash value: 2275125706

-------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-------------------------------------------------------------------------------
| 0 | MERGE STATEMENT | | 2 | 32 | 7 (15)| 00:00:01 |
| 1 | MERGE | TARGET | | | | |
| 2 | VIEW | | | | | |
|* 3 | HASH JOIN | | 2 | 20 | 7 (15)| 00:00:01 |
| 4 | TABLE ACCESS FULL| TARGET | 2 | 10 | 3 (0)| 00:00:01 |
| 5 | TABLE ACCESS FULL| SOURCE | 4 | 20 | 3 (0)| 00:00:01 |
-------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

3 - access("DATA"="DATA_S")