Thursday, October 25, 2007

Hints "stronger" than db parameters?

Actually I've tested only one parameter so the plural in title is questionable generalization :)

Recently battling with ORA-04030: out of process memory errors I found at least for me quite interesting fact.
USE_HASH hint is more powerful than parameter _hash_join_enabled = false (for 10g) or hash_join_enabled = false (for 9i).
I was quite suprised because without this test case I'd thought completely opposite. Of course neither hash_join_enabled nor _hash_join_enabled should be often altered to different value than default "true", but in case of "false" and your code having hints use_hash, you'll have hash joins regardless of above mentioned db parameter setting. It might be quite nasty surprise in case you have eliminated hash joins because of possible bugs, for example.
I've tested this behaviour for 10.2.0.1.0 SE and 10.2.0.1.0 EE on Windows as well as 10.2.0.3.0 SE on AIX and 9.2.0.7.0 EE on windows, all had the same behaviour.
Here is my test case and 10046 event trace execution plan results on 10.2.0.1.0 SE/Windows (all other tests were similar):


SQL> select * from v$version;
BANNER
----------------------------------------------------------------
Oracle Database 10g Release 10.2.0.1.0 - Production
PL/SQL Release 10.2.0.1.0 - Production
CORE 10.2.0.1.0 Production
TNS for 32-bit Windows: Version 10.2.0.1.0 - Production
NLSRTL Version 10.2.0.1.0 - Production

SQL> create table a as select rownum rn, object_name
2 from all_objects;

Table created.

SQL> alter session set events '10046 trace name context forever, level 1';

Session altered.

SQL> select count(*) from (
2 select *
3 from a a1, a a2
4 where a1.rn = a2.rn);

COUNT(*)
----------
60141

Execution plan:
STAT #24 id=1 cnt=1 pid=0 pos=1 obj=0 op='SORT AGGREGATE (cr=556 pr=73 pw=0 time=212727 us)'
STAT #24 id=2 cnt=60141 pid=1 pos=1 obj=0 op='HASH JOIN (cr=556 pr=73 pw=0 time=406508 us)'
STAT #24 id=3 cnt=60141 pid=2 pos=1 obj=176936 op='TABLE ACCESS FULL A (cr=278 pr=73 pw=0 time=61652 us)'
STAT #24 id=4 cnt=60141 pid=2 pos=2 obj=176936 op='TABLE ACCESS FULL A (cr=278 pr=0 pw=0 time=60192 us)'


SQL> alter session set "_hash_join_enabled" = false;

Session altered.

SQL> select count(*) cnt from (
2 select *
3 from a a1, a a2
4 where a1.rn = a2.rn);

CNT
----------
60141

Execution plan:
STAT #12 id=1 cnt=1 pid=0 pos=1 obj=0 op='SORT AGGREGATE (cr=556 pr=0 pw=0 time=374302 us)'
STAT #12 id=2 cnt=60141 pid=1 pos=1 obj=0 op='MERGE JOIN (cr=556 pr=0 pw=0 time=470826 us)'
STAT #12 id=3 cnt=60141 pid=2 pos=1 obj=0 op='SORT JOIN (cr=278 pr=0 pw=0 time=88667 us)'
STAT #12 id=4 cnt=60141 pid=3 pos=1 obj=176936 op='TABLE ACCESS FULL A (cr=278 pr=0 pw=0 time=44 us)'
STAT #12 id=5 cnt=60141 pid=2 pos=2 obj=0 op='SORT JOIN (cr=278 pr=0 pw=0 time=231092 us)'
STAT #12 id=6 cnt=60141 pid=5 pos=1 obj=176936 op='TABLE ACCESS FULL A (cr=278 pr=0 pw=0 time=60174 us)'

SQL> select count(*) cnt from (
2 select /*+ full(a1) full(a2) use_hash(a1 a2)*/ *
3 from a a1, a a2
4 where a1.rn = a2.rn);

CNT
----------
60141

Execution plan:
STAT #24 id=1 cnt=1 pid=0 pos=1 obj=0 op='SORT AGGREGATE (cr=556 pr=73 pw=0 time=214116 us)'
STAT #24 id=2 cnt=60141 pid=1 pos=1 obj=0 op='HASH JOIN (cr=556 pr=73 pw=0 time=399462 us)'
STAT #24 id=3 cnt=60141 pid=2 pos=1 obj=176936 op='TABLE ACCESS FULL A (cr=278 pr=73 pw=0 time=60803 us)'
STAT #24 id=4 cnt=60141 pid=2 pos=2 obj=176936 op='TABLE ACCESS FULL A (cr=278 pr=0 pw=0 time=60188 us)'

Wednesday, October 17, 2007

All latvian Oracle (and not only) users - UNITE!

I'd like to announce two resources for people that like relational databases and speak Latvian.
The first one is discussion group for Latvian Oracle User Group. The near future plans are described here (in Latvian).
If you speak Latvian and work with Oracle then it is your duty :) to join it! Of course one can join even if he doesn't speak Latvian, however there will be a small problem - everyone will understand you, but it is doubtful that you will understand anything except some keywords like Oracle, SQL, etc :)

The second resource is mine. And it is more general, it is blog about relational databases (in Latvian).
Recently I've searched Internet and was very unpleasantly surprised - I've found very few resources devoted to relational databases. So not to make only complaints I've started to blog about relational databases in my mother tongue. I'm of course not living in illusions that I can do it only myself and therefore anyone that is at least a little bit eager to help me in this process and has some knowledge either in general SQL, Oracle, MS SQL Server, MySQL, PosgreSQL or another relational db is encouraged to join me and become author in this blog.

Tuesday, October 02, 2007

Natural joins are evil

Motto:
If you like time bombs use them ;)

During my teaching courses I once again stumble upon ANSI SQL NATURAL JOIN syntax. And once again I suspect that creators of them had in mind something like how to write unmaintainable code. Or probably they liked time bombs ;)
OK for those unfamiliar with Natural joins let's firstly see what they mean:
"A natural join is based on all columns in the two tables that have the same name. It selects rows from the two tables that have equal values in the relevant columns."
So it joins two tables if column names are the same. There isn't any explicit join condition. And here is the biggest problem. The work has been done behind the scenes, and that is one of the biggest advantages of writing unmaintainable code.
So let's assume we have following tables:
CREATE TABLE persons (
person_id NUMBER(10) NOT NULL,
address_id NUMBER(10) NOT NULL,
surname VARCHAR2(40) NOT NULL,
first_name VARCHAR2(40) NOT NULL,
remarks VARCHAR2(1000),
CONSTRAINT person_pk PRIMARY KEY (person_id));


CREATE TABLE addresses (
address_id NUMBER(10) NOT NULL,
country VARCHAR2(40) NOT NULL,
city VARCHAR2(40),
text VARCHAR2(100) NOT NULL,
CONSTRAINT address_pk PRIMARY KEY (address_id));

ALTER TABLE persons ADD CONSTRAINT person_address_fk
FOREIGN KEY (address_id) REFERENCES addresses (address_id);

Lets add a few rows:

INSERT INTO addresses
VALUES (1, 'LATVIA', 'RIGA', 'BRIVIBAS STREET 100');

INSERT INTO persons
VALUES (1, 1, 'PLIVNA', 'GINTS', 'Actually he lives in another place');

So what would be the SQL query to get persons and their countries using NATURAL JOIN syntax?

SQL> SELECT surname, first_name, country
2 FROM persons
3 NATURAL JOIN addresses;
SURNAME

----------------------------------------
FIRST_NAME
----------------------------------------
COUNTRY
----------------------------------------
PLIVNA
GINTS
LATVIA

1 row selected.

SQL>

Nice one - we got one row as expected.
So assume times goes by and as per our new requirements we have to add some description field also in addresses table. To be consistent and following practices that differ from how to write unmaintainable code we use the same column name and data type as in persons.

ALTER TABLE addresses ADD remarks VARCHAR2(1000);

So what our query is showing now?

SQL> SELECT surname, first_name, country
2 FROM persons
3 NATURAL JOIN addresses;

no rows selected

SQL>

WoW! Where the row has gone? We didn't change our data!
Yeahhhh, everything works as expected. We added column remarks with the same name as in persons table and now join is being done using both columns. And of course data in persons.remarks column doesn't match with addresses.remarks. So our time bomb has blown up and we silently introduced a nasty bug in our system. Code is as valid as before from the compilation viewpoint, so we won't get compilation errors. It simply gives wrong result. Invalid SQL would be much better, then we'd spotted bug much easier, but this time we got exactly very nasty and in the beginning absolutely unclear bug.

There are also other things to consider:
1) natural joins between two tables without at least one common column results in cartesian product.
CREATE TABLE numbers AS
SELECT rownum id
FROM all_source;

Lets see how many rows table numbers have and how many rows we got for natural join between them:
SQL> SELECT count(*) FROM numbers;
COUNT(*)

----------
549290

1 row selected.

SQL> SELECT count(*) FROM (
2 SELECT * FROM persons
3 NATURAL JOIN numbers);
COUNT(*)
----------
549290

1 row selected.

SQL>

2) with different data types but the same column names there might be type conversion problems and quite strange error messages (see that it is NATURAL OUTER LEFT or RIGHT JOIN):

ALTER TABLE addresses MODIFY remarks NUMBER (15);

SQL> SELECT surname, first_name, country
2 FROM persons
3 NATURAL LEFT JOIN addresses;
SURNAME

----------------------------------------
FIRST_NAME
----------------------------------------
COUNTRY
----------------------------------------
PLIVNA
GINTS

1 row selected.

SQL> ed
Wrote file afiedt.buf
1 SELECT surname, first_name, country

2 FROM persons
3* NATURAL RIGHT JOIN addresses
SQL> /
SELECT surname, first_name, country
*
ERROR at line 1:
ORA-01722: invalid number

SQL>

So what to do?
1) Do not use this syntax. This is the most efficient and most obvious solution.
2) Use the same schema object naming conventions as me :) Natural joins simply never work using my favourite naming conventions.

Similar threads:
1) Natural Join and Index Monitoring in asktom.
2) Thread should one use ANSI join syntax when writing an Oracle application from Oracle-l list.