Saturday, December 08, 2007

Oracle wiki

I've thought for a while about creating an article about the resources available in Oracle problem solving process. Meanwhile Oracle wiki was born and already exists for about two months. As it is supposed to be shared and editable resource by everyone in Oracle community I thought that it would be a perfect place for my eventual article for two reasons:
1) more people hopefully will read it and probably get some help;
2) more people can add their suggestions how they solve Oracle related problems.
So my first contribution to Oracle wiki is How to solve Oracle Database related problems. Everyone is invited to correct it and/or add some other resources. As well as for other Oracle wiki pages of course.

Monday, November 26, 2007

How to choose database

Most of us (I mean readers of Oracle related blogs) most probably have already decided which DBMS to use. However the world is changing and sooner or later either you'll have to choose a new DBMS or defend your favourite one.
Quite often we hear such argument like DB X is cheaper than DB Y, therefore we have to take this one. And cheaper usually means only license cost. However license cost is only one part of direct expenses not to speak even about indirect expenses.
So i've written an article trying to explain more than 10 criteria which should be analyzed before final decision. Article contains also a list of more than 30 databases with links to their sites one can choose from.

Sunday, November 04, 2007

The curse of gapless sequences

Gapless sequences are one of the best examples of bad requirements. Initially it was thought as an article for my blog but the result was too big and I've put it on my site.

Thursday, October 25, 2007

Hints "stronger" than db parameters?

Actually I've tested only one parameter so the plural in title is questionable generalization :)

Recently battling with ORA-04030: out of process memory errors I found at least for me quite interesting fact.
USE_HASH hint is more powerful than parameter _hash_join_enabled = false (for 10g) or hash_join_enabled = false (for 9i).
I was quite suprised because without this test case I'd thought completely opposite. Of course neither hash_join_enabled nor _hash_join_enabled should be often altered to different value than default "true", but in case of "false" and your code having hints use_hash, you'll have hash joins regardless of above mentioned db parameter setting. It might be quite nasty surprise in case you have eliminated hash joins because of possible bugs, for example.
I've tested this behaviour for SE and EE on Windows as well as SE on AIX and EE on windows, all had the same behaviour.
Here is my test case and 10046 event trace execution plan results on SE/Windows (all other tests were similar):

SQL> select * from v$version;
Oracle Database 10g Release - Production
PL/SQL Release - Production
CORE Production
TNS for 32-bit Windows: Version - Production
NLSRTL Version - Production

SQL> create table a as select rownum rn, object_name
2 from all_objects;

Table created.

SQL> alter session set events '10046 trace name context forever, level 1';

Session altered.

SQL> select count(*) from (
2 select *
3 from a a1, a a2
4 where a1.rn = a2.rn);


Execution plan:
STAT #24 id=1 cnt=1 pid=0 pos=1 obj=0 op='SORT AGGREGATE (cr=556 pr=73 pw=0 time=212727 us)'
STAT #24 id=2 cnt=60141 pid=1 pos=1 obj=0 op='HASH JOIN (cr=556 pr=73 pw=0 time=406508 us)'
STAT #24 id=3 cnt=60141 pid=2 pos=1 obj=176936 op='TABLE ACCESS FULL A (cr=278 pr=73 pw=0 time=61652 us)'
STAT #24 id=4 cnt=60141 pid=2 pos=2 obj=176936 op='TABLE ACCESS FULL A (cr=278 pr=0 pw=0 time=60192 us)'

SQL> alter session set "_hash_join_enabled" = false;

Session altered.

SQL> select count(*) cnt from (
2 select *
3 from a a1, a a2
4 where a1.rn = a2.rn);


Execution plan:
STAT #12 id=1 cnt=1 pid=0 pos=1 obj=0 op='SORT AGGREGATE (cr=556 pr=0 pw=0 time=374302 us)'
STAT #12 id=2 cnt=60141 pid=1 pos=1 obj=0 op='MERGE JOIN (cr=556 pr=0 pw=0 time=470826 us)'
STAT #12 id=3 cnt=60141 pid=2 pos=1 obj=0 op='SORT JOIN (cr=278 pr=0 pw=0 time=88667 us)'
STAT #12 id=4 cnt=60141 pid=3 pos=1 obj=176936 op='TABLE ACCESS FULL A (cr=278 pr=0 pw=0 time=44 us)'
STAT #12 id=5 cnt=60141 pid=2 pos=2 obj=0 op='SORT JOIN (cr=278 pr=0 pw=0 time=231092 us)'
STAT #12 id=6 cnt=60141 pid=5 pos=1 obj=176936 op='TABLE ACCESS FULL A (cr=278 pr=0 pw=0 time=60174 us)'

SQL> select count(*) cnt from (
2 select /*+ full(a1) full(a2) use_hash(a1 a2)*/ *
3 from a a1, a a2
4 where a1.rn = a2.rn);


Execution plan:
STAT #24 id=1 cnt=1 pid=0 pos=1 obj=0 op='SORT AGGREGATE (cr=556 pr=73 pw=0 time=214116 us)'
STAT #24 id=2 cnt=60141 pid=1 pos=1 obj=0 op='HASH JOIN (cr=556 pr=73 pw=0 time=399462 us)'
STAT #24 id=3 cnt=60141 pid=2 pos=1 obj=176936 op='TABLE ACCESS FULL A (cr=278 pr=73 pw=0 time=60803 us)'
STAT #24 id=4 cnt=60141 pid=2 pos=2 obj=176936 op='TABLE ACCESS FULL A (cr=278 pr=0 pw=0 time=60188 us)'

Wednesday, October 17, 2007

All latvian Oracle (and not only) users - UNITE!

I'd like to announce two resources for people that like relational databases and speak Latvian.
The first one is discussion group for Latvian Oracle User Group. The near future plans are described here (in Latvian).
If you speak Latvian and work with Oracle then it is your duty :) to join it! Of course one can join even if he doesn't speak Latvian, however there will be a small problem - everyone will understand you, but it is doubtful that you will understand anything except some keywords like Oracle, SQL, etc :)

The second resource is mine. And it is more general, it is blog about relational databases (in Latvian).
Recently I've searched Internet and was very unpleasantly surprised - I've found very few resources devoted to relational databases. So not to make only complaints I've started to blog about relational databases in my mother tongue. I'm of course not living in illusions that I can do it only myself and therefore anyone that is at least a little bit eager to help me in this process and has some knowledge either in general SQL, Oracle, MS SQL Server, MySQL, PosgreSQL or another relational db is encouraged to join me and become author in this blog.

Tuesday, October 02, 2007

Natural joins are evil

If you like time bombs use them ;)

During my teaching courses I once again stumble upon ANSI SQL NATURAL JOIN syntax. And once again I suspect that creators of them had in mind something like how to write unmaintainable code. Or probably they liked time bombs ;)
OK for those unfamiliar with Natural joins let's firstly see what they mean:
"A natural join is based on all columns in the two tables that have the same name. It selects rows from the two tables that have equal values in the relevant columns."
So it joins two tables if column names are the same. There isn't any explicit join condition. And here is the biggest problem. The work has been done behind the scenes, and that is one of the biggest advantages of writing unmaintainable code.
So let's assume we have following tables:
CREATE TABLE persons (
person_id NUMBER(10) NOT NULL,
address_id NUMBER(10) NOT NULL,
surname VARCHAR2(40) NOT NULL,
first_name VARCHAR2(40) NOT NULL,
remarks VARCHAR2(1000),
CONSTRAINT person_pk PRIMARY KEY (person_id));

CREATE TABLE addresses (
address_id NUMBER(10) NOT NULL,
country VARCHAR2(40) NOT NULL,
city VARCHAR2(40),
text VARCHAR2(100) NOT NULL,
CONSTRAINT address_pk PRIMARY KEY (address_id));

ALTER TABLE persons ADD CONSTRAINT person_address_fk
FOREIGN KEY (address_id) REFERENCES addresses (address_id);

Lets add a few rows:

INSERT INTO addresses

VALUES (1, 1, 'PLIVNA', 'GINTS', 'Actually he lives in another place');

So what would be the SQL query to get persons and their countries using NATURAL JOIN syntax?

SQL> SELECT surname, first_name, country
2 FROM persons
3 NATURAL JOIN addresses;


1 row selected.


Nice one - we got one row as expected.
So assume times goes by and as per our new requirements we have to add some description field also in addresses table. To be consistent and following practices that differ from how to write unmaintainable code we use the same column name and data type as in persons.

ALTER TABLE addresses ADD remarks VARCHAR2(1000);

So what our query is showing now?

SQL> SELECT surname, first_name, country
2 FROM persons
3 NATURAL JOIN addresses;

no rows selected


WoW! Where the row has gone? We didn't change our data!
Yeahhhh, everything works as expected. We added column remarks with the same name as in persons table and now join is being done using both columns. And of course data in persons.remarks column doesn't match with addresses.remarks. So our time bomb has blown up and we silently introduced a nasty bug in our system. Code is as valid as before from the compilation viewpoint, so we won't get compilation errors. It simply gives wrong result. Invalid SQL would be much better, then we'd spotted bug much easier, but this time we got exactly very nasty and in the beginning absolutely unclear bug.

There are also other things to consider:
1) natural joins between two tables without at least one common column results in cartesian product.
SELECT rownum id
FROM all_source;

Lets see how many rows table numbers have and how many rows we got for natural join between them:
SQL> SELECT count(*) FROM numbers;


1 row selected.

SQL> SELECT count(*) FROM (
2 SELECT * FROM persons
3 NATURAL JOIN numbers);

1 row selected.


2) with different data types but the same column names there might be type conversion problems and quite strange error messages (see that it is NATURAL OUTER LEFT or RIGHT JOIN):

ALTER TABLE addresses MODIFY remarks NUMBER (15);

SQL> SELECT surname, first_name, country
2 FROM persons
3 NATURAL LEFT JOIN addresses;


1 row selected.

SQL> ed
Wrote file afiedt.buf
1 SELECT surname, first_name, country

2 FROM persons
SQL> /
SELECT surname, first_name, country
ERROR at line 1:
ORA-01722: invalid number


So what to do?
1) Do not use this syntax. This is the most efficient and most obvious solution.
2) Use the same schema object naming conventions as me :) Natural joins simply never work using my favourite naming conventions.

Similar threads:
1) Natural Join and Index Monitoring in asktom.
2) Thread should one use ANSI join syntax when writing an Oracle application from Oracle-l list.

Sunday, September 16, 2007

Avoiding trouble

Following post could be seen absolutely obvious mechanism to avoid unnecessary trouble in development process. But it was something new for me when several years ago I've firstly read it from the long term oracle-l list ex member Rachel Carmichael (unfortunately I've lost this e-mail) and now I've seen it is something new for many developers. In one sentence it is:
Inform your boss (customer, colleague or other responsible person) about possible trouble in a written and provable way.
OK now let's explain a bit more. There are three steps in this process:
1) Understand possibility for eventual trouble;
2) Fight for your right decision;
3) If you lost the battle inform responsible person about it explaining why you think there is possibility for trouble.
Let's see what each of these steps mean.

Understand possibility for eventual trouble

Here is the trickiest part. The first tricky thing is to understand that there might be possible trouble. Of course that comes with experience ;) Probably the first time you'll miss the eventual trouble but next time you'll be already smarter. The second tricky thing is to avoid never ending wretch syndrome. And here of course the main factor is your common sense - decide when to warn responsible person and when to not use false warnings.

Fight for your right decision

Almost always there is a possibility to fight for your decision. Of course here is also one tricky thing - you have to explain and motivate your decision. You can enforce your decision with test cases, examples, resources from books, Internet etc. You can find supporters among your colleagues and sometimes even usual opponents. Even if you'll loose at least you'll have clear conscience :)

Inform responsible person about it

This is the easy part. If you have decided that you should warn anybody then most of the time there is no need to seek for a notary. Usually a simple e-mail saved in your sent mails will be enough. You can of course consider possibility to add some more recipients to TO: or CC: fields. If it is a meeting protocol with customer a footnote with warning can be a solution. Of course if the possible consequences are quite serious you have to ensure that your insurance is at least as serious. One thing of course remains - the motivation has to be kept. Also in your written email, meeting protocol or whatever else.

The result

With above mentioned mechanism you've got two points:
1) Once again informed about your decision.
It might be possible that your boss simply missed your viewpoint. Probably he thought you are not serious enough. Probably he misunderstood you. Written text with appropriate explanation and motivation decrease such probability to the minimum.
2) Cover your a$$ (sorry for that expression but I couldn't resist :) in case of failure.
You will be able to point to your written message and say - I already warned you. Without it there are always place for different interpretations. Even the most ordinary thing - your boss probably simply fairly and honestly forgot about your warnings. With such cover even if you need to clean up the mess you won't pay for it. Your boss, your customer or whatever else responsible party being warned will pay for it.

Have I used that?

Yes, I have. I was lucky enough to have knowing and smart bosses till now so I haven't necessity to use such scenario (3rd step) inside my company. Either I could convince my bosses or the problem was out of my responsibility. But in communication with customers I've used this technique quite much along with catching their bad requirements. Sometimes that convinced them at last change their minds, sometimes not, but at least I and my company had very strong argument in case of failure.

Monday, September 03, 2007

Unrealistic optimism or just pure stupidity?

A while ago I've written a blog entry Where bad performance starts explaining that we as IT professionals should prevent customers from stupid requirements or at least strongly warn them that their desires will make much harder life both for customers and developers.

Some days ago I've spotted a nice estimate of a work task that takes the problem described above to the next level i.e. the core of problem is initiated even before the real project begins. Task was much bigger containing of many subtasks including:
1) three separated subtasks for three different reports. Each report was quite precisely explained;
2) a task formulated as follows: "Up to 20 reports defined more accurately in analysis phase". No more information was supplied, no more information was known to anyone.

So how do you think - what were the estimates for analysis and implementation? According to estimate analysis of each separated report should take 1, 2 and 3 workdays. Implementation estimate of each of them was 2, 3.5 and 3.5 workdays. And now the most interesting part: analysis and implementation of up to 20 unknown reports was estimated 15 days for analysis and 20 days for implementation.

So for known and quite precisely defined reports the average analysis and implementation work was estimated 2 and 3 days respectively for each report. BUT for completely unknown reports without any further information and/or restriction estimate was 0.75 days for analysis and 1 day for implementation.
So what to think? What was thinking both analyst and programmer making such estimates? What was thinking PM allowing such estimates to be published and giving them to customer?Are they afraid of big numbers? Why are they being so optimistic? It is a real mystery...
I'd be afraid to give any estimate for such purely defined task at all. And if I'd give I'd take at least average or even probably maximum of other similar tasks and also warn my boss about quite big risk to give estimates of unknown tasks. Probably I've seen too much projects fail or at least delay due to overly optimistic estimates...

Monday, August 27, 2007

Oracle Express Edition and MS SQL Server Express Edition

Today I was listening to a course from the dark side ;) One quite small fact (I was not aware of) about SQL Server 2005 Express Edition compared to Oracle 10g XE caught my attention though.
These versions are similar in their limitations i.e.
  • 1 CPU
  • 1 GB RAM
  • 4 GB user data
But there is one difference not in favour of Oracle - for SQL Server Express Edition one can create up to 16 instances on a single box, but for Oracle XE only one. Of course in general it is not the main decisive factor to choose between SQL Server and Oracle, but I can imagine some scenarios where more than one instance of Oracle XE on a single box would be nice, for example, simultaneous development of different applications on XE.

Tuesday, August 21, 2007

A new experience

I've completed my first Oracle certified training course in Mebius Latvia. This time as a trainer though.
For already a few years I feel necessity somebody to tell what I've learned and how to do or not to do certain things in Oracle ;)

The first step was my website where I could put some articles telling some experience stories. Now I've completed the next step – directly telling people how to work with Oracle. The content of my first course as a trainer was quite trivial – 10g PL/SQL fundamentals. Although describing more than 250 slides in two days, even quite easy slides, was a completely new experience for me. I hope I've completed it more or less so that my trainees understood what I was talking about ;) In fact I was really satisfied with them, because all had more or less experience with Oracle and programming as such, so there wasn't necessity to tell for example what the "IF" statement and logical values are. In the very beginning I frankly told them that questions help not only trainees to deeper understand the covered material but questions also help the trainer to feel better and understand that at least some trainees are following the lesson material ;) So they really followed my plea and time after time we had even discussions about one or another topic. So I'm thankful to my trainees, I hope they understood at least main things I told them and I'm looking forward to my next course after a while.

Thursday, August 09, 2007

Flashback for source code

Recent OTN forum question made me think that I'm quite lucky. Everywhere I've worked we had one or another source control software for source code. All scripts for schema creation as well as procedural units were put and maintained there. Sometimes reading about other people's problems that they've lost the only (sic!) version of their procedure dropping the schema or just recreating the procedure gives me the shudders. So for everyone that yet hasn't understood - keeping procedural units only in the db means it is just matter of time when you'll lose them. Either all of them or just one because of erroneous create or replace but sooner or later the time will come ;)

OK but what if you are unlucky and just realized that you have overwritten the last and only version of your source code unit (procedure, package, function, etc)? If you are at least on 9i (10g as well of course) you are not completely lost. Flashback that works on regular user's tables works in the same way also on internal tables holding source code. So the only problem remains how to get it and how to get it fast! Fast because it is just like with lost extremities - surgeon can put it back only if you are acting fast. And fast because the old data can simply be overwritten if Oracle feels the necessity. And now what's the mechanism is?

There are 2 scenarios:
1. query directly from base tables owned by sys or
2. query using provided dba/all/user views i.e. dba/all/user_source.

In any case for ordinary user by default it is not possible. You need sys access to do that or grant privilege from sys to any other user. Even DBA role is not enough.

1st scenario is as follows:

1) get object_id (from dba/all/user_objects)
SELECT object_id
FROM user_objects
WHERE object_name = 'yourname'
AND object_type = 'yourtype';

2) get text from source$ as of timestamp
SELECT source
FROM sys.source$
AS OF timestamp timestamp_before_error
WHERE obj# = object_id_from_1st_step
ORDER BY line;

1st step can be done by everyone. 2nd step needs flashback on source$.

2nd scenario is as follows:

simply select text from user/all/dba_source as of timestamp.
FROM dba_source
AS OF timestamp timestamp_before_error
WHERE name = 'yourname'
AND owner = 'yourowner'
AND type = 'yourtype'
ORDER BY line;

Of course this can be done by sys. I was not able to do that by other user because user_source for example is based on source$, obj$ and fixed table x$joxfs. I was able to grant flashback on source$, obj$ to another user but for fixed tables it is not possible (quite understandable as these are actually memory structures). But it seems on the other hand without flashback one cannot make queries from view that is based both on ordinary tables and fixed tables.
Above experiments were performed both on and EE Oracle.

Friday, June 15, 2007 domain has lived already a year now

My provider has provided a wonderful feature - collecting various statistics about visitors of my site. I really suggest everyone to find such provider or at least use Google analytics to collect various stats about your site.

I started my site a bit more than a year ago on May 2006. In the first month I was the only visitor testing how it looks like and adding some content. In the second month (June 2006) I had already 65 unique ip addresses visited my site. It was a real advancement :) Compared that to today - this is normal weekday rate for my site. Of course on weekends attendance drops.

Statistics provided by my Internet provider gave me interesting possibility to create a world map. All countries having at least one person visiting my site has been coloured more or less black. More black - more visitors. Colouring the map was interesting task - I even had to look in world atlas for some countries, shame on me - but just using skeleton map I was not sure exactly which country Ghana, Dominican Republic and Guatemala are. The most lovely were countries like Indonesia, Canada, Philippines and Russia - having so many islands and each one has to be coloured separately.

So what are the lessons I've learned this year?

1. Don't make fun with Google! It can give you so many hits. And it also can be annoyed and give you nothing. Annoyance for my site was because of two reasons:
1) changing content too rapidly
2) having the same articles in 2 different formats - html and doc.
So the medicine was think more about content and change it only once and place deny on robots.txt for doc files.

2. Google webmasters tool is really useful. One can submit sitemaps, so decreasing the time for Google to catch your new page, view restricted pages by robots.txt and a bunch of other features.

3. All other search engines unfortunately at least for me gives almost nothing. Even yahoo is almost dead for my site.

4. Placing valuable content is as critical as thinking about titles and words in your articles at least for Google.

5. Writing articles that hopefully are usable also for other people aren't so easy therefore I even more admire people having written books with precise technical info like Thomas Kyte and Jonathan Lewis for example.

So one month May 2007 gave almost the same stats as all previous year together (actually half of the year because June was the first month for wide public). Let's hope the trend in the future will be the same :)

Monday, May 07, 2007

Deferred constraint real life scenario

It is worth to know oracle features even if you have never used them.

Some days ago one of our developers came in and started to tell that our testers found another bug. This time bug is raised in Oracle by violated unique key. Of course developer didn't catch it in unit testing because it was a bit tricky but our testers were smart and nasty :) enough to dig it out.
So we had a form that may have many child records for a previously chosen parent record. To feel something concrete let's choose for the parent an order for tax discount - suppose you have submitted it in written form to your local government (not that I'm suggesting you to do that :))). But of course your local government doesn't want to give you tax discount and therefore as always asks you many documents explaining reason for it. But there is one restriction - each document must be of different kind e.g. one proving that you have 5 children, second proving that your rent has risen by 25% etc.
So as I strongly believe that business constraints should be enforced in db as much as possible I've created UK in supporting documents for two columns:
1) FK to order table and
2) document type.
Unfortunately in the form where all documents are entered one can change all attributes and our evil testers of course found the most evil one - swapped type of document A to document B and document B to document A.
So where are the problem you ask? Yea there aren't any problem if one looks on the start point (docs A and B for one order) and finish point (docs B and A for the same order), but somehow situation in the middle is very unpleasant. There is a moment when we have 2 documents with type B simultaneously. And here testers got the error as below:

SQL> create table parent (par_id number not null constraint par_pk primary key);
Table created.
SQL> ed

Wrote file afiedt.buf
1 create table child (cld_id number not null constraint cld_pk primary key,

2 cld_par_id number not null constraint cld_par_fk references parent (par_id),
3 cld_type varchar2(3) not null,
4 cld_data varchar2(100),
5* constraint cld_uk unique (cld_par_id, cld_type))
SQL> /
Table created.
SQL> insert into parent values (1);
1 row created.
SQL> insert into child values (1, 1, 'A', 'Doc type A');
1 row created.
SQL> insert into child values (2, 1, 'B', 'Doc type B');
1 row created.
SQL> commit;
Commit complete.
SQL> update child set cld_type = 'B' where cld_id = 1;

update child set cld_type = 'B' where cld_id = 1
ERROR at line 1:ORA-00001: unique constraint (GINTS.CLD_UK) violated

So what to do?
My immediate response was - deferred constraints! I definitely knew that I don't want to loose the constraint forever because I'd like to enforce as much as possible integrity constraints in DB. So how it looks with deferred constraints?

SQL> alter table child drop constraint cld_uk;
Table altered.
SQL> ed

Wrote file afiedt.buf
1 alter table child add constraint cld_uk unique (cld_par_id, cld_type)

2* deferrable initially deferred
SQL> /
Table altered.
SQL> update child set cld_type = 'B' where cld_id = 1;
1 row updated.
SQL> update child set cld_type = 'A' where cld_id = 2;
1 row updated.
SQL> commit;
Commit complete.

Everything works!
How does it look like now?

SQL> select cld_id, cld_par_id, cld_type from child;

---------- ---------- ---
1 1 B
2 1 A

Let's try some more experiments. Let's update only one row back and try to commit:

SQL> update child set cld_type = 'A' where cld_id = 1;
1 row updated.
SQL> commit;

ERROR at line 1:
ORA-02091: transaction rolled back
ORA-00001: unique constraint (GINTS.CLD_UK) violated

Bahhh!!! We got expected error!
The only thing - one has to remember that deferred constraints rolls back ALL transaction since it started even if the changed rows are completely valid and unrelated to the deferred constraint for example let's try to insert one completely valid row in parent table at first and then try to do nasty things with deferred constraint:

SQL> insert into parent values (2);
1 row created.
SQL> update child set cld_type = 'A' where cld_id = 1;
1 row updated.
SQL> select * from parent;

SQL> commit;

ERROR at line 1:
ORA-02091: transaction rolled back
ORA-00001: unique constraint (GINTS.CLD_UK) violated
SQL> select * from parent;


Our inserted row into parent table also has gone. With normal (i.e. not deferred) constraint Oracle rolls back only changes made by the latest statement not all statements since transaction started. One has to remember this behaviour when using deferred constraints.

Quite popular use of deferred constraints seems to be update of primary keys (and after that also foreign keys) but as I'm always (OK almost always) using surrogates and primary key update for me is something absolutely awful I've never had such need and I'd never recommend that to anyone.

Sunday, April 15, 2007

Statistics is power! ... And the goat...

Hmmm, you say absolutely stupid title? OK let's see whether you think the same after you read all this stuff.

No I'm not going to write about statistics on Oracle tables, although of course we must have stats on Oracle tables for CBO to operate as well as possible. I'm going to write a bit about a very old problem that now is at least partially resolved. And particularly - inablity to change e-mail in Oracle forums. In my previous blog entry I expressed hope that sometime in the future I'd be able to write another entry about it. Now the time has come. To be precise it came already 3 weeks ago on March 25.

And I was agreeably surprised that the time came so fast. According to this thread the problem started at July 2002. Since then many people many times write about this problem in feedback forum but the best what we got were promises about the bright future. Just like Communism that according to Khrushchev had to be built by 1980 but never started. And surprise, surprise! Just about two months after I published my article summarizing some statistics and counting all the desires, complaints and statements about inability to change the e-mail in 500 most recent threads we got it!

You see - my ego is fantastically high! ;)

OK I'll go down some steps.

The most I'm dare to think of - is that my article was one of the drops that broke up the big rock. But of course important drop that probably showed people the real size of the problem.

You see - my ego is fantastically high anyway even after stepping down! ;)

So I personally learned at least one thing - statistics is power! Even outside CBO!

And where's the goat you ask?
OK here is an old story (most probably already known to you in some variation):

A poor man comes to the rabbi complaining that his family has only one small room, many kids, and almost no money.
The rabbi says, "Take all your money, buy a goat, and keep the goat in your room. Come back in a month."
"But, rabbi, we don't have enough space even for us," the man said
"Just do what I say," the rabbi replied.
A month later the man comes back complaining that the goat smells and breaks everything.
"Sell the goat and come back in a month," the rabbi tells him.
A month later the man comes back to the rabbi with flowers.
"Thank you, rabbi! We're so happy the goat is out, now we have more room and some money!"

By objective considerations rabbi was a bit of cheater and poor man of course was stupid.

So we had a goat in our room for more than four years - much longer than poor man. Of course after removing our own "goat" the feeling is quite releasing and pleasing but we have to understand that it was removing the goat instead of giving a present.

Monday, March 26, 2007

Where bad performance starts

"My application is slow" - who haven't heard this very common complaint? Both from customers and developers (sic!) that created the application. No this won't be story how to diagnose what is slow and/or what to do to make it run faster. This will be a short story - how to try to not reach this complaint.

So where are the roots of bad performance? Many times the roots are very simple - allowing customer to promote and what's worse - accept bad requirements from customer. Bad and many times completely unnecessary requirements. The primary task of system analyst is of course to analyze, formalize, systematize and record customer requirements. But that should be done using sound criticism of initial requirements. And although the primary goal of specification is to record WHAT is needed and although specification should be implementation independent I'm sure that a really good system analyst has to keep in mind at least the vision HOW specified requirements will be implemented. Otherwise it is possible to result in a SRS describing business illogic instead of business logic (as was quite precisely named by a user in this thread).

Why illogical and unnecessary requirements are bad and should be thrown away instead of just implementing them - probably someone sometime would need them? There are at least following reasons to do that:
1) starting from requirements gathering through development and design until the testing and deployment development staff is wasting their time to implement them resulting in wasted man-hours and $$$,
2) every added functionality makes every application at least a little bit more harder to understand for customer and support which also results in wasted man-hours and $$$,
3) every added functionality needs some system resources either in CPU cycles, I/O calls or increased memory space. Everybody knows that resources are finite and so they simply are stolen from other users and useful functionality,
4) every added functionality usually takes some space on the screen. Of course it is not so critical today with common 17' and 19' monitors, but needless functionality many times can be as annoying as advertisements.

What to do to eliminate unnecessary requirements, if customer asks something crazy or you think that every other similar product already has it? There are at least following possibilities starting from the best:
1) frankly tell the customer that the requirement is bad explaining why it should be eliminated and what bad will happen if it would be created. You can even a bit overstate that :) This to my mind is the best solution, because even if customer and/or somebody else would force to implement you this particular bad feature you'd later have a wonderful (of course it should be written on paper or in e-mail!) argument that you have already predicted that!
2) use some scheme of prioritization, for example a nice one (just coincidence) called like very well known city - MoSCoW. I've never seen a project where all requirements REALLY are with the same priority. So mark the foolish requirements "could have", "won't have", "never have" (OK the latter is my own addition :). Thanks God even "could have" usually guarantees the "never have" because - how many software projects have you seen where the actual implementation is ahead of the schedule and budgeted expenses? :) So usually there is no time left for "could have's" that are ugly and undesirable from development perspective.
3) cash more $$$ for the particular feature. This is the most risky solution, because if customer is spiteful you'd cash him 3 times more than normal and would not deliver a good functionality. I in the place of customer would be 3 times angrier. And most of them are.

How to minimize the impact of the unnecessary requirement? If you have to implement it, of course you have to waste your and your customer resources to implement and accept it. The only possibility to minimize the impact is to make it configurable so that after the day (or week or hour) of intensive annoyance customer is able to switch it off. Of course this is also a solution for a pure marketing driven feature, which is absolutely necessary to sell your product but absolutely needless and harmful in real production.

What are examples of unnecessary functionality? Although very common and accustomed functionality of forums the precise threads count, view count and count of found searches actually is nothing more than satisfying the curiosity. What you'd lose if you'd don't know that this particular Database-General Oracle forum wouldn't display that it has Messages: 209,897 - Threads: 48,569 and searching for term Oracle it found 453 Messages?
Or this Oracle FAQ forum for SQL & PL/SQL Newbies has 44090 messages and 11735 topics and here searching for Oracle you'd get 84 results? Of course I assume that they are not counting these statistical numbers each time the user requests the page, but eventually this requirement can be very disastrous if implemented without caution. I really don't care whether the forum has 209,897 or 185,786 messages, probably the only thing that I care is to see the comparable activity that can be very approximate to judge in which forum there is more probability to get answer. And what I do care the most is to get the forum page as fast as possible. This doesn't mean I'm asking to remove this functionality. This means I'm explaining that for real work this functionality is not needed and if it takes significant additional time it is even harmful to my mind. The same can be said about returning the precise count of search results even if they are some quite big number. Do you really care whether your search resulted in 5000 or 5647 hits? I personally don't unless it is statistical report which is absolutely different subject than everyday search.

Another case study. Another quite common incorrectly used potentially very offensive feature is to put wildcards '%' around the search word. As a result we get potential resource intensive and long lasting full scan of table instead of minimal index scan. Actually I think it is really rare occasions when customer really needs to search for something like %A%. What does %A% really mean? A found document where the only symbol left in person name is "A"? A completely absentminded customer only remembering that his product name contains symbol "A"? Both customer and developer will be desperate in this situation - customer because the application is slow, developer because customer constantly complains that application is slow.

So as a result for the latest case study I thought of writing a basic article how to create indexes in Oracle and put it on my website but after searching the web I abandoned my idea, because so much articles have already been created and I doubt that mine would be better than already existing :) So here is a very small test case how to effectively search using wildcards either in the end (using the most basic index) or in the start (using function based index) but of course not both. So be aware in the following example how both searches for DBA_DB% and %DB_LINKS uses index range scan. This example was included mostly because I haven't seen any website mentioning the possibility to search using wildcard in the beginning of the string (probably haven't searched long enough :)

SQL> create table src (txt varchar2(100) NOT NULL);
Table created.

SQL> insert into src select distinct object_name from dba_objects;
22474 rows created.

SQL> create unique index src_uk1 on src(txt);
Index created.

SQL> create unique index src_uk2 on src(reverse(txt));
Index created.

SQL> exec dbms_stats.gather_table_stats(user, 'src')
PL/SQL procedure successfully completed.

SQL> set autot on explain

SQL> select * from src where txt like 'DBA_DB%';
Execution Plan

0 SELECT STATEMENT Optimizer=FIRST_ROWS (Cost=2 Card=1 Bytes=22)
1 0 INDEX (RANGE SCAN) OF 'SRC_UK1' (UNIQUE) (Cost=2 Card=1 Bytes=22)

SQL> select * from src where reverse(txt) like reverse('%DB_LINKS');

Execution Plan

0 SELECT STATEMENT Optimizer=FIRST_ROWS (Cost=3 Card=1 Bytes=22)
1 0 TABLE ACCESS (BY INDEX ROWID) OF 'SRC' (Cost=3 Card=1 Bytes=22)
2 1 INDEX (RANGE SCAN) OF 'SRC_UK2' (UNIQUE) (Cost=2 Card=1)

And here are many links mentioning how to create indexes in Oracle and what one or another access path means:

Oracle® Database Concepts 10g Release 2 (10.2)
Chapter 5 Schema Objects
Overview of Indexes
Overview of Index-Organized Tables
Overview of Application Domain Indexes

Oracle® Database Application Developer's Guide - Fundamentals 10g Release 2 (10.2)
Chapter 5 Using Indexes in Application Development

Oracle® Database Performance Tuning Guide 10g Release 2 (10.2)
Chapter 15 Using Indexes and Clusters

Oracle® Database Concepts 10g Release 2 (10.2)
Chapter 18 Partitioned Tables and IndexesT
Overview of Partitioned Indexes

When to Use an Index by Cary Millsap

Indexing options available in Oracle by Brian Hengen

Why isn't Oracle using my index ?! by Jonathan Lewis

Index Internals by Julian Dyke

Understanding Indexes By Tim Gorman

Oracle B-Tree Index Internals: Rebuilding The Truth by Richard Foote

Bitmap index Internals by Julian Dyke

Understanding Bitmap indexes by Jonathan Lewis

Bitmap Indexes 3 - Bitmap Join Indexes by Jonathan Lewis - what are Bitmap Join Indexes and when these are useful.

Operations of execution plans by Julian Dyke

Is this an index I use ? by Jonathan Lewis - how to find indexes that are not used.

Rebuilding indexes – why, when, how? by Jonathan Lewis

How high can you go ? by Jonathan Lewis - what is the biggest possible height of a b-tree index?

Expert Oracle Database Architecture 9i and 10g Programming by Thomas Kyte
Chapter 11 Indexes
Table of contents

Cost-Based Oracle Fundamentals by Jonathan Lewis
Chapter 4 Simple B-tree Access
Chapter 8 Bitmap Indexes
Table of contents

Tuesday, January 30, 2007

Five Million and Counting - WHAT???

Recently I got one of the regular mails from Oracle (called OTN TechBlast). Title of the first article was "Five Million and Counting" and it was written by Justin Kestelyn, Editor-in-Chief, OTN.
So what is he counting? First paragraph answers that question:
"I can hardly believe it, but OTN membership passed the 5-million benchmark in 2006. But why should you care?"

What is so remarkable in this statement? Yes of course it is a big number. Yes of course it is nice that people around the world are interested in Oracle and its products. But the question remains - What the author is counting?

And the question remains for the on big reason - OTN member cannot change his e-mail for already at least a few years (according to this post since about July 2002). It is surprising in nowadays when

  • people change their works quite often and the right thing is not to work all the life in one company;
  • companies merge, crash, change names, domains;
  • free e-mail servers come and go.
One of the important parts of OTN is forums. Among other forums there exists one called Feedback. The goal of the forum is defined as follows: "Use this forum for feedback about OTN services and problems with OTN accounts.". Actually nothing is said that someone will solve your problem... Anyway people obviously are hoping for the best and continually posting their problems. As the Justin Kestelyn, Editor-in-Chief, OTN hasn't gathered statistics of what the main problem for OTN members is (or at least hasn't indicated that in his article about 5 million members), I've tried to do this job myself.

The source of statistics were 500 most recent threads starting from Mar 6, 2006 6:04 PM till Jan 30, 2007 ~20:00 PM.

So 49 of 500 threads (or 9.8%) were only about this very problem asking how to change email, asking for possibility to do that or asking to delete old account because they've created new with lack of possibility to change email. When looking at such statistics like thread views and replies on threads the situation was even more dramatic:

  • 79632 out of 329834 views (24.1%) were on above mentioned threads;
  • 412 out of 1684 replies (including original posts) were to above mentioned threads.

So 24.5% or about one fourth of all posted messages were about only one problem.

I can at least try to convert that to lost $$.

Assuming each view took 10 seconds (which is very very modest assumption due to fact that forum response times cannot be called very short :) employees of Oracle and its partners and clients have used 79632 views * 10 sec = 796320 person seconds = 13272 person minutes = 221 person hours = 28 person days = more than one person month to just watch threads about this particular topic. Of course that is not including

  • time to formulate and write responses including OTN members from Oracle;
  • people not knowing about such forums and trying to find out how to change their e-mails;
  • people wasting time for creating new accounts;
  • people wasting time for asking to delete old accounts because they have created new ones;
  • forums support group wasting time to delete old accounts;
  • time wasted by other Oracle employees according to this message;
  • people getting angry and tearing their hairs wondering how such basic functionality cannot be included;
  • people loosing trust to Oracle seeing how their feedback is (not) being listened to;
  • 6 hours of my time to gather above mentioned statistics and write these paragraphs :)

Yeahhh, as we can see even Oracle ACEs are suffering from this forced duplication of individualities (see here and here) so what can we say?

Anyway I hope that I will waste another hour or so to write another article that this nightmare has been ended!

P.S. Here is the list of the above mentioned 49 of 500 threads speaking about the same problem along with thread views and replies (the first is one of the most famous in forum history :) 32374 260 83 2 129 2 785 6 489 1 273 1 257 1 392 1 406 2 1342 9 4593 15 418 2 582 1 981 3 370 0 348 0 442 0 620 1 2033 5 1313 6 2031 3 1622 3 779 1 1218 2 1819 1 1843 1 1955 2 1173 1 1447 1 1442 1 1952 5 659 2 1111 0 1860 4 1048 2 1075 3 572 1 417 1 395 1 228 0 810 0 986 1 718 0 459 3 819 3 726 1 789 1 709 0 740 1