Monday, March 26, 2007

Where bad performance starts

"My application is slow" - who haven't heard this very common complaint? Both from customers and developers (sic!) that created the application. No this won't be story how to diagnose what is slow and/or what to do to make it run faster. This will be a short story - how to try to not reach this complaint.

So where are the roots of bad performance? Many times the roots are very simple - allowing customer to promote and what's worse - accept bad requirements from customer. Bad and many times completely unnecessary requirements. The primary task of system analyst is of course to analyze, formalize, systematize and record customer requirements. But that should be done using sound criticism of initial requirements. And although the primary goal of specification is to record WHAT is needed and although specification should be implementation independent I'm sure that a really good system analyst has to keep in mind at least the vision HOW specified requirements will be implemented. Otherwise it is possible to result in a SRS describing business illogic instead of business logic (as was quite precisely named by a user in this thread).

Why illogical and unnecessary requirements are bad and should be thrown away instead of just implementing them - probably someone sometime would need them? There are at least following reasons to do that:
1) starting from requirements gathering through development and design until the testing and deployment development staff is wasting their time to implement them resulting in wasted man-hours and $$$,
2) every added functionality makes every application at least a little bit more harder to understand for customer and support which also results in wasted man-hours and $$$,
3) every added functionality needs some system resources either in CPU cycles, I/O calls or increased memory space. Everybody knows that resources are finite and so they simply are stolen from other users and useful functionality,
4) every added functionality usually takes some space on the screen. Of course it is not so critical today with common 17' and 19' monitors, but needless functionality many times can be as annoying as advertisements.

What to do to eliminate unnecessary requirements, if customer asks something crazy or you think that every other similar product already has it? There are at least following possibilities starting from the best:
1) frankly tell the customer that the requirement is bad explaining why it should be eliminated and what bad will happen if it would be created. You can even a bit overstate that :) This to my mind is the best solution, because even if customer and/or somebody else would force to implement you this particular bad feature you'd later have a wonderful (of course it should be written on paper or in e-mail!) argument that you have already predicted that!
2) use some scheme of prioritization, for example a nice one (just coincidence) called like very well known city - MoSCoW. I've never seen a project where all requirements REALLY are with the same priority. So mark the foolish requirements "could have", "won't have", "never have" (OK the latter is my own addition :). Thanks God even "could have" usually guarantees the "never have" because - how many software projects have you seen where the actual implementation is ahead of the schedule and budgeted expenses? :) So usually there is no time left for "could have's" that are ugly and undesirable from development perspective.
3) cash more $$$ for the particular feature. This is the most risky solution, because if customer is spiteful you'd cash him 3 times more than normal and would not deliver a good functionality. I in the place of customer would be 3 times angrier. And most of them are.

How to minimize the impact of the unnecessary requirement? If you have to implement it, of course you have to waste your and your customer resources to implement and accept it. The only possibility to minimize the impact is to make it configurable so that after the day (or week or hour) of intensive annoyance customer is able to switch it off. Of course this is also a solution for a pure marketing driven feature, which is absolutely necessary to sell your product but absolutely needless and harmful in real production.

What are examples of unnecessary functionality? Although very common and accustomed functionality of forums the precise threads count, view count and count of found searches actually is nothing more than satisfying the curiosity. What you'd lose if you'd don't know that this particular Database-General Oracle forum wouldn't display that it has Messages: 209,897 - Threads: 48,569 and searching for term Oracle it found 453 Messages?
Or this Oracle FAQ forum for SQL & PL/SQL Newbies has 44090 messages and 11735 topics and here searching for Oracle you'd get 84 results? Of course I assume that they are not counting these statistical numbers each time the user requests the page, but eventually this requirement can be very disastrous if implemented without caution. I really don't care whether the forum has 209,897 or 185,786 messages, probably the only thing that I care is to see the comparable activity that can be very approximate to judge in which forum there is more probability to get answer. And what I do care the most is to get the forum page as fast as possible. This doesn't mean I'm asking to remove this functionality. This means I'm explaining that for real work this functionality is not needed and if it takes significant additional time it is even harmful to my mind. The same can be said about returning the precise count of search results even if they are some quite big number. Do you really care whether your search resulted in 5000 or 5647 hits? I personally don't unless it is statistical report which is absolutely different subject than everyday search.

Another case study. Another quite common incorrectly used potentially very offensive feature is to put wildcards '%' around the search word. As a result we get potential resource intensive and long lasting full scan of table instead of minimal index scan. Actually I think it is really rare occasions when customer really needs to search for something like %A%. What does %A% really mean? A found document where the only symbol left in person name is "A"? A completely absentminded customer only remembering that his product name contains symbol "A"? Both customer and developer will be desperate in this situation - customer because the application is slow, developer because customer constantly complains that application is slow.

So as a result for the latest case study I thought of writing a basic article how to create indexes in Oracle and put it on my website but after searching the web I abandoned my idea, because so much articles have already been created and I doubt that mine would be better than already existing :) So here is a very small test case how to effectively search using wildcards either in the end (using the most basic index) or in the start (using function based index) but of course not both. So be aware in the following example how both searches for DBA_DB% and %DB_LINKS uses index range scan. This example was included mostly because I haven't seen any website mentioning the possibility to search using wildcard in the beginning of the string (probably haven't searched long enough :)

SQL> create table src (txt varchar2(100) NOT NULL);
Table created.

SQL> insert into src select distinct object_name from dba_objects;
22474 rows created.

SQL> create unique index src_uk1 on src(txt);
Index created.

SQL> create unique index src_uk2 on src(reverse(txt));
Index created.

SQL> exec dbms_stats.gather_table_stats(user, 'src')
PL/SQL procedure successfully completed.

SQL> set autot on explain

SQL> select * from src where txt like 'DBA_DB%';
Execution Plan

0 SELECT STATEMENT Optimizer=FIRST_ROWS (Cost=2 Card=1 Bytes=22)
1 0 INDEX (RANGE SCAN) OF 'SRC_UK1' (UNIQUE) (Cost=2 Card=1 Bytes=22)

SQL> select * from src where reverse(txt) like reverse('%DB_LINKS');

Execution Plan

0 SELECT STATEMENT Optimizer=FIRST_ROWS (Cost=3 Card=1 Bytes=22)
1 0 TABLE ACCESS (BY INDEX ROWID) OF 'SRC' (Cost=3 Card=1 Bytes=22)
2 1 INDEX (RANGE SCAN) OF 'SRC_UK2' (UNIQUE) (Cost=2 Card=1)

And here are many links mentioning how to create indexes in Oracle and what one or another access path means:

Oracle® Database Concepts 10g Release 2 (10.2)
Chapter 5 Schema Objects
Overview of Indexes
Overview of Index-Organized Tables
Overview of Application Domain Indexes

Oracle® Database Application Developer's Guide - Fundamentals 10g Release 2 (10.2)
Chapter 5 Using Indexes in Application Development

Oracle® Database Performance Tuning Guide 10g Release 2 (10.2)
Chapter 15 Using Indexes and Clusters

Oracle® Database Concepts 10g Release 2 (10.2)
Chapter 18 Partitioned Tables and IndexesT
Overview of Partitioned Indexes

When to Use an Index by Cary Millsap

Indexing options available in Oracle by Brian Hengen

Why isn't Oracle using my index ?! by Jonathan Lewis

Index Internals by Julian Dyke

Understanding Indexes By Tim Gorman

Oracle B-Tree Index Internals: Rebuilding The Truth by Richard Foote

Bitmap index Internals by Julian Dyke

Understanding Bitmap indexes by Jonathan Lewis

Bitmap Indexes 3 - Bitmap Join Indexes by Jonathan Lewis - what are Bitmap Join Indexes and when these are useful.

Operations of execution plans by Julian Dyke

Is this an index I use ? by Jonathan Lewis - how to find indexes that are not used.

Rebuilding indexes – why, when, how? by Jonathan Lewis

How high can you go ? by Jonathan Lewis - what is the biggest possible height of a b-tree index?

Expert Oracle Database Architecture 9i and 10g Programming by Thomas Kyte
Chapter 11 Indexes
Table of contents

Cost-Based Oracle Fundamentals by Jonathan Lewis
Chapter 4 Simple B-tree Access
Chapter 8 Bitmap Indexes
Table of contents