Coding notes: July 2017

This is a well known fact: (almost) every developer knows that "sometimes, a full table scan can perform better than using an index".

Lets show this in a little demonstration, which I picked up from Tom Kyte's Expert Oracle Database Architecture. The purpose of the demo is to show that an index access can be more expensive than a full table scan, under some circumstances.

I'm running this demo on a Oracle XE 11gR2 instance, running on a Lenovo X1 Yoga notebook, running Windows 10 (64 bits), with an Intel Core i7-6600U CPU (2.60 GHz), and 16 GB RAM.

Setup

Let's first create a table to hold data with a column having values sorted:

CREATE TABLE ORDERED (X NUMBER, Y VARCHAR2(2000), PRIMARY KEY(X));

Now populate ORDERED, taking care of sorting the data in the column X from 1 to 1'000'000.

begin

for i in 1 .. 1000000

loop

insert into ordered

values(i, rpad(dbms_random.random, 75, '*'));

end loop;

end;

commit;

... then create a table UNORDERED with identical structure, and populate it with the data from the first table, putting the values in the column X in random order:

CREATE TABLE UNORDERED AS SELECT X, Y FROM ORDERED ORDER BY Y;

ALTER TABLE UNORDERED ADD CONSTRAINT PK_UNORDERED PRIMARY KEY (X);

And before starting with the demo, gather the statistics:

begin

dbms_stats.gather_table_stats(USER, 'UNORDERED');

dbms_stats.gather_table_stats(USER, 'ORDERED');

end;

Comparing the execution plans

1. Table with ordered data:

SQL Statement:
$EXPLAIN PLAN FOR SELECT * FROM ORDERED WHERE X BETWEEN 10000 AND 30000;

------------------------------------------------------------------------------------------
|Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 20002 | 1582K| 323 (0)| 00:00:04 |
| 1 | TABLE ACCESS BY INDEX ROWID| ORDERED | 20002 | 1582K| 323 (0)| 00:00:04 |
|* 2 | INDEX RANGE SCAN | SYS_C007666 | 20002 | | 68 (0)| 00:00:01 |
-------------------------------------------------------------------------------------------

2. Table with unordered data, full table scan:

SQL Statement:
EXPLAIN PLAN FOR SELECT * FROM UNORDERED WHERE X BETWEEN 10000 AND 30000;

-------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 20002 | 1582K| 3275 (1)| 00:00:40 |
|* 1 | TABLE ACCESS FULL| UNORDERED | 20002 | 1582K| 3275 (1)| 00:00:40 |
-------------------------------------------------------------------------------

3. Table with unordered data, using the index:

SQL Statement:
EXPLAIN PLAN FOR SELECT /*+ index( UNORDERED PK_UNORDERED ) */*
FROM UNORDERED WHERE X BETWEEN 10000 AND 30000;

--------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 20002 | 1582K| 20049 (1)| 00:04:01 |
| 1 | TABLE ACCESS BY INDEX ROWID| UNORDERED | 20002 | 1582K| 20049 (1)| 00:04:01 |
|* 2 | INDEX RANGE SCAN | PK_UNORDERED | 20002 | | 44 (0)| 00:00:01 |
--------------------------------------------------------------------------------------------

Conclusion

Execution plans displays a much higher cost when the index is used on an unordered set of data - in that particular case. This might be unexpected, but it shows how important physical design is. So blindly putting indexes when encountering full table scans is not always a good idea.

Trying to change the optimizer first choice might also not be a good idea. Oddly, there's often in room a developer tempted to prevent the optimizer to to use a full table scan. Not good.

Finally, let's also point out that maintaining an index is also bound to costs.

Coding notes

Wednesday, July 5, 2017

A full table scan is not always bad, an index is not always good