Bitmap index size, or, Size does matter.

18 09 2007

For the past couple of years I have been working with the notion that Oracle’s bitmap indexes are a cool feature. Perfect for an index on low cardinality columns. And even the definition of ‘low’ in this regard has been stretched a lot. We have a datawarehouse where we create an index on every column of the dimension tables and the index is created as a bitmap index when the cardinality is below 50.000. Queries that are restricted on these columns fly.

The biggest downside to using bitmap indexes is (or so I believed) that they tend to grow enormously in size when DML is applied to the table that is underlying the bitmap index. Resulting in frequently rebuilding bitmap indexes to reclaim space (and thus performance by reducing the required I/O) or not using bitmap indexes at all.

It appears I have been sleeping or did not read the ‘What’s New’ guides thoroughly enough, because in the What’s new guide for 10gR1 this is mentioned:

Enhanced Bitmap Index Performance and Space Management

Bitmap indexes now perform better and are less likely to be fragmented when subjected to large volumes of single-row data manipulation language (DML) operations.

Let’s get dirty and do some querying. Here’s what happens in 9.2.0.7 (script shamelessly copied from Jonathan Lewis’s blog):
SQL> select * from v$version where banner like 'Oracle%';BANNER
----------------------------------------------------------------
Oracle9i Enterprise Edition Release 9.2.0.7.0 - 64bit Production
SQL>
SQL>
SQL> create table t1 as
2 select
3 rownum id,
4 mod(rownum,2) bit_col
5 from
6 all_objects -- {or your favourite row source}
7 where
8 rownum <= 10000
9 ;
Table created. SQL>
SQL> create unique index t1_pk on t1(id);
Index created. SQL> create bitmap index t1_bi on t1(bit_col);

Index created.

SQL>
SQL> — gather statistics on the table
SQL>
SQL> validate index t1_bi;

Index analyzed.

SQL>
SQL> select
2 name, height, lf_blks, lf_rows, btree_space, used_space
3 from
4 index_stats
5 ;

NAME HEIGHT LF_BLKS LF_ROWS BTREE_SPACE USED_SPACE
—————————— ———- ———- ———- ———– ———-
T1_BI 1 1 2 16188 2891

SQL>
SQL> begin — update 250 rows
2 for i in reverse 9500..10000 loop
3 update t1
4 set bit_col = 0
5 where
6 id = i
7 and bit_col = 1;
8 end loop;
9 commit;
10 end;
11 /

PL/SQL procedure successfully completed.

SQL>
SQL> validate index t1_bi;

Index analyzed.

SQL>
SQL> select
2 name, height, lf_blks, lf_rows, btree_space, used_space
3 from
4 index_stats
5 ;

NAME HEIGHT LF_BLKS LF_ROWS BTREE_SPACE USED_SPACE
—————————— ———- ———- ———- ———– ———-
T1_BI 3 82 496 1505836 810761

SQL>
SQL> drop table t1;

Table dropped.

SQL>
SQL>

So what I see here is that the bitmap index has grown from 1 8K block to 0,77M, just by updating 250 rows of a 10.000 row table. Imagine what that will do to your bitmap index on your multi-million row DWH table.
Now, I only have an 10gR2 database available and nothing is mentioned in that documentation regarding this subject, but let’s see what’s happening in 10gR2 (assuming the same will be true for 10gR1):

SQL> select * from v$version where banner like 'Oracle%';BANNER
----------------------------------------------------------------
Oracle Database 10g Enterprise Edition Release 10.2.0.3.0 - 64bi
SQL>
SQL>
SQL> create table t1 as
2 select
3 rownum id,
4 mod(rownum,2) bit_col
5 from
6 all_objects -- {or your favourite row source}
7 where
8 rownum <= 10000
9 ;
Table created.SQL>
SQL> create unique index t1_pk on t1(id);

Index created.

SQL> create bitmap index t1_bi on t1(bit_col);

Index created.

SQL>
SQL> — gather statistics on the table
SQL>
SQL> validate index t1_bi;

Index analyzed.

SQL>
SQL> select
2 name, height, lf_blks, lf_rows, btree_space, used_space
3 from
4 index_stats
5 ;

NAME HEIGHT LF_BLKS LF_ROWS BTREE_SPACE USED_SPACE
—————————— ———- ———- ———- ———– ———-
T1_BI 1 1 2 7996 2916

SQL>
SQL> begin — update 250 rows
2 for i in reverse 9500..10000 loop
3 update t1
4 set bit_col = 0
5 where
6 id = i
7 and bit_col = 1;
8 end loop;
9 commit;
10 end;
11 /

PL/SQL procedure successfully completed.

SQL>
SQL> validate index t1_bi;

Index analyzed.

SQL>
SQL> select
2 name, height, lf_blks, lf_rows, btree_space, used_space
3 from
4 index_stats
5 ;

NAME HEIGHT LF_BLKS LF_ROWS BTREE_SPACE USED_SPACE
—————————— ———- ———- ———- ———– ———-
T1_BI 1 1 2 7996 2853

SQL>
SQL> drop table t1;

Table dropped.

SQL>
SQL>

The index has not grown at all!

Now don’t go throwing away all your regular indexes and replace them all by bitmap indexes. There still is another downside to updating bitmap indexes, as Jonathan Lewis mentions in his post. They used to create a lot of redo in 9i and they still do use a lot of redo in 10g (Not to self; create script to demonstrate this). But there are certainly more cases where a bitmap index can save your day, or the performance of your datawarehouse.

And, as always, check if the results for this kind of use is the same in your own environment, or in IM speak; YMMV.

About these ads

Actions

Information

One response

30 10 2007
John Wu

There are some very significant research work that improve the performance of bitmap indexes too and there is a free (LGPL) software package called FastBit (http://sdm.lbl.gov/fastbit/) that implements a good number of these techniques. If you have an application where you don’t need a fledge database system, then it might be a good alternative to consider.

John

PS: Here is a highlight from a 2001 paper (http://crd.lbl.gov/%7Ekewu/ps/LBNL-48975.html_. FastBit uses a bitmap compression technique called WAH. In this 2001 study, WAH compressed bitmap indexes were found to perform 12 times faster than the closest competitor.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s




Follow

Get every new post delivered to your Inbox.

%d bloggers like this: