Number of days in a year

26 03 2009

I tend to forget these things. So I\’d better write them down.

select to_date(\'09-sep-2009\') - trunc(\'09-sep-2009\', \'mm\') days_since_month_begin 
,trunc (to_date(\'09-sep-2009\'),\'yyyy\') first_day_of_year
,interval \'1\' year just_curious
,to_char(trunc (to_date(\'09-sep-2009\'),\'yyyy\')+ interval \'1\' year - 1 , \'ddd\') days_in_year
from dual;

Advertisements




Photograph dates

23 09 2007

Today I was trying to get some photographs in the right chronological order. I use a tool, Exif Pilot, that can change the dates that are incorporated in most jpeg’s created with a digital camera. And another one, Flash renamer, that can rename the jpeg files based on the dates contained in the file. The last tool is not so necessary any more as a lot of file managers (both windows and linux world) can detect these dates as well. But when you rename a file and let it start with a date and time, a simple ‘dir’ or ‘ls’ lists the files in the ‘correct’ order.

There is one problem with all this. That is when you, or the person operating the digital camera, forget to set the right date and time in the camere when taking the photographs. Hence Exif pilot to change the dates in the jpeg file.

Exif Pilot change date dialogThere is however a quirk in this tool as you can see in the date dialog. When you have multiple files that have an incorrect date but are in the right chronological order, you can add or substract a number of days, hours, minutes and/or seconds. All you then have to do is figure out what the interval is between the date and time the camera has been set to and the actual date the photograph was taken. You probably can find out what the actual date has been. But then calculating the interval! That means finding a calendar and counting the days. Especially fun when the date the camera has been set to is several years back (typically when resetting the camera or batteries have run out).

Now it is by no means easy in SQL, but after several times of leaving through a calender I decided to spend a bit of time to calculate the difference between two dates in days, hours, minutes and seconds. Mainly to use it again after the next holiday or festivity when I have been taking photographs. So finally some Oracle:

SQL> ed
Wrote file afiedt.buf

1 select numtodsinterval(
2 to_date(’02-jul-2007 14:00:00′, ‘dd-mon-yyyy hh24:mi:ss’) –
3 to_date(’08-jan-2004 01:41:00′, ‘dd-mon-yyyy hh24:mi:ss’)
4 , ‘day’)
5* from dual
SQL> /

NUMTODSINTERVAL(TO_DATE(’02-JUL-200714:00:00′,’DD-MON-YYYYHH24:MI:SS’)-TO_D
—————————————————————————
+000001271 12:18:59.999999999

SQL>





Bitmap index size, or, Size does matter.

18 09 2007

For the past couple of years I have been working with the notion that Oracle’s bitmap indexes are a cool feature. Perfect for an index on low cardinality columns. And even the definition of ‘low’ in this regard has been stretched a lot. We have a datawarehouse where we create an index on every column of the dimension tables and the index is created as a bitmap index when the cardinality is below 50.000. Queries that are restricted on these columns fly.

The biggest downside to using bitmap indexes is (or so I believed) that they tend to grow enormously in size when DML is applied to the table that is underlying the bitmap index. Resulting in frequently rebuilding bitmap indexes to reclaim space (and thus performance by reducing the required I/O) or not using bitmap indexes at all.

It appears I have been sleeping or did not read the ‘What’s New’ guides thoroughly enough, because in the What’s new guide for 10gR1 this is mentioned:

Enhanced Bitmap Index Performance and Space Management

Bitmap indexes now perform better and are less likely to be fragmented when subjected to large volumes of single-row data manipulation language (DML) operations.

Let’s get dirty and do some querying. Here’s what happens in 9.2.0.7 (script shamelessly copied from Jonathan Lewis’s blog):
SQL> select * from v$version where banner like 'Oracle%';BANNER
----------------------------------------------------------------
Oracle9i Enterprise Edition Release 9.2.0.7.0 - 64bit Production
SQL>
SQL>
SQL> create table t1 as
2 select
3 rownum id,
4 mod(rownum,2) bit_col
5 from
6 all_objects -- {or your favourite row source}
7 where
8 rownum <= 10000
9 ;
Table created. SQL>
SQL> create unique index t1_pk on t1(id);
Index created. SQL> create bitmap index t1_bi on t1(bit_col);

Index created.

SQL>
SQL> — gather statistics on the table
SQL>
SQL> validate index t1_bi;

Index analyzed.

SQL>
SQL> select
2 name, height, lf_blks, lf_rows, btree_space, used_space
3 from
4 index_stats
5 ;

NAME HEIGHT LF_BLKS LF_ROWS BTREE_SPACE USED_SPACE
—————————— ———- ———- ———- ———– ———-
T1_BI 1 1 2 16188 2891

SQL>
SQL> begin — update 250 rows
2 for i in reverse 9500..10000 loop
3 update t1
4 set bit_col = 0
5 where
6 id = i
7 and bit_col = 1;
8 end loop;
9 commit;
10 end;
11 /

PL/SQL procedure successfully completed.

SQL>
SQL> validate index t1_bi;

Index analyzed.

SQL>
SQL> select
2 name, height, lf_blks, lf_rows, btree_space, used_space
3 from
4 index_stats
5 ;

NAME HEIGHT LF_BLKS LF_ROWS BTREE_SPACE USED_SPACE
—————————— ———- ———- ———- ———– ———-
T1_BI 3 82 496 1505836 810761

SQL>
SQL> drop table t1;

Table dropped.

SQL>
SQL>

So what I see here is that the bitmap index has grown from 1 8K block to 0,77M, just by updating 250 rows of a 10.000 row table. Imagine what that will do to your bitmap index on your multi-million row DWH table.
Now, I only have an 10gR2 database available and nothing is mentioned in that documentation regarding this subject, but let’s see what’s happening in 10gR2 (assuming the same will be true for 10gR1):

SQL> select * from v$version where banner like 'Oracle%';BANNER
----------------------------------------------------------------
Oracle Database 10g Enterprise Edition Release 10.2.0.3.0 - 64bi
SQL>
SQL>
SQL> create table t1 as
2 select
3 rownum id,
4 mod(rownum,2) bit_col
5 from
6 all_objects -- {or your favourite row source}
7 where
8 rownum <= 10000
9 ;
Table created.SQL>
SQL> create unique index t1_pk on t1(id);

Index created.

SQL> create bitmap index t1_bi on t1(bit_col);

Index created.

SQL>
SQL> — gather statistics on the table
SQL>
SQL> validate index t1_bi;

Index analyzed.

SQL>
SQL> select
2 name, height, lf_blks, lf_rows, btree_space, used_space
3 from
4 index_stats
5 ;

NAME HEIGHT LF_BLKS LF_ROWS BTREE_SPACE USED_SPACE
—————————— ———- ———- ———- ———– ———-
T1_BI 1 1 2 7996 2916

SQL>
SQL> begin — update 250 rows
2 for i in reverse 9500..10000 loop
3 update t1
4 set bit_col = 0
5 where
6 id = i
7 and bit_col = 1;
8 end loop;
9 commit;
10 end;
11 /

PL/SQL procedure successfully completed.

SQL>
SQL> validate index t1_bi;

Index analyzed.

SQL>
SQL> select
2 name, height, lf_blks, lf_rows, btree_space, used_space
3 from
4 index_stats
5 ;

NAME HEIGHT LF_BLKS LF_ROWS BTREE_SPACE USED_SPACE
—————————— ———- ———- ———- ———– ———-
T1_BI 1 1 2 7996 2853

SQL>
SQL> drop table t1;

Table dropped.

SQL>
SQL>

The index has not grown at all!

Now don’t go throwing away all your regular indexes and replace them all by bitmap indexes. There still is another downside to updating bitmap indexes, as Jonathan Lewis mentions in his post. They used to create a lot of redo in 9i and they still do use a lot of redo in 10g (Not to self; create script to demonstrate this). But there are certainly more cases where a bitmap index can save your day, or the performance of your datawarehouse.

And, as always, check if the results for this kind of use is the same in your own environment, or in IM speak; YMMV.





Encyclopedia spine in a DWH/BI context

26 05 2007

One of the blogs I read by Dominic Delmolini had an interesting question that seemed an nice exercise in the use of sql and probably analytic sql. It turns out that an elegant solution (by the questioner himself) uses the 10g addition of reg_exp_replace as well. I have been using regular expressions a lot in unix scripting and have grown fond of them, although they are terrible to read if you haven’t made them yourself.

The solution presented by Dominic made me think about uses for a query of this kind in the current day and age. Who still has a need for an encyclopedia after all, these days? Other than that it looks so nice on your bookshelves?

The thing that I came up with would be something along the lines of a materialized view for the contents of, for example, an enormous parts database with millions of parts, bolt and nut like parts that is. With this query (or materialized view based on it) you could present someone who is looking for a part but does not know what it is called exactly, with a quick way to find what he/she is looking for.  You could also use this in a datawarehouse to present an overview of the number of people selected for a mailing of some kind per part of the city – or street names the selected people live in. I think that would be a rather intuitive way of showing results that many people are used to, but, up to now, was not so easy to create.

Another thing learned.





Formatting elapsed seconds

10 05 2007

I guess we all have had that awkward moment when you want to know how many hours and minutes 11311 seconds is. I don’t understand why I can’t find such a function in Oracle, seems strange. Today was one of those moments so I created a query that does just that. I leave it to others to create a function from this.

This is what I started with:

SELECT TO_CHAR ( TO_DATE ('1900-01-01 00:00:00', 'YYYY-MM-DD HH24:MI:SS') + 1 / (24 * 3600),'HH24:MI:SS')
FROM DUAL;

A bit verbose so I changed it to:

SELECT TO_CHAR ( TRUNC (SYSDATE, 'DD') + 1 / (24 * 3600),'HH24:MI:SS')
FROM DUAL;

Both give the following result:

|formatted|
-----------
|00:00:01 |

That’s the principle, now for some test data

-- Create a table to hold some test data
CREATE TABLE father_time (seconds NUMBER);
-- Generate seconds as a fraction of a day into the table
-- I personally like the use of the pseudo column LEVEL in this statement

INSERT INTO father_time
WITH x AS
(SELECT (LEVEL - 1) / (24 * 3600)
FROM DUAL
CONNECT BY 1 = 1 AND LEVEL < 24 * 3600 + 3) -- slightly more than a day of seconds
SELECT *
FROM x;

-- Format the elapsed time into hours, minutes and seconds
-- if there are more than 24 hours of seconds the days are 'truncated'

SELECT seconds,
TO_CHAR (TRUNC (SYSDATE, 'DD') + seconds, 'HH24:MI:SS') AS formatted
FROM father_time
ORDER BY seconds DESC;

DROP TABLE father_time;

And here are some interesting parts of the output:

|seconds |formatted|
|-----------|---------|
|1,000011574| 0:00:01|
| 1| 0:00:00|
|0,999988426| 23:59:59|
|0,999976852| 23:59:58|
.
.
|0,99931713 | 23:59:01|
|0,999305556| 23:59:00|
|0,999293981| 23:58:59|
|0,999282407| 23:58:58|
|0,999270833| 23:58:57|
|0,999259259
| 23:58:56|

You can imagine the rest I guess.





Everything you always wanted to know about Dates

9 05 2007

If you start in a new sql dialect (Oracle PL/SQL, Transact-SQL, Ingres) it is always a hassle when you want to work with dates. And it appears I am not the only one that thinks so. In the weblog of Herod T. is a post that gives just the amount of detail someone needs who knows what he/she wants to do to a date column. Just not the exact right syntax. That is given in the pdf that is linked in the post above. Very useful to developers and reporters.