To illustrate, we will use a user-defined function to solve a problem encountered by many of our clients that use a specific Advancement product where dates are stored as character strings in the database. In this product, the "date" column is mandatory; however, there may be situations when the end user does not know the date value to be entered. In that case, as much information is entered as possible. This may lead to zeroed out values for unknown date components. For example, if the year is not known, but the month and day is, the end user may enter a value of '00000905' (for September 5th with an unknown year). If no date component is known, the end user may enter placeholder values such as all zeros or a space. To further complicate matters, since the date is a character string and validation at the database level is not possible, there may be occurrences of invalid date formats entered.
Note: Of course storing dates as character strings is not good practice, we are simply using this as an example of overcoming a problem with the use of user-defined functions.
To begin, we will create a test table with "date" values stored as character strings. The expected "good" format for the date values should be 'YYYYMMDD'.
CREATE TABLE date_test
( date_string VARCHAR2(8) );
( date_string VARCHAR2(8) );
Now a few rows will be inserted into the table.
INSERT INTO date_test VALUES ('00000000');
INSERT INTO date_test VALUES (' ');
INSERT INTO date_test VALUES ('19920821') -- Valid format;
INSERT INTO date_test VALUES ('08011999');
INSERT INTO date_test VALUES ('00000810');
INSERT INTO date_test VALUES (' ');
INSERT INTO date_test VALUES ('19920821') -- Valid format;
INSERT INTO date_test VALUES ('08011999');
INSERT INTO date_test VALUES ('00000810');
Additional rows are added to illustrate performance strain encountered with user-defined functions and context switching.
INSERT INTO date_test SELECT * FROM date_test;
/
/
/
/
/
/
/
/
/
/
/
/
/
COMMIT;
/
/
/
/
/
/
/
/
/
/
/
/
/
COMMIT;
Next, a function is created to test for valid dates. The function tries to perform an explicit conversion using TO_DATE. If the conversion fails, the date is not valid and an 'N' is returned; otherwise, a 'Y' is returned.
CREATE OR REPLACE FUNCTION is_date ( p_string IN VARCHAR2 , p_format IN VARCHAR2 DEFAULT 'YYYYMMDD' ) RETURN VARCHAR2 IS tmp DATE; BEGIN tmp := TO_DATE(p_string, p_format); RETURN 'Y'; EXCEPTION WHEN OTHERS THEN RETURN 'N'; END is_date; / SHO ERR
Take a look at the number of rows in the table, along with the result of the IS_DATE function for each distinct value.
SELECT COUNT(*) FROM date_test; COUNT(*) ---------- 81920 COL is_date FORMAT A7 SELECT DISTINCT date_string, is_date FROM ( SELECT date_string , is_date(date_string) AS is_date FROM date_test ); DATE_STR IS_DATE -------- ------- 00000000 N 08011999 N 00000810 N 19920821 Y N
EXPLAIN PLAN FOR
SELECT * FROM date_test
WHERE is_date(date_string) = 'Y';
SELECT * FROM TABLE(dbms_xplan.display());
Attempt to create a function-based index to improve performance. Unfortunately, a user-defined function cannot be used in a function-based index unless the function is determined to be DETERMINISTIC.SELECT * FROM date_test
WHERE is_date(date_string) = 'Y';
SELECT * FROM TABLE(dbms_xplan.display());
--------------------------------------- | Id | Operation | Name | --------------------------------------- | 0 | SELECT STATEMENT | | |* 1 | TABLE ACCESS FULL| DATE_TEST | ---------------------------------------
CREATE INDEX date_test_fbi ON date_test(is_date(date_string));
*
ERROR at line 1:
ORA-30553: The function is not deterministic
Change the function to be 'DETERMINISTIC' to allow an FBI to be created.*
ERROR at line 1:
ORA-30553: The function is not deterministic
CREATE OR REPLACE FUNCTION is_date ( p_string IN VARCHAR2 , p_format IN VARCHAR2 DEFAULT 'YYYYMMDD' ) RETURN VARCHAR2 DETERMINISTIC IS tmp DATE; BEGIN tmp := TO_DATE(p_string, p_format); RETURN 'Y'; EXCEPTION WHEN OTHERS THEN RETURN 'N'; END is_date; / SHO ERR
CREATE INDEX date_test_fbi ON date_test(is_date(date_string));
Re-examine the execution plan of the query executed earlier to make sure the index is being used.EXPLAIN PLAN FOR
SELECT * FROM date_test
WHERE is_date(date_string) = 'Y';
SELECT * FROM TABLE(dbms_xplan.display());
Now that the index is used, the function provides a much improved performance boost, as the function does not have to be executed, thus allowing the PL/SQL logic to be contained within a single function (rather than trying to valid dates directly in SQL) in an efficient way.SELECT * FROM date_test
WHERE is_date(date_string) = 'Y';
SELECT * FROM TABLE(dbms_xplan.display());
----------------------------------------------------- | Id | Operation | Name | ----------------------------------------------------- | 0 | SELECT STATEMENT | | | 1 | TABLE ACCESS BY INDEX ROWID| DATE_TEST | |* 2 | INDEX RANGE SCAN | DATE_TEST_FBI | -----------------------------------------------------
Our next blog will illustrate the difference between DETERMINISTIC and NON-DETERMINISTIC functions.
No comments:
Post a Comment