Saturday, October 20, 2012

Gotcha - Oracle: Null

NULL essentially points to an "undefined" value in the database. It helps to conserve space in database columns. It's useful when we want to store "nothing" in a column. NULL is defined in ANSI, so all databases including ORACLE implement them. But, ORACLE differs from ANSI standards in dealing with NULLs. Some of these could be real gotchas for developers coming from other databases.

Gotcha 1 - Empty String is the same as NULL in Oracle SQL

In case you didn't know, empty String (also known as null string)('') and NULL are considered the same in Oracle. This is not ANSI standard and is a real gotcha for developers coming from other databases, where they are different.

Here is a SQL I found in one of the programs I am supporting:
SELECT * FROM employee WHERE NOT((employee.dept_nbr IS NULL) AND (employee.dept_nbr = '') );

In Oracle, the above Where condition is *not* really checking for 2 different things. Seems to me, the developer wanted to make sure it was not empty and it was not null either. This is because, he/she didn't realize Null String is essentially same as NULL in Oracle.

When in doubt, I always try a simpler query than original. Oracle system tables and the built in table DUAL come in handy for those queries. Here is a sample query that shows empty string is Null.
SELECT 'empty string is null' FROM dual WHERE '' IS NULL;

Another Gotcha - a bug in the SQL above


The above example had another bug. if the NULL was not the same as '', the above query would have failed completely!! (How can same field be NULL AND something else (here null string ('')) at the same time?). The above condition must have been
SELECT * from employee WHERE NOT((employee.dept_nbr IS NULL) OR (employee.dept_nbr = '') );

--
OR even below SQL:

SELECT * from employee WHERE (NOT(employee.dept_nbr IS NULL) AND NOT(employee.dept_nbr = '') );

Gotcha 2 - Checking for NULL

Did you notice "IS NULL" there? When you check for a value being NULL, you cannot use equal to (=). You have to use IS or IS NOT. This is yet another gotcha for many new developers to Oracle. Strange thing is Oracle will not complain if you entered '' = NULL in the above SQL. It simply won't find any rows!! Who better to explain this than Ask Tom?









[caption id="attachment_585" align="alignleft" width="329"] Oracle Select "IS NULL"[/caption]


[caption id="attachment_587" align="aligncenter" width="412"] Select equal to NULL results in "no rows found"[/caption]

If you are coming other databases, this may be a surprise to you (SQL Server for e.g., allowed = NULL check). It's a real gotcha in Oracle, because Oracle won't complain if you said "= NULL", but the query may not work as expected. A query may return less number of rows than expected, because of NULLs in some fields in the where condition. For e.g., below SQL will return only 2 rows (I expected the row with NULL to show as well).
SELECT emp_nbr, dept_nbr, first_name || ' ' || last_name || '(' || dept_nbr || ')' AS employee
FROM employee WHERE employee.dept_nbr IN (NULL, 100. 200);

The above WHERE condition must be written as,
WHERE employee.dept_nbr IS NULL OR employee.dept_nbr IN (100, 200);

Another way of doing this is to use the NVL function in Oracle. NVL translates NULL value to the value passed in.
WHERE NVL(employee.dept_nbr, 0) IN (0, 100, 200);

Here NULL is translated to 0 and this we can check in equality or IN condition.

Gotcha-3 Concatenating NULL (Empty String)


When you concatenate text fields together, if any of the value is NULL, the result will *NOT* be NULL in Oracle. In other databases (for e.g., SQL Server, mysql), String + NULL results in NULL. In Oracle, only NULL + NULL results in a NULL.
SELECT emp_nbr, dept_nbr, first_name || ' ' || last_name || '(' || dept_nbr || ')' AS employee
FROM employee WHERE (employee.dept_nbr IS NULL);

In this case, if you expected NULL, you are in for surprise. The concatenation actually works. Only the NULL values will be missing. (For e.g., this will return Sam Varadar() based on the sample data in the screenshots above). Though, Oracle is not promising this in the future versions. See here.

Gotcha-4 Nulls and Indexes


Another hidden gotcha in this is that when we a query uses an index, we are implicitly applying Equality checks. If a column that is in an index has NULL values in some rows, then those rows will not be indexed and thus won't be picked up!

This also means a primary key column cannot have NULL values. This is because a primary key column value identifies each row uniquely and thus they have to be in the primary index on the table.

Other NULLs


We saw above, concatenation of strings works even if one of the values is NULL. This is because, a NULL is also considered an Empty String. Other operations may not be as kind:

For e.g., Adding NULL to a Number will result in NULL.
SELECT 1 + null FROM dual;

In PL/SQL NULL is a NULL statement - a statement that does nothing. It is often needed in control blocks, like if-then-else where we want to leave the if or else part empty, but PL/SQL syntax doesn't allow this.
IF (dept_nbr = 100) THEN
dbms_output.put_line('Inside IF block');
ELSE
NULL; -- Do nothing
END IF;

When sorting a query result using ORDER BY, you can sort rows with NULLs to come FIRST or LAST (default).
SELECT * FROM employee ORDER BY dept_nbr DESC NULLS FIRST;

Functions to deal with NULLs

Because of the unpredictable  behavior of queries around NULL values, it's often better to translate NULL into something meaningful, so equality (or inequality) tests can be done without worrying about NULLs.

We showed the use of NVL above. It translates NULL into a more meaningful value, 0. NVL2 is a similar function that returns one value for NULL and another for Non-NULL values. COALESCE is another function that could be passed in a series of comma separated values and it will return the first non-NULL value in the list.

These functions can also be used while creating index on a column that may have NULLs and thus help with performance also. See here for more on this.

Other functions include NULLIF, DECODE etc. See here for a nice discussion of the functions related to NULL.

References

  1. http://psoug.org/reference/null.html

  2. http://stackoverflow.com/questions/203493/why-does-oracle-9i-treat-an-empty-string-as-null

  3. http://asktom.oracle.com/pls/asktom/f?p=100:11:0::::P11_QUESTION_ID:17320984423926

  4. http://www.sqlines.com/oracle/string_concat

  5. http://www.oracle-base.com/articles/misc/null-related-functions.php

  6. http://thinkoracle.blogspot.com/2005/06/nulls-in-oracle.html

Quick Tip - EA Server: Reset Jagadmin password

This is a really quick tip on how to reset password for the administrator id "jagadmin" used in EA Server. We use this id for everything from running it in Development to production environment, only difference being in DEV we don't have a password for the id, to make it easier to move from machine to machine.

By default there is no password for the jagadmin id. The problem is, once you set a password, you cannot reset the password directly. The tools Jaguar manager (GUI) and the Jagtool (command line version) don't allow you to reset password "for security reasons". After I googled a bit, I found a way.

Surprisingly, all you have to do, to reset the password is to remove the line that has the below setting in the file called jaguar.props (this is the properties of the server itself, named jaguar).
com.sybase.jaguar.server.jagadminpassword=<password>

Where the <password> itself is stored encrypted.

This tip is only about resetting the password. So, I won't go into any further detail here. But this opens up the bigger topic about the default security model in Sybase EA Server, which  seems very weak. The administrator id, jagadmin is the default id and is needed if you want to do anything serious with EA Server. But, we manage to make it secure enough.More on this later.

If you are new to EA Server (also Jaguar), several of the terms will be new to you.  I intend to write  more on EA Server in future post. Please check back again later.

Quick Tip - *nix: Back Quote

While researching on an issue with a Java Stored procedure at work recently, I had to go to oracle udump directory to find out if there was any errors. To do this, we do grep like,
grep -i -l "nullpointer" *

to list the names of all log files that may contain NullPointer exceptions. -i = ignore case, -l = list only file names.

This tip is to be able to use that list in another command using back-quote (`). This is the character (also called grave accent) before 1 on PC keyboards, if you haven't used it. Back-quote is used for command substitution in *nix (unix, linux) systems. It's available in

For eg, to list those files with full directory listing:
ls -lt `grep -i -l "nullpointer" *`

Or to open the trace files found by grep, all at once in vi:
vi `grep -i -l "nullpointer" *`

(and then use :n in vi to go through the files)

Below line copies the found files out to another directory,
cp `grep -i -l "nullpointer" *` ~/nullsod

(where nullsod is a directory in my home directory.)

Command Substitution (using Back quote)

This process of executing commands inside back-quote is called command substitution. The command inside the back-quote is executed first and the output is plugged into the commandline that contains it. For e.g.,
echo `date`

returns System Date. Notice a back-quoted command is mostly embedded inside another. If you miss the echo there, it will try to return the date string to the shell which it will try to execute as a command which results in "Command not found" error.

Back-quote is often used in scripts to set the result into a variable. For e.g., we can save the date value in a variable called dt. Then we can use this variable in script.
$> export dt=`date +"%m%d%C%y%H%M%S"`
$> echo "Date is $dt"
$> cp output.log output.log."$dt"


#Where, $> is the shell prompt

In the above list, Date is executed with a specific format (mmddyyyhhmiss). This is then stored in a variable called dt. First we echo dt. Then we use dt to rename a file with date suffix, very handy when you are running a script on a regular basis and want to save the output in timestamped logs.

Incidentally, the statements above, show another kind of substitution, a string or variable substitution using double quotes("). Any expression inside a double quote is often evaluated first and substituted in the command that uses it. Notice the difference: double quote simply evaluates a string/variable and plugs in the value. A back-quote on the other hand, actually executes the string inside as a command and returns the  output to the enclosing command.

You can combine many of these expressions, and that's what makes it all powerful. If you haven't tried back-quote yet, try it. You will be glad you did!

Useful links

  1. Command Substitution in Unix

  2. Unix Shell Substitutions

  3. Advanced Bash Scripting






† udump - user dump - contains user trace files in Oracle database; all print statements from stored programs in Oracle (Java, PL/SQL) end up here.
‡ udump directory may have a large # of files depending on activity, in which case, grep may not work with wild card (*). You may have to refine the grep parms.

Quick Tip - *nix: ulimit

Last week one of our DBAs called me. She was trying to install Oracle 11g. She obtained zip files for the installation from another co-worker. This had many zip files, probably one per installation disk. She was trying to unzip the files and got some error.

Like always, I got down to business of checking and googling. When I tried to unzip the first file, it gave an error as to missing zip file headers. It suggested that the file could be part of a multi-part zip file. Good old google suggested how to combine a multi-part zip file. Just cat each file into one big file, then unzip it:
cat file1.zip file1.zip file3.zip > fullfile.zip

Unzip failed on the fullfile.zip. When I checked the file size, it had he same size as the 1st file, the largest file. Strangely, they both had the size of about 1GB. More googling revealed role of ulimit!

Resource limits on *NIX systems (ulimit)

All *nix systems (all flavors of Unix and Linux) have a limits on system resources per user. The limits include cpu time per process, number of processes, largest file size, memory size, number of file handles etc (file size is defined in number of blocks.). The root user gets unlimited resources, but all others are assigned a finite settings as defined in /etc/security/limits on AIX. (/etc/security/limits.conf in Ubuntu). Here is a good post, as to why you want to limit resources for users. To check the limits, use ulimit command.
ulimit -a

There are Hard limits (ulimit -Ha) set by the root and Soft limits (ulimit -Sa) a user can set herself.  See this blog for details on using ulimit.

Role of ulimit in our problem

Continuing our story about faile Oracle install above, when we did ulimit -a, one particular entry caught our attention. file size was set to 1GB (approx., remember it's set in blocks?). Bingo, this is why when we try to create the fullfile.zip, it got truncated at 1GB, the limit for the user. Apparently, the DBAs are supposed to have unlimited file size. She got onto calling the Unix admin.

To cut the story short, apparently ulimit played a role in getting the original zip files as well. When she downloaded the files, it truncated the biggest of files to 1GB each. Our first file happened to be one and that's why it looked incomplete. To make matters worse unzip and gzip ended in pointing to a multi-part zip file, which they weren't.

Like I said, this is true for all *nix systems. This is very powerful, in controlling runaway processes sucking system resources away. But there may be a legitimate situation like above, where you need to override the settings. So, next time your file seem corrupted or truncated, remember to check ulimit.

References

  1. ulimit in IBM AIX

  2. ulimit on HP-UX

  3. ulimit on Linux - blog

  4. ulimit on Unix - blog






Depending on the *nix you are using, this may be variously mentioned as limits per user, per shell and it's descendants. Some of the settings like cpu time is applied to individual processes user/shell is running.