When Others Then John Doe; —

A few weeks ago, while doing PL/SQL code review, I came across this gem:Screenshot-john-doe.sql (-tmp) - gedit

Of course upon seeing that, my head exploded.  Then it started a chain reaction in our office with everyone’s head exploding in sequence, like in the movies where our hero sets off a sequence of claymore mines.  Some things cannot be unseen 🙂

While this is an extreme example, I see plenty of other examples of WHEN OTHERS abuse.  This one would return “John Doe” if you ran out of temp space, had a constraint violation, too many rows, no data found, and practically any other condition.  I pretty sure that is not the intended design of the application.

Consider this misuse of WHEN OTHERS that I see a lot:

WHEN OTHERS THEN
ROLLBACK;

The developer is obviously trying to ensure that all incomplete transactions are rolled back on an error, but Oracle does that anyway.  So this is unnecessary, but if there is an error, it appears to work correctly while not actually doing anything.  This can be catastrophic.  Consider the following trivial example, where we want to backup the EMP table and keep backups for 30 days:

Screenshot-backup_emp.sql (~) - gedit-2

So we set this up and run it for a year as a scheduled job.  EMP has 896 rows in it, and the backup runs perfectly each time:

SQL> select count(*) from emp;

  COUNT(*)
----------
       896

Since I don’t really have a year to wait for the result, I’ll just run the backup job in a loop and display the counts in the backup_emp table afterwards. It should increase over time, then after 30 days remain about the same right? This script just runs it several times:

set serveroutput on
  2  declare
  3    v_count number;
  4  begin
  5  for i in 1..11 loop
  6    run_backup_emp;
  7    select count(*) into v_count from backup_emp;
  8    dbms_output.put_line('Backup run ' || i || ' complete, backup_emp count is ' || v_count);
  9  end loop;
 10* end;

So lets run it:

Backup run 1 complete, backup_emp count is 896
Backup run 2 complete, backup_emp count is 1792
Backup run 3 complete, backup_emp count is 2688
Backup run 4 complete, backup_emp count is 3584
Backup run 5 complete, backup_emp count is 4480
Backup run 6 complete, backup_emp count is 5376
Backup run 7 complete, backup_emp count is 6272
Backup run 8 complete, backup_emp count is 7168
Backup run 9 complete, backup_emp count is 8064
Backup run 10 complete, backup_emp count is 8960
Backup run 11 complete, backup_emp count is 0

PL/SQL procedure successfully completed.

WTF? What deleted all of the data in my backup_emp table? Of course since Murphy is an optimist, someone runs an “upgrade” script which fails but clobbers the EMP table before it fails, and here we are:

SQL> select count(*) from backup_emp;

  COUNT(*)
----------
	 0

1 row selected.

After a week of downtime, during the blamestorming session, everyone still asks, “what happened to all backup data? Have we been hacked?”

Our DBA takes out the WHEN OTHERS THEN ROLLBACK; and re-runs the job:

SQL> @p1
Backup run 1 complete, backup_emp count is 896
Backup run 2 complete, backup_emp count is 1792
Backup run 3 complete, backup_emp count is 2688
Backup run 4 complete, backup_emp count is 3584
Backup run 5 complete, backup_emp count is 4480
Backup run 6 complete, backup_emp count is 5376
Backup run 7 complete, backup_emp count is 6272
Backup run 8 complete, backup_emp count is 7168
Backup run 9 complete, backup_emp count is 8064
Backup run 10 complete, backup_emp count is 8960
declare
*
ERROR at line 1:
ORA-01653: unable to extend table C##SCOTT.BACKUP_EMP by 8 in tablespace DEMO
ORA-06512: at "C##SCOTT.RUN_BACKUP_EMP", line 3
ORA-06512: at line 5

So our backup failed silently for a year because we had shitty error handling and ran out of space. We didn’t really need those backups, did we?

Comments are closed.