Category Archives: sql

SQL – Nuances of IsNumeric() function

I love complex sql queries.

Today I was working on generating a report that called for a join between two tables which looked as those below. The fist table (company) contained approx.80K records and the second (company_lookup) contained approx. 20K records.


.--------------------------------.
|             Company            |
+----+------------------+--------+
| Id | Name             | Ticker |
+----+------------------+--------+
|  1 | Dummy company 1  | IBM    |
|  2 | Dummy company 2  |   0002 |
|  3 | Dummy company 3  | 46ABD  |
|  4 | Dummy company 4  | NAN    |
|  5 | Dummy company 5  | BP     |
|  6 | Dummy company 6  | BP     |
|  7 | Dummy company 7  | MSFT   |
|  8 | Dummy company 8  | E      |
|  9 | Dummy company 9  | GOOG   |
| 10 | Dummy company 10 | SUN    |
| 11 | Dummy company 11 |   0046 |
‘----+------------------+--------’


.---------------------------.
|       Company_Lookup      |
+------------------+--------+
| Name             | Ticker |
+------------------+--------+
| Dummy company 1  | IBM    |
| Dummy company 2  |      2 |
| Dummy company 3  | 46ABD  |
| Dummy company 4  | NAN    |
| Dummy company 5  | BP     |
| Dummy company 7  | MSFT   |
| Dummy company 8  | E      |
| Dummy company 9  | GOOG   |
| Dummy company 11 |     46 |
‘------------------+--------’

As there can be multiple records with the same ticker, I tried writing a simple join on ticker and name.

select * from company a, company_lookup b
where a.ticker = b.ticker and
         a.name = b.name

However, notice the following rows


.--------------------------------.
|             Company            |
+----+------------------+--------+
| Id | Name             | Ticker |
+----+------------------+--------+
|  1 | Dummy company 1  | IBM    |
|  2 | Dummy company 2  |   0002 |  <-----
|  3 | Dummy company 3  | 46ABD  |
|  4 | Dummy company 4  | NAN    |
|  5 | Dummy company 5  | BP     |
|  6 | Dummy company 6  | BP     |
|  7 | Dummy company 7  | MSFT   |
|  8 | Dummy company 8  | E      |
|  9 | Dummy company 9  | GOOG   |
| 10 | Dummy company 10 | SUN    |
| 11 | Dummy company 11 |   0046 |  <-----
‘----+------------------+--------’

.---------------------------.
|       Company_Lookup      |
+------------------+--------+
| Name             | Ticker |
+------------------+--------+
| Dummy company 1  | IBM    |
| Dummy company 2  |      2 |   <-----
| Dummy company 3  | 46ABD  |
| Dummy company 4  | NAN    |
| Dummy company 5  | BP     |
| Dummy company 7  | MSFT   |
| Dummy company 8  | E      |
| Dummy company 9  | GOOG   |
| Dummy company 11 |     46 |   <-----
‘------------------+--------’

Essentially, the ticker is same but is stored differently in the two tables (although the data type is varchar in both tables). A simple join doesn’t work as it fails to compare, for ex., ‘2’ with ‘0002’. I can’t blindly convert the column to numeric either since the column also stores alphanumeric values. After a lot of tinkering, I managed to write a rather ugly query using ‘isnumeric’ and ‘case’ clause.

select b.*, a.* from company a, company_lookup b
where
case isnumeric(a.ticker)
    when 0 then a.ticker
    when 1 then cast(cast(a.ticker as int) as varchar(10))  --stripping leading zeros, if any, by converting to int and then back to varchar
end = b.ticker
and
b.name = a.name

My intention was to do a numeric conversion only if the tickers are numeric. To my dismay, this query didn’t work either and kept giving me a ‘Can’t convert NaN to varchar’ error. Upon debugging further, it turned out that the isNumeric() function treats ‘NaN’, ‘E’, ’31E’ etc. as numeric. However, the cast/convert idn’t able to convert ‘NAN’ to a valid numeric value.

This issue has also been touched upon in the following stackoverflow query: http://stackoverflow.com/questions/570075/sql-isnumeric-returns-true-but-sql-reports-conversion-failed

So, finally I had to modify the query to filter out such values as follows.

select b.*, a.* from company a, company_lookup b
where
case a.ticker
    when 'NAN' then 'NAN'
    when 'E' then 'E'
else
    case isnumeric(a.ticker)
        when 0 then a.ticker
        when 1 then cast(cast(a.ticker as int) as varchar(10))
    end
end = b.ticker
and
b.name = a.name

P.S. – By the way, the above tables have been generated using the Text::ASCIITable perl module. Also, WordPress has a habit of converting double dashes to some special character so I had to edit the HTML of this blog entry to replace all dashes with ampersand hash045;

Advertisements