Making Use of Translate

From TekiWiki
Jump to: navigation, search

The Oracle Database has a SQL function called "translate" which can do character by character switching and removal. The format of the function is:

 translate(original_text, source_characters, target_characters)

If any of the source_characters are found in the original_text then they are replaced with the corresponding target_character. So any occurrences of the first source character in the original_text will be replaced with the first character of the target, the second with the second and so on.

If there is no correspond target for a source character (due to the target being shorter than the source), then that character is removed from the source altogether.

Finding Formats

Ever had the problem where some of the rows on a large Oracle Database table do not conform to a particular format? Or you wonder which different formats are in use?

Whether you are using SQL Plus or PeopleSoft Query, this where the translate command comes into its own. For instance, imagine you have a column the contains telephone numbers, but they are in a variety of formats, and you want to know how many different formats there are and how many rows in each format;

select translate(PHONE_NBR
, '1234567890'
, '9999999999') PHONE_FORMAT
, count(*) ROW_COUNT
group by translate(PHONE_NBR,'1234567890','9999999999')

This would yield something like:

(999) 999 9999   2567
999-999-9999     23848
(99999) 999999   76658
+99 999 999 999  3

The last three might have been difficult to find without this trick.

Finding Odd Characters

An odd character is just simply one you weren't expecting. If you get rid of all the "normal" characters all you are left with is oddities. You just need to list those characters you think are OK. If there is no corresponding character in the second string then it gets removed. If the replacement characters is empty however, translate gets over ambitious and removes all the characters, so we have to include one odd character to map to itself at the beginning. We can pick any character that isn't in our "normal" list. It does not need to exist in the column we are examining.

So let's say our "normal" list is alphabetic characters, and we have chosen tilde (~) as our oddity. This would give us the following translate:

, '~abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ'
, '~') ODD_CHAR

This will give all the rows with non alphabetic characters and show them. Depending on the requirements, other characters may be excluded too - such as space or hyphen.

I hope you found this tip useful. For more hints and tips like this, see the eBook:


Using Functions in PeopleSoft Query

PeopleSoft Query