Hashing is part of Data Vault 2.0. Hash Keys are used to

  • provide a surrogate for business keys
  • detect changes in fields in Satellite tables.
Oracle, SQL Server, Informatica, etc have built-in functions to compute a hash key.
Example with Oracle:
 select upper(rawtohex(DBMS_OBFUSCATION_TOOLKIT.MD5(input_string => upper('Hash Function Test')))) as Hash  
 from  dual;

HASH
—————————————————————————————————————————-
477203E7062D1C485D7BF526FCF5CBAF

Example with SQL Server:

 select UPPER(CONVERT(char(32), HASHBYTES('MD5', UPPER('Hash Function Test')), 2)) as Hash 

 

DB2 has no built-in MD5 Hash Function that produces compatible results. DB2 has an Oracle compatibility mode that provides some of the functions from Oracle. Unfortunately, DBMS_OBFUSCATION_TOOLKIT is not part of the Oracle compatibility mode. Therefore it is necessary to extend DB2 functionality by implementing a UDF.

There are several Open Source MD5 implementations in C available that could be used to create a DB2 UDF. A ready-for-use DB2 approach is provided by Helmut K. C. Tessarek on github.

He provides C Code and DDL scripts to create DB2 UDFs for Linux and Windows. The complete installation is documented in a manual available on github. Installation steps for Linux in short are:

  1. If not already part of your Linux system, install apr and apr-util from http://apr.apache.org/
  2. Install Helmut K. C. Tessarek’s functions by running makertn
    Variable APRPATH and APUPATH must be set within the script (e.g. /usr/local/apr/bin)
  3. Run register.dll to create the UDF functions within DB2.
It should be checked if apr and apr-util libs are loaded after installation of apr and apr-util
 ldconfig -p |grep apr  
     libexpat.so.0 (libc6,x86-64) => /usr/local/apr/lib/libexpat.so.0  
     libexpat.so (libc6,x86-64) => /usr/local/apr/lib/libexpat.so  
     libaprutil-1.so.0 (libc6,x86-64) => /usr/local/apr/lib/libaprutil-1.so.0  
     libaprutil-1.so (libc6,x86-64) => /usr/local/apr/lib/libaprutil-1.so  
     libapr-1.so.0 (libc6,x86-64) => /usr/local/apr/lib/libapr-1.so.0  
     libapr-1.so (libc6,x86-64) => /usr/local/apr/lib/libapr-1.so  

If the the libs are not loaded, the path to apr/apr-util libs (e.g. /usr/local/apr/lib) has to be entered into file “/etc/ld.so.conf”. The command “ldconfig” will load the libs afterwards.

Finally, there is an md5 function “php_md5” available that can be used to generate Hash Keys in DB2 LUW:
 select UPPER(php_md5(UPPER('Hash Function Test')))  
 from  sysibm.sysdummy1;

——————————–
477203E7062D1C485D7BF526FCF5CBAF

 

The result of “php_md5” is char(32). The other provided md5 function is a different implementation and produces different results.

 

Update: Db2 LUW 11.1 onwards has a hash function: HASH (<string>, algorithm). MD5 is algorithm = 1: HASH(‘Test’, 1)