background shape
background shape

How to use MD5 in SQL and AMPScript

Message Digest 5 (MD5) is a hash function used for data integrity, but it is not suitable for security-critical applications. It has found use for various applications, one of which is to create an encrypted contact subscriber key. MD5 takes any string and produces a fixed-size (128-bit) hash value, that cannot be easily decrypted.

To generate MD5 hash with MSSQL we can use following SQL select clause

CONVERT(VARCHAR(32), HashBytes('MD5', LTRIM(RTRIM(Lower([EmailAddress])))), 2) as SubscriberKey
  1. HashBytes('MD5', LTRIM(RTRIM(Lower([EmailAddress])))):
    • Lower([EmailAddress]): This function converts the value in the [EmailAddress] field to lowercase. It ensures that the hash calculation is case-insensitive.
    • LTRIM(RTRIM()): These functions are used to remove leading and trailing spaces from the lowercase email address. Trimming is commonly done to ensure consistent results when hashing.
    • HashBytes('MD5', ...): This function calculates the MD5 hash of the processed email address, producing a binary output.
  2. CONVERT(VARCHAR(32), ..., 2) as SubscriberKey:
    • CONVERT(VARCHAR(32), ..., 2): This part converts the binary MD5 hash result into a hexadecimal string representation of length 32 characters.
    • as SubscriberKey: This gives the resulting hexadecimal string the alias “SubscriberKey” for use in the query’s output.

In MSSQL Server, NVARCHAR and NCHAR data types store Unicode characters, and by default, they use UTF-16 encoding. So, when you convert binary data (like an MD5 hash) to a VARCHAR in SQL Server without specifying a character encoding, it typically uses the database’s default encoding, which is UTF-16 for Unicode data types.

If you intend to hash strings in SSJS or AMPScript to match the results obtained from SQL Server (which uses UTF-16 encoding by default), you should ensure that your string encoding matches. This means that you should normalize and encode your strings in UTF-16 before hashing to achieve consistent results.

Thus in AMPScript and SSJS we will have to use UTF-16 character set for MD5 hashing.

%%[
/*
MD5(1,2)
1 - string to hash
3 - character set. Default is UTF-8
*/
SET @emailAddress = "exmample@martechnotes.com"
SET @MD5EmailAddress  = MD5(@emailAddress, "UTF-16")
]%%
<script runat=server>
  /*
  MD5(1,2)
  1 - string to hash
  3 - character set. Default is UTF-8
  */
   var email = Platform.Variable.GetValue("@emailAddress");
   Platform.Function.MD5(email, 'UTF-16');
</script>

Oh hi there 👋
I have a FREE e-book for you.

Sign up now to get an in-depth analysis of Adobe and Salesforce Marketing Clouds!

We don’t spam! Read our privacy policy for more info.

Share With Others

Leave a Comment

Your email address will not be published. Required fields are marked *

MarTech consultant

Marcel Szimonisz

Marcel Szimonisz

I specialize in solving problems, automating processes, and driving innovation through major marketing automation platforms.

Buy me a coffee