Checksum Transformation

The Checksum Transformation computes a hash value, the checksum, across one or more columns, returning the result in the Checksum output column. The transformation provides functionality similar to the T-SQL CHECKSUM function, but is encapsulated within SQL Server Integration Services, for use within the pipeline without code or a SQL Server connection.

As featured in The Microsoft Data Warehouse Toolkit by Joy Mundy and Warren Thornthwaite from the Kimbal Group. Have a look at the book samples especially Sample package for custom SCD handling.

All input columns are passed through the transformation unaltered, those selected are used to generate the checksum which is passed out through a single output column, Checksum. This does not restrict the number of columns available downstream from the transformation, as columns will always flow through a transformation. The Checksum output column is in addition to all existing columns within the pipeline buffer.

The Checksum Transformation uses an algorithm based on the .Net framework GetHashCode method, it is not consistent with the T-SQL CHECKSUM() or BINARY_CHECKSUM() functions. The transformation does not support the following Integration Services data types, DT_NTEXT, DT_IMAGE and DT_BYTES.

ChecksumAlgorithm Property

There ChecksumAlgorithm property is defined with an enumeration. It was first added in v1.3.0, when the FrameworkChecksum was added. All previous algorithms are still supported for backward compatibility as ChecksumAlgorithm.Original (0).

  • Original – Orginal checksum function, with known issues around column separators and null columns. This was deprecated in the first SQL Server 2005 RTM release.
  • FrameworkChecksum – The hash function is based on the .NET Framework GetHash method for object types. This is based on the .NET Object.GetHashCode() method, which unfortunately differs between x86 and x64 systems. For that reason we now default to the CRC32 option.
  • CRC32 – Using a standard 32-bit cyclic redundancy check (CRC), this provides a more open implementation.

Installation

Installation requires Administrative privileges on the install machine. The task is provided as an MSI file which you can download and run to install it. This simply places the files on disk in the correct locations and also installs the assemblies in the Global Assembly Cache as per Microsoft’s recommendations. For SQL Server 2008 and below, you will have to add the transformation to the Visual Studio toolbox by hand. This process has been described in detail in the related FAQ entry for How do I install a task or transform component?, just select Checksum from the SSIS Data Flow Items list in the Choose Toolbox Items window.

Downloads

The Checksum Transformation is available for all versions of SQL Server from 2005 to 2016. Please choose the version to match your SQL Server version, or you can install multiple versions and use them side by side if you have more than one version of SQL Server installed.

Checksum Transformation for SQL Server 2005

Checksum Transformation for SQL Server 2008

Checksum Transformation for SQL Server 2012

Checksum Transformation for SQL Server 2014

Checksum Transformation for SQL Server 2016

Checksum Transformation for SQL Server 2017

Checksum Transformation for SQL Server 2019

Version History

SQL Server 2019

Version 7.0.2.30 – SQL Server 2019.
(29 Sep 2019)

SQL Server 2017

Version 6.0.2.30 – SQL Server 2017.
(29 Sep 2019)

SQL Server 2016

Version 5.0.2.29 – SQL Server 2016 GA release.
(6 Jun 2016)

SQL Server 2014

Version 4.0.2.29 – Updated installer including upgrade, Administrative install, extension mapping to support SSDT in Visual Studio 2015, and public Expression Editor control.
(6 Sep 2014)

Version 4.0.0.28 – SQL Server 2014 release. Includes upgrade support for 2005, 2008 and 2012 packages to 2014.
(6 Sep 2014)

SQL Server 2012

Version 3.0.0.28 – Fixed issue which caused reuse of 208 UI assembly. No runtime impact, just prevented UI from working on 2012 only install.
(23 Feb 2013)

Version 3.0.0.27 – SQL Server 2012 release. Includes upgrade support for both 2005 and 2008 packages to 2012.
(5 Jun 2012)

SQL Server 2008

Version 2.0.0.27 – Fix for CRC-32 algorithm that inadvertently made it sort dependent. Fix for race condition which sometimes lead to the error Item has already been added. Key in dictionary: ‘79764919’ . Fix for upgrade mappings between 2005 and 2008.
(19 Oct 2010)

Version 2.0.0.24 – SQL Server 2008 release. Introduces the new CRC-32 algorithm, which is consistent across x86 and x64.. The default algorithm is now CRC32.
(29 Oct 2008)

Version 2.0.0.6 – SQL Server 2008 pre-release. This version was released by mistake as part of the site migration, and had known issues.
(20 Oct 2008)

SQL Server 2005

Version 1.5.0.43 – Fix for CRC-32 algorithm that inadvertently made it sort dependent. Fix for race condition which sometimes lead to the error Item has already been added. Key in dictionary: ‘79764919’ .
(19 Oct 2010)

Version 1.5.0.16 – Introduces the new CRC-32 algorithm, which is consistent across x86 and x64. The default algorithm is now CRC32.
(20 Oct 2008)

Version 1.4.0.0 – Installer refresh only.
(22 Dec 2007)

Version 1.4.0.0 – Refresh for minor UI enhancements.
(5 Mar 2006)

Version 1.3.0.0 – SQL Server 2005 RTM. The checksum algorithm has changed to improve cardinality when calculating multiple column checksums. The original algorithm is still available for backward compatibility. Fixed custom UI bug with Output column name not persisting.
(10 Nov 2005)

Version 1.2.0.1 – SQL Server 2005 IDW 15 June CTP. A user interface is provided, as well as the ability to change the checksum output column name.
(29 Aug 2005)

Version 1.0.0 – Public Release (Beta).
(30 Oct 2004)

Screenshot

Checksum Transformation Editor dialog