SSIS – Merge Join vs Lookup Transformation

SSIS Concepts

What is the difference between Lookup and Merge Join transformation in SSIS?

Lookup and Merge Join component in SSIS are primarily used for joining two different sources. But there are several differences between these two transformations. Let us first look at what each transformation is mainly used for.

If you want to see the difference between Merge and Merge Join Transformations, you can read them here.

Lookup Transformation

Let us consider an example where we have an OrderDetails table which contains information about orders placed and another table Customers which have Customer information. If you want to get the Customer information based on each order in the OrderDetails table, you can use Lookup Transformation.

Merge vs Lookup

So, in short, we can use Lookup when you want to find or lookup rows matching in Source 2 based on Source 1 input. Typically, values from the reference dataset (Source 2) are added to the transformation output. In our example, the Lookup transformation will extract Customer Name from the Customer table using a value (CustomerKey) from the input column, and then add the Customer name to the transformation output. The values from the reference table can replace column values or can be added to new columns.

  • The Lookup transformation tries to perform an equi-join between values in Source 2 and Source 1 input.
    • If there is no matching entry in the reference dataset, rows without matching entries are considered as errors by default. But we can configure the Lookup transformation to redirect such rows to a no match output.
    • If there are multiple matches in the reference table, the Lookup transformation returns only the first match returned by the lookup query.
    • If multiple matches are found, the Lookup transformation generates an error or warning only when the transformation has been configured to load all the reference dataset into the cache. In this case, the Lookup transformation generates a warning when the transformation detects multiple matches as the transformation fills the cache.
  • We can join multiple columns in the input to columns in the reference dataset. The transformation supports join columns with any data type, except for DT_R4, DT_R8, DT_TEXT, DT_NTEXT, or DT_IMAGE.
  • The lookups performed by the Lookup transformation are case sensitive.

You can read more about Lookup Transformation here.

What is SSIS Merge Join Transformation

Merge Join transformation merge the 2 sorted datasets and gives output as single data set by joining on FULL or LEFT or INNER JOIN. The joining columns in both the datasets should be in sorted order and the same metadata type. This transformation has two inputs and one output. It does not support an error output.

In our example, if we use Merge Join to join OrderDetails and Customer tables, the result is a Table which lists all the Orders and their corresponding Customer details.

  • Merge Join Transformation requires sorted data for its inputs.
  • Merge Join transformation requires that the joined columns have matching metadata.

You can configure the Merge Join transformation in the following ways:

  • Specify the join is a FULL, LEFT, or INNER join.
  • Specify the columns the join uses.
  • Specify whether the transformation handles null values as equal to other nulls.

You can read more about Merge Join here.

We can summarize the difference between Merge Join and Lookup Transformation as below

Author: simpleBIinsights

Leave a Reply