All the dataflow components available in SSIS can be categorized as either Synchronous or Asynchronous components.
Synchronous components (non-blocking)
A simple explanation of Synchronous transformation is that a synchronous transformation processes incoming rows and passes them on in the data flow one row at a time. The output is synchronous with input, meaning that it occurs at the same time. Therefore, to process a given row, the transformation does not need information about other rows in the data set.
The output of a synchronous component uses the same buffer as the input. Reusing the input buffer is possible because the output of a synchronous component always contains exactly the same number of records as the input. Synchronous (non-blocking) transformations always offer the highest performance.
Synchronous transformations are either stream-based or row-based. Streaming transformations are calculated in memory and do not require any data from outside resources to transform the data. These are the fastest transformations around. Row-based transformations run a little bit slower because they require calling a service or looking up data from another source to calculate their values.
We can summarize Synchronous Transformations as below
- The output is synchronous with the input
- Input and Output record count stays the same (rows entering = rows exiting the transformation)
- Does not need information about any other rows inside the data set
- Can operate on the same buffer
- Better performing than an asynchronous transformation
- Synchronous transformations are either stream-based or row-based.
Asynchronous components (Semi-blocking and Fully-blocking)
We use Asynchronous transformation when it is not possible to process each row independently of all other rows. An example is the Sort transformation, where the component has to process the complete set of rows to generate a sort output. The output of an asynchronous component uses a new buffer. It’s not possible to reuse the input buffer because an asynchronous component can have more or less output records then input records.
All source adapters are asynchronous, they create two buffers; one for the success output and one for the error output. All destination adapters, on the other hand, are synchronous.
Semi-Blocking Asynchronous Transformations require a subset of the data to be collected before they can be sent to the destination(s). The shape of the data can change. A subtotal or sampling of data may be extracted from the source(s).
Fully Blocking Asynchronous Transformations are the slowest transformations. They require all the data to be pulled from the source(s) before they can be sent to the destination(s). All source data must be loaded into memory first – ouch! As much as we should try to avoid these, they can sometimes still be required, such as sorting data pulled from a flat-file source.
If there is more data than the memory available, it will use the %TEMP% directory to cache some of the data. If you want to use a different location, you can set the BufferTempStoragePath property of the Data Flow Task to point to a different folder location.
We can summarize Asynchronous Transformations as below
- Does not process rows independently in the dataset
- Rather than output rows as they are processed, the transformation must output data asynchronously, or at a different time
- Record counts usually change from input to output
- Must create a new buffer upon the output of the transformation
- Generally poorer performance than synchronous transformation
- Typically a Semi-Blocking or Blocking Transformation
Below table categorizes various Synchronous and Asynchronous Transformations available in SSIS