Comparing Overhead on the Execution Methods

by Allan Mitchell 7 Oct 2005 14:00

So we got to playing with some of the ways that you can execute an SSIS package and got kind of curious about any overhead associated with the execution methods. From these thoughts came a few tests and here are our findings.

We decided to look at performance over two different pipelines. We also had to decide on a "Control" so that we had something against which to measure. In the tests the "Control" was achieved by executing each pipeline in the Business Intelligence Development Studio with debugging (F5).

Pipeline 1

Single flow

This pipeline is a straight Source --> Destination pipeline. A Component is going to generate 10 million * 1 column DT_I4 rows.

Inside the data generator

The table below shows the comparison of the methods against our "Control"

Method Extra Information Duration as Percentage of Control
Business Intelligence Development Studio Debugging (F5) 100%
Business Intelligence Development Studio No Debugging (Open Window) (Ctrl+F5) 115%
Business Intelligence Development Studio No Debugging (Minimised Window) (Ctrl+F5) 55%
DTExec No Reporting 29%

Pipeline 2

Single flow

This pipeline is slightly different in that we are going to take the multicast transform and from our component generated 10 million rows generate 50 million rows by casting the data five times. Again a Component is going to generate 10 million * 1 column DT_I4 rows. The table below shows the comparison of the methods against our "Control"

Method Extra Information Duration as Percentage of Control
Business Intelligence Development Studio Debugging (F5) 100%
Business Intelligence Development Studio No Debugging (Open Window) (Ctrl+F5) 125%
Business Intelligence Development Studio No Debugging (Minimised Window) (Ctrl+F5) 32%
DTExec No Reporting 7%

What you cannot see from these figures is that the difference between the first and second pipelines' execution times is negligible (0.3 secs) when using DTExec which helps to illustrate a really important concept in SSIS. The Multicast transform is a synchronous component and therefore simply passes the input buffer downstream to the downstream components. The buffer is essentially memory. No mapping of memory between buffers is involved in our second pipeline so basically what you are seeing in the pipeline is 5 representations of the exact same thing. This makes it incredibly fast. The overhead in Business Intelligence Development Studio is incurred by trying to make things look pretty and logging what is happening.

Disclaimer

The results you see here are results we have experienced on our hardware, makes sure you test appropriately for your own configuration.

Add comment

  Country flag

biuquote
  • Comment
  • Preview
Loading