Multicast Transform and Threading in SQL Server 2008

by Allan Mitchell 19 Sep 2007 14:00

The Multicast transform in SSIS 2005 enables us to take 1 input dataset and from it generate n output datasets that are an exact copy of the input dataset which is extremely useful at times and we have used it on a number of projects. The downside is that those outputs are synchronous outputs and therefore are on the exact same thread as the input dataset. Normally a Synchronous output is good as no memory has to swap buffers and it is exceptionally quick. A problem arises if we have on one or more of those outputs a blocking Asynchronous transform. When this happens processing of the other outputs is halted until the Asynchronous transform completes. This can really hurt performance. What people used to do was to force a new buffer on the outputs of the multicast by adding a UNION ALL transform with only one input. This had the effect of generating a new execution tree and therefore engine thread.

In SSIS 2008 the threading model around SSIS has changed to be more performant and take advantage of today's multi core, multi-processor boxes. The Multicast transform is an excellent example of this optimisation. Each output on the transform is broken down into parallel execution paths. This threading optimisation promises to have many more enhancements on performance in other parts of SSIS such as packages with deep parallel execution paths.

Take this very simple data flow for instance:

Multicast sample package

When this flow runs in SSIS 2005 the Execution Tree format looks like the image underneath. As you can see only one tree is created and therefore only one thread:

begin execution tree 0
   output "Output 0" (4)
   input "Multicast Input 1" (24)
   output "Multicast Output 1" (25)
   input "Row Count Input 1" (29)
   output "Row Count Output 1" (30)
   output "Multicast Output 2" (40)
   input "Row Count Input 1" (33)
   output "Row Count Output 1" (34)
   output "Multicast Output 3" (42)
   input "Row Count Input 1" (37)
   output "Row Count Output 1" (38)
   output "Multicast Output 4" (44)
end execution tree 0

Let us take this a step further and look at the Execution plan for this SQL Server 2005 SSIS package Data Flow:

SourceThread0
   Drives: 1 
   Influences: 23 27 31 35 
   Output Work List
      CreatePrimeBuffer of type 1 for output ID 4.
      SetBufferListener: "WorkThread0" for input ID 24 
      CallPrimeOutput on component "Script Component" (1)
   End Output Work List
End SourceThread0
WorkThread0
   Drives: 23 
   Influences: 23 27 31 35 
   Input Work list, input ID 24 (1 EORs Expected)
      CallProcessInput on input ID 24 on component "Multicast" (23) for view type 0
      CallProcessInput on input ID 29 on component "Row Count" (27) for view type 0
      CallProcessInput on input ID 33 on component "Row Count 1" (31) for view type 0
      CallProcessInput on input ID 37 on component "Row Count 2" (35) for view type 0
   End Input Work list for input 24
   Output Work List
   End Output Work List
End WorkThread0

Now have a look at the exact same data flow except this time in SSIS 2008. As you can see the outputs are broken into Subpaths and this allows those outputs to go out and get their own engine threads:

Begin Path 0
   output "Output 0" (4); component "Script Component" (1)
   input "Multicast Input 1" (49); component "Multicast" (48)
   Begin Subpath 0
      output "Multicast Output 1" (50); component "Multicast" (48)
      input "Row Count Input 1" (54); component "Row Count" (52)
   End Subpath 0
   Begin Subpath 1
      output "Multicast Output 2" (65); component "Multicast" (48)
      input "Row Count Input 1" (58); component "Row Count 1" (56)
   End Subpath 1
   Begin Subpath 2
      output "Multicast Output 3" (67); component "Multicast" (48)
      input "Row Count Input 1" (62); component "Row Count 2" (60)
   End Subpath 2
End Path 0

Finally for the SQL Server 2008 package Data Flow, the Execution Plan:

Begin output plan   
   Begin transform plan
   End transform plan
   
   Begin source plan
      Call PrimeOutput on component "Script Component" (1)
	       for output "Output 0" (4)
   End source plan
End output plan

Begin path plan
   Begin Path Plan 0
      Call ProcessInput on component "Multicast" (48) for input "Multicast Input 1" (49)
	  Create new execution item for subpath 0
	  Create new execution item for subpath 1
	  Create new execution item for subpath 2
	  Begin Subpath Plan 0         
	     Call ProcessInput on component "Row Count" (52) for input "Row Count Input 1" (54)
	  End Subpath Plan 0
	  Begin Subpath Plan 1
	     Call ProcessInput on component "Row Count 1" (56) for input "Row Count Input 1" (58)
	  End Subpath Plan 1
	  Begin Subpath Plan 2         
	     Call ProcessInput on component "Row Count 2" (60) for input "Row Count Input 1" (62)
      End Subpath Plan 2
   End Path Plan 0
End path plan

Add comment

  Country flag

biuquote
  • Comment
  • Preview
Loading