12 Nov 2010 13:46
This is a quick walk through on how you can use the Script Component to perform Conditional Split like behaviour, splitting your data across multiple outputs. We will use C# code to decide what does flows to which output, rather than the expression syntax of the Conditional Split transformation. Start by setting up the source. For my example the source is a list of SQL objects from sys.objects, just a quick way to get some data: SELECT type, name FROM sys.objects
29 May 2010 20:27
In this post I want to show a couple of ways to order the data that comes into the pipeline. a number of people have asked me about this primarily because there are a number of ways to do it but also because some components in the pipeline take sorted inputs. One of the methods I show is visually easy to understand and the other is less visual but potentially more performant.
14 May 2010 15:40
How to optimise doing the same lookup multiple times. [More]
14 May 2010 11:51
Not many people understand that the Integration Services pipeline can be intentionally throttled back under the covers to stop you from running out of memory. In this video we show you how to detect this is happening and explain more about the limiter itself. More...
1 May 2009 08:40
The Trace File Source adapter is a useful addition to your SSIS toolbox. It allows you to read profiler traces stored as .trc files and read them into the Data Flow. From there you can perform filtering and analysis using the power of SSIS. There is no need for a SQL Server connection this just uses the trace file.
Cache warming for SQL Server Analysis Services
Reading the flight recorder
Find out the longest running queries on a server
Analyze statements for CPU, memory by user or some other criteria you choose
The Trace File Sour... [More]
18 Feb 2009 12:23
We in no way work for Attunity but we were asked to test drive a beta version of their Oracle CDC solution for SSIS. Everybody should know that moving more data than you need to takes too much time and uses resources that may better be employed doing something else. Change data Capture is a technology that is designed to help you identify only the data that has had something done to it and you can therefore move only what is needed. Microsoft have implemented this exact functionality into SQL server 2008 and I really like it there. Attunity though are doing it on Oracle... [More]
31 Oct 2008 08:22
Having your users walk into a cold cache on a morning is something that will cause them to have slower running queries initially. Cache warming takes care of a lot of that by executing typical queries against the newly processed cube and thereby building the cache ready for your users when they get in. There are a number of ways to warm the cache available. Chris Webb has already blogged about how he has done it and we thought we would show you a variation on that theme. UPDATE: The Trace File Source Adapteris now available. The package is relatively simple. ... [More]
28 Oct 2008 08:40
Data Flow Property Expressions Property expressions within the data flow are not immediately obvious. Whilst most pipeline or data flow components do support them for their properties, the expression is not generally exposed through the component user interface, neither is the Expressions property shown properties grid for the component. Property expressions are only available at the task level, so for those components that do support expressions, you need to look in the parent data flow task's property Expressions collection. This can be accessed through the Property Grid for the task. You n... [More]
28 Oct 2008 07:58
Date Parts Expressions support a range of date related functions such as DATEADD, with the same basic syntax to that found in T-SQL. Whilst the familiarity is very helpful, the difference that catches me out is the format of date part which must be quoted. T-SQL allows this: DATEADD(n, -10, GETDATE())
DATEADD(mi, -10, GETDATE())
DATEADD(minute, -10, GETDATE())
The SSIS equivalent is:
DATEADD("n", -10, GETDATE())
DATEADD("mi", -10, GETDATE())
DATEADD("minute", -10, GETDATE())
Related functions that use the same date part tokens -
6 Mar 2008 14:00
Download the slides and sample code from my Extending SSIS with custom Data Flow components presentation, first presented at the SQLBits II (The SQL) Community Conference.
Get some real-world insights into developing data flow components for SSIS. This starts with an introduction to the data flow pipeline engine, and explains the real differences between adapters and the three sub-types of transformation. Understanding how the different types of component behave and manage data is key to writing components of your own, and probably should but be required knowledge for anyone bu... [More]
19 Sep 2007 14:00
The Multicast transform in SSIS 2005 enables us to take 1 input dataset and from it generate n output datasets that are an exact copy of the input dataset which is extremely useful at times and we have used it on a number of projects. The downside is that those outputs are synchronous outputs and therefore are on the exact same thread as the input dataset. Normally a Synchronous output is good as no memory has to swap buffers and it is exceptionally quick. A problem arises if we have on one or more of those outputs a blocking Asynchronous transform. When this happens processing of the other ou... [More]
10 Sep 2007 14:00
Michael Entin has blogged about something that you see now, well you feel the effects anyway, in SQL Server 2005 but it isn't as visible as it should be but is really important. Backpressure is an interesting internal memory limiter inside SSIS and in SQL Server 2008 you get to see when the limiter kicks in. For a detailed explanation see Michael''s blog entry here but we wanted to show you how to reproduce it very simply.
Here is a package we are going to use to show the limiter kick in and how SQL Server 2008 shows us it taking effect:
Inside the Script Component a... [More]
26 May 2005 14:00
Sometimes source systems like to send us text files that are a little out of the ordinary. More and more frequently we are seeing people being sent master and detail or parent and child rows in the same file. Handling this in DTS is painful. In SSIS that changes. In this article we'll take you through one file of this type and provide a solution which is more elegant than what we have today, and hopefully will give you ideas as to how you can handle your own versions of this type of file. As mentioned this is a simple solution but we could extend it to include more powerful processing tech... [More]
6 May 2005 14:00
I was about to write about data flow buffer size configuration properties but need to get this one out first so that we can refer to it.
The Data Flow task (internally and in the object model also called the 'Pipeline') logs some pretty interesting information that describe the internal scheduler.
To see these messages, one way is:
1. In a package you want to get detailed pipeline logs on, select the 'Logging...' menu option on the control flow.
2. Check the Data Flow task of interest on the tree on the left. Switch over to Details and se... [More]
5 May 2005 14:00
Adjusting the following properties can have an impact on parallelization during execution:
This is a property on the Package. It defines how many tasks can be run concurrently. A value of -1 means the number of processors and when hyperthreading is turned on, it counts the number of logical processors, not physically present ones. In the little testing that I (Ash) have done, I've not noticed significant difference (<5 percent) in performance between turning on hyperthreading or leaving it off. It'd be interested to know if you see better resu... [More]
4 Apr 2005 14:00
(By Jamie Thomson)
Surrogate keys are generally considered fundamental building blocks of a data warehouse. They are used as identifiers for dimensional members and enable us to manage slowly changing dimensions.
SSIS does not contain a built in component for generating surrogate keys but there is still a mechanism for doing it – the Script Component. The Script Component allows us to modify the data in a data flow path using managed code and we can use it to generate surrogate keys.
The Row Number Transformation can be used to help generate surrogate keys... [More]
1 Sep 2004 14:00
(By Jamie Thomson)
A common requirement when building a data warehouse is to be able to get all rows from a staging table where the business key is not in the dimension table. For example, I may want to get all rows from my STG_DATE table where the DateID is not in DIM_DATE.DateID.
There are 2 ways to do this in conventional SQL.
-- Method #1
FROM STG_DATE s
LEFT OUTER JOIN DIM_DATE d
ON s.DateID = d.DateID
WHERE d.DateID IS NULL
-- Method #2
FROM STG_DATE s
WHERE D... [More]
2 Aug 2004 14:00
In this article we are going to assume a few things about the reader as we want to concentrate as much as possible on the Script Component itself and not the peripheral stuff
The things we will assume are:
You are happy with adding a Data Flow task to the designer in Workflow
You are happy to configure an OLE-DB Source adapter
You are comfortable with configuring mappings in the Destination adapter
As usual we will show you a graphic of the Data Flow once it is completed so we know that if yours resembles ours at the end then chances are we are at the same place. ... [More]