Downloading a file over HTTP the SSIS way

by Darren Green 2 Jul 2009 18:14

This post shows you how to download files from a web site whilst really making the most of the SSIS objects that are available. There is no task to do this, so we have to use the Script Task and some simple VB.NET or C# (if you have SQL Server 2008) code. Very often I see suggestions about how to use the .NET class System.Net.WebClient and of course this works, you can code pretty much anything you like in .NET. Here I’d just like to raise the profile of an alternative.

This approach uses the HTTP Connection Manager, one of the stock connection managers, so you can use configurations and property expressions in the same way you would for all other connections. Settings like the security details that you would want to make configurable already are, but if you take the .NET route you have to write quite a lot of code to manage those values via package variables. Using the connection manager we get all of that flexibility for free. The screenshot below illustrate some of the options we have.

HTTPConnectionManagerEditor

Using the HttpClientConnection class makes for much simpler code as well. I have demonstrated two methods, DownloadFile which just downloads a file to disk, and DownloadData which downloads the file and retains it in memory. In each case we show a message box to note the completion of the download. You can download a sample package below, but first the code:

Imports System
Imports System.IO
Imports System.Text
Imports System.Windows.Forms
Imports Microsoft.SqlServer.Dts.Runtime

Public Class ScriptMain

    Public Sub Main()

        ' Get the unmanaged connection object, from the connection manager called "HTTP Connection Manager"
        Dim nativeObject As Object = Dts.Connections("HTTP Connection Manager").AcquireConnection(Nothing)

        ' Create a new HTTP client connection
        Dim connection As New HttpClientConnection(nativeObject)


        ' Download the file #1
        ' Save the file from the connection manager to the local path specified
        Dim filename As String = "C:\Temp\Sample.txt"
        connection.DownloadFile(filename, True)

        ' Confirm file is there
        If File.Exists(filename) Then
            MessageBox.Show(String.Format("File {0} has been downloaded.", filename))
        End If


        ' Download the file #2
        ' Read the text file straight into memory
        Dim buffer As Byte() = connection.DownloadData()
        Dim data As String = Encoding.ASCII.GetString(buffer)

        ' Display the file contents
        MessageBox.Show(data)


        Dts.TaskResult = Dts.Results.Success
    End Sub

End Class

Sample Package HTTPDownload.dtsx (74KB)

Comments

7/16/2009 2:42:06 AM #

Works like a dream. Thanks for making the dtsx and sample files available so I could try it out straight away!

Joel Watson United States

7/16/2009 9:21:29 PM #

Just out of curosity, why are you doing this:

        Dim filename As String = "C:\Temp\Sample.txt"
        connection.DownloadFile("C:\Temp\Sample.txt", True)

Seems to me that the variable containing the name of the file should be specifed as a parameter to the .DownloadFile() method:

        Dim filename As String = "C:\Temp\Sample.txt"
        connection.DownloadFile(filename, True)

Just my .02 worth

PKSpence United States

7/17/2009 2:17:02 PM #

PKSpence, you're quite right I meant to use the variable consistently, code updated. Thanks.

Darren Green United Kingdom

7/19/2009 3:02:03 PM #

You can check an alternative based on WebClient class here: www.cozyroc.com/script/http-upload-download-task

CozyRoc United States

9/16/2009 8:35:21 PM #

Thank you very much for the clear article, worked like a champ (except I had to change MessageBox to MsgBox).

Kyle United States

9/16/2009 10:29:58 PM #

Kyle, did you forget to add Imports System.Windows.Forms? MsgBox only works in VB.NET, and as far as I recall is a hangover from old VB, really just a shortcut left in the VB.NET language. If you prefer it then that's fine too.

Darren Green United Kingdom

9/22/2009 9:42:08 PM #

This worked wonderfully and was exactly what we needed.  We were able to accomplish something much faster than I thought possible using the information you supplied. Thank you.

Casey United States

9/29/2009 9:55:59 PM #

Hi Darren,

Your Script Works Like a  Charm.

But i Have One Question Though - How to DownLoad the Dinamically Changing File Name.
I want to Automate the Process of Downloading.
The File Name Changes Everyday. e.g with New Date.

I trying to use expressions buyt doesn't Works.

Thanks in Advance
Kanthi

Kanthi United States

11/20/2009 7:04:32 PM #

I am gettng a couple errors. It looks like it has to do with using this package on a different machine... Since no one else has mentioned it, I bet it is because I am not familiar with dtsx... Anyone have an idea of what's going on?

Error: 2009-11-20 10:59:49.17
   Code: 0xC0016016
   Source:
   Description: Failed to decrypt protected XML node "DTS:Property" with error 0
x8009000B "Key not valid for use in specified state.". You may not be authorized
to access this information. This error occurs when there is a cryptographic err
or. Verify that the correct key is available.
End Error


and

Error: 2009-11-20 10:59:49.35
   Code: 0x00000002
   Source: Script Task
   Description: The script threw an exception: The system cannot find the path s
pecified. (Exception from HRESULT: 0x80070003)
End Error


Anyone? Help a n00b!

SeanO

11/21/2009 12:21:37 AM #

Would you be so kind to translate these to line to C#:
===============
        ' Get the unmanaged connection object, from the connection manager called "HTTP Connection Manager"
        Dim nativeObject As Object = Dts.Connections("HTTP Connection Manager").AcquireConnection(Nothing)

        ' Create a new HTTP client connection
        Dim connection As New HttpClientConnection(nativeObject)
===============
Thanks

Joe

11/26/2009 2:38:04 PM #

Cant we get whole directory, i mean use wild cards.

Rizwan India

12/1/2009 10:38:35 PM #

Hi Darren,

Can you light me how to download gzip file from http site using authentication.
cheers
dass

dass New Zealand

12/3/2009 6:41:32 PM #

Issue: couldn't download the file Using HTTPS:// with Certicates.

I try to use this above script and download the file file from HTTPS:// using the CERTIFICATES. It connects to the server. But It doesn't download the file. It creates the file with the Ceriticate name on it.

The text file shows "Virtual user wll0001f logged in." The wll0001f is the certificate name.

I have created the HTTP Connection manager in my SSIS with CERTIFICATE property.

Could I get any help?

dhiva United States

1/11/2010 7:18:20 PM #

Thanks very much, works like a charm.  

HollyK United States

7/28/2010 4:01:42 PM #

I am in SSIS. Where do I add this code? I have to copy the internet file to a local folder. I created the HTTP conn manager. But now whihc SSIS task will handle this code? Please help.

apal United States

7/29/2010 7:58:01 AM #

apal, the Script Task can be used to host code.

Darren United Kingdom

8/9/2010 8:33:14 PM #

Need some help here,  would this script work with https and certificates.  I am getting a proxy error.
When I select Test Connection in connection manager it succeeds however it gives Proxy authentication failed.
Error: The script threw an exception: Proxy authentication failed. This error occurs when login credentials are not provided, or the credentials are incorrect.

Thanks in advance.

Naresh Jagtiani United States

8/10/2010 10:33:24 PM #

Naresh, This work fine with SSL.

Your error indicates a problem with the proxy server authentication, so I would talk to you network team.

There is a limitation with the HTTP Connection manager, it doesn't have any option to pass through the default Windows credentials which is a simple way to supply valid Windows credentials to a proxy server. If that is issue use the System.Net.WebClient approach and pass in the default credentials like this -

webClient.Credentials = System.Net.CredentialCache.DefaultCredentials

Darren United Kingdom

8/11/2010 7:35:49 PM #

Hello Darren,  Thanks for your reply.  This is what I put in the script task which worked for me.

Imports System.IO
Imports System.Net
Imports System.Text
Imports System.Web
Imports System.Security.Cryptography.X509Certificates


Public Class WebRetrieve
    Public Shared Sub Main()
        Dim mycert As New System.Security.Cryptography.X509Certificates.X509Certificate _
(System.Security.Cryptography.X509Certificates.X509Certificate.CreateFromCertFile _
("C:\Certifcates\cert.cer"))
  
       Dim wr As HttpWebRequest = CType(WebRequest.Create("https://reports/reports.zip"), HttpWebRequest)
        wr.ClientCertificates.Add(mycert)

        Dim ws As HttpWebResponse = CType(wr.GetResponse(), HttpWebResponse)
        Dim str As Stream = ws.GetResponseStream()
        Dim inBuf(100000000) As Byte
        Dim bytesToRead As Integer = CInt(inBuf.Length)
        Dim bytesRead As Integer = 0
        While bytesToRead > 0
            Dim n As Integer = str.Read(inBuf, bytesRead, bytesToRead)
            If n = 0 Then
                Exit While
            End If
            bytesRead += n
            bytesToRead -= n
        End While
        Dim fstr As New FileStream("c:\New\reports.zip", FileMode.OpenOrCreate, FileAccess.Write)
        fstr.Write(inBuf, 0, bytesRead)
        str.Close()
        fstr.Close()
    End Sub
End Class

Naresh Jagtiani United States

Add comment




  Country flag

biuquote * Required fields. Your email address will not shown, check the preview for what you see. We use it to send you notifications on new comments if you ask us to below.
  • Comment
  • Preview
Loading




Welcome to SQLIS.com our free SQL Server Integration Services (SSIS) resource site.

MVP

RecentComments

Comment RSS