I believe the first time I ever learned JSON was a thing in the data world was when I attended SQLSaturday Washington DC in December 2012, but I never encountered it in any client projects. My clients relied solely on relational data sources for analytics, so that was my focus.
And then I had to write about it in my book Introducing Microsoft SQL Server 2016 (which is free to download) when JSON support was added to SQL Server 2016. But I still didn’t have clients using JSON. It was interesting to me that I could use SQL Server to work with JSON data, but it was still theoretical to me rather than practical.
Therefore, I never thought much about how I would handle it in SQL Server Integration Services (SSIS). I just didn’t have a reason.
Until now. This seems to be the year that I am bumping into JSON left and right. It’s everywhere!
My first client project forays into JSON this year have been to use it as a data source in Power BI. And then later I discovered how to get hockey data from JSON files into Power BI as I’ve been discussing in a series that started earlier this year here. In my last post, I decided that the refresh queries are getting too slow for my liking and I want to experiment with the data in other ways, so I need to move it into another format – starting with a relational database. Meanwhile, I started a client project that requires me to extract JSON files to load into a relational database.
I’m not one to miss the signals. It’s time for me to figure out my options for working with JSON in SSIS…. so here we are.
There are many different ways that one can acquire JSON files. The situations of interest to me at the moment is using API calls to retrieve the data and then somehow moving the data into a table. The purpose of this post is simply to review the options that I have uncovered to date.
Because every data movement situation is different, I won’t claim that any of these methods is preferred over another. The beauty of SSIS is that in many cases it’s flexible enough to handle the same process in multiple ways. At the end of the day, I have two golden rules when it comes to getting things done in SSIS:
- Make sure the data is right when the process is complete.
- Make sure the process completes in as timely a manner as possible.
Ideally, the end result adheres to these rules in such a way that it’s easy to troubleshoot or to adapt as needs change later. This means leaning towards an approach that matches the skills available in your organization. For example, using a Script Component with C# is not the best approach if most of your SSIS developers are more comfortable with T-SQL than C#.
Some time spent researching my options for working with JSON in SSIS turned up the following possible approaches:
- Parsing JSON Data Sources Using SSIS. This technique was written for SQL Server 2012. It uses a flat file source with single column with data type of text stream (DT_TEXT), then uses a Script Component as a transformation. In the Script Component, the key is to use the JavaScript Deserialize method from System.Web.Extensions.
- Use a custom component. You could write your own, acquire an open source JSON processing solution, or buy a commercial component that plugs into SSIS. I found these:
- SSIS JSON Integration Pack from ZappySys
- JSON Source Component from KingswaySoft
Note: I am not familiar with these companies and I have not tried these components. I’m just noting their existence in the universe of possibilities.
- Bulk import JSON File Data to Table. SQL Server 2016 added JSON support to T-SQL so this technique simplifies the process quite a lot if you are using SQL Server 2016 or later.
Personally, I lean towards the last option, Bulk Import. Why?
- I don’t want to have to write code in a Script Transformation if I don’t have to.
- I don’t want to have to pay for a commercial component when I can find another way to accomplish my goals without a whole lot of trouble.
- On the one hand, the open-source version might be useful. On the other hand, in one of client scenarios I have a slightly different than normal authentication situation. In that case, the use of the open-source component is more complicated to figure out than I want to deal with.
- I understand the Bulk Import approach. It’s easy enough to translate into an SSIS workflow, so that’s what I’m going to do.
In my next post, I’ll describe how I use the bulk import option in SSIS to move hockey JSON data into a SQL Server database.
2 Comments
[…] Stacia Varga shows a few methods for handling JSON data in SQL Server Integration Services: […]
[…] I mentioned in my previous post, I’ve been wrangling at lot of JSON lately. And, in my spare time, I’ve been wrangling […]