In my last post, I was stymied by a problem in SSIS. My goal was to use a T-SQL statement to generate a set of rows to represent each NHL season since 1917 as JSON and then to store the JSON in a flat file.
Specifically, this T-SQL statement:
DECLARE @seasonStart INT=1917; DECLARE @seasonEnd INT=year(getdate()); WITH seasonYears AS ( SELECT @seasonStart AS seasonYear UNION ALL SELECT seasonYear+1 FROM seasonYears WHERE seasonYear+1<=@seasonEnd ) SELECT convert(varchar, seasonYear) + convert(varchar, seasonYear + 1) as season FROM seasonYears option (maxrecursion 200) for json path, root('seasons');
What happened? The T-SQL produces the correct results in SQL Server Management Studio (SSMS). However, in SSIS, the same T-SQL statement in an OLE DB Source in a Data Flow Task produces two rows of data which adds a line feed into the flat file and renders the JSON unusable.
The problem is visible even before sending output to the flat file. Here’s what I see when I preview the query results for the OLE DB Source:
I wound up just using SSMS to generate the JSON and using copy/paste to create the file I needed because that got the job done. Meanwhile, I learned a bit more about why I ran into this problem, and I came up with a solution that I want to share with you in this post.
Official word from Microsoft is that FOR JSON always returns output in 2-4KB chunks. Although SSMS doesn’t display these chunks in separate rows, SSIS handles it differently. I could solve this problem either by using a Script Component as a source – and that would mean I would have to write code, which I don’t want to have to do if I don’t have to. Or I could handle the T-SQL in a different way. Which I did.
The trick is to store the JSON output in a variable and then output the result of the variable. I wound up having to rewrite the T-SQL statement because I couldn’t use a CTE in the statement assigned to the variable. Here’s the revised statement that returns the same results as the original shown at the beginning of this post.
DECLARE @seasonStart INT=1917; DECLARE @seasonEnd INT=year(getdate()); DECLARE @jsonText NVARCHAR(max) = (SELECT CONVERT(VARCHAR(4), @seasonStart + seasonYears) + CONVERT(VARCHAR(4), @seasonStart + seasonYears + 1) AS season FROM (SELECT number AS seasonYears FROM master..spt_values WHERE type = 'p' AND number BETWEEN 0 AND @seasonEnd - @seasonStart) AS t FOR JSON PATH, ROOT('seasons')); SELECT @jsonText AS jsonOutput;
When I preview this query in SSIS, I get one row of output:
And when I produce the flat file, I get the file I wanted. Problem solved!
7 Comments
[…] Update 2 June 2018: I discovered the reason for this behavior and developed a workaround which you can see here. […]
[…] Stacia Varga works around an oddity in the way SSIS reads JSON outputs: […]
Thanks for posting this. I was having exactly the same issues with “for json” queries working in SSMS but failing in SSIS packages. Your solution worked for me and made my day.
Nice one, this save me couple of hours!
Your article provides a simple canned example which is ok.
but there are problems on will encounter if using this approach in an automation setting.
Not dynamic (column names would need updated in an automation scenario)
ALSO
FOR JSON PATH is an nvarchar(max) which imposes a character limitation on large datasets.
ALSO
If you have an nchar (sp?) field it will pad zero’s that mucks with your output. This also may be an issue if coding automation
Thanks for the additional comments!
Thank you!!
I was stunk since for 2 days without understand how to solve this issue and your post helped me a lot!