Thus far in my series of posts using hockey data to explore the subject of data analytics in Power BI (which began all the way back here), I have been focused on “using” the data, and was generally refreshing data whenever I opened the PBIX file. I didn’t set up a refresh schedule for the published versions of the hockey datasets that are embedded into previous posts, and was thinking that someone might come along some day and wonder why the data is old. So today I decided to fix that situation and set up a refresh schedule.

There’s just one problem. I’m using the Web.Contents() function in several queries to get the JSON data from the NHL API.

Actually, the function is not the problem. It’s the way I’m using it to dynamically construct URLs so that I can iterate through lists of players or teams or whatever to retrieve the data I want.

Clearly that technique works just fine because I have data. However, when I try to schedule the refresh in the Power BI service, I get the dreaded error: “You can’t schedule refresh for this dataset because one or more sources currently don’t support refresh.”

I explained how to solve this particular problem once upon a time. For the hockey data queries, I had to use variations on that theme to get the dynamic URLs to work. To do this, I had to fix up my queries (which are actually functions) in the Query Editor by opening up the Advanced Editor and fixing the code as noted below.

getGames and getScores

The first few lines for each of these functions looked like this originally:

(season as text) as table =>
let
Source = Json.Document(Web.Contents("https://statsapi.web.nhl.com/api/v1/schedule?season=" & season)),

I changed to this:

(seasonValue as text) as table =>
let
Source = Json.Document(Web.Contents("https://statsapi.web.nhl.com/api/v1/schedule?season=20172018",
[Query=[season=seasonValue]])),

I had to hard-code a single season into the URL and then the value in the Query argument replaces the hard-coded value when the function is called.

I also had to change the variable for the query argument from season to seasonValue. When I tried Query=[season=season], the function worked independently. I could enter a parameter and invoike the function to see results correctly.

However, when I called the function from another query, I got the following error message: “Operation is not valid due to the current state of the object.” Not exactly a lot of information to work with that from a troubleshooting point of view. That’s when I tried changing season=season to season=seasonValue, the error message went away, and everything worked nicely.

getRoster

I changed this:

Source = Json.Document(Web.Contents("https://statsapi.web.nhl.com/api/v1/teams/" 
& teamid & "/roster"))

To this:

Source = Json.Document(Web.Contents("https://statsapi.web.nhl.com/api/v1/teams/",
[RelativePath= teamid & "/roster"]
))

Here I use the RelativePath argument to append a string (generated dynamically) to the URL in the first argument.

getPlayer

I replaced this:

Source = Json.Document(Web.Contents("https://statsapi.web.nhl.com/api/v1/people/" & playerid))

With this:

Source = Json.Document(Web.Contents("https://statsapi.web.nhl.com/api/v1/people",
 [RelativePath= "/" &playerid ]
 )),
))

In this case, it’s the same idea as the change I made in getRoster except here I had to put the static string first and follow it with the variable.

getYearlyPlayerStatistics

I went from this:

Source = Json.Document(Web.Contents("http://statsapi.web.nhl.com/api/v1/teams/"& teamid & "?hydrate=roster(person(stats(splits=yearByYear)))"))

To this:

Source = Json.Document(Web.Contents("http://statsapi.web.nhl.com/api/v1/teams",
[RelativePath=teamid & "?hydrate=roster(person(stats(splits=yearByYear)))"]))

Technically hydrate is a query parameter, but it’s static in this case (unlike seasonValue in the getGames example). Therefore, I could just append it to the relative path argument’s value.

One more thing…

I also had hard-coded the season start years from 1917 to 2017 into the scores away and scores home queries. I don’t want to have to update this range of values each year, so I adjusted these queries also.

Instead of this:

Source = {1917..2017}

I did this:

Source = {1917..Date.Year(DateTime.LocalNow())}

Voila!

After I publish the PBIX file to the cloud, I need to update the dataset credentials and set a refresh schedule. To do this, I perform the following steps:

  • Open the Power BI web page, locate the dataset, point to it to display the ellipsis icon, and then click the ellipsis.
  • Click Schedule Refresh.
  • In the Data Source Credentials section, click the Edit Credentials link and click Sign In. Repeat for each link.

And… Houston, we have a problem…

Each of the links for credentials is the URL for the functions that Power BI is attempting to resolve before it allows me to schedule the refresh. Everything is fine except for the getPlayer function which is trying this URL: https://statsapi.web.nhl.com/api/v1/people.

The problem in this case is that the API returns an error message because it requires a player ID. Which the function actually returns, but it doesn’t have one when Power BI is trying it out. Because the player ID is not sent as a query parameter, I can’t hard-code a value and then override it in the Query argument for the Web.Contents() function. Stumped…

Well, this leads me to change up my priority list for working with hockey data. I’ve been thinking recently that refreshing the PBIX file is slower than I would like and it’s only going to get worse as more data is available from the NHL. I want to use the data with other tools besides Power BI, so maybe it’s time to move the data into another repository. Stay tuned!