Recently, one of my clients was working with a new data source in Power BI that required multiple calls to a REST API to iterate through an unknown number of pages to retrieve JSON documents. We were able to use a combination of parameters and M code to dynamically construct the URI necessary to call a series of pages, and everything worked fine while working in the Power BI Desktop.

However, when the PBIX file was published to the Power BI service, the data set could not be refreshed because the data source was unsupported. Sad trombone…

The most promising clue to a resolution was found in a post by my friend Chris Webb (b|t), Web.Contents(), M Functions And Dataset Refresh Errors In Power BI. Chris describes how to deal with resolving the refresh error when the query parameters change. In my case, I needed the base URL to change. Our scenario was just enough different that I thought a post to document our approach might be useful to someone someday.

Background

The problem relates to the manner in which we need to dynamically build a URI to download JSON documents. Although the examples in this post refer to my client’s cloud-based applicant tracking system, JazzHR, the same scenario could apply to many other different REST APIs, so I’ll focus more on the structure of the URI and the M language in Power BI rather than JazzHR specifically.

My client wants to analyze applicant data in Power BI that can be obtained through the JazzHR API. By using a specifically structured URI, you can download data from JazzHR for the first 100 applicants. Because the information is secure, you must supply an API key to access your data as a query parameter in the URI, like this:

https://api.resumatorapi.com/v1/applicants?apikey= xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx

If you have more than 100 applicants, you need to request data by page number like this for page 1:

https://api.resumatorapi.com/v1/applicants/page/1</?apikey= xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx

The URI above returns the same 100 applicants as the URI without a page number. To get applicants 101 through 199, you use this URI:

https://api.resumatorapi.com/v1/applicants/page/2?apikey= xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx

The API request returns only an array of JSON documents. There is no metadata that you can use to determine how many pages are available. You just keep trying a new page number until no more data is returned. Therefore, our solution has to be able to dynamically generate some number of URIs for which we’ll arbitrarily set a limit.

Get a Static Page First

I like to approach development incrementally to make it easier to troubleshoot. In this case, I started by setting up a hard-coded reference to the URI for page 1. I created a new Web data source using the applicable URI. The results are returned as a list, so I used the To Table transform to get a table of records, and then I expanded the records so that my table includes all the fields from the JSON documents. When finished, the M script for this operations looks like this:

let

  Source = Json.Document(Web.Contents("https://api.resumatorapi.com/v1/applicants/page/1?apikey= xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx")),

  #"Converted to Table" = Table.FromList(Source, Splitter.SplitByNothing(), null, null, ExtraValues.Error),

  #"Expanded Column1" = Table.ExpandRecordColumn(#"Converted to Table", "Column1", {"id", "first_name", "last_name", "prospect_phone", "apply_date", "job_id", "job_title"}, {"id", "first_name", "last_name", "prospect_phone", "apply_date", "job_id", "job_title"})

in

  #"Expanded Column1"

Get a Set of Pages

Now I need to not only make this URI dynamic so that I can programmatically change page numbers, but I also need to make sure that all of the JSON documents from each page are combined into a single result set. Matt Masson (b | t) describes how to do this in his post, Iterating over multiple pages of web data using Power Query.

So I dutifully convert my query to a function (by editing in the Advanced Editor) like this:

(page as number) as table =>

let

  Source = Json.Document(Web.Contents("https://api.resumatorapi.com/v1/applicants/page/" & Number.ToText(page) & "?apikey= xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx")),

  #"Converted to Table" = Table.FromList(Source, Splitter.SplitByNothing(), null, null, ExtraValues.Error),

  #"Expanded Column1" = Table.ExpandRecordColumn(#"Converted to Table", "Column1", {"id", "first_name", "last_name", "prospect_phone", "apply_date", "job_id", "job_title"}, {"id", "first_name", "last_name", "prospect_phone", "apply_date", "job_id", "job_title"})

in

  #"Expanded Column1"

In this code, I’m setting up page as my parameter and then using it in the Web.Contents argument by concatenating it (after converting it to a string) in between the static URL and the apikey query parameter. Then I rename the query to getPage.

Next, I need to use the function to loop through a series of pages. For simplicity, I’m going to set up a loop for 3 pages. (In production, we set a much higher number. There’s no way to anticipate this value, so we’ll have to periodically review the data and adjust the upper limit as needed.)

To do this, I’ll add a blank query, open the Advanced Editor, and add in the following code:

let
    Source = {1..3},
    #"Converted to Table" = Table.FromList(Source, Splitter.SplitByNothing(), null, null, ExtraValues.Error),
    #"Renamed Columns" = Table.RenameColumns(#"Converted to Table",{{"Column1", "Page"}}),
    #"Added Custom" = Table.AddColumn(#"Renamed Columns", "Custom", each <strong>getPage([Page])</strong>),
    #"Expanded Custom" = Table.ExpandTableColumn(#"Added Custom", "Custom", {"id", "first_name", "last_name", "prospect_phone", "apply_date", "job_id", "job_title"}, {"id", "first_name", "last_name", "prospect_phone", "apply_date", "job_id", "job_title"})
in
    #"Expanded Custom"

The initial line containing Source sets up a list containing 3 values: 1, 2, and 3. The next line converts this list to a table, and then the single column it contains is renamed from Column1 to Page.

Notice the line containing #”Added Custom”. Here’s where a custom column gets added that calls the getPage function and passes in the value in the Page column. In other words, on the first row, the Page column contains 1, so the 1 gets passed into the function which returns a set of rows for the converted JSON documents from page 1. These get added to the table in the new query.

Then the second row passes in a 2, the function is invoked, and gets another set of documents which are converted into rows. Now my table contains 200 rows of data. The final row of the converted-from-a-list-of-3 table gets the final 100 rows of data so that I know have a total of 300 rows in my new query.

Data Source Unsupported for Refresh

I can adjust the list to get any number of pages that I want and it just works. In Power BI Desktop, that is. When I publish to the Power BI service, and try to refresh the data, I get the following error:

“You can’t schedule refresh for this dataset because one or more sources currently don’t support refresh.”

Grr… This defeats the whole purpose of getting pages of data dynamically! And so this is where the quest for another solution began, and the adaptation of Chris Webb’s approach to dynamic URLs.

“Faking Out” Web.Contents

As Chris explains in his post, the problem is that the Power BI service wants to validate the URI before it commits to refreshing the data source. Because the URI isn’t static, there is no URI to validate.

If the page number were a query parameter, as it is in many REST APIs, then the approach that Chris describes works beautifully. I was stumped for a bit because I got hung up on the notion that the base URL should be the part that’s static – https://api.resumatorapi.com/v1. I was thinking that the applicants portion was really treated as a parameter by the JazzHR API just like page number. My thought process for the solution was influenced by another post by Chris, Using The RelativePath And Query Options with Web.Contents() In Power Query And Power BI M Code.

I decided to set up the static URI with the applicants included, and then set up my dynamic page numbering in the RelativePath argument. I changed the Source line in the getPage function to this:

 Source = Json.Document(Web.Contents("https://api.resumatorapi.com/v1/applicants?apikey= xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx",
    [RelativePath="page/" & Number.ToText(page),
    Query=[apikey=" xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"]]))

Success!

This technique highlights the behavior of the RelativePath and Query options. RelativePath appends to the string that appears before the ? in the URI and Query replaces whatever you have set as the query parameter in the static URI.

I published to the Power BI service and was successfully able to refresh. The only “issue” was that I had to edit the credentials – which were anonymous. But all this entailed was opening the dialog box for the data set credentials and hitting OK.

Hope this helps should you find yourself wrangling with REST APIs that require dynamic values.