All posts by Dustin Ryan

Dustin is a Data Platform Solutions Architect on the Education team at Microsoft. You can find Dustin blogging and speaking at events like SQL Saturday, Code Camp, and SQL Rally. Follow Dustin on Twitter @SQLDusty.

Power BI Incremental Refresh with Web API Data Source

It’s been a while since I typed up a blog post but life, even in the midst of a pandemic, has been busy and crazy. As you can imagine, that customers I support in the education industry have been forced to pivot on a dime in order to continue to operate. My customers, in school districts and higher education institutions, are proving to be nimble and adaptive by adopting Azure cloud technologies such as Power BI and the Azure data platform in a time of high stress.

But I wanted to share one of the neat things I’ve recently worked on with one of my academic medical customers. My customer came to me with a specific challenge related to their adoption of Power BI and I wanted to take a few minutes to walk through a summary of the challenge and how we solved the problem.

The Challenge

The customer is using Power BI to report on data from Service Now via APIs. So the customer was able to quickly connect Power BI to Service Now data and begin reporting on relevant datasets very quickly. The challenge, however, is that querying multiple years of data via the API was less than desirable for a variety of reasons.

The customer was interested in exploring the incremental refresh capabilities of Power BI, but were worried about using Power BI’s native incremental refresh capability since query folding (if you’re new to query folding, read this here) is not supported by Power BI’s web connector. So the customer team reached out to me with the challenge and I worked up an example that I think will help them meet their requirement.

The Solution

The solution I came up relies on the fact that the Service Now API accepts a date as a parameter. But since I don’t have access to Service Now, I used WeatherAPI.com for demonstration purposes. There are a few different APIs you can use, but in my example I’m using their history API. If you want to play along, you can register for a Free account at WeatherAPI.com which gives you access the 7 days worth of history of weather data (the docs say 30 days but I could only query 7 days worth of history).

Calling the Data Source API

The first step was to create a function in Power BI that would query the weather API and accept a date as a parameter. To query the API data source, we’ll use the Web.Contents function with the Query argument, and this is really important (Chris Webb has a very nice blog post on this that I found helpful).

Take a look at my query below and notice the Web.Contents function. Because I need to dynamically construct the URL including the query string parameters, using the Query argument allows me to dynamically add the parameters while still maintaining a valid base URL, which is important if you want Power BI to be able to validate the connection prior to the data refresh.

(dtquery as text) =>
let
Source = Xml.Tables(
Web.Contents("http://api.weatherapi.com/v1/history.xml",
[Query=[key=APIkey, q="32212", dt=dtquery]])),
#"Pivoted Column" = Table.Pivot(Source, List.Distinct(Source[Name]), "Name", "Table"),
#"Expanded location" = Table.ExpandTableColumn(#"Pivoted Column", "location", {"name", "region", "country", "lat", "lon", "tz_id", "localtime_epoch", "localtime"}, {"location.name", "location.region", "location.country", "location.lat", "location.lon", "location.tz_id", "location.localtime_epoch", "location.localtime"}),
#"Expanded forecast" = Table.ExpandTableColumn(#"Expanded location", "forecast", {"Table"}, {"Table"}),
#"Expanded Table" = Table.ExpandTableColumn(#"Expanded forecast", "Table", {"hour"}, {"hour"}),
#"Expanded hour" = Table.ExpandTableColumn(#"Expanded Table", "hour", {"time_epoch", "time", "temp_c", "temp_f", "is_day", "condition", "wind_mph", "wind_kph", "wind_degree", "wind_dir", "pressure_mb", "pressure_in", "precip_mm", "precip_in", "humidity", "cloud", "feelslike_c", "feelslike_f", "windchill_c", "windchill_f", "heatindex_c", "heatindex_f", "dewpoint_c", "dewpoint_f", "will_it_rain", "chance_of_rain", "will_it_snow", "chance_of_snow", "vis_km", "vis_miles", "gust_mph", "gust_kph"}, {"time_epoch", "time", "temp_c", "temp_f", "is_day", "condition", "wind_mph", "wind_kph", "wind_degree", "wind_dir", "pressure_mb", "pressure_in", "precip_mm", "precip_in", "humidity", "cloud", "feelslike_c", "feelslike_f", "windchill_c", "windchill_f", "heatindex_c", "heatindex_f", "dewpoint_c", "dewpoint_f", "will_it_rain", "chance_of_rain", "will_it_snow", "chance_of_snow", "vis_km", "vis_miles", "gust_mph", "gust_kph"}),
#"Expanded condition" = Table.ExpandTableColumn(#"Expanded hour", "condition", {"text"}, {"text"})
in
#"Expanded condition"

When the Power BI dataset is refreshed, the query string parameters are added to the base URL. I can see this if I take a look at the query diagnostics for the steps that invokes the function. For the weather API, I need to include the following parameters:
1. key = API key
2. q = location zip code
3. dt = date
Notice the parameters “key”, “q”, and “dt” being added to the base URL with corresponding values.

With the function created, I’m ready to create my query that will call my function and create a table in my Power BI data model that will be configured for incremental refresh.

Building the Query & Calling the Function

First, I connected to a table in an Azure SQL Database that contains a list of dates and the corresponding year, month, and day values. Part of the reason I’m using Azure SQL as a data source is because it supports query folding.

Next I created the RangeStart and RangeEnd parameters which are required for using Power BI’s incremental refresh capabilities. Be aware that these parameters should be Date/Time type. Then I applied a Custom filter to the Date column based on the RangeStart and RangeEnd parameters. Be aware that the datatypes (Date/Time) for the parameter and the column you’re filtering should both match.

When applying the custom filter, ensure that you do not use “is after or equal to” and “is before or equal to” at the same time as this can cause you to have duplicated data in adjacent partitions, as noted in the docs.

Also, as a side note, I created a parameter to hold my API key which you saw referenced in the function we first discussed.

I perform a few transformations to construct the value that will be passed to the previously mentioned function’s “dt” parameter.

Then I’m ready to invoke my custom function I created in the first step. After I invoke the function and expand the contents, I end up with a view of the relevant weather data.

I clicked Close and Apply and the data is loaded into the model.

Configure the Incremental Refresh Policy

Now I’m ready to configure incremental refresh. I won’t go through the steps of configuring incremental refresh but it’s clearly documented here. In the below screenshot, you’ll see how I configured my incremental refresh policy. I want to keep 7 days worth of history and only refresh the 3 days previous to the current day. This means that when I first publish the dataset, Power BI will refresh all 7 days worth of data to load the dataset. Then on subsequent refreshes, only the past 3 days will be refreshed. Days older than 7 days will be dropped from the dataset.

Notice the warning above which states “Unable to confirm if the M query can be folded. It is not recommended to use incremental refresh with non-foldable queries.” I suspect this warning is appearing because of the function we’re using to call the API. In these scenarios, the documentation states that we should validate that the incremental refresh is occurring correctly and data is being incrementally refreshed. In a few moments, I’ll show you how I validated only the past 3 days were being refreshed.

Publishing and Refreshing the Data

Last, I published the data model and report to the Power BI service. In this example, I’m publishing to a workspace using Power BI Premium capacity so I can connect to the workspace with SQL Server Management Studio and view the partitions that will be created by Power BI.

Now that the data model and report are published, I can see the report. At this point I only have one day’s worth of data.

To load the previous 7 days worth of data, I need to update the data source credentials. To update the data source credentials, go the settings for the dataset and click Edit credentials next to each data source in the Data source credentials menu.

For my SQL Database I entered my SQL credentials and for the Web data source I used anonymous authentication but clicked the checkbox to Skip test connection because the connection will fail until the API key is provided (which happens in the function at the time of querying).

Verify the Incremental Refresh is Working

Because this workspace is on my Premium capacity, I can enable the XMLA endpoints to allow connectivity to the workspace with other tools. So before we process the data model, lets connect to the workspace and take a look at the partitions with SSMS.

As you can see, the WeatherData table only has the single, default partition. Now lets go back to Power BI and kick off a refresh of the dataset. Once the refresh completes, lets re-check the partitions in the WeatherData table in SSMS. You can see that we now have 8 partitions (7 days of history + the current day) and each partition has 24 rows. Power BI has created these partitions based on the incremental refresh policy we defined in the previous steps. Cool stuff!

What do you think will happen now if I refresh the data again? Lets find out! After the data refresh completes, I check the Last Processed timestamp for the partitions. Notice that only the last 3 partitions have been processed based on the timestamp! This indicates to me that my incremental refresh policy is now working correctly. I’m ready to schedule the data refresh in Power BI based on my requirements.

I can also check the Analytics on my API calls against WeatherAPI.com and see the additional API calls show up.

The Summary

In this blog post, I shared some steps for incrementally refreshing data in Power BI with data sources from a web API. If you’d like to take a look at my Power BI template that includes the logic, you download the .pbit file here.

Here are some of the resources I shared inline the blog post.

Chris Webb’s blog post on Web.Contents: https://blog.crossjoin.co.uk/2016/08/16/using-the-relativepath-and-query-options-with-web-contents-in-power-query-and-power-bi-m-code/
Miguel Escobar’s blog post on incremental refresh with SharePoint Online data source, which I found helpful because it follows a similar approach: https://www.poweredsolutions.co/2020/01/19/incremental-refresh-for-files-in-a-folder-or-sharepoint-power-bi/
What is Power BI Premium? https://docs.microsoft.com/en-us/power-bi/admin/service-premium-what-is
Incremental refresh with Power BI: https://docs.microsoft.com/en-us/power-bi/admin/service-premium-incremental-refresh
Dataset connectivity with the XMLA endpoint: https://docs.microsoft.com/en-us/power-bi/admin/service-premium-connect-tools
Recording Query Diagnostics in Power BI: https://docs.microsoft.com/en-us/power-query/recordingquerydiagnostics
Download SQL Server Management Studio: https://docs.microsoft.com/en-us/sql/ssms/download-sql-server-management-studio-ssms?view=sql-server-ver15
Configure schedule refresh with Power BI: https://docs.microsoft.com/en-us/power-bi/connect-data/refresh-scheduled-refresh
WeatherAPI.com: https://www.weatherapi.com/

I hope you found this blog post useful! If you did, leave a comment and let me know and share it with your friends and teammates!

Welcome to Azure Synapse at SQL Saturday Tampa 2020

See the source image

SQL Saturday Tampa is coming up again in a couple of weeks on February 29th and I’m excited to announce that I’ll be there and presenting my session Welcome to Azure Synapse. In this session, I’m planning to deliver a overview of Azure Synapse as well as demo Synapse Studio!

Microsoft’s powerful new service, Azure Synapse, brings together the worlds of data warehousing and big data analytics by providing a unified experience for ingesting, cleansing, modeling, and sharing data. In this session, we’ll discuss the basics of Azure Synapse, ideal uses cases for Azure Synapse, and demonstrate how attendees can use Azure Synapse. By the conclusion of this session, attendees will have a deeper understanding of Azure Synapse and how to begin using Azure Synapse.

So if you’re in the area and interested in learning more about Azure Synapse, this will be a great session for you. You’ll need to register for SQL Saturday Tampa here. Don’t hesitate though, this event will fill up very quickly! I hope to see you there!

Dynamic Column Level Security with Power BI and SSAS

Last week I was asked to tackle a requirement by a customer adopting Analysis Services to enable data exploration and ad hoc analysis by their users. One of their requirements was to secure columns based on a grant related to a cost center. For example, a grant has several attributes, with some attributes being considered “sensitive” and other attributes considered “non-sensitive”. Non-sensitive grant attributes would accessible to all users while a subset of the attributes in the grant table considered “sensitive” would be accessible to users related to the corresponding cost center. The challenge here is that while Analysis Services supports column level security, dynamic column level security is not supported. So my colleague and friend, the great Steve Pontello, and I put our heads together to address the requirement.

Continue reading Dynamic Column Level Security with Power BI and SSAS

See You at SQL Saturday South Florida on Saturday June 8th!

image

SQL Saturday South Florida is on Saturday June 8th and its just around the corner! The event is hosted at the Nova Southeastern University campus in Davie, Florida near Fort Lauderdale and its always a tremendous event with great speakers, content, and fun with the community. If you haven’t registered then you need to do it right away by going here. But I’m also excited and please that I’ll be delivering two sessions! Continue reading See You at SQL Saturday South Florida on Saturday June 8th!

Visualizing NFL Draft History with Power BI

With the 2019 NFL Draft upon us, of course I wanted to visualize some NFL draft data in Power BI. So I put together this interesting set of visualizations based on some data I scraped from Pro Football Reference. The dataset includes drafts from 1936 to 2019 including picks through round 6 where applicable. I’ll update the dataset as the remaining rounds of the 2019 draft are completed. So feel free to take a look, interact with the dashboards, and let me know if you have any questions.

If you’re a football and Power BI fan, check out my NFL Stats Report in Power BI here.

There are three pages in the NFL Draft Picks Power BI report. Continue reading Visualizing NFL Draft History with Power BI

Power BI Governance 101 at SQL Saturday 820 in Jacksonville, Florida on May 4

image

Coming up in just a couple of weeks, I’ll be presenting my session, Power BI Governance 101, at SQL Saturday #820 in Jacksonville, Florida on May 4th. The event will kick off at 8 AM with breakfast and sessions will be starting at 9 AM! SQL Saturday Jacksonville is being hosted at the beautiful University of North Florida campus. I’ve attended a lot of SQL Saturday events, and the Jacksonville event is my favorite to attend! It’s well organized, the speakers and content are top quality, and it’s always a great crowd.

But this year’s event is special! This year’s SQL Saturday in Jacksonville is a Jedi-themed event! So don’t forget to wear your favorite Star Wars gear! I know I will! You seriously do not want to miss this event!

Get Registered Here

My session, Power BI Governance 101, is scheduled for 10:15 AM in room Coruscant 51-1205. Deploying an enterprise business intelligence solution is no small task but one of the most important factors to consider in order to have a successful roll-out of a self-service analytics tool like Power BI is governance. We’ll discuss why governance is so important for a successful Power BI deployment, what are the components of Power BI that require governance, and how to govern your Power BI environment. By the end of this session, you should have a clearer understanding of governance in the context of Power BI. Check out the event schedule here.

If you’re planning to attend, leave a comment here and don’t forget to stop by and say “hi”!

Data Warehouse Design and Power BI at SQL Saturday Tampa 2019

SQLSaturday #859 - Tampa 2019

Hello, fellow data nerds! I wanted to make you aware of a fantastic event coming up this month in Tampa, Florida. On Saturday, March 23, 2019, the annual SQL Saturday event will be hosted at the University of South Florida! This is a really tremendous opportunity to engage in some quality training at little to no cost. If you’re able to be in the Tampa area Saturday 3/23 then you need to register for this event. I’ve seen the schedule and the line up is impressive! Old dogs and young dogs alike will learn something new from a host of professionals and industry experts!

I’ll be presenting two sessions during the morning of March 23 at SQL Saturday Tampa and I’d love to see you there!

Making Sense of the Power BI Ecosystem

Power BI has taken the world by storm since it’s release in mid-2015. Since then, Microsoft’s cloud analytics service has matured and grown in scope and complexity. In this session, we’ll seek to understand the moving parts of the Power BI ecosystem including Power BI Pro, Premium, Report Server, Embedded, Data Flows, and more. By the end of this session the audience should have a firm grasp on the components ecosystem and how those components work together.

Building a Data Warehouse from the Ground Up

What if you could approach any business process in your organization and quickly design an effective and optimal dimensional model using a standardized step-by-step method? In this session we’ll discuss the steps required to design a unified dimensional model that is optimized for reporting and follows widely accepted best practices. We’ll also discuss how the design of our dimensional model affects a semantic modeling technologies like SQL Server Analysis Services or Power BI and how the choices we make during the data warehouse design phase can make or break our reports.

I hope you can make it to this really amazing event and can make it to one of my sessions! Don’t forget to register! If you do, swing by and say hi to me!

Power BI Architecture Diagram v4 is now available!

I’ve finally updated my Power BI Architecture Diagram to include some of the new features that are now available and will soon be available. There are three new updates I made to the diagram in v4.

Download the Power BI Architecture Diagram v4

Download the Power BI Architecture Diagram Legend

First and most importantly, I updated the Power BI logo in the diagram to the latest version of the logo!

power bi icon

Secondly, I included Power BI Dataflows in the diagram tagged #6. Power BI Dataflows are used to ingest, transform, integrate, and enrich big data by defining data source connections, ETL logic, refresh schedules, and more. Data is stored as entities in the Common Data Model in Azure Data Lake Storage Gen2. Dataflow entities can be consumed as a data source in Power BI and by using Power BI Desktop. Read more about Dataflows here.

Lastly, I also included item #14 to illustrate the upcoming capability Power BI users will have to directly access Power BI hosted data models via XMLA. This will allow you to connect to the Analysis Services data models using Tableau, Qlik, SSRS, SSMS, Dax Studio, SQL Server Profiler and other tools. Initially only read-access will be supported. Eventually write-access will be supported, meaning that I’ll be able to refresh Power BI data models using other client tools. This capability is not currently available at the time of me writing this blog post, but stay tuned.

And don’t forget to download the updated legend that goes with the Power BI Architecture Diagram used to describe and explain the components.

Thoughts and feedback?

Take a look at the diagram and let me know your thoughts and any feedback you have! And if you found it useful, leave a comment and let me know and share it with your colleagues!

Calculating Quartiles with DAX and Power BI

Like many of my blog posts, this post is inspired by some work I was doing with a customer who had a requirement to calculate quartiles of a ranked set of data. So I thought I’d take a few minutes to demonstrate how I accomplished this using DAX in Power BI. For this example, I’m going to be using the NFL dataset that I download as part of my NFL analysis dashboards I published last week.

image

Continue reading Calculating Quartiles with DAX and Power BI

NFL Football Player Stats & Analysis 2018 is now available!

I’ve had a lot of people ask me for this over the past few months and its finally (mostly) ready! There are still a few things I’d like to do with the data models and reports but I wanted to go ahead and get the content shared out since I know many people use this for the Fantasy Football drafts which generally happen during the third week of the NFL preseason.

So here it is. Continue reading NFL Football Player Stats & Analysis 2018 is now available!