dax powerbi quartiles

Calculating Quartiles with DAX and Power BI

Like many of my blog posts, this post is inspired by some work I was doing with a customer who had a requirement to calculate quartiles of a ranked set of data. So I thought I’d take a few minutes to demonstrate how I accomplished this using DAX in Power BI. For this example, I’m going to be using the NFL dataset that I download as part of my NFL analysis dashboards I published last week.

image

Download the Power BI 2018 NFL Stats and Analysis Report Dashboard here

So for this example I’m going to calculate the quartiles for NFL football players based on yards rushing.

image

To calculate the quartile, we’re going to use the PERCENTILEX.INC DAX function. The PERCENTILEX.INC function returns the number at the specified percentile. So for example, if I had numbers 0 and 100 in my data set, the 25th percentile value would be 25. The 50th percentile value would be 50 and the 75th percentile value would be 75, and you can figure out what the 100th percentile value would be.

image

If I want to break my rushers into quarters so I can understand how they compare to the rest of the rushers in the NFL at their position in a given season, calculating the quartiles could be a good way to do that.

First, I create a calculated column. And I want to write an expression that calculates the 75th percentile value of rushing yards, which is also the upper third quarter boundary.

Quartile Rank by Rushing Yards = PERCENTILEX.INC(‘Stats’,[Rush-Yds],.75)

This provides me with my 3rd quartile upper boundary.

image

So I can tell if a player has more than 43 rushing yards, they’re in the upper quartile.

There’s a problem though. My dataset contains rushing stats for multiple years, so I’d like to calculate quartiles within the context of a single season. Also, it doesn’t make sense to compare the rushing stats of running backs to quarter back or wide receivers or tight ends. So I need to edit my calculation to only calculate the quartile for the current players position and season. To accomplish this I need to use a Filter function and the Earlier function as seen here.

Quartile Rank by Rushing Yards = PERCENTILEX.INC(
FILTER(
FILTER(‘Stats’,
[Season] = EARLIER([Season])
),
[Position] = EARLIER([Position])
)
,[Rush-Yds],.75)

The Filter function allows me to filter the Stats tables to a subset of records based on a condition and the Earlier function allows me to limit the subset of records returned from the Stats table to the current row’s season and position. I’m also filtering out players without any rushing yards

This calculation gives me the third quarter boundary. I could also create another calculation for the second quarter boundary using the following code:

PERCENTILEX.INC (
FILTER (
FILTER (
FILTER ( ‘Stats’, ISBLANK(‘Stats'[Rush-Yds]) = FALSE()),
[Season] = EARLIER ( [Season] )
),
[Position] = EARLIER ( [Position] )
),
[Rush-Yds],
.75
)

But this would require me to create multiple columns: one for the Q1 boundary, Q2 boundary, and Q3 boundary. Instead of doing that, I’ll just use a variable in DAX. I can use the variables to store the values for the different boundaries and then use the IF function to do the comparison to calculate which quarter the player is in.

Quartile Rank by Rushing Yards =
var high = PERCENTILEX.INC(
FILTER(
FILTER(
FILTER(
‘Stats’,
ISBLANK(‘Stats'[Rush-Yds]) = FALSE()
),
[Season] = EARLIER([Season])
),
[Position] = EARLIER([Position])
)
,[Rush-Yds],.75)
var mid = PERCENTILEX.INC(
FILTER(
FILTER(
FILTER(‘Stats’,ISBLANK(‘Stats'[Rush-Yds]) = FALSE()),
[Season] = EARLIER([Season])
),
[Position] = EARLIER([Position])
)
,[Rush-Yds],.5)
var low = PERCENTILEX.INC(
FILTER(
FILTER(
FILTER(‘Stats’,ISBLANK(‘Stats'[Rush-Yds]) = FALSE()),
[Season] = EARLIER([Season])
),
[Position] = EARLIER([Position])
)
,[Rush-Yds],.25)
return
IF(‘Stats'[Rush-Yds] <= low, 4, IF(‘Stats'[Rush-Yds] <= mid, 3, IF(‘Stats'[Rush-Yds] <= high, 2, 1)))

Using this method, I can tell which quartile each player is in within the context of a single season and position group. And then the fun with the charts begin.
image

If you’d like to download my Power BI Desktop file, you can get that here.

I hope you found that useful! Leave a comment and let me know what you thought!

One thought on “Calculating Quartiles with DAX and Power BI”

  1. Great explanation! How do you (or the DAX) deal with boundary ties? I’ve found this to be my nemesis when trying to quartile. Usually I have to work in rng (random number generator) and interpolate into unique values, painful.

Comments are closed.