3 SSAS Dimension Design Best Practices to Live By

After spending the last five years or so designing and performance tuning SSAS cubes and dimensions for the clients of Pragmatic Works, I’ve noticed there seems to be a recurring pattern with poorly designed dimensions. Because of this I wanted to take a few minutes to point out three easy things you can do to improve the performance and usability of your solution’s dimensions Continue reading 3 SSAS Dimension Design Best Practices to Live By

Learn Designing a Data Warehouse from the Ground Up at SQL Saturday 442 Orlando, FL

image I’m very excited to announce that I’ve been selected to present a session entitled Designing a Data Warehouse from the Ground Up on October 10, 2015 in Orlando, Florida at Seminole State College of Florida! I’ll be presenting this session for the first time along side Mitchell Pearson (b|t). This is going to be an amazing event with tons of amazing, FREE training for everyone including SQL server newbies on up to those who have been in the profession for years. Continue reading Learn Designing a Data Warehouse from the Ground Up at SQL Saturday 442 Orlando, FL

Make a Difference as a Thought Leader: A Book Review

imageIf you’ve ever spoken at an event like a SQL Saturday or Code Camp, answered a question on a forum, written a blog post, or helped lead a discussion at your place of work then on some level you are already a Thought Leader. If you’re asking what a Thought Leader really is, you’re probably a lot like I was when I first started reading Denise Brousseau’s Ready to Be a Thought Leader. A Thought Leader is somebody that drives thinking and learning in a particular industry, group, or profession. These people are widely recognized as an expert and authority on their subject matter and a go-to-person for learning and insight into their field. Thought Leaders are men and women that take the time to increase their knowledge, share what they’ve learned, and make a difference in the lives of others in their niche. Ready to Be a Thought Leader demonstrates a seven step pattern laid out by Denise Brosseau instructing the reader on how to become an innovative, forwarding-thinking, cutting edge Thought Leader. Continue reading Make a Difference as a Thought Leader: A Book Review

What’s New in SQL Server Analysis Services 2016?

There’s a load of new features that are included in the release of SQL Server Analysis Services 2016 CTP2. I’m pretty excited about these changes and while these changes have been public for a while now, I’d like to share my thoughts. I’ll say that these features are included in the SSAS 2016 CTP2 release. This release does not include all the enhancements to SSAS 2016 and these enhancements are subject to change. You can read about the enhancements here. Continue reading What’s New in SQL Server Analysis Services 2016?

TSQL Script to Find Foreign Key References to a Given Column

It’s always kind of a pain to have to hunt down all those foreign key references so you can address the issues. So I put this script together (based on a script found on StackOverflow) in order to help me find all the required information related to a particular column in a specified table. I’m mostly posting this for my own reference later and for anyone else that may find this useful, so enjoy!

SELECT OBJECT_NAME(f.object_id) as ForeignKeyConstraintName,
    OBJECT_NAME(f.parent_object_id) TableName,
    COL_NAME(fk.parent_object_id,fk.parent_column_id) ColumnName,
    OBJECT_NAME(fk.referenced_object_id) as ReferencedTableName,
    COL_NAME(fk.referenced_object_id,fk.referenced_column_id) as ReferencedColumnName

FROM sys.foreign_keys AS f
    INNER JOIN sys.foreign_key_columns AS fk 
        ON f.OBJECT_ID = fk.constraint_object_id
    INNER JOIN sys.tables t
        ON fk.referenced_object_id = t.object_id

WHERE OBJECT_NAME(fk.referenced_object_id) = 'your table name'
    and COL_NAME(fk.referenced_object_id,fk.referenced_column_id) = 'your key column name'

Here’s a picture of what the results look like. I ran this query against the ReportServer database used for SSRS in case you were wondering.

T-SQL to find FK key columns

If you want to find every Foreign Key in your database, just eliminate the Where clause to bring back all the FKs. Hopefully you found this as useful as I did.

Three SSAS Best Practices to Follow

There is a lot of work that goes into performance tuning a SQL Server Analysis Services solution for a client. And even though there may be many steps involved in identifying performance and management issues with a large SSAS solution, there are a few things that we can quickly check and implement to give us quick wins for improving the performance of our cube. In this post, I’d like to quickly point out three best practices that we can follow to improve performance and create a more positive experience for our users. These are not meant to be the top three best practices to follow, but rather three (among many) very important best practices you should follow. I believe following these three best practices will make a difference in your solution.

Create Hierarchies with Attribute Relationships

In my opinion, creating natural hierarchies are the single most beneficial thing an SSAS developer can do to improve the performance and usability of a cube. There are several reasons correctly defined user hierarchies are beneficial, but here are a couple of the top reasons.

Increased Query Performance

Calendar HierarchyCreating attribute relationships between attributes that are included in a user defined hierarchy improve the performance of queries using these attributes for a couple different reasons. In this case, we’ll look at the Calendar hierarchy in the Adventure Works Date dimension. The Calendar hierarchy has five levels with the Calendar Year attribute at the top level and the Date attribute at the bottom level of the hierarchy. We can also see the following attribute relationships created to give SSAS an understanding of how the members of these attributes relate to one another. Once created, these relationships give SSAS the ability to understand that a given date member is related to a single month member, a given month member relates to a single quarter, and so on and so forth.

imageThis also means that during processing, special indexes are created that map the relationships between each member of each level in the hierarchy. This means that before a query is written, SSAS also knows to which month, quarter, semester, and year each date aggregates. These indexes are only created if you correctly define the attribute relationships between the attributes in the hierarchy.

Improved User Experience

imageUser defined hierarchies also improve the user’s experience with the dimension because the hierarchy provides a very natural way for the user to navigate through the data in the dimension. A user defined hierarchy does two things for a user: 1) Using the hierarchy obviously presents the data to the user in an understandable and clear way. And 2) the hierarchy organizes the data based on your business requirements. This means that the user doesn’t have to think about how should these attributes be organized or labeled because you’ve done that for them. The user can focus on reading their reports, understanding the data, and making business decisions.

Partition Measure Groups to Separate Volatile and Static Data

imageMeasure groups that are larger than (about) one million records should be separated into multiple partitions. There are several advantages to separating larger measure groups into multiple partitions.

Increased Processing Performance

Partitions in a measure group are processed in parallel. This means that a measure group containing three years worth of data separated into one partition for each month will process faster than a measure group with all three years worth of data in a single partition. Your processing strategy for each measure group may vary depending on the amount of data. For instance, a large telecommunication company may collect hundreds of millions of records per day, dictating a more complex and granular partitioning strategy.

Also, we should consider which partitions contain data that is changing vs. data that is now static. If our businesses will continually log transaction in the current month, we can partition our data by month effectively separating our volatile data from the static data. This means we only need to process one months worth of data in our cube to pick up the latest changes instead of having to processing the whole measure group.

Increased Query Performance

By partitioning our data at the month level, certain queries will also perform better. For instance, imagine a user queries our measure group searching for last month’s sales figures. Because we have partitioned our measure group with each month being loaded into a separate partition, this means that the entire measure group does not have to be scanned. Only the partition containing the data for the month in question needs to be queried.

Create Aggregations for Large Measure Groups

You can think of Aggregations as indexes for SSAS. Aggregations are used to help SSAS find the answer to a user’s query faster. Aggregations are typically used for measure groups that are large and take a considerable amount to query.

Improved Query Performance

Aggregations contain the data of a measure group at a summary level typically higher than the lowest level of data included in the measure group. Aggregations are populated during the processing phase of the partition. You can think of Aggregations as exactly the same as the lowest level of the measure group just at a summary level. This means that when SSAS receives a query that can be answered using an Aggregation, SSAS does not have to spend the additional time required to retrieve the measure group data from the lowest level and roll up the data to the requested level because the Aggregation design already contains the data at the desired level.

Beware Too Many Aggregations

Because Aggregations are built during the processing phase, this means that for every aggregation you define more time is required to build the aggregations. This also means that it is especially important to only build useful aggregations that are necessary to improve performance. By creating aggregations that are not often used, you can degrade query performance and increase processing time with little to no benefit. Like indexes on a SQL Server table, too many aggregations or the wrong aggregations can actually hurt performance, so make sure you test, test, and then test to ensure your aggregations are helping your query performance.

I hope you have found this information useful. By following these three best practices, I really believe you’ll see a generous improvement in the performance of your SSAS cubes.

If you found this information helpful, I’d love to hear from you! Please leave me a comment and let me know what you think. Do you have any ideas on something I left out or should have included? Please let me know! Thank you!

Changes are A’comin’!

Over the last couple days you may have noticed some slight changes on my blog. Well believe you me when I say some more are coming! You may be asking yourself, “Self, what are all these changes for? Dustin’s blog was already great! Why mess with a good thing?” That’s a good question and one I hope to answer through this blog post.

Changes to my Blog

My WordPress blog site will be getting an upgrade! My current blog layout has served me well and I have always liked the look and feel but Continue reading Changes are A’comin’!

Generate a Date Table via Common Table Expression (CTE)

Occasionally I find myself needing to generate a small table with a list of dates for various queries I may be running. To do this, I usually leverage the Date dimension since I do most of my work in BI environments with a traditional data warehouse. But if you don’t have access to a Date dimension table, you can quickly generate a date table using Continue reading Generate a Date Table via Common Table Expression (CTE)

T-SQL Script to Dynamically Create Table, Build Clustered Columnstore Index, and Partition Switch

Recently myself and Mitchell Pearson (blog|twitter) were working on a project for a client that required us to load a ton of data (dozens of TBs) into some tables each built with a clustered columnstore index. We discovered during testing that the fastest way to get that much data into the clustered columnstore index is to create an empty uncompressed table, load the data into the uncompressed table, then apply the clustered columnstore index to the table, and partition switch the data into the main table. In order to facilitate this, I created this script to dynamically create a copy of the target table (without the columnstore index), create the clustered columnstore index, and then do the partition switch automatically.

Continue reading T-SQL Script to Dynamically Create Table, Build Clustered Columnstore Index, and Partition Switch

I’m Speaking at SQL Saturday #391 in Jacksonville, FL May 9th

image It’s that time of year again! SQL Saturday in Jacksonville, FL is upon us once again and I’m excited to be presenting a session titled, “Welcome to SSAS Tabular Models.” SQL Saturday #391 will be held on May 9th, 2015 in Jacksonville, Florida at the University of North Florida and is a totally free training event for SQL Server professionals and anyone wanting to learn about SQL Server!

My session is called, “Welcome to SSAS Tabular Models”, and will function as an introductory session on developing an SSAS Tabular model the right way. In this session I’ll discuss how to decide if building a Tabular model is the right choice, how to build a Tabular model completely from scratch, best practices you should follow, and things to avoid. If you’re new to Tabular Modeling or wanting to learn best practices, this will be a great session for you.

To get registered for SQL Saturday #391 completely for free, head over to SQLSaturday.com and click Register Now!

Dustin Ryan is a Data Specialist interested in Azure, SQL Server, and Power BI.

%d bloggers like this: