All posts by Dustin Ryan

Dustin is a Senior Business Intelligence Consultant with Pragmatic Works. You can find Dustin blogging and speaking at events like SQL Saturday, Code Camp, and SQL Rally. Follow Dustin on Twitter @SQLDusty.

Three SSAS Best Practices to Follow

There is a lot of work that goes into performance tuning a SQL Server Analysis Services solution for a client. And even though there may be many steps involved in identifying performance and management issues with a large SSAS solution, there are a few things that we can quickly check and implement to give us quick wins for improving the performance of our cube. In this post, I’d like to quickly point out three best practices that we can follow to improve performance and create a more positive experience for our users. These are not meant to be the top three best practices to follow, but rather three (among many) very important best practices you should follow. I believe following these three best practices will make a difference in your solution.

Create Hierarchies with Attribute Relationships

In my opinion, creating natural hierarchies are the single most beneficial thing an SSAS developer can do to improve the performance and usability of a cube. There are several reasons correctly defined user hierarchies are beneficial, but here are a couple of the top reasons.

Increased Query Performance

Calendar HierarchyCreating attribute relationships between attributes that are included in a user defined hierarchy improve the performance of queries using these attributes for a couple different reasons. In this case, we’ll look at the Calendar hierarchy in the Adventure Works Date dimension. The Calendar hierarchy has five levels with the Calendar Year attribute at the top level and the Date attribute at the bottom level of the hierarchy. We can also see the following attribute relationships created to give SSAS an understanding of how the members of these attributes relate to one another. Once created, these relationships give SSAS the ability to understand that a given date member is related to a single month member, a given month member relates to a single quarter, and so on and so forth.

imageThis also means that during processing, special indexes are created that map the relationships between each member of each level in the hierarchy. This means that before a query is written, SSAS also knows to which month, quarter, semester, and year each date aggregates. These indexes are only created if you correctly define the attribute relationships between the attributes in the hierarchy.

Improved User Experience

imageUser defined hierarchies also improve the user’s experience with the dimension because the hierarchy provides a very natural way for the user to navigate through the data in the dimension. A user defined hierarchy does two things for a user: 1) Using the hierarchy obviously presents the data to the user in an understandable and clear way. And 2) the hierarchy organizes the data based on your business requirements. This means that the user doesn’t have to think about how should these attributes be organized or labeled because you’ve done that for them. The user can focus on reading their reports, understanding the data, and making business decisions.

Partition Measure Groups to Separate Volatile and Static Data

imageMeasure groups that are larger than (about) one million records should be separated into multiple partitions. There are several advantages to separating larger measure groups into multiple partitions.

Increased Processing Performance

Partitions in a measure group are processed in parallel. This means that a measure group containing three years worth of data separated into one partition for each month will process faster than a measure group with all three years worth of data in a single partition. Your processing strategy for each measure group may vary depending on the amount of data. For instance, a large telecommunication company may collect hundreds of millions of records per day, dictating a more complex and granular partitioning strategy.

Also, we should consider which partitions contain data that is changing vs. data that is now static. If our businesses will continually log transaction in the current month, we can partition our data by month effectively separating our volatile data from the static data. This means we only need to process one months worth of data in our cube to pick up the latest changes instead of having to processing the whole measure group.

Increased Query Performance

By partitioning our data at the month level, certain queries will also perform better. For instance, imagine a user queries our measure group searching for last month’s sales figures. Because we have partitioned our measure group with each month being loaded into a separate partition, this means that the entire measure group does not have to be scanned. Only the partition containing the data for the month in question needs to be queried.

Create Aggregations for Large Measure Groups

You can think of Aggregations as indexes for SSAS. Aggregations are used to help SSAS find the answer to a user’s query faster. Aggregations are typically used for measure groups that are large and take a considerable amount to query.

Improved Query Performance

Aggregations contain the data of a measure group at a summary level typically higher than the lowest level of data included in the measure group. Aggregations are populated during the processing phase of the partition. You can think of Aggregations as exactly the same as the lowest level of the measure group just at a summary level. This means that when SSAS receives a query that can be answered using an Aggregation, SSAS does not have to spend the additional time required to retrieve the measure group data from the lowest level and roll up the data to the requested level because the Aggregation design already contains the data at the desired level.

Beware Too Many Aggregations

Because Aggregations are built during the processing phase, this means that for every aggregation you define more time is required to build the aggregations. This also means that it is especially important to only build useful aggregations that are necessary to improve performance. By creating aggregations that are not often used, you can degrade query performance and increase processing time with little to no benefit. Like indexes on a SQL Server table, too many aggregations or the wrong aggregations can actually hurt performance, so make sure you test, test, and then test to ensure your aggregations are helping your query performance.

I hope you have found this information useful. By following these three best practices, I really believe you’ll see a generous improvement in the performance of your SSAS cubes.

If you found this information helpful, I’d love to hear from you! Please leave me a comment and let me know what you think. Do you have any ideas on something I left out or should have included? Please let me know! Thank you!

Changes are A’comin’!

Over the last couple days you may have noticed some slight changes on my blog. Well believe you me when I say some more are coming! You may be asking yourself, “Self, what are all these changes for? Dustin’s blog was already great! Why mess with a good thing?” That’s a good question and one I hope to answer through this blog post.

Changes to my Blog

My WordPress blog site will be getting an upgrade! My current blog layout has served me well and I have always liked the look and feel but I think I could do for a better layout that makes the content more easily accessible to the reader. Because of this, I’ll be meeting with ladies in the marketing department at Pragmatic Works to have them give my blog and social media outlets a look to see what improvements I can make.

image

You may have also noticed that I’ve added my speaking calendar to my blog. At the top of my blog you’ll find a link to Dustin’s Calendar, as well as a smaller version of the calendar in my blog’s right sidebar. The purpose of the calendar is to make my speaking schedule more available to you. I love to have interaction with the audience and readers of my blog when I’m out speaking at SQL Saturday and other PASS events so hopefully I’ll be able to increase my availability and interaction with you through the calendar.

image

I’m also looking at creating some new and in-depth content for the blog including some new styles of posts and whitepapers, so definitely stay on the lookout for that material. If you have some great ideas of new content or style in a specific area of the SQL Server world send me a message! I’d love to hear from you!

Social Media

Follow Dustin on Twitter

For the longest time the only social media avenue I’ve really leveraged semi-professionally was my Twitter account. Sometimes I published technical stuff, links to my blog, interesting articles, but other times I published tweets unrelated to SQL Server. Going forward, I’ll be using my @SQLDusty Twitter account strictly for professional purposes.

I’ve always loved Twitter because the Twitter world provides you and I with a unique opportunity to connect, network, and gain access to incredibly qualified experts, professionals, and celebrities that work in our field who, without Twitter, we would never gain access outside of an expensive conference or training event. Almost anyone will respond to a 140 character message.

image

If you’re not following me on Twitter, please do so! Twitter is a great place to chat me up for ideas, questions, or just to say “Hi”. I’ll post links to new blog posts, whitepapers, speaking events, giveaways, and more, so be on the lookout for some changes to my Twitter Account and for some new content.

Like Dustin on Facebook

Also, I’ve created my own Facebook page! Facebook is also a great tool, but unlike Twitter, Facebook gives us the ability to make available a greater variety of types of content to you. Here you can send me a live chat message if you have a question about a challenge your facing on a project at work or if you want to pick my brain on a particular topic. With over 1 billion active users, Facebook gives you and I to interact in a completely different way than Twitter and in more than 140 characters.

image

So head over to my Facebook page and give me a Like, as I’ll be doing some giveaways in the very near future for some great free technical material and other fun stuff.

YouTube & Google+

Coming soon, I’ll be experimenting with a new (to me) style of content for training, education, and networking: YouTube! I’ll begin producing educational videos regarding SQL Server and Microsoft Business Intelligence. People learn in many different ways, but most people learn visually. I hope to increase the quality of the delivery of my content by leveraging short, to-the-point, self-produced training videos that you can watch during the work day while you take your lunch or during a short break from the grind.

image

So give me a follow on Google+  and subscribe to my channel on YouTube and be on the lookout for some very cool and informative videos on SQL Server and BI.

I hope you’re as excited as I am about all the new content I’ll be releasing and making available to you. If you have any great ideas regarding materials and content you’d love to see, leave me a comment, send me a message on Twitter, Facebook, YouTube, or Google+ and let me know! I’d love you to discuss your ideas!

Generate a Date Table via Common Table Expression (CTE)

Occasionally I find myself needing to generate a small table with a list of dates for various queries I may be running. To do this, I usually leverage the Date dimension since I do most of my work in BI environments with a traditional data warehouse. But if you don’t have access to a Date dimension table, you can quickly generate a date table using the following CTE:

;with dates ([Date]) as (
    Select convert(date,'2000-01-01') as [Date] -- Put the start date here

    union all 

    Select dateadd(day, 1, [Date])
    from dates
    where [Date] <= '2020-12-31' -- Put the end date here 
)

select [Date]
from dates
option (maxrecursion 32767) -- Don't forget to use the maxrecursion option!

Don’t forget to use the option (maxrecursion 32767) in your query to allow your CTE to surpass the default maximum recursion limit of 100. The maximum value you can use with option maxrecursion is 32767, which should allow you to generate a date table with with almost 90 years worth of dates, which for my purposes is usually plenty! Anyways, I hope you find this helpful.

T-SQL Script to Dynamically Create Table, Build Clustered Columnstore Index, and Partition Switch

Recently myself and Mitchell Pearson (blog|twitter) were working on a project for a client that required us to load a ton of data (dozens of TBs) into some tables each built with a clustered columnstore index. We discovered during testing that the fastest way to get that much data into the clustered columnstore index is to create an empty uncompressed table, load the data into the uncompressed table, then apply the clustered columnstore index to the table, and partition switch the data into the main table. In order to facilitate this, I created this script to dynamically create a copy of the target table (without the columnstore index), create the clustered columnstore index, and then do the partition switch automatically.

I hope you find this script helpful as an example.

DECLARE @DEBUG INT = 1

DECLARE @sql NVARCHAR(MAX),
    @cols NVARCHAR(MAX) = N'',
    @targettable nvarchar(max) = N'FactInternetSales', -- Target table 
    @partitionrange varchar(8) = 20110101, -- The partition range value you are loading
    @partitioncolumn varchar(100) = 'OrderDateKey' -- The column in the fact table you wish to partition your table by
    
-- GET PARTITION NUMBER 
declare @partitionnumber varchar(25)
Select @partitionnumber = $PARTITION.RangePF1(@partitionrange) -- Lookup the partition number based on the range value. This is needed for the partition switch

-- PREPARE SYNTAX TO CREATE NEW TABLE
declare @newtable nvarchar(max) = @targettable + '_' + @partitionrange

SELECT @cols += N', [' + name + '] ' + system_type_name + case is_nullable when 1 then ' NULL' else ' NOT NULL' end
 FROM sys.dm_exec_describe_first_result_set(N'SELECT * FROM dbo.'+ @targettable , NULL, 1);

SET @cols = STUFF(@cols, 1, 1, N'');

SET @sql = N'CREATE TABLE '+ @newtable + '(' + @cols + ') '

-- CREATE TABLE 
Set @sql = @sql + ' on ps_daily(' + @partitioncolumn + ');' -- Use the Partition Scheme in your environment

IF @DEBUG = 1
    BEGIN 
        PRINT @sql;
    END
        ELSE 
            BEGIN
                EXEC sp_executesql @sql PRINT @sql
            END

-- CREATE CCI
SET @sql = 
'CREATE CLUSTERED COLUMNSTORE INDEX CCIX_' + @newtable + ' ON ' + @newtable 

IF @DEBUG = 1
    BEGIN 
        PRINT @sql 
    END
        ELSE 
            BEGIN
                EXEC sp_executesql @sql PRINT @sql
            END
/*

Add your logic here to insert your data into the Clustered Columnstore Index table

*/

-- PARTITION SWITCH PARTITION
Set @sql = 
'ALTER TABLE ' + @newtable + ' SWITCH PARTITION '+ @partitionnumber +' TO ' + @targettable + ' PARTITION '+ @partitionnumber 

IF @DEBUG = 1
    BEGIN 
        PRINT @sql 
    END
        ELSE 
            BEGIN
                EXEC sp_executesql @sql PRINT @sql  
            END


-- DROP OLD TABLE
set @sql = 
'DROP TABLE ' + @newtable 

IF @DEBUG = 1
    BEGIN 
        print @sql 
    END
        ELSE 
            BEGIN
                EXEC sp_executesql @sql print @sql
            END

 

DISCLAIMER: Here’s the part where I say its not my fault if you blow up your environment because you ran some script you downloaded off the internet. Don’t run it in prod until you’ve verified it. :)

I’m Speaking at SQL Saturday #391 in Jacksonville, FL May 9th

image It’s that time of year again! SQL Saturday in Jacksonville, FL is upon us once again and I’m excited to be presenting a session titled, “Welcome to SSAS Tabular Models.” SQL Saturday #391 will be held on May 9th, 2015 in Jacksonville, Florida at the University of North Florida and is a totally free training event for SQL Server professionals and anyone wanting to learn about SQL Server!

My session is called, “Welcome to SSAS Tabular Models”, and will function as an introductory session on developing an SSAS Tabular model the right way. In this session I’ll discuss how to decide if building a Tabular model is the right choice, how to build a Tabular model completely from scratch, best practices you should follow, and things to avoid. If you’re new to Tabular Modeling or wanting to learn best practices, this will be a great session for you.

To get registered for SQL Saturday #391 completely for free, head over to SQLSaturday.com and click Register Now!

Choose Your Weapon: Self Service BI Edition

With an estimated 500 million Excel users in the world, it’s no wonder that Excel is the #1 business intelligence too in many organizations around the globe. And with the release of Excel 2013, the collection of powerful and flexible data analysis tools built into Excel has only continued to grow. Microsoft is constantly adding new features and functionality to Power BI at pretty fast rate, so now is a great time to start learning about everything that MS Power BI can offer your organization.

Because Excel is just full of a slew of incredible tools, its important for us to understand the difference between the tools, when you should choose each tool, and the role each tool can fill in your organization. That’s what this blog post is all about: Choosing the right self-service tool for the job.

Power Query: Data Discovery & Transformation

What: Power Query is a totally free add-in for Excel 2010 and Excel 2013 to search for and discover data from both within and outside your organization. All you have to do is open the Online Search window, type in what you’re looking for, and viola! Power Query will return a list of data source that you can preview and import into your Excel workbook and/or Power Pivot model.
image

And the list of possible data source types that Power Query can connect to is growing every month. Aside from the traditional data sources such as SQL Server, Oracle, Access, or SAP, Power Query can also connect to a pretty ridiculous amount of non traditional data sources such as: Hadoop, Facebook, Twitter, text files, Azure, Active Directory, Microsoft Exchange, and a bunch more. Gone are the days of being forced to work with IT in order to get the data you’re looking for. That’s kind of the point of self-service BI, right? :)

image

Not only can you import the data from your selected data source, you also have the ability to clean, transform, filter, and merge the data. With Power Query, we can easily apply a series of transformations in order to perfectly shape our data for our specific reporting requirements. Do you need to pivot your data? Transpose it? Split a column by a delimiter? Rename fields? Create calculations? Upper case the first letter of each word? Change a data type? Apply formatting? Merge census data with your corporate data? Power Query can do all of that and more without writing a single line of code. That’s pretty impressive if you ask me.
image 

And once you’ve customized the data to your liking, the query can be saved for future use or shared with other business users within your organization.

Where: Like I said, Power Query is a free add-in for Excel (2010 or 2013). Once the Power Query add-in is installed, Power Query gets it’s own little ribbon in Excel. Download Power Query here.

image

When: You’ll most likely find yourself using Power Query when you need to find a specific data set that may or may not currently exist within your organization in order to gain new insights into your business in ways that you previously haven’t had the capability for. If your data needs needs to transformed and shaped for your analysis, Power Query is a perfect candidate for the job.

Power Pivot: Data Modeling & Analysis

What: Power Pivot is a data modeling and data analysis tool all rolled up into one. With Power Pivot, we have the ability to extract data from multiple sources of differing types, create data relationships between the sources, and then model and format the data to our liking using Power Pivot’s drag and drop interface. We can embed hierarchies, unique calculations, and Key Performance Indicators into our Power Pivot model for use in Pivot Tables, Pivot Charts, and Power View reports.

image

With Power Pivot, your data can be refreshed and updated from its original data sources as often if you like. And if you deploy your Power Pivot workbook to SharePoint, your Power Pivot workbook can even be refreshed on a schedule of your choosing.

And with Power Pivot’s in-memory X-Velocity storage engine, performance is pretty incredible (like, don’t-blink fast), even with large data sets! 

Where: Power Pivot is an add-in for Excel 2010 that must be downloaded and install. Power Pivot comes included with Excel 2013 but must simply be turned on. Power Pivot gets its own ribbon in Excel and has its own development window that opens for designing your Power Pivot model.

image

image 

When: You’ll be using Power Pivot when you need to relate disparate data source to each other and create a custom data model, create calculations for new metrics, create natural hierarchies for navigating through your data, or create Key Performance Indicators (KPI) for dashboard or scorecard type reporting.

Power View: Interactive & Intelligent Data Visualizations

What: Power View is an adhoc & data exploration tool that is designed with the end user in mind. You don’t need to be a data visualizations expert to be able to put together impressive and always presentable reports with Power View. With Power View, you’re always working with live data whether you’re building a Power View report in Excel 2013 or in SharePoint 2010/2013.

image

Power View gives you the capabilities to explore you data in multiple formats quite easily by switching between chart types with a single click. No coding is required with Power View making this an ideal tool for the business person that just wants to do their job with no fuss or muss.

If you’re developing your Power View reports in SharePoint, you also have the very cool ability export your Power View report to Power Point in an interactive format! That’s right! I can export my Power View report into a Power Point presentation and still interact with it! Very cool!

Where: Power View reports can be developed in Excel 2013 using an Excel table or Power Pivot model as a data source. Power View reports can also be developed in SharePoint 2010 or 2013 using a Power Pivot model, SSAS Tabular Model, or an SSAS Multidimensional Cube as a data source.


Power Map: Geographic Data Visualizations

What: Power Map is a data visualization and data exploration tool that specializes in visualizes geographic data in a 3-dimensional format (using BING maps) and even animate it across time. Power Map allows you tell a story with your data by creating a Tour comprised of multiple scenes that focus on different aspects and trends in your data in a very unique and compelling way.

image

Once you’ve created your Tour, it can be exported as a movie to be shared with your team or to be shown during a presentation as an interactive guide of your data. I guarantee your audience hasn’t seen their data like this before!

Where: Power Map is a free add-in that can be downloaded and installed in Excel 2013. Get to Power Map on the Insert ribbon in Excel 2013. Download Power Map here.

image

When: Use Power Map when you want to tell a story with your geographic data on a 3D or 2D map, animate it across time to identify trends and gain new insights, or to simply explore your data in a different way.

I hope you find this information useful and that you’ve gained a better understanding of the self-service business intelligence tools that are part of Excel. Feel free to post any questions, comments, and let me know if you’ve enjoyed this post. Thanks for reading!

Navigating Hierarchies with MDX webinar recording is now available!

image Thanks to everyone that attended my webinar on Navigating Hierarchies with MDX! We looked at a bunch of different ways we can navigate up, down, and side to side in our hierarchies in order to do some really neat things with calculations. If you would like to view the recording, you can do that here completely for free! Also, if you’d like to view my PowerPoint slide deck and scripts I used for the webinar, you can download those from here. Just download the Navigating Hierarchies with MDX .zip file. :)

Now on to the questions!

Q: Is there anything like storeproc / pre stroed mdx query in SSAS which can be called in .net application.
A: There is a concept of SSAS stored procedures, which you can read more about here: http://msdn.microsoft.com/en-us/library/ms176113.aspx and here: http://asstoredprocedures.codeplex.com/

Q: How would ParallelPeriod handle a leap year?
A: ParallelPeriod returns the member at the same position in the specified period. So if the 29th day of February does not exist in the previous year, then no value will be returned:
image

Q: Can you use PeriodsToDate() on a ‘custom’ period like an Academic Term?
A: PeriodsToDate can be used on any user defined hierarchy.

Q: Setting date property for MTD, QTD seems straightforward.  What about WTD (week-to-date)?  It seems it might take some careful work each year.
A: If you have a Week attribute in your Date dimension, that should be set to Week, as well. That’s all that is required. :)
image

Q: What is the name of the zoom tool and highlighter used, just curious.
A: Zoomit. It’s free, too! I get asked that question every time I present.

Thanks for all the great questions, everyone! If you have any further questions, please feel free to post it here or to send me a tweet!