SQL Saturday #86 Aftermath

Thanks to everyone who attended my session “Reporting on a Cube with SSRS 2008.” I had a great time at the event and I hope those who attended enjoyed the great sessions and maybe even learned a couple things.

If you’re interested, here is the link to download my slide deck and examples I used during the session. In my example, I used the Adventure Works 2008 R2 cube, so head over to Codeplex to pick up the materials.

Also, here is the link to Mosha Pasumansky’s MDX Studio Online tool that will format your MDX for you. It’s a very cool tool and very helpful. Thanks to Pam Shaw for providing this link!

Data Cleansing with Regular Expressions in SSIS

Regular Expressions are a powerful way for you to search for patterns in strings of text. In SSIS, we can leverage Regular Expressions to assist us in cleansing data. I’m going to show you a couple different ways we can accomplish this task.

The first method involved using a Script Component within a Data Flow Task to remove unwanted characters from our data. In this example, I’ve created some junk data that contains a mixture of letter and numbers.

image

I want to eliminate the letters from each value. To accomplish this, I will use a Script Component as a Transformation.

First, drag the Script Component into your Data Flow. Make sure you select the Transformation option and click OK.

Next, open the Script Component Editor and go to the Input Columns tab. Select the field you wish you cleanse and set the Usage Type to ReadWrite.

image

Then go back to the Script tab and click the Edit Script button near the bottom.

image

Add the following code to your Script Component:

using System;
using System.Data;
using System.Text.RegularExpressions;  //<<— Added to allow the use of regular expressions
using Microsoft.SqlServer.Dts.Pipeline.Wrapper;
using Microsoft.SqlServer.Dts.Runtime.Wrapper;

[Microsoft.SqlServer.Dts.Pipeline.SSISScriptComponentEntryPointAttribute]
public class ScriptMain : UserComponent
{
public override void Input0_ProcessInputRow(Input0Buffer Row)
{
//TestJunk is the field I am cleansing. Make sure you use the correct field name.
       Row.TestJunk = RegExCleanse(Row.TestJunk);
    }

public String RegExCleanse(String field)
{
// This is the RegEx pattern we are looking for
String RegExPattern = “[a-z]”;

Regex objRegExp = new Regex(RegExPattern);
field = objRegExp.Replace(field, String.Empty);
return field;
}
}

Once you’ve pasted the script into the Script Component and changed the name of the field and are using the desired RegEx, click OK a couple times to close the script editor and the Script Component Editor. We are now ready to test our script.

Here are the results of our RegEx data cleanse:

image

All the letters have been removed from our text strings.

But there’s actually an easier way to handle data cleansing with Regular Expressions. It’s called the RegEx Replace Transform and its included in Task Factory developed by Pragmatic Works. So lets take a look at how easy it is to use RegEx to cleanse our data without a bit of scripting.

image

I’ve already added the RegEx Replace Transform to my Data Flow Task and connected it up to my OLE DB Source. Double-click the RegEx Replace Transform to open the editor.

Next, click the drop down arrow next to the field you wish you modify with RegEx. Now we can configure the transform to perform the desired cleansing. For the Action, select “Replace matching regular expression pattern with user defined value.” You can also choose to extract a string of text from your field based on the RegEx patten you specify.

For the Output Action, select “Replace the column data with fixed data.”

Lastly, specify the RegEx patten next to “Search RegX Pattern.” Then click OK. Your configuration should closely resemble mine, seen here.

image

Now when I run my Data Flow Task, all letters are removed from my data. The RegEx Replace Transform from Task Factory is an explosive tool that easily allows you to utilize Regular Expressions to scrub your data without the need to write complicated scripts.

Also, just an FYI. If you’re interested in learning more about RegEx or need some helping writing Regular Expressions, check out http://gskinner.com/RegExr/. This site is an amazing learning tool when it comes to RegEx.

SSIS Batch Update without Staging the Data

The native SSIS functionality currently prevent you from performing a batch update from a Data Flow Task without staging the data. If you wish to update records from within the Data Flow Task, you’re going to have to use the OLE DB Command transform, which is very slow. If you’re update more than just a few records, I’d highly recommend first staging the data and then using an Execute SQL Task to perform the batch update.

Or you could just use the Task Factory Update Batch Transform from Pragmatic Works and avoid staging the data at all or having to use the performance-devouring OLE DB Command transform.

10-31-2011 3-07-16 PM

The Update Batch transform included with Task Factory utilizes an ADO.Net connection to perform a batch update of a SQL Server table quickly and efficiently. Configuring the Update Batch transform is very easy.

After dragging the Update Batch transform into your Data Flow design area and connecting a Data Flow Path to it, you can then double-click the transform to open the Update Batch Transform editor.

First you’ll need to either create a new ADO.Net connection or select an existing connection. In my case, I’m going to select a previously existing ADO.Net connection, ADO NET AW.

10-31-2011 3-13-31 PM

Next I’ll select the Table I would like to update. After selecting the table, I need to specify which column(s) should be used as key values in determining if a record in the table is a match and meets the criteria for updating the record. I can either select a single column or multiple columns to create a composite key.

10-31-2011 3-15-44 PM

Lastly, any fields in the table that I do not wish to update should be marked as ignore. Fields that need to be updated should mapped to the corresponding field in the Destination Column.

Now we’re ready to update our records in batch without the need for staging. One of the neat advantages to using the Update Batch Transform from Task Factory is the ability to use this transform as a destination or a transformation. If I choose to do so, I can allow the records moving through the data flow to move on to another transformation or destination, as you can see here.

10-31-2011 3-18-30 PM

I’ve found the Update Batch Transform to be very useful when performing mass updates. I’ve seen tremendous performance gains in my SSIS packages since I no longer need to perform additional reads/writes to first stage the data and then perform the update.

Head to PragmaticWorks.com to download the free Task Factory trial and check out some of the other useful task and transformations included with Task Factory.

Quickly Edit Package Configuration Values from BIDS

If you’ve followed my blog at least somewhat, you’ve probably read a blog post or two where I cover one or several of the really cool features of BI xPress. That’s because, simply put, BI xPress is a very versatile and useful tool that no SSIS developer should be without. It pains me to think about all the time I have to waste whenever I go work for a client that does not have BI xPress.

With that said, one of the smaller but more useful features of BI xPress that I found myself making use of is the Quick Config Edit feature. With Quick Config Edit, you can easily view the configurations being used in your package and edit the values of those configurations very easily, all without leaving BI xPress.

To open Quick Config Edit, go to the BI xPress menu at the top of BIDS and select Quick Config Edit.

10-31-2011 2-17-05 PM

Once Configuration Editor is open, you’ll need to specify where your configurations are stored, whether it be on the File System or in a SQL Server table. In my case, I’m using a SQL Server table to configure this particular package. Specify the server, the credentials used, the database, and the table.

10-31-2011 2-23-37 PM

You’ll then be able to view the Configurations in your package, the package properties being configured, the values used to configure the package, and also make changes to those configurable values. It even becomes possible to edit multiple package configurations at one time.

10-31-2011 2-39-22 PM

Here you can see a configuration I have for a connection string. From this window I can easily adjust the connection string stored in the SQL Server table. Once I make a change to the Value I can just click Save to make the change final.

10-31-2011 2-27-32 PM

This is an example of a configuration for a Package Variable. From this view I can easily see the scope of the variable (Foreach Loop Container, in this case), the variable being configured, and the value of the package configuration.

10-31-2011 2-27-48 PM

Quick Config Edit also has a very handy feature that makes is easy for you to do a quick Find and Replace of your package configurations, allowing you to update multiple package configurations in the blink of an eye.

10-31-2011 2-34-09 PM

Quick Config Edit is a very simple but extremely useful feature. It may seem small, but these are the kinds of features that increase the speed of package development by allowing you to stay in the BIDS environment. This is hands down one of my favorite features of BI xPress because it is so convenient.

Check out this page if you’d like to read more about the features of BI xPress.

MDX Calculation Builder Automatically Builds MDX Calculations for You

One of the most powerful aspects of Analysis Services is the ability to easily view your facts over time. What’s not always so easily is writing those calculations that make time analysis possible. MDX can have a steep learning curve, which is where the BI xPress MDX Calculation Builder by Pragmatic Works steps in.

After installing BI xPress, you will very quickly and easily add calculations to your cube without having to write a lick of MDX. To access the MDX Calculation Builder, open BIDS and open your SSAS project. After opening the cube you would like create the calculations in, navigate to the Calculations tab of the cube designer. There you’ll see the Calculation Builder icon.

10-30-2011 10-22-07 PM

After you click the Calculation Builder icon, the Calculation Builder wizard will open. The first thing you’ll need to do is select a calculation.

10-30-2011 10-29-37 PM

You’ll have about a dozen different calculations to choose from, including Year to Date, Month to Date, and Year Over Year Growth, just to name a few. In this example, we’re going to create a Year Over Year Growth calculation. Next we’ll need to select our fact.

10-30-2011 10-36-48 PM

Then we’ll select our Year level attribute.

10-30-2011 10-41-19 PM

The third step is optional. We can apply conditional formatting to our calculation. In this example, if the measure is less than 0, the font will display red. Greater than zero, the font of the measure will display as green.

10-30-2011 10-46-34 PM

Lastly, we’ll give our measure a name, select the measure group we’d like the measure to be a part of, select the formatting, and preview the calculation that was written by the tool.

10-30-2011 10-49-13 PM

Then we’ll click Finish. The calculation has been added to our calculation script within our cube.

10-30-2011 10-50-58 PM

After processing the cube, we can test out our Year Over Year Growth.

10-30-2011 10-58-48 PM

And its that easy. In just four simple steps we’ve created our calculation without writing a bit of MDX.

The MDX Calculation Builder also allows you to build named sets just as easily. Of course, as you learn MDX, the MDX Calculation Builder won’t always be necessary. But it sure will accelerate your learning and make writing MDX calculations easier in the mean time.

SQL Saturday #86: BI Edition in Tampa, FL – Reporting on a Cube w/ SSRS 2008

Next month on November 5th is SQL Saturday #86 in Tampa, Florida. But this is no ordinary SQL Saturday. This is SQL Saturday: BI Edition! Which means this SQL Saturday is going to be even more amazing, hard core, and in your face than usual! If you’ve never been to a SQL Saturday event, let me just say that you are missing out. And if you have been to one before, then you know exactly how awesome these events are. Whether you show up for the incredible FREE training from qualified industry pro’s, the networking with fellow SQL Server fiends, or the great lunch and snacks, you’re sure to have a good time.

So before you waste anymore time, head on over to their site and get registered. Check out the extensive list of sessions and speakers and plan your day.

Yours truly will be giving a session covering Reporting on a Cube w/ SSRS 2008 at 10 a.m.  I’ll be covering some of the advantages the cube presents as a data source and how you can leverage the power of the cube with SQL Server Reporting Services 2008. So head on over to SQLSaturday.com and get registered before it’s completely full! We’re going to have a blast!

MDX VBA Functions

The Visual Basic for Applications library functions are a great way to extend the functionality of your MDX  statements, allowing you to do some very handy stuff. Irina Gorbach has created a nice little document outlining the VBA functions available, so definitely check that out. There’s an important performance difference between the VBA functions and native MDX functions, which Irina points out.

One of the VBA functions I’ve been asked about by a couple people is the IsNumeric function. The IsNumeric functions returns a Boolean value indicating whether an expression can be evaluated as a numeric value. Here is an example (using the AW example cube) of how one might utilize IsNumeric:

With Member [Measures].[LastOrderYear] as

IsNumeric([Reseller].[Reseller].CurrentMember.Properties("Last Order Year"))

Select [Measures].[LastOrderYear] on 0,

[Reseller].[Reseller].children on 1

From [Adventure Works]

And here is what the results look like:

image

So check out the Irina’s document on the SSAS VBA functions. I think you’ll find it very informative and useful.

Showing an SSAS Hierarchy as Delimited String

I recently encountered a requirement for a client to display a list of all the individual stores of a ragged hierarchy (the leaf members) on an SSRS report with a bunch of different measures. But they also wanted to be able to quickly see the the ancestor members of each level above each store quickly by using a tool tip to display the different ancestors.

Turns out there are a couple nifty, little MDX functions that can allow us to do this very easily. The Ancestors and Generate functions make this possible. Here an example I put together with the Adventure Works 2008 R2 cube so you can play along.

WITH MEMBER [Measures].[Management Structure] AS
            GENERATE(
            EXCEPT(({Ancestors([Employee].[Employees].CURRENTMEMBER,6),
                  Ancestors([Employee].[Employees].CURRENTMEMBER,5),
                  Ancestors([Employee].[Employees].CURRENTMEMBER,4),
                  Ancestors([Employee].[Employees].CURRENTMEMBER,3),
                  Ancestors([Employee].[Employees].CURRENTMEMBER,2),
                  Ancestors([Employee].[Employees].CURRENTMEMBER,1)}),[Employee].[Employees].[All]),
            [Employee].[Employees].CURRENTMEMBER.Name, "  >  ")

Select {[Measures].[Management Structure],
    [Measures].[Reseller Sales Amount],
    [Measures].[Reseller Order Quantity]} on 0,

NON EMPTY(DESCENDANTS([Employee].[Employees],10,LEAVES)) ON 1

From [Adventure Works]

Where [Employee].[Title].&[Sales Representative]

Pay attention to the calculate member [Measures].[Management Structure]. The way the Generate function works is that it takes one set and applies that set to each Member in another set. In this case, our first set is the ancestor from each level in our hierarchy of the current member, excluding the All member.

So this our first set:

EXCEPT(({Ancestors([Employee].[Employees].CURRENTMEMBER,6),
                  Ancestors([Employee].[Employees].CURRENTMEMBER,5),
                  Ancestors([Employee].[Employees].CURRENTMEMBER,4),
                  Ancestors([Employee].[Employees].CURRENTMEMBER,3),
                  Ancestors([Employee].[Employees].CURRENTMEMBER,2),
                  Ancestors([Employee].[Employees].CURRENTMEMBER,1)}),[Employee].[Employees].[All])

Our second set is the Name property of the CurrentMember of the Employee hierarchy being displayed:
 
[Employee].[Employees].CURRENTMEMBER.Name
 
The Generate function actually also excepts a third argument: A delimiter. We can specify a delimiter. In my example, I use “  >  “ to make the string a bit easier to read. If we execute the query we can see the results of our calculated member.

10-10-2011 9-47-39 PM

Pretty cool, huh? So now that we have each employees management hierarchy in sweet, little string, we can easily display this in a report as a tool tip.

I added the tool tip to the cell in my report’s tablix that contains the employee name. Here’s the expression I used:

=REPLACE(Fields!Management_Structure.Value,">",VBCRLF)

This expression replaces the “>” with a carriage return, line feed. Here’s what our tool tip looks like:

10-10-2011 10-02-24 PM

Using this method, we can easily display the managers for each employee in a tooltip without having to waste any real estate with additional columns.

I hope somebody finds this useful. I’m out!

Manage, Save, and Share SSIS Expressions

If you’re even slightly familiar with SSIS you know that one of the incredible strengths of SSIS is the ability to make packages dynamic in countless different ways. One of the ways SSIS accomplishes this is through the use of SSIS expressions and the expression language.

BI xPress has a really useful feature called Expression Manager. The Expression Manager is a very cool feature that allows an SSIS developer to create, share, and manage custom SSIS expressions.

Creating your own SSIS expressions is very easy. Simply open BI xPress and click the Expression Manager button in the top right of the window.

Capture 2

Or if you’re already developing a package in BIDS, go to your variables window and click the Expression List icon.

Capture 3

Now that the Expression Manager is open, we can browse the canned SSIS expression or create our own. Let’s create our own custom SSIS expression for us to use later on or share with our development team.

First, click the Create New Expression icon.

Capture 4

Then give your expression a name and a description. My expression is going to append the user’s name and the date to create a file name. Here you can see the expression.

Capture 5

And in case you want to copy it for use:

"C:\\DataOut\\Archive\\"  + RIGHT(@[System::UserName],FINDSTRING(REVERSE(@[System::UserName]), "\\", 1)-1)
+ (DT_WSTR,4)YEAR(GETDATE()) + "-"
   + RIGHT("0" + (DT_WSTR,2)MONTH(GETDATE()), 2) + "-"
   + RIGHT("0" + (DT_WSTR,2)DAY( GETDATE()), 2) + ".txt"

In my expression you can see that I leverage the UserName system variable to get the name of the user executing the package. I use FindString and Reverse to eliminate the domain and \ from the UserName, as well. Once you’re done, click Save and then click OK to close the Expression Manager.

Now that our expression is saved, we can use our expression in our SSIS packages. To utilize our expression in the package all we need to do is the use the BI xPress Expression Editor. We can clearly see our new expression in the My Expressions folder.

Capture 6

Besides allowing a user to manage their own custom SSIS expression, the Expression Manager features a host of canned, commonly used SSIS expressions. This drastically reduces the learning curve of the expression language. The Expression Manager comes with expressions that can create a file path and name with the date appended, create a dynamic SQL statement, or calculate the beginning of the previous month, just to name a few.

Capture 1

As you can see, the Expression Manager is a very powerful tool that allows you to fully realize the power of the SSIS expression language. To download a free trial version of BI xPress, check out PragmaticWorks.com.

Welcome the New Addition to the Ryan Family

If you’re wondering why you haven’t seen much activity on my blog lately, there’s a very good reason for that. On September 1st at 1:03 a.m., my wife gave birth to our first son and second child, Bradley James. Bradley weighed 6 lbs and 11 oz and is perfectly healthy. Being a father is an amazing experience, a lot of work, and an incredible blessing. Once we get the little guy on a good sleeping routine and we’re not up every couple hours to feed him, I hope to get back to blogging regularly. So stay tuned for some more great content. Now I’m gonna go take a nap.

20110924-120802.jpg

Dustin Ryan is a Data Specialist interested in Azure, SQL Server, and Power BI.

%d bloggers like this: