This post is the second in a series of topics on MDX for SQL University. In my first post, Location, Location, Location, I introduced the tuple as a way to retrieve data from a cube with very precise instructions about what you want. In this post, I’m going to show you how to do that by writing queries. One thing that you will have to promise me as you read this post is to forget what you know about SQL queries (presuming you have some knowledge in that area). It will mess you up. If you start with a clean slate, you will have an easier time of it.
Preparing the Query Environment
Before I dive into writing queries, let me explain the setup. I’m using SQL Server 2008 R2 Analysis Services for these examples. You can work with earlier versions, but in some cases you will encounter date ranges that are four years earlier than the date shown in my examples. For example, instead of using Q3 2007, you would need to use Q3 2004. The measure values will remain the same. Here are links to the samples that you need to have installed:
- If using Microsoft SQL Server 2005, install the sample databases (AdventureWorks, AdventureWorksDW)
- If using a later version, download the sample databases from http://msftdbprodsamples.codeplex.com/, install the databases, and then follow the instructions at http://msftdbprodsamples.codeplex.com/wikipage?title=Installing%20Analysis%20Services%20Database to install and deploy the Analysis Services database.
Requesting a Tuple from a Cube
Okay, let’s start simple. Very, very simple. Go to SQL Server Management Studio, connect to Analysis Services, and click the MDX button on the query toolbar. An MDX query editor window will open. In the Available Databases drop-down list, select the Adventure Works database for your version of SSAS.
To retrieve a tuple from the cube named Adventure Works, you write a query like this:
SELECT FROM [Adventure Works] WHERE ([Product].[Product].&[359],[Date].[Date].[January 1, 2008],[Measures].[Sales Amount])
Notice the way that I refer to dimension members here. I’m no longer writing a pseudo-tuple as I did in my previous post, but am now using member unique names which tells Analysis Services the dimension and hierarchy to which the member belongs. If you leave out the dimension and hierarchy, you will still get an answer, but no guarantee that you have the right answer if it’s possible that another dimension and hierarchy contain a member of the same name. For example, what if I have a West region and a sales person with the last name of West? Analysis Services will use one or the other to construct the tuple. The only way I can control which one is to use the dimension and hierarchy name. In fact, I switched from using Mountain-200 Black, 38 to [Product].[Product].&[359] because there are two products with the name Mountain-200 Black, 38 (due to Slowly Changing Dimensions) and I want the one that actually has sales for January 1, 2008. I can refer to members not only by name, but by the value in the key column – which in this case is the surrogate key for the dimension. I use an ampersand before the key value wrapped in brackets to specify that it’s a key column in the tuple rather than a member name .
Now let’s see the results of the query. Press F5 or the Execute button in the toolbar to see a single value appear in the Results pane.
The WHERE clause in your query contains the tuple that you want to retrieve. It does NOT behave like a WHERE clause in a SQL query, which is why I asked you nicely to forget about that other language – at least for the duration of this post. The problem with requesting a tuple this way is that the result does not include any metadata that tells me what I’m looking at. So I can enhance this query by showing one of the members of the tuple as a column label and one of the other members of the tuple as a row label, like this:
SELECT [Date].[Date].[January 1, 2008] ON COLUMNS, [Product].[Product].&[359] ON ROWS FROM [Adventure Works] WHERE [Measures].[Sales Amount]
Press F5 to see that the result of the query is the same, but now we have some more context for the value that displays.
The way that I like to think about the query results for this example is that it works in three phases. First, Analysis Services retrieves my request for the member on columns. Second, it retrieves my request for the member on rows. Those two steps occur completely independently of one another. Third, Analysis Services looks at the intersections produced by placing member on columns and rows and adds in the WHERE clause to produce a tuple. That’s why we get the same value in the result. It doesn’t matter which member you put on columns and which member you put on rows and which you leave in the WHERE clause. Collectively, they produce a tuple.
You don’t have to use a WHERE clause in the query. Instead, you could put everything onto an axis – either rows or columns. Essentially, you’re moving the tuple fragments out of the WHERE clause onto an axis. The tuple produced by the query parser is the same, but we get more metadata in the results to see the context of the query. For example, try this:
SELECT ([Date].[Date].[January 1, 2008], [Measures].[Sales Amount]) ON COLUMNS, [Product].[Product].&[359] ON ROWS FROM [Adventure Works]
Here you see the results of the new query.
You can put any member of the tuple on either rows or columns. You can even write the query to show rows first and columns second like this:
SELECT [Product].[Product].&[359] ON COLUMNS, ([Date].[Date].[January 1, 2008], [Measures].[Sales Amount]) on ROWS FROM [Adventure Works]
But the result of your query stays the same. You can have a query that only has a tuple fragment on columns like this:
SELECT ([Date].[Date].[January 1, 2008], [Measures].[Sales Amount]) ON COLUMNS FROM [Adventure Works]
But you cannot have a query that has a tuple fragment on rows like this:
SELECT ([Date].[Date].[January 1, 2008], [Measures].[Sales Amount]) ON ROWS FROM [Adventure Works]
If I don’t ask for something in a tuple explicitly by putting it on columns, on rows, or in the WHERE clause, Analysis Services plugs in the All member for the dimension and hierarchy that I didn’t include. Usually. There is an exception but we’ll save that explanation for another post. This behavior is especially helpful when you have lots of dimensions and hierarchies. Otherwise, you’d have a tuple that is ginormous (your new technical term for the day).
Using Sets in a Query
Asking for data one tuple at a time is rather tedious. Fortunately, we can take our knowledge about query construction to the next level. Technically speaking, when we put a member on columns or rows, we were asking for a set. A set of one. When we put the tuple fragment on columns or rows, we were creating a tuple set. Another option is to explicitly create a set like this:
SELECT {[Date].[Date].[January 1, 2008], [Date].[Date].[February 1, 2008]} ON COLUMNS, [Product].[Product].&[359] ON ROWS FROM [Adventure Works] WHERE [Measures].[Sales Amount]
Whereas we use parentheses to denote a tuple, we use braces to denote a set. The order in which we define the member of the set determines the order in which the members display in the results. Try changing the sequence of members in the set to see the results.
MDX provides us with lots of functions to produce sets or to produce members that we use in sets so that we don’t have to list out members explicitly in a query. In my next post, we’ll explore some some of these functions.
1 Comment
[…] me, you should check out the first post, Location, Location, Location, and my previous post, Writing Simple Queries, to get oriented. I’m going to continue building on the concepts introduced in the earlier […]