Once again, I am serving as the professor at SQL University during MDX week. MDX stands for MultiDimension eXpression language and we use it to add business logic to cubes in the form of calculations, named sets, and KPIs as well as to configure security and to write reports. I can’t teach you everything you need to know about MDX in just a few blog posts this week, but I can help you get into the right frame of mind to learn MDX, to get some insight into how the language works, and where to learn more to further your studies.
Before you undertake MDX, you should know something about Analysis Services and cubes. Earlier this year, I presented a series of posts as part of SQL University to introduce you to this topic:
- Why Do I Need a Cube and How Do I Get Started?
- Dimension Design 101
- Cube Construction 101
- Cube Enhancements 101
- Cube Deployment 101
Before we dive into using MDX later this week, let’s start with perhaps the most fundamental concept that you need to understand – location, location, location.
Build On What You Know
You’ve probably used an Excel spreadsheet at some point and are familiar with using cell references to create formulas. For example, in the spreadsheet below, if I want to perform a calculation that includes the sales for the product “Mountain-200 Black, 38” on 1/1/2008, I would use the formula =B2. The letter B stands for the column reference and the number 2 stands for the row reference. It’s a two-dimensional cell reference.
“OK,” you’re thinking. “I got that. Rudimentary Excel. What does that have to do with MDX?” Bear with me a little bit longer and I’ll connect the dots for you.
Now consider that I have two sheets in a workbook with a similar layout. One sheet for 2008 sales and one sheet for 2007 sales. I want to calculate total sales for the same product in January. In this case, I can use the formula =’Sales 2008’!B2 + ‘Sales 2007’!B2. I’ve added a third dimension, the sheet name, to the cell references to tell Excel where to find the data for the calculation.
Let’s continue expanding on this concept:
- For four dimensions, let’s assume that I have a separate workbook for different stores and I want to combine the sales for all stores for the same product and same month across the two years. My formula would now look like this – ‘[Store A.xlsx]Sales 2008’!B2 + ‘[Store B.xlsx]Sales 2008’!B2 + ‘[Store A.xlsx]Sales 2007’!B2 + ‘[Store B.xlsx]Sales 2007’!B2.
- For five dimensions, I might use different folders on the same drive to store workbooks by promotion. So now each workbook reference would look something like this: ‘[C:\Top Customer Promotion\Store A.xlsx]Sales 2008’!B2.
- Moving on to six dimensions, I can reference workbooks on different drives.
- For seven dimensions, I can reference workbooks on different servers.
Now I’m going to get a bit silly, but just use your imagination with me for a moment because I can’t really do this. It’s just expanding the concept of location. Here goes… I can reference workbooks on different servers on different floors (eight dimensions), in different buildings (nine dimensions), on different streets (ten dimensions), in different cities (eleven dimensions), in different states (twelve dimensions), in different countries (thirteen dimensions), on different planets (fourteen dimensions) in different universes (fifteen dimensions)!
Okay, I’ve probably gone a bit too far with this example, and technically speaking, I wouldn’t make each of these aspects of a location into a separate dimension from an Analysis Services point of view. But I want you to get the idea that I can use location references in my Excel formulas to be very precise about which data to include in the formula. Excel understands relative references and absolute references. If you don’t tell it otherwise, it’s going to assume that you want a relative reference. That is, use the current cell, column, sheet, or whatever that happens to be “current” for the cell in which you are typing the formula unless you give it explicit instructions to use a different cell, columns, sheet, or whatever.
And Now for the Tuple
Now let’s bring this back to MDX. A key concept in MDX is the tuple. I say tuh-ple because I learned it that way and I’m too old to change. (Think quintuple, sextuple, octuple – et cetera. The letter u is short the way that I learned.) Some people say too-ple, and that’s okay too. The beauty of reading a blog post is that you can read it with whichever pronunciation you prefer and my pronunciation won’t get in your way!
Let’s move our sales data from Excel into an Analysis Services cube. I’ve simplified the product name because the tire size in there seems to distract people. So “Mountain-200 Black, 38” becomes “Mountain-200 Black.”
To get data from a cube, I need to ask for that data in the form of a tuple. A tuple consists of one member from every dimension. So in my two dimensional representation of cube data above, the tuple to get sales for the Mountain-200 Black product in January would be (Mountain-200 Black, 1/1/2008).
Actually, that’s what I call a pseudo-tuple because it’s not exactly the way we would write it in MDX, but I don’t want you to get distracted by the details. For now, let’s focus on the concepts of a tuple.
Tuples are like coordinates that you learned in geometry. If you have a graph with two dimensions, x and y, you represent a point by using the coordinate (x,y). In geometry, order matters – the x must come before the y. But with tuples, order does NOT matter. We can write your example above as either (Mountain-200 Black, 1/1/2008) or (1/1/2008, Mountain-200 Black) and get the same answer: $66,095.71.
Now in our cube, most dimensions have an All level. And that counts as a cell, too. So we can create a tuple (All Products, All Dates) to get the grand total sales: $4,831,250.71.
If we have two kinds of measures in our sales, Sales Amount and Order Quantity, then that adds a third dimension to our cube. So our tuple changes accordingly to include the new dimension: (Mountain-200 Black, 1/1/2008, Sales Amount).
In summary, a tuple is just a location in a cube. We use it to tell Analysis Services where to find the data that we want. Using MDX functions, we can describe sets of tuples that we want to access “as is” or use in formulas. In my next post, I’ll explain how to write simple MDX queries using this concept of tuples.