This is the second post for SSAS week at SQL University. If you’re just joining the class, go back to the first post, Why Do I Need a Cube & How Do I Get Started, to get your bearings and get your environment ready, and then join me back here. Today, I’ll walk you through the basic steps of building dimensions. Then, later in the week, I’ll cover cube construction and also discuss various cube enhancements that you can make, security measures to implement, and tools that enable user access to cube data.
Before you actually start construction, you need to open Business Intelligence Development Studio—affectionately known as BIDS. You’ll find it in the Microsoft SQL Server 2008 R2 program group on your Start menu. BIDS is a Visual Studio 2008 shell that installs as one of the SQL Server Management Tools when installing SQL Server 2008 R2. You don’t need a Visual Studio license to use it.
To create a new project, open BIDS, click File->New->Project, and then select the Analysis Services Project template in the Business Intelligence Projects. Give the project a name – something creative and inspiring like My SSAS DB. Your project will later become a database on the Analysis Server, but it’s nothing like a SQL Server database. The SSAS database has a lot more files associated with it, and stores data in a completely different way.
Data for your cube has to come from somewhere and the Data Source file is the first step in defining where it comes from. It’s quite literally copied from the source and placed into structures that Analysis Services manages. For this course, you’ll use the AdventureWorksDW2008R2 database as described in this week’s first lesson. To create a data source, right-click the Data Sources folder in the Solution Explorer window (which you can open from the View menu if it’s not visible), and select New Data Source. Using the wizard, you can step through the process of creating a new data source that defines the data provider, connection string, and authentication method (Windows integrated security or database login) for the source data. Nothing new here—you’ve probably done this sort of thing lots of times using similar interfaces. Bottom line is that you can access any data source with an OLE DB or .NET provider.
The Impersonation Information page of the wizard is something that you’ve probably not seen before. Here you define the account that Analysis Services uses to connect to the data source. Usually, you select the Use the service account option here. The service account is the one that is running the Analysis Services service, either a domain account or a built-in account like NETWORK SERVICE. You need to make sure that this service account has read permissions on AdventureWorksDW2008R2.
Next, you create a Data Source View (DSV). The DSV lets you focus on selected tables of the source database and make changes as a logical layer without changing the physical structure of your data source, which is helpful if you have only read permissions for that source. For example, you can create logical primary keys and foreign key relationships if they do not exist in your data source. Additionally, you can enhance the data by adding a Named Calculation that uses a valid SQL expression to calculate values in a new logical column of the selected table in the Data Source View. You can also replace a table with a Named Query, which allows you to select specific columns or add derived columns to a selected table just as if you were adding a new view to the underlying data source.
To keep keep this cube relatively small and simple while you learn the basics, you’ll use only a few tables from the source database. Right-click the Data Source View folder in the Solution Explorer window, and select New Data Source View. Step through the wizard and select following tables: DimDate, DimProduct, DimProductSubcategory, DimProductCategory, FactInternetSales, and FactResellerSales.
All the primary keys and foreign key relationships are detected, so you don’t need to make any adjustments to the DSV to define those required elements. However, to get the Date dimension to work properly, you need to concatenate some columns in the DimDate table to create new derived columns. Right-click on the DimDate label in the diagram and click New Named Calculation. For the first calculation, assign a name, like Quarter, and use the following expression:
CAST(CalendarYear AS CHAR(4)) + ‘ Q’ + CAST(CalendarQuarter AS CHAR(1))
Add a second calculation named Month using the following expression:
CAST(CalendarYear AS CHAR(4)) + ‘ ‘ + LEFT(EnglishMonthName, 3)
As a final task for the DSV, rename all of the tables to remove the Dim or Fact prefix and to shorten the table names related to product, so that you have Date, Product, Subcategory, Category, InternetSales, and ResellerSales. As you create database objects in subsequent steps, names of objects will be inherited from the tables here in the DSV and it’s usually best to rename everything here to minimize the renaming effort elsewhere. Tip: When working with your own data, the ideal is to rename everything back in the source database, using views if necessary.
When finished, your DSV will be similar to the figure below. Notice the icons for your named calculations Quarter and Month as a visual cue that you’ve enhanced the DSV.
The Data Source and DSV definitions are the foundation for building dimensions and cubes. In this step, you build as many dimensions as you need. You might have one table in a dimension, such as the Date table, or multiple tables (called a snowflake), such as the Product, Subcategory, and Category tables. Most cubes have many more dimensions, but you’ll get the idea of the steps involved by building the two dimensions for this lesson.
When you design a dimension, you select the relevant tables in the DSV, pick the attributes (which you know as table columns) to include in the dimension, define the behavior of these attributes, and relate each attribute to other attributes by setting properties. You use a wizard to do the basic setup of the dimension and then fine-tune the dimension design using a designer interface and the Properties window. There are a lot of properties that you can configure which can seem overwhelming when you’re new to cubes. Frankly, the default values are fine most of the time for simple cubes and need to be changed only when you’re working with more advanced scenarios. I’ll walk you through changing the most important properties in today’s lesson.
To get started, right-click the Dimensions folder in Solution Explorer, and click New Dimension. For the creation method, select Use an existing table, start with the Date table as the main table, and change the Name column to FullDateAlternateKey on the Specify Source Information page of the wizard. On the Select Dimension Attributes page, change Date Key to Date, and select the checkbox for the following attributes: CalendarYear, Quarter, and Month. Click through the rest of the wizard to wrap it up.
Now it’s time to fine-tune the dimension design in the dimension designer. First, define a hierarchy. Adding a hierarchy not only helps users navigate data more efficiently, but can also improve query performance when a hierarchy contains a natural one-to-many relationship between each level in the hierarchy from top to bottom such as exists between Year, Quarter, and Month. Just drag and drop attributes from the Attributes pane in the dimension designer to the Hierarchies pane. You can also right-click the name of the new hierarchy, select Rename, and give it a better name than Hierarchy. Maybe something like Calendar. The rule is that it must be unique across all hierarchies and attributes in your dimension. When finished, your dimension designer looks like this:
The warning symbol is there because you haven’t defined attribute relationships for the levels of the hierarchy. Attribute relationships are important for performance reasons, so do that next. Click the Attribute Relationships tab in the dimension designer. Drag Month on top of Quarter, and drag Quarter on top of Calendar Year. Now you’ve defined the one-to-many relationship from the key attribute (which is the primary key in the table) to each level in the hierarchy (and the little warning symbol is now gone on the Dimension Structure tab – go ahead and take a peek!).
Now use what you’ve learned to set up the Product dimension. When you add the dimension using the wizard, set the Name column to EnglishProductName. You’ll see that the wizard automatically includes Subcategory and Category because it sees the foreign key relationships between the three product-related tables. For attributes, select the following: Product Key (which you should rename to Product), Product Subcategory Key (which you should rename to Subcategory), and Product Category Key (which you should rename to Category).
Create a Products hierarchy in this order from top to bottom: Category, Subcategory, Product. If you go to the Attribute Relationships page, you’ll see that the attributes have the correct many-to-one sequence because the table relationships contained the necessary information. With Date, all attributes were in the same table so the designer couldn’t determine the sequencing automatically.
Tip: If you forget to select an attribute when you’re using the wizard, don’t worry. You can drag the attribute from the diagram on the Dimension Structure page to the Attributes pane at any time.
To view a complete dimension as it will appear in a cube, you must deploy your project. Just right-click the project in Solution Explorer, and click Deploy. Each Analysis Services project becomes a database on the Analysis Services server when you use the Deploy command in BIDS the first time. In addition, the files that you see in Solution Explorer get copied to the Analysis Services server as database objects and data gets copied from the source tables into these database objects. Tip: If you don’t want to deploy to a local Analysis Services server, right-click the project in Solution Explorer, click Properties, and change the Server property on the Deployment page.
After processing is complete, you can browse each dimension in BIDS using the Browser tab of its respective dimension designer. If you closed the designer, just double-click the dimension in Solution Explorer to re-open it. Each attribute becomes a two-level hierarchy with the top level containing an All member and the bottom level containing all the unique values found in the corresponding table column. Use the Hierarchy drop-down list to switch between hierarchies in the dimension. Here’s a fragment of the Month attribute hierarchy:
Notice anything strange? We’ll fix that in the next step. Whenever you design dimensions, you need to check each attribute hierarchy and each user hierarchy (Calendar and Products in today’s lesson) to make sure the names of each member appear correctly and in the right sort order. A member is the unique value in a hierarchy – such as 2005 Aug or 2005 Dec in the Month attribute hierarchy.
Step 5: Sort Order
The default sort order for dimension members is by name. You can sort by the key value, assuming the key is correctly defined and will produce the sort order that you want. You can also sort by another attribute in the dimension. To fix the problem above, you start by changing the key value to a composite key based on the year and the month number. Switch to the Dimension Structure tab, click Month, and then find the Key Columns property in the Properties window. If that window isn’t open, use the View menu to open it.
Click in the Key Columns box to display the ellipsis button, and then click that button. In the Key Columns list, select Month, and then click the arrow pointing left to remove the column from the list. Then double-click CalendarYear and MonthNumberOfYear to add the two columns as a composite key to the list. Click OK to close the dialog box. Now you’ll need to change the NameColumn property to Month.
Whenever you use a composite key, you must specify a name column. If you’re using only a single column as a key, that column will do double-duty as both key and name column unless you specifically define a name column.
To check the results, you’ll need to redeploy the project. Only the changes you made get deployed so it goes much faster than the initial deployment. Go to the Browser tab and click the Reconnect button (the second button on the toolbar) to re-establish communication with the server. The sort order should be correct now for the Month attribute hierarchy. The Month level of the Calendar user hierarchy will also correctly sort.
Most cubes have a few more dimensions than you’re creating here, but the process is very similar so I won’t walk you through the steps. For extra credit, though, you can try adding in dimensions like Promotion or Sales Territory. As I mentioned earlier, there are a lot of properties to configure for dimensions that allow you to get specific behavior for advanced dimension structures, but the default settings are sufficient when dealing with simple dimensions such as you have in the project now. If you’re really curious, you can check out the official list of properties, but most of this won’t likely make more sense until you can put it into context with examples.
In the next post in this series, Cube Construction 101, I’ll show you how to work with the cube wizard and the cube designer. Once you get the cube built, you’ll see how the dimensions work with cubes to facilitate the slice-and-dice capability for which SSAS is famous.
P.S. If you have any trouble with today’s lesson or have questions about dimensions, post a comment.