The Local Cube Information Center
The Local Cube    Information Center    
www.cubeslice.com      

Supplement Your Microsoft Business Intelligence Strategy with the Fast Performance and Excellent ROI of Personal Data Marts

Tim Peterson

How fast is fast enough for OLAP browsing?

OLAP is at its finest when the data changes in less than a second. The user clicks and the data changes instantaneously. This is the kind of performance that allows a user to really explore an OLAP cube – drilling up, drilling down, trying one filter, adding an additional filter, switching the dimension displayed on the columns, etc.

Nigel Pendse wrote an article called Performance Matters for The OLAP Report (www.olapreport.com) in August, 2007. In it he says the following about OLAP tools:

Slow query performance has been consistently the most serious product-related reported problem, and for the last few years it has been the single most often complained of problem.

According to data gathered by the OLAP Report, industry-wide median OLAP query time averaged between 7 and 7.5 seconds in 2006.

Median input data volume and median query performance trends
Used with permission of The OLAP Report

Microsoft SQL Server 2005/2008 Analysis Services provides excellent OLAP browsing speed. But most organizations using Analysis Services could benefit in certain situations with the performance boost provided by personal data marts. If your OLAP query response time is over sixty seconds, you may be able to bring it down to five seconds. If you now have a response time of ten seconds, you may be able to reduce it to one second.

With faster speed your OLAP cubes will provide more actionable business intelligence because your business users will be willing to examine the data from more perspectives.

What Are Personal Data Marts and Why Haven’t I Heard About Them Before?

A personal data mart is a set of one or more customized local cube files that have been created for a specific user’s queries. Each local cube contains a small portion of the data available in the Analysis Server cube. Because they are small, they can be very fast.

For example, a company could have a large Sales cube, with data for 1,000 Sales Representatives for the past 5 years. The Sales Representatives examine the Sales cube every day. They spend most of their time analyzing their own sales for the most recent month. A local cube containing only the data for one Sales Representative for one month will have only 1/60,000th of the data in the Analysis Server cube. Unused members from the Product dimension, the Customer dimension, and other dimensions can be removed from the local cube. This small cube will browse much more quickly than the Analysis Server cube, especially when calculations are being evaluated. This fast speed can be achieved while still keeping all of the cube’s dimensions, attributes, measures, and calculated members.

Besides improving browsing speed, the local cube files also provide the flexibility for the Sales Representative to look at data when not connected to the corporate network.

Local cube files have been a part of the Microsoft BI toolset since 1998 with the release of SQL Server 7. Their purpose has been to provide users access to OLAP data while they are offline. The use of local cubes to improve cube browsing speed has often not been recognized.

While Microsoft has always provided the ability to create local cube files, it has often been difficult to find tools to automate the repeated creation of customized local cubes.

With the release of SQL Server 2005, the ability to design local cube files was improved. The Analysis Services Scripting Language (ASSL) gives the user full control over the creation of the local cube. Most users, however, do not have access to a tool that creates local cubes using ASSL. Local cube creation tools usually use the Create Global Cube statement, which gives far fewer local cube creation options. A cube created with Create Global Cube cannot be optimized for browsing speed in the same way as a cube created with ASSL.

The result is that there are organizations with a lot of experience in using Microsoft Business Intelligence, but little experience in using local cube files. Many of these organizations could achieve a dramatic improvement in their cube browsing speed by switching to personal data marts for their OLAP data delivery or by using personal data marts for at least some of their more challenging cube browsing situations.

How Can I Know If My OLAP Performance Will Improve With Personal Data Marts?

The only way to know for sure, of course, is to try it. However, we have found that personal data marts can be particularly useful in the following situations.

1. The cubes have one or more large, flat dimensions. Sometimes dimensions are designed without hierarchies or with hierarchies having very few levels. One example of this is the Sales Order dimension in the Adventure Works sample database that is distributed with Analysis Services. The Analysis Server cannot build effective aggregations for a large, flat dimension.

If you design a local cube that removes a significant portion of a large, flat dimension (such as limiting the Sales Order dimension to the sales for a single Sales Representative), you will greatly improve browsing speed when the user is looking at that dimension.

2. The cubes use complex calculated members. Calculations can slow browsing speed. Because there is less data in a local cube, the calculations can be evaluated more quickly. The improved performance is most evident in a calculation that has to evaluate a large number of null cells. Because local cubes can have all unused dimension members removed, the number of null cells can be exponentially reduced.

3. Each individual user is using only a small portion of the cube. Personal data marts are especially effective at improving browsing speed when each user typically looks at a small subset of the cube’s data – like the previous example of a Sales Representative looking at the most recent month’s data.

4. Some users only want to see an odd subset of the cube. I have often seen situations where users want to see an oddly shaped subset of the data – such as all sales from Region 1, together with the sales from two cities in Region 2, excluding sales initiated by the corporate headquarters staff, and only including sales to customers who haven’t purchased anything in the previous two years. A good OLAP client tool will allow a user to see this subset of the data, but it is much easier and there can be much better performance if the user starts browsing with this exact set of data.

5. Cube browsing speed is adequate for some but not all users, or is adequate at certain times but not other times. Analysis Server cubes can be effectively optimized for the typical user, but there may be certain users who are browsing the cubes in unusual ways that greatly increase query response time. When one user issues a long query, the query response time can be slowed for everyone using that Analysis Server. In these situations, it can be very helpful to create personal data marts for the users that have the lengthy queries. Specialized local cubes can be designed to improve the browsing speed for these users. Because the Analysis Server no longer has to evaluate these complex queries, the browsing speed for all the other users can be improved and will be more consistent.

Can I Achieve the Same Level of Performance Without Using Personal Data Marts?

In some situations, you can match the performance of personal data marts by using other strategies. In many cases, however, these other strategies are inadequate or may be more expensive to implement.

1. Add MOLAP aggregations. The Analysis Server gains a lot of its ability to quickly respond to queries through the use of aggregations. If you haven’t done so already, create MOLAP aggregations for your cubes. Aggregations will not, however, increase the browsing speed in large, flat dimensions. They may or may not be helpful in improving the speed of calculated members.

2. Set proper attribute relationships in hierarchies. When using the Analysis Server from SQL Server 2005/2008, aggregations cannot be created for hierarchies unless relationships are set for the attributes used to create the levels.

3. Add partitions to the cube. When used with appropriate slicing in the queries, partitioning can greatly improve cube browsing speed. If there is one single dimension that is causing browsing speed problems, adding partitions along that dimension can be very effective. Personal data marts, though, are more effective when different users need a cube optimized along different dimensions. It’s very difficult to take every individual’s situation into account when designing partitions for an Analysis Server cube.

4. Optimize calculated members. There are many strategies for optimizing calculated members which, in some cases, can dramatically improve browsing speed.

5. Use more powerful hardware. The Analysis Server performs better with a 64-bit server, multiple processors, and a large amount of memory.

6. Teach users how to avoid problem browsing areas. Many Analysis Server cubes will perform well as long as the user doesn’t pick the wrong combinations of attributes. In many, but not all, situations, slow performance can be avoided by picking appropriate slicers before attributes are placed on the columns and rows.

7. Simplify the Analysis Server cubes. Look at the list of situations where personal data marts can improve performance and modify the Analysis Server cube so that these conditions are removed:

  • Eliminate large dimensions
  • Remove the lowest level of detail in large dimensions
  • Add levels in large, flat dimensions
  • Remove calculated members if they are slowing cube browsing
  • Remove seldom used attributes

In summary, all these strategies can speed OLAP browsing. Some of them will make a big difference with very little effort. But in many cases, the easiest and cheapest option is to give your users personal data marts.

There are many excellent resources available for optimizing Analysis Server cubes. See the end of this article for suggestions on papers, web sites, and blogs.

Personal Data Mart ROI

In considering the ROI of a personal data mart, there are three primary courses of action:

1. Be satisfied with cubes that have slow browsing speed
2. Use one or more of the other strategies to improve cube browsing speed
3. Use personal data marts to improve cube browsing speed

Each of these options has a cost associated with it.

Course of Action #1 – Be Satisfied

When looking at the costs of slow browsing speed, consider the following:

1. What is the overall value of your Microsoft Business Intelligence system? How much did your BI system cost? What benefit is your organization receiving from the cubes? What additional benefits are available through more effective use of the cubes?

2. Your cubes will not give a good ROI unless they work well for your users. Whatever benefit you receive from the cubes will be greatly enhanced if they can be browsed quickly. Fast browsing cubes are an essential part of a Business Intelligence system. It is unwise to spend a lot of money building a BI system without also achieving the best possible cube browsing speed.

3. What speed is too slow? Nigel Pendse, in an article entitled What Is OLAP?, has said that a defining characteristic of OLAP is fast browsing response time – where simple queries take one second, most responses are returned in about 5 seconds, and very few responses take more than 20 seconds. I think this should be considered to be a minimum standard. Users (and organizations) will gain much more benefit from OLAP if the typical query is under 1 second and very few queries exceed 5 seconds. With this level of performance, users feel that they’re getting their questions answered immediately.


Course of Action #2 – Other Strategies

The cost is quite varied for the other performance-enhancing strategies:

1. Some strategies are very cost-effective. If you can achieve excellent cube browsing speed by adding aggregations, setting attribute relations, adding partitions, or by optimizing calculated members these would usually be the best options. Even if you have to hire a consultant to help you do this, the benefit should be worth the cost.

2. Some strategies reduce the value of the cubes. If you have to remove dimensions, attributes, levels, or calculations you need to consider the cost of giving up some of the cube’s functionality. Adding levels to a hierarchy can also make a cube harder to use. The option of teaching users to avoid certain types of browsing has a very high cost because it restricts OLAP browsing to the people who have learned how to do it, rather than having the benefits of OLAP insight available to everyone in the organization.

3. Some strategies can be expensive. The improved hardware option has a definite cost attached to it. And though better hardware can be very helpful, by itself hardware may not bring optimal performance.

Course of Action #3 – Personal Data Marts

When using the personal data mart strategy, you need to calculate the costs of doing these tasks:

1. Designing and creating the local cubes. It is fairly easy to create local cube files in SQL Server 2000. In SQL Server 2005/2008, it is usually necessary to use the Analysis Services Scripting Language (ASSL) to create optimized local cubes. The ASSL can be quite difficult to manipulate. Our company has developed a product, CubeSlice, which uses ASSL to create customized local cube files that can be optimized for browsing performance.

2. Distributing the local cube files. Local cube files need to be recreated when new data is available. These new local cube files need to be distributed to the users on a regular basis.

Summary of ROI Considerations

There are many factors involved in calculating the ROI of personal data marts. From my perspective, the key factors are as follows:

1. Poor cube performance is not acceptable. If OLAP cubes are needed by an organization, those OLAP cubes need to have excellent performance.

2. There are some inexpensive ways to improve the performance of Analysis Server cubes. You should always try these methods to see if they will give the performance you need.

3. In some situations, personal data marts and local cubes provide the fastest possible cube browsing speed. There are expenses in setting up a system of creating and distributing local cube files, but the results are often well worth the expense. When you use personal data marts, you can give individual users a set of cubes that are optimized for the querying they want to do.

Fast Cubes – Happy Users – More Business Insight

The goal of using personal data marts is to make OLAP fast – convenient, easy, and effective.

Fast cubes make for happy users. If there are happy users, there will be more users. And with fast cubes, each of those users will be able to find more insights to improve the organization.

About the Author

Tim Peterson is the lead author of Microsoft OLAP Unleashed (SAMS, 1999) and the author of Microsoft SQL Server 2000 Data Transformation Services (SAMS, 2000). He is a speaker, teacher, and consultant. He is the chief software architect of CubeSlice, a tool that automates the creation of local cube files.

Would you like to see if the use of local cube files in personal data marts could speed up your cube browsing? Sign up for a free demo of CubeSlice now. Using remote desktop software, we will show you how to design high performance, customized local cube files. You will see for yourself whether they would be useful in your organization.


For More Information

www.cubeslice.com

How to Dramatically Improve Browsing Speed Performance for Microsoft SQL Server 2005 Analysis Services

Local Cubes in SQL Server 2005 Analysis Services

The Local Cube Information Center


Tim Peterson’s BI Performance Blog – Local Cube and Other Strategies to Boost OLAP Cube Browsing Speed Performance

www.olapreport.com

What is OLAP?

Performance Matters

Extensive and detailed reviews of OLAP products

www.microsoft.com

Microsoft SQL Server 2005 Analysis Services Performance Guide by Elizabeth Vitt.

Query performance directly impacts the quality of the end user experience. As such, it is the primary benchmark used to evaluate the success of an OLAP implementation.

www.xlcubed.com

Build Better Cubes: Real-Life Advice on Building Analysis Services Cubes by Gabhan Berry

“Every business intelligence solution will have its problems with data scalability. It’s inevitable. Almost always, the source data will increase in size over time. What you have today may perform adequately but this may not be true next month or next year. As technology and hardware has improved, this problem has been alleviated but not solved; the limits have simply been moved, not extinguished. Building a cube that uses all data for all time, and where the data increases over time, is a recipe for a cube that will eventually be too slow to use.”

Richard Tkachuk’s Analysis Services Page, with an article about optimizing calculations – MDX Performance Hints

www.ssas-info.com

Everything about MS SQL Server Analysis Services 2005
(Suggestion – Select the menu item SSAS Articles by Subject and then choose Performance)

Microsoft OLAP by Mosha Pasumansky

Microsoft SQL Server Development Customer Advisory Team

Chris Webb’s BI Blog – MDXtreme Programming