Thomas.I Microsoft BI & Analytics

A Site Dedicated to General BI Issues and MS BI Issues: Architecture and Technology

Archive for the ‘No Category’ Category

2012 in review

Posted by thomasivarssonmalmo on January 1, 2013

The WordPress.com stats helper monkeys prepared a 2012 annual report for this blog.

Here’s an excerpt:

4,329 films were submitted to the 2012 Cannes Film Festival. This blog had 39,000 views in 2012. If each view were a film, this blog would power 9 Film Festivals

Click here to see the complete report.

Advertisements

Posted in No Category | Leave a Comment »

Estimated File Size in BISM Multidimensional(SQL Server 2012)

Posted by thomasivarssonmalmo on September 28, 2012

One property that has changed in the latest release of SSAS, now called BISM Multidimensional, is the Estimated Size of the cube database in Management Studio. This property no longer tells the whole story about the cube database size. 

Start Management Studio and connect to your SSAS service. Expand the databases folder and righ click –>properties on a cube database of your choice.

MyCubeDatabases

I will right click on the Adventure Works DW 2008R2 database and select properties.

MMS Estimate

In the Estimated Size property above you can see 20.66 MB. This might look good since the source database file size is 141 MB. That implies a size down to around 14 percent of the original database size.

The problem is that this estimate is not correct for the total cube database size like you can see in the next picture.

FileFolderSize

The cube database resides in C:\Program Files\Microsoft SQL Server\MSAS10_50.MSSQLSERVER\OLAP\Data on my hard drive. There is one folder for each cube database.

It is Swedish(storlek på disk = size on disk) and says that the cube database is 89.5 MB on my hard drive. The size is now 63 percent of the original database file size. This is actually the true story.

Be careful with the estimated file size in Management Studio in SQL Server 2012.

Edit: Update(2012-11-11). After the kind assistance of Akshai Mirchandini, in the MS team I have got some clarifications that was not clear to me when I wrote the post.

  • This estimate was changed already in SQL 2008 R2
  • The reason for this change was the time to open MMS and get an update of the total file size was too long
  • It is an estimate of the compressed total file size.

Posted in No Category | Leave a Comment »

Power View in Excel 2013 Part Two: Maps

Posted by thomasivarssonmalmo on July 22, 2012

This is my second blog post about the functionality in Power View for Excel 2013 that will show you have easy it is to create map visualizations in Excel 2013.

I start with the basic model with relational data from the Adventure Works database with country, calendar year and internet order quantity. What you can see here is a basic tables when you drag the attributes and the measure to the default box in the down right corner.

The Basic Model

Now we will change this boring table into something more visually useful. Make sure that the table is marked and press the map button in the ribbon.

The Map Button

After you have pressed the map button you will see something like this on the canvas.

 

The first map

 

But we can extend this even more. Here is the design tool to the right where I have added CalendarYear to vertical multiplies.

 

The map design tool

The result is this map.

The final report

The purpose of this blog post is to show you how easy it is to integrate maps into your reporting and change a boring table into something more useful with a few clicks.

Posted in No Category | 1 Comment »

The Data Warehouse Value Proposition

Posted by thomasivarssonmalmo on February 18, 2012

This is a slide that I am working on tryng to present what the customer pays for compared to running canned reports directly on top of source systems. I do try to avoid details from a customer perspective

 

image

 

This is what I have in my mind trying to explain what I am doing and what the customer is paying for.

Posted in No Category | 3 Comments »

My Quick Guide to SSAS Query Tuning Part One

Posted by thomasivarssonmalmo on January 28, 2012

The intention of this blog post is to write a practical introduction to SSAS Query tuning with the SQL Server Profiler tool. I think that SQL Server profiler can be very usefu but it also have its issues with delivering too much information and in a very unclear way. I hope that this post can be helpful and show how Profiler interacts with the SSAS engine.

The first step is to build a modified Adventure Works cube that only use basic measures from the fact table, no MDX scripts(except for the default calculate() and with no measure expressions. With measure expressions you will get more events, in Profiler for a single query, than you need to start learning this tool. I am using SSAS 2008 R2 SP1 and the Adventure Works DW database and cube project available on www.codeplex.com.

I have two measure groups in thiis model.

TheMeasureGroups

The following dimensions are used. They have no attribute relations yet, only flat connections with the dimension key in each dimension. You can use as many as you like. I have included more dimensions than I need for this blog post but I will refer back to this model in later posts.

TheDimensions

Finally, these are the relations between the dimension and measure groups. It is exactly as in the standard Adventure Works model but without the measure expressions.

The relations

There are no partitions in the measure groups and no aggregations in the measure groups. We will add that in a later post and see the impact of that. I will improve this model step-by-step.

I assume that starting SQL Server Profiler is not an issue here. I have a saved template that I reuse for query tuning. The focus in this blog post are some of the following events.

The Profiler Trace Events

Now it is time to start the fun. I assume that you have the SSAS service up and running and that you have started SQL Server Management Studio and connected to SSAS. Otherwise you will se a lot of file load events that I have described in two previous blog posts. Start the profiler just before running each query and after each query stop it and clear the events. What you will see here, at best in this blog post, is storage engine events and nothing in the formula engine since this model do not include MDX scripts yet.

I will start wih this query:

Select {} on 0
From [Adventure Works DW2008R2 Performance];

Here is a subset of the events that you will see in Profiler.

FirstQueryNull

There are no Query Subcube events so no activity has been going on in the storage engine. The query was sent to the server and a result was returned(Serialize Results Current).Resource usage shows zero on all counters. Thanks to Chris Webb research about these counters here

The second query will use a basic measure group measure on the columns axis and nothing more.

Select {[Measures].[Order Quantity]} on 0
from [Adventure Works DW2008R2 Performance]

.

Query2aPartitions

The is one reading from the measure group and partition containing the [Order Quantity] measure.

A new event occurs that was not shown in the first query. The event subclass numbers 2 and 22(query subcube and query subcube verbose means that data was not retrived from the storage engine cache.

Query 2 Query Subcube Verbose

If you look at the Query Subcube Verbose event in the more detailed window that appears if you mark that event you will see this.

Query 2 Subcube Verbose

We did only send a query that requested a measure from the cube but no dimensions, like you can see above. On each dimension you have a line of zeroes meaning that no specific dimension member was asked for in the query. Profiler only shows the dimensions related to the measure in the measure group. The line of zeroes is a mathematical model called vectors and that is how SSAS currently shows its internals.

This is not the main point here but the resource usage counters are the following for this query explained from Chris Webbs blog post above.

Query 2 Resource Usage

The reads tells you about the number of disk read operations for the query and the reads in kb tells you about the physical amount of data that was retrieved.

The number of rows scanned are the number of rows decoded/filtered/aggregated by the storage engine.The number of rows returned are the number of resulting rows from the same query. I will not go into detail about these counters.

The next query will retrieve the same measure as before but add a new measure from the second measure group.

Select {[Measures].[Order Quantity],[Measures].[Order Quantity – Fact Reseller Sales]} on 0
from [Adventure Works DW2008R2 Performance];

For the same measure that we asked for in the pervious query we will get this event in Profiler.

Fråga3Cache

The data above is retrived from the storage engine measure group cache. In the query subcube and subcube verbose you will see both event subclass 1 and 21 showing that the data was requested from the measure group cache.

The second measure goes through the same process as the measure in the previous query.

Query 3 second Measure

The data is not cached since it has never been retrieved before.

Both Query subcube verbose events still show the same zero for all dimensions. This is the query subcube verbose event for the second measure.

Query 3 second measure query subcube verbose

If you like you can run the third query a second time to confirm that both measures will be retrieved from the storage engine cache.

Finally it is time to see what happens in profiler when you run queries for dimension members. I will only use a date dimension to show this.

This is the first query.

Select {[Order Date].[Calendar Year].&[2010]} on 0
From [Adventure Works DW2008R2 Performance];

This is the clear message from the SSAs engine that you have requested a dimension member in the Query Subcube event.

Query Subcube query 4 query subcube

Can you see the 1 appearing at the end?

Query subcube verbose is a little bit more generous.

Query4 verbose

There is a number 2 on the order date dimension. This is the internal SSAS DataId of that member.

Finally if you run this query with many dimension members:

Select {[Order Date].[Calendar Year].&[2005],[Order Date].[Calendar Year].&[2006],[Order Date].[Calendar Year].&[2007]} on 0
From [Adventure Works DW2008R2 Performance];

Query 4 Many dimension members

Can you see the star * on dimension 4? This means that all members was requested from that attribute. 

If you run this query with two dimension members:

Select {[Order Date].[Calendar Year].&[2005],[Order Date].[Calendar Year].&[2006]} on 0
From [Adventure Works DW2008R2 Performance];

Query4 two members

Here you will see the plus sign + . This means that more than one member was requested.

I hope that this first part of the query tuning can explain the storage engine retrieval of data in Profiler and how it describes the retrieval of dimension members.

Posted in Analysis Services SSAS, No Category | Tagged: | Leave a Comment »

SQL Server 2012 Books

Posted by thomasivarssonmalmo on December 23, 2011

With a big release like SQL Server 2012 it can be expected to see some new book titles on the shelf next year.

Here is the list that I have found so far:

The SQL Server RDBMS

Microsoft SQL Server 2012 High Performance T-SQL Using Windows Functions(MS Press 30 April 2012):  SQL Server 2012 TSQL have some important additions to this programming language like being able to navigate to previous and next records and running sums, that we in the MDX community know quite well. Itzik Ben-Gan is the writer and that tells me that it is a good book.

SQL Server 2012 Query Performance Tuning(APress 30 April 2012): This is always an interesting subject and usually is a chapter in TSQL books. No details have been revealed yet about the content.

Analysis Services

Professional Microsoft SQL Server 2012 Analysis Services with MDX(WROX 30 April 2012). The title hides the fact that this book will cover both BISM Multidimensional with MDX and BISM Tabular with DAX. 

Applied Microsoft SQL Server 2012 Analysis Services:Tabular Modelling(PROLOGICA, March 2012). Teo Lachev is the author of this book that will describe the self service concept in BISM, how to build applications in Excel and how to publish in SharePoint 2010. There will also be chapters dedicated to BI professionals.

Marco Russo, Alberto Ferrari and Chris Webb are working on a book about BISM tabular. I will write more when I have the details. The Title is “Microsoft SQL Server 2012 Analysis Services: The BISM Tabular Model”

Reporting Services

Professional Microsoft SQL Server 2012 Reporting Services(WROX 30 April 2012).  I think that Reporting Services is the MSBI product with most new features added in each release so it is usually mandatory to buy a new book. This book release will also cover Power View.

Visualizing Data with Power View(McGraw-Hill Osborne 1 June 2012) This is a book about the new MS SharePoint report buliding tool for BISM Tabular models.  

Integration Services

Professional Microsoft SQL Server 2012 Integration  Services(WROX 11 April 2012). Integration Services in SQL Server 2012 is a major release so this book is also important. I am not sure if it will cover Data Quality Services and Master Data Management as well.

General MS SQL Server 2012 BI Books

Programming SQL Server 2012 (MS Press, 31 May 2012). I would have second thoughts about the title of this book. It is not about programming in TSQL which is a book with almost the same title in the SQL Server 2008 version. I can see some MSBI content like integration of data mining into applications, building and maintaining data warehouses and use Excel to build front ends for OLAP cubes. I will probably move this book to a general MS BI 2012 section in an update.

Introducing Microsoft SQL Server 2012(O’REILLY/MS Press, March 2012). Half of the chapters in this book will cover MS BI Topics like SSIS, DQS, PowerPivot, SSAS and MDX as well as SSRS.

 

These are the titles that I have found on Amazon this far. I will update this list as soon as I find more interesting titles.

Happy Christmas Everybody!

Posted in Books, No Category | Tagged: | 3 Comments »

When Analysis Services 2008 R2 Wakes Up

Posted by thomasivarssonmalmo on November 13, 2011

This is the start of a small series about SSAS performance scenarios that might help in everyday performance discussions with users and customers.

The first post is about why SSAS initially can be perceived as slow after a restart of the SSAS service and when the first user connects after the restart. It does not matter if the first users is a SSAS administrator or a user connection with Excel. I have used the Adventure Works 2008 R2 demo cube project but I think that you will see the same behaviour on versions from SSAS 2005.

This post only use the SQL Server Profiler performance tool and Management Studio or Excel 2010 to connect to the Adventure Works cube after a restart of the SSAS service.

Go to SQL Server Profiler under Performance Tools and start a new trace with a connection to SSAS. Use the standard template and save it to a file to get a first view about what you can do with Profiler. On the second tab I have selected the following events in Events Selection.

 

Profiler File Load Events

In a future post I will continue with more events in Profiler after a restart of the service but this can be enough for this scenario. Actually you will only have to select the File Load Begin and the File Load End events.

Now you can stop the SQL Server Profiler, if it is running, and push the clear trace window button in Profiler. Restart the SSAS service in SQL Server Configuration Manager and when the service is restarted you can start the Profiler with the trace that you have created. When both the SSAS Service and Profiler is started you can connect to SSAS with a client like Management Studio or Excel.

In Profiler you will see the following file load events, in the picture below, that will only show up after a restart of the SSAS service. This are the events that can make you perceive SSAS as slow during the first connection.

SSAS File Loading Profiler

If you play with clear cache XMLA commands or clearing the file system cache the file load events will never show up if you do this on the same cube database. When you connect with XMLA in Management Studio all database files on the server will be loaded from the SSAS files. With and MDX query it is only the target database files that will be loaded.

What can you do to improve performance of the SSAS file loading? Buy a quick I/O system(disks). With unlimited budget I will recommend solid state disks that will give you quite amazing performnce after a restart.

In a coming blog post we will have a look at more events after a restart of the service and compare that to what happens when you clear the SSAS cache and the file system cache.

Posted in Analysis Services SSAS, No Category | Tagged: | 1 Comment »

SQL Server DENALI Short BI Guide

Posted by thomasivarssonmalmo on September 3, 2011

A lot on my plate for the moment so I had limited time to blog but since I am back in consulting expect to see more posts here after the end of September.

SQL Server Denali is a big release from the BI perspective and I will add my own reflections here but Denali has been commented by al lot of good bI thinkers that I have links to ón my site.

First I think you will here the abbreviation BISM(Business Intelligence Semantic Model) as many times as you have heard about UDM(Unified Dimensional Model) in the past. Our old multidimensional friend Analysis Services, alias UDM, will be called BISM Multidimensional when Denali ships. If you read the product team blog it will still be called Analysis Services and PowerPivot Team blog.

What about the news for BISM Multidimensional in Denali? Not much regarding new features except for that you can select to bypass the 4GB string store limit for a cube. This probably  comes with a price as usual in the trade off between performance and features as usual in this business.  There is no product that comes with good performance  and a lot of features even if sales people claim that. The big news can be that this is still the best tool for building really large cubes on top of big data mart. It is not dead and you cannot exchange it for other MS BI servers when you need to construct complex business logic or need to browse large cubes.

BISM Tabular is the Vertipaq engine or column store mode for Analysis Services. When you install Analysis Services in Denali you will have to choose either modes, tabular or multidimensional, in the first installation. To add the second mode on the same server you will have to run the installation a second time. BISM tabular is really a server version of PowerPivot for SharePoint outside of SharePoint. You do not need SharePoint to install BISM Tabular or Analysis Services in tabular mode.

Some questions arise if you try to install BISM tabular and BISM multidimensional on the same box. BISM tabular is totally memory based so if you need both I would not run that on the same box as BISM Multidimensional or the SQL Server relational engine(RDBMS). In fact I think that you need several separate servers to support SQL Server RDBMS, BISM multidimensional, BISM tabular and SharePoint 2010(with Reporting Services). This will mean some additional decisions regarding in license cost and hardware budgets. It will also add additional complexities regarding infrastructure setups with Kerberos knowledge being mandatory.

Why would you need BISM tabular? First it is more flexible than BISM multidimensional. You can deploy Excel 2010 PowerPivot models to BISM tabular so users(or non consultants) with a good knowledge of DAX can build their models in Excel and send that to a an IT departement for further work or deployment. I must admit that it is far quicker for me, as a professional, to build models in Excel 2010 PowerPivot version 2 than doing the same in in BISM multidimensional. In the BIDS for tabular projects the UI experience is very near to building models directly in Excel 2010. I would recommend you to use BISM tabular for dimensions models that requires more changes within shorter time frames than what BISM Multidimensional can offer.

BISM tabular models can be queried in Reporting Services with MDX. That can be  a way to bypass the SharePoint 2010 requirement and integrate Reporting Services in native mode to other portals. Excel 2010 is also supported as a client and Excel also use MDX to query BISM tabular models.

The second pro of using BISM tabular is that it is the only supported data source for Crescent, the new reporting tool in Denali, focused on non IT professionals to build quite complex reports. Crescent use DAX, the query and expression laguage of BISM tabular. This is a killer application for non IT professionals and the first serious attempt from Microsoft to buld a user friendly reporting tool. The backside is the SharePoint 2010 Server requirement for Crescent.

DAX is now a limited query language and not only an expression language but without the capacities of MDX. It is like a midget MDX that use functions for the query part.

Reporting Servives in Denali and only in SharePoint 2010 integrated mode also have a new event subscription model for KPI changes that the users, themselves, can configure.

In TSQL and the relational engine new Windows functions are added for building cumulative values and having the notion of previous record like we had in MDX since the start. Reporting in TSQL on the relational source will have more capable tools in the Denali release.

PowerPivot in Excel comes with some great features that is also part of BISM tabular. You can build natural hierarchies in the dimensions tables. There is an ER diagram view where you can relate tables by a drag and drop columns UI, build hierarchies and see DAX calculations. DAX measures can be formated. Another important addition is that some DAX functions have been simplified. As an example is the native DistinctCount() function. No more CountRows().

The relational engine in Denali also supports the Vertipaq column store model with special indexes. Vertipaq or the column store is managed by the relational engine query optimizer so adding these indexes to a relational data mart can boost performance of TSQL based reports. I guess that this will make the ROLAP mode of BISM multidimensional even more interesting in the near future.

SSIS will be a huge release in Denali but I will have to refer to other good blogs for comments about the improvement in this product. I use SSIS not only for ETL but also for building recordsets for import into Excel when you do not have the time to wait 30 minutes for SSRS to build a report with 50 thousand records.

Posted in No Category | 1 Comment »

Performance and Query tools for SSAS

Posted by thomasivarssonmalmo on May 6, 2011

SSAS consultants and management experts are starved with good tools for analyzing MDX queries. We have nothing. Please vote on connect to make a change to this on this link.

This is VERY IMPORTANT.

Posted in No Category | Leave a Comment »

MVP for the third year

Posted by thomasivarssonmalmo on October 2, 2010

I have got the email yesterday that I am awarded as SQL Server MVP for the third year. This is a great honour and I will work hard during the next 12 months.

My blog has moved from Live Spaces to here since MS will not continue with LS. After the migration a lot of the pictures was not with the proper size anymore and I will have to adjust that. I will also remove older entries that do not serve any use anymore.

Posted in No Category | Tagged: , | Leave a Comment »