BIG MAC Index with Power Map and Power View

The Big Mac Index is an invention by the news magaziine “The Economist” to measure if a countrys currency is over or under valued. One way to do this is to use the prize of the Big Mac in USD as a comparative product since it is the same all over the world. Then take the price of the Big Mac in all local currencies and calculate that local price into the USD price and compare it to the USD-price. You can read more here.

So what I have done is that I have imported the data from The Economist, in several Excel spreadsheets and used Power Query to change data types and did some minor corrections. Only a few countries have continous data from 2000 to 2014 and a lot of the world countries are not part of the index.

Power Map

Above is the state in early 2014. Since US is the comparative country nothing will show up there. Over valued countries and currency are entirely dark blue. Undervalued countries and currencies have white edges. According to the map Norway, Swiitzerland, Sweden and Venezuela have the most overvalued currencies.

Here is a recording of the tour  where you can see the development over 14 years.

 

We can also have a look at Norway in Power View.

Power View

As I can say, from several stays in Norway it is a really expensive country and  beatyful. Norway stays at the top during most years.

My BI Year 2013

I had too little time to blog as much as I wanted. The one about SSAS evaluation nodes was the one that took me a lot of time to write and it should have a second part with more practical stuff but there was simply no time for that in 2013.

So what happened then? I can mention the word certification.  I took both the 70-463, implementing a data warehouse with SQL Server 2012, and the 70-461, querying MS SQL Server 2012 and I am currently preparing for the 70-462, administering MS SQL Server 2012 databases. The two last ones, from my point of view as a BI consultant, are a waste of time. I am very happy to have passed the 70-463, because it is very useful in my daily work. Parts of the TSQL certification are useful, like 30-40 percent but I would like to see a more BI-focused database engine part with admin and TSQL parts in one exam. Right know both exams are too developer and DBA focused. The goal is to get the MCSA title, since it is mandatory in the company that I work for. It will not make me better as a BI-consultant since I never do cluster installations, Always On installation or set up replication of the database engine. Most of my customers either has outsourced maintenance of hardware and software or would never let me play with their IT-infrastructure. I have colleagues that have these skills and do this work daily.

The parts of BI that have have spent most of my time on, the last ten years, like Reporting Services and BISM MD/Tabular come in the second certification step, to be a “MS Certified Solutions Expert.

Now to the other parts. The year was mostly spent on the ETL or ELT part of BI. I have been involved in name standard document for BI like clening up that the same term applies to different objects or different terms apply to the same object. I was also invloved in discussions with customers about ETL or ELT strategies.

I attended a 5 day long course on the new PDW V2 and that hardware and software solution really can compete on the high data volume scenarios of BI. Do not think about it as only a part of the BIG Data scenarios because it can integrate well with SQL Server environments without the BIG data part. My personal opinion is why you should invest in more than 4 socket CPU system because on the declining return on adding more CPU power. This also questions the Fast Track part of MSBI offerings since PDW V2 gives much better value for money.

I also worked with a POC for a customer that included SQL Server 2012, BISM Tabular, SharePoint 2013 and Excel 2013. The fun part was to load a 340 million record fact table in to  a dedicated tabular server and see that it worked very well and that compression worked to shrink data to 15 percent of the original size.

I met a very smart customer that maintain their own business catalog with product and customer definitions outside of their ERP system. When they upgrade their ERP system they use this business catalog with cleaned product- and customer keys. That is what I call true master data management.

Slowly the interst for more self service is riising here in Sweden.  Since Power Pivot requires quite capable hardware like 64 bit laptops with 8 GB RAM and SSD and Office 2013 that will be an investment for most organizations that requires approval since most laptops are still 32 bit with 2-4 GB RAM. On top of that a SharePoint infrastructure is also needed.  Still the idea that IT provides datasets to users that then use Excel 2013 with Power Pivot and perhaps Power Query is attractive since it will reduce pressure on IT to help with data mash up.

Alternative to drill through in Excel 2013-Quick Explore

I sometimes get feedback on features from customers that I did not recognized when testing the feature on your own. Quick Explore in Excel 2013 is such a feature.

When we build BI solutions and start to deliver data in a tool customers usually would like to validate the data against referenced transactions in the source system. BISM Tabular and BISM Multidimensional both have a feature for this called drill-through but you have limited control over what will be presented.

I start with a Pivot Table in Excel 2013 that use a multidimensional cube but it will work the same in Tabular. You probably recognize the Adventure Works multdimensional  cube.

StartUpPivotTable

I the next step I will highlight the CY 2005 and United Kingdom cell that tells that the order quantity was 96. You will see a small magnifier glass appear to the right of the cell. That is Quck Explore.

QuickExplore

Qlick on that icon and a small window will open up. I have choosen to go down drill to Internet Sales Order Details –> Sales Order Line to get the reference numbers.

 

DrillToQE

The result might seem disappointing first since we will se all years of order quantity for United Kiingdom not only 2005.

 

UK for all years

This issue can be quickly fixed by moving years from columns in the report to the report filter box.

 

MoveDateToFilters

Finally we have a clean list will the sales order numbers for United Kingdom and the year 2005. It is a clean list without any other redundant information.

 

EndResult

In a Tabular or Multidimensional model you will need to creat a 1-1 dimension with the fact table refered to as a degenerate dimension and in large fact table scenarios this can be a performance problem.

Happy Quick Explore!

BISM MD: Introduction to Evaluation Nodes

This is the first introduction to the concept of evaluation nodes in SQL Server Profiler and what they can tell you about what is going on in the SSAS engine. I would also like to examine where the evaluation nodes show up in the Profiler trace. Evaluation nodes were introduced as events in SQL Server Profiler with the SQL Server 2012 release. They have been part of the engine before that but it is in the SQL Server 2012 edition they were added as Profiler Events.

I must acknowledge Akshai Mirchandani, from Microsoft, who has answered my questions and given me feedback on this text. If anything is wrong here, blame me.

SSAS/BISM Multidimensional Query Plans

SSAS produces query plans like the SQL Server RDBMS when receiving queries and before doing any real physical work and retrieve data. Query plans in multidimensional structures are complex and would require many chapters in a book to cover. It is not possible to go into all details about how this works in SSAS/BISM MD in this blog post. I refer to the Analysis Services 2008 unleashed for details.

There are two main parts that works in this process and that is the Formula Engine (FE), which does most of the work, except for going to the disk system and retrieve data. A simplified general picture is that the FE handles the MDX (queries) sent to SSAS and breaks that request down into small parts, called sub cubes. It then tries to find out if it can find the data in the FE cache or will have to send it further to the Storage Engine.
Retrieving actual data from disk or the file system cache is the work handled by the other part, the Storage Engine (SE) unless there is data cached in the FE.

Logical and Physical Query Plans

The SSAS FE creates both logical and physical query plans. The SE has its own query plans and it is not a part of this blog post. The word plan here is not fully accurate since the profiler trace shows how the query was executed and not a planned query execution. SQL Server RDBMS query optimizer use estimated query plans and actual query plans but that is not the fact for SSAS/BISM MD.

Evaluation nodes are essentially nodes in the logical plan. Instead of using the word logical query plan it is recommended to use Evaluation node. They get converted into nodes in the physical plan and might get executed. Evaluation nodes can be created but not always executed. They can also be cached and used by other cells during query execution. All of this takes place in the Formula Engine.

Physical plan nodes in the FE implement calculation operations like multiplication and division but nothing of that are exposed in the profiler trace events for evaluation nodes.

Can Evaluation Nodes help in Query Execution Analysis?

Evaluation nodes are chatty events like most SQL Server Profiler Trace events for SSAS/ BISM MD. A lot of data gets generated and not in a friendly User Interface. With more complex calculations than this example you can get a lot of Evaluation Node events that can be hard to find patterns in. This blog post can help you with seeing how the Evaluation Nodes appears as three groups of events, one group that will get executed and two that will not get executed. The post will also help you to find the xml-tags with interesting information about the query plan.

An example of interesting information is if evaluation node is run in cell by cell mode or block mode. Have a look at the table at the end of this blog post where several xml-tags in an evaluation node are explained.

 

The cube used and the queries

This is my sample cube with two measure groups (no calculations or measure expressions). It is the same cube that I used in my previous blog post.

The Cube Structure

There are no natural or user hierarchies’ only attribute hierarchies. Each MG has only one partition. There are no MDX scripts and no aggregations. The aggregation property of each attribute are set to default.

This is the first query that was executed on a non-cached cube.

The MDX Query

Below is the result of the simple query above.

The result of the MDX Query

The Complete Profiler Trace

To set up the trace with Profiler you need to add these events before running the query above.

ProfilerTraceSettings

I assume that the reader know how to configure the SQL Server Profiler.

Here is the first part of the Profiler Trace events.

FirstPartProfilerTrace

This is the second part of the profiler with the event following the last event above.

SecondPartProfilerTrace

Grouping the Trace Events

Let me first try to describe what is going on here with a high level picture.

EvaluationNodesTopView

The VALUE, FORMAT_STRING and LANGUAGE groups are derived from an xml-tag (CellProperty) in each evaluation node events that will be shown later. Each group of evaluation nodes have their own NodeIndex-property. The first group starts with NodeIndex 0.

EvalNodesNodeIndex

First the FE starts with analyzing the sets on each axis and builds subcubes that covers these sets. Then an iteration over the cells starts with a request of the current cells property value. Each intersection of axis sets points to a cell. FE will then find a subcube that was previously built and then create an evaluation node for that subcube. This will trigger the Init-Build-Prepare of the evaluation node in the Profiler trace. Data is then requested from the SE that is needed by this evaluation node, including data needed by other evaluation nodes (look at the Query Subcube Verbose event in the Profiler Trace), that were created indirectly by the Build/Prepare phases. Now it is time to run the evaluation node (event 7, RunEvalNode Start and 8, RunEvalNode End). This is optional sometimes since because evaluation nodes do not always have to be run (for non-value cell properties or cell-by-cell mode). Finally the value for the current cell property is fetched as a result of running the evaluation node (or calculate it as a cell-by-cell operation). Then these steps are repeated for the next cell.

I we look at the Profiler trace earlier we can see that the evaluation node that was run was cached and used by the other cells. The two later groups with CellProperties(, FORMAT_STRING, LANGUAGE) were never run since they do not have the RunEvalNode-events after them as their last step.

This is the first group of Evaluation nodes will CellProperty (VALUE) and NodeIndex(0).

FirstGroupOfEvaluationNodesValue

If I click on the Calculation Evaluation-2 InitEvalNodeEnd of the first group I can see this property:

InitEvalNodeEnd

 

Single Evaluation Nodes: Prefetch Operation

To add up the discussion about the first group of evaluation nodes we can say this:

  • · The evaluation nodes are initialized and built
  • · The same group of evaluation nodes are then prepared.
  • · The evaluation nodes are run.

Actually there is also a less clear step included in this group. That is a prefetch operation were all SE requests for all evaluation nodes that have been prepared are issued to the SE. The prefetch gathers together the subcubes that were detected as part of the prepare phase of o or more evaluation nodes and not just the last one. There is no event in the trace indicating the prefetch itself. This prefetch operation is a SE request for all prepared evaluation nodes and that is why the evaluation nodes show intermingling of SE queries. After the prefetching is finished the evaluation node can be run.

This is the second group with CellProperty (FORMAT_STRING) that was not run.

CellPropertyFormatString

There is no RunEvalNode event in this group.

This is the third group with CellProperty (Language) that also was not run

CellPropertyLanguage

Like the previous group of evaluation node events there is no RunEvalNode event in this group.

Evaluation node tags for the run evaluation node

Here I have two fragments from the same evaluation node in Profiler.
I refer to the table for details about each tag. The most interesting here are:

  • · <Exact>1</Exact>: Which means that the subcube is an exact match.
  • · <LazyEvalutation>0</LazyEvaluation>: The evaluation node was run in bulked mode
  • · <SparseIterator>1</SparseIterator>: The evaluation node did not have to iterate over the entire cube. It was iterated over a smaller, sparse , space.

The collection of the tags i refer to are here.

InterestingTags1

InterestingTags2

Summary

This blog post is an introduction to evaluation nodes in SQL Server Profiler with the scenario of a cube with no calculations. This is why the resulting information from the profiler trace in this scenario is limited. The next post will have a look at evaluation nodes when MDX expressions are involved.

Table:Evaluation Node Tags

EvaluationNodeTag

Explanation

Exact

Exact means that the subcube is an exact match of the cells that are needed by the query. Sometimes the server can build a subcube that is *larger* than the cells needed by the query, and in that case Exact will be false.

Empty

Empty means that the result of this evaluation node is empty, none of the cells have any data . This can happen if you have a subcube that is somehow not valid (e.g. perhaps the subcube has a coordinate that doesn’t auto-exist with other coordinates or perhaps some other situation where calculations get eliminated until nothing remains and the subspace is guaranteed to have no cells with data).

PlanWasRun

PlanWasRun means the execution plan for this evaluation node has or has not yet been run. It will get run when the Run events show up – or it will *never* be run, because FE ends up fetching cell values in a cell-by-cell mode.

PlanMustBeRun

PlanMustBeRun is too internal to explain. But essentially there are some situations where we build an evaluation node, and it turns out that it matches another evaluation node that has already been executed – in that case, we can just point the evaluation node to the cached result of that earlier evaluation node, and the plan for the new evaluation node does not need to be run, and so the flag will be set to false.

NaiveEvaluation

This is on the evaluation node item. Use the value of the LazyEvaluation tag instead.

Status

PrePareStarted/Built/Uninitialized/RunStarted/RunSucceeded

CalcsAtSameGranularity

This indicates whether the calculations are on the same granularity or not.

SingleCell

This means whether the subcube of the evaluation node is for single cell coordinates. This may translate into cell-by-cell.

LazyEvaluation

Indicates cell-by-cell mode or bulk evaluation.

SparseIterator

When in block mode, do we have to iterate over the entire (dense) space of the entire subcube or can we iterate over a smaller (sparse) space. Remember that Sparse iteration is good. E.g. if you have a calculation with the expression [Measure1] * 2, and Measure1 is a storage engine measure, then we can execute a storage engine request and iterate just over the cells returned by the Storage engine request – which is a much sparser space than the entire subspace because many of the cell values are null and we don’t need to calculate them at all.

EvaluationItemCount

The number of calculations that apply to the subspace – these can change as we proceed through the stages of Init/Build/Prepare, because some of the calculations will get overlapped/split/eliminated. The final set is what you see at the end of Prepare or the beginning of Run

Overlaps

Overlaps means whether individual calculations overlap with each other. E.g. a scope on all states and a scope on the state of Washington – the item representing the first calculation has an overlapping calculation.

CoversFullSpace

CoversFullSpace means whether the calculation item covers the full space of the evaluation node, or whether the item has filters (e.g. State = Washington) that make it smaller than the full space of the evaluation node.

HasConditions

HasConditions refers to IF conditions in the calculations.

NoCalculations

0 -> Only Storage Engine Query

CellProperty

As discussed above.

CalculationLocation

This is the location of the calculation. When it is only SE queries this tag contains the measure group. In general it will contain the location and expression that applies to the evaluation node item.

VaryingAttributes

Drives the way the expression is evaluated and makes the expression dependant upon it. The 0s and 1s are bits and the numbers from 0 to 140 are the bits that are turned on .

Ancient Monument Spotting with Excel 2013 Data Explorer and Power View

While I am not working with BI or learning new technical stuff I read about archeology in Scandinavia and especially about what can be found in the region of Sweden where I live. One idea of mine is to be able to compare differnt types of acheological findings from different time periods and see if they appear on the same geographical spot or not. Can Excel 2013 help me with this? Yes it is possible with the help of Excel Data Explorer, currently in beta and PowerPivot and Power View.

First I need to show you the geographical spot so that you can relate to the area. I have highlighted one type of ancient monumnets in the Bing map that also shows the final result after this exercise. It is from Trelleborg municipality in south of Sweden since I have found a really good dataset from that part with a list of ancient monuments(stone age, bronze age and iron age). Most of these monuments are graves and burial mounds.

 

So where are we

The dataset can be found here on Wikipedia: http://sv.wikipedia.org/wiki/Lista_%C3%B6ver_fasta_fornminnen_i_Trelleborgs_kommun

Below you can see how this Wiki-page looks like. Name is the unique registered number for each ancient monument. “Lämningstyp” is the classification like rune stones, burial mounds. “Historisk Indelning” is the actual geographic spot of the monument. “Läge” is the latitude and longitude of the monument and the key to map this information to the Bing Map in Power View.

The Data source

I started with launching Excel 2013 with the Data Explorer preview and selected from web as the data source. I simply added the url above and Data Explorer started the import. As you can see I got several datasets for different sub areas in Trelleborg. Here is an issue I have not solved yet and that is if I am able to import all data sets with one query by setting a parameter. With my approach I am able to show other steps in the cleaning and consolidation process with Data Explorer.

 

First Query

The cleaning process is straight forward. I have right clicked the columns I do not need and selected hide. I have also changed the data type of the Id-nr, a not important column, to text with the same righ click and quick menu. Finally I have added an Index column not for any specific reason that being able to run a count aggregation on that column. The final query result looks like this with the steps shown to the right.

cleaning steps

I have created one query for each different sub area within Trelleborg. With all queries defined I have used the Append button to add the sub areas to the same result set.

 

The appending process

This Append process is not the best way to run a union over all queries. It is possible to do this much quicker by changing the append query slightly. You simply add the query names to the function like you can see below.

 

Extend the query

 

With this work done it is time to load the data into PowerPivot. In Excel 2013 this is done transparently and since I have only one single table there is no question about joins.

 

LoadToDataModel

After that I started Power View in Excel 2013(Insert)  the data model will appear. I select maps as the graphical tool in Power View. The selections for my map are the following. I need a measure in the size to see anything. I simply run a count on Lämningstyp(=classification of monument), and in Location I put the Latitude and Longitude data field. I added lämningstyp a second time for the color of the circles in Power View.

 

Selections for the map

The result look like this with two “Lämningstyp” selected in a filter. Gravfält is a burial field and Hög is a burial mound, usually from the bronze age. All sub areas are not loaded into the model so there are more monumnets than you see in the map.

Resulting Map

Below you also see the filter section.

The filter section

 

So what is the point of doing this? The is an available database together with maps of all ancient monuments in Sweden but without the flexibility of the Excel 2013 solution I have shown here. The fact that I can add colors to different groups or classification of monuments in Power View is very useful. It is possible to see different clusters of classifications on the same geographical spot or near by. I can also quickly select and unselect categories of monuments.

Multiselecting Dates in Excel 2013 on top of SSAS/BISM Multidimensional

It is not uncommon to compare BI clients on a very detailed feature level in discussions with customers but this blog post is about a question in the Analysis services forum that I posted an answer to. I have written a longer blog post on the SolidQ blog about the most important new features in Excel 2013 for multidimensional analysis and this post add a feature that was not on my mind while I wrote this:

http://blogs.solidq.com/Business-Intelligence-SQL-Server-pa-svenska/Post.aspx?ID=15&title=Excel+2013+and+the+new+Multidimensional+Improvements

You can work with sets of dates in Excel 2013 if you use the GUI approach. First you can use the old Pivot Table report filter approach like below where you have a filter above the Pivot Table.

Add Dates as Report Filter

If you would like to use a date interval you activate “Select Multiple Items” in the left corner and then you are left to click-click and click for all the relevamt dates.

Report Filter Select Multiple Items

This might involv a lot of clicking especially if you would like intervals of 20-30 dates.

With the Excel slicers you have a much more efficient way of marking ranges of dates. I assume that you know how to add slicers for a natural hierarchy in Excel. Below I have selected the first 5 days in July 2005 by clicking the first date in the interval and then using shift ´click on the last date in the interval.

Use shift in slicers to mark date intervals

In Excel 2013 there is a third approach with Time Lines. That option is placed next to Insert Slicer in the Ribbon under PivotTable Tools.

Insert TimeLine Ribbon

You are presented to the option to select the date hierarchy.

DateHierarchyChoicesTimeLine

I have swedish settings and År means Year. You can select levels in the upper right corner to select years and then the months of interest for quick navigation. I am on my way to the first five days in July in the picture below.

TimeLineSelection2005

When I have navigated to the date level of July 2005 I simply drag the cursor over the dates that I am interested in and the Internet Order Quantity sums up in the same way as with the slicer approach.

DragTheCursorOnTheTimelIne

It is always important to know about these less apparatent features in clients that will make the analytic daily work easier for our customers.

2012 in review

The WordPress.com stats helper monkeys prepared a 2012 annual report for this blog.

Here’s an excerpt:

4,329 films were submitted to the 2012 Cannes Film Festival. This blog had 39,000 views in 2012. If each view were a film, this blog would power 9 Film Festivals

Click here to see the complete report.

ISOWeeks in SSAS Revisited

I wrote a blog post a few years ago about IsoWeeks in SSAS and now it is time to do an update. Since SQL Server 2008 and later we have a TSQL Datepart() argument that will remove the need to build TSQL User Defined Functions to add this information into a date dimension table. You will see this in the code samples but have a look at the datepart() TSQL function i Books On Line for more information.

ISOweeks seems to be the fashion here in the Scandinavian countries. The main idea is that a year can contain 52 or 53 weeks depending on if the new year starts on Thursday or later. All ISOWeeks have seven days and the main problem for SSAS or BISM Multidimensional is that with ISO weeks they cross over calendar years and this means a many to many relation between ths IsoWeek and the calendar year if you try to add a calendar year as a hierarchy level above IsoWeek. The question is if you should add a year level above the IsoWeek or not. If you decide to add a year level that must be a business defined IsoYear level and not a calendar year. In this blog post we assume that such a level is needed and the business rule is that week 52 and 53 should always belong to the previous year and week 1 to the new year. As an outcome of this descision is that ISOYears and calendar years will not have sum up in the same way. Totals over years will be the same but not the yearly distributions.

Enough said and now it is time to transfer this IsoWeek requirement into code and a working solution. My code is a theoretical example and has no real world background except for the date dimension. It is simplified to check that the numbers are summed correctly and includes one order for each day that is part of the date dimension.

I will start with the code for a date dimension below.  I am using a CTE(TSQL Common Table Expression) with  a start date(2009-01-01) and an end  date to the last date of the current year ( where   Year(DateValue) < = Year(GetDate()) ) . You can also see the datepart-function with the ISOWeek argument. I am doing a select Into TSQL statement to create a new table from the CTE. The last column(1 as IsoYearWeek) was added to create an integer column that we will use in a second step. The column created with the TSQL case statement is the business rule mentioned earlier to create the IsoYear classification. I thanks my colleague Eva Eriksson for her kind help.

 

DateDimensionCode

 

In a second step I use this TSQL statement to update the IsoYearIsoWeek column:

Update dbo.TestDateDim
Set IsoYearIsoWek = IsoYear * 100 + IsoWeekInt

The result will look like this in Management Studio.

DateDimensionTable

 

Finally it is time to create the dummy fact table with one order for each date in the date dimension. This is also done with a CTE part of SQL Server since the 2005 version.

 

FactTable

It is time to build the date dimension and the cube in the SQL Server 2012 Data Tools previously named BIDS. I assume that you now how to do this except for the settings I have added all attributes to the date dimension. In the wizard you should set the type properties for the different date attribute members that will be part in MDX Time calculations. Some of the attribute types are kept as regular because they are not important for this example.

AttributeTypesDateDim

I have created these hierarchie below in the date dimension.

 

DateHierarchies

I have also created the attribute relations for the hierarchies like this.

 

Attribute relations

Finally it is time to add the fact table. In the cube structure tab it look like this.

 

FactTable2

And in the dimension usage tab it look like this.

Fact Table

 

Process the cube and start Excel to have a look at the cube. I am using Excel 2013. I have created two pivot tables from the same connection. The pivot table to the left is the IsoWeek hierarchy and the one two the right is the calendar  hierarchy. We have the same totals but the distributions differs. If you expand the IsoWeek hierarchy for IsoYear 2012 all weeks should have 7 as the sum of orderQty, like you can see in the second picture.

 

TheTwoHierrachies

 

IsoWeekExcel

I a future post I will cover a  many to many requirement in a date dimension with weeks.

Excel 2010 and Excel 2013 Slicer Settings

Here is an interesting new functionality in Excel 2013 regarding slicers with BISM Multidimensional. One of my colleagues asked for this functionality in Excel 2010 when he built reports with slicers for publishing in SharePoint 2010 Excel Services. The slicers should not show members without data in the slicer and this is obviously not the fact with Excel 2010 as you can see in the picture below. There is not such choice in the slicer settings(right click on the slicer)

OldExcel2010SlicerSettings

In Excel 2013 this was changed to the bettter and I did not figure this out until today when i remembered the colleagues wish.

 

Excel 2013SlicerSettings

There is a new setting added that I have underlined, Hide items with no data. This setting works as you can see if you compare the slicers in Excel 2013 to the ones in Excel 2010(first picture). The members without data, lightblue, are gone.

Estimated File Size in BISM Multidimensional(SQL Server 2012)

One property that has changed in the latest release of SSAS, now called BISM Multidimensional, is the Estimated Size of the cube database in Management Studio. This property no longer tells the whole story about the cube database size. 

Start Management Studio and connect to your SSAS service. Expand the databases folder and righ click –>properties on a cube database of your choice.

MyCubeDatabases

I will right click on the Adventure Works DW 2008R2 database and select properties.

MMS Estimate

In the Estimated Size property above you can see 20.66 MB. This might look good since the source database file size is 141 MB. That implies a size down to around 14 percent of the original database size.

The problem is that this estimate is not correct for the total cube database size like you can see in the next picture.

FileFolderSize

The cube database resides in C:\Program Files\Microsoft SQL Server\MSAS10_50.MSSQLSERVER\OLAP\Data on my hard drive. There is one folder for each cube database.

It is Swedish(storlek på disk = size on disk) and says that the cube database is 89.5 MB on my hard drive. The size is now 63 percent of the original database file size. This is actually the true story.

Be careful with the estimated file size in Management Studio in SQL Server 2012.

Edit: Update(2012-11-11). After the kind assistance of Akshai Mirchandini, in the MS team I have got some clarifications that was not clear to me when I wrote the post.

  • This estimate was changed already in SQL 2008 R2
  • The reason for this change was the time to open MMS and get an update of the total file size was too long
  • It is an estimate of the compressed total file size.
Follow

Get every new post delivered to your Inbox.

Join 25 other followers