Monday 22 December 2008

SQL 2005 SP3 has landed

SP3 for SQL 2005 is now available for download.

Included are a number of Bug Fixes , performance enhancements for using SSRS in SharePoint Integration mode, improved rendering of PDF fonts in SSRS, support for SQL 2008 Dbs with SQL 2005 Notification Services and support for Teradata Dbs as a data source for Report Models.

This Service Pack is cumulative and can be run on any version of SQL 2005.

Wednesday 17 December 2008

PPS-P: Where, oh where, does my data go?

A user enters a value into a PerformancePoint Planning form... so what happens next? And why are we the least bit interested?

Firstly, to answer the why do we care question, ask yourself this...

  1. When do your extracts from PPS run (presuming you run it more regularly than say once overnight)?

  2. Do users eg approvers expect to see updates immediately - particularly at certain times of the cycle ie month end?

  3. What data do reviewers, approvers and report consumers expect to see? Approved? Currency Converted?

So we do care after all. So now to what actually happens. In fact, there are a few things that can happen. Let's examine each separately.

(1) The user selects the Save Privately action

  • The data is stored on the local file system - known as the Offline Cache. This location is usually C:\Documents and Settings\%username%\Local Settings\Application Data\Microsoft\PerformancePoint\OfflineCache
  • In the Offline Cache, the Catalog.xml file stores a list of all assignments stored locally
  • In the Offline Cache, each assignment is stored in a .dat file. The dat file an xml file which contains binaries.
  • Data is NOT sent to the SQL server or the cube
  • The status of the assignment is still 'started'
(2) The user selects the Submit Draft action
  • The data is stored in the SQL table called MG_ModelName_Default_Partition
  • The resulting values from Definition Rules are stored in the MG table also however they're linked to the 'No Assignment' assigment
  • Once the pre-configured polling period has passed, the cube called ApplicationLabel_ModelSiteLabel will be refreshed
  • The status of the assignment is 'Partial'
(3) The user selects the Submit Final action.
  • The data is stored in the SQL table called MG_ModelName_Default_Partition
  • The resulting values from Definition Rules are stored in the MG table also however they are linked to the 'No Assignment' assigment
  • Once the pre-configured polling period has passed, the cube called ApplicationLabel_ModelSiteLabel will be refreshed
  • The status of the assignment is 'Submitted'

(4) A data movement job occurs

  • The data is copied from the source SQL table called MG_ModelName_Default_Partition to the destination table called MG_ModelName_Default_Partition
  • Data is aggregated when the scope of a dimension (in the data association) has been set to "All" on the source model.
  • All data is copied: Partial, Submitted, Approved
Then of course the question is, how long does all this take? Sadly, there's no simple answer to that question, as so aptly discussed by Peter Ebley. To paraphrase Peter, there are a few steps that occur:
  1. Send it to the server (Status = Pending)
    Depends on the size of the submission (How many rows? Is the workbook attached? How many dimensions?) and network bandwidth.

  2. Process the Queue (Status = WaitProcess)
    Depends on the Planning Process Poll Interval (recommendation is 1 minute) and how many items are in the queue and of course, how long it takes to validate the submission.

  3. Cube Processes
    Depends on the OLAP Cube Refresh Interval property in the Workflow tab of PAC.

Weighing in on Project Gemini

My reception to Project Gemini to date has been lukewarm. Sure, self-service cubes in Excel sounds sexy and looks sexy... but... we don't let our users have free reign with data for a reason. These controls didn't just happen because we want to keep ourselves in a job (I hope!).

I'm gonna take a stab and suggest that every BI pro has walked into a job and found an environment where there are 2000 reports on a single system and some of them consume all the machine resources and take 10 minutes to execute. That sort of malarky. So we did the hard slog, got the BA's back up and whipped everyone back into line. The users were happier because the reports execute quickly and were easy to find, you were happier because there are far fewer reports to maintain, the server team are happier because the box didn't have to be rebooted every day and the BA's are happier because everyone else is happy (one hopes!).

This is why we put controls in place.

So. Doesn't this leave Project Gemini dead in the water in the real world?

Well, I read an interesting perspective from Patrick Husting which made me re-think my thoughts (non-pun intended).

"People are always going to do self service reporting in some way whether we like it or not, no matter how good the underlying data is, so why not do it in a controlled manner where everything is audited and logged and the IT team has full visibility of what is going"

Ok so, to quote from a really crappy movie I saw recently "just because it is, doesn't mean it should be". But I think he might actually be right. To a point.

I really do hate to be a naysayer. And I'm not just saying that. But, I have a big BUT (please no jokes!).

Where is all this magical data that is being analysed coming from in the first place? From systems. Well guess, what, we don't let users have hollus bollus access to our databases to avoid the suituations caused by our predecessors.

Actually that's not my only issue... users do NOT have a big picture view of the world. They don't care what other parts of the business are doing and they are not interested in how data aggregates upwards beyond their little fish pond. They are not interested in whether there is a single source of truth or if their version of the truth is different to someone else's. They are not interested in network bandwidth or performance at other sites.

Nor should they care about this stuff, that's not their job. It's ours.

It's all very well to say "everything is audited and logged" but aren't we supposed to operate preventatively not reactively?

So I have to ask, isn't that what something like ProClarity is for? To allow users to slice and dice data however they choose while ensuring that the structures and models have been put in place to consider the big picture?

You know what, I think I've just talked myself back to my original stance. It's sexy and I like it but I can't envision practical applications in the real world yet.

That said, I'm always happy to be proven wrong.