Showing posts with label Load Rule. Show all posts

22 August 2016

Stupid Programming Tricks No 29, part 2 -- Dynamic Load Rule columns

Where we were or where are we or most importantly where am I?

I have no idea as to the last point but then again I never do. Ever.

Moving on, in the first part of this exciting (surely not but at least useful) series, I related how to stack dimensions in a single column.

Alas and alack, My Man In California, Glenn Schwartzberg, pointed out in the comments to that post that he had already covered this at two different Kscopes. Oh the shame, but as a soon-to-be-ex-board member the number of sessions I get to attend is severely limited. Sorry, Glenn. I had to figure it out on my own. I never do things the easy way. Ever. Again. Bugger.

The Network54 use case I addressed was primarily a need to both stack dimensionality as well as selectively address more or fewer columns of data depending on data scope.

This is quite easily done in a Load Rule, indeed it’s possible in both a SQL as well as text Load Rule and it all centers around how the initial record of a Load Rule. The Stupid Trick here is that if ten data columns are defined at design-time and only five are passed at load time, the Load Rule ignores the last five.

One might think that this would be best accomplished by changing the SQL query within the Load Rule but by doing that one would edit the Load Rule itself, this would be a design-time change , and the number of columns would be modified. I’ll also mention that Load Rules are tricky little buggers that just beg to go FOOM! so I’m loathe to modify them.

Instead, a SQL view that changes the scope of the columns passed to the Load Rule’s SELECT (well, you have to skip the “SELECT” but that’s the action the rule performs) * FROM viewname and ta-da, the Load Rule now selects fewer (or even more with an important caveat) data columns.

That caveat

This more-or-less Load Rule behavior is predicated on the columns that are defined within a SQL view.

I take the point, perhaps even before you’ve raised it, that modifying the SQL is a design-time change. But with these requirements something and somehow is going to change. Load Rule or ALTER VIEW? One man’s meat is another man’s poison so it’s time to pick yours.

What kind of poison for the evening, sir?

I’ll have just a wee slice of SQL:

Pulling this in a SQL Load Rule looks like this:

And this:

So no different than the original fStacked table which is in turn no surprise given that the fields are the same in the view as they are in the table.

Let’s cut that view down to just Actual columns:

Et voilà!, dynamic Load Rules

Change the SQL in the view, no change to the rule, change the data columns and thus scope, all with no editing of The Devil’s Own. Gee whiz, could it be that Load Rules aren’t spawn from Old Scratch? Maybe.

Would I do it this way?

Let’s review what this approach does:

It stacks dimensions in a column, i.e. it allows more than one dimension to be defined for each column of fact data. That’s not a condition of dynamic Load Rules but instead is a requirement of that post way back when in part 1.
It shows that removing or adding columns to a data source make the Load Rule display more or fewer columns.

The caveat to the above approach is that the definition of the Load Rule’s columns must happen before the data change and the maximum number of possible columns needs to be defined up front.

If this last bit seems odd, think what happens when you load a poorly defined text file such as when you’re told, “There are 13 columns of data,” but in fact there’re really 14 columns 2,312 records down although not in the first 2,311 rows. Whoops, someone forgot to mention that and because the Load Rule defines columns based on its initial 50 row read (yes, you can change this and even the starting position but you’d have to know the exact row to go to) Essbase is going to throw a rod because it doesn’t know how to handle data column 14. The damnable thing of it is if the Load Rule can’t see the column, the Essbase geek can’t add it. The “fix” is to create a one record file that has that 14th column, tag the column as Ignore During Data Load, and for the 2,311 preceding rows it’s as if that column doesn’t exist (remember there is no tab or comma delimiter at the end of those 13 fact field records) until record 2,312. This is the same concept as the “Budget”,”Sales” columns defined when the data exists and then being dropped when the data source no longer contains said columns.

Whew.

So what do I do? Benchmark. I’d benchmark this approach, particularly the full Actual and Budget load example vs. two separate Load Rule streams, one of Actual and the other of Budget. And as this is an ASO cube, I’d kick that latter approach off in multiple MaxL statements writing the contents to a single buffer. Is the performance issue tied to how fast Essbase can commit a temporary tablespace to a permanent one or is it how big the pipe from the fact source is? Dunno till tested.

If a two stream approach worked, there’d need to be some kind of conditional branching in the batch script to decide which or both to load.

Whew, again. No one said that system requirements are easy. That’s why we’re (hopefully) paid the big (even more hopefully) bucks (insert currency of your choice).

Be seeing you.

03 August 2016

Stupid Programming Tricks No. 29 -- Dynamic Load Rule columns

Preface

This was originally going to be one ginormous post on Stupid Essbase Load Rule Tricks but once I got to page 12 and realized I was about halfway through I decided to split this subject into multiple posts. At the end of it all I’ll put in links to bring everything into a single spot.

With the warning, off we go.

Hate is such a ugly emotion

It is moderately well know that Yr. Obt. Svt. hates Load Rules. Why?

I hate them because they have an interface that is little improved from the days of Essbase Application Manager. If you want a feel for what that looked like, have a read through the DBAG which (un)surprsingingly hasn’t bothered changing the screen shots of what is after all fundamentally unchanged from 1993.

I hate them because they are a temptation to those without other data integration tools to manipulate data. I really and truly have seen instances of over 240 Replace selections in a single rule. How does one audit that? Understand it? Manage it? Know that it’s actually performing per requirements? The short answer is notgonnahappen. Lest I be accused of casing stones at those less fortunate (eh, those of you who have had the (dis)pleasure of meeting me may think that’s a pretty low bar), that’s what the admim had at hand, that’s what he used. But It Just Isn’t Right when a SQL INNER JOIN would have done the same thing and in a much better way. But I digress.

And lastly, I hate them because I have comprehensively, completely, and totally shot myself in the foot, fired to slide lock, reloaded from my spare magazine, and repeated ad infinitum. Seriously, it’s easy to do this even when one is being careful and, Best and Brightest, I’l bet you’ve done it more than once. Ugh.

Another Load Rule rant over. As always, it feels so good to vent my Load Rule spleen.

A plea for healing

And yet we need Load Rules. The latest and greatest version of Essbase, Essbase Cloud Service (EssCS), that Tim German I presented at Kscope16 used – Wait for it! – Load Rules albeit in much improved fashion. The need to get metadata and data into Essbase (and Planning) remains. Until that golden day when all of we Essbase geeks are using EssCS or at least enjoying the functionality that it provides in on-premises Essbase we are well and truly stuck with them. An even better alternative would be to use the INSERT INTO…SELECT FROM data and metadata nirvana. A man can hope.

And now a use case

And actually, they can be at least bearable if we only didn’t have to suffer through the interface and kludginess that is a Load Rule. How can this be done? Quite simply really – SQL is the answer.

My Very Favorite Essbase Message Board In The Whole Wide World (MVFEMBITWWW)

Over on Network54, there was a thread where the original poster (OP) wanted to dynamically change the number of columns he’d read from a table via a single Load Rule. Huh? And then there was the assertation that both overloading (or stacking) single data column as well as dynamic load rules aren’t possible. Double huh? That just isn’t so, or at least I don’t think that’s so.

It seems odd on its face but his requirement is to sometimes load history across multiple or a single year and/or sometimes load just the latest period. It would certainly be possible to stick the year and the period in the column but this is a set of pretty big data at least in Essbase-land – potentially up to 3 BILLION rows. Eeek.

A far more efficient approach would be to stick the periods in the columns. What is likely to be faster when loading three years of data in this format ~ 83 million rows (3,000,000,000 / 12 periods / 3 years) or 3,000,000,000 (the whole kit and caboodle)regardless of the number of columns? Exactly.

Putting aside efficiency although I’m not clear on why one would want to do that, this approach would require multiple Load Rules – at least one for just a single period and one for multiple periods – and batch processes and then a selective execution of those Load Rules. While this isn’t an impossible task, it’s certainly annoying.

Is there another way? You betcha.

Stacking dimensions

The first step is figuring out how to put years into the column instead of the year. Using Sample.Basic and (gasp) a text Load Rule and yes this is possible in SQL as you will see a bit later on. Let’s see how this might be done.

Typical and atypical

Typically each column in a load rule is a single dimension as shown below. Easy peasey no big deasy. When I build Load Rules and when I look at what others create this one-to-one relationship is standard.

That may be the default but it isn’t the only way to peel an egg. You can stack or overload each column with multiple dimensions. By that I mean a column that refers to measure Sales can also be directly tied to the scenario Actual while the next column can be Budget and Sales.

This can be built in a Load Rule by selecting two or more dimensions:

Note well the double quotes around the two member names. These are important because they act as delimiters. If not used, Essbase thinks that the string is a single member name of Actual,Sales. What’s desired are two dimension assignments in one column.

When selected via double clicking on the member names the result is a data column definition looks like this: "Actual","Sales"

Unpossible!

That’s all well and good but as you might imagine I prefer to have the data describe itself. To do that I can create a data Load Rule that reads the first record as the header:

Without data, the Load Rule looks a bit empty:

By placing the header record as the first record, and using the above Data Source Property header row definition, the Load Rule magically has the stacked dimension definitions by column.

NB – Essbase automatically trims double quotes so using an escape character is a requirement as shown below:

Maybe an easier way to view it is in Excel:

With that, I have a Load Rule that is Actual in columns 5 though 12 and Budget in columns 13 through 20.

For the record, escaping double quotes is performed using the backslash symbol as follows:

\"Actual\",\"Sales\"

Here’s what it looks like in a Load Rule column in EAS looking just as if you’d’ve selected it by hand.

Does it work? Yep.

As always, the proof is in the delcious pudding. I like South African Malva Pudding but to each his own.

Dynamic SQL your way

It’s easier in a relational source. Yes, that’s a drum I beat again and again simply because it’s true.

I can create a simple fact table with the stacked headers I need in the column names:

Note how SQL Server (one might think that I’d switch to Oracle sooner or later) sticks [ and ] around the special character double quotes. There is no need for escaping those characters. Why it is required in one place and not in another is the Sweet Mystery of Life.

A simple query returns the values:

Ta da:

And a load:

Good grief that was easy.

Two cool things thus far

We now know:

Data columns can refer to more than one dimension on a column-by-column basis.
Either text files or relational sources can pass that multiple dimensionality either in a header record with escaped double quotes or via a column name in a fact table.
Never say never again when it comes to Essbase.

The next Stupid Trick in this series will show how to selectively turn columns on and off in a Load Rule.

Be seeing you.