I will be available on 31 May, 2014. Essbase forever! Or something like that. Contact me on LinkedIn if you're interested.

18 August 2014

Stupid Programming Tricks No. 19 -- TRUNCATEing DATAEXPORT

An interesting statistic

The code I am about to show you, and the solution that I came up with, took me over 15 hours of research, cursing, testing, and finally triumph.  Multiple cups of coffee were also consumed as were a few cups of proper tea.  Why do you care about what someone (such as yr. obt. svt.)  does for free or how he imbibes caffeine?  Perhaps for the amusement factor, as the solution that I came up with consists of eleven lines of code.  Eleven lines of code in 15 hours, for those of you mathematically challenged, is 0.73 lines of code per hour, or 5.5 lines of code per calendar day.  Ouch.  But that is the price for:  having a “good” idea, venturing into technology areas one knows very little about, and not giving up.

I tend to agree (if only to salve my ego) with the first point by David Veksler of the Mises Economic Blog (What, you’re not all Hayek fans?).  I quote:
A programmer spends about 10-20% of his time writing code, and most programmers write about 10-12 lines of code per day that goes into the final product, regardless of their skill level. Good programmers spend much of the other 90% thinking, researching, and experimenting to find the best design.

Does that put me in the category of a good programmer?  It’s hard to say – I do spend an awful lot of time thinking, “What am I trying to do and conceptually, how would I do it?” and then later, much later, I spend even more time trying to figure out how I turn that concept into reality.  So is that being a good programmer or just someone flailing about in areas that he doesn’t know much about?  You decide.

Why oh why oh why am I prattling on about this?

There was a thread the other day over on Network54 that discussed how to use the headers that come from a BSO DATAEXPORT statement as they are, without a doubt, pants.  I suggested that SQL output and subsequent manipulation (I should note that Adam_M went and did it in a text file but no matter, this was the inspiration and based on what he wanted to do, I think SQL might have been a better fit for transforming the data – Read The Whole Thing and make up your own mind) was the answer to Adam_M’s issue, but when I did so I had in the back of my mind the nagging reminder that, “Yeah, he’ll write it to relational, but then he’ll do it two or three or more times as he tweaks the extract and will end up with many more records than he ought.  This is a bummer (and erroneous – I would personally rather have it error out) and cannot be resolved by Essbase.”  And that’s right; Essbase will not truncate the table before it writes.  You have to do this manually and that of course means you have some kind of SQL access which in the real world, particularly in production, you will most likely not have.

The goal aka the functional spec

It would be super nice if BSO’s DATAEXPORT calc script function did a TRUNCATE to clear the table, or at least had the option of doing a TRUNCATE, before writing to a SQL target.

Two paths not taken

Just too much of a hack, even for me

One approach would be to use the @CalcMgrExecute function I wrote about (and I still owe part three of that series to you, Gentle Reader), run a MaxL script, in that MaxL script use the shell grammar, run a SQL*Plus (if Oracle) or sqlcmd (if SQL Server) script that does the TRUNCATE and then return back to the calling MaxL script, end it, and do the export as the next step in the original calc script.  At least I am pretty sure that would work but it is just too clunky for words.  I would go down this path if I wasn’t allowed to make the modifications I am about to describe but I would hate myself in the morning.

The way I would like to do it, but cannot

What occurred to me was that a SQL trigger that did a TRUNCATE before the INSERT from DATAEXPORT would do the trick.  Unfortunately, this cannot work because of the way Essbase writes to relational tables.

Would you believe…

That the default for Essbase is to do separate INSERTs for each and every line in the export?  I had heard this from My Man In California (MMIC) aka Glenn Schwartzberg but like oh so many words of wisdom he drops my way (MMIC is a very giving guy and I a forgetful one), I completely forgot.  Why does this inefficient approach to writing to relational matter?

It all goes KABOOM

It matters because the common SQL triggers of BEFORE INSERT TRIGGER (Oracle) and INSTEAD OF INSERT (SQL Server) would fire each and every time Essbase writes to the table.

Here’s what SQL Profiler showed me as the DATAEXPORT code ran:

There are 2,645 records to export out of Sample.Basic and that requires 2,645 INSERT statements.  Eeek.

It might be possible, although inefficient, to use these triggers to do the TRUNCATE and then copy the single record to yet another table, but that is an ugly approach one that I am not sure I have thought through completely as that target would need to be truncated on first write.  So maybe an INSERT trigger is not on.

Another approach

But then I again recalled of one of MMIC’s pearls of wisdom (see, Glenn, I really do listen to you) – there is an Essbase.cfg file setting that can allow batch inserts if the database supports that functionality.

DATAEXPORTENABLEBATCHINSERT

Off I went to the Tech Ref and did my requisite RTM.  Oh goody, thought I, it’s going to work.  Just read the description:  “When DATAEXPORTENABLEBATCHINSERT is set to TRUE, Essbase determines whether the relational database and the ODBC driver permit batch insert.”  But I should have known that when Essbase giveth with one hand, it taketh away with the other, because the next sentence states, “If they do, Essbase uses the batch-insert method, and, thus, performance is optimized.  Essbase determines the batch size; however, you can control the number of rows (from 2 to 1000) that are inserted at one time by using the DEXPSQLROWSIZE configuration setting.”

Oh, bugger, all this statement does is increase the number of records until that INSERT trigger fails.  I know that Sample.Basic has 2,645 rows and that means three INSERTs at the maximum row DEPSQLROWSIZE.  Bugger, again.  What to do?

Is there a way out?  Why yes there is.

INSERT triggers are out, but is there another trappable event that could allow the TRUNCATE to occur?  I looked again at SQL Profiler and saw two interesting facts.

  1. The LoginName column reflects whatever username I defined in the ODBC System DSN I used.
  2. There is a login event, at least as trapped via SQL Profiler.

The simple solution, although figuring out how to do it took, oh, forever, and it isn’t enough

If I could write a trigger that intercepts the act of EssbaseLogin (or whatever SQL username I chose) connecting to SQL Server I could have the login trigger TRUNCATE the table right then and there.  This frees me from the multiple INSERT issue quite handily.  How oh how oh how to do it was the question.

Figuring it out

SQL Server has a trigger database operation keyword called FOR LOGON (the Oracle equivalent is AFTER LOGON).  All I needed to do was to create that trigger, test for the username, do the TRUNCATE, and, because I am a paranoid nut, log: the fact that I did this, when I did it, and that the TRUNCATE succeeded.

A note for all of you SQL n00bs and experts – what you are about to see is the very first SQL trigger yr. obt. svt. ever wrote.  At least that’s the excuse I am sticking to as explanation why it took so long.  SQL n00bs – take heart, if I can do it, so can you; experts – everyone has to start somewhere, so stop the snickering at my plodding progress.

The target database

Here is what QueryTest.SampleBasicExport looks like – a simple table that sticks Product, Market, Measures, and Scenario into the rows and the 12 level zero months of Year as columns.

Approach #1

I stole the genesis for this code somewhere on the web (I meant to keep the link, but forgot to do so).   Here’s what I ended up with:

Line by line

As most of the readers of this blog are Essbase practitioners first, and SQL geeks second (or third, or fourth, etc.), I will explain each line of the code.

  1. The first line defines the trigger EssbaseDataExport – as I changed this about eleventy billion times the code shows ALTER instead of CREATE – and that this is a server-level trigger.
  2. The second line states that this is a LOGON trigger.
  3. Line three’s AS statement defines the trigger.
  4. The BEGIN statement on line four is the beginning block of the trigger logic.
  5. We finally get to the TRUNCATE logic on line five.
  6. Line six’s END marks the end of the trigger.

All I need to do is execute the code and ta-da, I now have the server-level trigger EssbaseDataExport.  You can see EssbaseDataExport in SQL Server Studio that the trigger is now there.

EssbaseDataExport  in action

Here’s what my calc script looks like:
Note that the SQL username EssbaseLogin is used.  At this point this isn’t important but it will be in just a bit.

Let’s do a count of the rows in QueryTest.SampleBasicExport.

If I run the export  I expect to see 2,645 records.

What’s the count?

When I run it a second time I expect the same number of rows as I have done a TRUNCATE just after logon but before the INSERTs actually happen:

And check the count:

DATAEXPORT ran, the LOGON trigger did the TRUNCATE, and the table still has 2,645 records.  Success boil in bag!!!

Is this really working? If I disable the trigger, there will be no truncation on run and the record count will now be 2,645 + 2,645 or 5,290 records.  I hope.
SQL Server gives me confirmation that the trigger EssbaseDataExport is disabled.

Run the DATAEXPORT one more time (Do I have to show you yet another MaxL screen shot?  Hopefully not as I am not going to do so.)

What’s the count?  Why yes it is 5,290 records.  So proof that Essbase won’t do a TRUNCATE before writing when the trigger is turned off.  Huzzah!

With a quick re-enable of the trigger, what happens when I run the DATAEXPORT calc script?
It runs yet again.
And now I have a row count of…
2,645.  I have now proved that every time there’s a login to SQL Server, I can get a TRUNCATE.  

Approach #2

But there’s a problem with this approach and it’s really quite a big one.  Every connection to SQL Server is going to result in a TRUNCATE.  Don’t believe me?  I will disconnect from SQL Server Studio, and then reconnect.

Here’s my connect to SQL Server dialog box. I click on the Connect button and…

What’s my row count on QueryTest.SampleBasicExport?

Yup, it’s zero.  As Donald Fauntleroy Duck would say, “Aw, Nuts!”  I need to be able to test for the username so that only the username that does that particular write to relational forces the clear.  And I’m going to need to be very careful about what username does that clear – EssbaseLogin could do two different kinds of connects – once to write (the TRUNCATE makes sense) and then again to read (which would be a bummer because the trigger would blow away the table contents).

So what I really need to do is create another username just used for writes, and maybe even a username just used for writing to that particular table.  I don’t want a SQL Load Rule to force that TRUNCATE to occur because that too will fire the CONNECT trigger.

Setting up a different DSN

As I am admin on my box and my SQL Server instance, this is easy peasy lemon squeezy.

Create the SQL Server username

As I am a dba on my own server, I will create the username EssbaseDataExport_Writer.

And then give EssabseDataExport_Writer db_owner access so it can do a TRUNCATE.

Creating the system DSN

Create the ODBC system DSN with that username EssbaseDataExport_Writer.

Set the default database to QueryTest:

And confirm that the username can connect:

Modifying the trigger to test for a specific username

What the trigger needs to do is test for the username EssbaseDataExport_Writer and then, and only then, perform the TRUNCATE.  All other usernames will not perform the clear of the SampleBasicExport table.

Happily there is a database operation keyword called SYSTEM_USER that will return the connection username.  Stick that SYSTEM_USER into an IF statement and then do the TRUNCATE if true and we should be good.

Here’s the trigger with the IF test for SYSTEM_USER.

Proof of the pudding part 1

If I the DataExp1 calc script using the non-tested username EssbaseLogin, SQL Server should not perform that TRUNCATE.  If there were already 2,645 records in the table SampleBasicExport, I should now have 5,290.

And so it is:

Proof of the pudding part 2

I created another calc script, this time pointing to the new DSN with the username DataExport_Writer.  In my typically unimaginative/lazy way, I named it DataExp2.

What happens now when I run the calc script?

And the row count?

Yup, I am now testing for just the username EssbaseDataExport_Writer!  Whew, what a long journey, but now when Essbase writes to SampleBasicExport using the EssbaseDataExport_Writer DSN/username, it will never double, triple, quadruple, etc. count the data.  

More cool trigger functions

It’s great that the TRUNCATE works, but wouldn’t it also be nice to have a record of each and every time the export fired and that the target table was cleared out before the write?  Why yes it would.  All I need to do is create a table with that kind of information.

Over in QueryTest, I created a table called LoginAudit with three fields:  DateTime, SystemUser, and RowsBeforeInsert.

I then altered the EssbaseDataExport trigger to do an INSERT after the TRUNCATE.  SYSDATETIME() is a date/time stamp, SYSTEM_USER() we’re already familiar with, and then I did a subquery to get the row count in SampleBasicExport which, if the TRUNCATE is successful, should always be zero.

After I apply the trigger, I then run the calc script DataExp2 and…

This result is just what I hoped for.  I could stick that INSERT statement before the IF to log everyone (this is actually a pretty common usage) and track everyone, I could extend the IF to an ELSEIF and test for other tables, in short I could do all of the cool things that SQL allows me to do.

And that is why SQL should be used for ETL work

I have gone pretty far afield and it took me 19 pages in MS Word to get to the end but a lot of that was spelling out each and every step in the process for we SQL beginners.

ETL in a Load Rule is, in my opinion, the work of the Devil Hisself because it isn’t transparent, is easy to get wrong, and has functions that cannot be undone, e.g. splitting a column.  Whatever the theological genesis of SQL is, it doesn’t suffer from any of the faults of Essbase Load Rules.  Yes, this is another Cameron rant about the Evils of Load Rules.  But I’m done.  :)

Thanks to the power of SQL, I’ve solved the issue with getting TRUNCATE access to an Essbase DATAEXPORT table which is quite often very difficult to do in a production environment through a very simple trigger.  See, SQL is awesome.

A caution, or at least a series of questions
One point about awesomeness -- I  reached out to my buddy Rich Magee, who knows SQL inside and out from a prior professional existence as a DBA, and asked him if there were any downsides to this approach.  Of course there were (life is often a combination of good and bad).  Here are his comments and they are food for thought:

"My understanding is that Logon triggers are typically used to monitor and control max number of logins, time outs and such. However, I could see no reason to not use them in your circumstance.
 
My questions (dba hat on) to you would be:
  • Why can you not simply schedule the stored proc to run?
  • What if the user logs on and off 5 times in a 5 minute period?
  • Would that not spawn unnecessary/redundant jobs all doing the same thing?
  • Could a risk be filling up the CPU or Disk with jobs/temp files that aren’t needed?"
So definitely some food for thought -- as always, you, Gentle Reader, must decide if what I hand out for free is genius or madness.  I tend to think it is the latter, not the former, but that is for you to decide.
The conclusion to the conclusion

Re the point about being a good (or bad) programmer based on approach  – I cannot say which category I fall into but I do know that I spent a lot of time figuring out what I wanted to do and then a lot more time figuring out how to do it with very little code at the end.  As before, you have to decide what class I (or you) fall into.  At least this is something you won’t have to figure out.

Be seeing you.

11 August 2014

Enhanced Planning Validations: Part 2

Introduction

At the end of July Tyler Feddersen very generously wrote part one of this two part series on data validation in Planning; you now are reading the concluding entry.  I’ll let Tyler do the explaining below but I want to make sure you, Gentle Reader, understand that other than this introduction and a super short concluding paragraph that everything is 100% Tyler’s.  I don’t have many guest bloggers (if you are interested send me a comment via this blog or a LinkedIn message) but I am always beyond happy to use my blog as a platform for sharing information.  

Thanks must also go out to Performance Architects – not many companies (in my experience, “not many” should read “just about none”) would allow one of their employees to write for something other than their blog.  PA are to be commended for their spirit of information sharing.  Chuck Persky, whom I know from the long ago days of ISA (and thus my connection to PA), will be presenting at Oracle Open World – if you are going to that event I encourage you to attend his session Oracle Planning and Budgeting Cloud Service: Oracle Hyperion Planning in the Cloud [UGF9091] on Sunday, 28 September.  I’m not presenting on that Sunday (or at all at OOW, for that matter – sanity occasionally comes my way), but I am the guy who solicited the ODTUG EPM user group sessions.  And lest unworthy thoughts enter your mind,  I reached out to Chuck long before this blog post was written – there is no quid pro quo.

With that, enjoy.

Tyler Feddersen, Performance Architects

Let’s take it a step further


If you are reading this and have not read Part 1, I would definitely recommend stopping now and doing so. The rest of this blog will be taking the results of Part 1 and taking it a step further in creating a pretty good overall validation solution.

The validation process discussed in Part 1 was a simple way to be able to identify errors that would cause data integrity issues. However, it does not prevent them. This is the part of the process that Part 2 will attempt to accomplish.

A Look Ahead


Each individual step with this solution is fairly simple and direct. However, the process that combines them all starts to get a little confusing. Let’s take a glance at the general process flow that an end user would go through when working with this technique.

  1. The user completes all necessary data entries
  2. Following data entry, the user would change a lock status to “Locked”
    1. Done through a separate entry form
    2. A business rule checks for any errors prior to allowing the lock
    3. Causes a flag to show as “Locked”, which also prevents additional business rule execution
  3. If Process Management is used, the user would approve the entity, which would validate against the data form used in Step #2
    1. If approvals process finds that the entity has not gone through the initial locking process, an error will be displayed
    2. By successfully approving, the user is no longer the owner of the entity
  4. Following a successful approval, the user is no longer able to perform any data entry, which means that the initial lock in Step #2 cannot be reverted
  5. At this point, the user would need to have the entity sent back through Process Management in order to perform any further data modifications

The end result of the process allows for analysts and administrators to switch their focus to more business-related issues rather than focusing on locking down the system and having to check for all of the typical data entry errors. Administrators generally have to pick a universal end date to lock down the system in order to be assured that users are not making additional changes. With the process mentioned above, budgets and forecasts can be potentially analyzed as they are completed since a user’s ability to make any modifications is removed as soon as everything is approved.

The important part to remember is that this process is done completely using out-of-the-box functionality. If there is anything through the steps that you feel could be simplified….it probably could. But this will at least get us started.

Step 1: Dimensionality


If you recall from Part 1, I created two validation members: AllocValidate and Validate. AllocValidate was created in the dense Accounts dimension  while the Validate member was created within a sparse dimension, to give us block suppression capabilities. For this portion of the validation process, I created an additional Accounts member, LockFlag. This new member will be used in coordination with the previously created member, AllocValidate, to create a locking “flag” that all business rules can use to decide whether the rule should continue to process or not.

Additionally, I added a “NO_XXXX” for each dimension. The flag only needs to be stored at the Entity level, so each dimension outside of Accounts, Period, Scenario, Version, Year (although, I use No Year), and the dimension containing “Validate” will need the “NO_XXXX” member.  





Step 2: SmartLists


Building off the SmartList from Part 1, I added a few new entries. These entries will be used to display the status of an entity in the approval process, including: locked, open, and invalid. Additionally, I created a new SmartList to accompany the “LockFlag” member. This SmartList will be used by the users to lock and unlock their entity.  



Step 3: Create the Data Form


Create a data form with following parameters:

Rows: Entity
Columns: AllocValidate and LockFlag
POV: All other members where the flag will be stored (including Validate)

Additional Specifications:
  • Make the column for AllocValidate Read-Only. This column will be modified based on the user selection for LockFlag in coordination with a business rule that will run on save.
  • Add a Validation Rule for the AllocValidate column to show as invalid if the column does not appear as being locked.  



Step 4: Business Rules


First, create a rule to run on save that will check for the “LockFlag” selection and update the “AllocValidate” status accordingly. The script should initially check if an entity has any error messages. If not, then it will lock or unlock based on the LockFlag selection. Note that this rule actually runs for all entities each time the save occurs. However, there is very little to no potential conflict issue between users, as the rule runs very quickly. However, this rule can also be run as a Menu option to select a single entity at a time if desired. An example script is shown below.

The rule checks if any validation errors already exist. If so, it changes the AllocValidate flag to appear as invalid.


If the rule detects an invalid entry, the rest of the rule is not processed. If there was no error found, the rest of the rule will lock or unlock based on the LockFlag selection.

Next, update each business rule to take the “AllocValidate” flag into consideration. If the flag’s current status is “Locked”, then we no longer want to run the business rule if it will change the current data set. An example is shown below, using the @RETURN function as well.

Lastly, I’ve udated the rule that was created in Part 1 to include an automated re-assignment of the AllocValidate lock status to be invalid, if an error occurs. This is to prevent an issue where the user locks the entity but then submits invalid data, as data entries cannot be prevented through the use of business rules.  The example below would assign an error to a specific assignment in addition to assigning the “Invalid” flag for the whole entity.

Step 5: Process Management


Process Management is a fairly complex subject within Hyperion Planning, so I will not be going into a lot of detail on the specifics of setting everything up as that could be its own blog…or two. The assumption here is that the decision has been made to use Process Management, which allows it to incorporate this validation process. The important part is to make sure to select the checkbox “Do Not Promote” within the options for the Validation Rule set up in Step 3. This will cause approvals to fail if it detects an error within the data form.


Step 6: Example


If you’re confused at this point, it’s quite understandable. I am as well. Nonetheless, let’s put everything together and actually go through a sample process that a user might go through.

  1. The user has invalid data and is trying to lock department (entity) DP_0000. However, the rule has found a validation error.


  1. The user can navigate to a form with the list of errors affecting all of their entities. The example below was the form that was created in Part 1.


  1. The error is displayed on the form, and the user fixes the error. Note: My error flags were thrown off during the preparation for this demo, and the error doesn’t EXACTLY reflect the real error. We’ll pretend.


  1. After fixing the error, the user navigates back to the locking form and tries to lock again. The lock is successful this time.


  1. The user tries to run a rule to add a new position to the department that has been locked. Since this rule would change the data set that has been validated, an error is returned.


  1. The user goes back into the data set, entering a change to the data set that will cause an error to occur. The flag is automatically reset to be invalid.


  1. As an example for Process Management, we’ll pretend the user has approved the department successfully. In doing so, the depatment can no longer be unlocked unless the department is sent back, creating a completely locked down department.


Conclusion


By now, you probably get the gist of what is trying to be accomplished here. However, it all can get a litlte complex with how all of these pieces interact with each other. As mentioned earlier, all of these steps are just an example of how all of these pieces can work together to create a customized flow within any Planning application. They can be refined, simplified, or even expanded as needed. The best part about all of this is that it’s simply a new take on existing functionality, allowing developers and users to remain within the Planning technology. With the Hyperion functionality, we’re always attempting to achieve higher efficiency, and there is nothing more efficient than removing some redundant tasks from the administrator’s plate. Thanks to all for reading and a special thanks to Cameron as well for allowing me to hijack his blog for a few entries. Let me know if you have any questions or comments!

Cameron’s conclusion

So there you have it – a way to prevent planners from inputting incorrect data values.  It’s sort of the budget data quality Holy Grail for Planning administrators and consultants alike.  And oh yeah, business owners too.  This is awesome stuff and again my thanks to Tyler Feddersen for writing this and Performance Architects for allowing Tyler to write outside the walls of his company.  PA understands that sharing information is how we all grow and learn. Thanks for sharing, guys.

Be seeing you.