An interesting statistic
The code I am about to show you, and the solution that I came up with, took me over 15 hours of research, cursing, testing, and finally triumph. Multiple cups of coffee were also consumed as were a few cups of proper tea. Why do you care about what someone (such as yr. obt. svt.) does for free or how he imbibes caffeine? Perhaps for the amusement factor, as the solution that I came up with consists of eleven lines of code. Eleven lines of code in 15 hours, for those of you mathematically challenged, is 0.73 lines of code per hour, or 5.5 lines of code per calendar day. Ouch. But that is the price for: having a “good” idea, venturing into technology areas one knows very little about, and not giving up.
A programmer spends about 10-20% of his time writing code, and most programmers write about 10-12 lines of code per day that goes into the final product, regardless of their skill level. Good programmers spend much of the other 90% thinking, researching, and experimenting to find the best design.
Does that put me in the category of a good programmer? It’s hard to say – I do spend an awful lot of time thinking, “What am I trying to do and conceptually, how would I do it?” and then later, much later, I spend even more time trying to figure out how I turn that concept into reality. So is that being a good programmer or just someone flailing about in areas that he doesn’t know much about? You decide.
Why oh why oh why am I prattling on about this?
There was a thread the other day over on Network54 that discussed how to use the headers that come from a BSO DATAEXPORT statement as they are, without a doubt, pants. I suggested that SQL output and subsequent manipulation (I should note that Adam_M went and did it in a text file but no matter, this was the inspiration and based on what he wanted to do, I think SQL might have been a better fit for transforming the data – Read The Whole Thing and make up your own mind) was the answer to Adam_M’s issue, but when I did so I had in the back of my mind the nagging reminder that, “Yeah, he’ll write it to relational, but then he’ll do it two or three or more times as he tweaks the extract and will end up with many more records than he ought. This is a bummer (and erroneous – I would personally rather have it error out) and cannot be resolved by Essbase.” And that’s right; Essbase will not truncate the table before it writes. You have to do this manually and that of course means you have some kind of SQL access which in the real world, particularly in production, you will most likely not have.
The goal aka the functional spec
It would be super nice if BSO’s DATAEXPORT calc script function did a TRUNCATE to clear the table, or at least had the option of doing a TRUNCATE, before writing to a SQL target.
Two paths not taken
Just too much of a hack, even for me
One approach would be to use the @CalcMgrExecute function I wrote about (and I still owe part three of that series to you, Gentle Reader), run a MaxL script, in that MaxL script use the shell grammar, run a SQL*Plus (if Oracle) or sqlcmd (if SQL Server) script that does the TRUNCATE and then return back to the calling MaxL script, end it, and do the export as the next step in the original calc script. At least I am pretty sure that would work but it is just too clunky for words. I would go down this path if I wasn’t allowed to make the modifications I am about to describe but I would hate myself in the morning.
The way I would like to do it, but cannot
What occurred to me was that a SQL trigger that did a TRUNCATE before the INSERT from DATAEXPORT would do the trick. Unfortunately, this cannot work because of the way Essbase writes to relational tables.
Would you believe…
That the default for Essbase is to do separate INSERTs for each and every line in the export? I had heard this from My Man In California (MMIC) aka Glenn Schwartzberg but like oh so many words of wisdom he drops my way (MMIC is a very giving guy and I a forgetful one), I completely forgot. Why does this inefficient approach to writing to relational matter?
It all goes KABOOM
Here’s what SQL Profiler showed me as the DATAEXPORT code ran:
There are 2,645 records to export out of Sample.Basic and that requires 2,645 INSERT statements. Eeek.
It might be possible, although inefficient, to use these triggers to do the TRUNCATE and then copy the single record to yet another table, but that is an ugly approach one that I am not sure I have thought through completely as that target would need to be truncated on first write. So maybe an INSERT trigger is not on.
Another approach
But then I again recalled of one of MMIC’s pearls of wisdom (see, Glenn, I really do listen to you) – there is an Essbase.cfg file setting that can allow batch inserts if the database supports that functionality.
Off I went to the Tech Ref and did my requisite RTM. Oh goody, thought I, it’s going to work. Just read the description: “When DATAEXPORTENABLEBATCHINSERT is set to TRUE, Essbase determines whether the relational database and the ODBC driver permit batch insert.” But I should have known that when Essbase giveth with one hand, it taketh away with the other, because the next sentence states, “If they do, Essbase uses the batch-insert method, and, thus, performance is optimized. Essbase determines the batch size; however, you can control the number of rows (from 2 to 1000) that are inserted at one time by using the DEXPSQLROWSIZE configuration setting.”
Oh, bugger, all this statement does is increase the number of records until that INSERT trigger fails. I know that Sample.Basic has 2,645 rows and that means three INSERTs at the maximum row DEPSQLROWSIZE. Bugger, again. What to do?
Is there a way out? Why yes there is.
INSERT triggers are out, but is there another trappable event that could allow the TRUNCATE to occur? I looked again at SQL Profiler and saw two interesting facts.
The LoginName column reflects whatever username I defined in the ODBC System DSN I used.
There is a login event, at least as trapped via SQL Profiler.
The simple solution, although figuring out how to do it took, oh, forever, and it isn’t enough
If I could write a trigger that intercepts the act of EssbaseLogin (or whatever SQL username I chose) connecting to SQL Server I could have the login trigger TRUNCATE the table right then and there. This frees me from the multiple INSERT issue quite handily. How oh how oh how to do it was the question.
Figuring it out
SQL Server has a trigger database operation keyword called FOR LOGON (the Oracle equivalent is AFTER LOGON). All I needed to do was to create that trigger, test for the username, do the TRUNCATE, and, because I am a paranoid nut, log: the fact that I did this, when I did it, and that the TRUNCATE succeeded.
A note for all of you SQL n00bs and experts – what you are about to see is the very first SQL trigger yr. obt. svt. ever wrote. At least that’s the excuse I am sticking to as explanation why it took so long. SQL n00bs – take heart, if I can do it, so can you; experts – everyone has to start somewhere, so stop the snickering at my plodding progress.
The target database
Here is what QueryTest.SampleBasicExport looks like – a simple table that sticks Product, Market, Measures, and Scenario into the rows and the 12 level zero months of Year as columns.
Approach #1
I stole the genesis for this code somewhere on the web (I meant to keep the link, but forgot to do so). Here’s what I ended up with:
Line by line
As most of the readers of this blog are Essbase practitioners first, and SQL geeks second (or third, or fourth, etc.), I will explain each line of the code.
The first line defines the trigger EssbaseDataExport – as I changed this about eleventy billion times the code shows ALTER instead of CREATE – and that this is a server-level trigger.
The second line states that this is a LOGON trigger.
Line three’s AS statement defines the trigger.
The BEGIN statement on line four is the beginning block of the trigger logic.
We finally get to the TRUNCATE logic on line five.
Line six’s END marks the end of the trigger.
All I need to do is execute the code and ta-da, I now have the server-level trigger EssbaseDataExport. You can see EssbaseDataExport in SQL Server Studio that the trigger is now there.
EssbaseDataExport in action
Here’s what my calc script looks like:
Note that the SQL username EssbaseLogin is used. At this point this isn’t important but it will be in just a bit.
Let’s do a count of the rows in QueryTest.SampleBasicExport.
If I run the export I expect to see 2,645 records.
What’s the count?
When I run it a second time I expect the same number of rows as I have done a TRUNCATE just after logon but before the INSERTs actually happen:
And check the count:
DATAEXPORT ran, the LOGON trigger did the TRUNCATE, and the table still has 2,645 records. Success boil in bag!!!
Is this really working? If I disable the trigger, there will be no truncation on run and the record count will now be 2,645 + 2,645 or 5,290 records. I hope.
SQL Server gives me confirmation that the trigger EssbaseDataExport is disabled.
Run the DATAEXPORT one more time (Do I have to show you yet another MaxL screen shot? Hopefully not as I am not going to do so.)
What’s the count? Why yes it is 5,290 records. So proof that Essbase won’t do a TRUNCATE before writing when the trigger is turned off. Huzzah!
With a quick re-enable of the trigger, what happens when I run the DATAEXPORT calc script?
It runs yet again.
And now I have a row count of…
2,645. I have now proved that every time there’s a login to SQL Server, I can get a TRUNCATE.
Approach #2
But there’s a problem with this approach and it’s really quite a big one. Every connection to SQL Server is going to result in a TRUNCATE. Don’t believe me? I will disconnect from SQL Server Studio, and then reconnect.
Here’s my connect to SQL Server dialog box. I click on the Connect button and…
What’s my row count on QueryTest.SampleBasicExport?
Yup, it’s zero. As Donald Fauntleroy Duck would say, “Aw, Nuts!” I need to be able to test for the username so that only the username that does that particular write to relational forces the clear. And I’m going to need to be very careful about what username does that clear – EssbaseLogin could do two different kinds of connects – once to write (the TRUNCATE makes sense) and then again to read (which would be a bummer because the trigger would blow away the table contents).
So what I really need to do is create another username just used for writes, and maybe even a username just used for writing to that particular table. I don’t want a SQL Load Rule to force that TRUNCATE to occur because that too will fire the CONNECT trigger.
Setting up a different DSN
Create the SQL Server username
As I am a dba on my own server, I will create the username EssbaseDataExport_Writer.
And then give EssabseDataExport_Writer db_owner access so it can do a TRUNCATE.
Creating the system DSN
Create the ODBC system DSN with that username EssbaseDataExport_Writer.
Set the default database to QueryTest:
And confirm that the username can connect:
Modifying the trigger to test for a specific username
What the trigger needs to do is test for the username EssbaseDataExport_Writer and then, and only then, perform the TRUNCATE. All other usernames will not perform the clear of the SampleBasicExport table.
Happily there is a database operation keyword called SYSTEM_USER that will return the connection username. Stick that SYSTEM_USER into an IF statement and then do the TRUNCATE if true and we should be good.
Here’s the trigger with the IF test for SYSTEM_USER.
Proof of the pudding part 1
If I the DataExp1 calc script using the non-tested username EssbaseLogin, SQL Server should not perform that TRUNCATE. If there were already 2,645 records in the table SampleBasicExport, I should now have 5,290.
And so it is:
Proof of the pudding part 2
I created another calc script, this time pointing to the new DSN with the username DataExport_Writer. In my typically unimaginative/lazy way, I named it DataExp2.
What happens now when I run the calc script?
And the row count?
Yup, I am now testing for just the username EssbaseDataExport_Writer! Whew, what a long journey, but now when Essbase writes to SampleBasicExport using the EssbaseDataExport_Writer DSN/username, it will never double, triple, quadruple, etc. count the data.
More cool trigger functions
It’s great that the TRUNCATE works, but wouldn’t it also be nice to have a record of each and every time the export fired and that the target table was cleared out before the write? Why yes it would. All I need to do is create a table with that kind of information.
Over in QueryTest, I created a table called LoginAudit with three fields: DateTime, SystemUser, and RowsBeforeInsert.
I then altered the EssbaseDataExport trigger to do an INSERT after the TRUNCATE. SYSDATETIME() is a date/time stamp, SYSTEM_USER() we’re already familiar with, and then I did a subquery to get the row count in SampleBasicExport which, if the TRUNCATE is successful, should always be zero.
After I apply the trigger, I then run the calc script DataExp2 and…
This result is just what I hoped for. I could stick that INSERT statement before the IF to log everyone (this is actually a pretty common usage) and track everyone, I could extend the IF to an ELSEIF and test for other tables, in short I could do all of the cool things that SQL allows me to do.
And that is why SQL should be used for ETL work
I have gone pretty far afield and it took me 19 pages in MS Word to get to the end but a lot of that was spelling out each and every step in the process for we SQL beginners.
ETL in a Load Rule is, in my opinion, the work of the Devil Hisself because it isn’t transparent, is easy to get wrong, and has functions that cannot be undone, e.g. splitting a column. Whatever the theological genesis of SQL is, it doesn’t suffer from any of the faults of Essbase Load Rules. Yes, this is another Cameron rant about the Evils of Load Rules. But I’m done. :)
Thanks to the power of SQL, I’ve solved the issue with getting TRUNCATE access to an Essbase DATAEXPORT table which is quite often very difficult to do in a production environment through a very simple trigger. See, SQL is awesome.
A caution, or at least a series of questions
One point about awesomeness -- I reached out to my buddy Rich Magee, who knows SQL inside and out from a prior professional existence as a DBA, and asked him if there were any downsides to this approach. Of course there were (life is often a combination of good and bad). Here are his comments and they are food for thought:
"My understanding is that Logon triggers are typically used to monitor and control max number of logins, time outs and such. However, I could see no reason to not use them in your circumstance.
My questions (dba hat on) to you would be:
- Why can you not simply schedule the stored proc to run?
- What if the user logs on and off 5 times in a 5 minute period?
- Would that not spawn unnecessary/redundant jobs all doing the same thing?
- Could a risk be filling up the CPU or Disk with jobs/temp files that aren’t needed?"
So definitely some food for thought -- as always, you, Gentle Reader, must decide if what I hand out for free is genius or madness. I tend to think it is the latter, not the former, but that is for you to decide.
The conclusion to the conclusion
Re the point about being a good (or bad) programmer based on approach – I cannot say which category I fall into but I do know that I spent a lot of time figuring out what I wanted to do and then a lot more time figuring out how to do it with very little code at the end. As before, you have to decide what class I (or you) fall into. At least this is something you won’t have to figure out.
Be seeing you.