21 October 2010

Who will rid me of this turbulent bug

It’s déjà vu all over again

Yep, life is sometimes like a Yogi Berra saying. That’s scary.

I just rolled off a Planning migration from h-e-double-toothpicks. I am reminded, again, that I am an applications, not an infrastructure consultant. For some strange reason, I seem to enjoy parading my serial infrastructure incompetence to all and sundry via this blog. Dirty Harry said it best. I am embracing my limitations with renewed fervor.

My pain=your gain

In an effort to ensure that this particular problem doesn’t bite you, oh applications consultant/administrator reader, in the unmentionables, think back, far back to the long-ago days of Planning 2.2. Was there a release with that number? Oh yes, and even before that. I have been around Planning a long time. So why have I learnt so little?

Moving past questions that cannot be answered (or at least questions that have answers I do not want to hear), there was a problem in older releases of Planning – ephemeral port consumption. No, that is not a Victorian-era disease that involves sanitariums and bloody coughs.

Why do you care and what are they?

The issue is that when Planning refreshes filters, it consumes ephemeral ports during its communication with Essbase. When the OS runs out of ports, Planning filter refreshes fail.

What does it look like?

The symptoms

What should have tipped me as to the error was that with 100 users in the app (I got pretty darn good with the Planning importsecurity.cmd/exportsecurity.cmd utilities) the refresh would work. The fact that the command line syntax for invoking the import and export utilities is completely different was just a dollop of Hyperion icing on the misery cake.

Getting back to what worked and didn’t, the filter refresh would work with 300 users in the app.

As the number of usernames increased (I was slowly adding known good MSAD usernames) to just over 600, at some point (and no, I never did get to the actual count that just tripped failure as I was adding in groups of 50) Planning would fail on the refresh.

I (and quite a few others) spent a lot of time trying to figure out if the MSAD ids were “bad” (some were and “bad” in MSAD means a bunch of different things, e.g., corrupted, locked out, etc.). But that wasn’t the issue.

Should have paid attention, but didn’t

What really threw me is that as I did the refresh, I’d get a pretty consistent list of failed usernames. However, when I selected those usernames individually, their refresh would work. Huh? Also, these same ids worked in other Planning apps. Huh, again.

And the answer(s) are

I would love to tell you that I came up with the diagnosis and the cure to this filter refresh failure, especially because I suffered through this in 2002, but I must give credit where it is due – say hello to Jason D’Onofrio who went into Metalink and started searching for an answer. Why would anyone want to search the help? If you don’t fancy my preferred diagnostic method of blindly poking around you too can search Metalink for knowledge base article 826673.1.

And the thing of it is, Tim Tow has documented this error and its fix for, oh, forever, maybe? A long time certainly.

If you can’t be bothered to read any of the explanations, here’s the quick and dirty Windows fix (the same issue affects *nix, but not very much and while the concept applies to that OS, the mechanics below do not):

1) Go into Windows Registry editor on the Essbase server.

2) Navigate to the following key: HKLM\SYSTEM\CurrentControlSet\Services\Tcpip\Parameters

3) Right click and select New or Edit->New and then select DWORD Value.

4) The name should be MaxUserPort.

5) Right click on the new “MaxUserPort” and edit the DWORD value. Enter a decimal value of 65534. You have just increased the number of ephemeral ports to their maximum value.

6) Again create a new DWORD Value. Call it TcpTimedWaitDelay. Set it to a decimal value of 30. You have now decreased to the minimum the time Windows will take to release a port.

Your registry settings should look like this when you’re done.

7) Reboot the Essbase box after stopping your various services – you know the boot order.

8) After starting the Oracle EPM services back up, try doing a refresh. You should have bottled magic at this point.

NB – The Metalink instructions go on about adding MaxFreeTcbs and setting that the decimal value to 6250. That wasn’t necessary in my case.

Why might you not see an error?

Maybe the registry settings are already there and you don’t know it.

Maybe you have small user communities and you never blow through Windows’ ephemeral ports.

Maybe you just can’t believe that this issue exists almost a decade after Hyperion Planning 1.0 was released on an unsuspecting world.

Maybe you’re on 11.1.2 and are using Windows 2008 which has a larger ephemeral port range. Yes, despite Essbase.sec’s almost complete emasculation in this release, filters are still stored in good old Essbase.sec.

Maybe you’re running some version of *nix.

Maybe you’re just lucky. :)

Phew, this is a problem I never want to revisit. Thanks again, Justin, for finding the answer.

Cameron's Blog For Essbase Hackers