There are three nasty bugs haunting X-Plane 10 that are on my plate right now; these are serious bugs that pretty much ruin the X-Plane experience for those who hit them. They are marked in the bug base as “hair on fire” – the highest priority we have for a bug. (Yes, the bug base literally has a priority category called “hair on fire”!)
Every morning I have to ask “can I close any of my hair on fire” bugs, but these three have been nasty enough that a status update is warranted, even before we have a fix.
Bug 1: Poor Performance with ATI Hardware on Windows
A code path used by the cars and clouds is stalling in the ATI driver on Windows. I don’t know why this is, but we are trying to follow up with ATI directly on this. The result is pretty unfortunate: if you have any cars (even “siberian”) or any clouds (even scattered) framerate absolutely tanks.
As it turns out, if you set weather to clear and turn off cars completely, ATI performance on Windows is pretty good! Obviously that’s not a useful work-around, but if you have ATI hardware and are wondering what kind of fps you’ll get once we fix this, those are the two features to turn off. (I suspect rain may hit this code path too.)
As with all driver-related problems, I must point out: I do not know if it’s our fault or ATI’s, and the party at fault may not be the party to fix it. Apps work around driver bugs and drivers work around illegal app behavior all the time…that’s just how it goes. Philosophically, the OpenGL spec doesn’t require any particular API calls to be fast, so you can’t really call a performance bug a bug at all. It’s ATI’s job to make the card really fast and my job to speculate which subset of OpenGL will cause the card to be its fastest.
I have a Radeon 7970 and will write up some of my performance findings later. Unfortunately with the clouds affected by the problem code path, it’s hard to see how the card will really do.
(OpenGL nerds: the problem is that glBufferData with an in-flight VBO and a NULL pointer is stalling. This particular use of glBufferData is meant to tell the GPU that we want a new fresh buffer, with the old buffer being asynchronously discarded once the GPU has completed drawing. This “orphan the buffer” pattern is the standard way to do high performance vertex streaming in OpenGL prior to glMapBufferRange.)
Bug 2: Hang on Start with NVidia Hardware (Windows)
A small percentage of Windows users get a hang on start with NVidia hardware. We’ve had this bug for years, but it’s more of a problem in version 10, where command-line work-aronuds are not available. We have identified four third party programs (Omnipage, Matrox Powerdesk, Window Blinds, and TeamViewer) that all seem to cause the problem at least some of the time, but some users have this problem with no obvious cause. There isn’t a hardware or driver pattern to it either.
Until recently, we could never reproduce this bug in-house; it was only through the patience of users that we were able to characterize exactly what was going wrong via copious logging. But a few days ago we got a break: a user reported Window Blinds to be hanging X-Plane, and unlike before, Window Blinds hangs our machines up too! (That’s a huge step forward in getting a bug fixed.)
At this point we’re working directly with NVidia to try to find out what’s going on, as the hang occurs inside their driver code. The same rant applies here as above: until we know exactly what’s going on, we can’t say whether this is a driver bug or an app bug.
Bug 3: Crashes Near Munich on Linux
Linux users see crashes near EDDM with memory exhaustion, even at very low rendering settings. This bug appears to be Linux specific and we do now know what’s going on.
For reasons that I cannot yet fathom, the math deep inside the ATC airport taxi layout auto-generator goes haywire on Linux, resulting in an infinite loop that allocates memory, eventually crashing X-Plane.
What’s infuriating about this bug is that it is a Heisenbug: all of my attempts to diagnose or “prove” what is going on causes the math to stabilize and the app to function. The bug does not appear when the optimizer is off; it certainly looks like an optimization bug but I hate to call foul on the compiler guys without a ton of evidence. It’s just too easy to write C++ floating point code with sharp edges.
I don’t have a good time frame for the two video card related bugs; I consider us lucky to be able to communicate with ATI and NVidia at all. I don’t have insight into their development process and if I did, it would be covered under our NDA. The only thing I can say is that these are top priority bugs.
The Linux bug is still in my court; I last put it down in a state of GCc-induced rage. I believe I will be able to get it fixed some time during the 10.04 beta process.
Final unrelated thought: all of the GPUs in my office, listed in performance order…
GeForce 5200 FX, GeForce 6200, Radeon 9600 XT, Radeon 9700, Radeon X1600M, GeForce 8800 GT, Radeon HD 4870, GeForce 285 GTX, Radeon HD 6950, GeForce 580 GTX, Radeon HD 7970