There are three nasty bugs haunting X-Plane 10 that are on my plate right now; these are serious bugs that pretty much ruin the X-Plane experience for those who hit them.  They are marked in the bug base as “hair on fire” – the highest priority we have for a bug. (Yes, the bug base literally has a priority category called “hair on fire”!)

Every morning I have to ask “can I close any of my hair on fire” bugs, but these three have been nasty enough that a status update is warranted, even before we have a fix.

Bug 1: Poor Performance with ATI Hardware on Windows

A code path used by the cars and clouds is stalling in the ATI driver on Windows.  I don’t know why this is, but we are trying to follow up with ATI directly on this.  The result is pretty unfortunate: if you have any cars (even “siberian”) or any clouds (even scattered) framerate absolutely tanks.

As it turns out, if you set weather to clear and turn off cars completely, ATI performance on Windows is pretty good!  Obviously that’s not a useful work-around, but if you have ATI hardware and are wondering what kind of fps you’ll get once we fix this, those are the two features to turn off.  (I suspect rain may hit this code path too.)

As with all driver-related problems, I must point out: I do not know if it’s our fault or ATI’s, and the party at fault may not be the party to fix it.  Apps work around driver bugs and drivers work around illegal app behavior all the time…that’s just how it goes.  Philosophically, the OpenGL spec doesn’t require any particular API calls to be fast, so you can’t really call a performance bug a bug at all.  It’s ATI’s job to make the card really fast and my job to speculate which subset of OpenGL will cause the card to be its fastest.

I have a Radeon 7970 and will write up some of my performance findings later.  Unfortunately with the clouds affected by the problem code path, it’s hard to see how the card will really do.

(OpenGL nerds: the problem is that glBufferData with an in-flight VBO and a NULL pointer is stalling.  This particular use of glBufferData is meant to tell the GPU that we want a new fresh buffer, with the old buffer being asynchronously discarded once the GPU has completed drawing.  This “orphan the buffer” pattern is the standard way to do high performance vertex streaming in OpenGL prior to glMapBufferRange.)

Bug 2: Hang on Start with NVidia Hardware (Windows)

A small percentage of Windows users get a hang on start with NVidia hardware.  We’ve had this bug for years, but it’s more of a problem in version 10, where command-line work-aronuds are not available.  We have identified four third party programs (Omnipage, Matrox Powerdesk, Window Blinds, and TeamViewer) that all seem to cause the problem at least some of the time, but some users have this problem with no obvious cause.  There isn’t a hardware or driver pattern to it either.

Until recently, we could never reproduce this bug in-house; it was only through the patience of users that we were able to characterize exactly what was going wrong via copious logging.  But a few days ago we got a break: a user reported Window Blinds to be hanging X-Plane, and unlike before, Window Blinds hangs our machines up too!  (That’s a huge step forward in getting a bug fixed.)

At this point we’re working directly with NVidia to try to find out what’s going on, as the hang occurs inside their driver code.  The same rant applies here as above: until we know exactly what’s going on, we can’t say whether this is a driver bug or an app bug.

Bug 3: Crashes Near Munich on Linux

Linux users see crashes near EDDM with memory exhaustion, even at very low rendering settings.  This bug appears to be Linux specific and we do now know what’s going on.

For reasons that I cannot yet fathom, the math deep inside the ATC airport taxi layout auto-generator goes haywire on Linux, resulting in an infinite loop that allocates memory, eventually crashing X-Plane.

What’s infuriating about this bug is that it is a Heisenbug: all of my attempts to diagnose or “prove” what is going on causes the math to stabilize and the app to function.  The bug does not appear when the optimizer is off; it certainly looks like an optimization bug but I hate to call foul on the compiler guys without a ton of evidence.  It’s just too easy to write C++ floating point code with sharp edges.

I don’t have a good time frame for the two video card related bugs; I consider us lucky to be able to communicate with ATI and NVidia at all.  I don’t have insight into their development process and if I did, it would be covered under our NDA.  The only thing I can say is that these are top priority bugs.

The Linux bug is still in my court; I last put it down in a state of GCc-induced rage.  I believe I will be able to get it fixed some time during the 10.04 beta process.

Final unrelated thought: all of the GPUs in my office, listed in performance order…

GeForce 5200 FX, GeForce 6200, Radeon 9600 XT, Radeon 9700, Radeon X1600M, GeForce 8800 GT, Radeon HD 4870, GeForce 285 GTX, Radeon HD 6950, GeForce 580 GTX, Radeon HD 7970

About Ben Supnik

Ben is a software engineer who works on X-Plane; he spends most of his days drinking coffee and swearing at the computer -- sometimes at the same time.

46 comments on “Three Nasty Bugs

  1. Wow!!! Thanks for the insight Ben! Just turned off clouds and cars completely, and my fps jumped from 18-22 to 34-38. Radeon HD 6970 at fault 😉

    1. Yep – once you remove the driver hit, things pick up. Now the next trick is: at 34-38 fps, can you turn on even MORE stuff that you didn’t already have and KEEP those fps. Modern PCs with decent cards can typically run the autogen at extreme res.

      1. Have a 6950 with 2GB VRAM and have the same problem. Once I turn on cars framerate drops dramatically to 10-13. I can turn on objects, trees and Texture Resolution to max without losing too much frames. So, obviously I´m looking very forward to the bug fixes regarding the ATI cards 😀 Good luck and hopefully it doesn´t take too long :-))

      2. You are completely right Ben – same setup, and now with INSANE objects, I still keep the same fps (and 40 fps at night)! And that’s because I have HDR on, global shadows, textures at extreme resolution, airport with extreme details, and I am running at 5760×1080 (3x1080p)!!!

        Wow wow wow!!! What a difference!

        1. wotaskd – are you running a single 6970 in eyefinity? Your results are good to hear as I want to keep a single 7970 GPU and run at the same setup (5760 x 1080).

          1. I _must_ point out at this point that it is _not_ a goal for X-Plane to run at 5760 x anything at 60 fps with maxed out options on a single CPU. I do want to be able to run at 2560 x 1440 at 30-60 fps with all rendering settings maxed out on a premium single-GPU video card. But when you get into dual and triple monitors, you’ll have to turn something down, even after we finish performance tuning.

          2. I’m not looking to max everything – just the important visuals (mainly HDR, shadows and weather). I’ve actually had pretty good luck so far so any improvement would be a bonus.

          3. LOL – not trying to maximize everything, eh? Just the fill-rate expensive things: HDR, shadows, clouds, and huge res. 😉

            Seriously, if you’re willing to dial it back you’ll find you can free up some room for fill rate. We’re trying to provide good compromises for how things get dialed in and out.

          4. Understood. I do like the way the options allow you to dial things in and out. Until the 7970 drivers are more mature it will be tough to know what I can really achieve with my setup. My goal is anything over 30fps consistently – to me that is perfectly enjoyable and smooth.

          5. Yes, Eyefinity @ 5760×1080. I’m with Ben, this setup is non-standard, not what 90% of users need, and probably not a target for X-Plane 10 optimization. That said, it’s amazing to see what can be done with current hardware. Still, I’d probably stick to the standard presets and enable the features that matter the most.

            Also, my intention is to dial down my screen setup to 3x720p as soon as AMD’s 12.2 drivers are released (with support for alternative resolutions) and X-Plane supports 720p (something Ben already said is something to be added in the future).

  2. I feel 1/3 of your pain. We recently found a little diddy buried in some F77 from back in the 80s. Linux made the float end w/ a 0 and Windows made it end w/ a 1. Different compilers on either sides.

    I too am suddenly loving the jump in fps sans cars and clouds w/ a 5850. Good luck.

  3. Hi Ben,

    I’m a new x-plane user (starting out with v10), and have been in a holding pattern on getting a new video card, since I’ve been lusting after the 7970 but holding off on buying it due to reported issues with x-plane.

    So I’m very grateful for the update! You mentioned performance issues with the 7970, but other than that, I’ve also heard reports of serious rendering issues (artifacts or flashing textures). Are you also seeing these issues in your testing, or have they been confirmed fixed? Refer to //forums.x-plane.org/index.php?showtopic=56217&st=0 (in particular, OptimusPrimeX and acompany both report these issues).

    I’m currently using on-board video until I can buy a video card, which makes me sad… 🙁 so I really would love to buy the 7970 — if it will work!

    Thanks,
    Jim

    1. I see the crazy geometry too – I’m hoping this is just an early driver bug; we see artifacts come and go in the Windows drivers for both NV and ATI from time to time. The 7970 is so new the regular Cat drivers don’t support it yet…I’m assuming they’ll get back to a unified model soon.

    2. Hi Ben,
      So glad I stumbled on this topic this morning! I’m a new user as well and I actually filed a bug report last night about the flashing/crazy textures and artifacts. I have a 7970 and can’t wait to get home today and turn off the cars and clouds and see what happens!
      When you mentioned the “crazy geometry” you are seeing, can you be more specific? I am having flashing/shaking textures that are very unnerving and unenjoyable, and want to feel some sense of relief that this is what you are seeing as well.
      Thanks!
      Marc

        1. Thanks for the quick response! Do you happen to see jagged or really blocky 3D shadows with the 7970 as well?

          1. I didn’t notice them being any worse than normal; shadow blockiness varies with camera angle, weather and settings. It’s better than it was in 10.00 but still needs more tuning.

          2. Interesting… I did submit a screenshot of mine to show the level of blockiness (though I’m sure you have hundreds of submissions to go through), so if you happen to come across it any feedback would be great.

          3. I’m sorry to be a jerk, but: please do _not_ use the blog to follow up on filed bug reports. Once a bug report is filed, please let it be.

            (Once you do it, everyone will want to know, and we have enough open bugs that we really need to spend our time fixing them, not fielding questions on them.)

            Don’t worry – once the bug is in the DB, it is preserved until we can get to it!!

  4. Ben,

    I don’t know which compiler is being used on Windows platforms, but if GCC is used for Linux portings, to my knowledge this compiler supports different possible paths to translate floating point math into CPU-specific instructions. Maybe any of these paths is broken? Scalar floating point operations can be obtained both by using legacy x87 instructions (deprecated because of performance penalties) or scalar SSEx instruction (recommended). Does the bug arise with both of these?

    Regards,
    Filippo

  5. It seems like Nvidia 580 GTX on Windows is also suffering from serious performance issues. I get miserable performance on an overclocked Core-i7 + 3x 580GTX with SLI enabled _and_ disabled (SLI is worse). I get a lot better frame rates with a crappy C2D + 480 GTX. Someone on the forum was also complaining about 580 GTX on Windows. It’s good to know you have one of those GPU’s on hand.

    Here’s a tip from an OpenGL driver developer: avoid glBufferData again on an used buffer name, it might be better to delete the buffer and start a fresh one. I don’t know if that would trigger a slow path in the ATI driver. There must be a reason for you not using (double buffered) memory mappings with buffers.

    What comes to the bug you suspect to be a compiler problem: have you tried using a different version of GCC or even Clang? Grab GCC sources from Git, build it and test if the problem is still there. Of course if you can’t reproduce the issue reliably, that will be difficult.

    1. Hi Riku,
      Re: 580, I don’t know what’s going on; I have seen at least one other 580 complaint, but I have the 580, and it runs pretty well at high settings, and a number of other users have confirmed this. So…I think this may be an exercise in “diffing” your system against other users systems. I do not believe we have a 580 compatibility/perf bug.

      Re: glBufferData, where did you hear this? Re-specifying a null block is pretty standard – it correlates to lock-discard on D3D. I also tried a map-buffer-range with the discard flag and that also stalls.

      I _think_ I tried using two full VBOs and it didn’t fix it, but I should double-check.

      Here’s a reason to use orphaning/map-discard and not double-buffering: you don’t know when the GPU has “released the lock” on your buffer (that is, when the command stream no longer references the buffer). On a triple-buffered system this could be a long way in the future; on an AFR SLI system this could be even further in the future. So if you explicitly double-buffer, double may not be enough. How much is enough? 3? 4? With orphaning (where you basically pass ownership to the GPU for disposal and grab a new buffer) you let the card release and recycle the memory as quickly as possible.

      Re: the math bug, the version of GCC is not the same on Mac and Linux in our build environment, so we have at least demonstrated that it is not all versions of GCC. Moving the project to CLANG or a _majorly_ different GCC version may be non-trivial.

      1. Re 580: On the i7/Triple-580GTX/Win7-64 box I have tried, X-Plane 10 is not playable at all, even at minimum detail settings, FPS is around 30 with minimum settings. Increasing detail level decreases FPS but not dramatically. I have tried enabling and disabling SLI from the control panel and there is a ~3x difference in FPS (without SLI, single screen only, is better), using the same (small) window size. I think there is a strange compatibility issue, and I think that it’s because both XP10 _and_ the Nvidia drivers have some bugs that together cause this. My plan is to try installing Linux on this box to see if it works better. Feel free to contact me through e-mail and we can debug this issue together. I don’t know if you have access to a huge triple-GPU SLI setup, so I can help you on that front.

        Re glMapBuffers: I don’t know about the driver in question and this use case in particular. On the GL(ES) driver I work on, I’ve noticed that sticking to simple use cases if usually the fastest path. Exploring the dark corners of GL (like texture completeness oddities) will often trigger a nasty software workaround, which may be slow and/or buggy.

        Re math bug: Playing around with different compiler versions and optimization flags is laboursome but may reveal nasty bugs, but you have to be able to reliably reproduce the problem to get anything meaningful. Depending on the subset of C or C++ you’re using, changing from GCC to Clang may be anything from trivial to impossible. Clang is intended to be GCC compatible, but all (C++) features are not quite ready yet.

  6. Wow! I turned off clouds and cars and set the autogen to “tons” and I couldn’t quite believe my eyes! Now I want to see that everytime. I’m using a Radeon 6770, couldn’t max out the autogen, but what I saw is enough – so realistic, what a wonderful job Ben! I think I’m getting a better Nvidia card in order to enjoy those cities. Do you think RAM is also a factor here? I’m running Win32 with 2GB, wondering if I should move to 64 and more RAM…

    1. 64-bit windows is a win. In the short term you get 4 GB of address space instead of 2; in the long term you need 64-bit when we go 64-bit to get any benefit.

    1. Not really diligent – partly I had to play musical parts to get my PC working…as far as I can tell I had a freak mobo problem but I went through a lot of parts. I also wanted to see how “more fill rate” would affect things to get some sense of how Moore’s law is going to affect the sim over the next year.

  7. Ben,
    Any chance this problem with cars/clouds and ATI is limited to Windows and not causing a problem on Macs too?
    Not sure which update of XP it was, but at some point cars started causing a huge FPS hit on my Mac when previously it had not.
    I’m running on an early 2008 Mac Pro 2.8 Ghz Dual Quad with 14 GB ram and an ATI 5770. Most of my settings are default, no HDR as that turns XP into a slideshow. Clouds set to 20% and I used to be able to run with cars at near max setting, but now it just kills it.

      1. Just tried it on 64-bit Arch Linux, AMD 6950. Over KPDX with no clouds and cars set to “Siberian” the framerate is about 25 fps. Changing cars to “none” the framerate immediately goes up to 45 fps. I don’t know if it’s related to the Windows bug but cars really impact the fps in Linux on my machine.

        Thanks for the updates Ben – I enjoy reading the posts that you and Chris post.

    1. Tony, PM me through the .org

      I have almost the same system as you (except only 10megs RAM) and am getting better results. We should compare notes – but not here 😉

      Sorry Ben, minimal hijacking going on, nothing to see, move right along.

  8. Ben,
    Thats not my intent. Your original post on the bug with Windows is more or less identical to the problems I’ve been experiencing on my Mac.

    1. I do not think I agree with this.

      I can go to EDDM on Windows on my machine.
      I cannot go to EDDM on Linux even under the lowest settings.

      The Linux math bug is one that would crash any machine, regardless of settings.

  9. I fly from the south, no matter where, on the Fulda VOR (FUL 112.10) exactly north, then crashed my XP10 exactly at 51.55 ° N and 009.75 ° E.

    If you want I send you my email again with the crashlog.

  10. Hi Ben,

    thanks for sharing these infos. I am relativly sure that i encounted a crash near munich a few weeks ago ON WINDOWS7! I tried a flight from EDDF to EDDM three times, always with a CTD near munich. Don’t know it its related, but may be (Window7, 64bit, GeForce 560 TI)

  11. hello Ben….
    v10 linux crashes is activated only when the swap
    I pray for fixing the bug that the voice sound displays the graphics settings
    thanks

Comments are closed.