Category: Development

Hardware Buckets

Yesterday I went off on a rant about how the OpenGL 3.0 spec doesn’t matter a lot because OpenGL grows via a la carte extension.  And I mentioned that this creates a dilemma for anyone designing a rendering engine that has to run on a wide range of hardware, with very different underlying capabilities.

Back in the old days of X-Plane 6, there wasn’t a lot of variety in hardware.  Some cards were faster than others, but the only real difference in capability was how many texture units you had. X-Plane’s job was simple…
  • First we have a runway texture.
  • Still got a second texure unit?  Great!  Skid marks!
  • Still got a third texture unit?  Great!  Landing light!
  • Got another one?  Etc…

Other than texture stacking, there wasn’t much to do.

Since then the rendering engine has become a lot more complex, as have OpenGL cards.  To try to keep the combinations down, I tried to use a “bucketing” strategy for X-Plane 9.  The idea of bucketing is to group cards into major buckets based on whole sets of common functionality, so that we only have to test a few configurations (the “low end” bucket and “high end” bucket), rather than having to test every single card for its own unique set of features.
The obvious bucketing choice was pixel shaders – given a card with shaders and a few other features, we can render all of the new effects.  A card without shaders basically gets X-Plane 8.
So what went wrong?  Driver compatibility, that’s what.  Ideally we don’t want to allow every single underlying rendering engine feature to be turned on or off individually because the combinations are uncontrollable.*  But in practice, being able to turn features on and off is necessary to trouble-shoot drivers that don’t always do what we expect them to.
With the GeForce 8800 and Radeon HD cards, there is a potential third bucket for DirectX-10 capable cards, capable of significantly more advanced pixel shading effects.  But time will tell whether we can actually make a bucket, or we have to look at each feature individually.  My suspicion is that even if we organize the shipping features into buckets, we’ll have to support a lot of combinations under the hood just to trouble-shoot our own application.
*Example: cross a standard vs. oversized panel with the presence or absence of advanced framebuffer blending, crossed with whether render-to-texture works. That’s 8 possible ways just to render the 3-d cockpit panel texture.  Cross that with panel regions and 3-d cockpits and the new panel spotlighting and you have 64 configurations. Ouch!
Posted in Development by | Comments Off on Hardware Buckets

OpenGL 3.0

A few people have asked me about OpenGL 3.0 – and if you read some of the news coverage of the OpenGL community, you’d think the sky was falling.  In particular, a bunch of OpenGL developers posted their unhappiness that the spec had prioritized compatibility over new features.  Here’s my take on OpenGL 3.0:

First, major revisions to the OpenGL specification simply don’t matter that much.  OpenGL grows by extensions – that is, incremental a la carte additions to what OpenGL can do.  Eventually the more important ones become part of a new spec.  But the extensions almost always come before the spec.  So what really matters for OpenGL is: are extensions coming out quickly enough to support new hardware to its fullest capacity?  Are the extensions cross-vendor so that applications don’t have to code to specific cards?  Is the real implementation of high quality?
So how are we doing with extensions?  My answer would be: “okay”.  When the GeForce 8800 first came out, the OpenGL extensions that provide DirectX 10-like functionality were NVidia-specific.  Since then, it has become clear that all of this functionality will make it into cross-platform extensions, the core spec, or some of each.  But for early adopters there was a difficult point where there was no guarantee that ATI and NVidia’s DirectX 10 features would be accessible through the same extensions.
(This was not as much of an issue for DX9-like features, e.g. the first generation of truly programmable cards.  NVidia had a bunch of proprietary additional extensions designed to make the GeForce FX series less slow, but the basic cross-platform shader interface was available everywhere.)
Of more concern to me is the quality of OpenGL implementations – and for what it’s worth, I have not found cases where a missing API is standing between me and the hardware.  A number of developers have posted concern that OpenGL drivers are made too complex (and thus too unreliable or slow or expensive to maintain) because the OpenGL spec has too many old features.  I have to leave that to the driver writers themselves to decide!  But when we profile X-Plane, we either see a driver that’s very fast, or a driver that’s being slow on a modern code path, in a way that is simply buggy.
Finally, I may be biased by the particular application I work on, but new APIs that replace the old ones don’t do me a lot of good unless they get me better performance.  X-Plane runs on a wide range of hardware; we can’t drop cards that don’t support the latest and greatest APIs.  So let’s imagine that OpenGL 3.0 contained some of the features whose absence generated such fury.  Now if I want to take advantage of these features, I need to code that part of the rendering engine twice: once with the new implementation and once with the old implementation.  If that doesn’t get me better speed, I don’t want the extra code and complexity and wider matrix of cases to debug and test.
In particular, the dilemma for anyone designing a renderer on top of modern OpenGL cards is: how to create an implementation that is efficient on hardware whose capabilities is so different.  I’ll comment on that more in my next post.  But for the purposes of OpenGL 3.0: I’m not in a position to drop support for old implementations of the GL, so it doesn’t bother me at all that the spec doesn’t drop support either.
The real test for OpenGL is not when a major revision is published; it is when the next generation of hardware comes out.
Posted in Development by | 2 Comments

Bad Alloc Crashes in 920 – Bad Timing

I just received a series of reports today that certain converted scenery will cause X-Plane to crash with a “bad alloc” error. Basically, this couldn’t have hit us at a worse time. The final 920 was cut a week ago. We physically can’t recut; Austin is on the road, and I am knee deep in it. But there is a possible work-around, and there will be a patch. Here’s the whole situation.

What is a Bad Alloc?

A bad alloc error is an error that comes up when X-Plane runs out of memory. This can happen for two reasons:

  • We have run out of address space – that is, there is no more virtual memory left, or
  • We have run out of page file/physical memory – that is, we can’t back that virtual memory.

The first case is by far the most common – you’d only hit the second if you are on Windows with a fixed-size (but small) page file. (Hint: if you have a fixed size page file, make it big!)

X-Plane can run out of memory for many reasons – everything that runs in the sim uses memory, and the amount used depends on what area you are in, what rendering settings you pick, and what third party add-ons you use. While I’d like to someday reach a point when the sim tells you gracefully that it’s out of memory, it will always be a fact of life that at some point (hopefully an absurdly high one) the amount of stuff you’ve asked X-Plane to do will exceed how much memory you have.

(If you are thinking 64 bits, well, that will just change the problem from a crash to a grinding halt when we run out of physical memory.)

We see bad allocs when there are too many third party add-ons installed (XSquawkBox is a particular pig because it loads every CSL on startup), too complex scenery, and it can also be caused by drivers not efficiently using memory. (This is particularly a problem on Vista RTM.)

The Bug

When X-Plane creates a curved airport taxiway, it allocates a temporary memory buffer to hold the intermediate product of the pavement. The size of that buffer depends on the complexity of the curve it is processing and a constant, based on the maximum curve smoothness.

In 920 I provided an option to crank up the curve smoothness in X-Plane. In the process, I increased that constant factor by 4x, which causes X-Plane to hit its memory ceiling on layouts that used to be acceptable. You’ll see this problem more often on:

  • Bigger, more complex layouts.
  • Configurations that were already chewing up a lot of memory.
  • Machines with less address space (Windows without /3GB, older Mac OS X operating systems.)

What really suckered us about this bug was that it comes in a form that looks almost the same as a driver issue we’ve seen with ATI drivers on Windows — we’ve seen strange forms of memory exhaustion on ATI when shifting scenery with high rendering settings. So we didn’t realize that this was something new until G5 users reported the bug (making us realize it wasn’t a driver thing).

What To Do

The bad news is that we can’t do an RC5 – we’re out of time. But – there will be a patch – relatively soon. This bug is on the short list for a patch to fix 920.

In the meantime, there is actually a work-around. By coincidence, some of the internal rendering engine constants are viewable via the “private dataref” system — basically a series of datarefs in the sim/private/… domain that I use for on-the-fly debugging. The dataref that matters here is:

sim/private/airport/recurse_depth

If you load up DataRef Editor you’ll see it has a value of 12 . That’s too high. Changing it to 10 will allow otherwise problematic airports to load.

I will try to post a plugin in the next 10 days that sets this dataref to 10 on startup, effectively patching the problem. This will also limit the maximum smoothness of curves – but my guess is that if you see the crash (not all users do) then you can’t run on the max airport curve setting anyway.

Of course the next patch will contain a real solution: a more efficient memory allocation scheme!

Posted in Development, News, Scenery by | 4 Comments

DVD Not Found: Mystery Solved

Some users reported during 920 beta that X-Plane would sometimes not detect its DVD – a condition that would come and go. Tonight I figured out what is happening.

  1. In order to validate the DVD, X-Plane decompresses part of its contents into the preferences folder. Why preferences? There is no good reason – it’s historic.
  2. X-Plane will create a preferences folder if there is not one. But it does not do that until you quit.
  3. The X-Plane installer will not make directories unless they contain files.

So put these three things together: on the first run of a new install, there is no preferences file, so the DVD check fails since the directory that will contain some temporary files is missing. Run a second time, and the directory is there and the DVD check succeeds.

The next patch of the sim will fix this, but in the meantime, if you delete your preferences, leave the empty directory in place!

Posted in Development by | Comments Off on DVD Not Found: Mystery Solved

Engine Modeling and Autopilot

Two random and unrelated notes:

First, RC4 is going out as is, despite the engine modeling changes being incomplete. Basically we now have a more sane approach to the engines themselves, but no FADEC control. FADECs are on the short list for the next update. Sometimes we just run out of time – not every release can have everything.

Second, a note on autopilot customization – I am party to a fair number of questions about whether the plugin system can be used to make subtle changes to the autopilot logic. The answer is of course: no. If you really want something different for an autopilot, you’d have to replace the entire “top-half” set of logic and drive the flight directors yourself – in this situation you are responsible for:

  • All modes and mode changes based on conditions.
  • The actual selected flight envelope to achieve the desired AP setting.

But you are not responsible for driving the trim and yoke, which are done by you setting the flight director.

Why can’t you just override one specific behavior? It’s an issue of infrastructure.

Fundamentally, the autopilot only does a few certain tricks. If it were capable of doing customized behaviors, you’d already see it, in the form of a dataref or (more likely) a Plane-Maker setting. Basically there is no generality to the autopilot that we secretly have inside the code but don’t expose.

Will there be a more general autopilot someday? Maybe – I don’t know, I don’t work on that code. But the plugin system has always aimed to make it possible to do anything, but not necessarily easy. In particular, the plugin system doesn’t aim to make your development easier by recycling the simulator itself as a convenient library of lego bricks. In the end of the day, X-Plane is an application, not a library. If it were a library, that would be lots of fun for third parties, but it is not.

Posted in Development, News by | Comments Off on Engine Modeling and Autopilot

Framebuffer Incomplete: I Need Your Help!

I believe I am getting close to a possible solution for the dreaded “Framebuffer Incomplete” errors – these error messages pop up when X-Plane starts, and you end up quitting.

If you meet these criteria, please contact me:

  1. You have an ATI card that has shown this error in the past.
  2. You can put on the latest Catalyst drivers. (I know a lot of you have put on older drivers to work around this.)
  3. You can run X-Plane 920 RC2.

If you’re in this crew, please email me at my XSquawkBox email address!

The rub is: despite having four machines with ATI cards, I never see this error. So I need to send you a build to get close to a fix!!! Let’s swat this bug for real!

Posted in Development, News by | 3 Comments

How Many Cores Will You Have?

My last post generated  number of posts from both sides of the “hardware divide” (that’d be the have’s and have-not’s).  I think everyone at least grasps that developer time is finite and features have to get prioritized at the cost of other features, even if not everyone agrees about what we should be coding.

I think the term “hardware divide” is the right one, because the hardware market has changed. Years ago when I bought myself a nice shiny new Dell (back when that wasn’t an idiotic idea) a medium-priced Dell had medium-priced hardware.  Not only did I get a decently fast CPU (for the time), but I got a decent AGP bus, decent motherboard, etc.  The machine wasn’t top-end, but it scaled.
When you look at any computer market, you need to consider what happens when consumers can no longer accept “more” and instead want “the same for cheaper”.  This change in economics turns an industry on its head, and there are always winners and losers.  (I have claimed in the past that operating systems have turned that corner from “we want more” to “we want cheaper”, a shift that is very good for Linux and very bad for Microsoft.)
Desktop computers hit this point a while ago, and the result is that a typical non-gamer computer contains parts picked from the lower end of the current hardware menu.  You’re more likely to see:
  • Integrated graphics/graphics by the chipset-vendor.
  • System memory used for VRAM.
  • Slower bus speeds, or no graphics bus.
  • GPU picked from the lowest end (with the fewest number of shader units).
  • CPUs with less cache (this matters).

Someone commented a few days ago that computers would get more and more cores, and therefore multi-core scalability would be very important to fully utilizing a machine.  I agree. 

But: how many cores are those low-end PCs, aimed for general use (read: email, the web, text editing) going to have?
My guess is: not that many.  Probably 2-4 at most.
These low end PCs are driven by one thing: price – the absence of VRAM or dedicated graphics hardware is all about bringing the hardware costs down – a $25 savings matters!  In that situation, box-builders will want the cheapest CPU, and the cheapest CPUs will be the physically smallest ones, allowing for more chips on a wafer.  A low-end PC will get no benefit from more than 4 cores – the intended use probably doesn’t even use one.*
Multiple cores are great because they give us a new way to benefit from smaller transistors (that is, by packing more cores on a chip, rather than clocking it faster, which has real limitations).  But I think you’ll start to see the same kinds of gaps in CPU count that you see now with GPUs.
(In fact, the mechanics are very similar.  The main differences between high-end and low-end GPUs of the same family are the number of parallel pixel pipelines – the low-end chip is often a high-end chip with a few defective pipelines disabled.  Thus you can have a 4x or 8x performance difference due to parallel processing between siblings in a GPU family.  Perhaps we’ll see the same idea with multi-core chips: build an 8-core chip, and if 4 of the cores fail, cut them out with the laser and sell it as a low-end chip.)
* One advantage of multiple cores is that they can take the place of dedicated hardware.  For example, there is no penalty for doing CPU-based audio mixing (rather than having a DSP chip on the sound card) if the mixing happens on a second core.  Being able to replace a dedicated component with a percentage of a core is a win in getting total hardware cost down, particularly if you were going to have the second core already.
Posted in Development by | Comments Off on How Many Cores Will You Have?

Pixel Shaders and Moore’s Law

In my post on 64-bit computing and X-Plane, there’s a point that’s implicit: there is a cost (in development time) to adopting any new technology, and it takes away from other things. I’ve been slow to work on 64-bit X-Plane because it would take away from things like texture paging and generic instruments. Similarly, there is a cost every time we do a build to supporting more configurations, so we pay for 64-bit continuously, by supporting six platforms instead of 3 (3 operating systems x 2 “bit-widths” of 32 and 64 bits).

We have a similar problem with graphics hardware, but it’s even more evil. Moore’s Law more or less says that in a given period of time, computer technology gets twice as fast. In the case of graphics cards, each generation of cards (coming out about every 12-18 months) is twice as fast as the last.

This has some scary implications for X-Plane. Consider these stats for video cards (taken from Wikipedia):

Card      Date        fill rate       Bus         Memory bw
GF3 01Q4 1920 MT/S 4x 8 GB/S
GF4 Ti 03Q1 2400 MT/S 8x 10 GB/S
GF5950 03Q4 3800 MT/S 8x 30.4 GB/S
GF6800 04Q2 7200 MT/S PCIe16 35.2 GB/S
GF7900 06Q1 15600 MT/S PCIe16 51.2 GB/S
GF8800 06Q4 36800 MT/S PCIe16 86.4 GB/S
GF9800 08Q2 47232 MT/S PCIe16/2 70.4 GB/S

Let’s assume we support any video card in the last 5 years (in truth we support more than that). The difference between the best card and the oldest in w006 was 13,680 MT of fill rate.

Now in 2008 the difference is 43,432 megatexels per second!

In other words, the gap between the best and worst cards we might support is over 3x larger in only 3 years!

This is no surprise – since cards get twice as fast with every revision, the gap for a given number of generations also gets twice as wide.

What this means for us, programming X-Plane, is that coming up with a single simulator that runs on the very best and worst hardware is becoming increasingly more difficult, as the performance gains at the high end run away.

Posted in Development by | 1 Comment

Custom Datarefs for OBJ Animation

I don’t usually blog about plugin issues, but this one falls into limbo, somewhere between plugins, aircraft design and OBJ animation. This is written for plugin developers; if you don’t write plugins, you can tune out.

Plugin Drawing

Plugin drawing is documented in a few tech notes – I strongly recommend reading them all.

The basic idea is this: to draw in a plugin, you register a callback. X-Plane tells you “it’s time to draw” and you draw.

The first rule of drawing callbacks is: you only draw. You don’t do anything else.
The second rule of drawing callbacks is: you only draw. You don’t do anything else.

There are a number of good reasons for this.

  1. You don’t know how many times, or in what order the callbacks will be called! You might be called four times per frame or none. So doing flight model calculations in a draw callback is a bad idea.
  2. Drawing has to be fast. In the drawing code we are trying to stuff the GPU to the gills with work to do. Drawing time is not a good time to go off and do other expensive calculations.When we look at the interaction of the CPU and GPU, we know that the flight model takes some pure CPU time, and we can improve efficiency by queuing an expensive OpenGL operation before we hit that CPU-only phase. If your plugin is doing its real calculations during draw time, our pipelining gets thrown off.
  3. We’re continually increasing the parallelism of the sim via threads to take advantage of multiple cores. But plugins are single threaded by definition. This means that plugin interaction points will (in the future sim) halt parallel operation and slow overall performance. When we design parallel operation, we can try to optimize our design to avoid this “halt” case, but if you are doing something strange (like flight calculations in a draw loop) you’re more likely to fall off the “fast path”.

Surprise

Now here’s the surprising thing: your dataref callback is actually a drawing callback, sort of!

Object datarefs are called back during object drawing. Therefore they have all of the above problems that drawing callbacks have. Here is my recommendation:

If you create a custom dataref that is used by an OBJ for animation, return a cached value from a variable in the dataref, and update that variable from the flight loop.

This avoids the risk of your “systems modeling” code being called more than once per frame, etc.

Posted in Aircraft & Modeling, Development by | 1 Comment

Why Is Installer Scanning Slow?

When you update X-Plane, it scans about 10,000 files before it runs the update. This scan checks every single byte of every file. As a result, it’s a little bit slow – it takes a few minutes.

With the new installer, when you add and remove scenery, we do the same scan on the installed scenery. This scan is very slow – 78 GB is a lot of data!

When the next installer comes out (2.05, which will be cut after 920 goes final), the add-remove scenery function will do a quick scan to see what files exist. This is a lot faster – even with full scenery the scan takes only 10-20 seconds on a slower computer.

I can’t pull the same trick for updates – we need to detect any possible defect in a file during update to ensure that the install is correct!

Posted in Development by | 4 Comments