Tag: performance

Multi-Core Texture Loading

In a previous post I discussed the basic ideas behind using multiple threads in an application to get better performance out of a multi-core machine.

Now before I begin, I need to disclaim some things, because I get very nervous posting anything involving hardware. This blog is me running my mouth, not buying advice; if you are the kind of person who would be grumpy if you bought a $3000 PC and found that it wouldn’t let you do X with X-Plane (where X includes run at a certain rendering setting, framerate, or make your laundry) my advice is very simple: don’t spend $3000. So…

  • I do not advocate buying the biggest fastest system you can get; you pay a huge premium to be at the top of the hardware curve, particular for game-oriented technologies like fast-clock CPUs and high-end GPUs.
  • I do not advocate buying the Mac Pro with your own money; it’s too expensive. I have one because my work pays for it.
  • 8 cores are not necessary to enjoy X-Plane. See above about paying a lot of money for that last bit of performance.

Okay…now that I have enough crud posted to be able to say “I told you so”…

My goal in reworking the threading system inside X-Plane for 920 (or whatever the next major patch is called) is, among other things, to get X-Plane’s work to span across as many cores as you have, rather than across as many tasks are going on. (See my previous post for more on this.)

Today I got just one bit of the code doing this: the texture loader. The texture loader’ job is to load textures from the hard drive to the video card (using the CPU, via main memory) while you fly. In X-Plane 901 it will use up to one core to do this, that core also being shared with building forests and airports.

With the new code, it will load as many textures at a time as it can, using as many cores as you have. I tested this on RealScenery’s Seatle-Tacoma custom scenery package – the package is an ENV with about 1.5 GB of custom PNGs, covering about half of the ENV tile with non-repeating orthophotos.

On my Mac Pro, 901 will switch to KSEA from LOWI in about one minute – the vast majority of the time is spent loading about 500 PNG files. The CPU monitor shows one core maxed out. With the new code, the load takes fourteen seconds, with all eight cores maxed out.

(This also means that the time from when the scenery shifts to when the new scenery has its textures loaded would be about fourteen seconds, rather than a minute, which means very fast flight is unlikely to get to the new area before the textures are loaded and see a big sea of gray.)

Things to note:

  • Even if we don’t expect everyone to have eight cores, knowing that the code can run on a lot of cores proves the design – the more the code can “spread out” over a lot of cores, the more likely the sim will use all hardware available.
  • Even if you only have two or four cores, there’s a win here.
  • Texture load time is only a factor for certain types of scenery; we’ll need to keep doing this type of work in a number of cases.

This change is the first case where X-Plane will actually spread out to eight cores for a noticeable performance gain. Of course the long-term trend will be more efficient use of multi-core hardware in more cases.

Posted in Development, Scenery by | 6 Comments

A Tale of Three Operating Systems, Part II (Why You Need Bootcamp)

A while ago I put three operating systems on my laptop. With the Mac Pro I’ve done the same thing – it’s a huge win to be able to cover such a wide swath of OS/GPU/CPU combinations with fewer machines. Last time it was OS X 10.4, Windows XP SP2, and Ubuntu 6.06. This time I repeated the process with OS X 10.5.2, Windows Vista RTM, and Ubuntu 8.04. Random observations:

  • Linux really just keeps getting stronger. I’ve always been a bit skeptical about Linux as a desktop environment, particularly as a Windows/Mac developer (that is to say, I’m spoiled by free high quality IDEs ad debuggers that require no setup to use the platform SDK, comprehensive platform documentation in one location, etc.). But Linux installation is becoming more plug & play and trouble-free each time I make myself a live CD.
  • Windows Vista is a train wreck. I feel a little bit lame blogging this, as taking pot-shots at Vista is sort of like shooting fish in a barrel. But the contrast between Ubuntu, which has become easier to use over a year and a half, and Windows, which has not, is stark.
  • There are some positive things to say about Vista. The partition-aware installer is a real convenience for multi-booters. And once you figure out where everything has been moved to and go back to “classic” views, the OS is tolerable. But you’ll still find plenty of things that will make you want to tear your hair out. My recommendation: stick with XP. (Duh.)

Now on to the performance numbers. These numbers are the Xp900 time demo fps tests 1, 2 and 3. Each set of 3 numbers is from the three phases.

      1           2               3
MAC 49/ 60/ 62 38/ 43/ 44 21/ 20/ 21
WIN 121/128/133 114/115/119 77/ 75/ 82
LIN 143/144/157 130/123/132 92/104/113

That’s not a typo. Linux is beating out Vista, but both are absolutely killing OS X. What’s going on here? I don’t know. But there appears to be something that isn’t well optimized in the GeForce 8 drivers on OS X.

I suspect Apple will close this gap eventually; don’t bother asking me for status information on this because if they ever tell me what’s going on, I’ll be bound by NDA not to tell you.

For now my recommendation is: consider dual-booting into Linux – it’s pretty easy to install Ubuntu and you’ll get great X-Plane performance. With good drivers, the Mac Pro and 8800 are just monstrous.

Posted in Development by | 5 Comments

Threads and Cores

Now that multi-core machines are mainstream, you’ll hear a lot of talk about “threads”. What is a thread, and how does it relate to using more cores?

Definitions

A “core” is an execution unit in a CPU capable of doing one thing. So an 8-core machine might have two CPUs, each with four cores, and it can do eight tasks at once.

A “thread” is a single stream of work inside an application – every application has at least one thread. Basically a two-threaded application can do two things at once (think of driving and talking on your cellular phone at the same time).

Now here’s the key: only one core can run a thread at one time. In other words, if you have an eight core machine and a one-thread application, only one core can run that application, and the other seven cores have to find something else to do (like run other applications, or do nothing).

Two more notes: a thread can be “blocked” – this means it’s waiting for something to happen. Blocked threads don’t use a core and don’t do anything. For example, if a thread asks for a file from disk, it will “block” until the disk drive comes up with the data. (By CPU standards, disk drives are slower than snails, so the CPU takes a nap while it waits.)

So if you want to use eight cores, it’s not enough to have eight threads – you have to have eight unblocked threads!

If there are more unblocked threads than cores, the operating system makes them take turns, and the effect is for each of them to run slower. So if we have an application with eight unblocked threads and one core, it will still run, but at one eighth the speed of an eight core machine.

It’s not quite that simple, there are overheads that come into play. But for practical purposes we can say:

  • If you have more unblocked threads than cores, the execution speed of those threads slows down.
  • If you have more cores than unblocked threads, some of those cores are doing nothing.

Trivial Threads

When a thread is blocked, it does not use any cores. So while X-Plane has a lot of threads, most of them are blocked either most or all of the time. For all practical purposes we don’t need to count them when asking “how many cores do we use”. For example, G1000 support is done on a thread so that we keep talking to the G1000 even if the sim is loading scenery. But the G1000 thread spends about 99.9% of its time blocked (waiting for the next time it needs to talk) and only 0.1% actually communicating.

What Threads Are Floating Around

So with those definitions, what threads are floating around X-Plane? Here’s a short list from throwing the debugger on X-Plane 9.0. (Some threads may be missing because they are created as needed.

  • X-Plane’s “main” thread which does the flight model, drawing, and user interface processing.
  • A thread that can be used to debug OpenGL (made by the video driver, it blocks all the time).
  • Two “worker” threads that can do any task that X-Plane wants to “farm out” to other cores. (Remember, if we want to use more cores, we need to use more threads.)
  • The DSF tile loader (blocks most of the time, loads DSF tiles while you fly).
  • At least 3 threads made by the audio driver (they all block most of the time).
  • At least four threads made by the user operating system’s user interface dode (they block most of the time).
  • The G1000 worker thread (blocks most of the time, or all the time if you don’t have the G1000 support option).
  • The QuickTime thread (only exists when QuickTime recording is going on).

So if there’s anything to take away from this it is: X-Plane has a lot of threads, but most of them block most of the time.

Core Use Now

So how many cores can we use at once? We only need to look at threads that aren’t blocked to add it up. In the worst flying case I can think of:

  1. The main thread is rendering while
  2. The DSF tile loader is loading a just-loaded tile while
  3. One of the pool threads is building forests while
  4. You are recording a QuickTime movie (so the QT thread is compressing data).

Yep. If you really, really put your mind to it, you can use four cores at once. 🙂 Of course, two cores is a lot more common (DSF tile loading or forests, but not both at once, and no QuickTime.

Core Use In the Future

Right now some of X-Plane’s threads are “task” oriented (e.g. this thread only loads DSF tiles), while others can do any work that comes up (the “pool threads”, it’s like having a pool car at the company, anyone can take one as needed). The problem with this scheme is that sometimes there will be too many threads and sometimes too few.

  • If you have a dual-core machine, having forests building and DSF loading at the same time is bad – with the main thread that’s three threads, two cores; each one runs at two-thirds speed. But you don’t want the main thread to slow down by 66%, that’s a fps hit.
  • If you have a four-core machine, then when the DSF tile is not loading, you have cores being wasted.

Our future design will allow any task to be performed on a “pool thread”. The advantage of this is that we’ll execute as many tasks as we have cores. So if you have a dual-core machine, when a DSF tile load task comes along while there is forests being done, the one pool thread will alternate tasks, leaving one core to do nothing but render (at max fps). If you have a four-core machine, the DSF load and forests can run at the same time (on two pool threads) and you’ll have faster load times.*

* Who cares about load time if you’re flying? Well, if you crank up the settings a lot and fly really fast, the loader can get behind, and you’ll see trees missing. X-Plane is always building trees in front of you as you fly and deleting the ones behind you. So using more cores to build the forests faster means you’re less likely to fly right out of the forest zone at high settings.

Posted in Development by | Comments Off on Threads and Cores

Hardware Profiles

X-Plane 9 currently recognizes (roughly) three categories of graphics hardware:

  • Non-Pixel-Shader hardware (GeForce 2,3,4, Radeon 7000-9200)
  • First-Generation Shader Hardware (GeForce FX 5nnnn, Radeon 9500-9800, X300-X600)
  • Later-Generation Shader Hardware (GeForce 6, 7, 8, Radeon X200, Radeon X700+)

That first bucket is pretty simple: those cards don’t support programmable pixel shaders (as we know them today) and can’t run any shader effects. The “use pixel shaders” check box doesn’t appear in the rendering settings.

The distinction between the later two is a little bit more subtle. Basically the first generation of pixel shader cards (the 9700 and friends) support only 96 instructions for each pixel shader; this puts us right on the edge of sometimes not being able to draw all of our effects; we have to simplify the water slightly to make it work. The next generation of chips (X850 and friends) doesn’t have this limitation.

By comparison, while NVidia cards have been able to handle long shaders from day one, the GeForce 5’s shader performance is really poor.

So we bucket all of these chips as “first-gen”. When we detect this we:

  • Simplify shaders slightly (gets us out of trouble with the 9700).
  • Don’t default to shaders being on when the sim is first booted (because the framerate will probably be unusably slow).

Even though the 9700 provided very usable shader performance in its day, by the standards of modern GPUs, this older chip isn’t that fast, so it’s probably for the best that we not enable reflective water by default on machines with these cards.

By comparison, X-Plane deals with almost all other capabilities on an a-la-carte basis; particualr features are enabled if the right menu of hardware features is available. We do this to try to deal more flexibly with the wide variety of cards that are out there. Some examples:

  • You’ll get hardware accelerated runway lights if your card supports pixel shaders and sprites (virtually all shader-enabled cards have sprites).
  • You’ll get sun-glare effects if your card supports pixel counting (virtually all modern cards can do this).
  • The non-pixel-shader rendering code will show more detail if your card supports more texture units (this is only an issue with very old hardware).

I’ve been looking over hardware profiles a lot lately, and I suspect that the next big “jump” in hardware will be the DX10-compliant cards (GeForce 8, Radeon HD). There’s a lot of fine print in what the various cards can do between all of the pre-DX10 cards; at some point when we decide what menu of features we’ll require for rendering, we need to simplify.

My guess is that when we start to have “really advanced” pixel shaders that require hardware more sophisticated than what we need now, we’ll simply require a DX10 card. Otherwise we’ll have to sort through 8 different profiles of fine print, only to attempt to partially support cards that probably won’t be fast enough anyway.

(That is to say, a feature is only useful for us if it can run reasonably quickly. It doesn’t make sense for us to try to make a special “simplified” version of a rendering feature for, say, the X850 if every X850 is going to turn it off every time for framerate reasons.)

If any of this turns into hardware buying advice, I suppose it would be this:

  • If you are deciding between a DX10 and DX9 card (e.g. between the HD2400 and X1900, or GeForce 8600 vs 7900) go for the newer generation DX10 ards (HD or GeForce 8); if the card has decent performance you’ll also be setting yourself up for future features.
  • As always, pay attention to the fine print of the model numbers, particularly the configuration. Lower model number cards basically have fewer parallel components than higher number ones and that leads directly to lower framerate.

I see an add online for the GeForce 8500 for $70 and 8600 for $90. But if you look at the links below, you’ll see that the 8500 has only half the shaders of the 8600 – that’s going to be a huge performance difference for $20.

(So the moral of this story is: try to get an HD or GeForce 8 card, but don’t dip into the really low end cards because they’re stripped down too far for X-Plane use.)

These pages (NV, ATI) on Wikipedia list specs for a whole pile of cards and can be useful to decode the fine print.

Posted in Development by | Comments Off on Hardware Profiles

GeForce 7 and Water Performance

A number of Windows and Linux GeForce 7 users have discovered that the command-line option –no_fbos improves their pixel-shader framerate a lot. Windows and Linux Radeon HD users have also discovred that –no_fbos cleans up artifacts in the water. Here’s what’s going on, at least as far as I can tell. (Drivers are black boxes to us app developers, so all we can do is theorize based on available data and often be proved wrong.)

Warning: this is going to get a bit technical.

FBO stands for framebuffer object, and simply put, it’s an OpenGL extension that lets X-Plane build dynamic textures (textures that change per frame) by drawing directly into the texture using the GPU. Without FBOs we have to draw to the main screen and copy the results into the dynamic texture. (You don’t see the drawing because we never tell the card “show the user”.)

We like FBOs for a few reasons:

  • Most importantly, FBOs allow us to draw big images in one pass even if the screen is small. For example, if we have a 1024×1024 dynamic texture but the screen is 1024×768, then withou FBOs we have to draw the image in two parts and stitch it together. That sucks. With FBOs we can just draw straight to the texture and not worry about our “workspace” being smaller than our texture. This is going to become a lot more important for future rendering features where we need really-frickin’ big textures.
  • It’s faster to draw to the texture than to copy to it.
  • If you’re running the sim with FSAA, then we end up using FSAA to prepare all of those dynamic textures. In virtually all cases, we don’t need the quality improvements of FSAA, so there’s no point in taking the performance penalty. When we render right into the texture, FSAA is bypassed and we prep our dynamic textures a lot faster.

Since copying to a texture from the screen predates these new-fangled FBOs by several years, most drivers can copy from the screen to the texture very quickly; however we have hit at least one case where FBOs are much faster than copy-from-screen. That’s really a rare bug, and as you’ll see below, we see more weird behavior with FBOs.

When do we use FBOs vs. copying? Well, it’s pretty random:

  • Pixel shader reflective water and fog use FBOs.
  • Cloud shadows and the sun reflection when pixel shaders are off do not use FBOs.
  • The airplane panel uses FBOs if the panel is 1024×1024 or smaller; if the panel is larger than 1024×1024 we draw from the screen and patch things together. So the P180 and the C172 are using different driver techniques!!

When you run X-Plane with –no_fbos, you instruct X-Plane to ignore the FBO capability of the driver, and we use copy-from-screen everywhere.

Mipmapping

There is one more element: mipmapping. A mip map is a series of smaller versions of a texture. Mipmapping allows the video card to rapidly find a texture that is about the size it needs. Here’s an example: imagine you have a building with a 128×128 texture. If you park your plane by the building, the building might take up about 100×100 pixels on the screen; your 128×128 texture is a good fit.

Now taxi away from the building and watch it get smaller out your rear window. After a while the building is only taking up 8×8 pixels. What good is that 128×128 texture? Its’ much too big for the job. With mipmapping, the card has a bunch of reduced-size versions of your texture laying around…64×64, 32×32,16×16, 8×8, 4×4, 2×2, 1×1. The video card realizes the building is tiny and grabs the 8×8 version.

Why not just use the 128×128 texture? Well, we’d only have two options with this texture:

  1. Examine all 16384 pixels of the texture to compute the 64 pixels on screen. That sucks…we’re accessing VRAM sixty four times for each pixel. Accessing VRAM is slow, so this would kill performance.
  2. Simply pick 64 pixels out of the 16384 based on whatever is nearby. This is what the card will do if mipmapping is not used (because option 1 is too slow) and it looks bad. Thsoe 64 pixels may not be a good representation of the 16384 that make up your building side.

So mipmapping lets the video card grab a small number of pixels that still capture everything we need to know about the building at low res.

We don’t mipmap our dynamic textures very often; the only ones that we do mipmap are the non-pixel-shader sun reflections and the pixel-shader sun reflections.

ATI

As far as we can tell, the current ATI Catalyst 8.1 drivers do not generate mipmaps correctly for an FBO-rendered texture. This is why without –no_fbos ATI users on Windows or Linux see very strange water artifacts. –no_fbos switches to the copy path, which works correctly.

At risk of further killing my track record of driver bugs in v9, we do think this is a bug. We have good contact with the ATI Linux driver guys so I have hopes of getting this fixed.

nVidia

It appears that the process of creating mipmaps for FBO textures is not accelerated by the hardware on the GeForce 7 GPU series. This is why GeForce 7 users are seeing such poor pixel shader performance, while GeForce 8 users aren’t having problems.

Now poor performance is not a bug; there’s nothing in the OpenGL spec that says “your graphics card has to do this function wickedly fast”. Nonetheless, what we’re seeing now is unusably slow. So more investigation is needed — given that the no-FBO case runs so quickly, I suspect the hardware itself can do what we want and it’s just a question of the driver enabling the functionality. But I don’t know for sure.

Posted in Development by | 3 Comments

The Limits of Orthophotos and Meshes in X-Plane

I get asked a lot about the limits of meshes and orthophotos in X-Plane. I’ll try to answer this, but the answer isn’t as simple as most people expect.

Texture Limits and Orthophotos

The maximum single texture size in X-Plane 8 is 1024×1024, and in X-Plane 9 it is 2048×2048.

I believe the maximum number of unique custom orthophotos that can be attached to a single DSF is at least 32768.

In practice, that number is pretty useless because X-Plane loads all textures for a DSF at the highest user-allowed res when the DSF is loaded. That means you tend to load a lot of textures. Every system is different and drivers have a lot to do with RAM efficiency, but generally you’ll run out of virtual address space and crash the sim before you can attach 32768x2048x2048 of pixels.

X-Plane has no limits on how the texturing is applied – that is, you can use your 2028×2048 texture to cover an entire tile or a single meter. So again, the limiting factor on the resolution of your orthophotos is how much total area you want to cover and how much RAM you can spend (remember RAM is also used for mesh complexity, 3-d models, etc.).

You do not need to have enough VRAM to hold all loaded orthophotos; the video driver will paeg the textures into VRAM. Virtual address space is the limiting factor. How far you push it depends on a lot of subjective things:

  • If you expect your users to also run with a lot of trees, 3-d objects, cars on roads, and some plugins, you can’t use a lot of RAM.
  • If you expect your users to have /3GB in their boot.ini and use nothing but your add-on, you can use a lot more RAM.

Generally the size of the DDS texture on disk is a good proxy for the virtual memory that is required to hold your textures.

It should be noted that these limits on texturing (due to X-Plane blindly loading a lot of stuff at once) affect all scenery types: objects, draped polygons, very complex airplanes, plugins, and not just terrain mesh orthophotos.

Getting Past the Texture Limit

It will take a future extension to the rendering engine to get past the current limits. Basically X-Plane will have to load textures at lower resolutions when they’re farther away. I don’t know when that is coming, but when it happens, it will increase the total amount of image data a DSF mesh can contain, because the limiting factor will be how much data is in the small area the user is looking at (since the rest can be stored at much lower res for far-away views). At that point the limiting bottleneck will be resolution (smaller means more data at once), not total image data.

Mesh Limits

Unfortunately, limits to the mesh are even more vague than limits to texture usage. X-Plane uses an adaptive mesh – basically you can put your vertices wherever you want. So the highest resolution you can achieve might be much smaller than 1 meter resolution, but you can only do this for a small area before the total mesh size gets too big. But this is okay – the intention of DSF is to let you put a lot of detail where you need it.

I believe that once again memory provides the first limitation to the mesh. That is – you’ll run out of memory loading your insanely huge mesh long before you hit a limit to the DSF container structure. And once again, even the RAM limit isn’t a hard limit because that virtual address space is shared with texures. Your mesh density limits actually go down when your textures go up because it’s a zero-sum game.

Estimating Memory

Here are some ideas on how to estimate your memory footprint:

  • Run X-Plane over ocean to get an idea of the baseline memory use that the sim needs without extra scenery.
  • Load your mesh without textures (move the textures away) to find the cost of the mesh itself. (I am going on the assumption here that you can rescale your mesh using whatever mesh generation tool you’re using).
  • The size of DDS textures is a good proxy for the memory used.
Posted in File Formats, Scenery by | 2 Comments

Performance Wrap-up (for now)

The story on X-Plane performance is never over, but the chapter that is 9.00 pretty much is. I think we’ll be RC in the next build (if all goes well). Certainly a lot of the things that are still performance “problems” will require changes larger than we can do in a late beta.

I say problems in quotes because a lot of what’s been reported lately is in the form of: a huge screen res + a lot of shaders + a lot of FSAA = slow fps. That’s not really a bug, that’s an engine limitation. Now I want to make the engine as fast as possible, and a lot of this pixel shader stuff is new to 9.0, so if our track record for tuning stays the way it was for v8, we’ll probably get some efficiency improvements later.

But unfortunately there’s an underlying limitation: the new water and fog both cause the rendering engine to consume significantly more hardware resources than it would otherwise. Turn them on and you get prettier pictures at a price.

Just to post a a few general things I’ve found:

  • X-Plane 9 will tell you where your GPU really stands. GPUs that were very adequate for X-Plane 8 (like the GeForce 6600 GT) will turn out to have nothing left in reserve for v9, while GPUs that were bored in v8 (the GeForce 8800 GTX for example) will show what it really has.
  • Generally the cost of going from no shaders to shaders with water reflections of “none” and no volumetric fog should be very low if your screen res and FSAA don’t add up to something crazy (like 16x FSAA at 2048×2048).
  • If you do have serious performance hits, try –no_fbos in the command-line; some drivers seem to have trouble with them.
  • The P180’s virtual cockpit is a lot more expensive than the other ones, because it has a huge panel that is used in 3-d. We’ll hopefully rebuild the cockpit at some point.
  • Turning water reflections to “complete” is very expensive. Watch the water and use the lowest setting that looks good. You don’t need complete reflections if there are a lot of waves!
  • Shaders, FSAA, and screen size are all pulling from the same set of resources – be careful about cranking up all three.
  • Check your v-sync – a lot of users whose vsync clamped them at 60 fps in v8 will be clamped at 20 in v9.
  • Do your testing with texture res set low, then crank texture res later; pixel shaders also require the allocation of VRAM that can’t be purged (for things like reflection images) so running out of VRAM can show up in some weird ways performance-wise.
  • The new Intel iMacs have serious performance problems with shaders on. This is due to driver limitations; given the much better performance under BootCamp, I expect the Mac performance to get better when the drivers are updated. For now I’d keep shaders off.

For now, please hold off on sending me performance reports. I just don’t have time to address them. In the future I will try to solicit very specific performance data points that we need to check. Perhaps in the future we can also set up a database of fps-test results to have a more comprehensive idea of how the hardware does its job.

I expect future features to appear in v9 that further eat hardware; those features will have an off switch. You may have to pick and choose what graphics you enable; there is no free lunch here. I also expect new graphics cards to emerge that make the GeForce 8800 GTX seem quaint!

Posted in Development by | 2 Comments

Simple Optimizations for Airplanes and Objs

Just a few basic things:

For airplanes where you don’t want to show the PlaneMaker part (because you’ve rebuilt the plane visually using OBJs):

  • In X-Plane 9, set the parts to be invisible – this is faster than drawing them with a transparent texture.
  • In X-Plane 8, if the texture is transparent, please downsize it! A 1024×1024 texture that’s fully transparent is just a waste of VRAM!

For any version, any object, avoid using ATTR_diffuse_rgb when possible. You can get the same effect by tinting your texture and save unnecessary state thrash.

Comments Off on Simple Optimizations for Airplanes and Objs

X-Plane Water – Now and the Future

Randy forwarded me a very detailed email message from the features list about water. First a few notes:

  1. I don’t read the features list – I’m only blogging this because Randy forwarded it my way.
  2. Procedural water (that is, procedural waves with the sky as a reflection) is a default shader option because when we first looked at it, it seemed to be “low cost”. Some hardware really chokes on this option. But one trend that’s clear: I have not seen any hardware that can do volumetric fog but has trouble with water. When it comes to expensive computations, vol-fog is the the heavy effect. If your card can run with vol-fog even remotely well, you’re not going to get any kind of fps boost by turning off water. So the question of whether procedural water can be optional is one of whether the water looks good (which I will discuss below), not one of performance.
  3. I’ve received plenty of emails about how we can “cheat” on the reflections to make them faster. To be honest, these ideas aren’t very useful…you really need to know how the rendering engine works on a low level to find good ways to cheat on reflection quality.

(It’s not enough to make the reflection texture less good looking, you have to do so in a way that makes it take less time to render! We’ve basically already taken all the optimizations we can – that’s what that water reflection detail popup does. Keep it on a lower setting.)

So with that in mind I’d like to bring up three issues with the reflective water and give you some idea of the roadmap. I should say that when I list features as coming in the 9.xx or 10.xx time frame what I really mean is: some depend on new global scenery and some do not. It’s possible that we may do a ton of water work in 9.1 or we may not do any more for five years. I can’t really make a good prediction on future features after 9.0, except that they’ll be, um, really cool. 🙂

Water Weather Settings (9.00)

X-Plane currently provides a global wave height setting. X-Plane 9 beta 15 ignores this setting and simply creates calm water. This is simply a link-up between the physics and the shader that I had not gotten to until now. I believe that beta 16 will address this; the water wave height will match what you dial in. This still is only a baby step, but there will at least be a workaround for the mos common complaint (the ocean looks like glass) in that you can set the wave height to 10 meters.

Water Properties (9.xx, 10.xx)

Now Peter brings up an important point: the properties of how the water look vary with their location. First I must point out an architectural issue: you can’t do a great job of computing “fetch” (open runs where wind builds up waves) in the sim because the area adjacent to the current water may not be loaded. That is, if X-plane doesn’t have the pacific ocean loaded, how can it know that the waves hammering San Francisco can be pretty big? So I have always viewed fetch as something to pre-compute into a DSF mesh.

(This does bring up an architectural issue…how can we have properties on open ocean with no DSF mesh? The answer might end up being that we do have to provide water-only DSFs for bays and inlets that are large enough to cover a whole tile.)

Now it turns out that (secretly) the DSFs have contained a very crude version of fetch since 8.20. In version 8 we didn’t really have the shading power to visualize it (I did have some experimental code once) and in version 9 it is not yet hooked up. The fetch calculation isn’t very good, but it does exist. In the long term, we’ve talked within the company about including bathymetric data (water depth) and even water properties like clarity and the color of the goo in the water. Improving the water metadata would be a next-global-scenery feature, and we don’t recut global scenery very often.

(The DSF format is flexible…you can encode just about as much extra data into the mesh for water as you want – it’s not a format change.)

So I think you may see some improvement in the water as we utilize existing fetch data in the mesh, and some improvement as we encode more meta-data into the mesh. Note that to do any of this we need to change the sim a bit…right now it assumes a constant wave height – this would no longer be true. I think these kinds of improvements will start during the 9.x run but not be in 9.0.

Filtering Errors (9.xx)

This is the most technical issue with the water, relating to how the graphics hardware works. The problem is that the way the far-view of the water is computed is too reflective due to down-sampling.

In real life, if I am in front of a body of water with 6 inch waves, I will see two things:

  1. Near me, I will see the waves themselves. Part of the waves will be dark, because their surface normal faces me, so the Fresnel equation says I see down into the water, where I see the bottom (or if it’s deep enough, darkness). Part of the waves will be at an angle to me and act reflective, picking up colors from the sky and maybe surrounding terrrain. So my waves are going to be a mix of the sky color and some kind of dark color, with the color contrast allowing me to see the shape of the wave.
  2. Farther out, the waves will be smaller than my eyes can distinguish and I’ll start to see a more consistent water color, which is a mix of all of the various sky color inputs and darkness. The particular mix might depend on the chop and shape of the waves and angle to the sky. The important thing is: there is scattering of the reflection at a level smaller than my visual acuity, so I see sky color, but I don’t see a sky reflection, and that color is darkened by the Fresnel equation.

Now in X-Plane the problem is this: the water waves are built up procedurally from a noise texture. As we get farther away, that wave texture is reduced in quality. Unfortunately, the graphics hardware averages together the wave shape, not the resulting color from the wave shape.

So instead of getting sky + deep = darker blue for the water, I get peek + trough = flat water! In other words, at a far distance the waves are canceling each other out before color lookup, giving us a perfect mirror in the far-ground.

This is fundamentally an implementation problem – I bring it up only because it’s a counter-intuitive one. In the immediate future, the “glass lake” problem will become less because filtering only kicks in like this when the waves become less than one pixel – with the option for taller waves coming, the waves should be visible farther out. In the longer term we’ll probably put in new shading code to address filtering problems.

Reflection Positioning Bugs (9.0, 9.xx)

As of beta 15 I thought I had fixed most of the reflection positioning bugs (that’s what happens when something reflects in the wrong place in the water) – the geometry for this is made complicated by the Earth being round. I don’t expect to nip all of these bugs in 9.0 but I do hope to get most of them, and I will keep working on this as bug reports come in.

Wave Shape (9.xx)

Finally, our wave shapes are quite primitive – it’s just shaped procedural noise designed to look tolerably like water. We have a framework into which we can insert more complex wave equations (at the cost of some framerate). I don’t know what the future will bring in this area. The v9 water sets out a new foundation onto which we can do more complex water. But we have to crawl before we can walk.

Posted in Scenery by | 5 Comments

A New Broken Record

For years now I’ve been harping about ways to keep the number of batches down in your scenery. A batch is a single submission of triangles to the graphics card for drawing. Batches get rendered fast even if they contain a ton of triangles, but changing modes between batches is not very fast, so a few large batches is hugely better for performance than a large number of tiny batches.

To play that broken record one more time, there are two ways that you (a scenery designer) can cut down the number of batches):

  • Use a small number of larger textures instead of a large number of small textures, preferably sharing textures between similar scenery elements that are placed nearby. X-Plane will do its best to merge the content that uses those textures into single batches. We call this the “crayon rule.”
  • Use less attributes in your objects. Attributes usually require a new batch (after the graphics card mode has been changed due to the attribute). So if you’ve got 1000 attributes in your object, you’ve got a problem.

Well, with X-Plane 9 I have a new broken record: avoid overdraw!

Overdraw is the process of drawing pixels on top of other pixels on screen. It happens any time we use blending to do translucency, and any time we use polygon offset to build the image in layers.

Overdraw is bad because with X-Plane 9’s pixel shaders, most users are slowed down by the graphics card’s ability to fill in pixels (pixel fill rate), with those complex shaders being run for every pixel. If you are at a screen-res of 1200×1024 looking at the ground with no objects, that might be 1.2 million pixels to fill. But if there is an overlay polygon covering the ground, we have 2.4 million pixels to fill! That’s a huge framerate hit.

Right now there’s not much you can do about overdraw. Once MeshTool comes out I will post some guidelines on how you can limit overdraw.

We took a step in the v9 global scenery to limit overdraw: in X-Plane 8 the global scenery tried to hide repetition of flat textures by drawing them over each other with offsets. In X-Plane 9 this is done in a pixel shader (e.g. the texture is analyzed and swizzled in the shader and then drann once), cutting down the number of times we must draw.

If you turn pixel shaders on and off in a flat area like Kansas you might see this if you compare the screenshots – the farm textures are more repetitive without shaders. This gives faster fps to everyone (with or without shaders) by eliminating overdraw.

Posted in Development, Scenery by | Comments Off on A New Broken Record