Randy forwarded me a very detailed email message from the features list about water. First a few notes:
- I don’t read the features list – I’m only blogging this because Randy forwarded it my way.
- Procedural water (that is, procedural waves with the sky as a reflection) is a default shader option because when we first looked at it, it seemed to be “low cost”. Some hardware really chokes on this option. But one trend that’s clear: I have not seen any hardware that can do volumetric fog but has trouble with water. When it comes to expensive computations, vol-fog is the the heavy effect. If your card can run with vol-fog even remotely well, you’re not going to get any kind of fps boost by turning off water. So the question of whether procedural water can be optional is one of whether the water looks good (which I will discuss below), not one of performance.
- I’ve received plenty of emails about how we can “cheat” on the reflections to make them faster. To be honest, these ideas aren’t very useful…you really need to know how the rendering engine works on a low level to find good ways to cheat on reflection quality.
(It’s not enough to make the reflection texture less good looking, you have to do so in a way that makes it take less time to render! We’ve basically already taken all the optimizations we can – that’s what that water reflection detail popup does. Keep it on a lower setting.)
So with that in mind I’d like to bring up three issues with the reflective water and give you some idea of the roadmap. I should say that when I list features as coming in the 9.xx or 10.xx time frame what I really mean is: some depend on new global scenery and some do not. It’s possible that we may do a ton of water work in 9.1 or we may not do any more for five years. I can’t really make a good prediction on future features after 9.0, except that they’ll be, um, really cool. 🙂
Water Weather Settings (9.00)
X-Plane currently provides a global wave height setting. X-Plane 9 beta 15 ignores this setting and simply creates calm water. This is simply a link-up between the physics and the shader that I had not gotten to until now. I believe that beta 16 will address this; the water wave height will match what you dial in. This still is only a baby step, but there will at least be a workaround for the mos common complaint (the ocean looks like glass) in that you can set the wave height to 10 meters.
Water Properties (9.xx, 10.xx)
Now Peter brings up an important point: the properties of how the water look vary with their location. First I must point out an architectural issue: you can’t do a great job of computing “fetch” (open runs where wind builds up waves) in the sim because the area adjacent to the current water may not be loaded. That is, if X-plane doesn’t have the pacific ocean loaded, how can it know that the waves hammering San Francisco can be pretty big? So I have always viewed fetch as something to pre-compute into a DSF mesh.
(This does bring up an architectural issue…how can we have properties on open ocean with no DSF mesh? The answer might end up being that we do have to provide water-only DSFs for bays and inlets that are large enough to cover a whole tile.)
Now it turns out that (secretly) the DSFs have contained a very crude version of fetch since 8.20. In version 8 we didn’t really have the shading power to visualize it (I did have some experimental code once) and in version 9 it is not yet hooked up. The fetch calculation isn’t very good, but it does exist. In the long term, we’ve talked within the company about including bathymetric data (water depth) and even water properties like clarity and the color of the goo in the water. Improving the water metadata would be a next-global-scenery feature, and we don’t recut global scenery very often.
(The DSF format is flexible…you can encode just about as much extra data into the mesh for water as you want – it’s not a format change.)
So I think you may see some improvement in the water as we utilize existing fetch data in the mesh, and some improvement as we encode more meta-data into the mesh. Note that to do any of this we need to change the sim a bit…right now it assumes a constant wave height – this would no longer be true. I think these kinds of improvements will start during the 9.x run but not be in 9.0.
Filtering Errors (9.xx)
This is the most technical issue with the water, relating to how the graphics hardware works. The problem is that the way the far-view of the water is computed is too reflective due to down-sampling.
In real life, if I am in front of a body of water with 6 inch waves, I will see two things:
- Near me, I will see the waves themselves. Part of the waves will be dark, because their surface normal faces me, so the Fresnel equation says I see down into the water, where I see the bottom (or if it’s deep enough, darkness). Part of the waves will be at an angle to me and act reflective, picking up colors from the sky and maybe surrounding terrrain. So my waves are going to be a mix of the sky color and some kind of dark color, with the color contrast allowing me to see the shape of the wave.
- Farther out, the waves will be smaller than my eyes can distinguish and I’ll start to see a more consistent water color, which is a mix of all of the various sky color inputs and darkness. The particular mix might depend on the chop and shape of the waves and angle to the sky. The important thing is: there is scattering of the reflection at a level smaller than my visual acuity, so I see sky color, but I don’t see a sky reflection, and that color is darkened by the Fresnel equation.
Now in X-Plane the problem is this: the water waves are built up procedurally from a noise texture. As we get farther away, that wave texture is reduced in quality. Unfortunately, the graphics hardware averages together the wave shape, not the resulting color from the wave shape.
So instead of getting sky + deep = darker blue for the water, I get peek + trough = flat water! In other words, at a far distance the waves are canceling each other out before color lookup, giving us a perfect mirror in the far-ground.
This is fundamentally an implementation problem – I bring it up only because it’s a counter-intuitive one. In the immediate future, the “glass lake” problem will become less because filtering only kicks in like this when the waves become less than one pixel – with the option for taller waves coming, the waves should be visible farther out. In the longer term we’ll probably put in new shading code to address filtering problems.
Reflection Positioning Bugs (9.0, 9.xx)
As of beta 15 I thought I had fixed most of the reflection positioning bugs (that’s what happens when something reflects in the wrong place in the water) – the geometry for this is made complicated by the Earth being round. I don’t expect to nip all of these bugs in 9.0 but I do hope to get most of them, and I will keep working on this as bug reports come in.
Wave Shape (9.xx)
Finally, our wave shapes are quite primitive – it’s just shaped procedural noise designed to look tolerably like water. We have a framework into which we can insert more complex wave equations (at the cost of some framerate). I don’t know what the future will bring in this area. The v9 water sets out a new foundation onto which we can do more complex water. But we have to crawl before we can walk.
For years now I’ve been harping about ways to keep the number of batches down in your scenery. A batch is a single submission of triangles to the graphics card for drawing. Batches get rendered fast even if they contain a ton of triangles, but changing modes between batches is not very fast, so a few large batches is hugely better for performance than a large number of tiny batches.
To play that broken record one more time, there are two ways that you (a scenery designer) can cut down the number of batches):
- Use a small number of larger textures instead of a large number of small textures, preferably sharing textures between similar scenery elements that are placed nearby. X-Plane will do its best to merge the content that uses those textures into single batches. We call this the “crayon rule.”
- Use less attributes in your objects. Attributes usually require a new batch (after the graphics card mode has been changed due to the attribute). So if you’ve got 1000 attributes in your object, you’ve got a problem.
Well, with X-Plane 9 I have a new broken record: avoid overdraw!
Overdraw is the process of drawing pixels on top of other pixels on screen. It happens any time we use blending to do translucency, and any time we use polygon offset to build the image in layers.
Overdraw is bad because with X-Plane 9’s pixel shaders, most users are slowed down by the graphics card’s ability to fill in pixels (pixel fill rate), with those complex shaders being run for every pixel. If you are at a screen-res of 1200×1024 looking at the ground with no objects, that might be 1.2 million pixels to fill. But if there is an overlay polygon covering the ground, we have 2.4 million pixels to fill! That’s a huge framerate hit.
Right now there’s not much you can do about overdraw. Once MeshTool comes out I will post some guidelines on how you can limit overdraw.
We took a step in the v9 global scenery to limit overdraw: in X-Plane 8 the global scenery tried to hide repetition of flat textures by drawing them over each other with offsets. In X-Plane 9 this is done in a pixel shader (e.g. the texture is analyzed and swizzled in the shader and then drann once), cutting down the number of times we must draw.
If you turn pixel shaders on and off in a flat area like Kansas you might see this if you compare the screenshots – the farm textures are more repetitive without shaders. This gives faster fps to everyone (with or without shaders) by eliminating overdraw.
I’ll try to get it fixed soon….it’s a bit frustrating, because it’s the second time in the last week that I made a change to X-Plane to try to improve performance, tested it on my hardware, then discovered in-field that it helps some machines but screws up a lot of others.
The fog was a screw-up though…it’s broken on the 9600XT and I have one of those.
(I’m not entirely sure what the minimum graphics card for volumetric fog will turn out to be…right now we let you use it no matter how slow it makes the computer, but generally it’s been sort of a performance problem…it’s just a very expensive algorithm that needs some kind of restructuring.)
X-Plane 8 didn’t care much whether you used ATTR_cockpit in scenery objects or other strange places. It would simply show the cockpit panel texture, and if it hadn’t been updated, you might see an old one, and if it had never been used, maybe you’d see the random (but colorful) contents of memory. Similarly if you could get close enough to another airplane to look in the window, you’d see your own panel, since there is only one panel texture (for the user’s airplane) in the entire scenery system.
This is a bigger problem in X-Plane 9.
- Because the panel texture can be expensive for big panels, we are a lot more aggressive about not setting up the panel texture if we can avoid it. This means that sometimes the texture doesn’t exist at all. This is why in beta 14 you get an error if you do a formation flight having only been in “w” (forward 2-d) view…the panel texture doesn’t yet exist, but the exterior view of the Cessna tow plane uses it.
- With panel regions there can be up to four panel textures, so you can see the potential for anarchy.
- Panel textures aren’t even the same size any more, causing the wrong-panel-in-AI-plane problem to look even weirder than before.
So in beta 15, the panel texture is replaced with a dummy white texture for:
- Any cockpit object for an AI plane.
- Any scenery objects that are illegally using the panel texture.
This prevents crashes and other nasty stuff. If you want to make the panel be visible in your AI plane, consider using LOD to make a non-panel-texture “fake” cockpit image (at a very small res) at farther LODs. My guess is that in normal usage of the sim you’d really have to do something dangerous to get close enough to see the hack.
We did discuss live panels for all planes (for all of about 3 seconds), but the live panel texture in 3-d is so expensive that it’d be prohibitive to most users for even one AI airplane, let alone 20!
We’ve been preaching “one big texture, not lots of little textures” for a while now, and generally speaking, packing a lot of art into one big texture makes life eaiser for X-Plane, because it can draw more triangles at once before it has to tell the card to change what it’s doing. Inside the company we call this the “crayon rule“.
Now the total set of geometry and textures that X-Plane needs to use for one frame is the “working set” – you can think of it as the crayons that you keep out of the box because you need them all the time. And as I said before, if the working set becomes too big, your framerate dies.
Now with large panels we’re seeing a new phenomenon, one of the first cases where the crayon rule might not be true. The reason is due to working set.
When you make an airplane with a large panel in version 9, you can either use ATTR_cockpit, which lets you use the entire panel as a texture, or you can use ATTR_cockpit_region, which will let you use several parts of the panel. Each ATTR_cockpit_region is a texture change, so that’s more crayons. And yet ATTR_cockpit_region is usually faster.
The reason is two-fold:
- You can often use cockpit regions that don’t cover the entire cockpit texture. Large panels are rounded up to 2048 if the are larger than 1024 in any dimension, so the “wasted space” in a 1600×1600 panel is actually quite huge. If you can get away with some smaller regions, your total panel texture area is smaller because there isn’t wasted space due to this rounding, and you can also skip things like Windows. Prepping the panel texure takes time, and it’s done once for lit and once for non-it elements, so it adds up!
- It turns out there are two categories of textures that contribute to the working set: static texures and dynamic ones, and their impact on VRAM is very different. Dynamic textures are much more expensive. The panel texture is dynamic and it’s uncompressed, so it really costs a fortune. (32 MB of VRAM for 1600×1600. That’s not a lot for a static texture but for a dynamic one that’ll kill you.)
Here’s the details on dynamic vs static textures: the OpenGL driver keeps a backup copy of a texture in main memory, so that if it has to purge VRAM (to make room for more stuff) it still has the texture. As it “swaps” textures, the process is to simply send textures as needed from main memory to VRAM. No big deal.
But with a dynamic texture, the texture has been modified in VRAM! So the copy in system memory is old and stale. The graphics card thus must send the texure back to main memory, consuming twice as much bus bandwidth as normal. (To free 16 MB of VRAM and refill it takes 32 MB of transfer, 16 MB to copy the old texture back to system RAM and another 16 to send the new textures to VRAM.) On non-PCIe cards, this back-transfer might be at 1/8th the speed of the transfer to the card, so this is even worse on AGP cards.
Thus the driver does its best to not throw out dynamic textures. And this is why the panel texture is so expensive. That P180 will cause X-Plane to make two 16-MB dynamic texures, and those textures will cause 32 MB of VRAM to basically be off the table. That’s less space for the other textures to swap in and out of. This kind of “permanent allocation” makes the VRAM budget tighter for all other drawing operations.
Given the right combination of large panels, large res, pixel shader effects (which make more dynamic textures), clouds, and FSAA, you can easily get even a 256 MB card to a state where the free space into which static textures are shuffled becomes horribly small, and the framerate just dies.
So the moral of the story is: yes, it can be worth 4 crayons (using panel regions) to avoid the huge cost of dynamic textures from large panels.
As to static textures (regular DDS files) that are 2048×2048 – the jury is still out but my guess is they don’t represent a huge performance problem. As one user pointed out to me, they’re only 2 MB when compressed (maybe more with alpha) so they’re not insanely huge, and they can be swapped out.
Lori and I are about to leave for a New Years Eve ski trip, but before I shut down the laptop for the last time in 2007 I wanted to say: Happy New Years to everyone in the X-Plane community. I had a lot of fun working on X-Plane in 07 and hopefully the sim brought you enjoyment too. I think 2008 is going to be very exciting – version 9 plants the seeds for a lot of interesting new possibilities during the version run.
When I get back I’ll post a bit on panel regions, for which I have the first performance numbers, as well as some of the strange effects FSAA has on the sim. We should have a progress report on Linux soon too.
See you next year!
I found the root cause of another NVidia specific bug, and once again it’s my own stupid code. If you Google for driver bugs, you’ll find plenty of grumpy developers ranting about how card X does this wrong thing and card Y does that wrong thing…I figure it’s only fair to follow up and say “yep, that one was mine.”
Like the previous nVidia-only crash, this was a case where X-Plane was always doing something wrong, but only some drivers had problems with the behavior. So the crash was NVidia-specific, but X-Plane caused.
I believe that this bug was manfiesting itself either as a message that “scenery shift took more than 30 seconds” or some kind of crash. One of the problems was that the diagnostics for this particular bit of code were really bad. So we’ve improved things a bunch…
- There is more careful error checking during scenery shift, and those error messages are reported.
- If the sim does crash, some new code will output a crash log on Windows that helps us isolate what actually happened.
Beta 12 will be out soon with the fix that caused problems on NV hardware as well as the improved diagnostics. So you may find that the sim just works better, but if it does still crash or report errors, please tell us – now we’ll have log files that will let us diagnose the problem a lot faster!
At this point my in-box has approximately 180 emails from the month of December regarding X-Plane 9. So while I appreciate all of the feedback we’ve gotten (bug reports, performance, etc.) it’s going to take a while to reply. If you haven’t heard from me, don’t panic! I hope to answer a whole pile of emails next week.
In the meantime, I’ve been working on improved crash logging on Windows. Right now when we have a crash on Windows, all we know is that (1) X-Plane crashed and (2) what DLL we crashed in (which is always us or the video driver – not useful).
Coming soon, X-Plane will catch the fatal crash, examine memory to see what was going on, examine its own EXE to figure out the names of the functions going on, and output it all to a crash log that users can send us to get a much clearer picture of what’s going on. This information is called a “backtrace” – we’ve had it for the Mac or a while (OS X provides back-traces automatically with a crash) and it’s really useful.*
So my top priority is all of the users who are seeing problems during scenery load, and a new build with a back-trace should help to reveal what’s really going on.
I’m also working on putting additional timing and performance information into the sim so we can learn more about why some users have poor performance. So far here’s what I’m seeing:
- 8800 users seem to have great performance. If you have this card and don’t have good fps, adjust your x-plane settings and NV control panel settings – this card can bring it.
- 8600 users sometimes have performance problems – not sure why.
- Older nvidia GPUs (7600, 6800) sometimes have performance problems with the new eye-candy features – I am investigating.
- The pixel shaders seem to slow down the new HD2x00 Radeons a lot more than expected…I still need to investigate this. This is the most surprising datapoint – the X1600 does very well, so I would expect newer GPUs to at least have that level of performance. I think this is something we might be able to address.
However not all of the reports are consistent, so I think it’s too soon to make some calls on recommended hardware. The only thing that’s clear is that most 8800 useres who we do careful perf experiments with end up with huge framerates.
* Microsoft provides some back-trace facilities, but since we don’t use their compiler tools, we had to roll our own.
To those who have sent performance info: thank you, but you probably won’t hear for me for a week. I’m up to my eyeballs in reports and it’s going to take a while to get through them.
I finally found the code that allows X-Plane to turn off V-Sync. This should help nVidia users who are having framerate problems.
The basic idea is this:
- X-Plane tells the graphics card to draw a lot of stuff.
- The video card accumulates this “todo” list and works on it while X-Plane runs.
- X-Plane indicates that the entire frame to be seen is done and tells the card to show the results.
- Eventually some time later the card finishes the todo list and then shows it to the user.
V-Sync relates to the question of when this last step happens. When V-Sync (vertical sync) is off, the card shows the results as soon as it is done drawing.
But when V-Sync is on, when the card finishes drawing the world, it then waits until the monitor is done drawing its frame, and then shows the results. Without V-Sync we can have a situation where the top half of the monitor is showing a new frame and the bottom half is showing an old frame. (This is called “tearing”.)
So normally V-Sync is good because it prevents tearing. But the problem with V-Sync is that a frame can only be shown when the monitor refreshes. The video card has to wait until this happens, and this slows our framerate down.
In particular, most users have their monitors set to 60 hz. If X-Plane can only produce frames at 50 hz, the video card will have to further slow the framerate down to30 hz (one x-plane frame for every two monitor refreshes). If X-Plane falls below 30 hz, we end up with 20 hz (one X-Plane frame for three monitor refreshes), and if X-Plane goes below 20 hz, we would clamp at 15.
So when monitor refresh is on, there can be large framerate hits for small losses of performance in the sim, and a real risk of getting locked around 20 fps.
(The minimum framerate in X-Plane is intentionally set to 19 so that we won’t fog up if the video card clamps us at 20 fps.)
So when beta 11 comes out, you may get some framerate back if you haven’t already hacked your graphics card’s control panel settings. If you still want v-sync, you can always set it this way in the control panel. But most users I’ve talked to are happy to have it off.
In an only vaguely related note, one of the reasons to have high frame rate is to have a smooth flight model. But Austin has now put a new setting in the operations-and-warnings dialog box: you can pick how many times per graphics frame the physics run. The normal ratio is 1:1, but for fighter and acrobatic pilots, you might find that you can get a nice feel at lower fps (20-30) by setting a higher ratio.
Random thoughts:
Update your drivers! Version 9 uses driver features that version 8 does not. Just because version 864 works doens’t mean that your drivers are up to date and bug free! First thing to try when weird things happen in version 9 is a driver update.
If you have 60 fps and a rendering setting cuts it to 30, you probably have vsync on – that is, your graphics card can only run at an even divisor of your refresh rate. The next hit will be 20, then fog. Change your monitor refresh rate to 75 or 80 hz. If the framerates all change (to 80, 40, 20, etc.) it must be v-sync. Turn v-sync off for better framerates under heavy load. Nvidia users, you need to turn v-sync to off, not application controlled.
X-Plane 900b7should be able to put sustained load on three cores – if you’re recording a QuickTime movie, one core draws the world, one compresses QuickTime frames, and one rebuilds 3-d as you fly. So…I guess we’ve already hit a point where a quad-core machine has some benefit over a dual-core machine. (I think we’ll start to see more central features use more cores during the v9 run.)
The new forests rebuild 3-d very frequently – dual core users who run on “tree hugger” should see utilization of 75% or higher on both cores, depending on video card power. (If your CPU usage isn’t 100% then probably your video card is holding you back – turn down FSAA.)