X-Plane 10 will have rendering options for global illumination and global shadows. This leaves one question: what if the user has these features disabled?
The plan for version 10 is this: the OBJ file format will have some extensions to allow conditional commands based on rendering settings. A few notes on these conditional commands:
- They will only be based on rendering settings.
- They will be evaluated once when the object is loaded. (If rendering settings change, the object will be reloaded.)
The idea is to be able to change which lit texture you use or remove a set of shadow polygons depending on rendering settings.
The conditionals are evaluated once at load time so that the object can be fully optimized based on the particular set of conditionals used. For example, if your drop shadow (with ATTR_poly_os) is fully removed at load time (because global shadows are on) your object now has fewer attributes, which is good for frame-rate.
This is very different from ANIM_hide. The hide animation may or may not hide depending on datarefs; to keep this fast, you cannot “hide” an attribute, only triangles. This means you “pay” for your atttributes no matter what.
The motivation for both designs is this: if the set of attributes in a file never changes (e.g. they are either conditionally removed at file load once, or they are always present regardless of animation) then we can optimize the attributes of an object once knowing how they relate to each other, to create the leanest, meanest OBJ.
I wrote up some performance tuning notes for OBJs on the wiki. A few notes on how these rules will change with version 10:
Everything on that note applies to version 10 too. If you’ve tuned your model for version 9, that effort will be worth it in version 10.
A few rules are even more important in version 10 than before. In particular, I’ve done a lot of performance tuning for OBJ drawing, but you don’t get those wins if you use ATTRibutes. Clean your objects out for maximum speed.
One special case: objects with very small vertex count are sometimes extra fast in version 10. For example, in version 9, a tree with 8 vertices and no attributes is horribly slow. In version 10, this tree will be quite fast. So in version 9 you might make the tree have 64 vertices and look nicer; in version 10 by keeping the tree lean and mean, you get a speed improvement.
Normal maps are expensive. A 2048 x 2048 normal map takes 22 MB of VRAM! So make sure you get every bit of image quality you can out of it. Two things to consider:
- Normal maps are uncompressed (because texture compression really screws up the normal map). So per-pixel detail will be preserved. Use it!
- VRAM is allocated for an alpha channel whether you use one or not. This is because the cards need 32-bit pixels for performance. So include an alpha channel in your normal maps and use it to modulate shininess; this can help create the illusion of different materials.
For scenery objects, do not include an alpha channel if you don’t need it. When textures are compressed, the alpha channel does take more VRAM, and X-Plane will hit a faster rendering path for textures with no alpha channel.
For quite a while now, I have been advocating in favor of DDS compression. I am pretty damned obstinate, but eventually if enough people yell at me, I get a clue. I have come to appreciate that there are some cases where DDS compression is not a net win; this blog post explains when it happens and what we might do in X-Plane 10 to work around this.
DDS – The Good, The Bad, the Ugly
DDS is a file format that contains image data pre-mipmapped (that is, the smaller versions of the image that the video driver needs are included) in a format that may or may not be compressed. DDS is virtually always used with a compressed image format (like DXT1 or DXT5). This has three positive effects for X-Plane:
- Because the image is already compressed, we save CPU time when loading the texture that would be spent compressing while X-Plane is running.
- Because small versions of the image (the “mipmap pyramid”) is already in the file, we save time down-sizing the image with the CPU, another win for load time.
- Because the image is compressed ahead of time, it can be compressed with a slow high quality compressor rather than a fast low quality compressor, so relative to other compressed images we get an image quality improvement.
The bad is that the DDS file does not contain the original uncompressed file. If the user unchecks “compress textures to save VRAM”, DDS files remain compressed. If the image file contains details that don’t compress well, they’re going to get splatted and stay splatted.
What If VRAM Grew On Trees?
My original heavy arguments for DDS were based on the idea that VRAM is a limited commodity; if we don’t compress textures, the user runs out of VRAM faster and has to go down a level of resolution…and once that happens, everything starts to look ugly.
But what if the user has 1 GB of VRAM? At this point, we’ve limited the maximum quality the user can see because we don’t have the original uncompressed image anymore, only the DDS/DXT version. This can be frustrating to authors who spent a lot of time on their textures.
If you ship PNGs with your airplane or scenery, turning off texture compression will reveal this beautiful, uncompressed image, but now when texture compression is on, the compression will be done by the video driver, and that will look extra ugly.
The Best Of Both Worlds
This is my thinking for version 10. (These are just musings, we haven’t coded this yet.) Currently DDS are preferred to PNG files. We could relax the load rules in version 10 to prefer PNG over DDS when texture compression is off and DDS over PNG when it is on. This would allow authors to ship both PNGs and DDS files and have the right one be picked for the scenario: the pre-compressed one when texture compression is on and the uncompressed one when compression is off.
Over the last two years, we’ve been working on the X-Plane rendering engine, setting up the code to take advantage of multiple cores when possible. Most of these changes are in X-Plane 9 already: multi-core creation of 3-d scenery, multi-core loading of scenery, multi-core loading of textures.
X-Plane 10 will leverage this work, pushing even more work to multiple cores. And yet, nine women cannot have a baby in one month. The ultimate limit on framerate will be based on the performance of one core pushing data to your graphics card.
So how many cores do you need, and is it better to have a few fast cores or more slow cores?
I can’t give you a firm answer, because I don’t know how important money is to you, I don’t know which rendering settings you care most about, and X-Plane 10 isn’t finalized. (And even if it was, we often improve performance in patches.) But I can suggest our attitude to how cores are used.
A Graphic Analogy
With graphics cards, the companies target different markets. The “enthusiast” market is the top end, where money is no object and maximum performance is the goal. Below that you have “performance” cards (a good value for the money but not as fast) and “mainstream” (which by video game standards means “slower than snot” – main stream users don’t need fast 3-d graphics to check email).
When it comes to rendering features, we expect X-Plane to be efficient enough to meet the performance expectations of a given slice of the market. In other words, if you have bought a top-level graphics card, we need to be efficient enough to let you run at 8x FSAA on a huge screen with per pixel lighting – that’s what is expected of the enthusiast crowd and that’s what the cards are built for.
But if you have a mainstream card and get 5 fps with per pixel lighting, well, too bad. You’ve got a cheap, low powered card, you need to dial down the sim. Here our goal is only to give you a way to run X-Plane at all. (And frankly, if X-Plane could run at 60 fps on a mainstream card, it means the max rendering settings don’t take advantage of top end hardware!)
At this point we expect pretty much any modern computer to have at least two cores, and users with single core machines are going to have to make serious graphic quality sacrifices to run X-Plane. (This is already true for version 9.)
When it comes to version 10, I think we will categorize “a lot of cores” (e.g. 4, 6, or more) as enthusiast – some of the top level rendering options may not function well with less than four cores, but dual core machines should at least run decently with some rendering options enabled.
How far we go toward utilizing really high core count (Austin upgraded to the new Mac Pro and now has 8 hyper-threaded i7s) I don’t yet know. My guess is that we will get at least some measurable benefit from up to 8 cores.
Trading off clock speed for core count is a difficult choice. Obviously if you had a choice of one core that was twice as fast as a 2-core CPU, you’d take the speed – you’re getting the same total computing power but the clock speed can help frame-rate. But the trade-off is almost never formulated like this, and it’s too soon to have hard data from the sim itself.
For what it’s worth, the latest generation of CPUs is really fast, so it is unlikely that you’d upgrade to a new CPU and not get some serious benefit, whether it comes in cores or clock-speed. The users I have heard from with new iMacs seem quite happy with their machines.
I have updated some of the facade documentation on the wiki with new performance tips for using facades.
A few quick notes:
- Your facades must be counter-clockwise (when viewed from above). Do not repeat the first point; X-Plane will “close” your building for you. (A four sided building should have four points, not five.)
- If you turn off two-sided facade drawing and your walls look wrong (and your roof disappears) your facade is wound in the wrong direction.
The performance tips go into a fair amount of detail about saving memory. Most of X-Plane’s rendering fall into two categories:
- Shared meshes (objects), where the geometry of the object is saved once and used lots of times. Objects usually hurt frame-rate by consuming CPU time, because for each drawing of the object, we have to do some setup to draw that shared geometry in a different location. (Version ten should feature some major improvements in object efficiency.)
- Non-shared meshes (everything else), where every single “instance” of a tree, facade, forest, etc. is uniquely constructed in memory. Non-shared messages are very fast (because we can submit a huge pile of non-shared messages to the video card in one shot) but they consume a lot of memory (because we pay for RAM per building/tree, not just once). Typically non-shared meshes are limited by virtual address space, not by framerate.
Facades are non-shared meshes, so the performance tips focus on how to limit the amount of RAM needed to represent your facade.
I sometimes get questions from authors considering how much to rely on a 2-d panel mapped to 3-d via the panel texture vs. a true 3-d panel. I can’t comment on what will look best, but I can comment on the relative performance characteristics of both techniques, and the answer might surprise you: in some cases you’ll get better performance by modeling directly in 3-d.
The 2-D Way
When you use the panel texture to make an object, X-Plane goes through a lot of steps to create the final result:
- Your panel has to be rendered in 2-d. We atlas your panel textures, but we don’t necessarily order them optimally – we don’t know the optimal order. Each generic instrument is at least one batch, perhaps even two. Those batches have very low vertex count, and the vertices are stored non-optimally on the CPU. There may be a fair number of texture changes between instruments.
- If you use ATTR_cockpit_region, we then go back and do the same thing…again! Why? Well, we need your panel’s raw color (“albedo” to graphics nerds) and the emissive light given off by anything self-lit separately, so that we can do correct 3-d lighting.
- Both of these are rendered to an off-screen texture that the video driver will feeel obligated to preserve at all costs, putting pressure on VRAM.
- Only when all that is done do we begin drawing your object, with the usual batches to change to panel texture and change back, perform animations, etc.
If this seems expensive, that’s because it is. Periodically users send me airplanes to look at their performance, and lately I’ve been seeing a lot more problems with 2-d panels (that fuel 3-d cockpits) being the performance bottleneck, not the 3-d modeling itself.
The 3-d Way
What if we want to go 3-d? Well, we’re going to “eat” a lot more of what your 3-d pit already has:
- You’ll need a lot more animations to move all of those parts.
- You’ll need new batches with ATTR_lit_level to dial up and down various lighting levels.
But you do get some advantages:
- Geometry in objects is processed about as optimally as we possibly can. All of that work we’ve done on the rendering engine to make OBJs fast is available in your cockpit. So you can increase 3-d detail ‘for free’.
- Your lit geometry can be drawn in a single pass (we don’t need to prepare two separate lit textures). So for example a needle would take three batches via the panel-texture route (a batch to rotate the needle for albedo, a second batch to draw the rotated night needle, and a third batch to draw the resulting texture in 3-d) but only one if you use the OBJ directly.
- Since you organize your textures for OBJs, you can guarantee that all of the cockpit stuff is together, saving texture thrash.
- You can use normal maps to add per pixel detail to your cockpit; panel textured geometry cannot be normal mapped.
A Balancing Act
Given the high cost of panel texture relative to native OBJ drawing, you’d think going native OBJ would be a no-brainer, right? Well, not quite.
A needle is an easy case: you can model a needle using a rotation animation, so your implementation in an OBJ and our generic instrument are quite similar. Same with the throttle lever generic instrument.
But what about a “glass pie indicator”? What about a moving map? What about a rotary?
There are some generic instruments that have “movement” for which there is no equivalent OBJ technique. With these generics, the generic instrument/panel code may be able to render the generic quite a bit more directly than your OBJ can simulate the same effect.
This is my suggestion on a cut-off: if you can directly model a generic instrument with an OBJ (needles, throttles, and other “simple moving things”), consider 3-d. If you would have to use a lot of extra texture space, copies of your mesh, or a lot of show-hides, use the panel texture.
Your goal should not be to eliminate the use of panel texture. But if you can cut panel texture down to a single 1024 x 1024 region from a larger area, you’ll probably see a performance win or a reduction in your airplane’s system requirements.
Performance Test First
Final thought: before you invest months in a complex cockpit design, mock up the “work-load” X-Plane must do and performance test it! For an OBJ, simply make one moving instrument and duplicate the mesh to get the number of expected animations. For the panel, drag out a bunch of instruments, make custom textures and just paint junk into them with photoshop. The goal is to make X-Plane do the same amount of work as it will in the final version. Then fly your test panel on target computers and observe performance.
The short answer is: to save memory.
The cars and replay seem to be a case of damned-if-we-do, damned-if-we-don’t. If we don’t stop the cars and reverse them in replay, we get piles of bug reports. If we do try to replay the traffic, we get bug reports too.
The current implementation is a bit strange: when you replay traffic, the cars will go back a bit in time, but at some point they will just stop and refuse to reverse any more. What’s going on?
The answer is that the cars have the memory of a goldfish. They simply don’t remember where they came from. Each car knows what “link” it is on, and about when it got onto that link and how fast it is going. (A link is a single straight piece of road.) So when we go into replay, we can easily move the cars along their links as time goes forward and backward.
But when we reach a time earlier than when the car entered the current link, the car has know idea how it got there, so it is forced to stop.
This is a simple case of not wanting to burn four tons of memory on a feature that is mainly visual. To replay the cars, we would have to accumulate a history of every link a car has been on as it drives. For 20,000 cars and a sim that’s been running a while, that’s a lot of memory to burn just in case you happen to hit the replay button.
In fact it gets worse. The cars are kept in a data structure that tells us who needs to make a driving decision and when.* This structure is optimized for the cars moving forward in time. We’d have to make and maintain an entire second copy of this structure to move the cars backward; again burning CPU and memory while you fly just in case you might hit a replay.
So instead we just provide replay on the current link.
* Programming nerds: the cars are in a priority queue by time to next navigation decision. I consider this to be very clever.
A few years ago I blogged about gamma correction for png files. Here’s the very, very short version:
- PC and Mac monitors are calibrated differently. Dark tones on a PC appear darker than on a Mac. The curve of how colors are mapped to the monitor is the gamma correction curve, typically expressed as a number like 1.8 for Mac and 2.2 for PC. The higher the number, the more Gothic your dark tones.
- A png file can have a gamma value written into the file, which tells X-Plane (and anyone else) what kind of monitor the png was drawn on. This lets X-Plane brighten a png from a Mac when you are on a PC, and darken a png from a PC when you are on a Mac.
- If you leave off the gamma value on your png, we assume 1.8 (Mac) which can be bad if you’re a PC author.
While this is confusing, it was an improvement over the BMP situation (where everything was set up for a Mac and PC users had to simply crank their monitor brightness).
In version 9 we added a gamma correction setting to X-Plane. The setting you enter in the rendering settings is how “dark” your monitor is (bigger number = darker). We then attempt to compensate by lightening the textures more; thus a bigger number results in a lighter looking X-Plane (because you told us your monitor was dark and we tried to “fix it”).
There are two other developments since the original png situation which have unfortunately been a step backward in terms of X-Plane color correction.
DDS and Gamma
The handling of DDS and gamma is, to put it mildly, quite problematic. The problem is two-fold:
- DDS doesn’t actually have gamma information, so we can’t tag DDSes as having originated on Macs and PCs. So we assume a DDS is authored at a gamma of 1.8 (Mac). I think DDSTool correctly does a gamma correction when grinding files at other gammas.
- (If you are a real graphics programmer, please do not read this next sentence.) X-Plane attempts to adjust the color of the DDS in its compressed form. This is a big hack designed to keep framerate high, but it’s really not a very good idea. The result can be color distortion when a DDS is viewed at 2.2 gamma.
So that’s not good, but what happened next made things a lot worse.
Apple Goes Gothic
Apple adopted the sRGB color profile for OS X 10.6, which has a gamma curve of about 2.2. So now the situation with DDS is particularly ugly:
- All DDS are authored at a gamma of 1.8.
- All users are moving toward a display gamma of 2.2.
- X-Plane thus has to always color correct, but its color correction is low quality for performance reasons.
This is…very sad.
There are two things we can do about this:
- In the short term, we can provide post-decompression color correction. This will cost a (hopefully) small amount of framerate and improve color fidelity for users with 2.2 gamma. This is the kind of thing that any user with a modern card would want, but that we might make optional for users with very old hardware.
- In the long term, we can provide a gamma calibration in the text files that wrap DDS files so that authors can mark their DDS as already being 2.2. This will mean that for most users X-Plane won’t have to do any color correction at all.
In order to understand the difference between hiding geometry and disabling drawing, you need to understand that an OBJ triangle can serve many purposes. Broadly, those purposes are:
- Drawing (the most basic use).
- Collision detection, of which we have three flavors: collision of the plane with the object (“hard surfaces”, or “physics”), collision of the mouse with the panel (manipulators, or clickable triangles) and collisions of the camera with the airplane (“solid camera”, which constrains the camera).
Any given triangle can be drawn and/or used to check for any of these collisions*; attributes change what the triangle is used for.
By default, all triangles are drawn; ATTR_draw_disable marks future triangles as not being drawn. This allows you to make a triangle that is used only for collisions. Examples might include a “hot spot” in front of a region on the panel (the hot spot might be easier to click than a small switch) and an invisible simple mesh to constrain the camera.
By comparison, ANIM_hide effectively removes some triangles from your model (temporarily) for all uses – drawing and collision detection of any kind. If a door is hidden, it’s not only not drawn, but it’s not going to stop the camera moving through it either.
Some key points to these distinctions:
- Categorizing what a triangle is used for (drawing, various flavors of collisions) is static – that is, it is always the same for the object and never changes with datarefs or animation. This is intentional for performance reasons!
- Animation to hide triangles affects the triangle in every way consistently – drawing and collisions.
Generally, you will get better performance improvements by removing categories from a triangle than by hiding it. (That is, it is better to not have manipulators on your cockpit, so it isn’t mouse-click collision-checked, than to hide it.) But the purpose of ATTR_draw_disable and ANIM_hide are different enough that which you use will be determined by the effect you are trying to create.
Finally, note that hiding an object completely (that is, the object does no drawing) does not provide the maximum performance benefit of not having an object at all. ANIM_hide was created to allow authors to create clever effects, not as a performance enhancer!
* This is not quite accurate: airplane-object collision checks are only available in scenery objects, and camera/airplane or mouse/panel collision checks are only available in the cockpit object.