Tag: hardware

I’m Looking Through You

Order independent transparency (OIT): first, the shiny robot.

So first, what’s so special about this? Well, if you’ve ever worked with a lot of translucency in X-Plane, you know that it doesn’t work very well – you invariably get some surfaces that disappear at some camera angles.

The problem is that current GL rendering requires translucent surfaces to be drawn from farthest to nearest, and who is far and who is near changes as the camera changes. There are lots of tricks for trying to get the draw order mostly right, but in the end it’s somewhere between a huge pain in the ass and impossible.

What’s cool about the robot data is that the graphics card is drawing the transparency even if it is not drawn from back to front, which means the app can just shovel piles of translucent triangles into the card and let the hardware sort it out (literally).

X-Plane is currently riddled with transparency-order bugs, and the only thing we can do is burn a pile of CPU and add a ton of complexity to solve some of them partly. That proposition doesn’t make me happy.

So I am keeping an eye on hardware-accelerated OIT – it’s a case where a GPU feature would make it easier for modelers to create great looking content.

Posted in Development, Scenery by | Comments Off on I’m Looking Through You

Pie in the Sky: Future Rendering Tricks

I’ve had a few inquiries about environment maps, normal/specular maps in more places than just OBJs, etc. The short answer is: a lot of these future rendering engine enhancements are good ideas that we like, but we have other features that are already partially implemented that we need to productize first.

In particular, one of those half-finished features not only could use to be shipped, but also affects the way just about every other rendering effect works. So better to get these features finished first, and build new effects within the context of these “new rules”.

Here are some of the ideas that I’ve heard kicked around:

  • Environment maps on the plane – I like it, it’s not that hard to do, and the framerate hit could be ramped up and down with detail. See above about finishing other features first.
  • Next-gen texturing on runways (wet runways, environment maps, bump mapping) – I like all of it. The runways really need to be addressed comprehensively, not piece-wise, in order to find a rendering configuration that meets our scalability and efficiency needs. (In other words, we can’t just burn tons of VRAM on the runways, and we need a way to render them that works on low and high end computers with one set of art assets.)
  • Normal mapping on the ground. I like it, but I wonder if this isn’t part of a bigger idea: procedural texturing on the ground. E.g. if we want to add detail on the ground, can we add it with multiple layers at different resolutions with a shader adding yet more detail.

Generally speaking, it’s not that hard to push a feature out from one part of the rendering engine to the other. For example, normal maps on facades (if anyone cares 🙂 would make sense and be a fairly trivial change. I hesitate on polygons only because the ground might require something a little bit more sophisticated.

Going Crazy With Choices

Here’s a straw man of why polygons might be different: draped polygons can’t have specular shininess right now, and I don’t think anyone is complaining. So it’s a bit of a waste to use the alpha channel of a normal map for shininess. Perhaps it could be used for something else like an environment mapping parameter.

Hrm…but we would like to have environment mapping on airplane objects too someday. Well, we could go two ways with that. We could just use the alpha channel for shininess and environment mapping…not totally unreasonable but it wouldn’t let us have a glossy non-reflective material, e.g. aluminum vs. shiny white paint.

Math nerds will realize that the blue channel in normal maps can be “reconstituted” from the red and green channels by the GPU (at the cost of a tiny bit of GPU power). That would give us two channels to have fun with – blue and alpha. Perhaps one could be shininess and one could be an environment mapping material.

Well, shininess is still no good on the ground. But…perhaps that would be a good place to store dynamic snow accumulation? Hrm…

My point of this stream-of-consciousness rant is that the design of any one rendering engine feature is heavily influenced by its neighbors. We’ll get all of these effects someday. If there are features that are really easy, we can get them into the sim quickly, but the only obvious one I see now is using bump maps on other OBJ-like entities (which at this point would mean facades).

Posted in Development by | 3 Comments

12 GB For the RenderFarm

The global scenery gets cut on “the renderfarm”, which is our name for the cluster of computers, each loaded with all of the input data for the global scenery. These computers chug for a few days to churn out the whole planet.

With the next global render I am trying an experiment: using one 8-core Mac Pro as the RenderFarm. We’ve never had more than 8 processors in the farm at a time; in the past to get 5 processors might have taken 3 machines. The appeal of a single machine is ease of setup; no data to sync between machines, no sorting out which machine did which tile and merging it all back.

Today I upgraded the Mac Pro’s memory (again) to 12 GB. I thought the logic of why I did this might be of interest to X-Plane users who are trying to figure out “should I have more memory”?

Basically there are three memory limits we care about:

  1. The virtual memory limit per process – generally 3 GB per process for a 32-bit application. If an application wants more memory tan this, regardless of what you have, it is dead.
  2. The virtual memory limit for the whole machine. Since the machine virtual memory limit is a function of hard disk space, normal users will never care about this – we can have a huge amount of virtual memory.
  3. The physical memory actually used by the sum of all programs actually running. Once all programs need more physical memory than you have, they start using hard drive, and they get really, really, really slow.

In the case of the render farm, our processing program runs on one DSF tile at a time. But with 18,000+ tiles we can take advantage of more cores by processing 8 tiles (using 8 copies of the program running at once) on 8 cores.

This is where memory comes in. Before the upgrade my machine had 4 GB of memory, allowing each of the 8 tiles to use a little bit less than 512 MB of RAM before we ran out of physical memory and started paging. (The OS takes a little bit for itself.)

Normally this is all good — a typical run might only use 300 MB of RAM. But every now and then we hit New York City or Boston and the RAM use spikes out to 1500 MB or more.

This is okay if just one process hits New York, but what if a bunch of processes hit a big city at once? We blow past the physical RAM of the machine and every tile becomes slow at once. And because they are all slow at once, they take a very long time (hours) to clear this state.

(This state of thrash due to many processes is like 8 people trying to go through a doorway at once. If they would just take turns, they’d be through in a second. But by forcing 8 at once they get stuck. The operating system won’t pause a few of the high-memory programs to let the others complete.)

Since the RenderFarm has to run overnight unattended, I upgraded RAM to 12 GB, for 1500 MB per rpocess before thrash. My hope is that for this investment, we’ll be able to run the processing through without a human to unjam it.

What About X-Plane

So would a RAM upgrade for X-Plane help? Well we can apply what we know from above to figure it out. Generally, you need enough RAM that X-Plane + all other programs won’t run out of physical memory. Since X-Plane is 32-bit (and can only use 3 GB of RAM) you are likely to be fine with 4 GB when running X-Plane + the OS. Any more and X-Plane can’t take advantage.

The exception might be if you need to run X-Plane and another big program like PhotoShop at the same time. At that point you might want enough RAM for both to run at the same time.

Posted in Development by | 3 Comments

More Threads

X-Plane 940 beta 1 is out…it will take a little bit to get the release notes and docs on the website completely up-to-date. We’re still dealing with loose ends from our migration to the new web site, and most of my office is packed up for a move to the Boston area. I’ll try to get docs up as fast as I can given the chaos flying about.

Given the interest multi-core stirred up in previous notes, I will mention one change to 940: with this build we’ve added yet more multi-core to X-Plane.

In 931, X-Plane will use as many cores as you have to load textures, but only one to build “3-d scenery” (a loose category for the work we do when we make airport taxiways and lines, build forests, and extrude roads).

In 940, this “3-d scenery” is also done on as many cores as you have. This should speed up load times a bit, particularly under very heavy tree settings, and hopefully keep the forest engine running faster for users with more cores.

It also sets us up for long-term growth; X-Plane’s visual quality is sometimes limited by the time to build 3-d meshes…being able to do this work on many cores means we can use higher quality algorithms.

Consider for example the roads: my original “road extruder” (the code that converts a vector road into a 3-d model, called an extruder because it builds a 3-d road from a cross section like one of those play-dough toys) made beautiful intersections with stop-lines and cross walks and lots of other great stuff.

It was also really slow. And at the time the sim wouldn’t fly at all while roads were extruded, so speed was of primary concern. So I replaced it with the much dumber extruder you see today, where intersections are basically ignored.

Now that we have 3-d scenery build on multiple cores, we can begin to provide rendering options that take more CPU time but produce higher quality results. The trees and airport layouts already do this (in that they take more time and produce slower, more detailed, higher triangle count sceneery at higher rendering setting for the same input DSF ad apt.dat file). With more cores, we can continue this strategy with roads and other parts of the sim without worrying about overloading the one core that was doing this work.

Of course just because we can use 8 cores doesn’t mean we do…you won’t see 8 cores maxed out very often, particularly if you have simple scenery and a very fast machine.

Posted in Development, Scenery by | 6 Comments

Optimization By Check-Boxes

In the next X-Plane 930 beta (it should post today I think) the rendering settings have two new check-boxes: one to enable the “dynamic” airplane shadow and one to enable per-pixel lighting.

In the last week a number of users emailed me performance numbers via the fps test, and from what I can tell, 99% of performance problems can be attributed to these two new features chewing up resources in a way that 922 did not. When the features are both disabled, from what I can tell, the sim should be as fast or faster than 922.

The new beta will also limits dynamic shadows to your aircraft – beta 13 will calculate a dynamic shadow for every aircraft, which is unacceptably slow when you have a lot of AI planes enabled.

I may still be able to improve the performance of the per-pixel-lighting shader, but fundamentally per-pixel lighting is going to be more expensive than per vertex. The average X-Plane scene might have 200,000 to 500,000 vertices. At absolute minimum resolution, no FSAA, you’re going to have over 700,000 pixels even if there is no “over-draw” – you could easily have 10x that fill rate with only a modest increase in overdraw, full-screen anti-aliasing and window size. Simply put, per-pixel lighting is more work.

Please bear in mind: without per-pixel lighting X-Plane’s pixel shader is extremely simple. If you have a “low-end” card this could give you the illusion of GPU power when there is really not much under the hood.

Examples of low-end cards: GeForce 7300, GeForce 8400, GeForce 9400, Radeon X300, Radeon X1300, Radeon HD2400. All of these cards are the younger brother of a fairly capable card, but with fewer pixel shader units/cores. If each unit is doing very little work, you don’t need that much pixel-filling power…but when we go to a “real” shader, the difference between a GeForce 8400 and 8800 becomes very, very apparent. Simply put, even with optimization the GeForce 7300 (for example) will never run a huge monitor with per pixel lighting and high FSAA.

Posted in Development by | 3 Comments

The Constraints of Hardware

In a previous post (in which I tried to argue that threading is a “how” and not a “what” when it comes to feature requests) a user made this comment:

That is, that I feel you are a bit too concerned about the fact that XP has to be possible to run on a 2001 year machine. This really halts the development although you could add options to turn this and that off.

I’d like to side-step around the details of cost-benefit analysis (e.g. do the sales from low-end systems pay for the development of a renderer with lower system requirements) but take a second to focus on three general issues:

  • Is there a cost to developing a scalable renderer?
  • How does the trend of hardware development affect hardware?
  • How do marketing forces affect both of the above?

Scalability

Is there a cost to writing a renderer that can run on a wide range of hardware? Absolutely. Obviously we have to write more code to do that.

But there is an additional cost: there are some rendering engine design decisions that have to be made system-wide. It’s not practical to provide different scenery files for different hardware (since we are limited by distribution on DVD). In some cases we have to pick a non-ideal data layout (for the highest end hardware) to support everyone.

But: before you raise up arms against your fellow X-Plane user who is holding you down with his GeForce 2 MX and P-III 800 mhz machine, bear in mind that the problem of picking a data format is a bit unavoidable. Even if we targeted the highest-end machines and told everyone else to jump in a lake, those decisions would appear to target rather quaint machines only a year into the version run. At some point we have to pick a line in the sand.

There is some light at the end of the tunnel when it comes to scalability: as computers become (on average) bigger and faster, we can start to defer at least a little bit of the work of scenery generation to while the sim is running. When we first designed the new sceney system (for X-Plane 8) most users did not have dual-core machines, so the doing work on the scenery was very expensive. We preprocessed as much as possible. This isn’t as necessary any more.

So are high-end users limited by having one renderer that fits all sizes? Perhaps a little bit, but any design choice is only going to fit one hardware profile perfectly, and hardware is a moving target; today’s shiny new toy is tomorrow’s junk.

Hardware Growth

Every two years (to be very loose about things) the number of transistors we can pack on a chip doubles. This “transistor dividend” can be turned into more cores for a CPU, or more shading units (which are now really just cores) for a GPU.

And this gets to the heart of why I don’t think we can say “forge the low-end” any time soon. Imagine that we support 6 years of hardware with X-Plane, and the best hardware is 8 times as powerful as the low-end hardware. Fast-forward two years – we drop two-years of hardware and two-years of new ATI and NV graphics cards come out. What is the result?

Well, the newest hardware is still 8x as powerful as the old hardware, but the difference in the polygon budget between the two has now doubled! In other words, the gap in absolute performance is doubling every two years, driving the two ends of our hardware spectrum farther apart. (Absolute performance is what Sergio and I have to worry about when we design a new feature. How many triangles can we use is an absolute measurement.)

If we say “okay forget it, only 3 years of supported hardware” that gets us out of jail for a little while, but eventually even the difference between the newest and slightly off-the-run hardware will be very large.

A gap in hardware capability is inevitable and it will only get worse!

Market Divergence

You may have noticed that the above paragraph makes a really gross assumption: that the lowest end hardware we support is the very best card on the market from a certain number of years ago. Of course this isn’t true at all. The lowest end hardware we support was probably pretty lame even when it was first made. The GeForce FX 5200 was never, even for a microsecond, a good graphics card. (It was, however, quite cheap even when first released.)

So the gap we really have is between the oldest low-end and newest high-end hardware, which is really quite wide. Consider that in May 2007 the GeForce 8800 Ultra was capable of 576 GFLOPs. Two months later (July 2007) the GeForce 8300 GS was released, packing a whopping 22 GFLOPs. In other words, in one video card generation the gap between the best and worst new card NVidia was putting out was 26x! (I realize GFLOPs isn’t a great metric for graphics card performance – really no one metric is adequate, but this example is to illustrate a point.)

Let’s go back in time a few years. In February 2002, NVidia released the GeForce 4 Ti (high-end) and MX (low-end. The slowest MX could fill 1000 MT/s, while the fastest Ti could fill 2400 MT/s. That’s a difference in fil rate of “only” 2.4x.

What’s going on here? Commodification! Simply put, graphics cards have reached the point where a lot of people just don’t care. Very few users need the full power of a GeForce 8800, so a lot of lower-end machines are sold with low-end technology – more than adequate for checking email and watching web videos. This creates a market for low-end parts and creates a wider “gap” for X-Plane. Dedicated returning X-Plane users might do the research and buy the fastest video card, but plenty of new users already have the computer, and it might have something unfortunately (like a Radeon X300 or Intel GMA950) already on the motherboard.

As X-Plane’s hardware needs diverge from the needs of mainstream computer users, we can expect some but not all of our users to have the latest and greatest. We can also expect plenty of new users to have underpowered machines.

Let me go out on a limb (I am not a technologist or even a hardware guy, so this opinion isn’t worth the bits it is printed on) and suggest this: we’re going to see a commodification fall-off in the number of cores everyone has too. Everyone is going to have two cores because it is cheap to put a second core on the main CPU if it lets you get rid of a whole array of special-purpose hardware. Give me multi-core and maybe I can get away with software-driven rendering (who needs hardware acceleration), software-driven sound (goodbye DSP chips), maybe I can even find cheaper ways to build my I/O. But 16 cores? The average user doesn’t need 16 cores to check email and run Windows 7.

So as transistors continue to shrink and it becomes possible to pack 8 or 16 cores on a die, I expect some people to have this and others not to. We’ll end up in the same situation as the graphics chips.

Summing It Up

To sum it up, sure there may be some drag on X-Plane in supporting a wider range of hardware. But it’s an inevitable requirement, because hardware shifts in capability even during a single version run, and as hardware becomes faster, the gap between -end and cheap systems gets wider.

Posted in Development by | 8 Comments

Multi-Threading Is a Weird Feature Request

Over and over, whether it is a feature request list for X-Plane or another simulator, I see the same thing: “multi-core support” or “multi-threading” as a feature request.

Now before I continue, I must remind everyone: X-Plane is already multi-threaded and will take advantage of multi-core hardware. How much we use those cores depends on the type of scenery loaded.
The problem is that multi-threading (as a way to use multi-core hardware) is a solution technique, not a problem statement.  What is threading going to be used for?  If I simply program the other 7 cores of your computer to calculate PI to 223,924 digits have I met the feature request?  This probably isn’t what anyone wants.
Implicit in the request for multi-core is (I speculate) a request for better frame-rate.  (I did see one user who wanted multi-core to be used for a more accurate flight model.  This strikes me as a poor trade-off for hardware based on my understanding of the flightmodel – we would use a lot of hardware for only a marginal accuracy improvement – but I commend the user for stating the problem and not just a possible solution.) But is multi-threading the best way to get framerate?
If I had two patches to X-Plane, one that doubled fps by using two cores and one that doubled fps by using more efficient code, which would be better?  To me the obvious answer is: the code that is more efficient.  It will run on any hardware (not just multi-core) and if you have multi-core hardware, we still have that second core free for some other functionality.
So to me the feature request should be something like: “higher framerate – and yes I have multi-core hardware”.  Or perhaps “more visual detail at the same framerate – and yes I have multi-core hardware”.
All feature requests need to be in terms of problem statements, not possible solutions.  This lets us find the set of problems that can be solved together in a coherent manner, and it lets us pick a solution that meets our engineering goals.
Posted in Development by | 21 Comments

Per Pixel Lighting Isn’t Free

I’ve had a little bit of time to look at X-Plane 930 performance. The data isn’t 100% conclusive yet, but one performance issue sticks out like a sore thumb: per-pixel lighting hurts fps.

Now, part of this is that the per-pixel lighting shaders are not yet optimized (and perhaps are not terribly well written). I need to take some time to see if I can get some more performance out of them.
But…per-pixel lighting isn’t free – when per-pixel lighting is on, the video card is simply doing a lot more work than it used to. Consider: a typical X-Plane scene might have 250,000 vertices on screen at once.  At a minimum, you have at least 750,000 pixels on screen*.  Make your window bigger and that number goes up – fast!  Turn on 16x FSAA and watch the pixel count get even larger.  So the number of lighting calculations done by your graphics card are at least 3x higher with per-pixel lighting and potentially 50x higher.  Even if your graphics card has a lot of power, that’s going to cost a bit.
So one option I am considering is making per-pixel lighting a rendering option. This would allow users who want 922-level fps to simply turn it off. In my tests so far, turning off per-pixel lighting gets fps to within a few percent of 922.
(The only reason to have shaders on but per-pixel lighting off would be to have a cheap version of the reflective water. In the long term I want to limit the number of a la carte rendering settings, but for now it seems reasonable to support v9.00 base configurations through the entire version run.)
* In practice, not every pixel on screen requires full shading, e.g. the sky does not require complex shading.  But some parts of the screen may be shaded multiple times.  This is called “overdraw”.  For example, with a runway we pay for our shaders twice – first with the ground underneath the runway, then with the runway itself.
Posted in Development by | 5 Comments

Why Can’t I Mark My Object As “Extreme Resolution”

I receive a number of requests from authors for an attribute to tag an object as “needs maximum texture resolution” or “needs compression disabled” or “needs maximum anisotropic filtering”. The general idea is that the author wants to ensure a viewing environment that looks good.

For the most part, I am against these ideas – think of the two cases:
  1. If the attribute on the content is request for a relative improvement in resolution (e.g. set my object to one texture res higher than the rest of the world) then what we’ll have is an arms race – every author will set their content with this flag, and the result will be that the entire sim tends to run at one res setting higher than expected.  The result: users without enough VRAM will turn their res settings down another notch and all the content will look like it did before.
  2. If the attribute on the content is a request for an absolute setting (e.g. load this texture at the highest resolution possible) some content will simply not run on some computers that do run X-Plane.

My general point is this: users run X-Plane with texture resolution, anisotropic filtering, and compression set to lower settings for a reason – because their hardware isn’t very fast!  Forcing the sim to ignore the settings and run at a higher res won’t make the user’s video card any better – it will just take the framerate vs. visual quality tradeoff out of the hands of the user.

That’s a simplification of the issue – in fact I am sympathetic to the notion of differential settings – that is, we need to use more texture resolution for art elements that are closer to the viewer.  The sim already improves airplane resolution a bit and cockpit resolution a lot. We set anisotropic filtering a bit higher on runways because they are viewed from a shallow view angle pretty much all the time during normal flight.
At this point I am looking at some more specific overrides for cockpit objects.  In particular, modern cockpits are built out of many attached objects, and not just the “cockpit object” itself – reducing the resolution of these objects can make cockpit labels illegible.
If we do get extensions to improve resolution I can only say this: use them very, very sparingly! Adding the extension doesn’t improve the user’s hardware. If the user had the ability to run your airplane at extreme res without compression and 16x anisotropic filtering, he’d already be doing that!
Posted in Development, Modeling by | Comments Off on Why Can’t I Mark My Object As “Extreme Resolution”

More Aircraft RFCs – Landing Lights

It looks to me like we could afford a few landing light halos on most (but not all) hardware.  This gets a bit tricky in terms of how we make this available to authors…

  • We have to allow access without breaking old planes.
  • There will be two distinct cases due to very different hardware.

So…I have posted an RFC on the X-Plane Wiki.  Please post your thoughts on the discussion page!

One option (not really discussed in the RFC) is to do nothing at all.  Basically I hit upon this during some routine refactoring of the shaders.  The whole issue can be deferred indefinitely.
Why wait?  Well, I don’t believe that an incremental increase in the number of landing light halos is the future.  Our end goal must be some kind of truly global illumination, hopefully without a fixed lighting budget.  It may not make sense to add a bunch of complexity to the aircraft SDK only to have all of those limit become unnecessary cruft a short time later.
(I think I can hear the airport designers typing “why do the airplane designers get four lights and we get none?  Give us a light or two!”  My answer is: because of the fixed budget problem. We can allocate a fixed budget of lights to the user’s aircraft because it is first in line – we know we either have the lights or we don’t.  As soon as we start putting global lights in the scenery, we have to deal with the case where we run out of global lights.  For scenery I definitely want to wait on a scheme that isn’t insanely resource limited!)
Programmers: yes – Dx10 hardware can do a hell of a lot more than 4 global lights.  Heck – it can do a hell of a lot period!  For example, it can do deferred rendering, or light pre-rendering. A true global lighting solution might not have anything to do with “let’s add more global lights a few at a time.”
Posted in Aircraft, Development, File Formats by | 5 Comments