I’ll try to summarize some of our hardware findings for X-Plane 10 over the next few posts. But in my previous post I mentioned that the new MacBook Pros have only an 8x PCIe connection to the discrete GPU (that is, the nice GPU that isn’t built in to the CPU, the one you want to fly X-Plane with) and this got a bit of attention.
So it begs the question: what is this PCIe bus and why do we need to care all of a sudden?
The PCIe bus is the connection between the CPU/main memory and your graphics card (with its memory and GPU). It is the bottleneck through which all communications must flow – sometimes every frame, sometimes every now and then.
PCIe slots are named by the number of lanes (e.g. 16x means 16 lanes) – each lane has fixed capacity (which is doubled in PCIe 2.0). So a graphics card in a 16x slot drink data from your computer at double the rate of one in an 8x slot – it’s an extra wide straw.
(Nerds: I realize this is about the worst description of the PCIe bus you will ever find. Go read Wikipedia!)
What Do We Use the PCIe Bus For?
X-Plane needs the PCIe bus to:
- Send the instructions to draw each frame to the GPU.
- Transfer any textures, new OBJ meshes, and other data that will be held in VRAM. The data is born on the CPU, goes over the PCIe bus once, and then lives in VRAM.
- Send to the GPU anything that changes every frame to the GPU. For example, smoke puffs and car headlights have to go over the PCIe bus every frame because they are constantly changing.
- Send to the GPU mountains, forests and other non-repeating geometry. This data gets sent every frame.
If the sum of all of the stuff on that list gets too big, your framerate drops as the CPU and GPU both wait data to make it over the bus. In other words, the bus can at times be the bottleneck in terms of framerate.
If you set your rendering settings near the maximum that your computer can handle and get the occaisional stutter, that may be X-Plane running out of PCIe bus bandwidth. As you fly to a region with new textures that haven’t been used before, the OpenGL driver will transfer our textures over the PCIe bus from system RAM to VRAM. If the PCIe bus is already nearly maxed out, the extra traffic of those textures is going to temporarily hurt framerate – sometimes in the form of a stutter or pause.
Are You Sure You Know What You’re Doing?
At this point those of you who know some things about 3-d graphics are shouting at your monitors: why are you guys transferring the mountains and forests over the PCIe bus every frame? Why not just put them in VRAM, since they don’t change?
That’s a good question and if you have a better solution than the one we use, I’d love to hear it.
The problem is this: OpenGL doesn’t give us a good way to prioritize which meshes (VBOs) stay in VRAM and which ones are purged out when we run out of VRAM. If we put every mesh in the sim into VRAM, framerate gets better (because we aren’t using the PCIe bus) right up until we run out of VRAM. At that point the OpenGL driver freaks out and starts throwing out textures to make room for meshes, and then the textures have to be sent back over the PCIe bus, and we end up in a world of hurt. We end up in a state of texture thrash as we have too much “stuff” for VRAM and framerate falls off of a cliff.
The real problem is this: X-Plane has no idea how much VRAM is available for its own use. Sure the card might have 256 MB, but how much is being used by the OS window manager for those translucent window effects, or by other applications? We can’t even add up how much VRAM we use with ultimate precision because we don’t know the granularity of allocation on the video card (there’s real overhead for VBOs being rounded up to the VM page size, for example) or whether side buffers like a hierarchial Z buffer have been allocated.
X-Plane works around this with a simple rule: all OBJs go to VRAM, because their geometry is likely to be repeated, and non-repeating geometry, like forests and mountains, stay in system RAM and go over the bus.
This heuristic actually works pretty well in X-Plane 9 – we have enough bandwidth to transfer all of that “stuff” once per frame, and we tend not to run out of VRAM and thrash.
Why Does X-Plane 10 Want more PCIe Bus Bandwidth?
X-Plane 10 is hungrier for bus bandwidth for three reasons:
- The OBJ engine’s performance has been improved a lot. In the past, we’d run out of CPU capacity (to draw OBJs) long before we ran out of bus bandwidth. This isn’t always the case with X-Plane 10. The graphics are always held back by their weakest link. If you have a strong GPU (and low effects settings) and the OBJ engine is efficent, the PCIe bus is the weakest link.
- The art assets are more detailed and thus contain more vertices.
When shadowing is on we have to draw the entire world multiple times, once to build shadow maps and once to draw the real world. So shadowing can double (or even triple or worse) our bus bandwidth usage. We didn’t have that kind of free capacity on the bus in the first place.
We’re still working on the engine, art assets, and performance, so my hope is that we’ll find ways to cut down bus use (especially with shadows). And there has to be one slowest part of the system – as of this writing, the PCIe bus is often it.