- 1 Types of Rendering Primitives
- 2 Hardware Budget
- 3 Measuring CPU and Bus Usage
- 4 Identifying Back-End Bottlenecks (Raster Ops, etc.)
- 5 What Isn’t Counted
Types of Rendering Primitives
The X-Plane renderer’s “low level” component is responsible for scenery drawn to the screen; it is considered low level because the more abstract scenery descriptions from DSF tiles are held in memory and then converted into low level primitives for drawing when they come into the local area of flight.
There are three kinds of low level primitives:
- Patches are static sets of triangles or lines that are not repeated and exist at a specific world location.
- Objects are statically placed instances of OBJ files, in other words, repeated meshes with possible attributes and animation.
- Vehicles are OBJ instances that continually change their position.
While a little bit more book-keeping is needed for vehicles, for the purpose of budgeting they can be treated as static OBJ instances; the cost of moving them is very low relative to the rendering load.
Rendering Primitive, DSFs and apt.dat
This is how DSF primitives and apt.dat primitives map to low level primitives:
- DSF Mesh -> Patches. (These are the only patches that always exist no matter where the plane flies.)
- DSF Object -> Object Instance.
- DSF Forest-> Patches.
- DSF Facade -> Patches.
- DSF Beach -> Patches.
- DSF Draped Lines -> Patches.
- DSF Draped Polygons -> Patches.
- DSF Object Strings -> Object Instances.
- DSF Road Network -> Patches, vehicles and object instances.
- Apt.dat pavement -> Patches.
- Apt.dat marking lines -> Patches.
- Apt.dat lights and fixtures -> Objects.
Most of the dynamic scenery in the sim (balloons, boats, birds) also become object instances.
Special Processing for Objects
The billboarded lights from objects are sometimes stripped from the objects, consolidated, and converted to patches. Most of the objects used for runway lights have been carefully built to make them candidates for this process.
An object with no content (e.g. no LODs or only empty LODs after light stripping) will be discarded from the scene graph.
What this means is that when the low-quality airport-light objects are used (e.g. “high detailed runway environment” is not on) then all of the lights are stripped and turned to patches, and no object instances are generated. This means that the low-quality rendering load of O’Hare is at most about 10,000 patch vertices and 0 object instances.
Graphics cards are pipelined – that is, different parts of the hardware perform different tasks in parallel; the pipeline runs as fast as the slowest stage. So your framerate will be limited by the resource you run out of first.
This list does not mention uses of budget that tend not to matter. For example, objects consume video memory bandwidth, but video memory busses are so much faster than the rate at which the sim can draw objects that you’d never max this out.
Every time X-Plane has to draw a set of triangles, the CPU gets involved. The amount of work the CPU does depends on how much state change there is between batches.
Batches are used up by having a larger number of individual textures (each texture swap is a new batch), having more objects (each object is at least one batch), and having more attributes in objects (each attribute change is a new batch).
Patches also create batches, but because state is shared between patches, the cost of rendering a lot of similar patches is not as large as rendering a lot of objects.
Graphics Bus Bandwidth
All data needed to render the scene that isn’t already in video memory has to travel over the graphics bus. Frame-rate can be limited if the graphics bus is maxed out.
The biggest consumer of bus bandwidth is patches, because they reside in system memory. Graphics bus bandwidth is also consumed when the working set exceeds video memory (see below).
Video memory is RAM on the graphics chip. Because it is connected directly to the GPU by a high performance custom memory bus, data can be fetched much faster from video memory.
Usage of video memory is controlled by the video driver; X-Plane requests that a set of objects be cached in video memory and the card does its best to store as many of them in video memory as possible; the rest are stored in system memory and transferred as needed.
X-Plane requests video-memory caching of the mesh vertices for all objects and all textures.
Dynamic textures (textures rendered by X-Plane) are particularly notable; not only do they take video memory, but if they have to be removed from video memory they have to be transferred over the bus (consuming bus bandwidth). So dynamic textures can be considered to be significantly more video memory expensive. Examples include a very large panel texture used in a 3-d cockpit.
The working set is the set of all data that X-Plane needs to render a single frame. Every bit of data in the working set must either be cached in VRAM or transferred over the graphics bus. Thus you could say that:
framerate = graphics bus bandwidth / (working set - video memory)
In other words, once the working set exceeds video memory, framerate will start to go down, but less so if the graphics bus is fast. This equation isn’t perfectly obvious because X-Plane will not attempt to cache mesh vertex data, even if video memory is free.
Working set becomes larger when more textures are visible on screen, more objects are visible on screen, more patches are visible on screen, etc. Careful LOD to fully remove objects can reduce working set, as can arranging textures so that shared textures are used for a set of nearby objects.
Video Memory Bandwidth
While video memory is exceedingly fast, it is still possible for the video card to have trouble fetching texture data. This typically only happens when a very large texture is being drawn over a very large number of pixels, and the drawing mode is very inexpensive (e.g. no shaders, no blending, etc.). It is also possible to consume extra video memory bandwidth by hacking X-Plane’s settings using the nvidia control panel (e.g. turning texture LOD bias way up).
Video memory bandwidth is effectively only consumed by textures.
Shader Ops and Raster Ops
Shader Ops are operations performed by pixel shaders; more complex shading algorithms (like the reflective water or volumetric fog) consume more shader ops, as do drawing more pixels.
Raster Ops are operations committing pixels to the screen. Drawing one texture over another (E.g. with overlay polygons) can consume a lot of raster ops, as well as shader ops, since we’re filling in the same memory over and over.
Measuring CPU and Bus Usage
CPU and bus usage can be measured by looking at specific dataref editor statistics. From the dataref editor, pick “show stats”; the sections below indicate what stats to look at. Use the filter function to view groups of stats easily.
tells the total number of “layers” (that is, groups of stuff with unique graphics state) drawn per frame – higher numbers indicate the use of more expensive batches and more CPU time.
Patch Vertex Throughput
Patches are the direct contributor to vertex throughput, which consumes bus bandwidth. Of all patch sources, forests are usually the one that maxes out the graphics bus, because it’s possible to create hundreds of thousands of trees with a simple polygon and .for file.
shows the number of patch vertices rendered per frame.
Patch CPU Load
The terrain patch stats show how the CPU is being used to draw patches.
tells the number of patches being drawn, adding (relatively inexpensive) batches.
tells the number of patches that are visibility tested, consuming CPU.
tells the number of steps taken over the quad-tree containing all patches, consuming CPU.
The water stats are rather poorly named; when reflective water is on, the sim has to make an initial pass over all visible water patches to figure out how to position the water reflection plane. This usually isn’t very expensive compared to the rest of drawing, but could be if you had a really huge number of water patches.
tells the number of steps taken over the quad-tree containing information about water location, consuming CPU.
tells how many water patches were evaluated, consuming CPU.
tells how many water patches were visibility-tested, consuming CPU.
Object Throughput and Batching
Static object instances and vehicles are captured through the following stats:
– the number of steps through the scene graph for cars.
– the number of cars drawn.
– the number of car visibility tests.
– the number of steps through the scene graph for objects.
– the number of drawn static objects.
– the number of object visibility tests.
These stats apply to all objects drawn from any source in the sim.
– the number of batches caused by all object drawing.
– the number of commands executed by all objects.
– the number of lines drawn for all objects.
– the number of triangles evaluated for mouse-click testing for the cockpit object.
– the total number of objects drawn (static, vehicles, dynamic scenery, and airplane-attached).
– number of non-sequential drawn objects.
– number of tris drawn for all objects.
– number of objects drawn that are attached to airplanes.
Identifying Limiting Bottlenecks
A few tips on identifying limitations due to CPU and bandwidth:
- If X-Plane is not fully utilizing one core, it is probably not bottlenecked on CPU use. (But the sim could be bottlenecked on both CPU and later on raster ops.)
- If drawing fewer objects improves frame-rate when the objects in question are very small (e.g. an aerial view) and texture res is set conservatively, then you are probably batch-limited and CPU limited.
- If you are not drawing objects and pixel shaders and FSAA are off and you don’t appear to be batch-limited, you could be bus limited.
Identifying Back-End Bottlenecks (Raster Ops, etc.)
X-Plane can be bottlenecked on back-end operations, e.g. Raster Ops and Shader Ops.
- If decreasing the sim’s screen size improves framerate, the sim may be bottlenecked on raster ops or shader ops.
- If turning down the sim’s anti-aliasing settings improves framerate, the sim may be bottlenecked on raster ops or shader ops.
- If turning off shaders improves framerate (compared to shaders with no volumetric fog or water reflections), you may be bottlenecked on shader ops.
There are a few specific authoring techniques that can bottleneck back-end ops:
- Over-drawing the terrain using overlay polygons consumes a lot of fill, since the sim must pixels and then draw over them again.
- Overlapping a lot of pavement in an airport can also consume fill rate.
Use Mesh Tool, not overlay polygons to add orthophotos to the sim.
Don’t overlap pavement in apt.dat files.
What Isn’t Counted
The statistics above do not count three major sources of slow-down:
- The 2-d panel.
- Non-object airplane geometry (e.g. geometry modeled by PlaneMaker).
- Clouds and weather drawing.
To get a clean estimate, use a forward-no-HUD view on a clear day to eliminate these other consumers of resources.