In a previous post I discussed a new facility in X-Plane 10.25r1 (coming real soon) to disable aircraft-attached objects for performance optimization.  There is a second use for this feature: performance analysis.

This post is targeted at aircraft authors, particularly authors who create complex aircraft with Lua scripts or plugins.

The basic idea here is to remove work from X-Plane and measure the improvement in performance.  I strongly recommend you look at X-Plane’s framerate in milliseconds.  An improvement of 1 frame per second is a big improvement at 15 fps and almost nothing at 60 fps.  But saving 5 ms is always a good thing, no matter what framerate.

The GPU Is Confusing

Your graphics card is basically a second computer, with its own memory, its own bus, and its own CPU (in this case called a GPU).  Ideally the CPU on your computer and GPU on your graphics card are both working at the same time, in parallel, to present each frame of X-Plane.

OpenGL drivers accomplish this by building a “todo” list for the GPU, which the GPU then does on its own time.  Ideally the CPU is off computing flight models and running plugins while the GPU is still filling in mountain pixel shaders on screen.

For any given frame, however, one of the CPU or GPU will have more work than the other, and the other will be idle some of the time.

  • If you have a lot of GPU work (e.g. 4x SSAA HDR with clouds on a huge screen) and not much CPU work (e.g. no autogen) then the CPU will have to wait.  It will give all of its instructions to the GPU for a frame, then the next one, then the next one and eventually the GPU will go “stop!  I’m not done with the frames you gave me” and the CPU will wait.
  • If you have a lot of CPU work (e.g. lots of objects and shadows and plugins) but not much GPU work (e.g. a small screen without HDR on a powerful graphics card) the GPU will wait; it will finish its entire todo list and then go “uh, now what?”.  Until the CPU can get more instructions ready, the GPU is idle.

Here’s the key point: your framerate is determined by the processor that is not idle.  If your GPU is idle and your CPU is not, you are “CPU bound”; if your CPU is idle and your GPU is not, you are GPU bound.  Optimizing the use of the processor that is idle will not improve framerate at all.

Viewing Specific Processor Load in X-Plane

X-Plane’s “frame rate” data output gives you two useful numbers for figuring out where X-Plane is spending its time:

  • “frame time” gives you the total time to draw one frame, in milliseconds.
  • CPU load give you the fraction of that frame time that the CPU was busy.

For example, my copy of X-Plane is showing 9 ms frame time and .78 CPU load.  That means that the GPU needs 9 ms to draw the frame, but the CPU needs 7 ms to draw the frame – I am GPU bound. (I am also in the middle of the ocean, hence the low numbers.)

Unfortunately if you are CPU bound (CPU load > 0.95 or so) there is no current display for the GPU’s utilization; we are working on that for X-Plane 10.30.

Analyze Performance By Subtraction

Putting it all together:

  • You can calculate the actual CPU time spent in your add-on from the CPU load and frame time data outputs.
  • You can disable your add-on to see how much CPU time is now used; the difference is the CPU time your add-on is consuming.
  • You can change settings to be GPU bound (and confirm that by seeing CPU load < 0.95).  Once you are GPU bound, improvements in framerate when you turn off your add-on show the amount of time you used the GPU for.

Armed with this knowledge, you can find the real cost in milliseconds of GPU and CPU time of the various parts of your add-on.  You can find out what costs a lot (the panel? The 3-d? The systems modeling) and then focus your optimizations in places where they will matter.

Support Performance Analysis in Your Add-Ons

In order to tell what part of you add-on is consuming CPU and GPU time, you need to be able to separate your add-on into its sub-components and disable each one.

If your add-on makes heavy use of plugin code or scripts, I recommend putting in datarefs that turn off entire sub-sections of processing.  Good choices might be:

  • Bypassing custom drawing of glass displays in an aircraft.
  • Bypassing per-frame systems calculations.
  • Bypassing per-frame audio code, or shutting off the entire audio subsystem.
  • Turning off any 2-d overlay UI.

You use DataRefEditor to turn off each part of your system and see how much framerate comes back.  If you get 5 ms of CPU time back from turning off your 2-d UI you can then say “wow, the UI should not be so CPU expensive” and you can then investigate.

X-Plane Supports Performance Analysis

The technic of shutting a system off and measuring the increase in performance is exactly how we perform performance analysis on X-Plane itself, and as a result X-Plane ships with a pile of “art controls” that disable various pieces of the simulator.

This article provides a big list of art controls you can use to disable parts of your aircraft and measure the performance improvement.

Here’s where object-kill datarefs come in: the “cost” of drawing an object is mostly inside X-Plane (in driver and OpenGL code) but also in the time spent calling dataref handlers that drive animation.  With an object-kill dataref, you can turn off your objects one at at time and see which ones consume a lot of CPU and GPU time.  If an object is found to be expensive, it might be worth an optimization pass in a 3-d editor, or some time spent to improve the scripts that control its datarefs.

 

About Ben Supnik

Ben is a software engineer who works on X-Plane; he spends most of his days drinking coffee and swearing at the computer -- sometimes at the same time.

4 comments on “Performance Analysis by Subtraction

  1. I tried accessing these art controls as datarefs
    XPLMSetDatai (XPLMFindDataRef( “perf/kill_fm” ), 1 );
    but it didn’t work. So to kill the flight model, the way to do it is still to use dataref
    “…/override_flightcontrol” ?

  2. Well written.

    These CPU and GPU load rates are static values right? Would it not be more suited if it where messured over time, i.e an avarage framerate test?

    1. No, they are updated over real-time – they represent the instantaneous utilization of hardware that your computer is doing “right now.” They are updated on a 1-second basis – long enough to be statistically useful but short enough that you can change the view and get a new answer quickly.

Comments are closed.