This blog post is more or less a “stern talking to” for plugin developers, but before I go there, I want to acknowledge that we (Laminar) screwed up the docs here in a way I didn’t even realize until working on a bug report. A decade ago when the plugin SDK was young, we did have clear docs that the plugin APIs were not at all thread-safe. Sandy and I were also on the plugin dev email list and we’d administer a Scottish-style beating to anyone who even started typing the first few letters of “threading”.
However, this document was lost in the migration from the old XSquawkBox server to developer.x-plane.com. I’m working with Jennifer on new docs now, and I apologize for the thrash any plugin developer gets hit with from not knowing the threading guidelines.
With that in mind: the X-Plane plugin API is not thread-safe. You can only call plugin APIs on the thread that called you. No exceptions!
The plugin SDK was invented for X-Plane 6. At the time, multi-CPU and multi-core hardware was totally unavailable to the flight sim community. Apple hadn’t released the dual-G5 yet, and the Pentium D hadn’t come out yet. There was no point in thinking about multi-core because there weren’t multiple cores.
The plugin guidelines were therefore set up very simply: the API is single threaded; call us back on the thread we call you. The SDK internally has no locks or handling of re-entrancy and has no model to cope with resource sharing or data integrity across threads.
Furthermore, X-Plane’s crash detection system is not meant to categorize other-thread crashes, so you can’t easily tell that your plugin is crashing the sim. You might even crash a different plugin, or some internal part of X-Plane.
One more way we’re not thread safe that might not be obvious: when you read a dataref, you are executing code from another plugin. You can’t do this on a thread if the XPLM isn’t thread-safe, but you also can’t do this if the other plugin isn’t thread safe.
Stopping the Bleeding
In X-Plane 11.30 we made a change to stabilize the sim: we added code to actively detect and ignore plugin calls to the XPLMTerrainProbe APIs from background threads. We did this after seeing automatically reported crashes that turned out to be due to plugins calling the SDK terrain probe API from worker threads. By ignoring the call, we avoid the crash.
My plan is to start doing this for all plugin calls at a patch point in the next few months. The problem here is that there’s basically no such thing as a benign threading bug – the table stakes here are the complete destabilization of the sim.
If you are a plugin author and you are using background threads to call XPLM APIs, please stop doing this now. Please plan to fix this in your plugin as soon as possible. The change will probably not make things any worse – my current idea is to no-op the calls, just like we did with terrain probes. But if your plugin is using these async calls and sometimes succeeding but sometimes crashing, you’re going to stop seeing the crashing and the sometimes lucky “success”.
I’m working with Tyler and Jennifer on a docs update now – hopefully this week all of the docs should be completely consistent. But they’re going to say what I’ve said above: no calls to the XPLM on worker threads.
Will We Ever Be Thread-Safe?
Sandy and I did some work back in the XPLM 2.0 days to start making the SDK partly thread-safe. This work was not completed, but the idea was to at least make a small number of APIs callable from worker threads. For example, by making flight loop callbacks schedulable from worker threads, a plugin could “wake up” SDK code from an async IO callback. That idea still makes some sense and we may get there someday.
For expensive tasks, we’ve already made API changes to address the underlying performance problems. For example, you can’t load an object synchronously from a thread, but you can ask us to load it asynchronously and call you back, and we use one of our threads to offload the work.
Some APIs I expect to never be thread safe. For example, we can’t sanely provide a threaded API to datarefs because we can’t promise that the plugin on the other side of the call is thread-safe. Given that a dataref read function can call other datarefs or other arbitrary plugin code, the opportunity for dead-locks is limitless.
Tyler’s Fever Dream
I should mention something that Tyler has been looking at as a future SDK initiative. This is somewhere between pie-in-the-sky and a fever-dream: plugins running asynchronously in separate processes.
The idea is to have a version of the main SDK APIs that are “fundamentally asynchronous” (e.g. the response comes later and that’s baked into the contract). This would allow plugins to run in another process, with results being communicated via IPC. Out-of-process plugins would have a bunch of advantages:
- You could write a plugin in pretty much any language we can write a binding for. Right now, plugins must fit into an unmanaged DLL inside X-Plane; this would allow the full “async API” to be used in .net or even Matlab. The requirement would only be that the plugin “app” environment be able to host unmanaged C code. This is a much less difficult requirement to meet.
- Plugins would be isolated by process boundaries; if a plugin crashes, the sim keeps running. Plugins can’t go around trashing each other’s memory.
- Use of multiple cores for plugin CPU processing is basically free, because the plugins can execute in parallel.*
- Plugins can each have their own CEF instance – or any other library that doesn’t like to be multiply instantiated in a single process.
One of the major hurdles to implementing an API like this was sharing graphics across the process boundary. As it turns out, we have to solve this problem for Vulkan anyway – the same APIs that share surfaces between Metal/Vulkan and OpenGL are IPC-friendly from day 1. So a 2-d plugin OpenGL window in Vulkan doesn’t have to be in X-Plane’s process.
I don’t see async plugins ever replacing 100% of in-process plugins, and this isn’t a plan to kill off the current C API. But I do think out-of-process plugins would be a much better fit for a wide range of plugin tasks.
* Users familiar with game programming will appreciate one of the fundamental dangers here: plugins in other processes might steal oversubscribed cores from the sim itself, causing stutters and unreliable framerate. Tyler and I have talked about that a little bit, and have some ideas for taking plugin background work into account and giving plugins ways to opt in and say “please run this background work for me at a time that won’t hose framerate.”