What Vulkan Means to Developers

A user asked me to write a little bit about Vulkan. My first reaction was to not post anything for a simple reason: Vulkan is a feature for me, not for you. That is, Vulkan does not make you (the user of X-Plane’s) life directly better; instead it makes my development job easier and it makes it possible for me to create a better X-Plane.

Some day if we end up running on multiple drivers (e.g. Vulkan and OpenGL and Metal), you may not be able to tell the difference between the Vulkan and OpenGL version; the sky won’t be more blue, the clouds won’t be more puffy. I can’t think of any features exposed by Vulkan that aren’t in OpenGL 4.x with extensions now — we already have sparse memory, tessellation and compute in OpenGL. (But then, you might actually be able to tell because framerate might be higher.)

Anyway, having finished reading the (700+ page – that’s what down time in the airport is for) Vulkan spec last week, the rest of this post is my view on Vulkan as a 3-d graphics developer. This is definitely an “inside baseball” kind of post, so if you want to go surf cat videos, I won’t be offended!

The Problems with OpenGL

For an application like X-Plane, Vulkan is an improvement over OpenGL; to understand why we have to look at the problems with OpenGL as an API that allows applications to communicate with 3-d hardware drivers. There are a few:

OpenGL’s approach to multi-core and threading are antithetical to performance. This is really the straw that broke the camels back for me and OpenGL. Developers who know me know that I’m not a “burn it down and start from scratch” kind of guy*. But the threading model in OpenGL requires drivers to hurt performance with safety checks and locks that an application can’t get away from. This isn’t something you can just fix with an extension.
OpenGL’s object and binding model also make multicore performance difficult. The use of a “current” object in a context and the ability to radically change objects after they are built mean lots of driver paths require internal locks.
OpenGL’s compatibility is its greatest strength and greatest weakness. The original plan to rebuild the object model in OpenGL 3.0 died before it was released. Instead the ARB came up with a plan to optionally deprecate APIs. Every vendor of OpenGL except for Apple has chosen to keep backward compatibility with everything. That’s great for keeping old apps working, but it means every change to OpenGL is expected to work with everything else there ever was, ever.
OpenGL requires that shaders be compiled by the driver. This means an application is exposed to idiosyncrasies in the compiler of each driver we ship with. This isn’t as bad as it used to be, but writing shaders is still a matter of write-once, check everywhere.**

How Vulkan Helps: The design principles behind the Vulkan API address all of these issues.

The biggest single feature of Vulkan is its new multi-core friendly threading model. Vulkan is “externally” synchronized, which basically means applications can do whatever they want, but have to talk to different parts of the driver from different threads. To use an analogy: OpenGL is filled with traffic lights. Vulkan doesn’t even have stop signs, and it’s up to two drivers to not be on the same road at the same time. (As application developers, this is a great setup, as we know what roads we are on and can plan to not have collisions.)
The object model clearly separates expensive creation operations from inexpensive usage operations. Expensive object creation can be done on worker threads or at initialization time. Objects can’t be radically reconfigured once they are built, so the small changes that are allowed to existing objects aren’t slow.
Shaders are compiled ahead of time into an intermediate representation; no more shader compile fails after a driver update.

Vulkan is a smaller, lower level API with clear performance guidelines and a focus on multicore from day one. This is a good fit for what we need with X-Plane.

The Problems with the OpenGL Ecosystem

The OpenGL ecosystem is the collection of all of the companies and programmers working with OpenGL. This includes the vendors of graphics chips who create OpenGL drivers (NVidia, AMD, Intel, Imagination, Qualcomm, ARM), OS vendors who provide OpenGL interfaces (Apple and Google), the major game engine developers that have the ear of the hardware vendors (think Unreal Engine 4, Unity, etc.), the major CAD application developers, etc.

More serious for OpenGL than the problems of the API is the state of the ecosystem.

OpenGL’s API is underspecified: there is no comprehensive conformance test for OpenGL, so we can’t really know if an OpenGL driver works or is buggy.
OpenGL’s API is underspecified: no performance guarantees or even recommendations are made. If you ever look at the tech blog Chris and I maintain, it’s full of posts about the latest witchcraft I’ve found to make vertex throughput go faster. That stuff isn’t part of the GL spec, yet you have to know it to build a real-time graphics application.

Given the lack of specifics, application developers and driver writers end up locked in a sort of death-spiral:

Driver writers observe the behavior of applications and change the driver behavior to work around applications. This can include trying to improve performance and trying to bandage around broken behavior. (Since games are the typical benchmark for new graphics cards, NVidia and AMD are hugely incentivized to make games run faster by any means possible.)
App developers observe the behavior of the drivers and change application behavior to work around driver issues. In X-Plane’s case this often means intentionally not using the optimal code path for driver stack X because it is slow. If the driver team ever fixes code path X, X-Plane still isn’t using it; when the driver team looks at performance they then decide to improve code path Y (that we are using, the backup plan) because that is what will make our app benchmark faster.

How Vulkan Helps: Vulkan helps by being much more highly specified in terms of both conformance and performance.

Vulkan is a much smaller, simpler API – OpenGL has simply become too complicated to completely test. With Vulkan, we can hope to test the entire driver.
Vulkan is being built with an open source test suite from day 1, with the goal being to build up a huge number of tests so we can know that a given driver is correct.
The Vulkan API is very clear about what operations are fast and what operations are slow. An application that uses the fast API can expect fast performance on those code paths. Guessing is not required.

Downsides to Vulkan

For smaller OpenGL application developers like Laminar Research, I can think of just one down-side; it’s one that I haven’t seen a lot of application developers talk about, probably because it requires admitting that we (the app developers) might not be as smart as those driver guys.

OpenGL is a higher level API; OpenGL applications leave some of the hard problems of 3-d graphics up to the driver. This means that some of these very hard, very important performance problems are being solved by a team of engineers from the company that built the hardware. They have resources, they know everything about the particular hardware they are coding for, and performance is job one.

Vulkan is a low level API; with Vulkan, some of those hard problems will be solved by Chris.

Ha ha…no, I’m totally kidding. We don’t let Chris play with pointers or any other sharp objects. That code will be written by me, and it’s a safe bet that I know less about the hardware than the driver team and have less time to do pure performance work then all of the engineers at Nvidia or AMD who work on the OpenGL stack.

This is a calculated risk for Vulkan as an eco-system. The hope is that (1) with specific information about performance as part of an API, us application developers won’t screw things up too badly and (2) because we (the application developers) know more about our specific applications, we can optimize performance in ways the driver guys can’t because they don’t have the bigger picture. One of the hardest problems in a graphics API is conveying intent; OpenGL drivers spend a ton of code trying to guess what the application is trying to do. Vulkan solves this problem by letting the application make performance decisions itself.

Next Steps

If you know OpenGL but don’t know a lot about how 3-d graphics hardware really works, reading the Vulkan spec is a little bit like believing that Santa Claus brings your presents down the chimney and then [spoiler alert] one day reading a 700+ page PDF that explains how your parents actually buy the presents a month or two in advance, wrap and hide them, and then sneak them under the tree while you are asleep.

For X-Plane, the Vulkan spec makes clear what types of operations future drivers will support, what will be done for us, and what we have to do ourselves. This gives us a good framework to then incrementally build next-generation rendering code in X-Plane that is “Vulkan-friendly” – even if it is still using OpenGL. My guess is that this new code will be faster than the code it replaces even before we change drivers.

Once we have the rendering code restructured and modernized, we can then set it up to run on Vulkan or OpenGL, taking advantage of the Vulkan pathways where they exist.

One final thought: I have no idea how the actual experience of coding for Vulkan will be. It may be that everything “just works”, or it may be an exercise in frustration. Once we’ve been through a full port I’ll post a post-mortem, but I think it’s too soon for anyone out there to have good realistic feedback on a full Vulkan porting experience from OpenGL.***

* And when people suggest ground up rewrites I usually link to this.

** For what it’s worth, the driver shader bugs I see (and the ones that really take up my time) are bugs in the back end of the compiler, where optimized machine code for the GPU are generated. Vulkan isn’t going to fix this; Vulkan removes the front end of the compiler from the driver but not the back end.

*** Yes, there are games running on Vulkan now. But if your game is already running on multiple APIs, that’s different from porting a straight OpenGL app. The final spec and roduction drivers haven’t been out long enough for anyone to be able to say “I ported half a million lines of OpenGL code to Vulkan and here’s the result.”

About Ben Supnik

Ben is a software engineer who works on X-Plane; he spends most of his days drinking coffee and swearing at the computer -- sometimes at the same time.

View all posts by Ben Supnik →

19 comments on “What Vulkan Means to Developers”

700+ pages. You are my hero.

The hype seems to generate 2 areas of benefit: “more control of threads” and “everything else currently handled by driver devs”. Where does xplane stand to gain the most ground? You predict that gradual architecture changes in opengl (that would be vulkan-like in structure) would bring a boost without a switch to vulkan. It sounds like xplane is not as bound by restrictive opengl threading?

Furthermore, if a layer of driver is removed (closer to silicon), will the situation be more or less similar to the mobile market? In other words, I can only imagine that the number of hardware combinations in the PC world is significantly more than that of the Android world…will this make that situation worse?

Ben Supnik says:

March 16, 2016 at 10:09 pm

We definitely _are_ bound by OpenGL threading! The OpenGL threading rules require the driver to take locks to protect us in places where we don’t need protection (because X-Plane is lock free for the rendering thread), hurting performance and smoothness. There are future features we’d like to do where this penalty will be worse.

So we have our own things we can fix (I’m working on improving FPS right now – the code changes I’ve made yesterday and today appear to improve things 5% on my iMac) but we also need a threading model where we can leverage multi-core and get the locking system we want.

Regarding the mobile market, I don’t follow your logic. Android is a very difficult platform to develop 3-d apps for because there are four major GPU vendors instead of three, but also because the GPU’s driver comes with OS updates, and the carriers are really, really bad about pushing them. The problem on Android is mostly users running on old drivers with no recourse.

Vulkan may be closer to the metal, but it -is- a common denominator – it’s an API that every modern GPU can support. The standards committee included the mobile GPU vendors and made sure to specifically include abstractions in Vulkan that make it mobile-friendly.

Sounds promising, maybe with Vulkan we can get a high enough framerate for VR. yes I know it wont be anytime soon, but a man can dream can’t he?

Also, any chance we’ll see a 10.50 blog soon?

Hi Ben , do you hace already researxhed metal? And any plans to share some notes as well? Do am i correct saying volkan does not benefit macs it is not supported bere?

I do unterdand thi new 3d apis are features to you not to us. But ultimately we (users) are aleays struglying for performance and crank settings up. All marketing around this new los level apis majes us to believe are fps are going up thus being very interested on them no matter we do not understand most of it 🙂

Ben Supnik says:

March 16, 2016 at 10:19 pm

I have researched metal. It is very similar to Vulkan with three major differences:
1. Metal manages GPU memory for us, similar to OpenGL.
2. We don’t get access to push constants.
3. There don’t appear to be persistent descriptor sets – resource bindings are streamed.
4. I don’t see the tessellation or geometry shader stages anywhere – perhaps they’ll be added later.
Except for item 4, all of these things make Metal more like OpenGL and less like Vulkan.

Fortunately, metal does share a few really important concepts with Vulkan:
– Flexible mostly lock-free threading. (Now that the Vulkan threading model is public, I should probably re-read what Metal offers.)
– All expensive CPU computations to set up shaders are done “up front” to make a single pipeline object – this helps offload rendering CPU load.
– Distinct rendering passes to help tiling GPUs like the PowerVR chips.

Vulkan doesn’t benefit Mac users because (at least as of now) Apple isn’t doing a Vulkan driver. I read somewhere that there’s a project to write a Vulkan emulator on top of Metal, but this will probably be less efficient; emulating a low level API with a high level API means duplication of computation.
1. Jon Hay says:
  
  March 17, 2016 at 1:41 pm
  
  It sounds,to me, like the mac platform is being marginalized. It already is marginalized when x-plane builders don’t, for what ever reason, port to the mac platform. It will be further marginalized when, the mine is faster crowd, start chiming in. Similar but separate is not equal. My two cents!
  1. Ben Supnik says:
    
    March 17, 2016 at 2:06 pm
    
    Users have already observed faster fps running their Mac on bootcamp than with the Apple GL stack. And Apple doesn’t really -make- a machine that’s competitive with a 250 watt-GPU game machine. So the mine is faster crowd has been chiming in for years now.
    
    (Basically the last Mac that could really be wattage-competitive with a desktop PC was the Mac Pro when the 5870 was the ‘big’ card. It’s just a question of power consumption.)

As I downloaded the latest nVidia driver today, I noticed that Vulcan is supported. It may have been there before, but I just noticed it. In case others, such as I, were wondering if they had the equipment to run it:

Version: 364.51 WHQL
Release Date: 2016.3.8
Operating System: Windows 7 64-bit, Windows 8.1 64-bit, Windows 8 64-bit, Windows Vista 64-bit
Language: English (US)
File Size: 322.97 MB
Release Highlights
Gaming Technology
Support for Vulkan API

So reading in-between the lines, Vulcan will be a xplane 11 feature ?

Hi Ben,

based on your comments, it looks like the freedom in the management of resources and shaders is what X-Plane may benefit most from the future transition to Vulkan.

But there is another aspect that was heavily marketed by the DirectX 12 guys (but, if I understood well, DX12 and Vulkan look quite similar to each other, so I expect that a feature of one will more or less be present in the other too), and it is a sort of “heterogeneous computing”. In other words, with DX12, you can throw any number and combination of video cards (from different manifacturers, too) at your PC and the workload is (almost) magically shared among them. I think this is also a noteworthy feature deserving some attention. What I don’t know is: does Vulkan offer a similar capability? If yes, is something that comes “for free” for the simple fact that an application supports Vulkan, or will applications have to be developed in a certain way to take advantage of this?

Ben Supnik says:

March 20, 2016 at 12:28 pm

Vulkan supports this too, but “for free” is the exact -opposite- of how I would describe the feature!!!

First, I should say: I -think- some of the multi-vendor and multi-GPU interop features are -not- yet well specified in Vulkan. I saw a note form Graham Sellers on the forums basically saying “we deferred this to get the standard case out without delaying the spec, it’s better not to rush and screw this up but it’s coming real soon”. (And I think that’s the right decision.) But we can look at DX12 and look at the proposed models to at least understand the landscape.

And the landscape is this:
– The drivers aim to provide some reliable functionality for communication between “devices” (read: graphics cards), and in some case to allow that to be portable between GPUs.
– In return, you, the app, do 100% of the work to make multi-GPU work.

That second point is kind of a big caveat. NO app running DX12 or Vulkan supports multi-GPU unless the app developer puts the feature in. And that means the app team stopping development on other features to -just- do multi-GPU.

The reason this happened is that making the driver do things like multi-GPU is basically impossible in modern apps. The driver can’t know enough about what the app is doing to know how to distribute work across hardware.

The question will be: how many apps will actually go the lengths needed to make this work. My guess is: AAA games and games whose engines “just support it” will do it. So it’s not that an app has to be written a certain way. It’s that the app has to code the entire feature itself.

To give you an idea of what simple multi-GPU (E.g. SLI alternate frame rendering) costs, basically every single resource in the app has to be replicated to both GPUs – the resource management has to be done separately and we have to identify resource problems on each GPU independently and track them. For any shared resources that are truly shared, we have to either create them in both places or synchronize them across GPUs. We then have to render to one GPU or the other, track that state separately (e.g. resources on one GPU aren’t visible on the other) and finally somehow get the finished frame put back together. And that’s for the “easy” case, alternate frame rendering (which adds latency, which all the VR guys hate).
1. elios says:
  
  March 21, 2016 at 8:37 am
  
  all the more to seriously look at DX 12 MS, NV, and AMD have dev tools to make dx12 with multi gpu and vr as easy as they can for devs

Hi ben !!

Are you aware of that potential big problem which is the Vulkan reference shader compiler that apparently does no optimizations at all ? I (educated) guess that you are… Here is the take on that matter from another developer of another game :
https://forums.inovaestudios.com/t/to-vulkan-or-not-to-vulkan-that-is-the-question/3174

For what I understand (as I’m not an expert), optimizations are now independent from the drivers and using the reference shader compiler without doing anything else output a frame rate hit compared to OpenGL (in some cases), DX11 and even more seriously compared to DX12.

Is it a “deal breaker” for using Vulkan or does that “simply” mean you will have to spend much more dev. time to optimize that just to, at least, reach the same level of performance (frame rate intended) than with OpenGL, DX11 or DX12?

You will correct me if I’m wrong but it seams to be a serious problem… For you devs at least 🙂

Bye.
Dave.

Ben Supnik says:

March 20, 2016 at 12:46 pm

I don’t understand why the blogger referred to is so freaked out about a shader gap. Given how new Vulkan is, I’d be shocked if the gap is any smaller. My understanding from talking to IHVs is that the big problem in proving that any of the new tech is ready is not comparing them to each other but rather the new drivers getting beaten by D3D11. A huge amount of optimization has gone into a legacy and stable technology.

With that in mind, optimization can happen at -two- levels:
1. When you compile your shader to SPIR-V, “structure”-level optimization can happen in the generation of byte code. For example:
int x = 0;
if(x)
{
do_expensive_thing();
}
Once the shader compiler builds a back-end representation of this, it can apply optimization: the compiler can reason that X will _always_ be 0 at the point of the if statement, and therefore everything inside the if clause can simply be _deleted_ from the shader ahead of time, making it faster. This is the kind of optimization that LLVM does really well (e.g. reason about your code, re-org it to take advantage of things it can see).

2. When the SPIR-V binary is loaded into the driver, the driver has to generate actual machine assembly for it, and this is where the vendor can optimize the generated code. Frankly, I don’t see why the driver can’t also run a structure-level optimization pass of its own, although one would hope there’s nothing left to be had by the time we get to SPIR-V. (My guess is: the drivers will end up optimizing – it always looks good for the GPU vendor to be fast.)

I expect that over time the IVHs will jump on case 2 to improve shader performance; the bigger question is how long it will take to get a GLSL->SPIR-V compiler that has a really good back-end.

There _is_ one set of optimization that is now strictly in the application’s hands. The Vulkan spec provides a specific mechanism for specialization – that is, the generation of more than one compiled shader where the constants have changed. Let’s look at our example again:

uniform float want_expensive_fx;
…
for(int i = 0; i < 1000; ++i) rgba += want_expensive_fx * sample(my_sampler,uv); So - that's terrible code, and you should never ever write it. But the key point is this: - In Vulkan, if you code that, your loop runs 1000 times. Sorry, it sucks to be you. - You can declare want_expensive_fx to be a specialization constant in your app - this results in multiple pipelines, where you must pick the right one. You could specialize with want_expensive_fx = 0 and the GPU compiler will delete that entire loop. - In OpenGL (and probably D3D??) this process was _totally automatic_. Us app developers have been complaining about it - "you specialized my shader and it caused a pause in rendering", but it has also meant that we could write really stupid code and have the driver clean up after us. That's gone in Vulkan. And that is a general ecosystem problem with Vulkan: any case where the driver was performing optimizing that is now on the app side (since Vulkan has moved the app/driver split DOWN in the stack by quite a bit) is now strictly on the app, and if the app doesn't do it, that optimization opportunity is lost forever. And I have to think: maybe us app developers aren't the gods of optimization that we think we are. If we were, why have the driver guys been rewriting and replacing our shaders for so many years? 🙂 So the short of it is: driver code gen will get better fast, pre-compiled code gen will get better eventually, but app optimizations are on us. New ecosystems always take time to catch up in benchmarks to the existing fully optimized incumbent. EDIT: one last note on this: X-Plane has had its own "poor man" version of specialization constants for a while now (and I suspect other apps do too): X-Plane's shaders have pre-processor #defines. When X-Plane builds a shader, it builds several GLSL shader objects, with the #defines pre-defined, and keeps them bundled together. This means: - Changing one of these 'constants' is really a shader change in the app and - New GPU code is generated for each case. That matters because those constants tend to turn off and on major features. E.g. if we compile our terrain shader with atmospheric scattering off, the code for atmospheric scattering is -totally gone- from the shader. It doesn't cost us an if statement, instruction length, samplers, uniforms, or anything. We tend to use these #defines for major features where the amount of code removed is very large.

Thank you, very int. reading Ben..

Hello Ben, I am interested in this question (in the subject). After the transition to X-Plane 10 on Vulkan in will influence the setting of water quality on FPS ?
And the current versions of X-Plane 10 settings water quality greatly affects the fps.

Ben Supnik says:

March 20, 2016 at 6:13 pm

The water reflection detail (that’s what the setting controls) affects framerate for CPU-bound machines. If there is an improvement of this on Vulkan, it will be the same as an overall improvement in fps.

All of this sounds great!
I just hope Vulkan won’t cause you to drop Linux support. 🙁

Notreallyme says:

March 26, 2016 at 1:14 pm

Why would Vulkan do that? On the contrary it should make cross-platform support easier.

Comments are closed.