Point Clouds Are the Technology of the Future (And Always Will Be)

“Unlimited Detail” is back – you can see the videos and read some criticism here. I have never seen a really good white paper on the technology, so I’m going to have to speculate a bit about what it is they’ve actually done, and then I’ll use the rest of the post to describe why this isn’t the only way to improve perceived realism in a game (and is not the most likely one to succeed.

But first: some video. This is Euclideon’s promo video, showing lots of really ugly polygonal models, and some clearly not polygonal models with a lot of repeating things.

And here we have in-game footage from the upcoming Battlefield 3, using the Frostbite 2 engine.

I’m starting with the video because I had the same first reaction that I think a lot of other 3-d graphics developers had: attacking six-sided trees from Crysis 1 is a straw man; the industry has moved beyond that. Look at BF3: is the problem really that they don’t have enough polygons in their budget? Do you see anything that looks like a mesh?

What Is Unlimited Detail, Anyway?

Short answer is: I don’t know – the company has been quite vague about specific technical claims. This is what I think is going on from their promotional material.

Their rendering engine uses point cloud data instead of shaded, mapped, textured “polygonal soup” as the input data to be rendered. Their algorithm does high performance culling and level of detail on the highest level of point cloud data. (Whether this is done by precomputing lower-res point clouds for farther views, like we do now for textures and meshes, is not specified.)

Why Are Polygons Limiting?

First, we have to observe the obvious: a 1080p video image contains a bit over 2 million pixels; today’s video cards can easily draw 2 million vertices per frame at 30+ fps (even over theAGP bus!). So for a modern GPU, polygon count is not the operative limit. If you add more polygons, you can’t see them because they become smaller than one pixel on the screen.

The limit for polygons is level of detail. If the polygonal mesh of your model is static, then when you walk away from it, the polygons are (in screen space) too small (and we run out of polygons – if we are drawing more than one vertex per screen pixel, we can exceed budget) and if we move in, the polygons are too big.

In other words, the problem with polygons is scalability, not raw count.

And in this, Euclideon may have a nugget of truth in their claims: if there is a general purpose algorithm to take a high-polygon irregular 3-d triangle mesh and produce a lower LOD in real time, I am not aware of it. In other words, you can’t tell the graphics card “listen, the airplane has a million vertices, but can you just draw 5,000 vertices to make an approximation.” Polygons don’t work like that.

Coping With Polygon Limit: Old School Edition

There’s a fairly simple solution to the problem of non-scalable polygons: you simply pre-create multiple versions of your mesh with different polygon counts – usually by letting your authoring system do this for you.* X-Plane has this with our ATTR_LOD system. It’s simple, and it works sort of.

The biggest problem with this is simple data storage. I usually advise authors to store only two LODs of their models because each LOD takes up storage, and you get limited benefit from each one. Had really smooth LOD on objects been a design priority, we could have easily designed a streaming system to push the LODs of an object out to disk (just like we do for orthophoto textures), which would allow for a large number of stored LOD variants. Still, even with this system you can see that the scalability is so-so.

There’s another category of old-school solutions: CPU-generated dynamic meshes. More traditional flight simulators often use an algorithm like ROAM to rebuild meshes on the fly at varying levels of detail. When the goal is to render a height field (e.g. a DEM), ROAM gives you all of the nice properties that Euclideon claims – unlimited detail in the near view scaled out arbitrarily far in the far view. But it must be pointed out that ROAM is specific to height fields – for general purpose meshes like rocks and airplanes, we only have “substitution LOD”, and it’s not that good.

Don’t Repeat Yourself

It should be noted that if we only had to have one unique type of house in our world, we could create unlimited detail with polygons. We’d just build the house at 800 levels of detail, all the way from “crude” to “microscopic” and show the right version. Polygonal renderers do this well.

What stops us is that the mesh budget would be blown on one house; if we need every house to be different, LOD by brute force isn’t going to work.

That’s why the number of repeating structures in the Euclideon demo videos gives developers a queasy feeling. There are two possibilities:

When they built their demo world, they repeated structures over and over because it was a quick way to make something complex, then saved the huge resulting data set to disk.
They stored each repeating part only once and are drawing multiple copies.

If it’s the second case, that’s just not impressive, because games can do this now – it’s called “instancing“, and it’s very high performance. If it’s the first case, well that was just silly – if their engine can draw that much unique detail, they should have filled their world with unique “stuff” to show the engine off.

Where Does Detail Come From?

Before we go on to how modern games create scalable polygonal meshes, we have to ask an important question: where do these details come from?

The claim of infinite detail is that you would build your world in ridiculously high resolution and let the engine handle the down-sampling for scalability automatically. I don’t think this is a realistic view of the production process.

For X-Plane, the limit on detail is primarily data size. We already (for X-Plane 9) ship a 78 GB scenery product. But it’s the structure of that detail that is more interesting.

The scenery is created by “crossing” data sets at two resolutions to create the illusion of something much more detailed. We take the mesh data (at 90m or worse resolution) and texture it with “landclass” textures – repeating snippets of terrain texture at about 4 meters per pixel. The terrain is about 78 GB (with 3-d annotations, uncompressed) and the terrain textures are perhaps 250 MB. If we were to simply ship 4 meter per pixel orthophotos for the world, I think we’d need about 9.3 trillion pixels of texture data.

I mention this because crossing multiple levels of detail is often both part of an authoring process (I will apply the “scales” bump map to the “demon” mesh, then apply a “skin” shader) and how we achieve good data compression. If the “crossed” art elements never have to be multiplied out, we can store the demon at low res, and the scales bump map over a short distance. There can be cases where an author simply wants to create one huge art asset, but a lot of the time, large scale really means multiple scale.

Coping With Polygon Limit: New School Edition

If we understand that art assets often are a mash-up of elements running at different scales, we can see how the latest generation of hardware lets us blow past the polygon limit while keeping our data set on disk small.

DX11 cards come with hardware tessellation. If our mesh becomes detailed via a control mesh, curve interpolation (e.g. NURBS or whatever) and some kind of displacement mapping, we can simply put the source art elements no the GPU and let the GPU multiply it out on the fly, with variable polygon resolution based on view angle. Since this is done per frame, we can get the right number of polygons per frame.**
Since DX10 we’ve had reasonably good flow control in shaders, which allow for displacement mapping and other convincing promotion of 2-d detail to 3-d.

So we can see a choice for game engine developers:

Switch to point cloud data and a new way of rendering it. Use the art tools to generate an absolutely ginormous point cloud and trust the scalability of the engine or
Switch to DX11, push the sources of the art asset to the GPU, and let the GPU do the data generation in real-time.

The advantages of pushing the problem “down to the GPU” (rather than moving to point clouds) is that it lets you ship the smaller set of “generators” for detail, rather than the complete data set.

Euclideon does mention this toward the end of their YouTube video, when they try to categorize art assets into “fiction” (generated by art tools) and “non-fiction” (generated by super-high-resolution scanners).

I don’t deny that if your goal is “non-fiction” – that is, to simply high-res scan a huge world and have it in total detail, not even clever DX11 tricks are going to help you. But I question how useful that is for anyone in the games industry. I expect game worlds to be mostly fiction, because they can be.

If I build a game world and I populate my overpasses with concrete support pylons, which am I going to do?

Scan hundreds of thousands of pylons all around San Diego so I can have “the actual concrete”? or
Model about 10 pylons and use them over and over, perhaps with a shader that “dirties them up” a little bit differently every time based on a noise source?

There are industries (I’m thinking GIS and medical imaging) where being able to visualize “the real data set” is absolutely critical – and it may be that Euclideon gains traction there. But for the game development pipeline, I expect fiction, I expect the crossing of multiple levels of detail, and I expect final storage space to be a real factor.

Final Id Thoughts

Two final notes, both regarding Id software…

John Carmack has come down on the side of “large art assets” as superior to “procedural generation” – that is, between an algorithm that expands data and having the artists “just make more” the later is preferable. The thrust of my argument (huge data sets aren’t shippable, and the generators of detail are being pushed to the GPU) seems like it goes against that, but I agree with Carmack for the scales he is referring to. Procedural mountains aren’t a substitute for real DEMs. I think Carmack’s argument is that we can’t cut down the amount of game content from what currently ships without losing quality. My argument is we can’t scale it up a ton without hitting distribution problems.

Finally, point clouds aren’t the only way to get scalable non-polygonal rendering; a few years ago everyone got very excited about Sparse Voxel Octrees (SVOs). A SVO is basically a 3-d texture with transparency for empty space, encoded in a very clever and efficient manner for fast rasterization and high compression. Will SVOs replace polygons? I don’t know; I suspect that we can make the same arguments against SVOs that we make against point clouds. I’m waiting to see a game put them into heavy use.

* E.g. the artist would model using NURBS and a displacement map, then let the 3-d tool “polygonalize” the model with different levels of subdivision. At high subdivision levels, smooth curves are smooth and the displacement map provides smaller detail.

** The polygon limit also comes from CPU-GPU interaction, so when final mesh generation is moved to the GPU we also just get a lot more polygons.

About Ben Supnik

Ben is a software engineer who works on X-Plane; he spends most of his days drinking coffee and swearing at the computer -- sometimes at the same time.

View all posts by Ben Supnik →

20 comments on “Point Clouds Are the Technology of the Future (And Always Will Be)”

Thanks for that post Ben, i was wondering if you had seen the video!
There is some more info on the August 2nd post here – http://notch.tumblr.com/

Ben Supnik says:

August 3, 2011 at 2:23 pm

Right, Notch and I had similar reactions I think:
– The actual test demo art asset has to be less than they claim -a linear sampling at that res is insane.
– If they don’t require instancing, they shouldn’t have shown instancing – instancing is much easier than huge datasets.

Will X-Plane 10 have this point cloud stuff? Since you already produced a nice video with palm trees and everything, I expect you to issue a beta version within two weeks!

Ben, thanks for taking the time to convert a great deal of tech soup into something very readable, enjoyable and understandable. Euclideon’s video is almost magical. It’s too good to be true. And what is too good to be true, usually isn’t. It’s important to understand the difference, one you make clear. It’s not the point cloud thingie that’s the problem, it’s the size of the dataset that you have to generate and ship with your product. That’s the real thing they have to prove: can they ship their product without it looking like instancing gone wild?

Now I just have to hope that I can get a DX11 video card that will run on a Windows XP system….. I want too much. But then again….I’m a NURBS modeler. 🙂

Filippo says:

August 4, 2011 at 2:36 am

DX11 card that will work under Windows XP? Well, you can achieve that already, but your card will be limited to DX9 features. DirectX releases beyond 9 are not supported under WinXP, so getting a DX10 or better card and letting it run under WinXP is sort of waste, although it basically works (you will obtain the equivalent of a DX9 card with the raw power of a DX11 card, this is it).

The only way to exploit all the hardware features of a recent video card under Windows XP, as far as I understood, is to go with OpenGL.

Thanks for that explanation. I don’t usually post on this just read everything haha. I have seen stuff like this before I am a Civil Designer and remember once seeing a surveying company who could literally drive down a street with laser and video mounted on the roof and create survey from point data highly accurate and very impressive. only problem was try and put 5, 10, 100 or 1000 million points into civil design packages you run out of memory pretty fast.

I am also very interested in how this will handle lighting, theoretically it will have to calculate light bounces off of every “atom” if that is the case then lighting will probably be the bottle neck of the software. As we know lighting is very important and if you don’t have it right it won’t look right.

Just my thoughts if this is not just hot air then it looks like exciting times to come.

Bill says:

August 8, 2011 at 5:20 pm

As far as lighting goes, you might be able to get away with approximating it in a lot of cases.

Even if this turned out to be the best thing since sliced bread, you’d still almost certainly need to retain some underlying, lower resolution meshes for other purposes, such as collision detection. It’s just a thought off the top of my head, but perhaps you could take the polygonal data from those meshes, calculate the lighting on them, and maybe apply some Phong shader or something. Then, using a depth map, blend the shading data onto the rendered scene.

I’m not really sure how well that would work, but it sounds good in my head!
1. Ben Supnik says:
  
  August 8, 2011 at 8:51 pm
  
  That actually sort of exists.
  http://www.slideshare.net/DICEStudio/siggraph10-arrrealtime-radiosityarchitecture
  See around slides 12 or so.
  But since we’re not doing anything that’s vertex constrained (our lighting is all on the GPU and all screen space/deferred – we’re not trying for radiosity yet) we don’t need to reduce vertex count. If we can afford to save the detailed meshes at all (since RAM is the issue) we can easily afford to light them.

@Filippo:
In the context of X-Plane 10, read “Direct X11 card” as “recent card with support for Shader model 4”. It has nothing to do with DirectX itself (Mac and Linux versions can’t use DirectX anyway), but with the capabilities of the graphics card.
It is just that these cards are advertised as “Direct X11”, because “Shader model 4” sounds even more nerdy…

Philipp

Ben Supnik says:

August 5, 2011 at 2:07 pm

Right – there is still an issue that OS vendors don’t always make it possible to install the newest drivers on the oldest OSes. You won’t be able to get GL3 on an older Mac OS (despite having a DX10 card) and you can’t put new DX on XP. I’m not sure it matters though when it comes to polygon rasterization vs point search; DX11 hw exists, and you can get drivers most of the time. 🙂

It seems there should be a way to mix the two technologies. I don’t see why you couldn’t store the polygon information in such a way that the level of detail calculation could be made on the fly.

You’re obviously storing vertex information. That vertex information becomes your point cloud. Priority could be assigned to each of these vertices; and vertex pairs could even be flagged as edges to define texture boundaries and such. Low priority points can fall away in favor of creating larger polygons from the vertex data on-the-fly.

For example, if you have mountainous terrain, while in the distance you only have to read a small subset of the point data (in order of point priority), and new, larger polygons could be created on the fly. Your bucket system can be used to trigger that level of detail calculation. Distant terrain can be prepared by creating polygons out of every 1/and vertex, where N is a distance factor. Terrain, close-up, can use 1/1 vertex, embellished with procedural refinements.

Or, store the terrain data in terms of NURBS information. Tessellate coarse, or fine, based on how far away from the camera you are. As you get closer, you can procedurally augment the level of detail with fractal, noise, and other common algorithms.

Deane says:

August 7, 2011 at 4:03 pm

“1/and” was supposed to read “1/N” N would be ceiling|f * d| where d is distance and f is attenuating factor.
Ben Supnik says:

August 7, 2011 at 7:59 pm

Any time the source mesh isn’t actually a mesh (e.g. it’s a nurb or height field) and the “refinement” from source data to mesh is on the GPU, then yes you can scale nicely.

Until relatively recently, this kind of refinement on the GPU wasn’t possible, hence Euclideon can take pot-shots at older games with their square trees.

But if the artist creates a mesh (and not some source format like a nurb or a height field that becomes a mesh) then re-evaluating the mesh to change LOD becomes extremely expensive. In other words, if you want to build the mesh with variable LOD on the fly, you really need an input data format that is amenable to doing that in real time; this means changing the tool chain for art assets in an extensive manner.

If a GPU can generate the high altitude (eg 3000 ft+) global scenery on the fly using 2d terrain + modified 2d open street maps + a library of 50+ different road / water / land types, and use a more extensive local area building library for lower altitudes, then would you be able to ship x-plane on a single DVD, or are these 2d files still massive files across the globe?

Could this be the direction for x-plane 11 with the next generation of graphics cards?

Ben Supnik says:

August 8, 2011 at 8:17 am

The GPU cannot do this. Even if we could, the data wouldn’t fit on one DVD.
1. Richard says:
  
  August 8, 2011 at 10:41 am
  
  Cool. Thanks for your reply.
  
  I wonder though if the install program could generate the .dsf files from the raw 2d data, textures and objects?
  
  Could this be a way of getting much more detail onto 6 DVDs?
  1. Ben Supnik says:
    
    August 8, 2011 at 11:15 am
    
    That wouldn’t help. The input data for the global scenery is currently 69 GB, after zip compression – that is, it’s larger than the final scenery. As OSM grows, this will get even worse. (We might get slightly better ratios with 7zip but it’s still moot.) The scenery creation program took three days to run on an 8-core Mac Pro last time around, and may take quite a bit longer this time. The scenery creation program also requires a Unix environment, which hoses Windows.
    1. Richard says:
      
      August 8, 2011 at 7:31 pm
      
      Thanks for explaining that. It is a fascinating process, and highly creative, in the original sense!
      
      We already have pretty accurate terrain, but having full roads and rivers will make it easier to find one’s way around.

Have you seen the 40 plus minutes video of an interview that Euclideon’s founder and lead engineer Bruce Dell gave to HARDOCP on August 10th and where it shows the demo engine running real time and “flying” over the landscape at user control, not pre-saved stuff ? Do you still feel the same after seeing this ? Please let us know your much appreciated thoughts, Tks 😉

you can find it at:
HARDOCP Interview and demo show

http://www.hardocp.com/article/2011/08/10/euclideon_unlimited_detail_bruce_dell_interview

Ben Supnik says:

August 13, 2011 at 9:20 am

I don’t think my opinion has changed that much – but it wasn’t meant to be that negative either. My point is _not_ that Euclideon’s work is bad, lame, fake, or anything like that – only that the existing polygon-rendering “establishment” is a moving target, and it’s really moving quickly!

In particular, to me the big question is: will point rendering be enough of an improvement once displacement-based tessellation is wide-spread? Bruce Dell is absolutely right that displacement tessellation is not the same as an arbitrary geometry budget with smooth LOD fall-off. But I’m not sure how much that will actually matter for production games.

The devil will be in a bunch of details we don’t have:
– How does their rendering system interact with existing material systems and other key rendering-pipeline tech (e.g. screen-space effects, deferred rendering, lighting volumes, etc.).
– How well does it run on GPGPU hardware.
– How good is the compression ratio – in the end, the difference between point-search and multiple polygon meshes is storage efficiency. I feel a bit stupid saying that because multiple polygon meshes is a _lousy_ way to make continuous LOD (hence the interest in displacement tessellation) but still, if we had infinite machine resources we surely wouldn’t care.

And of course, I am totally sympathetic to a small number of programmers trying to do something a bit at odds with the “standard” design of the industry on a shoe-string budget…if they want to go hide and get work done, I can never ever fault them for that! 🙂

Comments are closed.