Discussion: Volumetric Worlds

aphadeon · Sep 12, 2015

Volumetric World Technology: A Discussion

Hello hbgames!

This topic is to discuss some thoughts and opinions I have in relations to Volumetric Worlds and the future of gaming in general. Specifically, I enjoy tinkering and I'd like to gather some thoughts and ideas that may help me tinker my way into something significantly useful.

This post may be a bit long-winded, just a heads up. If you have no interest in voxels or volumetric world data, turn around now because it only gets worse.

What is Volumetric World Data?

Why I'm glad you asked. Volumetric implies that matter has substance. Polygon meshes, for example, are not volumetric. If you cut them open, without the aid of artist supplied polygons and animations, there would be nothing inside. There is no knowledge of what's inside. The leather strap on a character's clothing is not leather at all - its a bump that an artist painted brown. Obviously, this fashion of simulating reality has clear and inherent limits.

A volumetric simulation of reality, on the other hand, would somehow divide that leather strap into a number of quantized space units, and assign each one a material value of some sort - i.e. "leather". With this information, we can still render the same polygonal mesh, or any other unit of rendering. But if an interaction is made that an artist did not accomodate, assuming some fancy programming, at least the engine would know what belongs on those missing surfaces - leather, in the example above.

What are Voxels?

Well, the term is a bit subjective, but generally describes arrays of data, like pixels, that apply to 3d volumetric representation. In short, Minecraft. This is one solution for volumetric world data. In Minecraft, if you dig away some dirt, rather than empty space being below it, we know that there is stone there, and it becomes visible. That said, voxels generally require massive amounts of data, as even if you only store 1 byte per voxel, that racks up faster than you can imagine. Assuming you want more information than that, it gets obscene rather quickly. You could consider Minecraft to be a volumetric world of extremely low resolution - as each "voxel" represents a cubic meter of space. That's big. Imagine if a pixel on a screen had to represent an entire meter of space - aside from cool aerial shots of the earth, that resolution would be deemed nearly unusuable. The challenge, therefore, is to use less than 1 byte per voxel, to allow for higher resolutions to be rendered without running out of RAM on a household machine.

How do we reduce the overhead?

Well, there are dozens of potential methods. Flat arrays, instead of 3d arrays, compressed somehow (typically something cheap and efficient like Run-Length Encoding). This approach is probably the most common, but its efficiency is circumstancial, and depends on flattening order. Other algorythms can be combined with this, such as a Lempel–Ziv–Welch dictionary compression on top of it. But at this point, especially in the case of cascading compressions, modification becomes expensive as you have to typically convert it back to a flat array then re-encode it. This is non-optimal because first this process has overhead, and secondly the data enters the flat array state- even though only temporarily, thats a lot of potential allocation and deallocation of RAM.

As an experiment, I made a "realtime Run-Length Encoding". That is, one I could modify and it is aware of every possible operation (and never performs more than 2 changes per state change), and executes these operations inline without reverting the data to a flat array. It simply checks the value of the cells adjacent to the modified cell and updates them accordingly, performing merges and splits as needed.

Trying to expand this realtime implementation to 2D is where it becomes a headache. There are 7 possible scenarios for 1D, and that expands exponentially for 2D, to the point where a 3D realtime RLE is a bit nightmarish to even consider implementing.

What would a 3D RLE look like?

Well, seeing I used Minecraft as an example above, let's roll with that for some visualization aids:

Here we have a simple, 2 value, 4x4 2-dimensional configuration on the right, using stone and wood planks as our test values. On the left, we have the elements a 2D RLE would store this as, in realtime. Instead of a total of 16 "voxels", the engine would consider the entire configuration to consist of 7 voxels in this example- and the voxels have height and width parameters.

This can be encoded by starting at one corner, then iterating every position towards the other corner. When you reach an unprocessed position that is occupied, you check each of the three dimensions, one at a time, expanding in that direction until a differing value is encountered. For example, starting in the top left corner, we cannot expand the width as stone comes in the next column. Attempting to expand down, we can. Then we proceed to the next unprocessed position and repeat the process. Every position that is consumed needs to be flagged as processed to avoid inclusion in multiple elements. Now, this works well for encoding, but encoding is not the goal, the goal is realtime maintenance, that this is just the format the world exists in.

Here is the same process demonstrated in 3D (sorry for the poor angle):

Here a 3x3x3 cube of 27 voxels is reduced to 10 voxels, each one having width, height, and depth. (Yes, that's 40 bytes instead of 27, but this system is targeting larger, sparser areas in which the gain would be far greater.)

The problem comes from maintaining this in realtime:

When even a single voxel is broken, there are a significant number of checks that need to be performed to avoid fragmentation of the remaining data. Perhaps as a temporary solution a defragmentation could be performed periodically, or when data is saved/unloaded. But defragmentation requires converting the data back to a flat array and recounting it - the goal here being to never have the data in array form, i.e. never, even temporarily, storing 1:1 or greater byte-to-voxel ratio.

By maintaining storage in this manner, it can be passed to the renderer as-is, as it is already in a fairly optimized form (though allowing blocks to clip would increase it further, but only be suitable for rendering- thats a different discussion for later).

Obviously, a 3d-RLE would only really be of benefit to sparse arrays where the rate of change is low. In densely packed chaotic arrays, this has the potential to be less efficient than flat arrays. The only way I can imagine to compensate for this would be to track the density of a given region (think a "Chunk" in Minecraft), and switch methods after a certain threshold is encountered.

Another major benefit of the 3d RLE, storing larger blocks of arbitrary size, is that you don't have to store "empty". Minecraft's "air", for example- still occupies a full integer in ram simply to store a 0. (There are a couple shortcuts that even Minecraft takes here, such as empty chunks not containing a data array, and using shorts instead of ints - but shorts can actually be less efficient than ints and are often stored as full integers in RAM anyway- at least when used by dynamic languages like Java.)

Closing Thoughts:

I have more to share, and will expand on this information in time as I am aiming to construct a very-high-resolution voxel engine. I would love any feedback if there are other developers here that understand my ramblings and have ideas or thoughts related to them.

Thanks for reading!

Ellie · Sep 13, 2015

This is really insightful, thanks.

Don't all volumetric shapes ultimately end up rendered as vertices though?

aphadeon · Sep 13, 2015

Typically, yes - most rendering of voxel/volumetric data is constructed to vertice-based meshes.

There was an engine a couple years ago, amusingly called the "Unlimited Detail Engine" by Euclideon (https://www.youtube.com/watch?v=5AvCxa9Y9NU) (you don't even want to know the filesize for those "models") which worked by using screenspace raycasts to determine the appropriate color of each pixel. However it was only a rendering solution, and modifying the data was intensive, so the technology ultimately went to applications such as image processing rather than realtime gaming.

Another method would be to use 2d particle rendering, likely based on screenspace raycasts. You could use 3d particles, like a few tiny "asteroid-like" meshes to represent, say, grains of a material.

Rendering is relatively mundane for anyone who has tinkered with it enough. The bottleneck is simply the sheer monstrous volumes of data required for high resolution volumetric data.

The reason for limiting the scope to worlds is that scope very much affects the methods needed. The "voxel" field is significantly more advanced in medical imaging rather than gaming, however due to the difference in goals many of the techniques do not translate directly between the two.

Right now I'm focusing on storage (in this instance referring to both RAM and disk storage), after that I have some interesting thoughts on fractal material definition, a binding-strength multi-grid physics system, new rendering options, and video driver abuse, which I'll expand on later.

I made a renderer that can handle a ridiculous number of voxels (somewhere around 3 billion data points). However I have no way to test its true limits as I run out of RAM long before the card maxes out its potential- and that's using a load, construct mesh, discard technique for the voxel data.

Anyone else have thoughts on storing volumetric world data at < 1:1 value/voxel ratio? I'd absolutely love some creative discussion on the matter.

Xilef · Sep 13, 2015

Princess Amy":1notp3e1 said:
Don't all volumetric shapes ultimately end up rendered as vertices though?

At the moment this is only because graphics hardware has evolved this way. In the 90s there was a small battle between triangles and voxels, it could have been that we ended up with GPUs for rendering voxels, but we didn't we got GPUs for rendering triangles.

So these days if you want fast voxels you currently need to take advantage of the GPU, which means potentially rendering voxels as vertex data for polygon meshes. But it doesn't have to be this way; we now have very fast pixel shaders and GPU computation solutions, so we can render voxels on the GPU avoiding vertex data or triangle data altogether.

It doesn't just have to be voxels -> vertices, Quake 1 had a solution to rendering their triangle meshes at a distance that could skip triangles that resolved into 1 or less pixels where the triangle vertices were used as voxel data and, if the distance between two vertices was 1 or more pixels, the triangle vertices would be subdivided into more vertices to be used as voxel data.

Envision, Create, Share

Discussion: Volumetric Worlds

aphadeon

Ellie

aphadeon

Xilef

Share this page

Thank you for viewing

Discord

RPG Maker

Content