Sunday, September 15, 2013

The end of an era...

It's with great sadness that I have to announce the company I have greatly enjoyed working for over the last eighteen years has had to close:

http://www.develop-online.net/news/45340/Blitz-Games-Studios-closes

I think it impossible to overstate just what a great group of tremendously skilled people we had there producing a huge variety of projects on many diverse platforms.  The games industry is an unforgiving place however and a whole raft of factors conspired against us over the last year or two making it impossible to continue.

I wish everyone from there the very best of luck finding something new and exciting to work on.

Thursday, July 11, 2013

Damn, Voxel Data is BIG

I left off the last post with a little question about how many voxels I might need to represent a planet.  There were a few guesses of  4.7810528e+209.8782084e+17 and 10^21 which at first glance seem absurdly large but in fact they were pretty close to the mark.  You see the thing about voxel data is it's big.  Very big.

Apart from the inherent inefficiencies of using GPUs to render voxels with raycasting when they would generally be much happier doing what they were designed for and be rasterising triangles, it's the amount of memory voxel data sets consume that prohibit their use in many scenarios and while that's pretty much accepted wisdom and hardly newsworthy - I thought it might be interesting to just take a moment to put out there some cold hard numbers.

So, how many voxels and therefore of course bytes of storage do you need to render a planet?  The answer as you would expect depends on the fidelity of your representation and your method of storage, but these are my numbers what what they're worth:

To encompass a maximum data set of 25,000 Km cubed I am using 22 levels of detail each offering eight times the number of voxels of the previous one (i.e. twice the number along each axis).  The lowest detail level uses a 19^3 grid of voxel bricks each of which is 14^3 voxels in size so the entire lowest detail level provides 266^3 voxels.  Dividing the 25,000 Km region by this equates to each voxel encompassing a cubic region of space 93.98 Km on a side (830,185 Km^3).  That's pretty big!

So obviously the lowest level of detail is pretty coarse but even at that low detail level there are 266*266*266 voxels making up the space which equates to 18,821,096 individual voxels.  So that's nearly Twenty Million voxels for just one very crude detail level!

The next level down gives us 532^3 voxels each 46.99 Km on a side so we end up with over 150 Million of them there - things are scaling up pretty quickly.  Continuing this down to the maximum detail 22nd level gives us the following results:
Voxel counts, sizes and storage requirements at each level of detail
As you can see the numbers get silly very quickly, and this is storing just a single distance field value per voxel in half precision floating point (i.e. two bytes per voxel) without any thought to storing normals or occlusion for lighting or any kind of texturing information

So to put it another way, storing the full 25,000 Km sized cube of space at a maximum detail level of 4.5 cm per voxel would take 173,593,970,549,359,000,000,000,000 (173 million billion billion) voxels taking a mind boggling 294 Zetabytes of storage just for the distance field!  Putting that astronomical number into real world terms, if you burnt it all to dual layer DVD disks and stacked them on top of each other the pile would be nearly five times as high as the distance from the Earth to the edge of the Solar System!

I know the cost of hard drive storage continues to decrease but I'm pretty sure I can neither afford that sort of storage nor fit it in my PC!

Let's think about it a bit more though, firstly 25,000 Km is the maximum size my system can store - let's for the sake of argument say that I'm only interested in storing something Earth sized for now. It's not a perfect sphere but The Earth has a generally agreed approximate radius of 6371 Km which when plugged in to my formula reduces the numbers quite substantially:
Same voxel statistics but for an Earth sized cube of space
Which while better is still pretty extreme at a cool 39 Zetabytes - but let's keep going anyway. These numbers are for the entire cubic space but planets are conventionally round so with the volume of a sphere being (4/3)*PI*(Radius^3) our Earth sized planet has a volume of 1,083,206,916,845 KM^3, just 52.4% of the entire cube volume.

Before proceeding further lets define the planet a bit more accurately though; while they are basically spheres a perfectly smooth spherical surface isn't very interesting so we need some amount of surface detail. Conversely though we don't typically want to be able to travel to the planet's core so data below a certain depth underground probably isn't needed. Combine these two and you end up with a spherical shell of some determined thickness that defines the area of interest for rendering. My chosen thickness is currently 18Km (about 59,000 feet) which is approximately twice the height of Everest; I am anticipating that having roughly 10 Km available for the tallest mountains and 8 Km available for the deepest caves or underwater abysses ought to be sufficient

Making this assumption that we are only interested in a hollow shell of data is a great optimisation both of storage and rendering performance because instead of tracing the ray through the entire cubic data set we can first intersect it with the outer sphere of the planet's data shell and start tracing from there removing the need to store a great many of the distance samples. Assuming there aren't any holes in the shell you can also ignore any space inside the shell as you're bound to hit something solid before getting there.

Some basic calculations shows that for an Earth sized planet an 18 Km thick shell takes up just 0.23% of the planet's bounding cube and has a volume of about 9 billion Km^3 requiring something like 1.017E+23 voxels taking 172 ZB at two bytes per voxel.  Damn, that's still pretty large.

This estimation is pretty crude as it doesn't take account of the topology of the actual terrain which will allow many more bricks to be discarded and there are basic storage optimisations such as compressing the brick data but no matter which way you look at it you simply can't store brick data for a whole planet at this kind of detail.

There is also a major downside to all these assumptions of course - we are restricted to sensible planet shapes with sensible features which is a shame as being able to create crazy shaped planets you can travel through the core of sounds like a lot of fun.  Even though my efforts at the moment are on conventional planet shapes therefore I'm taking care to make sure there are no artificial restrictions in place that would preclude more radical geology. These limitations are simply optimisations to make the data set more manageable and easier to generate (i.e. faster) while I develop the system, pushing the boundaries of what's possible is a large part of this project's raison d'ĂȘtre.

Although all these crazy numbers make this project sound like an exercise in futility, remember that these are for the highest 4.5cm per voxel detail level.  Each level lower takes only 1/8th the amount and of course you only need high detail data for the immediate vicinity of the viewpoint; the key therefore is to have a system that can generate brick data on demand for any given position in the clipmap hierarchy.  Combine this with a multi-tier memory and disk caching strategy and you get something usable.

Remember also that one of the benefits of clipmaps is that the memory overhead for rendering is fixed regardless of detail level.  I support ten renderable levels from the set of 22 so it makes no difference whether I'm rendering levels 0 to 9 or level 12 to 21 the overhead is the same.

Finally, you're maybe wondering why bother to store the bricks at all when there are some pretty cool projects out there showing what can be achieved with noise based functions directly on the GPU such as iq's Volcanic.  The main reason I'm sticking with them is that I want to be able to add all sorts of voxely infrastructure onto my planets such as roads, cities and bridges which are hard to do dynamically in real-time but also because I want to experiment more with creating truly heterogeneous terrains rather than the fairly homogeneous looking constructs real time noise tends to favour.  That's the goal anyway.

I realise this has been a pretty dry number-heavy post so +1 if you're still with me, hopefully the next one will be a bit more visual.

Wednesday, July 3, 2013

Interplay of Light

This is just a quick shout-out for the blog of a colleague here at Blitz Games Studios:

"Interplay of Light" is the work of talented graphics programmer Kostas Anagnostou and is a recommended addition to the feed list of anyone interested in the field.

Tuesday, June 18, 2013

Goodbye Octrees

It's been a while since I posted an update on the Voxelus project but it has been advancing slowly as time allows.  As with most hobby projects there never seems to be enough time but it's certainly not defunct and I enjoy fiddling with it when I can pull myself away from the Kerbal Space Program!

Framework Switch

Most of the changes however are fairly invisible from the outside, one of the largest for example was changing C++ framework from the commercial one that I use and help develop at work to an entirely new one of my own creation.  Although a considerable amount of work, I was finding the constantly moving goalposts of using a framework being actively developed for other purposes frustrating, as was the ever increasing complexity it brought.

By creating my own framework I am now entirely in control and can create one that works exactly as I want and is only as complex as necessary to achieve my goals.  Another benefit is that because it's now 100% my own code should I ever wish to distribute source for any of my projects I can do so freely without worrying about copyright or IP ownership issues.

Goodbye Octrees

Apart from changes to the underlying code structure, the biggest change to the project and definitely the largest consumer of development effort is my move away from sparse voxel octrees.  I had started out using SVOs as they have been receiving quite a lot of coverage recently and they felt like a good way to experiment with raycasting voxels.  While they work well for relatively small bounded areas I was having trouble working out a way to make them scale sufficiently to encompass an entire planet - my stated goal for this project.

The main problem with the SVO approach was the number of levels of data necessary to represent a planet to a sufficient fidelity.  Using some basic arithmetic I worked out that I would need 22 levels of voxel data to represent a cube of space 25,000 km on a side (roughly twice the size of the Earth) down to a resolution of 4.5 cm per voxel which I felt would be sufficient for walking around on the surface.  Ignoring the issues of storage, having to recurse down up to 22 levels of octree every step along every pixel ray reading index and data textures at every iteration felt like asking too much of the GPU.

The problem can be simplified somewhat by restricting the number of levels actually being used in any given frame to a sensible range based on the distance of the viewpoint from the surface - my guesstimate at the moment is using ten of the 22 levels - but you either need to still recurse down from the root to get to the first of the ten levels of interest or you need some sort of strategy allowing you to jump in to the octree structure at any given level and point in space.  Neither of these felt like problems I wanted to tackle.

I decided instead to ditch the octree structure altogether and move instead to a clipmap based system.  I had used clipmaps previously on other terrain based projects for rendering heightfields where they provide a relatively simple and effective solution for storing, streaming and rendering multi-resolution data so I thought I would be able to drop them in fairly easily to this current project.  Well...sort of.

Clipmap Background

Clipmaps have been around for a long time, there is a SGI paper on them from 2004 for example or more recently an article on using them for terrain rendering in GPU Gems 2, Miguel Cepero also appears to use them in 3D in his Voxel Farm Engine project. Essentially they provide a way to scroll a potentially infinite data field through a fixed size multi-resolution cache.  If you are familiar with mip-maps you can think of the 2D case as being somewhat like a reverse mip-map, instead of each level of data representing the same area of space but at half the resolution, in a clipmap each level of data is the same resolution and instead represents an area of space twice the size of the preceding level.
Three levels of a 4x4 clipmap illustrating how each level is the same resolution but covers four times the area of the preceding level at quarter the data density
So for example, if the first level of the clipmap stores a 1 km square of terrain in a 512x512 grid, the second level would store a 2 km square also in a 512x512 grid while the third level would store a 4 km square once again in a 512x512 grid.  It can be easily seen here that each level is storing four times the area of the preceding one, a geometric progression that allows huge areas to be covered with a relatively low number of levels.

When rendering with clipmaps you centre each level around the viewpoint so the highest fidelity data from the first level represents the terrain closest to the camera, the next level the data a little further away and so on.  As perspective is reducing the size of features on-screen anyway the drop in resolution with distance provides a natural level of detail scheme.

A key trick for efficiency when moving clipmaps around your data set is to think of the mapping of terrain to the clipmap buffers in memory as toroidal so instead of having to copy all the data around as the viewpoint moves you simply write the new data that's just moved into range over the old data that's just moved out of range.  In this way the minimum amount of memory is rewritten and because the clipmaps never actually change size or move away from the viewpoint you don't get any numerical precision issues during rendering facilitating an effectively infinite data set.

For a system like Voxelus where creating or loading the data for a region can take a number of frames and it's possible to move the viewpoint rapidly around you do get inevitable flicker if the data for the new region can't be copied in fast enough but unnecessary flickering can be minimised by keeping a border around the edge of the clipmap that isn't normally rendered.  When the viewpoint moves at a more measured pace this extra data gives you something to render while the new data is paged in.  Of course the faster your viewpoint moves the wider this band of pre-prepared data must be to avoid flickering so it's a trade-off of memory versus visual artifacts.

Adding Dimensions...

As I said I had used 2D clipmaps before in a fairly traditional heightfield renderer but to use them for rendering a volumetric planet it would appear that I had to extend them to work in 3D, but as I was looking to render a set of ten levels from my full planet sized 22 level data set I actually had to implement a four dimensional clipmap arrangement as the renderable data set scrolls not just in the X, Y and Z spacial axes but also along the W axis representing which level of data to use as the viewpoint moves closer to and further away from the surface.

This isn't conceptually that much more difficult, but it did give me some head scratching moments trying to work out the shape of dirty data regions as the viewpoint moved around!

It's well known that voxels can produce vast amounts of data which can be difficult to manage efficiently in a real-time environment,  trying to manage them individually makes this even worse so although I dropped the octree data structure I decided to keep the brick concept from the earlier version.  In my currently implementation each of the ten clipmap levels stores a 19^3 grid of bricks each of which is a 14^3 array of voxels.

Wireframe bounds of each of the 10 clipmap levels when the viewpoint is located close to the planet surface on the left hand side
As can be seen here, each level encloses 1/8th of the volume of the preceding level, normally they are centred around the viewpoint but that makes them hard to visualise so for the purposes of this screenshot I moved the viewpoint close to the surface on the left hand side of the planet then locked their positions before flying back out for illustrative purposes.

Clipmap levels coloured by level
This second image shows the view from the original camera position near the surface, but here the voxels from each clipmap level are being rendered in a different colour.  Note the smooth blending between voxels of different levels achieved by sampling adjacent levels and blending the distance values.  This is one of the main benefits of raycasting voxels rather than triangulating them - seamless per-pixel level of detail blending - which I plan to expand on in another post.

Voxel brick visualisation
Finally this image shows the same style of colour coded level of detail view but also illustrates the size of the voxel bricks.  Each of those patches is a 14^3 brick of voxels which is the smallest unit of data I generate, store and transfer.  Only the higher detail brick size is shown here so each pixel is in fact sampling from the illustrated brick size and the equivalent one from the next level down (i.e. double the width, height and depth).  Note how the bricks get larger automatically with distance - a simple metric of pixel size over distance is used to pick the best pair of levels at each point along each pixel's ray.

As the viewpoint moves bricks are pulled from a CPU side cache and copied into the dirty regions.  By transferring voxel data around in brick sized units it's possible to use the hardware far more efficiently than trying to do so individually.

The Devil's in the Details

As anyone who's ever tried to implement complicated real time graphics will know even conceptually simple systems can end up taking plenty of effort to get working well, but to help stave off TL:DR syndrome I'll save further detail for another post.

Until then, anyone care to guess how many voxels I'll need to make up a planet?