Wednesday, October 15, 2014

Country Boundaries

After spending some time on a system to make names for places as described in my last project post I thought it would be interesting to look at creating countries for my planet I could then attach such names to.  I wanted to create a system where I could determine from any point on the surface of the planet which country it was part of.

I thought the easiest way to go about this would be to create a triangulated shell around the planet against which a ray from and normal to the surface could be intersected, the triangle that was hit would let me know which country the point was within.

My very first go at this involved setting up an array of 200 points chosen at random around the surface of a sphere enclosing the planet, these were created by simply choosing random numbers in the [-1, 1] range for the X, Y and Z positions then normalising the resulting vector. Once I had these I use the QuickHull Algorithm to produce a triangulation of the convex hull formed by these points and assigned a country ID to each - albeit really just a colour at this point.  The control net mesh looked like this:

Triangulation of country control net created purely from seed points
For now I'm just using a sphere created by subdividing an Icosahedron for my planet's terrain and it's purely spherical with no elevation variation to ease visualisation.  Taking the centroid of each terrain triangle and shooting a ray from it normal to the surface against the country boundary control net described above then colouring the terrain triangle with the country ID proves the concept of determining countries from surface points:

Countries found by ray-casting from the surface
Note the jagged edges of the triangles here (shown in close up in the top corner) in contrast to the smooth ones in the direct rendering of the control net above - remember this second image shows the individual terrain triangles flat shaded, effectively point sampling the country control net.

Unless you know of a planet with purely triangular countries though it's not particularly realistic.  To make the results a bit more organic I played with adding some noise to the vector shot from the surface against the control net which initially produced some slightly more encouraging results:

Perturbing the ray from the surface
The triangular nature of the control net however is still quite apparent and more problematically closer examination reveals there are problem cases where the noise causes unrealistic "bubbles" of a country to leak into adjacent countries due to the perturbation of the vectors being intersected against the control net.  This occurs most near the country boundaries as points on one side can end up being displaced enough to end up embedded in the neighbouring country:

"Bubbles" of one country leaking into it's neighbour
This is not something you generally see in the real world and was what I felt to be an unacceptable results which unfortunately I couldn't think of a way to correct.  I ruled out noise perturbation as a result and went back to the drawing board.

Edge Subdivision 


Instead of noise then I had to come up with plan B - my second approach was to subdivide the edges of the triangles adding displacements to the introduced points to make the edges more interesting.  In this way the boundaries can be more interesting while still preventing "bubble" errors.  The process here is to recursively divide any boundary edge over some minimum length introducing a new boundary point in the middle of it.  The cross product of the edge vector and the normal gives a tangent which is used to displace this new point by some random amount.  The magnitude of this displacement is capped to be a proportion of the length of the edge so long edges have their midpoint displaced further than short edges to keep things in proportion.

Subdividing the edges did turn my nice triangles into n-gons so I also had to add in a triangulation step to turn each set of country boundary points back into a mesh I could both ray test against and render for debugging purposes.  As I knew the boundaries couldn't have holes I used the trusty Ear Clipping Algorithm as it's so simple to implement and fast enough for this case.

There is still a potential for error here though as there is a possibility that the displaced midpoint might end up within an adjacent country's boundary causing the boundary edges to intersect creating invalid boundary pockets:
An edge between two countries that has been subdivided
Displacing the newly introduced midpoint perpendicularly to the edge by a distance proportional to the edge's length to make the boundary more interesting...

...but if you displace it too far the new edges can intersect the other existing edges of one or other boundary making an invalid boundary
So for each potentially displaced midpoint the two edges that would connect it to the original edge's end points are tested for intersecting any of the remaining edges of the two boundaries that adjoin across the edge being split.

If no intersection is found the displaced point is valid and I move on testing the two created edges to see if they in turn can be sub-divided, if however either of the two new edges intersect a remaining edge I have to pick a new position for the midpoint and try again.  To reduce the chance of getting stuck trying to find a valid position each time I have to reposition it I reduce the maximum magnitude of the displacement with each attempt so the new midpoint gradually tends back to the original edge.  If the magnitude gets so low it's not worth dividing the edge at all I just give up, leave the original edge undivided and move on to the next edge.  This only tends to happen in highly acute boundary corners where slivers are produced.

Running the process dividing any edge longer than 50Km gives a noticeably better result than the simple triangles I started with:
Country control net produced by subdividing edges of triangular countries
Although it's better there is no getting away from the fact that I started with triangles and I was still not particularly happy with it.  To make the country shapes more interesting my next step was to randomly take pairs of adjoining countries and merge them together producing fewer hopefully more interesting shapes:

Randomly merging adjacent countries helps obscure their triangular origins somewhat
Another improvement but again to me it's still quite obvious I started with triangles - there are a large number of nexus points where to my mind an unrealistically high number of countries meet at a single place:

Unrealistic nexus points where many countries meet, another consequence of the triangular basis
I experimented with increasing the number of seed points which meant more smaller triangles to start with but although the effect was improved a bit more it felt more like treating the symptom than a cure.

Dual Meshing


Rather than shrink the triangles I thought it would be better to start from a more friendly base so instead of using the triangulation of the convex hull surrounding my initial seed points I instead calculated the dual of this mesh effectively producing faces surrounding each seed point rather than passing through them.  This immediately gave me what I thought was a better initial mesh, effectively similar to a 3D Voronoi diagram:

Using the dual of the triangular mesh produces better country shapes as a basis
Running the edge subdivision on this mesh finally gave me a result I was pretty happy with:

Subdividing the edges of the new countries makes them more realistic
I also added back the random merging of adjacent countries to make things more varied:

Randomly merging adjacent countries provides more variety of scale
Note how the irritating nexus points no longer feature, the countries join in what I feel to be a far more realistic manner.  The main issue I have with it now though is the countries before merging are a bit too similar in size causing the result to be a bit uniform - I think in the future I'll play around with clustering the seed points somehow instead of having them uniformly random and see how that affects things.

Note the triangulation of the country boundary control net doesn't necessarily produce a spherical mesh as such as the edges of triangles spanning large areas will at times be much closer to the centre of the sphere than their unit length vertices producing odd looking kinks in the geometry:

An obvious kink in the yellow country caused by the control net's triangles spanning a wide solid angle on the planet
this isn't a problem however as remember I'm not normally rendering the country boundary geometry directly, I'm just intersecting rays shot from the far denser planet surface mesh against it to determine which country a point is located within.

A Note on Ray-Triangle Intersections


While working on this there were a number of times when I had to intersect rays or 3D line segments with triangles.  I was using a trusty old intersection routine I've had kicking around for years but found that even with double precision there were times when a ray or line would be reported as missing a triangle that it really should have intersected, usually when falling on an edge between two adjoining triangles.  Puzzled by this I did a little research (i.e. consulted Google) and found a better intersection algorithm specifically designed for intersecting with meshes, "Watertight Ray/Triangle Intersection"

Implementing this solved all my problem cases in one fell swoop so it turned out to be a good find, if you are interested in using it do note however that there appears to be a minor error in the code presented in the paper, see near the bottom of this blog post for more info.

Conclusion


Although this has been focussed on country generation, while working on it I have thought that the system could also be used to produce areas of water for oceans and seas by simply flagging certain boundaries as water zones.  This would produce some nicely shaped features while avoiding the problem of parts of a country being underwater if water was placed by a different system.  I will need to feed the country boundaries into the elevation production system however for this to work so the water zones and coastlines are given appropriate topology.  Something for the future anyway.

So there you have it, a set of what I feel are quite realistically shaped countries I can use to move on to the next step, possibly placement of towns and cities followed by primary road or rail networks...that sounds like it should be interesting to look into anyway.

Saturday, July 26, 2014

TerraTech

This is just a shout-out to publicise the kickstarter for TerraTech, an outstanding indie game being developed by a group of dedicated developers including a friend of mine:

It only has about a day to go and is 99% there on their funding target, it would be a real shame if this exciting project couldn't go ahead so check it out and support it if you can.  There is a free demo on Steam so there is no excuse :-)

Sunday, July 13, 2014

Nomenclature

I've had to take a bit of a break from project work recently due to other commitments but hopefully I'll be able to get back to at least tinkering with things soon.  As a bit of a preamble to that I've been looking at a bit of a fringe task for procedurally generated worlds, namely the procedural generation of names.

My plan for the near future is to focus  more on generating a plausible political and logistical infrastructure for a planet instead of purely working on the visual terrain as I have done many times before so I'm going to need to be able to name things be they continents, countries, regions, cities, airports, premiers or whatever.  This is a completely new area to me and so sounded like an interesting diversion.

There are many ways to procedurally construct names ranging from simply producing random sequences of letters to utilising complex linguistic rules but the one I chose to experiment with was the use of Markov Chains (http://en.wikipedia.org/wiki/Markov_chain), an algorithm that decides what comes next in a sequence by looking only at the preceding X elements and using some probability tables to decide what's most likely to follow.  In my case the sequence is the letters of the name being generated so by looking at the last so many characters of the name so far I can choose a plausible next character and so on.

So where do these probability tables come from?  Well to make names that look sensible I thought it best to reference the real world as a starting point, so to generate names for countries I use a set of 275 real world country names as my exemplars - something that's not too hard to come by these days thanks to the internet.  These real names are analysed to store the probability that any given sequence of characters is followed by another, for example it may be that 23% of the time the letter 'a' is followed by the letter 't', or that 15% of the time the sequence "tr" is followed by the letter "s" in the real world names.

The resultant behaviour of the algorithm is heavily dependent upon the size of the exemplar pool but even more significantly upon the number of characters you look at when deciding what might come next.  For example running the same algorithm with the same exemplars but different length groups produced the following sets of names:


It's immediately obvious that the longer the character sequence examined the more rigidly the generated names stick to the exemplars while the shorter the character sequence the more variety the generated names exhibit.  The extreme example here is table (1d) where only the last character emitted is used to decide what comes next resulting in a virtually random sequence of characters that bears little resemblance to the exemplars or even anything vaguely pronounceable!

Of the variations I think the three character sequence version (1b) is most promising but while the longer sequence length prevents too much randomness in the output at times there simply isn't enough randomness meaning that the generated names are quite similar or even identical to the real world exemplars.  There is also nothing preventing the same sequence being generated more than once such as "falklands uk".

To address the first of these issues I decided to run a check on each generated name to see if it was different enough to the exemplars to be accepted.  This is done by comparing the generated name to each exemplar and calculating the Levenshtein Distance between the two strings (http://en.wikipedia.org/wiki/Levenshtein_distance), essentially the minimum number of single-character edits (insertions, deletions or substitutions) required to change one word into the other.

By rejecting any generated names that produced a Levenshtein Distance to any exemplar that was too low we can help ensure that we don’t accidentally end up with something too similar to a real world place name.  Running the same test again but now with this added exemplar distance check rejecting any generated names with an exemplar distance score of less than two we get a better selection of plausible but not actually real world names:


Note here that the distance check is performed on each separate word of the generated and exemplar names separately to avoid generating names where one word is unique but the other lifted straight from an exemplar - certain key words such as "the", "of" or "island" are exempted from this check however as they are perfectly valid in the output.

As an experiment, running the test one final time but this time with the distance rejection limit raised from less than two to less than five produced these country names:


Comparing these to the previous test's output the biggest change here is with the longer character sequence generated names (3b and to a lesser extent 3c) as the distance check has rejected anything even remotely similar to one of the exemplars causing far more random and less recognisable names.  The shorter sequence names at the bottom have been less affected as they were pretty random to start with.  Note there is no four character sequence output here as the system was unable to generate any names using that most restrictive of criteria that also passed the distance check - the names were by definition most similar to the exemplars therefore least likely to have an acceptably high distance score.

So far these have all been country names but how does the system work with a different set of exemplars?  To find out I used it to try to generate plausible city names instead - first using a full set of 26914 Earth city names with a three character sequence length and a distance threshold of three:


As you can see there is a lot of variety in the names reflecting the huge real world variety of city names depending on culture and language, as one of my main goals at the moment is to produce a procedural world that exhibits more heterogeneity than it typically found by global application of procedural systems however this one-size-fits-all name style is not really what I'm after.

By limiting the set of exemplars to those real world cities from a particular country however more localised results can be achieved.  Out of interest I ran the system over just those real world city names from the United Kingdom, United States, France and Japan in turn to see how closely the output would reflect their dramatically contrasting morphology:


I'm no language expert but to my layman eyes the output from each while varied amongst it's peers is noticeably different from those in the other tables which I think is good enough for my needs - I just want each of my virtual countries to have an identity reflected in it's nomenclature.

These tables have been generated from just a few runs and with specific parameters, but it's important to bear in mind that when used for real the exemplar pool, sequence length and the distance threshold can all be varied each time a name is needed providing a steerable but near limitless potential set of generated names.

Anyway, I’m happy enough with it for now so I’m going to move on to something else...comments as always are welcome



Sunday, September 15, 2013

The end of an era...

It's with great sadness that I have to announce the company I have greatly enjoyed working for over the last eighteen years has had to close:

http://www.develop-online.net/news/45340/Blitz-Games-Studios-closes

I think it impossible to overstate just what a great group of tremendously skilled people we had there producing a huge variety of projects on many diverse platforms.  The games industry is an unforgiving place however and a whole raft of factors conspired against us over the last year or two making it impossible to continue.

I wish everyone from there the very best of luck finding something new and exciting to work on.