Friday, 30 March 2012

Bezier Curve template

Sometimes you just want some simple code you can drop in to do bezier curves:

Templated Bezier function:

static inline float CR40( float t ) { return 0.5f*(2*t*t-t*t*t-t); }
static inline float CR41( float t ) { return 0.5f*(2.0f-5*t*t+3*t*t*t); }
static inline float CR42( float t ) { return 0.5f*(4*t*t+t-3*t*t*t); }
static inline float CR43( float t ) { return 0.5f*(t*t*t-t*t); }
template
T CRBez4( T a, T b, T c, T d, float t )
{
return CR40(t) * a + CR41(t) * b + CR42(t) * c + CR43(t) * d;
}

download

Well, now you can. Here's some rather old code I found, and recently reused.

Thursday, 24 November 2011

Existence checks

Let's say you have a large table of data, lots of rows with a few columns. You know that every row is unique because that's a basic feature of your table, no duplicates. There are only a few instances where non-unique elements are a good idea in tables, so we're going to ignore that corner case for now.

Let's look at what we have with an example:

Find out if Bob is the leader of team A
Name Clan Rank
alice team A leader
bob team B private
bob A team leader
charlie A team private

We can see that we can't just check for any one column and return on success, this is much like checking for a string, we have to check every element before we're sure we have a match. You could maintain the table's alphabetical order, reducing the string queries down to only the rows that match "bob" or "team A" (dependent on which column you sorted by). If you walk through the table, you have to string match at least one column.

But, we know that the table is made up of unique elements. There will be only one row returned if at all, so that makes this table a set. It's the set of Name Clan Rank associations that are known to be true. There are a very large number of potentially true values in this set, the product of the list of all potential names, all potential clans, and all potential ranks. Let's assume 4 billion names, 1 million clans, and 100 ranks. If we were to use a bit field to identify all the possible permutations, we'd not be able to fit the array into our computers. 400 trillion combinations would be quite literally, larger than we could handle. But sparse bit fields have better solutions than just simple array lookups.

What we've done is reduced the problem to its essential components. We're not looking for the row that contains the data, we're looking to see if the data exists. We don't want to access the data, only find out if it's in there. For this task, there are many solutions, but I'll just mention one.

I like hashes. My solution to this is to store the hashes of all the rows in an array of small arrays. I index into those small arrays by a small number of bits, the smallest amount of bits I can that differentiates the hashes enough that all the associated ones don't overflow the capacity of the cacheline sized mini arrays.

H1[bits0-n] -> index of array it should be in.

With this, I do one cache line load to get at the array of hashes, and if the hash is in there, then I can return that the row does exist. If not, then I can assert that it doesn't. And that's quick.

Monday, 31 October 2011

static

The static C++ keyword has multiple meanings based on context.

When used inside a function,

void foo() {
  static int i;
  printf( "%i", i );
  i++;
}

it means that the declared variable is global in lifetime scope, even though only local to the function for reading and writing. Normally this is used to implement singletons.

When used inside a class,

class Bar {
public:
  static void Baz() { ++foo; }
  static int foo;
};

it means that the declared member variable or function don't require a this pointer, and thus don't require an instance for operation. This can be useful for implementing class specific memory pools, or other types of class specific global state encapsulation.

When used in front of a variable or function,

static void Baz() { ++foo; }
static int foo;

it means that the global variable or free function are not destined to be used outside the current compilation unit, which reduces the number of symbols during link time, which can lead to better build times.

There is a missing functionality in that you cannot have a class defined that doesn't offer external linkage. You cannot specify that a class doesn't export its member functions. This is a shame as it guarantees that the link time of object-oriented C++ code gets worse, and you can't fix it.

Monday, 22 August 2011

float == float

Never use equality checking with floats, they are not accurate for checking with an equals operator.
For example, this loop would run forever:

for( float a = 0.0f; a != 1.0f; a += 0.1f )
blah();

Not ten times. Try it.

This is because 0.1f is an approximation. The IEEE standard float cannot represent 0.1f, only a very near neighbour. In fact, once you've added all ten, the bit pattern is only off by 1 bit, so it's pretty accurate, but still not equal.

1.0f = 0x3f700000
the sum of ten 0.1f is 0x3f700001
quite remarkable that it's so close.

So, you agree, floats are not safe for checking equality, right?


Wrong.
floats are great for equality, but just don't use the wrong ones.
for example, 0.125f has an exact representation in floats, so adding together eight of them will in fact, be perfect.
In fact, floats have exact representations for all the integers right up to 16,777,216.0f, after which the mantissa runs out of accuracy.
so, when you see code like this:

for( float a = 0.0f; a < 100.0f; a += 1.0f )
{
float percent = a * 0.01f
/* do something with a percent */
if( a == 0.0f || a == 25.0f || a == 50.0f || == 75.0f )
/* something to do with quadrants */
}

then you can trust it just fine. Seriously. Also, this is useful for times when you might want to not suffer the notorious load-hit-store. If you had done this:

for( int a = 0; a < 100; ++a )
{
float percent = a*0.1f;
/* do something with a percent */
if( a == 0 || a == 25 || a == 50 || == 75 )
/* something to do with quadrants */
}

you would be converting from int to float on every run through the loop, which is slow on any modern console.

Tuesday, 12 April 2011

Why use a quaternion?

No, Gimbal-lock* is not the answer.

Quaternions provide a different way of representing a transform. The representation is limited to rotations, optionally you can try to use it for mirrorings too, but it get's a little awkward so we don't normally do that.

Matrices provide a complete solution to representation of the orientation and position of an object, so they can be considered to be the de-facto solution for static scene transform representation, however, because they are a linear re-combination, they do suffer when used dynamically.

When you create your model matrix, the one for you windmill, you would normally generate a matrix (something like Matrix::MakeXRot( bladeAngle )) then set it into the rendering transform hierarchy. All well and good. What this is doing is setting a static matrix every time you change the bladeAngle value. If you are rendering a robotic arm, you would have a number of different matrices at different levels in that hierarchy, each one concatenating with another to finally provide that finger transform.

If you're not provided with all the angles, and instead being told that you have a start and end value for each transform in the hierarchy, then that works fine for the examples so far, you interpolate the angles and produce new matrices that are in-between the start and end matrices. This works fine because we're still in the world of scalars, interpolating between 0 and 100 for the windmill just gives us a value that we use to generate a rotation matrix. So all is fine.

But, when we are given the matrices, and not the rotations, suddenly we have a surplus of information, and we're not sure what to do about it. If you have a character end position that is a 180 turn from its start position (a Y-axis turn of PI), then you cannot tell which way to turn to interpolate, nor can you tell whether or not the real intention was not something more obscure, like a scaling to -1 in both the x and z axes.

Okay, so that last possible animation is unlikely to be a feature, more likely to be a bug, but that's what happens if you linearly interpolate a matrix. That's what normally happens when you linearly interpolate a linear transform.

This is where quaternions come in: they aren't a linear representation of orientation, not only are they non-linear, they're also not really rotations. The best explanation I've read / figured out for myself about what they really are, is that they are a kind of 4D mirroring operation, which when applied twice gives you the rotation you would expect. As far as anyone using them should be concerned though, they are vectors that give you a space for doing interpolation of 3D rotations.

A quaternion represents a rotation, it can be interpolated to produce a rotation that will be on the shortest path between the two given orientations. Quaternions can be interpolated linearly, spherically, or used in higher order functions such as beziers or TCB splines.

That is what you need to know. It allows you to interpolate properly.

Spherical linear interpolation, which leads to very smooth results, can often be relegated to trying too hard territory. Most animators and motion capture data already provide enough keys that doing the interpolation that carefully is not really necessary. Don't forget that artists don't generally see sub frame data, so getting it right is pointless in most cases.

* Gimbal-lock is when you can't interpolate from one rotation to another without doing some long winded route all round the houses. It manifests in Eulers due to the polar representation. Any time you get near a pole, you have to work overtime to get the transform in the right orientation.

Friday, 24 September 2010

Invent a new spline

For some reason of another, I had to invent a spline, and not a normal four control point one, but a three control point variant. It had to go through all three points, so a three point Bezier was out. I decided to limit the spline so that the tangent would match the end point difference, like Catmull-Rom splines do.

So, here is my new spline.

P(t) = P0*(1+2t^2-2t) + P1*(1-(2t-1)^2) + P2*(2t^-t)

P(0) == P0
P(0.5) == P1
P(1) == P2

At t=0.5 the tangent has equal slope to (P2-P0)

This should make for a reasonably useful curve.

view in glory: http://is.gd/fqyBT

Tuesday, 24 August 2010

How fast is your debug build?

Why is your debug build slow? Lots of asserts? Awkward templates that only compile out to sensible stuff in release?

It's only a theory right now, but I think that if your game is running like a dog in debug, maybe there's something wrong with the number of branches you're doing. Not just in debug, but in general.
Asserts are the usual cause of slow down (they break the cache and branch and cause memory wastage), but they are valuable as warnings when things are going wrong. They provide great protection, but they aren't necessary if you make sure that the condition that would normally trigger them cannot be achieved.

One of the most common uses I've seen for an assert is checking for NULL. This can be avoided by just making sure that your queries are either returning null objects instead (dummy objects), or by making the query inherently null proof (use a queued up todo list rather than a fetch from and check for work to do).

Another slowdown can be from templates that aren't being fully expanded or optimised in debug code. This is a lot simpler to fix. Stop using template meta programming. Your compile time will go down too. Template meta-programming is also a replacement for moving wholesale to scripted development. Consider the problem your meta-programming is solving in light of a scripted environment.

On the PS2, we never saw that big a difference in final build and the debug build. This was mostly down to the fact that the PS2 was spending all its time streaming data from one place to another (no nulls in the middle of a stream), and processing them with data driven transforms (no meta-programming for us, just plain coded transforms driven by data demand). The only code that ever got all that faster was the C++ class style math library (which was hilarious to watch as it went from being slower than hand coded vector unit code by a factor of four to being faster than the hand coded vector unit code by a factor of two.)