This post is part of a series in which I try to explain everything I learned at GDC ’09. In it, I over Jim Van Verth‘s talk on affine transformations. I had hoped to get this post done a week ago, but I wasn’t confident in my understanding of the subject matter, so I took some time to research it independently. So, caveat lector: although I’m a math guy, I’m not a 3D graphics expert. I linked to the references I used at the end of this article, and I encourage you to take a look at some of them. However, the fact that I learned this recently could be a good thing; it’s always easier to teach things you just learned. People who’ve known a thing for years have totally internalized it and don’t know what it was like to hear it for the first time.
What is an affine transformation?
It’s a way of changing the size, shape, and/or position of an object in a 2D or 3D scene. Affine transforms are not the only kinds of transforms; they have some special properties.
- If three points were on a line before an affine transform, they will still be collinear after it.
- The same holds true for four points on a plane.
- If two lines were parallel before an affine transform, they will be afterwards.
An affine transformation is a function that takes a vector or point and returns another vector or point. To transform an entire shape, just apply it to all the vectors and points in the shape.
I struggled with deciding when to introduce equations in this article. I decided to put them at the end, so that I can motivate them with concepts first. The trade-off is that it’s going to sound a little weird when I talk about composing transforms, because I’ll have to split it between the two sections. Luckily, they’re both pretty short..
Why use affine transforms?
Objects in a 3D scene are stored in memory somewhere with their own local coordinate system. To be in the right relative positions when the scene is rendered, they need to be moved, rotated, and sometimes scaled.
What kinds of affine transformations are there?
Translations, rotations, scales, reflections, and shears. All but shears are easy to imagine and describe, which is convenient because shears are “the bad one” that usually only happen by accident.
- A translation move objects around without rotating them or changing their sizes. It’s a rigid body transform, meaning it could be performed on a rigid object, like a brick, in the real world.
- A scale increase or decrease the size of objects. It can just make an object bigger, or stretch it in just one direction, or any combination of those. It’s not a rigid body transform, because you can’t stretch bricks.
- A rotation does just what you’d think; rotates an object. It’s a rigid body transform. It only rotates objects around the origin; to rotate objects around other points, you have to first translate the object, rotate it, then translate it back. I’ll go into that later.
- A reflection is a mirror-image. It swaps left and right or up and down. It’s not a rigid body transform. Another way to think of it is as a negative scale; it’s what you’d get if you squash something down to zero and stretch it back out again in the other direction.
- A shear is a non-uniform translation. When you apply one to an object, the effect is visually similar to slanting. It’s not a rigid body transform. You rarely want these in computer graphics.
How do you compose them?
By applying one and then applying the next. There are only two things to say about composing before looking at the math. First is that order matters; for example, if you’re applying a scale and a rotation both, and you apply them in the wrong order, you can end up with a shear. Second, composing transforms is the tool that lets you rotate around a point other than the origin. You do this by translating until that point is the origin, applying the desired rotation, and then applying the inverse of the original translation.
How does the math work?
Affine transforms can be represented in more than one way. I’m only going to show the one I think is easiest. In fact, the method I present is not the standard, as many readers have pointed out. I only picked it to make this material as approachable as possible, which will make learning the other methods easier later. For more on this, see the Van Verth presentation and the wikipedia page in the references. That said, here’s how an affine transform’s function looks to mathematicians:
T(x) = Ax + y
x is the point you’re transforming, represented as a column vector.
y define the affine transform;
A is a matrix and
y is another column vector. Here’s what that looks like in 3 dimensions:
In this example, I filled in all the numbers in the matrix
A and the vector
y at random. Intelligently generated transforms have a specific form for each type.
Translating an object by
a units along the x axis,
b units along the y axis, and
c units along the z axis:
Scaling a an image by a factor of
a in the x direction, a factor of
b in the y direction, and a factor of
c in the z direction:
Rotating an image degrees around the x axis:
Rotating an image degrees around the y axis:
Rotating an image degrees around the z axis:
Reflecting an image across the x axis:
Reflecting an image across the y axis:
Reflecting an image across the z axis:
How do you compose them? (with math!)
Same answer as before: apply one, then apply the other. And, as above, order matters. Let’s take a look at one. Suppose I want to apply two transforms:
Ax + y and then
Bx + z. It would look like this:
B(Ax + y) + z
Let’s compose a scale that increases an object’s size by a factor of 2:
and a rotation by 90 degrees about the z axis:
Here’s the composition:
- Jim Van Verth’s powerpoint presentation on affine transformations from GDC
- Wikipedia’s entry on transform matrices
- Wikipedia on affine transforms
- Diana Gruber’s presentation on rotation matrices from Xtreme Game Developers Conference in 2000
Every GDC article I write has the goal of helping us do a better job at IMVU. Since I want my articles to be relevant to a wider audience, I’m separating out the IMVU tie-ins. What follows is the IMVU-centric section of this post, which won’t be relevant to everyone. For the rest of you, please enjoy the fuzzy kitten.
At IMVU, we’ve already got this pretty well licked. 3D rendering is our core product! Maybe we could get me involved in it, though, now that I’ve spent all this time studying it. Affine transforms are not nearly all there is to 3D rendering, though, so maybe we should wait until I finish studying and writing about the rest of the math involved. Also, I’m not the only programmer we have who is not a 3D graphics guru, so we could simply use this article as education.