Opening GL: Camera - Playing with Matrices

So one of the first things I wanted was to have a free roaming camera that would let me explore the world from all angles. I felt this would be very useful for seeing how different things that I had coded would affect the scene and to make sure that there weren't any nasties hiding in a perspective that I hadn't considered!

After playing with OpenGL for a little while, I of course stumbled upon the gluLookAt() function. This takes in 3D world coordinates for the position of the camera and the point that the camera is looking at. The function then constructs vectors from these and does some jiggery-pokery to position the camera at the given coordinates, looking in the direction given, with the world rotated to simulate the camera having moved to that position.

While this was all well and good, I wasn't comfortable with gluLookAt() - it felt too restrictive, too clunky. After looking around the internet, many people commented that for a camera with any real power, managing ones own matrices for world and view transformations was the way to go and in the long run offered more freedom.

I eventually found two very useful websites that helped me learn how to do this:

The mathematics of the 3D Rotation Matrix - An extensive article by Diane Gruber describing how to construct and maintain 3D rotation matrices to control transformations of the view and world.
Matrix and Quaternions FAQ - A summary article of various techniques and optimisations when dealing with 3D rotation matrices.

For most purposes, there are two types of camera: a look at camera, and a roaming camera.

Look at Camera

A look at camera is a camera that fixes its view on a single point. This may be a camera that follows the trajectory of a ball as it rotates in a circle.

The first step in building this camera is to construct the 3D rotation matrix. For this purpose, some information is needed: namely, vectors describing the position and orientation of the camera and the world.

To get this information, the user has to provide coordinate information from which vectors can be constructed. These vectors take the following form:

Out vector - The vector representing the line of sight of the camera. It is a vector that points from the camera's location to some distant point in the direction the camera is facing. A "look at" vector.
Right Vector - This is the vector that is perpendicular to the out vector, originating at the camera's location. It forms the x-axis of the camera's coordinate system
Up Vector - This vector is perpendicular to the plane formed by the out and right vectors. It indicates which direction is up relative to the camera.

With these vectors, we know the position of our camera in the world. Together, they can be used to construct the rotation matrix. The rotation matrix is initially constructed by taking the projection of these vectors onto the world coordinate system. The projection onto the world coordinate system merely gives us the end point of each vector, originating from the origin.

The projected coordinates are then used to construct the rotation matrix by having each coordinate of a vector form a row in the rotation matrix (RM). Row 1 in the RM is the right vector, row 2 is the up vector and row 3 is the out vector. 3D rotation matrixes are 4x4 matrices as they make use of homogenous coordinates. This allows translations and rotations to be stored in the same matrix and allows multiplying of matrices whilst preserving any transformations.

The RM is filled in with the negative values for the Out vector as in OpenGL, instead of moving the camera, the world is moved. By default, the camera looks down the negative z-axis, so we minus each of the Out vector coordinates.

So, we use the Out, Right and Up vectors to build the RM but how do we find these vectors?

Finding the out vector is simple. Because we have been given the camera position and look at point by the user, these coordinates define a look at vector with the camera as the origin. To construct the vector, we simply do lookAtPoint - cameraPosition to obtain our Out vector. Simple. We then normalize the vector as dealing with unit vectors simplifies calculations.

Calculating Right and Up are a little more complicated. Once we have our Out vector, we know what the camera is looking at but we have no idea as to what orientation the camera is in - there are an infinite number of rotations around the out vector that could define the camera's orientation.

Knowing the Up vector would fix the orientation but how to find it? The answer lies in defining a reference vector with which to fix the orientation of the camera. This reference vector is called the WorldUp vector and is typically defined as the unit y axis vector of the world coordinates (0, 1, 0).

The WorldUp vector is coplanar with the Out vector. We can then utilise the cross product function to find our Right Vector as the cross product returns a vector that is orthogonal to both input vectors. Since we now have our Out and Right vectors, it is a simple matter of again using the cross product to find the up vector for the rotated camera coordinates. After normalising, we have all the information we need to construct the RM for a look at camera.

One final thing remains to complete our look at camera and that is to perform translation. Translation is performed by creating a translation matrix whereby the amount of translation is contained in the final column of the 4x4 matrix. In the case of the camera, this takes the form of translating the world away from origin (as in OpenGL, the camera stays at the origin and the world is moved). Since we are looking down the x-axis, the world is translated by the negative position of the camera.

In order to obtain the final transformation matrix for the world/view, the RM and the translation matrix are multiplied together. Since OpenGL is a right handed coordinate system, the matrices it deals with have their rows and columns swapped. For this reason, the matrix is transposed before being loaded into OpenGL using the glLoadMatrixf(float* matrix) function for use with drawing.

Roaming Camera

The roaming camera utilises Euler angles which specify angles of rotation around the axes. We then construct rotation matrices that produce rotation around each axes. Each rotation is defined as follows.

The final RM that would rotate the world to simulate the movement of the camera is then given by multiplying these matrices together. This process can be optimised however, by noticing that some trigonometric calculations appear multiple times in the final RM. The final RM is evaluated as follows.

Once the RM has been constructed, the translation matrix is constructed in the same way as with the look at camera and the final transformation matrix obtained by multiplying the two and transposing the result.

Relative Rotation

One of the benefits of a roaming camera is to explore the world at will. In order to do this, relative rotation is required - that is, a rotation in respect to the view coordinate system and not the world system. Thankfully this is simple. Once the initial transformation matrix has been constructed, any further rotations applied are relative to the new coordinate system. Relative Motion is thus achieved by multiplying the RM matrix by the new rotation matrix, multiplying the result by the current translation matrix and then loading the final transformation matrix into OpenGL after performing a transpose.

Relative Motion

Moving the camera relative to its coordinate system is slightly more complex. Multiplying a vector by a scalar is the geometric equivalent of moving along the vector by some amount. Using this, relative motion is accomplished by incrementing the amount of translation in the translation matrix by a vector. In order to move the camera forward in the direction it is facing, increase the amount of translation by a scalar multiplied by the Out vector.

Where (x,y,z) is the original amount of translation. The RM is then multiplied by this new translation matrix and the resulting transformation matrix loaded into OpenGL after performing a transpose operation.

(*Images taken from http://www.fastgraph.com/makegames/3drotation/)

Opening GL

Thursday, 2 February 2012

Camera - Playing with Matrices

No comments:

Post a Comment