Monday 30 April 2012

Adding Lighting: Normals

Building on the previous post, we can use the lighting to light a cube of our own making, rather than an object provided by GLUT. Here is the definition of our new cube:

glPushMatrix();
 glBegin(GL_TRIANGLES);

 /*      This is the bottom face*/
 glVertex3f(1.0f, -1.0f, -1.0f);
 glVertex3f(1.0f, -1.0f, 1.0f);
 glVertex3f(-1.0f, -1.0f, 1.0f);
 glVertex3f(1.0f, -1.0f, -1.0f);
 glVertex3f(-1.0f, -1.0f, 1.0f);
 glVertex3f(-1.0f, -1.0f, -1.0f);

 //top face
 glVertex3f(1.0f, 1.0f, -1.0f);
 glVertex3f(-1.0f, 1.0f, -1.0f);
 glVertex3f(-1.0f, 1.0f, 1.0f);
 glVertex3f(1.0f, 1.0f, -1.0f);
 glVertex3f(-1.0f, 1.0f, 1.0f);
 glVertex3f(1.0f, 1.0f, 1.0f);
 
 //right face
 glVertex3f(1.0f, -1.0f, -1.0f);
 glVertex3f(1.0f, 1.0f, -1.0f);
 glVertex3f(1.0f, 1.0f, 1.0f);
 glVertex3f(1.0f, -1.0f, -1.0f);
 glVertex3f(1.0f, 1.0f, 1.0f);
 glVertex3f(1.0f, -1.0f, 1.0f);
 
 //front face
 glVertex3f(1.0f, -1.0f, 1.0f);
 glVertex3f(1.0f, 1.0f, 1.0f);
 glVertex3f(-1.0f, 1.0f, 1.0f);
 glVertex3f(1.0f, -1.0f, 1.0f);
 glVertex3f(-1.0f, 1.0f, 1.0f);
 glVertex3f(-1.0f, -1.0f, 1.0f);
 
 //left face
 glVertex3f(-1.0f, -1.0f, 1.0f);
 glVertex3f(-1.0f, 1.0f, 1.0f);
 glVertex3f(-1.0f, 1.0f, -1.0f);
 glVertex3f(-1.0f, -1.0f, 1.0f);
 glVertex3f(-1.0f, 1.0f, -1.0f);
 glVertex3f(-1.0f, -1.0f, -1.0f);
 
 //back face
 glVertex3f(1.0f, 1.0f, -1.0f);
 glVertex3f(1.0f, -1.0f, -1.0f);
 glVertex3f(-1.0f, -1.0f, -1.0f);
 glVertex3f(1.0f, 1.0f, -1.0f);
 glVertex3f(-1.0f, -1.0f, -1.0f);
 glVertex3f(-1.0f, 1.0f, -1.0f);

 glEnd();
 glPopMatrix();

The following pictures compare the result of drawing this cube with the result of using the cube provided by GLUT, each modelled using our previous lighting system.


Our cube

GLUT cube

Both cubes are drawn in the exact same environment, using the exact same lighting, yet here we see completely different lighting results.

Surface Normals

The GLUT cube is lit correctly whilst our cube is not lit correctly. This is because the GLUT cube defines normals along with the vertices. A normal is a vector that is perpendicular to a plane. A surface normal is a normal that is perpendicular to the plane that the surface lies on. For example, each surface of our cube is made up of two triangles and each of these has it's own surface normal. To illustrate this, here is our cube shown in Blender with surface normals showing.


The surface normals allow us to know the direction that the surface is facing. OpenGL then uses this information to calculate how much of the light from the light source the surface is receiving. It does this by Taking the direction of the light and the surface normal and calculating the angle between them using the dot product operation. The smaller the angle between the two vectors, the more light the surface receives. Because our cube provides no normal information, OpenGL has nothing to work with and so we get incorrect results.

So how do we go about calculating the surface normals? I covered this is a previous post, but basically the vector operation called the cross product is performed using the three vertices of each face. This returns a vector perpendicular to the surface. The following code will accomplish this.

Vector3 crossProduct(Vector3 v1, Vector3 v2){
 
 Vector3 cross = {v1.y * v2.z - v1.z * v2.y,
     v1.z * v2.x - v1.x * v2.z,
     v1.x * v2.y - v1.y * v2.x};
 
 return cross;
 
}

Vector3 getSurfaceNormal(Vector3 v1, Vector3 v2, Vector3 v3){
 
        /*
        * obtain vectors between the coordinates of
        * triangle.
        */
 Vector3 polyVector1 = {v2.x - v1.x, v2.y - v1.y, v2.z - v1.z};
 Vector3 polyVector2 = {v3.x - v1.x, v3.y - v1.y, v3.z - v1.z};
 
 Vector3 cross = crossProduct(polyVector1, polyVector2);
 
 normalize(cross);
 
 return cross;
 
}

You will notice that after calculating this perpendicular vector, the normalize function is called on it before it is returned which produces a unit vector. A unit vector is simply a vector whose magnitude is 1. The process of normalising a vector is described here. (If these concepts are confusing, reading up on the basics of vector mathematics is strongly recommended before continue to pursue learning OpenGL). When providing normals, we always provide unit vectors as OpenGL expects this due to simplified calculations. Failing to do so will result in strange lighting!

Because our cube is very simple, we can actually just mentally calculate the normals and write them directly into the code. The following example now attaches normals to our cube.

void drawCube(){

 glPushMatrix();
 glBegin(GL_TRIANGLES);

 /*      This is the bottom face*/
 glNormal3f(0.0f, -1.0f, 0.0f);
 glVertex3f(1.0f, -1.0f, -1.0f);
 glVertex3f(1.0f, -1.0f, 1.0f);
 glVertex3f(-1.0f, -1.0f, 1.0f);
 glVertex3f(1.0f, -1.0f, -1.0f);
 glVertex3f(-1.0f, -1.0f, 1.0f);
 glVertex3f(-1.0f, -1.0f, -1.0f);

 //top face
 glNormal3f(0.0f, 1.0f, 0.0f);
 glVertex3f(1.0f, 1.0f, -1.0f);
 glVertex3f(-1.0f, 1.0f, -1.0f);
 glVertex3f(-1.0f, 1.0f, 1.0f);
 glVertex3f(1.0f, 1.0f, -1.0f);
 glVertex3f(-1.0f, 1.0f, 1.0f);
 glVertex3f(1.0f, 1.0f, 1.0f);
 
 //right face
 glNormal3f(1.0f, 0.0f, 0.0f);
 glVertex3f(1.0f, -1.0f, -1.0f);
 glVertex3f(1.0f, 1.0f, -1.0f);
 glVertex3f(1.0f, 1.0f, 1.0f);
 glVertex3f(1.0f, -1.0f, -1.0f);
 glVertex3f(1.0f, 1.0f, 1.0f);
 glVertex3f(1.0f, -1.0f, 1.0f);
 
 //front face
 glNormal3f(0.0f, 0.0f, 1.0f);
 glVertex3f(1.0f, -1.0f, 1.0f);
 glVertex3f(1.0f, 1.0f, 1.0f);
 glVertex3f(-1.0f, 1.0f, 1.0f);
 glVertex3f(1.0f, -1.0f, 1.0f);
 glVertex3f(-1.0f, 1.0f, 1.0f);
 glVertex3f(-1.0f, -1.0f, 1.0f);
 
 //left face
 glNormal3f(-1.0f, 0.0f, 0.0f);
 glVertex3f(-1.0f, -1.0f, 1.0f);
 glVertex3f(-1.0f, 1.0f, 1.0f);
 glVertex3f(-1.0f, 1.0f, -1.0f);
 glVertex3f(-1.0f, -1.0f, 1.0f);
 glVertex3f(-1.0f, 1.0f, -1.0f);
 glVertex3f(-1.0f, -1.0f, -1.0f);
 
 //back face
 glNormal3f(0.0f, 0.0f, -1.0f);
 glVertex3f(1.0f, 1.0f, -1.0f);
 glVertex3f(1.0f, -1.0f, -1.0f);
 glVertex3f(-1.0f, -1.0f, -1.0f);
 glVertex3f(1.0f, 1.0f, -1.0f);
 glVertex3f(-1.0f, -1.0f, -1.0f);
 glVertex3f(-1.0f, 1.0f, -1.0f);

 glEnd();
 glPopMatrix();

}

Normals are added by making a call to glNormal3f and stating the vector coordinates. OpenGL then uses this normal value in calculations for any following vertices declared with glVertex3f until either a new normal is declared or the drawing ends. Because OpenGL now knows the correct normals, we achieve the correct lighting conditions, exactly the same as the GLUT cube.


Vertex Normals


Vertex normals are similar to surface (face) normals. Whereas surface normals represent a vector that is perpendicular to the polygon's surface, vertex normals have no concrete definition. A vertex normal is an average of surface normals and they are used to calculate more realistic and less blocky lighting.

One problem with using surface normals is that the lighting for each surface is calculated using its normal which is then applied to every pixel of the surface. This means that each surface has a uniform colour causing edges between surfaces to appear distinct due to the uniform colouring, causing the model to look blocky.


This is due to the way OpenGL calculates lighting: using the Gouraud shading model. The Gouraud method first requires that each vertex has its own normal. A vertex normal is calculated by averaging the surface normals of every surface that the vertex is a member of.


Once it has these normals, OpenGL uses the Blinn-Phong reflection model to calculate how much light is being received at each of these vertex normals. Gouraud takes these values and interpolates the values along the edges of the polygon, using the lighting calculated at the vertices at the end of the lines and changing it based on how far away from each vertex the current point is. Once the lighting values have been interpolated along the edges, the process is repeated for the pixels within the polygon, interpolating using the lighting values just calculated at the edge, changing the pixel value based on how close it is to each edge. Hopefully the following diagram showing the interpolation of interior points, makes it a little clearer (I represents the light intensity value at a point).

This is the cause of the blocky lighting as seen above. When we set only the surface normal to be used for a polygon, each vertex normal uses the surface normal. This means the lighting value at each vertex is the same and thus, the interpolated values across the entire surface are the same. This means there are no smooth transitions between polygons (smooth shading).

One method around this is to, within our program, define a normal for each vertex in the model. The way to do this is to look at the normals of each face that uses the vertex, called adjacent faces. The normal for the vertex is then calculated as the average of the normals of the surface normals as described above. This averages the lighting value of the surfaces such that the edges of the surfaces do not appear.


Cube shown with face normals

   Cube shown with vertex normals

Hopefully clear from the pictures is that at the edges of the cube , the vertex normals are pointing out at a 45 degree angle. This is because it is a combination of the face normal pointing upwards and the face normal pointing outward.

In order to create the vertex normals, we run through each face of the model and for each vertex making up that face, mark the face as an adjacent face. After this initial pass, we run through each vertex and work out the average normal from each of the surface normals of each adjacent face marked. Example code is presented below.

void WavefrontOBJModel::calculateNormals(){

 for(int i = 0; i < numberOfFaces; i++){

  WavefrontOBJFace * face = &modelFaces.at(i);
  
  int vertexIndex1 = face->getVertexIndices().at(0);
  int vertexIndex2 = face->getVertexIndices().at(1);
  int vertexIndex3 = face->getVertexIndices().at(2);
  
  WavefrontOBJVertex * vertex1 = &modelVertices.at(vertexIndex1);
  WavefrontOBJVertex * vertex2 = &modelVertices.at(vertexIndex2);
  WavefrontOBJVertex * vertex3 = &modelVertices.at(vertexIndex3);
  
  vertex1->addAdjacentFace(i);
  vertex2->addAdjacentFace(i);
  vertex3->addAdjacentFace(i);
  
  CVector3 v1 = *vertex1->getCoords();
  CVector3 v2 = *vertex2->getCoords();
  CVector3 v3 = *vertex3->getCoords();

  CVector3 cross1 = v2 - v1;
  CVector3 cross2 = v3 - v1;

  float normalX = cross1.y * cross2.z - cross1.z * cross2.y;
  float normalY = cross1.z * cross2.x - cross1.x * cross2.z;
  float normalZ = cross1.x * cross2.y - cross1.y * cross2.x;

  CVector3 normal (normalX, normalY, normalZ);

  normal.normalize();

  face->setSurfaceNormal(normal);

 }
 
 for(int i = 0; i < numberOfVertices; i++){
  
  WavefrontOBJVertex * v = &modelVertices.at(i);
  
  int numAdjacentFaces = v->getAdjacentFaces().size();
  
  float xNormalTotal = 0.0f;
  float yNormalTotal = 0.0f;
  float zNormalTotal = 0.0f;
  
  for(int j = 0; j < numAdjacentFaces; j++){
   
   int faceIndex = v->getAdjacentFaces().at(j);
   
   CVector3 faceNormal = *modelFaces.at(faceIndex).getSurfaceNormal();
   
   xNormalTotal += faceNormal.x;
   yNormalTotal += faceNormal.y;
   zNormalTotal += faceNormal.z;
   
  }
  
  CVector3 newVertexNormal (xNormalTotal, yNormalTotal, zNormalTotal);
  
  newVertexNormal.normalize();
  
  v->setNormal(newVertexNormal);
  
 }

}

Now that we have a normal for each vertex, we simply make a call to glNormal3f for every glVertex3f call that we make, rather than setting it once for each face.

Strange Lighting: A Caveat


We're nearly at our goal of achieving smooth shading for our models. There is however one problem: if we run our cube example with calculated vertex normals included this time, this is what we get:



What gives!? We went through all the trouble of calculating and defining our vertex normals and it's worse than it was before!

The reason we get this cool but incorrect bevel type effect at the edges is due to the fact that Gouraud shading is used for shading curved surfaces. It does not handle sharp edges well. This is because if we have a light shining directly down on the top of the cube, the vertex normals at the edges use an average of the surface normals on the top and the surface normals at the sides. Since the light is completely on the top surface, the top is in light and the side is in darkness because the angle between the normals is 90 degrees. We do however, get a smooth transition from light to dark which is not correct.

After racking my brains a while and asking a few questions on the interwebs, the solution is that because Gouraud only works on smooth surfaces, we need to break the connection between the edge vertices - we need to stop them averaging surface normals that are separated by a sharp edge. The easy way to do this is to simply duplicate your vertices at every sharp edge in the model. This means that the faces on either side of a sharp edge are no longer adjacent faces in terms of vertices and will not be averaged when calculating vertex normals. I'm sure this can be done programatically however it is much easier to ask (read: demand) that your artist provide you with models where vertices are duplicated at sharp edges. I'm sure this is the standard now but I'm no artist so I'm not sure! Besides, this can be easily accomplished in Blender using the edge split modifier.

Improve Lighting: Add Polygons


Another common problem with lighting is that the models that are being lit simply don't have enough polygons. A cube with only twelve triangles will not look very good when put under the lights. This is because there are not many vertices and so the lighting values calculated at a particular vertex will have to be interpolated across too large a distance, causing poor results. Of course increasing vertices means poorer performance so balancing these two things is a bit of an art. To illustrate the effect of having too few polygons, compare the lighting results of using a spot-light to light our cube with 12 polygons and our cube with 46343 polygons (too many but will demonstrate the effect). Thankfully Blender provides the ability to subdivide our object very easily.



12 polygons

   46343 polygons


As we can see, the low polygon cube is not even lit at all. This is due to the fact that the spotlight doesn't even reach one of the 24 vertices!

Next time we'll improve the appearance of our cube by applying materials and textures. This also means that we'll be able to bring in the specular lighting that I so conveniently skipped over ;).

Next time.

*Gouraud shading description and images inspired by the book 'Computer Graphics' by Hearn & Baker, Third international edition*

Sunday 29 April 2012

Adding Lighting: Basics

In order to add some basic lighting to the 3D model that I'm now able to load in, the lighting must first be initialised and then applied.

There are three main types of lights that can be simulated with OpenGL: directional, point and spotlight. We will talk about these and how to include them in the program later.

Initialising Lighting

Thankfully OpenGL does a lot of the lighting calculations for us. For every lit vertex in the scene a lighting model known as the Blinn-Phong shading model is used to calculate how much light the vertex is receiving from any light sources in the scene and so is able to decide on the colour at that vertex. A shading model called Gouraud shading is then applied to interpolate the lighting across the face, which means that every pixel across the face is efficiently coloured based on the colour of the nearest vertices and its distance from them. This is how our 2D faces are lit and coloured. Thankfully OpenGL does all this for us so we don't have to worry about (not until shaders I think, anyway!).

So how do we utilise this functionality? A simple call to glEnable(GL_LIGHTING) will turn on lighting for us and the corresponding glDisable(GL_LIGHTING), will turn it off. Every vertex sent to the hardware between these calls will be lit using the method described above. That's it!

There are a number of other options that can alter the appearance of the lighting but for now, this will do nicely.

Creating Lights

So now that we've told OpenGL we want to use lighting, how do we actually include this in our scene? After all, a lit scene with no lights isn't much use at all...

OpenGL allows us the use of 8 lights, each with the symbolic name GL_LIGHT0, GL_LIGHT_1...,GL_LIGHT7. In order to use these lights, we have to call glEnable like before. By default all lights apart from 0 are dark and their colours need to be set. Light 0 however, has a bright white light (1.0, 1.0, 1.0, 1.0) by default. We can just enable Light 0 and use that however it's much more fun to customise it to our needs...

The colour of the lights in OpenGL are mainly defined by three components:

  • Ambient: In real life, light rays bounce around and illuminate every object they collide with, gradually losing energy, this causes a room for example, to be filled with light. Modelling this efficiently in software however proves to be a very hard problem. To simplify matters, OpenGL simply applies a small amount of light to everything in the scene, somewhat simulating all the light rays bouncing around. This can be thought of as the "background lighting".


  • Diffuse: Many objects spread the light they receive in a uniform manner. This means that when light interacts with the object, the light bounces around the object and secondary rays are reflected in what appears to be from all directions away from the surface, providing a nice even light. To approximate this effect, if an object is in the area of a light, the calculations used to colour the surface heavily use the diffuse property of the light. Diffuse lighting can be thought of as the "colour" of the light.


  • Specular: Some objects do not reflect light in a uniform manner, they prefer to reflect light in a particular direction. This type of lighting is called specular reflection and occurs for "shiny" objects such as metals and a mirror. Typically we see this as a small white blob. For objects in our scene that are shiny (more on this when we get to texturing the model), we need to set the colour of the light that will appear reflected in this shiny section of the material. This can be thought of as the colour of the "shininess" of the interaction between the light and the object.


Each of these colour properties are set using three values representing a mix of the Red, Blue and Green colours and a 4th alpha property which comes into play when blending is used. All properties of a particular light are set using the glLight[f/i] and glLight[f/i]v functions (the f means that floats are supplied, i - integers). A typical light setting is applied in the example below:

 float ambientColour[4] = {0.2f, 0.2f, 0.2f, 1.0f};
 float diffuseColour[4] = {1.0f, 1.0f, 1.0f, 1.0f};
 float specularColour[4] = {1.0f, 1.0f, 1.0f, 1.0f};
 
 glLightfv(GL_LIGHT0, GL_AMBIENT, ambientColour);
 glLightfv(GL_LIGHT0, GL_DIFFUSE, diffuseColour);
 glLightfv(GL_LIGHT0, GL_DIFFUSE, specularColour);

The 'v' form of the glLight function means you are providing an array argument. This should be a float or integer array of size four, describing the RGBA values. This is enough to configure the colouring of our simple light!

Positioning the Light

The final step of creating this light is to specify where in the scene the light is to be placed. This step is also used for stating the type of light that we want to use. In order to position the light, we call glLight again, providing it with GL_POSITION this time. This again is an array of size four with the first three floats representing the x, y and z coordinates in space in the world in which our camera will be placed. What then, is the 4th float used for?

The fourth float is when we finally get to specify the type of light that we want OpenGL to provide us with. As we said before, there are three types of light that we can utilise: Directional, point and spotlight.

A directional light is a light that originates from an infinite point in the distance. You define a vector which defines the direction from which the light emanates towards the world origin. All rays from the light can then be thought as being parallel to the direction vector defined, basically a wall of light. This light source approximates a large uniform light source, such as the sun.

As an example, we can define a directional light source with the following coordinates.

float position[4] = {-5.0f, 0.0f, 0.0f, 0.0f};

glLightfv(GL_LIGHT0, GL_POSITION, position);

While the coordinates {0, 0, -5} define a direction, it is easier to think of it as the position of the light but with the vector extended into infinity. The following figure hopefully demonstrates the concept a little better. A sphere is located at the world origin {0, 0, 0}, another at {-10, 5, 0}. The light direction is defined as {0, 0, -5}, denoted by the small red cube. The light then shines toward the origin parallel to this vector, from infinitely far away.



The second light type that we can include in our scene is the point light. Rather than being a global light like the directional light, the point light is a local light, an example of which would be a light-bulb. Like the directional light, you define a point light by using an array of size four however this time, the final member is 1 instead of 0, making it a point light. The other difference is that rather than defining a light direction, the first 3 parts of the array define a point in the world where the light is placed. Think of it as positioning any other object in your scene.

float position[4] = {-5.0f, 0.0f, 0.0f, 1.0f};

glLightfv(GL_LIGHT0, GL_POSITION, position);

In this example, we can see that the lighting effect is different. Rather than emanating from an infinite point towards the world origin, the light emanates from the position of the light outwards in all directions.


The final type of light is the spotlight. The spotlight is a specialized form of the point light in that the arc from which light emanates is limited to a cone. We see this type of light in lamps and torches (flash-lights). Rather than modifying the final member of the position array, we leave that as 1 and instead call additional glLight functions.

We will call three additional functions, one specifying the direction of the light source form which the light will emanate; one specifying the angle of the arc from which light will come and a final function which will set whether the light from the arc is concentrated in middle of the beam or spread uniformly out.

 float spotDirection[4] = {1.0f, 0.0f, 0.0f, 1.0f};
 float spotArc = 45.0f;
 float spotCoefficient = 0.0f; // Ranges from 0 to 128.
 
 glLightfv(GL_LIGHT0, GL_SPOT_DIRECTION, spotDirection);
 glLightf(GL_LIGHT0, GL_SPOT_CUTOFF, spotArc);
 glLightf(GL_LIGHT0, GL_SPOT_EXPONENT, spotCoefficient);

Here we have a spotlight that emanates light in the direction of the vector {1.0f, 0.0f, 0.0f}. We also define the arc to be 45 degrees. This value corresponds to the angle between the centre line of the cone and one of the edges of the cone. This means we have a cone of light, totalling 90 degrees. The final function defines a spot coefficient of 0. This value ranges from 0 to 128 with 0 being a uniform distribution of light and 128 being most of the light focused in the centre of the cone. By adding this spot light to the previous example, we get the following result.


As we can see, the other sphere is not lit as it is outside the spot light's cone from which light emanates.

Organising the Code

With these elements, we can accomplish simple lighting. We can also utilise the position values or even use OpenGL's transformation functions (glTranslate, glRotate) to transform the lights and create interesting lighting within our scene. Some caution however, has to be taken as to when these calls are made.

Some of the function calls to set-up the light can be called once at the beginning of the problem and left until the end or until you require to change it. These functions include enabling the light, setting the ambient, diffuse and specular colours, setting the spotlight arc angle and coefficient if a spotlight is to be used and, providing lighting is to be used for every element in the scene, the lighting itself can be enabled here. We can bundle this up into an initialise lighting function, called once at the beginning of the problem.

Two functions must be called every time the scene is rendered however. These are the call to glLight to position the light and if a spot light is used, the call to glLight to define the direction from which the light emanates. This is because these functions use the current modelView matrix to transform the light. The light will then be placed at different points in the scene depending on whether you call it before setting up your model view matrix (e.g. using gluLookAt()) or after. (This issue is the difference between the lighting being positioned in world coordinates or eye coordinates. Read this if this is not clear. I may write about the coordinate system myself sometime). Probably for most purposes, you will want to make these calls after setting up the model view matrix so it stays in the same place relative to the camera.

I have provided some example code of a program that introduces simple lighting into a scene.

#include < GL/gl.h >
#include < glut.h >

float playerX = 0.0f;
float playerY = 0.0f;
float playerZ = 15.0f;

/*
* Initialise the data used
* for creating our light.
*/

float ambient[4] = {0.2f, 0.2f, 0.2f, 1.0f};
float diffuse[4] = {1.0f, 1.0f, 1.0f, 1.0f};
float specular[4] = {1.0f, 1.0f, 1.0f, 1.0f};

float position[4] = {-5.0f, 0.0f, 0.0f, 1.0f};

float spotDirection[4] = {1.0f, 0.0f, 0.0f, 1.0f};
float spotArc = 45.0f;
float spotCoefficient = 0.0f; // Ranges from 0 to 128.

void positionLight(){
 
 /*
  * Tell OpenGL where we want our light
  * placed and since we're creating a spotlight,
  * we need to set the direction from which
  * light emanates.
  */
 glLightfv(GL_LIGHT0, GL_POSITION, position);
 glLightfv(GL_LIGHT0, GL_SPOT_DIRECTION, spotDirection);
 
}

void display(){

 glClearColor(0.0f, 0.0f, 0.0f, 1.0f);

 glLoadIdentity();

 glClear(GL_COLOR_BUFFER_BIT | GL_DEPTH_BUFFER_BIT);

 gluLookAt(playerX, playerY, playerZ, 0.0f, 0.0f, 0.0f, 0.0f, 1.0f, 0.0f);
 
 /*
  * Tell OpenGL we want all the following
  * objects in our scene to have lighting
  * applied to them.
  */
 glEnable(GL_LIGHTING);
 
 /*
  * Position the lights AFTER the model View matrix
  * has been set up.
  */
 positionLight();
 
 glutSolidSphere(1.0f, 50, 50);
 
 glPushMatrix();
 
 glTranslatef(-7.0f, 5.0f, 0.0f);
 
 glutSolidSphere(1.0f, 50, 50);
 
 glPopMatrix();
 
 /*
  * We don't need the lighting anymore
  * so disable it.
  */
 glDisable(GL_LIGHTING);
 
 glPushMatrix();
 
 glColor4f(1.0f, 0.0f, 0.0f, 1.0f);
 
 /*
  * create a small red cube where the light
  * is.
  */
 glTranslatef(position[0], position[1], position[2]);
 
 glutSolidCube(0.2f);
 
 glPopMatrix();

 glutSwapBuffers();

}

void reshape(int x, int y){

 glMatrixMode(GL_PROJECTION);

 glViewport(0, 0, x, y);

 glLoadIdentity();

 gluPerspective(60.0, (GLdouble)x / (GLdouble)y, 1.0, 100.0);

 glMatrixMode(GL_MODELVIEW);

}

void initLighting(){
 
 /*
  * Tell OpenGL we want to use
  * the first light, GL_LIGHT0.
  */
 glEnable(GL_LIGHT0);
 
 /*
  * Set the ambient, diffuse and specular
  * colour properties for LIGHT0.
  */
 glLightfv(GL_LIGHT0, GL_AMBIENT, ambient);
 glLightfv(GL_LIGHT0, GL_DIFFUSE, diffuse);
 glLightfv(GL_LIGHT0, GL_SPECULAR, specular);
 
 /*
  * We're going to make GL_LIGHT0 a spotlight.
  * Set the angle of the cone of light and
  * how uniform the dispersion of light is.
  */
 glLightf(GL_LIGHT0, GL_SPOT_CUTOFF, spotArc);
 glLightf(GL_LIGHT0, GL_SPOT_EXPONENT, spotCoefficient);
 
}

int main(int argc, char **argv){

 glutInit(&argc, argv);
 glutInitDisplayMode(GLUT_DOUBLE | GLUT_DEPTH | GLUT_RGBA);
 
 glutInitWindowSize(500, 500);

 glutCreateWindow("Adding Lighting");

 glutDisplayFunc(display);
 glutReshapeFunc(reshape);

 glEnable(GL_DEPTH_TEST);
 
 /*
  * setup the lighting once
  * in our program.
  */
 initLighting();

 glutMainLoop();

 return 0;

}


You may note that using this example to light your own models will not work, this is why I have used GLUT to draw some spheres for us. The reason for this is that no normals nor materials have been set for your objects where as the GLUT routines include them. More on normals and materials and their importance in lighting a scene in the next few posts.

Tuesday 27 March 2012

Creating and Loading .OBJ Models

In order to have the ability to play around with OpenGL in a virtual environment other than a handful of randomly scattered polygons, I thought I better look into a bit of simple 3D model making.

A quick scour of the internet revealed that 3DS Max seems to be somewhat of an industry standard, however that requires a substantial investment so I looked for alternatives. Blender (http://www.blender.org/) seemed to be an open source favourite, matching much of what 3DS Max does so I decided to give it a go and download it.

The tutorials are pretty good on the website and get you going pretty quickly. Through the use of predefined meshes and the "snap" tool, I was able to knock up a curious looking little room fairly quickly, also defining vertex normals along the way. A particularly useful feature was being able to "triangulate" all the polygons in the model, thereby turning them all into a combination of triangles.


Once the model had been made, I chose to export the model as a .OBJ file. The .OBJ file is a file format defined by Wavefront, the group behind another 3D modelling tool, Maya. It defines information about the model in an ASCII text file. This has the downside of resulting in large file sizes but is apparently popular to use due to its ASCII format which is much simpler to read and load than a binary file such as the .3DS format. For this reason, I thought it'd be a good place to start with learning how to load and handle 3D models from files before moving on to more complex formats later.

Exporting a model to .OBJ format in Blender affords the option of including a lot of data describing the model in the output file. Since I'm just starting out, I've decided to only include vertex information and to exclude all the other information such as material and normal data which I will gradually introduce into my programs in future posts.


I quickly discovered the model information output to the .OBJ file isn't particularly object/program friendly. From various sources I was told that a lot of information is optional and quite often completely absent from the file. The information is typically given in sentences with a short code indicating the kind of information to follow. for Instance, a "v" at the beginning of a sentence indicates vertex data and is succeeded by three float values (x, y, z) separated by a space. Face information is presented in a similar fashion: prefixed by "f" and presenting a list of the indices of the vertices that make up the face with 1 being the first vertex defined and n being the last vertex defined in the file representing a model with a total number of n vertices.


With this file, one can read in the model information and reconstruct it in the OpenGL program. In order to do this I created a few classes that would store the model information so that the model can be passed around and accessed within the program with a single object. The method of reading in the file was pretty straight forward.

In terms of 3D modelling, the file was composed of the following elements:

  • Vertices: prefixed by 'v' and containing 3 float values: the x, y and z coordinates of the vertex in the model relative to the model origin in Blender.
  • Faces: A face refers to one polygon that makes up the model. Prefixed by 'f' and containing 3 integer values: the indices of the vertices that make up the face. The vertex index represents the nth vertex defined in the file, starting at 1, not 0.
  • Object: An object is a smaller construction within the model. A model is made up of objects and an object is made up of faces. Objects are not represented within the file (an object name line can be added but is metadata) and a large model can be declared as just a single object. However, objects provide a logical structure to the model. They are represented by a vertex/faces block. E.g. the snippet of the .OBJ file above contains information for 2 objects.

After reading in a line from the file, the first character is checked and the corresponding method to handle that data type is called (vertex, face). Since faces can be defined anywhere in the file (in this case, after the vertex definitions for the vertices that make up the face), an object is processed after the final face declaration before a vertex definition. In order to do this, before a vertex is read and processed from the file, a check must be made to determine whether a face was just read and therefore we reached the end of an object. If so, the previously read information is stored in a modelObject class which is stored in the model object. By carrying on this process, the file is read in and a model object is obtained which is made up of modelObject objects which is comprised of faces which in turn store the vertices.

Displaying the model is simply a case of accessing the model object during frame drawing and looping through the objects and faces and drawing each face as a triangle.

Next I plan to include vertex normals to the model and describe how to use them to introduce lighting to a scene.

As an aside, I also learnt some interesting C++ tips from errors that I made. My C++ isn't good and my endeavour to learn OpenGL is also an endeavour to improve my C++ coding. The first thing I learnt is that declaring a new object within a code block without using the new operator, creates the object on the function stack and not the heap.


This means that as soon as the end of the function is reached, the entire stack for the function is deleted, including the new object i.e. it no longer exists. This means that if you were planning to use it in the future, the information will be lost and garbage will be used. In order to persist an object for further use in other parts of the program (e.g. via pointers), the object should be declared on the heap by using the new operator. Subsequently, the delete operator should be used when the object is no longer required otherwise a memory leak will be introduced.

This also lead on to the discovery of the Rule of Three, a C++ coding principle. This generally states that if within a class, you declare one of the following:
  • Destructor
  • Copy Constructor
  • Assignment Operator
Then the other two should be implemented. This is particularly necessary for the case where a destructor is declared. This is because by declaring a destructor, you probably have non primitive data contained within the class that needs to be handled correctly (e.g. use of the delete operator). C++ generates a default copy constructor and assignment operator that only performs shallow copying. For most applications, this is insufficient. In order to fix memory related issues, a copy constructor and assignment operator function should be implemented which correctly control the copying of dynamic memory.

Within the OBJ loader program, I made this mistake and it caused object data that I had moved into the model object to be garbage once the temporary data I had used had been collected by the system. By implementing the copy constructor and assignment operator function within the necessary objects, logical copying of the data was performed and the integrity of the model data was maintained.



Monday 12 March 2012

Sphere Intersecting a Polygon

It had become clear that being able to detect the collision of a sphere with a polygon was an important concept in terms of 3D graphics. Many Collisions are optimized by the use of a sphere or some other similar object to represent one of the colliding objects. This simplifies calculations greatly. Now to learn how!

Firstly, for a simple test case of this concept, I used a simple triangle and a wireframe sphere drawn using glut. The wireframe makes it easier to see any intersections.

The previous posts documented methods for checking for different forms of collision. These techniques can be used for the sphere intersection and modified if necessary. In order to check for the sphere intersecting with the polygon, three checks were necessary:

  1. Check if the sphere lies completely outside the plane formed by the polygon
  2. Check whether the projection of the centre of the sphere onto the plane formed by the polygon lies within bounds of the polygon itself.
  3. If the projection of the centre of the sphere does not lay inside the polygon, check that the sphere does not intersect with the edges of the polygon.

Performing these checks is enough to determine if the sphere intersects with the triangle.

1. Check if the sphere lies out the plane formed by the polygon


If the sphere has a radius of 1 metre and the centre of the sphere is 100 metres away from the plane formed by the polygon, it is obvious that no collision can occur. To determine whether this is the case, we use a technique I described in an earlier post. Firstly we calculate the distance that the centre of the sphere lies from the plane by utilising the plane equation.

We then make one minor change to see if the sphere is intersecting the plane. If the distance of the sphere centre from the plane is less than that of the sphere's radius then we have an intersecting. For this reason, we take the absolute value of the distance (as the sphere can be either in front of or behind the plane) and compare it with the sphere radius. If it is smaller, we continue onto step 2 otherwise there is no collision and we can return false.

2. Projection of Sphere lies inside polygon

If the sphere does indeed intersect the plane, we need to perform more checks. We first need to find whether the point at which the sphere intersects with the plane is within the bounds of the polygon or not. How do we determine the point at which the sphere intersects the plane? Well if the sphere is indeed intersecting the plane, the number of points on the plane it intersects with is infinite. For this reason, we project the sphere's centre onto the plane as whether it lies within the polygon.

To do this, we use the distance that the sphere's centre lies from the plane, which we calculated in step 1. We then multiply this by the normal of the plane. Remember that geometrically speaking, by multiplying a vector by a scalar, we are moving along the vector by a distance determined by the scalar. Since we start at the plane and move along the normal a distance equal to the distance of the sphere from the plane, we obtain a vector that represents the offset of the sphere's centre from the plane.

Because this vector goes from the plane to the centre of the sphere, we subtract this vector from the sphere's centre vector to obtain a new vector that goes from the sphere centre to a point on the plane. This is the projection of the sphere's centre onto the plane and serves as our intersection point on the plane.


This technique is illustrated above. We know from test 1 that the sphere is intersecting the plane so we want to find whether the centre of the sphere is inside the bounds of the polygon. The normal of the plane N is multiplied by the distance of the sphere centre from the plane to obtain the offset vector N * d. We can then subtract this vector from the sphere's centre to obtain the same vector in the opposite direction, finding the intersection point with the plane, i.

Now that we know the sphere centre's intersection point with the plane, we need to know whether that point is inside or outside of the polygon. If outside, there is no collision and we must proceed to step 3. In order to determine whether the point is inside the polygon, we use the same trick as before, constructing vectors between the intersection point and the polygon's vertices and summing up the interior angles. If equal to 360 degrees, the point is inside the polygon and we can return true for the collision, otherwise it is outside. I discussed this technique in more detail in the previous post.


3. Detecting Sphere-Edge Collisions

So, we know that the sphere intersects the plane but that the projection of the centre of the sphere onto the plane lies outside the polygon. Surely there is no collision? Well actually, there remains one case where the sphere could still be colliding with the polygon. Take a look at the figure below.

Here we can see that the sphere intersects the polygon and yet our previous test would have found the intersection point of the sphere centre and the plane to be outside of the polygon and would have ruled that no collision would have occurred. For this reason, we need to perform a final test checking for collisions with the edges of the polygon.

In order to do this, we construct vectors between each pair of vertices, starting with the most bottom-left vertex and proceeding in a clockwise fashion. For each vector we find the point on the vector that is closest to the sphere's centre and check the distance between them. If the distance is less than the sphere's radius, the sphere must be intersecting the polygon and we can return a true collision.

How do we find the closest point on the edge vector to the sphere's centre? The following technique does just that.

Firstly, we construct the edge vector by subtracting the current vertex from the next vertex in the polygon. This provides us with a vector representing an edge of the polygon. We also normalize this vector as we only care about its direction, not its magnitude for reasons I'll explain soon. We then construct a vector from the current vertex to the sphere centre by subtracting the current vertex coordinates from the sphere's centre coordinates. This is what we now have:



The distance along the edge vector E of the closest point on the edge vector to the sphere's centre (c) is found by performing a dot product operation using the two vectors we just created. This is cd. Why is this? Well, remember that geometrically, the dot product means this:

 \mathbf{a} \cdot \mathbf{b}=\left\|\mathbf{a}\right\| \, \left\|\mathbf{b}\right\| \cos \theta \,

That is, the dot product is equal to the product of the magnitudes of both vectors multiplied by the cosine of the angle between them. By performing a dot product between E and P we obtain the following:


However, remember that we normalized the edge vector, meaning that it's magnitude is 1, so it drops out of the equation, leaving:


Remembering simple trigonometry and observing that the P vector and the distance to the closest point on the edge vector forms a triangle, we can see that the dot product will give us cd since the hypotenuse of the triangle multiplied by the cosine of the angle gives us the adjacent edge, i.e. cd.

Now that we have the distance along the edge vector to the closest point, we can find the coordinates of the closest point on the edge by taking the coordinates of the current vertex of the edge and adding on the the edge vector multiplied by the distance to the closest point.


Now that we know the closest point on the edge to the sphere's centre, we simply use the euclidean distance formula to find the distance between the two points (d in the diagram). 


we then compare this with the radius of the sphere and if it is smaller, we must be colliding. If the distance is greater, there is no collision and we move on to check the next edge in the polygon. 

By performing these 3 checks, we are able to fully determine if a sphere is intersecting with an arbitrary convex polygon.

Thursday 23 February 2012

Line Intersecting A Polygon

Being able to detect when a line is intersecting a plane will be useful, but I thought next I would have a go at detecting a more specific case: whether or not a line is intersecting a polygon.

For this, I used the same set-up as before: a simple 2D triangle and a line that changes colour based on whether it is intersecting the triangle. Because a line can only be intersecting a polygon if it intersects the plane that the triangle lies on, the code from the program before was reused. First the plane intersection is performed. If the test fails, the line cannot be intersecting the polygon so we are done. If the line is intersecting the plane, we need to perform more analysis to determine if it intersects the polygon.

In order to do this, we first need to determine the point on the plane that the line is intersecting. For this, we need to utilise a vector operation known as the dot product. For arbitrary vectors a and b, the dot product is defined as the following:

\mathbf{a}\cdot \mathbf{b} = \sum_{i=1}^n a_ib_i = a_1b_1 + a_2b_2 + \cdots + a_nb_n

That is, each component of a vector is multiplied together and summed, returning a single value. Given two vectors A and B that represent points in 3D space, the dot product is found by performing the following calculation.


Geometrically speaking, the dot product represents an operation on the length of the vectors and the angle between them. A geometric interpretation of the dot product is as follows:

 \mathbf{a} \cdot \mathbf{b}=\left\|\mathbf{a}\right\| \, \left\|\mathbf{b}\right\| \cos \theta \,

That is, performing the dot product gives us the lengths of the two vectors multiplied and then multiplied again by the angle between the two vectors. This proves to be a very useful equation for probing the geometric relationship between two vectors. For now, observe that if both vectors a and b are unit vectors, their product will simply be one, meaning that the dot product of two unit vectors will actually give you the cosine of the angle between them.

So how can we use this information to find the point that the line intersects the plane at? Firstly we need a representation of the line to work with so we obtain the line's vector by subtracting one endpoint of the line from the other endpoint. Now that we have the line's vector, we need to find the point at which the vector intersects the plane that the polygon lies on. The first step is to choose one of the line's endpoints and find the   closest distance that it lies from the plane. This is achieved by substituting the endpoint's coordinates into the equation for the plane. We negate the distance value as we want to move back along the vector. This distance is also the magnitude of a vector that shoots perpendicular from the plane to the line's endpoint. Since the normal of the plane is also perpendicular to the plane, the vector from the plane to the endpoint and the plane normal share an orientation. Because of this, we can perform a dot product operation between the normal of the plane and the line's vector to find the angle between the line and the vector from the plane to the line endpoint. After performing these two operations, we now have the angle and the length of one side of the triangle formed between the plane and the line. This is demonstrated below.



Here, we can see that the Line vector L and the Plane Normal N (in the reverse direction because we negated the distance) form a triangle. The use of the plane equation and the dot product have given us the length of one side (-d) and the angle between two of its sides (θ). What we want to find is the magnitude of the side of the triangle from the line endpoint L1 to the intersection point i. This just comes down to basic trigonometry. Since we want to find the length of the hypotenuse and we know the length of the side adjacent to the angle, the length of the hypotenuse can be found by:


This distance tells us how far along the line vector we need to move until we reach the point that intersects with the plane. Geometrically, multiplying a vector by a scalar means moving along the vector. Because of this, in order to obtain the intersection point, we take our endpoint, L1 and add on the line vector multiplied by the magnitude of the distance from L1 to i. This effectively means we start at L1 and move along the vector until we get i.


Now that we have the coordinates of the point where the line intersects with the plane, we have one final step: to determine if that point lies inside of the polygon. To determine the "insidedness" of the point, a simple trick can be performed. Firstly, we visualise drawing lines from each pair of the polygon's vertices to the intersection point. Each pair of lines forms a triangle. This is illustrated in the figure below.



The key to this solution is realising that the angles around the intersection point formed by each of the triangles should all add up to 360 degrees. If the angles do not add up to 360 degrees, the point is not inside the polygon and we can return false otherwise the point is inside the polygon and we can return true. How to construct the triangles within the polygon and find the sum of their angles? The first step is to take two of the vertices and create two vectors originating from the intersection point. This can be done by taking the vertex coordinates and subtracting the coordinates of the intersection point. After normalising, our trusty dot product operation can then be used to find the angle between the two vectors and thus one of the interior angles around the intersection point. Repeating this for each of the polygon vertex pairs and totalling the angles found mean we will be able to determine if the intersection lies inside the polygon or not.

Images: Wikipedia
Images and 3D "insidedness" algorithm: http://paulbourke.net/geometry/insidepoly/

Wednesday 8 February 2012

Line Intersecting a Plane

One of the next tasks that I set myself was to write a little demo whereby the intersection of a line with a plane is detected and responded to.

In order to test this, I first used OpenGL to draw a line in space as well as a triangle that could move along the line with the use of the keyboard.


All points on the triangle are coplanar with the plane that is being tested to see whether the line intersects. The goal is to have the colour of the line turn to red once the line is intersecting the plane that the triangle lies on.

After scouring the web, the proved to be a fairly simple process but one that required a few mathematical calculations. The main key to solving the problem is through the use of the equation of a plane:

Where (A,B,C) represents the Normal vector for the plane, (x,y,z) are the coordinates for any point on the plane and D is the distance of the plane from the origin. This equation will evaluate to 0 for any point that lies on the plane. A point not on the plane will yield a result that represents the nearest distance from the point to the plane. If the value is positive, it means the point lies ahead of the plane (in the direction the plane normal points) and if the value is negative, it means the point is behind the plane.

In order to know whether the line is intersecting the plane, we plug the coordinates for the endpoints of the line and plug them into the equation of the plane. By inspecting the distance from the plane of each point, we can determine whether it is intersecting the plane. In order to do this, we need to be able to use the equation for the plane. In order to do that, we need two pieces of information:

  • The normal of the plane
  • The distance of the plane from the origin (D).

Calculating the Normal of the Plane

The normal of a plane is a vector that is perpendicular to the plane. To make calculations easier, we also normalize the vector, that is making it of length 1. The image below (credit: Wikipedia), shows two normals for a polygon.

In order to calculate the normal for a polygon or plane, we make use of the cross product operation for vectors which takes 2 vectors and returns a vector which is orthogonal to both. This is exactly what we require. Having performed the cross product (and normalizing!), we now have the normal to the plane the coordinates of which are the A, B and C values used in the general equation for a plane. Now we only need D.

Calculating the Distance from the Origin (D)

Once the normal for the plane has been calculated, we need D which is the distance the plane lies from the origin. This is done by simply rearranging the general equation for the plane.

For the (x,y,z) vector, we can use any point on the plane as it would evaluate to 0, allowing us to perform this rearrangement. For the demo, any points used to draw the triangle would suffice.

Checking Line Intersection

We can now calculate a point's distance from a plane using the general equation. How can we use this to check whether a line is intersecting the plane? Well, if the line is intersecting the plane, one of the points will be in front of the plane and one behind. If this is the case, one of the values for the general equation will be positive and the other negative. In order to check whether this is the case, simply substituting the coordinates for each endpoint of the line as the (x,y,z) vector in the general equation and then multiplying the results together. If the line is indeed intersecting the plane, the multiplication will result in a negative number.

Checking for this intersection and drawing the line in the appropriate colour means we are finished!

Thursday 2 February 2012

Camera - Playing with Matrices

So one of the first things I wanted was to have a free roaming camera that would let me explore the world from all angles. I felt this would be very useful for seeing how different things that I had coded would affect the scene and to make sure that there weren't any nasties hiding in a perspective that I hadn't considered!

After playing with OpenGL for a little while, I of course stumbled upon the gluLookAt() function. This takes in 3D world coordinates for the position of the camera and the point that the camera is looking at. The function then constructs vectors from these and does some jiggery-pokery to position the camera at the given coordinates, looking in the direction given, with the world rotated to simulate the camera having moved to that position.

While this was all well and good, I wasn't comfortable with gluLookAt() - it felt too restrictive, too clunky. After looking around the internet, many people commented that for a camera with any real power, managing ones own matrices for world and view transformations was the way to go and in the long run offered more freedom.

I eventually found two very useful websites that helped me learn how to do this:

For most purposes, there are two types of camera: a look at camera, and a roaming camera.

Look at Camera

A look at camera is a camera that fixes its view on a single point. This may be a camera that follows the trajectory of a ball as it rotates in a circle.

The first step in building this camera is to construct the 3D rotation matrix. For this purpose, some information is needed: namely, vectors describing the position and orientation of the camera and the world.

To get this information, the user has to provide coordinate information from which vectors can be constructed. These vectors take the following form:

  1. Out vector - The vector representing the line of sight of the camera. It is a vector that points from the camera's location to some distant point in the direction the camera is facing. A "look at" vector.
  2. Right Vector - This is the vector that is perpendicular to the out vector, originating at the camera's location. It forms the x-axis of the camera's coordinate system
  3. Up Vector - This vector is perpendicular to the plane formed by the out and right vectors. It indicates which direction is up relative to the camera.

With these vectors, we know the position of our camera in the world. Together, they can be used to construct the rotation matrix. The rotation matrix is initially constructed by taking the projection of these vectors onto the world coordinate system. The projection onto the world coordinate system merely gives us the end point of each vector, originating from the origin.


The projected coordinates are then used to construct the rotation matrix by having each coordinate of a vector form a row in the rotation matrix (RM). Row 1 in the RM is the right vector, row 2 is the up vector and row 3 is the out vector. 3D rotation matrixes are 4x4 matrices as they make use of homogenous coordinates. This allows translations and rotations to be stored in the same matrix and allows multiplying of matrices whilst preserving any transformations.

The RM is filled in with the negative values for the Out vector as in OpenGL, instead of moving the camera, the world is moved. By default, the camera looks down the negative z-axis, so we minus each of the Out vector coordinates.

So, we use the Out, Right and Up vectors to build the RM but how do we find these vectors?

Finding the out vector is simple. Because we have been given the camera position and look at point by the user, these coordinates define a look at vector with the camera as the origin. To construct the vector, we simply do lookAtPoint - cameraPosition to obtain our Out vector. Simple. We then normalize the vector as dealing with unit vectors simplifies calculations.

Calculating Right and Up are a little more complicated. Once we have our Out vector, we know what the camera is looking at but we have no idea as to what orientation the camera is in - there are an infinite number of rotations around the out vector that could define the camera's orientation.

Knowing the Up vector would fix the orientation but how to find it? The answer lies in defining a reference vector with which to fix the orientation of the camera. This reference vector is called the WorldUp vector and is typically defined as the unit y axis vector of the world coordinates (0, 1, 0).

The WorldUp vector is coplanar with the Out vector. We can then utilise the cross product function to find our Right Vector as the cross product returns a vector that is orthogonal to both input vectors. Since we now have our Out and Right vectors, it is a simple matter of again using the cross product to find the up vector for the rotated camera coordinates. After normalising, we have all the information we need to construct the RM for a look at camera.

One final thing remains to complete our look at camera and that is to perform translation. Translation is performed by creating a translation matrix whereby the amount of translation is contained in the final column of the 4x4 matrix. In the case of the camera, this takes the form of translating the world away from origin (as in OpenGL, the camera stays at the origin and the world is moved). Since we are looking down the x-axis, the world is translated by the negative position of the camera.
>

In order to obtain the final transformation matrix for the world/view, the RM and the translation matrix are multiplied together. Since OpenGL is a right handed coordinate system, the matrices it deals with have their rows and columns swapped. For this reason, the matrix is transposed before being loaded into OpenGL using the glLoadMatrixf(float* matrix) function for use with drawing.

Roaming Camera

The roaming camera utilises Euler angles which specify angles of rotation around the axes. We then construct rotation matrices that produce rotation around each axes. Each rotation is defined as follows.

The final RM that would rotate the world to simulate the movement of the camera is then given by multiplying these matrices together. This process can be optimised however, by noticing that some trigonometric calculations appear multiple times in the final RM. The final RM is evaluated as follows.

Once the RM has been constructed, the translation matrix is constructed in the same way as with the look at camera and the final transformation matrix obtained by multiplying the two and transposing the result.

Relative Rotation

One of the benefits of a roaming camera is to explore the world at will. In order to do this, relative rotation is required - that is, a rotation in respect to the view coordinate system and not the world system. Thankfully this is simple. Once the initial transformation matrix has been constructed, any further rotations applied are relative to the new coordinate system. Relative Motion is thus achieved by multiplying the RM matrix by the new rotation matrix, multiplying the result by the current translation matrix and then loading the final transformation matrix into OpenGL after performing a transpose.

Relative Motion

Moving the camera relative to its coordinate system is slightly more complex. Multiplying a vector by a scalar is the geometric equivalent of moving along the vector by some amount. Using this, relative motion is accomplished by incrementing the amount of translation in the translation matrix by a vector. In order to move the camera forward in the direction it is facing, increase the amount of translation by a scalar multiplied by the Out vector.

Where (x,y,z) is the original amount of translation. The RM is then multiplied by this new translation matrix and the resulting transformation matrix loaded into OpenGL after performing a transpose operation.

(*Images taken from http://www.fastgraph.com/makegames/3drotation/)