Lessons learnt in 3D graphics and game programming
This post is a result of some insight and necessary steps to get the June 2010 DirectX SDK working on a Windows 8 system, with Visual C++ 2010 express. Given the fact that DirectX 11 comes bundled in with the Windows 8 SDK, my stubbornness to change the configuration is due to the fact that I wanted to follow Frank D. Luna’s latest book, mirroring the same development environment (Visual Studio 2010 and the DirectX SDK) in my Windows 8 machine. Without further ado, lets dive in to the details…
If you are like me and want to use Visual C++ 2010 and the June 2010 DirectX SDK in Windows 8, then the below steps outline what needs to be done (I am assuming you are using a Windows 8 machine):
If everything went well, you should be up and running in coding and building DirectX applications using VC++ 2010 (provided that you link the appropriate libraries and set the folder paths in the project settings) on your Windows 8 system.
In a 3D game environment, vectors are used to hold the values of points or vertices. Thus a vector would contain a coordinate [x, y, z] which represents the specific vertex in 3D space. Points are defined such by vectors because the start position of the vector is usually taken as [0, 0, 0] which is the origin of the coordinate space. Thus all the vertices in 3D models and game elements are represented by vectors, but it is important to remember that vectors are not 3D vertices. (There is another type of vector that is represented by 4 coordinates [x, y, z, w], which are homogenous coordinates. We will talk about this in a later post dedicated solely to this very important and interesting topic).
Another place where vectors are used are surface normals. Every 3D surface has a surface normal, which is nothing but a vector pointing away from the surface and perpendicular to that surface. This vector (surface normal) determines how light sources in the environment light up the specific surface. This is just one aspect of the surface normal, as it is used in many other places in the game. Vectors are also used in shading (which we will talk about later), and dynamically processing visual elements in the game.
Simply put, vectors are one of the most used constructs in 3D games. This is why learning and understanding about vectors and their operations is very important. Without further ado, lets dive in to learn about vector operations and some basic linear algebra.
THE ZERO VECTOR
The zero vector is the additive identity in the set of vectors. The 3D zero vector would be denoted by [0, 0, 0]. This vector is a special vector because it is the only vector with no magnitude or direction. It would be easy to assume the zero vector as a point, but the reader should remember that a point represents a location. It is better to remember the zero vector as a vector with zero displacement.
Vectors negation can be related to multiplication of scalar numbers by -1. The negated vector is known as the additive inverse of the original vector. A vector of any dimension is negated by all its individual components as shown below:
– [x, y] = [-x, -y]
-[x, y, z] = [-x, -y, -z]
Geometrically speaking, vector negation produces a vector same as the original, but with opposite direction.
MAGNITUDE OF A VECTOR
Vectors have a magnitude (length) which can be calculated simply by taking the root of the sum of squares of the individual dimensions. This is very simple if you remember the very basic skill of calculating the length of the hypotenuse in a right angled triangle using the Pythagorean theorem (for 2 dimensions).
For a 2D and 3D vector, the magnitudes are defined as below, respectively:
VECTOR MULTIPLICATION AND DIVISION BY A SCALAR
Vectors can be multiplied by scalars, and this happens by multiplying the individual components of the vector by the scalar value. What we geometrically obtain by scalar multiplication is another vector that is parallel to the original vector, but which could differ by magnitude or direction, depending on the scalar value. Some examples are given below:
k[x, y, z] = [kx, ky, kz], -2[4, 0, 1] = [-8, 0, -2]
Vectors can be divided by scalars as well, and this would be equivalent to multiplying the vector with the reciprocal of the scalar value, which is shown as follows:
1/k[x, y, z] = [x/k, y/k, z/k]
Some important aspects to note:
In many situations, it is not the magnitude of the vector that is important, but the direction. In these cases it is convenient to work with unit vectors, which have the same direction of the original vectors, but their magnitude is 1. The process of taking a vector, and converting it into a vector of magnitude 1 while maintaining the direction, is known as vector normalization. The unit vector is known as the normal.
A vector is normalized by dividing the vector by its magnitude (scalar division, as the magnitude value is a scalar). The result is a vector which is the normal to the given original vector:
Vnorm = V/||V||, Where V is not zero.
The below image (courtesy of the book 3D math primer for graphics and games development, by F.Dunn and I.Parberry) show unit vectors in 2D space, which touch the surface of a circle of unit radius. In 3D space, unit vectors would touch the surface of a sphere of unit radius:
VECTOR ADDITION AND SUBTRACTION
Vectors can be added or subtracted only when their dimensions are equal. The individual components of the vectors are added or subtracted to obtain the resultant vector. Though vector addition is commutative, vector subtraction is not. Examples of vector addition and subtraction is given below:
[2, 5, -1] + [3, 1, 0] = [5, 6, -1]
[3, 0, -3] – [5, -2, 0] = [-2, 2, -3]
The geometrical concept of vector addition and subtraction is the basic triangle rule. Given the vector addition of two vectors A and B as A+B, we need to find the resultant vector which has the starting position of A, but the ending position of B. This can be applied to many vectors. Vector addition may seem a simple enough concept, but we will later see a similar mechanism to transform vectors from one coordinate space to another.
In the next post, we will continue the rest of the vector operations, and look at two very important operations: Vector dot product and the vector cross product.
Well what are vectors anyway? The topic of vectors crop up in geometry, physics, engineering disciplines, mechanics, etc. How they are used, as well as their definitions at times, vary from context to context. Below I have listed how vectors are defined in some contexts:
1. In geometry, Euclidian (or spatial) vectors are line segments that represent length and direction.
2. In physics, vectors are used to show magnitude (usually in some unit) and direction, representing aspects such as velocity, force etc.
3. In linear algebra, vectors are elements of vector spaces, but unlike the above attributes of vectors, may not always be made up of real numbers.
I liken vectors to cross-cutting concerns in regular software applications. One example of a cross cutting concern in software applications is logging. In any one of the layers in a layered software architecture, logging is an important function that is applied across the layers (or used as an aspect, in Aspect Oriented Programming lingo).
Vectors are a cross-cutting concern across geometry, linear algebra, mechanics, engineering, fluid dynamics, etc. Vectors are a necessary and critical element in each of these areas, but is pretty much the same thing when taken by itself, and can be treated as an aspect, if I may use the term again from AOP.
Lets backtrack for a moment: A geometrical point is something and nothing at the same time. It is purely a location in space, but has no width, height, length, or any type of dimensional size. Next, a line can be defined as the straight path between two points. But can this straight path have any thickness or size? Points and lines are abstract idealizations in geometry, we cannot create or draw them, but we can visualize them by giving them size, thickness etc, that will make sense to our eyes as points and lines. A vector is yet another abstraction, which represents the magnitude of something (denoted by the length of a directional line segment), and the direction of the acting element to which the magnitude is applicable. A vector starts from a initial point, and ends at a terminal point, with the directed line segment connecting the two points representing the magnitude and direction.
What game programmers need to know is that vectors can be represented as lists of numbers (or arrays, to be more accurate). If the initial point of each vector is taken as the origin of a coordinate system, every vector can be represented by a list of numbers. In two dimension, vectors can exist only on a plane, and thus need a minimum of two numbers (or values) to be defined in a list. In 3D, vectors can exist in 3D space, and need a minimum of three numbers to be defined. But it can be more than three dimensions, and we will see about higher dimension vectors in a future post, which has further implications in 3D game programming.
One important consideration when talking about vectors, is the relationship they have to points. Points were explained to represent only position. Vectors do not have position, but have magnitude and directions (displacement). But points do not have precise or absolute locations, their locations are defined relative to some coordinate space. Now, what happens when you draw a line segment from the origin of this coordinate space to the point in question? What we get is the displacement of that point, from the origin. Thus, in a given coordinate system, if we have a vector starting from the origin and describing a displacement of [x, y], we end up in the location of the point represented by [x, y]. What we need to remember is that points and vectors are conceptually (think physics) different but mathematically (think geometry) equivalent.
To sum up, vectors are simply directional line segments that represent a certain direction, and a magnitude which is denoted by the length of the line. If the vector is initiated from the origin of a coordinate system, the vector is equivalent to a point in the coordinate space whose coordinates are the same as for the vectors terminal point (And vice-versa: The displacement of a point in a coordinate space from the origin, is given by the vector that begins from the origin and ends at the point).
In the next post, we will look at where vectors are used in 3D games development, and some of the basic vector operations that we need to know about.
My sincere apologies to the readers and followers of GameCoderLogic, for the long period of inactivity in publishing posts. I have just recently moved to Singapore from Sri Lanka due to my change of workplace and job, and had to put blogging (and many other things) on hold till I transitioned from my previous job to the current one.
Now that I have had time to settle down (the writing of this post being proof of this fact, somewhat…), regular 3D Game Math primer posts will continue, starting from the upcoming weekend.
WHY DO WE NEED MULTIPLE COORDINATE SPACES?
This question is best answered with a story given in the book ‘3D Math Primer for graphics and games development’ by Fletcher Dunn and Ian Parberry. The story goes that there are two cities, ignorant of each other and having their own coordinate system to map the different areas within the city. In the first city, the inhabitants do not care or have any knowledge of the coordinate system used in the second city and vice versa. They each have their own coordinate system and live their blissful day to day lives. But one day, the state engineer is tasked with building a road between the two cities. This creates a dilemma: the coordinate systems used by the cities (maybe with the origins at the city centers) is not adequate for the engineer. He needs a larger frame of reference, encompassing the two cities within it. He needs a new coordinate space.
The 3D models we see rendered in games are generally created using 3D modeling packages. For example, if I use the open source software ‘Blender’ to draw a 3D model that I will use in my game, say for example a Spherical spaceship, then in my modeling package the origin of the model will be in the center of the sphere, denoting (0, 0, 0). But when I import this model to my game, I need this model placed in reference to ‘something’ that represents the whole virtual world of the game. This ‘something’ is the world coordinate space. With regards to multiple coordinate spaces and computer games, three coordinate spaces are vitally important and need to be understood clearly: Object space, World space, and Camera space.
Object space is what I was describing above when I mentioned that the origin of the model is at the center of the spherical spaceship. Object space refers to the coordinate frame that the object is embodied and described in. World space is the global coordinate space, where everything of interest to the game in a graphical sense will reside. Finally, camera space is your view into the game world: The origin in camera space represents the point where the camera is stationed (your screen viewport) and where the view frustum begins. Camera space is the frame of reference relative to this 3D origin. Note that camera space can be treated as a ‘special object space’ where the object in question is the camera through which you view the game, and all the game constructs are referenced relative to this camera object space.
NESTED COORDINATE SPACES
Objects in the world space of a game are rarely rigid objects. One example of a rigid object would be a model of a barrel. But even this object weaves a complicated path through world space when tumbling down a flight of stairs (based on the physics that is coded). Now think of a model of a Robot with a large number of abilities. Let’s assume for the moment that the robot has pincers that open and close in the Y-Z plane of world space, and a circular saw that is spinning in the X-Z plane of world space, and the robot is traveling along the Z-axis in world space (assume the robot is traveling away from our field of view, towards the horizon). The programmer trying to implement this behavior of the robot in only world space context is going to have a very hard time! This is because, the movements of the robot in world space is complicated when we look at the movement collectively. Add to this that the robotic pincers can move around, and the circular saw can change orientation, and you have a very complicated movement pattern in world space that the programmer would find hard to implement. This is why we need nested coordinate spaces.
Think of nested coordinate spaces as parent – child coordinate space nodes in the structure of a graph or tree. In the example of the robot, the object space of the robot is the parent space. The object spaces of the pincers and the circular saw can be child spaces of the robots object space, and the pincer blade object space can be a child space of the pincers object space. In this manner, if a given object space and object is responsible for the movement of the connected child coordinate space nodes, implementing complicated movements in world space becomes very much less complicated to the programmer. This reduction in complexity is due to the nested coordinate spaces of the object and constituent parts of the object, and the responsibility of each node in controlling the individual movements of the child nodes, which collectively results in the complicated movement we observe in world space.
BETWEEN TWO PARALLEL UNIVERSES: A NOTE ON INERTIAL SPACE
Hopefully the reader has a clear understanding of object space and world space by now. I like to think of object space and world space as two parallel universes, and traveling between these two universes is the idea of transformation between object and world space. What I would like to focus on here, is the space between these universes of object space and world space, or the intermediate transition stage between the two universes, which is known as ‘inertial space’. The idea of inertial space would be more clear by referring the image shown below
Inertial space has the same origin as that of object space, but the primary axes are aligned to the world space primary axes. How does inertial space help in transforming an object from object space to world space? Inertial space makes it easy by providing a two step process:first we transform from object space to inertial space by simple rotation. In this operation, the X and Y axes of the object space above, coincide with the X and Y axes of the inertial space, and the robots alignment becomes upright. Second, we move the location of the inertial space origin to coincide with the origin of world space, and this operation is known as translation. In this two step process, which is easy to think about individually, we have transformed the object from object space to world space, via the intermediary inertial space.
The 3D coordinate system is a natural extension of the 2D cartesian coordinate system we are all familiar with and have studied. The cartesian coordinate system has an interesting history, first documented by the French mathematician Rene Descartes in around 1637. Early work in this time period dealt with only one or two axis coordinate systems, although later findings indicated that the mathematician Pierre de Fermat worked with a three axis (3D) system which was not published. The work done by these great men laid the basis for the great discoveries by Newton and Leibniz later on: the invention of the Calculus. But enough reflections on history, and back to the matter at hand…
We will be covering the following aspects in this first blog post dealing with 3D mathematics:
THE 3D COORDINATE SPACE
The 3D coordinate system consists of three lines or axes (singular form: axis), which are mutually perpendicular to each other, and form the abstract representation of the three dimensions that we humans are able to see. If we had the ability to sense or see in more than 3 dimensions, we would not be limited to the 3 axis system, though this limitation is absent in areas such as linear algebra where you can describe systems with any number of dimensions in an abstract sense. In real life, we see only in three dimensions, and will be our basis behind creating real life-like game environments and models.
The three axes are referred to as the X, Y, and Z axes. Any and every point in 3D space (relative to the axes) can be described in reference to these three axes. The point where the three axis lines intersect is known as the origin. The origin is to the 3D space, what the number zero is to the number line. A coordinate in 3D space is described by the length units projected on the X, Y, and Z axes respectively to reach that position, starting from the origin. Thus the position of the origin can be described as (0, 0, 0) and the point shown in the coordinate system below has the coordinates (x, y, z).
LEFT HANDED AND RIGHT HANDED COORDINATE SYSTEMS
Left handed and right handed orientation is nothing more than a categorization of how the 3D axes can be aligned. If you take your left hand and extend the thumb, index, and third finger at right angles to each other, and hold your hand so that the thumb (positive x-axis) is pointing to your right, the index (positive y-axis) is pointing up, and the third finger (positive z-axis) is pointing towards the direction you are facing, then you have a left handed 3D coordinate system. If we do the same with our right hand, where the thumb, index, and third finger are the positive x, y, and z axes respectively, we have a right handed coordinate system. Notice that if you rotate the right handed orientation to match the x and y axis directions of the left handed system, only the positive z-axis changes direction. We could say that the difference between the orientations is that in a left handed system, the positive z-axis points away from us towards where we are facing, whereas in a right handed orientation, the positive z-axis points towards us. The figure below represents this fact.
In the next post, we will learn about multiple (3D) coordinate spaces and nested coordinate space, and also how basic coordinate transformations can take place. Please leave any feedback on content, clarity of writing, on what you think should be changed, or errata in general, and I will do my best to work on it.
Many a time I’ve heard statements like “Games programming? 3D graphics? that requires an awful lot of maths right? and graduate level math at that…” ….and therein onwards the interest dies off due to the pre-mis-conceptions about the role that mathematics will play in games development.
For a programmer/developer/software engineer, learning the concepts of 3D mathematics needed for developing games is not an impossible task (the math we use in 3D game programming is referred to as ‘3D math’ but is actually a composite collection of topics from different mathematical disciplines as we shall see shortly). Of course, there will be some due inherent complexity, as there should be with good reason, but we tackle complexity everyday in software engineering: software architecture and patterns, databases and database design, mastering specific technologies that we use to build systems with, web site user experience, all bring about their own inherent complexities, but we master them and control them in order to reach our objectives. In that regard, what is so different about 3D mathematics that we cannot master the complexity, understand it, and apply it? That being said, this may be a good time as any to elaborate on what 3D mathematics is made up of.
The math used in the rendering of 3D games is based on a sub-set of topics derived from the mathematical branches of linear algebra, (solid) geometry, real analysis, and higher algebra. In addition, any artificial intelligence used in game logic and physics simulations, employs real world physics and mathematics as well. But we are currently concerned with the mathematics needed for drawing and rendering 3D graphics for games, and that is what we will focus on at the moment, though we will discus all aspects of game programming, including AI, in later topics.
To this end, I will be creating a thread of posts in this blog that deal with 3D math topics in a linearly progressive fashion, including code examples showing their applications where possible. This is in order to show and prove to the reader, that learning 3D mathematics is not the daunting task it is labeled to be, and also to pave the way for the game programming concepts and topics discussed in this blog that rely on mathematics. It is one of the most rewarding experiences, that when you are developing 3D computer games, you actually understand the mathematical concepts and principles behind what you are doing. For at that point, no aspect of the game is out of your control, as you know and understand exactly what is happening under the hood.