Matrices and Translations

Before Reading: If you're rusty on how matrix multiplication works please review here learn/linear/matrices before continuing.

OpenGL Lives in 4D

One of the things that confuses most people when they are learning about WebGL is that the API seems to live in 4D. Even though we use OpenGL to render 3D graphics all the vertices of our objects are specified with 4 coordinates, (x,y,z,w) and all of the matrices used to scale, rotate, and translate are 4x4 matrices. One of the main reasons for this is that 4x4 matrices are required to describe translations of 3D. In this page I hope to explain how and why 4x4 matrices do this.

Our Goal: Understand how and why translations of 3D require 4x4 matrices.

How we get there: First I will show how translations of 1D can be accomplished with 2x2 matrices, then how translations of 2D can be accomplished with 3x3 matrices and lastly I will explain how 4x4 matrices describe translations of 3D.

Moving 1D with $$2\times2$$ Matrices

The animation to the left shows the effects of the matrix $$T_{(1)} = \begin{bmatrix} 1 & {\color{vred}x = 1} \\ 0 & 1 \end{bmatrix}$$ on the plane. The trick to understanding how this matrix encodes a translation of 1D is by focusing on the effect this matrix has on the line $$y = 1$$ which has three dots on it. Even though the line $$y = 1$$ lives in 2D we can think of it as a copy of the 1D line we want to translate. This matrix shears all of 2D space but if we focus on the line $$y = 1$$ we can see that it translates points on the line to the right by 1. Even though the matrix is really transforming all of 2D space we only care about the effect it has on our copy of the line, y=1.

By pressing the blue bottom on the bottom right corner of the animation you can rotate the camera to view the line from the top, notice how this makes more evident the translating effect of the matrix.

touch_appTry it Yourself

In the example above our matrix described a translation by one because $$x = 1$$. To translate the line $$y = 1$$ by $$x$$ units use the slider below.

$$T_{(x)} = \begin{bmatrix} 1 & {\color{vred}x} \\ 0 & 1 \end{bmatrix}$$

$$x =$$ 0.0

Moving 2D with $$3\times3$$ Matrices

The animation to the left shows the effects of the matrix $$T_{(1,1)} = \begin{bmatrix} 1 & 0 & {\color{vred}x = 1} \\ 0 & 1 & {\color{vred}y = 1} \\ 0 & 0 & 1 \end{bmatrix}$$ on space. In the previous example we translated the points in 1D by embedding a copy of the real number line in 2D. In this case we will translate points in 2D by embedding a copy of the 2D plane in 3D. This copy of 2D is the plane $$z=1$$. The matrix $$T_{(1,1)}$$ shears 3D space but the effect is has on points lying on the plane $$z=1$$ is a translation up and to the right by one. Even though our matrix is really shearing 3D space, we only care about the effect it has on our copy of the plane, $$z=1$$.

touch_appTry it Yourself

In the example above our matrix described a translation of up and to the right by 1 because $$x = 1$$ & $$y = 1$$. To translate the plane $$z = 1$$ by $$x$$ units to the right and $$y$$ units up use the sliders below.

$$T_{(x,y)} = \begin{bmatrix} 1 & 0 & {\color{vred}x} \\ 0 & 1 & {\color{vred}y} \\ 0 & 0 & 1 \end{bmatrix}$$

$$x =$$ 0.0

$$y =$$ 0.0

Moving 3D with $$4\times4$$ Matrices

So far we have shown how matrices can encode translations in 1D, the line, and 2D, the plane. The trick we have used is imagining the dimension we want to translate in a higher dimension.

When we wanted to translate 1D we embedded it in a higher dimension, 2D. The 2x2 matrix would then shear 2D space but the effect it had on our 1D space was a translation.

When we wanted to translate 2D we embedded it in a higher dimension, 3D. The 3x3 matrix would then shear 3D space but the effect it had on our 2D space was a translation.

In a similar fashion to how we have done things in lower dimensions, if we want to use a matrix to describe a translation of 3D we think of our 3D space as being embedded in a higher 4th dimension. Just as our copy of 1D in 2D space was the line $$y=1$$, our copy of 3D in our 4D space is the volume $$w=1$$. This volume in our 4th dimensional space consists of all points $$(x,y,z,w)$$ where the w component is set to 1. This is why in OpenGl we generally specify vertices with the regular 3 components x,y,z and the w component set to 1. Our $$4\times4$$ matrix then describes a shearing of 4D space but the effect it has on our 3D space is a translation. To translate a point in 3D by x units along the x axis, y units along the y axis and z units along the z axis the following matrix is used.

$$T_{(x,y,z)} = \begin{bmatrix} 1 & 0 & 0 & {\color{vred}x} \\ 0 & 1 & 0 & {\color{vred}y} \\ 0 & 0 & 1 & {\color{vred}z} \\ 0 & 0 & 0 & 1 \end{bmatrix}$$

In the end we can’t visualize this 4th dimension but hopefully the analogies I have shown in lower dimensions have given an intuition for how and why we use $$4\times4$$ matrices in OpenGL.