How the deprecated OpenGL matrix model works

Even if nowadays everybody seems to drop OpenGL methods when they are deprecated on the core profile, it doesn't mean that you don't need to use them in compatibity profile or that you don't want to know how they work. I searched on the web to find more information on how the old and deprecated OpenGL matrices are implemented and I didn't find anything (except tutorials on how to use them!). My doubt was mainly about the operations order, because I needed to make a C++ implementation of them, maintaining the same exact behavior. I used OpenGL matrices In the past without worrying about how they were implemented, I had a precise idea but now I have to be 100% sure. Even if we know how to implement operations between matrices, the row-column product doesn't have the commutative property so the internal implementation can make the difference. At the end, my question is:

- What is the matrix row-column order and how the product is implemented on OpenGL?

Tired of finding pages saying how they are useless and deprecated now, I had to check by myself the Mesa source code to find what I was searching for:

P = A * B;

P[0] = A[0] * B[0] + A[4] * B[1] + A[8] * B[2] + A[12] * B[3];
P[4] = A[0] * B[4] + A[4] * B[5] + A[8] * B[6] + A[12] * B[7];
P[8] = A[0] * B[8] + A[4] * B[9] + A[8] * B[10] + A[12] * B[11];
P[12] = A[0] * B[12] + A[4] * B[13] + A[8] * B[14] + A[12] * B[15];

P[1] = A[1] * B[0] + A[5] * B[1] + A[9] * B[2] + A[13] * B[3];
P[5] = A[1] * B[4] + A[5] * B[5] + A[9] * B[6] + A[13] * B[7];
P[9] = A[1] * B[8] + A[5] * B[9] + A[9] * B[10] + A[13] * B[11];
P[13] = A[1] * B[12] + A[5] * B[13] + A[9] * B[14] + A[13] * B[15];

P[2] = A[2] * B[0] + A[6] * B[1] + A[10] * B[2] + A[14] * B[3];
P[6] = A[2] * B[4] + A[6] * B[5] + A[10] * B[6] + A[14] * B[7];
P[10] = A[2] * B[8] + A[6] * B[9] + A[10] * B[10] + A[14] * B[11];
P[14] = A[2] * B[12] + A[6] * B[13] + A[10] * B[14] + A[14] * B[15];

P[3] = A[3] * B[0] + A[7] * B[1] + A[11] * B[2] + A[15] * B[3];
P[7] = A[3] * B[4] + A[7] * B[5] + A[11] * B[6] + A[15] * B[7];
P[11] = A[3] * B[8] + A[7] * B[9] + A[11] * B[10] + A[15] * B[11];
P[15] = A[3] * B[12] + A[7] * B[13] + A[11] * B[14] + A[15] * B[15];

where A and B are 4x4 matrices and P is the result of the product. As you can see, this snippet clarifies how rows and columns are internally ordered and how the product is implemented. In conclusion, the OpenGL methods to modify the current matrix are implemented by Mesa in this way:

- translation

void glTranslatef(GLfloat x, GLfloat y, GLfloat z)
{
   GLfloat *m = currentMatrix->m;

   m[12] = m[0] * x + m[4] * y + m[8]  * z + m[12];
   m[13] = m[1] * x + m[5] * y + m[9]  * z + m[13];
   m[14] = m[2] * x + m[6] * y + m[10] * z + m[14];
   m[15] = m[3] * x + m[7] * y + m[11] * z + m[15];
}

- rotation

void glRotatef(GLfloat angle, GLfloat x, GLfloat y, GLfloat z)
{
   GLfloat xx, yy, zz, xy, yz, zx, xs, ys, zs, one_c, s, c;
   GLfloat rotateMatrix[16];
   GLboolean optimized;

   s = sinf( angle * M_PI / 180.0 );
   c = cosf( angle * M_PI / 180.0 );

   memcpy(rotateMatrix, Identity, sizeof(Identity));

#define M(row, col) rotateMatrix[col*4+row]

  const GLfloat mag = sqrtf(x * x + y * y + z * z);

  if (mag <= 1.0e-4F) {
     /* no rotation, leave mat as-is */
     return;
  }

  x /= mag; y /= mag; z /= mag;

  xx = x * x; yy = y * y; zz = z * z;
  xy = x * y; yz = y * z; zx = z * x;
  xs = x * s; ys = y * s; zs = z * s;
  one_c = 1.0F - c;

  M(0,0) = (one_c * xx) + c;
  M(0,1) = (one_c * xy) - zs;
  M(0,2) = (one_c * zx) + ys;
  M(1,0) = (one_c * xy) + zs;
  M(1,1) = (one_c * yy) + c;
  M(1,2) = (one_c * yz) - xs;
  M(2,0) = (one_c * zx) - ys;
  M(2,1) = (one_c * yz) + xs;
  M(2,2) = (one_c * zz) + c;

   // currentMatrix = currentMatrix * rotateMatrix

   matrix_multf( currentMatrix, rotateMatrix );
}

- scaling

void glScalef(GLfloat x, GLfloat y, GLfloat z)
{
   GLfloat *m = currentMatrix->m;

   // currentMatrix = currentMatrix * scaleMatrix

   m[0] *= x;   m[4] *= y;   m[8]  *= z;
   m[1] *= x;   m[5] *= y;   m[9]  *= z;
   m[2] *= x;   m[6] *= y;   m[10] *= z;
   m[3] *= x;   m[7] *= y;   m[11] *= z;
}

Another thing that I found tricky is the exact number of matrix stacks and the exact behavior when you select one that is unsupported. The current matrix stack can be selected with glMatrixMode, so we are going to have four kind of matrix stacks (COLOR, MODELVIEW, PROJECTION and TEXTURE) that can be current to matrix operations and somebody can think that four matrix stacks can be enough, but it's wrong. The TEXTURE category includes also the matrix stacks supported for each texture units that can be choosen with glActiveTexture. However, you can have only 4 texture units for the fixed functions (GL_MAX_TEXTURE_UNITS), 8 texture coords for the vertex attributes (GL_MAX_TEXTURE_COORDS) and more than 32 texture units for the program shaders (GL_MAX_COMBINED_TEXTURE_IMAGE_UNITS). So, at this point:

- What is the exact number of matrix stacks associated to texture units on OpenGL?

The answer is not so simple. I checked the Mesa source code, but this time it didn't help alot. I found that they supported a number of matrix stacks equal to fragment shader image units, that I know for sure that is wrong because matrices are deprecated and more that 32 texture units are supported only by non-deprecated functions. In fact, the answer to my question lies in the deprecated functions, in particular those related to program shaders. As the texture coords are only GL_MAX_TEXTURE_COORDS (gl_MaxTextureCoords), there are only gl_MaxTextureCoords (deprecated) built-in matrices on the shader language:

uniform mat4 gl_TextureMatrix[gl_MaxTextureCoords];

So I can deduce that the same applies to matrix stacks. To be 100% sure, I did a test on my GeForce GTX 670 with glActiveTexture, passing a texture unit that is less than GL_MAX_TEXTURE_COORDS and it worked, while texture units with greater-equal values where simply ignored without errors. So my final answer is that the number of texture matrices associated to texture units is equal to GL_MAX_TEXTURE_COORDS. If you are not sure, you can try it by your self, with the following code:

GLint maxTextureCoords = 0;
glGetIntegerv(GL_MAX_TEXTURE_COORDS, &maxTextureCoords);
glMatrixMode(GL_TEXTURE);
glActiveTexture(GL_TEXTURE0 + maxTextureCoords);

glLoadIdentity();

GLfloat matrix[16];
glGetFloatv(GL_TEXTURE_MATRIX, matrix);

As the matrix stack GL_TEXTURE0 + maxTextureCoords is not available, the glLoadIdentity has no effect and glGetFloatv should return garbage memory. See that this is the true behavior that should be respected by most of the graphics driver, but it could be violated by alternative implementations (like Mesa). Anyway, we care only about the standard and not the particular cases, so we can ignore these kind of exceptions.