Category Archives: Graphics

Everything about graphics, from hand drawn pictures to CGI. All the applications, algorithms, demos, tutorials and even the games used to produce graphics.

How to parse gl.xml and produce your own loading library

In the previous article I emphasized the importance of not having a third-party loading library like glew because OpenGL is too complex and unpredictible. For example, if you want to implement a videogame with an average graphics and a large audience of users, probably OpenGL 2.1 is enough. At this point, you may need to load only that part of the library and make the right check of the extensions or just use the functions that have been promoted to the core of the current version. Remember that an extension is not guaranteed to be present on that version of OpenGL if it's not a core feature and this kind of extensions has been introduced after 3.0 to maintain the forward compatibility.

For instance, it's useful to check the extension GL_ARB_vertex_buffer_object only on OpenGL 1.4 (in that case you may want to use glBindBufferARB instead of glBindBuffer) but not on superior versions because it has been promoted to the core from the version 1.5 onward. The same applies to other versions of the core and extensions. If you target OpenGL 2.1, you have to be sure that the extensions tipically used by 2.1 applications have not been promoted to the latest OpenGL 4.5 version and to check the extenions on previous versions of the library, making sure to use the appropriate vendor prefix, like ARB. Even if with glew you can make this kind of check before using the loaded functions, I don't recommend it because glewInit() is going to load also parts that you don't want to use and you run the risk to understimate the importance of checking the capabilities.

Anyway, reading the OpenGL spec and add manually the required extensions is a time expensive job that you may don't have the time to do. Recently, the Khronos group has released an xml file where there is a detailed description of the extensions and the functions for every version of the library, it is also used to generate the gl.h and the glext.h header files with a script in Python. In the same way, you can program a script that parses the gl.xml file to generate your own loading library, making the appropriate check of the extensions and including only the part that you really need to load on your project. You can find the gl.xml file here:

Continue reading

Why loading libraries are dangerous to develop OpenGL applications

OpenGL is not so easy to use. The API exposes thousand of functions that are grouped into extensions and core features that you have to check for every single display driver release or the 3D application may not work. Since OpenGL is a graphics library used to program cool gfx effects without a serious knowledge of the underlying display driver, a large range of developers is tempted to use it regardless of the technical problems. For example, the functions are loaded "automagically" by an external loading library (like glew) and they are used to produce the desired effect, pretending that they are available everywhere. Of course this is totally wrong because OpenGL is scattered into dozens of extensions and core features that are linked to the "target" version that you want to support. Loading libraries like glew are dangerous because they try to load all the available OpenGL functions implemented by the display driver without making a proper check, giving you the illusion that the problem doesn't exist. The main problem with this approach is that you cannot develop a good OpenGL application without taking the following decision:

- How much OpenGL versions and extensions I have to support?

From this choice you can define the graphics aspect of the application and how to scale it to support a large range of display drivers, including the physical hardware and the driver supported by the virtual machines. For example, VirtualBox with guest addictions uses chromium 1.9 that comes with OpenGL 2.1 and GLSL 1.20, so your application won't start if you programmed it using OpenGL 4.5, or even worse you won't start also on graphics cards that support maximum the version 4.4 (that is very recent). For this reason, it's necessary to have a full awareness of the OpenGL scalability principles that must be applied to start on most of the available graphics cards, reducing or improving the graphics quality on the base of the available version that you decided to target. With this level of awareness, you will realize that you don't need any kind of loading library to use OpenGL, but only a good check of the available features, that you can program by yourself. Moreover, libraries like glew are the worst because they are implemented to replace the official gl.h and glext.h header files with a custom version anchored to the OpenGL version supported by that particular glew version.

Continue reading

How the deprecated OpenGL matrix model works

Even if nowadays everybody seems to drop OpenGL methods when they are deprecated on the core profile, it doesn't mean that you don't need to use them in compatibity profile or that you don't want to know how they work. I searched on the web to find more information on how the old and deprecated OpenGL matrices are implemented and I didn't find anything (except tutorials on how to use them!). My doubt was mainly about the operations order, because I needed to make a C++ implementation of them, maintaining the same exact behavior. I used OpenGL matrices In the past without worrying about how they were implemented, I had a precise idea but now I have to be 100% sure. Even if we know how to implement operations between matrices, the row-column product doesn't have the commutative property so the internal implementation can make the difference. At the end, my question is:

- What is the matrix row-column order and how the product is implemented on OpenGL?

Tired of finding pages saying how they are useless and deprecated now, I had to check by myself the Mesa source code to find what I was searching for:

P = A * B;

P[0] = A[0] * B[0] + A[4] * B[1] + A[8] * B[2] + A[12] * B[3];
P[4] = A[0] * B[4] + A[4] * B[5] + A[8] * B[6] + A[12] * B[7];
P[8] = A[0] * B[8] + A[4] * B[9] + A[8] * B[10] + A[12] * B[11];
P[12] = A[0] * B[12] + A[4] * B[13] + A[8] * B[14] + A[12] * B[15];

P[1] = A[1] * B[0] + A[5] * B[1] + A[9] * B[2] + A[13] * B[3];
P[5] = A[1] * B[4] + A[5] * B[5] + A[9] * B[6] + A[13] * B[7];
P[9] = A[1] * B[8] + A[5] * B[9] + A[9] * B[10] + A[13] * B[11];
P[13] = A[1] * B[12] + A[5] * B[13] + A[9] * B[14] + A[13] * B[15];

P[2] = A[2] * B[0] + A[6] * B[1] + A[10] * B[2] + A[14] * B[3];
P[6] = A[2] * B[4] + A[6] * B[5] + A[10] * B[6] + A[14] * B[7];
P[10] = A[2] * B[8] + A[6] * B[9] + A[10] * B[10] + A[14] * B[11];
P[14] = A[2] * B[12] + A[6] * B[13] + A[10] * B[14] + A[14] * B[15];

P[3] = A[3] * B[0] + A[7] * B[1] + A[11] * B[2] + A[15] * B[3];
P[7] = A[3] * B[4] + A[7] * B[5] + A[11] * B[6] + A[15] * B[7];
P[11] = A[3] * B[8] + A[7] * B[9] + A[11] * B[10] + A[15] * B[11];
P[15] = A[3] * B[12] + A[7] * B[13] + A[11] * B[14] + A[15] * B[15];

where A and B are 4x4 matrices and P is the result of the product. As you can see, this snippet clarifies how rows and columns are internally ordered and how the product is implemented. In conclusion, the OpenGL methods to modify the current matrix are implemented by Mesa in this way:

Continue reading

Do you want to work for NICE/Amazon? We are hiring!

Hi. Since NICE was acquired by Amazon I became part of the Amazon EC2 and its team in the world. Me and my collegues are working hard to improve our High Performance Computing and remote visualization technologies, which basically require advanced C/C++ programming skills and a deep knowledge of the OpenGL libraries. If you meet the requirements and want to be part of our world-class team, check our current offers here:

In addition to the skills listed in the announcements, the candidate must make a moderate use of modern C++ features and third-party dependencies (e.g. the use of high-level frameworks like QT or boost is justified only if it brings real benefits to the project and not to skip programming). know how to manage device contexts, choose / set pixel formats / fbconfigs, destroy / create rendering contexts, set the default frame buffer or FBO as rendering target, use graphics commands to render frames with multiple contexts running on multiple threads, without performance issues. A good knowledge of Desktop OpenGL specifications (from 1.0 to 4.5), deprecation and compatibility mode is required (e.g. the candidate must know that some OpenGL functions can be taken with wgl / glXGetProcAddress instead of using blindly a loading library like glew). If you have concerns or questions, do not hesitate to contact me. Regards.

Unrelated Engine – Deferred Rendering and Antialiasing

I tried to implement the explicit multisample antialiasing and I got good results, but it's slow on a GeForce 9600GT. A scene of 110 fps became 45 fps with only four samples, just to point out the slow down. While I was jumping to the ceiling for the amazing image quality of a REAL antialiasing with deferred shading (not the fake crap called FXAA) I fell down to the floor after I seen the fps, what a shame.


Anyway, I decided to change from a deferred shading to a deferred lighting model just to implement a good trick in order to use the classic multisample (that in my card can do pretty well also with 16 samples!) reading from the light accumulation buffer in the final step and writing the geometry to the screen with the antialiasing enabled. The result is a little weird, but you can fix it by using that crap fxaa on the light accumulation buffer which is smoother than the other image components. For example, I can use: a mipmapping or anisotropic filtering to eliminate the texture map aliasing, a FXAA to eliminate the light accumulation buffer aliasing and finally a MSAA to eliminate the geometry aliasing.

ps: I used the nanosuit model from this site:

TSREditor – Last screenshots

TSREditor is a huge editor in the style of Blender3D  projected to create or edit the resources like textures, 3d models, sound and levels for games and other stuff.


In spite of the large amount of work needed to reach a decent version of this software, I decided that it will be free as a part of my Unrelated Framework. In this moment I'm far from a decent beta version to release, but I can show you two nice screenshots of the program at work.


Unrelated Engine – Work in progress

This is the first "work in progress" video of my 3d engine called Unrelated Engine. Some complex animated models come from Doom 3. They were converted to a maximum of 4 weights per vertex as well as the identical vertices have been cancelled to improve the speed. The shader language was used to obtain a large amount of skinned meshes and complex materials with a reasonable speed.

The 3d models are from doom3 and from, they were used only to test my engine and to make this video. The rights of these 3d models and the music are reserved by their respective authors.

Unrelated Engine – A nice test with Mario 64 map

I created a nice video with an engine that I'm still developing and that is part of my Unrelated Framework. It uses:

- OpenGL (to draw the graphics)
- DevIL (to import images)
- Assimp (to import 3d models)

The 3d models are from, they were used only to test my engine and to make this video. The rights of these 3d models and the music are reserved by their respective authors.

Unrelated Framework – Inclusion of OpenIL/DevIL

OpenIL is a library with very powerful image loading capabilities and I decided to include it in my framework. My standard can handle several image formats like the classic 8 bit RGB or more advanced  formats like 16 bit RGB or High Dynamic Range. The images are imported from file and used maintaining the original format (if it's possible).  For other feature check  OpenIL official web site (

Supports loading of:

* Windows Bitmap - .bmp
* Dr. Halo - .cut
* Multi-PCX - .dcx
* Dicom - .dicom, .dcm
* DirectDraw Surface - .dds
* OpenEXR - .exr
* Flexible Image Transport System - .fits, .fit
* Heavy Metal: FAKK 2 - .ftx
* Radiance High Dynamic - .hdr
* Macintosh icon - .icns
* Windows icon/cursor - .ico, .cur
* Interchange File Format - .iff
* Infinity Ward Image - .iwi
* Graphics Interchange Format - .gif
* Jpeg - .jpg, .jpe, .jpeg
* Jpeg 2000 - .jp2
* Interlaced Bitmap - .lbm
* Homeworld texture - .lif
* Half-Life Model - .mdl
* MPEG-1 Audio Layer 3 - .mp3
* Palette - .pal
* Kodak PhotoCD - .pcd
* ZSoft PCX - .pcx
* Softimage PIC - .pic
* Portable Network Graphics - .png
* Portable Anymap - .pbm, .pgm, .pnm, .pnm
* Alias | Wavefront - .pix
* Adobe PhotoShop - .psd
* PaintShop Pro - .psp
* Pixar - .pxr
* Raw data - .raw
* Homeworld 2 Texture - .rot
* Silicon Graphics - .sgi, .bw, .rgb, .rgba
* Creative Assembly Texture - .texture
* Truevision Targa - .tga
* Tagged Image File Format - .tif
* Gamecube Texture - .tpl
* Unreal Texture - .utx
* Quake 2 Texture - .wal
* Valve Texture Format - .vtf
* HD Photo - .wdp, .hdp
* X Pixel Map - .xpm
* Doom graphics

Supports saving of:

* Windows Bitmap - .bmp
* DirectDraw Surface - .dds
* OpenEXR - .exr
* C-style Header - .h
* Radiance High Dynamic - .hdr
* Jpeg - .jpg
* Jpeg 2000 - .jp2
* Palette - .pal
* ZSoft PCX - .pcx
* Portable Network Graphics - .png
* Portable Anymap - .pbm, .pgm, .pnm, .pnm
* Adobe PhotoShop - .psd
* Raw data - .raw
* Silicon Graphics - .sgi, .bw, .rgb, .rgba
* Truevision Targa - .tga
* Tagged Image File Format - .tif
* Valve Texture Format - .vtf

This is an example of how it works in my framework:

C_Image *newImage = new C_Image();
//load from file
newImage->ImportFormFile(UF_FILE_JPG, "test.jpg");
//export to file
newImage->ExportToFile(UF_FILE_HDR, "test.hdr");
//set file format jpeg compression
newImage->SetFileFormat(UF_FileFormat_JPG(50)); //the image will be saved in my format with a jpeg quality of 50

Gianpaolo Ingegneri
Copyright @ 2010 – All right reserved

Unrelated Framework – Gui and Gui Editor

Finally I completed the GUI of my framework. The best feature is that the Gui can work in two modes: via software or using the OpenGL. It's  very helpful for cross platform compatibility, for video games or other OpenGL purpose.


The Gui is in his first version but it has all the widgets necessary to create professional applications. An interesting feature is that you don't need to program a single row of code to create particular interfaces: with the Gui Editor you can easily project all kind of professional interfaces and load them in your program using few functions. You don't need to code the widgets to make them work properly.  In this way you can save hours of programming.

There is a full list of widgets implemented:

- Button
- Radio button
- CheckBox
- Form
- Frame Window
- FrameBox
- PictureBox
- ScrollBar
- Scroll space
- TextBox
- ComboBox
- Menu
- ListView
- TreeView
- ToolBar
- Image button
- Graphic Api Viewer

Of course the GUI was coded in C++, it's object oriented and it's an integrative part of my Unrelated Framework.

Gianpaolo Ingegneri
Copyright @ 2010 – All right reserved

Unrelated Engine – Aquarium 3D

Aquarium 3D is a little demo of an engine that I'm developing for my framework. It uses a multithreading system with a thread for the physic engine and a thread that draws the graphics on the screen: the two threads are perfectly synchronized to maintain the best fluidity possible with different framerates.

The physic engine has a static number of iterations per second, in this case 30. It can obtain a good fluidity of movements also on higher fps of the graphics card (like 75 for example) upscaling the static framerate with a series of trajectory corrections. It uses also OpenGL for graphics,GLUT to open the window and lib3ds to import the 3d studio meshes. The fish models are property of this site:

Gianpaolo Ingegneri
Copyright @ 2010 – All right reserved

Ultra Fast Interpolation (via software)

In general, interpolation is a method used to construct a range of values from a set of data points. In digital image computing there are several methods of interpolation to improve the aspect of a transformed image but there is a problem: all of them are too slow to work via software in real time. Infact, we can see fast interpolations in 3d games only because they are performed via hardware by the graphic card (infact in the past it was very difficult to see an interpolation performed in real time).  However, interpolation is usefull  not only for 3d engines but they are an important part of digital image computing so there is a real need to develop a method to make it faster, expecially if you have to work with a large amount of images at the same time. For this reason I have programmed from scratch a set of  optimized algorithms of interpolation that are definite as in the past... but 100 times faster! The only limitation is that they can be used only for scale trasform but numerically they are perfect and faster at the same time.


Moreover, they are useful to generate procedural images in real time, like textures, that can be used in 3d engine or in some paint softwares where you cannot have a 3d card to speed up all the stuff. In this demo you can see the high performaces of different algorithms with images of 16 bit per pixel. To make sure that my interpolation is numerically perfect, I have included also the calculus of the normal map and the bump mapping effect. With biquadratic interpolation you will obtain a still perfect scaled bump map because the normal map calculus is derivative and the biquadratic is a second order reconstruction filter.

There are the performances on my computer:

Zoom 1x (on a 512x512 image)

- Nearest: 400 fps - Bilinear: 270 fps - Biquadratic: 80 fps

Zoom 4x (on a 512x512 image)

- Nearest: 965 fps - Bilinear: 780 fps - Biquadratic: 245 fps

Zoom 16x (on a 512x512 image)

- Nearest: 1405 fps - Bilinear: 1165 fps - Biquadratic: 390 fps

As you can see, the speed of the algorithm is directly proportional to the size of the zoom, however it is very fast also at the minimum size of 1x. This condition is very useful if you have to resize images or to generate textures with a large amount of stretched layers, like perlin noise. CPU: Intel 2 QuadCore 2333 Mhz; RAM: 4 GB DDRII 800Mhz

Gianpaolo Ingegneri
Copyright @ 2010 – All right reserved

CryEngine 3!

The version 3 of CryEngine is the porting of the version 2 for the world of consoles, supporting both Playstation 3 and Xbox360 (but still compatible with the PC). Check the following video to figure out the new features.

Texture Generation V1.0

This time it was a little bit harder. As you can see in this video, my engine can generate in real time very complex textures with the maximum detail at the maximum speed possible.

Each texture is generated in real time at the frame rate showed on the top-left corner of the window (the "generation" label).

- Lumps (512x512, 200 objs, 16 bpp, 345 FPS)
- Blobs (512x512, 16 bpp, 107 FPS)
- Perlin noise (512x512, 8 octaves, 16 bpp, 115 FPS)

Also, the normal map and the bump mapping are calculated in real time. I used 16 bit per pixel to have more precision when I get the normal map (for more informations check my High Static Range video). Precision and speed of this engine are awesome. The final result can loop in all horizontal and vertical directions.

CPU: Intel 2 QuadCore 2333 Mhz; RAM: 4 GB DDRII 800Mhz

©2009 Gianpaolo Ingegneri

High Static Range

This demo shows a new feature about the texture generation engine implemented in my Unrelated Framework. The normalmap used by bump mapping is calculated on the base of an heightmap generated with two different methods: Low Static Range and High Static Range. The enviroment bump mapping shows the difference in quality between the two ranges.

The first range uses only 8 bits per pixel that is too low to represent a continuous surface like the sphere generated in this example and as result we can see many rings on the surface during the light effect. On the contrary, an high range of 16 bits per pixel is perfect to have a light effect on a continuous surface infact we don't have any kind of imperfection. I could use the famous High Dynamic Range to generate the height map, but in my opinion it could be too expensive for the use of memory and the lack of speed (32 bits floating point per pixel, or 16 bits half-float has continuous conversions because cpu doesn't support it).

©2009 Gianpaolo Ingegneri