I want to create this post to clarify once and for all how the OpenGL extensions mechanism works and the correct proceedings to target OpenGL versions. I named this article in this way because OpenGL are generally bad documented (or difficult to understand) and OpenGL.org wiki makes the things worse. For example, several people got confused by this page:
These are useful extensions when targeting GL 2.1 hardware. Note that many of the above extensions are also available, if the hardware is still being supported. These represent non-hardware extensions introduced after 2.1, or hardware features not exposed by 2.1's API. Most 2.1 hardware that is still being supported by its maker will provide these, given recent drivers.
(1) Why don't the new tokens and entry points in this extension have "ARB" suffixes like other ARB extensions?
RESOLVED: Unlike a normal ARB extension, this is a strict subset of functionality already approved in OpenGL 3.0. This extension exists only to support that functionality on older hardware that cannot implement a full OpenGL 3.0 driver. Since there are no possible behavior changes between the ARB extension and core features, source code compatibility is improved by not using suffixes on the extension."
so the question is:
- GL_ARB_map_buffer_range is a core extension or not?
In the previous article I emphasized the importance of not having a third-party loading library like glew because OpenGL is too complex and unpredictible. For example, if you want to implement a videogame with an average graphics and a large audience of users, probably OpenGL 2.1 is enough. At this point, you may need to load only that part of the library and make the right check of the extensions or just use the functions that have been promoted to the core of the current version. Remember that an extension is not guaranteed to be present on that version of OpenGL if it's not a core feature and this kind of extensions has been introduced after 3.0 to maintain the forward compatibility.
For instance, it's useful to check the extension GL_ARB_vertex_buffer_object only on OpenGL 1.4 (in that case you may want to use glBindBufferARB instead of glBindBuffer) but not on superior versions because it has been promoted to the core from the version 1.5 onward. The same applies to other versions of the core and extensions. If you target OpenGL 2.1, you have to be sure that the extensions tipically used by 2.1 applications have not been promoted to the latest OpenGL 4.5 version and to check the extenions on previous versions of the library, making sure to use the appropriate vendor prefix, like ARB. Even if with glew you can make this kind of check before using the loaded functions, I don't recommend it because glewInit() is going to load also parts that you don't want to use and you run the risk to understimate the importance of checking the capabilities.
Anyway, reading the OpenGL spec and add manually the required extensions is a time expensive job that you may don't have the time to do. Recently, the Khronos group has released an xml file where there is a detailed description of the extensions and the functions for every version of the library, it is also used to generate the gl.h and the glext.h header files with a script in Python. In the same way, you can program a script that parses the gl.xml file to generate your own loading library, making the appropriate check of the extensions and including only the part that you really need to load on your project. You can find the gl.xml file here:
OpenGL is not so easy to use. The API exposes thousand of functions that are grouped into extensions and core features that you have to check for every single display driver release or the 3D application may not work. Since OpenGL is a graphics library used to program cool gfx effects without a serious knowledge of the underlying display driver, a large range of developers is tempted to use it regardless of the technical problems. For example, the functions are loaded "automagically" by an external loading library (like glew) and they are used to produce the desired effect, pretending that they are available everywhere. Of course this is totally wrong because OpenGL is scattered into dozens of extensions and core features that are linked to the "target" version that you want to support. Loading libraries like glew are dangerous because they try to load all the available OpenGL functions implemented by the display driver without making a proper check, giving you the illusion that the problem doesn't exist. The main problem with this approach is that you cannot develop a good OpenGL application without taking the following decision:
- How much OpenGL versions and extensions I have to support?
From this choice you can define the graphics aspect of the application and how to scale it to support a large range of display drivers, including the physical hardware and the driver supported by the virtual machines. For example, VirtualBox with guest addictions uses chromium 1.9 that comes with OpenGL 2.1 and GLSL 1.20, so your application won't start if you programmed it using OpenGL 4.5, or even worse you won't start also on graphics cards that support maximum the version 4.4 (that is very recent). For this reason, it's necessary to have a full awareness of the OpenGL scalability principles that must be applied to start on most of the available graphics cards, reducing or improving the graphics quality on the base of the available version that you decided to target. With this level of awareness, you will realize that you don't need any kind of loading library to use OpenGL, but only a good check of the available features, that you can program by yourself. Moreover, libraries like glew are the worst because they are implemented to replace the official gl.h and glext.h header files with a custom version anchored to the OpenGL version supported by that particular glew version.
Even if nowadays everybody seems to drop OpenGL methods when they are deprecated on the core profile, it doesn't mean that you don't need to use them in compatibity profile or that you don't want to know how they work. I searched on the web to find more information on how the old and deprecated OpenGL matrices are implemented and I didn't find anything (except tutorials on how to use them!). My doubt was mainly about the operations order, because I needed to make a C++ implementation of them, maintaining the same exact behavior. I used OpenGL matrices In the past without worrying about how they were implemented, I had a precise idea but now I have to be 100% sure. Even if we know how to implement operations between matrices, the row-column product doesn't have the commutative property so the internal implementation can make the difference. At the end, my question is:
- What is the matrix row-column order and how the product is implemented on OpenGL?
Tired of finding pages saying how they are useless and deprecated now, I had to check by myself the Mesa source code to find what I was searching for:
where A and B are 4x4 matrices and P is the result of the product. As you can see, this snippet clarifies how rows and columns are internally ordered and how the product is implemented. In conclusion, the OpenGL methods to modify the current matrix are implemented by Mesa in this way:
I tried to implement the explicit multisample antialiasing and I got good results, but it's slow on a GeForce 9600GT. A scene of 110 fps became 45 fps with only four samples, just to point out the slow down. While I was jumping to the ceiling for the amazing image quality of a REAL antialiasing with deferred shading (not the fake crap called FXAA) I fell down to the floor after I seen the fps, what a shame.
Anyway, I decided to change from a deferred shading to a deferred lighting model just to implement a good trick in order to use the classic multisample (that in my card can do pretty well also with 16 samples!) reading from the light accumulation buffer in the final step and writing the geometry to the screen with the antialiasing enabled. The result is a little weird, but you can fix it by using that crap fxaa on the light accumulation buffer which is smoother than the other image components. For example, I can use: a mipmapping or anisotropic filtering to eliminate the texture map aliasing, a FXAA to eliminate the light accumulation buffer aliasing and finally a MSAA to eliminate the geometry aliasing.
ps: I used the nanosuit model from this site: www.gfx-3d-model.com/2009/09/nanosuit-3d-model/
TSREditor is a huge editor in the style of Blender3D projected to create or edit the resources like textures, 3d models, sound and levels for games and other stuff.
In spite of the large amount of work needed to reach a decent version of this software, I decided that it will be free as a part of my Unrelated Framework. In this moment I'm far from a decent beta version to release, but I can show you two nice screenshots of the program at work.
This is the first "work in progress" video of my 3d engine called Unrelated Engine. Some complex animated models come from Doom 3. They were converted to a maximum of 4 weights per vertex as well as the identical vertices have been cancelled to improve the speed. The shader language was used to obtain a large amount of skinned meshes and complex materials with a reasonable speed.
The 3d models are from doom3 and from http://www.models-resource.com/, they were used only to test my engine and to make this video. The rights of these 3d models and the music are reserved by their respective authors.
I created a nice video with an engine that I'm still developing and that is part of my Unrelated Framework. It uses:
- OpenGL (to draw the graphics)
- DevIL (to import images)
- Assimp (to import 3d models)
The 3d models are from http://www.models-resource.com/, they were used only to test my engine and to make this video. The rights of these 3d models and the music are reserved by their respective authors.
OpenIL is a library with very powerful image loading capabilities and I decided to include it in my framework. My standard can handle several image formats like the classic 8 bit RGB or more advanced formats like 16 bit RGB or High Dynamic Range. The images are imported from file and used maintaining the original format (if it's possible). For other feature check OpenIL official web site (http://openil.sourceforge.net/)
This is an example of how it works in my framework:
C_Image *newImage = new C_Image();
//load from file
//export to file
//set file format jpeg compression
newImage->SetFileFormat(UF_FileFormat_JPG(50)); //the image will be saved in my format with a jpeg quality of 50
Copyright @ 2010 – All right reserved
Finally I completed the GUI of my framework. The best feature is that the Gui can work in two modes: via software or using the OpenGL. It's very helpful for cross platform compatibility, for video games or other OpenGL purpose.
The Gui is in his first version but it has all the widgets necessary to create professional applications. An interesting feature is that you don't need to program a single row of code to create particular interfaces: with the Gui Editor you can easily project all kind of professional interfaces and load them in your program using few functions. You don't need to code the widgets to make them work properly. In this way you can save hours of programming.
There is a full list of widgets implemented:
- Radio button
- Frame Window
- Scroll space
- Image button
- Graphic Api Viewer
Of course the GUI was coded in C++, it's object oriented and it's an integrative part of my Unrelated Framework.
Copyright @ 2010 – All right reserved
Aquarium 3D is a little demo of an engine that I'm developing for my framework. It uses a multithreading system with a thread for the physic engine and a thread that draws the graphics on the screen: the two threads are perfectly synchronized to maintain the best fluidity possible with different framerates.
The physic engine has a static number of iterations per second, in this case 30. It can obtain a good fluidity of movements also on higher fps of the graphics card (like 75 for example) upscaling the static framerate with a series of trajectory corrections. It uses also OpenGL for graphics,GLUT to open the window and lib3ds to import the 3d studio meshes. The fish models are property of this site: http://toucan.web.infoseek.co.jp/3DCG/3ds/FishModelsE.html
Copyright @ 2010 – All right reserved
In general, interpolation is a method used to construct a range of values from a set of data points. In digital image computing there are several methods of interpolation to improve the aspect of a transformed image but there is a problem: all of them are too slow to work via software in real time. Infact, we can see fast interpolations in 3d games only because they are performed via hardware by the graphic card (infact in the past it was very difficult to see an interpolation performed in real time). However, interpolation is usefull not only for 3d engines but they are an important part of digital image computing so there is a real need to develop a method to make it faster, expecially if you have to work with a large amount of images at the same time. For this reason I have programmed from scratch a set of optimized algorithms of interpolation that are definite as in the past... but 100 times faster! The only limitation is that they can be used only for scale trasform but numerically they are perfect and faster at the same time.
Moreover, they are useful to generate procedural images in real time, like textures, that can be used in 3d engine or in some paint softwares where you cannot have a 3d card to speed up all the stuff. In this demo you can see the high performaces of different algorithms with images of 16 bit per pixel. To make sure that my interpolation is numerically perfect, I have included also the calculus of the normal map and the bump mapping effect. With biquadratic interpolation you will obtain a still perfect scaled bump map because the normal map calculus is derivative and the biquadratic is a second order reconstruction filter.
As you can see, the speed of the algorithm is directly proportional to the size of the zoom, however it is very fast also at the minimum size of 1x. This condition is very useful if you have to resize images or to generate textures with a large amount of stretched layers, like perlin noise. CPU: Intel 2 QuadCore 2333 Mhz; RAM: 4 GB DDRII 800Mhz
Copyright @ 2010 – All right reserved
The version 3 of CryEngine is the porting of the version 2 for the world of consoles, supporting both Playstation 3 and Xbox360 (but still compatible with the PC). Check the following video to figure out the new features.
Also, the normal map and the bump mapping are calculated in real time. I used 16 bit per pixel to have more precision when I get the normal map (for more informations check my High Static Range video). Precision and speed of this engine are awesome. The final result can loop in all horizontal and vertical directions.
This demo shows a new feature about the texture generation engine implemented in my Unrelated Framework. The normalmap used by bump mapping is calculated on the base of an heightmap generated with two different methods: Low Static Range and High Static Range. The enviroment bump mapping shows the difference in quality between the two ranges.
The first range uses only 8 bits per pixel that is too low to represent a continuous surface like the sphere generated in this example and as result we can see many rings on the surface during the light effect. On the contrary, an high range of 16 bits per pixel is perfect to have a light effect on a continuous surface infact we don't have any kind of imperfection. I could use the famous High Dynamic Range to generate the height map, but in my opinion it could be too expensive for the use of memory and the lack of speed (32 bits floating point per pixel, or 16 bits half-float has continuous conversions because cpu doesn't support it).