Hi, read that iPhones OpenGL ES implementation of Matrix Multiplication is very slow and that their VOB implementation yields no significant improvements in performance. Sorry, but I lost reference to articles.
My question is, assuming these shortcoming are true, how do I optimize matrix multiplication and is there a work around for VOB inadequacy?
I just downloaded the "Optimizing OpenGL for iPhone" lecture from Stanford University (you will find it at iTunes U), where Tim Omernick from ngmoco talked a lot about batching all drawing operations into one call. It however requires you to do all the matrix multiplications such as rotations and translations manually on the vertices instead of changing GL state with glTranslate and glRotate for each object.
I find it weird that it would cost less to do all the matrix multiplications manually on the vertices instead of letting OpenGL take care of it. Perhaps it depends on how complex the mesh is? Tim Omernick showed that it is better to combine many simple objects into one big array and calculate all the translations and rotations manually, than using glTranslate/glRotate operations and calling glDrawArrays() for each object. I wonder if that is still the case for complex objects? If you have a few but very complex mesh objects, you will also have few glTranslate/glRotate and glDrawArrays() calls for each frame, in which case it might be better to let OpenGL do it all?
Does anyone have any insight into this? Should we always combine all the objects into one big array and make just one glDrawArrays() call per frame, or does it depend on the complexity of the scene?
I find it weird that it would cost less to do all the matrix multiplications manually on the vertices instead of letting OpenGL take care of it. Perhaps it depends on how complex the mesh is? Tim Omernick showed that it is better to combine many simple objects into one big array and calculate all the translations and rotations manually, than using glTranslate/glRotate operations and calling glDrawArrays() for each object. I wonder if that is still the case for complex objects? If you have a few but very complex mesh objects, you will also have few glTranslate/glRotate and glDrawArrays() calls for each frame, in which case it might be better to let OpenGL do it all?
Does anyone have any insight into this? Should we always combine all the objects into one big array and make just one glDrawArrays() call per frame, or does it depend on the complexity of the scene?
It depends... the old iPhone has an inneficient implementation in terms of uselessly copying vertex data on each draw call but I wouldn't go as far as calculating everyting on the CPU and then issuing a single draw call.
The key optimization for the old iPhone is to minimize the size of your vertex data which means turning floats (uv/positions) into shorts etc ... and, if possible, using PVRT compression for textures.
On the other hand if you are doing 2d (quads) then obviously combining as many of them into one big array makes perfect sense since you don't want issue a separate batch for every quad (4 vertices)
I generally combine all static objects into single renderable (calculating their world coordinates once ) but keep all moveable objects as separate geometry.
This only applies to OpenGL 1.x .. the latest iPhone fully supports VBOs and overall behaves quite differently in terms of performance.
I generally combine all static objects into single renderable (calculating their world coordinates once ) but keep all moveable objects as separate geometry.
Ditto. I'm working in 2d, but i had no problem drawing 50-100 sprites to the screen using separate draw calls. Drawing a 16x16 tilemap the same way was unfeasible, however; 256 draw calls was obviously excessive. So now I create the tilemap as a single piece of gemoetry, but still draw the sprites separately, and everything is back in butter.