OpenGLES2.0 optimisations

Discussion in 'Public Game Developers Forum' started by GlennX, Sep 9, 2011.

  1. GlennX

    GlennX Well-Known Member

    May 10, 2009
    761
    0
    0
    UK
    #1 GlennX, Sep 9, 2011
    Last edited: Sep 10, 2011
    I posted a long blog about my experiences with switching to and optimising for OpenGLES2.0 HERE.

    Could be useful to some here.
     
  2. NinthNinja

    NinthNinja Well-Known Member

    Jan 31, 2011
    441
    0
    0
    #2 NinthNinja, Sep 9, 2011
    Last edited: Sep 9, 2011
    That's a very interesting read Glen :)

    I too found out about GL_STREAM_DRAW ;)

    There's a lot of misconception that Apple give out about updating VBOs. They recommend orphaning before you update - what they forget to tell you is that because of GL driver bugs that got introduced in iOS 4.0 that this method can kill framerate on those drivers but they fixed on a latter firmware - this means your game will run great on the latest firmware but for those people who never update will complain about sluggish speeds. The best method that works with all gl drivers from 4.0 for opengles 2.0 are thus.

    1. Double buffer your VBOs and VAOs.
    2. If you have one large buffer that you will update in that will be a dumping ground for everything of the same vertex format, it's best to update with glBufferSubData. The reason is I found that the larger the buffer the more time glUnmapBufferOES took to process - I think internally the bigger the buffer the more the gl driver does. Using glBufferSubData to update regions in the buffer ended up quicker.
    3. If your buffer is smallish and is used for one specific thing then using glMapBufferOES and glUnmapBufferOES is better to use.

    My vertex class for dynamic updates is such and can cater for both methods:

    Code:
    
    void Vertex::DynamicBegin()
    {
    	if(resetIndexCount) IndexCount = 0;
    	
    	if(useMapBuffer)
    	{
    		glBindBuffer(GL_ARRAY_BUFFER, vbo_array_dynamic);
    		ActiveDynamicPtr = (char *)glMapBufferOES(GL_ARRAY_BUFFER, GL_WRITE_ONLY_OES);
    			
    		if(updateIndex)
    		{			
    			glBindBuffer(GL_ELEMENT_ARRAY_BUFFER,vbo_element);
    			ActiveIndexPtr = (char *)glMapBufferOES(GL_ELEMENT_ARRAY_BUFFER, GL_WRITE_ONLY_OES);
    		}
    	}else{
    		ActiveDynamicPtr = DynamicVertsPtr;
    		if(updateIndex) ActiveIndexPtr = IndexPtr;				
    	}
    }
    
    /************************************************************************/
    
    void Vertex::DynamicEnd(bool bindOff)
    {
    	if(isVBO)
    	{	
    		if(useMapBuffer)
    		{
    			glUnmapBufferOES(GL_ARRAY_BUFFER);
    			if(bindOff) glBindBuffer(GL_ARRAY_BUFFER, 0);
    			
    			if(updateIndex)
    			{
    				glUnmapBufferOES(GL_ELEMENT_ARRAY_BUFFER);
    				if(bindOff) glBindBuffer(GL_ELEMENT_ARRAY_BUFFER, 0);
    			}			
    		}else{
    			glBindBuffer(GL_ARRAY_BUFFER, vbo_array_dynamic);
    			u32 size = (u32)ActiveDynamicPtr - (u32)DynamicVertsPtr;
    			glBufferSubData(GL_ARRAY_BUFFER, 0, size, DynamicVertsPtr);
    			if(bindOff) glBindBuffer(GL_ARRAY_BUFFER, 0);
    			
    			if(updateIndex)
    			{
    				glBindBuffer(GL_ELEMENT_ARRAY_BUFFER,vbo_element);
    				size = (u32)ActiveIndexPtr - (u32)IndexPtr;
    				glBufferSubData(GL_ELEMENT_ARRAY_BUFFER, 0, size, IndexPtr);
    				if(bindOff) glBindBuffer(GL_ELEMENT_ARRAY_BUFFER, 0);
    			}
    		}
    	}	
    }
    
    
    
    Another thing to gain speed on vertex submission is to have a w component with your position data and put a scale in your view matrix. And use packed formats. One of my structures is as follows to show what I mean:

    struct Vertex1
    {
    GLshort x,y,z,w;
    u32 colour;
    GLshort u,v;
    };

    Which packs down to 16 bytes :)


    When I was involved with True Axis I spent a lot of time optimizing Space Tripper - when I got the code converted to run on iOS I was really disappointed because it ran at 12/15 fps. When I left in January this year the game was running at 60fps. It just goes to show what can be done if you really go in depth into this stuff. All that knowledge was used for Jet Car Stunts to get a super smooth update with no framerate variance.

    Anyway, from my extensive testing, I've found 2 things that help framerates.

    1. Vetex submission to the GL drivers is very important.
    2. Using as much low precision in shader variables is very important for 3GS, iPhone 4, and iPad 1
    3. The iPad 2 is a different story - if you build your 3D engine with all the best practices then that thing flies with so much power.
     
  3. GlennX

    GlennX Well-Known Member

    May 10, 2009
    761
    0
    0
    UK
    Hi Andy, you should really get a namecheck in that blog as most searches on GL optimisations on the Apple forums throw up a thread where you are a major contributor!

    Whatever happened to Spacetripper? That could have been huge if it'd been released when you first showed a 30 FPS video two years ago.
     
  4. NinthNinja

    NinthNinja Well-Known Member

    Jan 31, 2011
    441
    0
    0
    When iOS 4.0 came out Apple added some very handy things for stopping the gl driver from copying the z buffer and other buffers back to the CPU side. If you are not using these new functions then I recommend using them. To show you what I mean:

    Code:
    		if(multisampled)
    		{			
    			glBindFramebufferOES(GL_FRAMEBUFFER_OES, msaaFramebuffer);
    			
    			Render();
    			
    			glBindFramebufferOES(GL_READ_FRAMEBUFFER_APPLE, msaaFramebuffer);
    			glBindFramebufferOES(GL_DRAW_FRAMEBUFFER_APPLE, resolveFramebuffer);
    			glResolveMultisampleFramebufferAPPLE();
    			
    			GLenum attachments[] = { GL_COLOR_ATTACHMENT0_OES, GL_DEPTH_ATTACHMENT_OES};
    			glDiscardFramebufferEXT(GL_READ_FRAMEBUFFER_APPLE, 2, attachments);
    			
    			glBindRenderbufferOES(GL_RENDERBUFFER_OES, resolveColorbuffer);
    			[context presentRenderbuffer:GL_RENDERBUFFER_OES];
    		}else{
    			Render();
    			
    			if(discardFramebufferSupported)
    			{
    				GLenum attachments[] = { GL_DEPTH_ATTACHMENT_OES};
    				glDiscardFramebufferEXT(GL_FRAMEBUFFER_OES, 1, attachments);
    			}
    			[context presentRenderbuffer:GL_RENDERBUFFER_OES];
    		}
    
    This gains a lot of speed back :)
     
  5. NinthNinja

    NinthNinja Well-Known Member

    Jan 31, 2011
    441
    0
    0
    I think Space Tripper is coming out - Luke is presently finishing it up. I got terribly burnt out from doing all the updates for Jet Car Stunts and getting pissed off with iOS firmware updates that broke Space Tripper, especially to making the accelerometer updates sluggish to the point where the game was unplayable. But it seems Apple fixed those problems after I left True Axis so Luke decided to finish it off... I think Luke posted that it will be submitted sometime in Oct.

    I really should write up an article on how to optimize for Opengles 1.1 and 2.0 for iOS from 1st gen hardware to the latest and greatest ;) There's so many tricks I learnt that no one knows about. Plus it would include gl driver bugs on each particular firmware and avoid certain things for those cases. But I guess it's finding time to do that.

    I'm actually working on an iPad project at the moment with a framerate set at 60fps. Very interesting stuff going on there but it's too early to go into details yet. I'm pretty happy on how it's turning out :)

    I'm loving your new stuff Glen - it looks stunning!
     
  6. GlennX

    GlennX Well-Known Member

    May 10, 2009
    761
    0
    0
    UK
    I'd been running with the framebuffer discard for a while though I only added MSAA a few days ago. I'm a bit crap at finding info, in the end I got most of it from watching a few of last years WWDC videos (well worth downloading on iTunesU) and searches in the apple dev forums.
     
  7. blitter

    blitter Well-Known Member

    Great thread!

    GL_STATIC_DRAW and a bunch of 'handy' glUniformMatrix3fv() here...wish I could talk more about it :)
     
  8. blitter

    blitter Well-Known Member

    I'll get my coat!
     
  9. nvx

    nvx Well-Known Member

    Jan 7, 2011
    195
    0
    0
    UK
    Interesting blog Glen, I am in the process of updating our engine to use GLES2 (yea kinda late, but better than never) so these tidbits of info are very timely :cool:

    That would be fantastic and very much appreciated :D
     

Share This Page