★ TouchArcade needs your help. Click here to support us on Patreon.

OpenGLES2.0 optimisations

09-09-2011, 08:25 AM
#1
Joined: May 2009
Location: UK
Posts: 741
OpenGLES2.0 optimisations

I posted a long blog about my experiences with switching to and optimising for OpenGLES2.0 HERE.

Could be useful to some here.

-=< Fat Owl With A Jetpack >=-
-=< Topia World Builder >=-
-=< Twitter >=-
-=< Blog >=-

Last edited by GlennX; 09-10-2011 at 05:04 AM.
09-09-2011, 09:43 AM
#2
Joined: Jan 2011
Posts: 425
That's a very interesting read Glen

I too found out about GL_STREAM_DRAW

There's a lot of misconception that Apple give out about updating VBOs. They recommend orphaning before you update - what they forget to tell you is that because of GL driver bugs that got introduced in iOS 4.0 that this method can kill framerate on those drivers but they fixed on a latter firmware - this means your game will run great on the latest firmware but for those people who never update will complain about sluggish speeds. The best method that works with all gl drivers from 4.0 for opengles 2.0 are thus.

1. Double buffer your VBOs and VAOs.
2. If you have one large buffer that you will update in that will be a dumping ground for everything of the same vertex format, it's best to update with glBufferSubData. The reason is I found that the larger the buffer the more time glUnmapBufferOES took to process - I think internally the bigger the buffer the more the gl driver does. Using glBufferSubData to update regions in the buffer ended up quicker.
3. If your buffer is smallish and is used for one specific thing then using glMapBufferOES and glUnmapBufferOES is better to use.

My vertex class for dynamic updates is such and can cater for both methods:

Code:
void Vertex::DynamicBegin()
{
	if(resetIndexCount) IndexCount = 0;
	
	if(useMapBuffer)
	{
		glBindBuffer(GL_ARRAY_BUFFER, vbo_array_dynamic);
		ActiveDynamicPtr = (char *)glMapBufferOES(GL_ARRAY_BUFFER, GL_WRITE_ONLY_OES);
			
		if(updateIndex)
		{			
			glBindBuffer(GL_ELEMENT_ARRAY_BUFFER,vbo_element);
			ActiveIndexPtr = (char *)glMapBufferOES(GL_ELEMENT_ARRAY_BUFFER, GL_WRITE_ONLY_OES);
		}
	}else{
		ActiveDynamicPtr = DynamicVertsPtr;
		if(updateIndex) ActiveIndexPtr = IndexPtr;				
	}
}

/************************************************************************/

void Vertex::DynamicEnd(bool bindOff)
{
	if(isVBO)
	{	
		if(useMapBuffer)
		{
			glUnmapBufferOES(GL_ARRAY_BUFFER);
			if(bindOff) glBindBuffer(GL_ARRAY_BUFFER, 0);
			
			if(updateIndex)
			{
				glUnmapBufferOES(GL_ELEMENT_ARRAY_BUFFER);
				if(bindOff) glBindBuffer(GL_ELEMENT_ARRAY_BUFFER, 0);
			}			
		}else{
			glBindBuffer(GL_ARRAY_BUFFER, vbo_array_dynamic);
			u32 size = (u32)ActiveDynamicPtr - (u32)DynamicVertsPtr;
			glBufferSubData(GL_ARRAY_BUFFER, 0, size, DynamicVertsPtr);
			if(bindOff) glBindBuffer(GL_ARRAY_BUFFER, 0);
			
			if(updateIndex)
			{
				glBindBuffer(GL_ELEMENT_ARRAY_BUFFER,vbo_element);
				size = (u32)ActiveIndexPtr - (u32)IndexPtr;
				glBufferSubData(GL_ELEMENT_ARRAY_BUFFER, 0, size, IndexPtr);
				if(bindOff) glBindBuffer(GL_ELEMENT_ARRAY_BUFFER, 0);
			}
		}
	}	
}
Another thing to gain speed on vertex submission is to have a w component with your position data and put a scale in your view matrix. And use packed formats. One of my structures is as follows to show what I mean:

struct Vertex1
{
GLshort x,y,z,w;
u32 colour;
GLshort u,v;
};

Which packs down to 16 bytes


When I was involved with True Axis I spent a lot of time optimizing Space Tripper - when I got the code converted to run on iOS I was really disappointed because it ran at 12/15 fps. When I left in January this year the game was running at 60fps. It just goes to show what can be done if you really go in depth into this stuff. All that knowledge was used for Jet Car Stunts to get a super smooth update with no framerate variance.

Anyway, from my extensive testing, I've found 2 things that help framerates.

1. Vetex submission to the GL drivers is very important.
2. Using as much low precision in shader variables is very important for 3GS, iPhone 4, and iPad 1
3. The iPad 2 is a different story - if you build your 3D engine with all the best practices then that thing flies with so much power.

Last edited by NinthNinja; 09-09-2011 at 09:46 AM.

09-09-2011, 10:07 AM
#3
Joined: May 2009
Location: UK
Posts: 741
Hi Andy, you should really get a namecheck in that blog as most searches on GL optimisations on the Apple forums throw up a thread where you are a major contributor!

Whatever happened to Spacetripper? That could have been huge if it'd been released when you first showed a 30 FPS video two years ago.

-=< Fat Owl With A Jetpack >=-
-=< Topia World Builder >=-
-=< Twitter >=-
-=< Blog >=-
09-09-2011, 10:12 AM
#4
Joined: Jan 2011
Posts: 425
When iOS 4.0 came out Apple added some very handy things for stopping the gl driver from copying the z buffer and other buffers back to the CPU side. If you are not using these new functions then I recommend using them. To show you what I mean:

Code:
		if(multisampled)
		{			
			glBindFramebufferOES(GL_FRAMEBUFFER_OES, msaaFramebuffer);
			
			Render();
			
			glBindFramebufferOES(GL_READ_FRAMEBUFFER_APPLE, msaaFramebuffer);
			glBindFramebufferOES(GL_DRAW_FRAMEBUFFER_APPLE, resolveFramebuffer);
			glResolveMultisampleFramebufferAPPLE();
			
			GLenum attachments[] = { GL_COLOR_ATTACHMENT0_OES, GL_DEPTH_ATTACHMENT_OES};
			glDiscardFramebufferEXT(GL_READ_FRAMEBUFFER_APPLE, 2, attachments);
			
			glBindRenderbufferOES(GL_RENDERBUFFER_OES, resolveColorbuffer);
			[context presentRenderbuffer:GL_RENDERBUFFER_OES];
		}else{
			Render();
			
			if(discardFramebufferSupported)
			{
				GLenum attachments[] = { GL_DEPTH_ATTACHMENT_OES};
				glDiscardFramebufferEXT(GL_FRAMEBUFFER_OES, 1, attachments);
			}
			[context presentRenderbuffer:GL_RENDERBUFFER_OES];
		}
This gains a lot of speed back
09-09-2011, 10:27 AM
#5
Joined: Jan 2011
Posts: 425
Quote:
Originally Posted by GlennX View Post
Hi Andy, you should really get a namecheck in that blog as most searches on GL optimisations on the Apple forums throw up a thread where you are a major contributor!

Whatever happened to Spacetripper? That could have been huge if it'd been released when you first showed a 30 FPS video two years ago.
I think Space Tripper is coming out - Luke is presently finishing it up. I got terribly burnt out from doing all the updates for Jet Car Stunts and getting pissed off with iOS firmware updates that broke Space Tripper, especially to making the accelerometer updates sluggish to the point where the game was unplayable. But it seems Apple fixed those problems after I left True Axis so Luke decided to finish it off... I think Luke posted that it will be submitted sometime in Oct.

I really should write up an article on how to optimize for Opengles 1.1 and 2.0 for iOS from 1st gen hardware to the latest and greatest There's so many tricks I learnt that no one knows about. Plus it would include gl driver bugs on each particular firmware and avoid certain things for those cases. But I guess it's finding time to do that.

I'm actually working on an iPad project at the moment with a framerate set at 60fps. Very interesting stuff going on there but it's too early to go into details yet. I'm pretty happy on how it's turning out

I'm loving your new stuff Glen - it looks stunning!
09-09-2011, 02:47 PM
#6
Joined: May 2009
Location: UK
Posts: 741
Quote:
Originally Posted by NinthNinja View Post
When iOS 4.0 came out Apple added some very handy things for stopping the gl driver from copying the z buffer and other buffers back to the CPU side. If you are not using these new functions then I recommend using them. To show you what I mean:

This gains a lot of speed back
I'd been running with the framebuffer discard for a while though I only added MSAA a few days ago. I'm a bit crap at finding info, in the end I got most of it from watching a few of last years WWDC videos (well worth downloading on iTunesU) and searches in the apple dev forums.

-=< Fat Owl With A Jetpack >=-
-=< Topia World Builder >=-
-=< Twitter >=-
-=< Blog >=-
09-10-2011, 10:00 PM
#7
Joined: Dec 2010
Location: ΖΞN
Posts: 304
Great thread!

GL_STATIC_DRAW and a bunch of 'handy' glUniformMatrix3fv() here...wish I could talk more about it :-)

BubbleSand - "the best sand app"
Tetroms
09-18-2011, 05:12 AM
#8
Joined: Dec 2010
Location: ΖΞN
Posts: 304
I'll get my coat!
10-05-2011, 03:55 PM
#9
Joined: Jan 2011
Location: UK
Posts: 195
Interesting blog Glen, I am in the process of updating our engine to use GLES2 (yea kinda late, but better than never) so these tidbits of info are very timely

Quote:
Originally Posted by NinthNinja View Post
I really should write up an article on how to optimize for Opengles 1.1 and 2.0 for iOS from 1st gen hardware to the latest and greatest There's so many tricks I learnt that no one knows about. Plus it would include gl driver bugs on each particular firmware and avoid certain things for those cases. But I guess it's finding time to do that.
That would be fantastic and very much appreciated