I cannot seem to understand how the frame drawing sync with buffer swapping.
Following are the questions:
1.Since most of the open GL calls are non blocking (or buffered) how do you know if the gpu is done with current frame?
2.Does open GL handles it so that an unfinished frame wont get swapped to the window buffer
3.How do you calculate the frame rate? I mean what is the basis for determining the no of frames drawn or time taken by each frame?
The modern way of syncing with the GL is using sync objects. Using glFinish() (or other blocking GL calls) has the disadvantage of stalling both the GPU and the CPU (thread): the CPU will wait until the GPU is finished, and the GPU then stalls because there is no new work queued up. If sync objects are used properly, both can be completely avoided.
You just insert a fence sync into the GL command stream at any point you are interested an, and later can check if all commands before it are completed, or you can wait for the completion (while you still can have further commands queued up).
Note that for frame rate estimation, you don't need any explicit means of synchronization. Just using SwapBuffers() is sufficient. The gpu might queue up a few frames in advance (the nvidia driver has even a setting for this), but this won't disturb fps counting, since only the first n frames are queued up. Just count the number of SwapBuffer() calls issued each second, and you will do fine. If the user has enabled sync to vblank, the frame rate will be limited to the refresh rate of the monitor, and no tearing will appear.
If you need more detailed GPU timing statistics (but for a frame rate counter, you don't), you should have a look at timer queries.
Question 1 and 2: Invoke glFinish() instead of glFlush():
Description
glFinish does not return until the effects of all previously called GL commands are complete. Such effects include all changes to GL state, all changes to connection state, and all changes to the frame buffer contents.
Question 3: Start a Timer and count how many calls to glFinish() were executed within one second.
Generally, you don't. And I can't think of a very good reason why you should ever worry about it in a real application. If you really have to know, using sync objects (as already suggested in the answer by #derhass) is your best option in modern OpenGL. But you should make sure that you have a clear understanding of why you need it, because it seems unusual to me.
Yes. While the processing of calls in OpenGL is mostly asynchronous, the sequence of calls is still maintained. So if you make a SwapBuffers call, it will guarantee that all the calls you made before the SwapBuffers call will have completed before the buffers are swapped.
There's no good easy way to measure the time used for a single frame. The most practical approach is that you render for a sufficiently long time (at least a few seconds seems reasonable). You count the number of frames you rendered during this time, and the elapsed wall clock time. Then divide number of frames by the time taken to get a frame rate.
Some of the above is slightly simplified, because this opens up some areas that could be very broad. For example, you can use timer queries to measure how long the GPU takes to process a given frame. But you have to be careful about the conclusions you draw from it.
As a hypothetical example, say you render at 60 fps, limited by vsync. You put a timer query on a frame, and it tells you that the GPU spent 15 ms to render the frame. Does this mean that you were right at the limit of being able to maintain 60 fps? And making your rendering/content more complex would drop it below 60 fps? Not necessarily. Unless you also tracked the GPU clock frequency, you don't know if the GPU really ran at its limit. Power management might have reduced the frequency/voltage to the level necessary to process the current workload. And if you give it more work, it might be able to handle it just fine, and still run at 60 fps.
Related
I have a java application that, from time to time, seems to have hiccups where it lags a lot / becomes unresponsive for a few seconds, then continues like normal again. This isn't associated with any disk or network output, but CPU usage goes way up for a short time when this happens.
I'd like to use JProfiler to see what happens during that time, but I don't know what triggers the behaviour (so I can't just move my application to that point, then start CPU recording), and leaving CPU recording on all the time until a hiccup occurs doesn't help much either, since that will include the CPU percentages of everything up to that point in the calculation, distracting from what's using CPU now.
So what I'd like is a view that shows me "average CPU usage by method over the last X seconds", that throws away all data that's older than X seconds automatically, and calculates just the averages over those last X samples (assuming 1 sample per second). I wasn't able to find any option that allows me to do this; is this something that JProfiler just doesn't support, or haven't I looked hard enough?
These kind of exceptional circumstances can be analyzed with JProfiler's "exceptional method runs" feature.
In the call tree, select the method that shows the performance spike and select "Add As Exceptional Method" from the context menu.
Then, you can see the slowest invocations separately with all other invocations merged into a single node:
This screen cast shows the entire feature:
http://blog.ej-technologies.com/2011/02/methods-statistics-and-exceptional.html
I'm developing a Live Wallpaper using OpenGL ES 3.0. I've set up according to the excellent tutorial at http://www.learnopengles.com/how-to-use-opengl-es-2-in-an-android-live-wallpaper/, adapting GLSurfaceView and using it inside the Live Wallpaper.
I have a decent knowledge of OpenGL/GLSL best practices, and I've set up a simple rendering pipeline where the draw loop is as tight as possible. No re-allocations, using one static VBO for non-changing data, a dynamic VBO for updates, using only one draw call, no branching in the shaders et cetera. I usually get very good performance, but at seemingly random but reoccurring times, the framerate drops.
Profiling with the on-screen bars gives me intervals where the yellow bar ("waiting for commands to complete") shoots away and takes everything above the critical 60fps threshold.
I've read any resources on profiling and interpreting those numbers I can get my hands on, including the nice in-depth SO question here. However, the main takeaway from that question seems to be that the yellow bar indicates time spent on waiting for blocking operations to complete, and for frame dependencies. I don't believe I have any of those, I just draw everything at every frame. No reading.
My question is broad - but I'd like to know what things can cause this type of framerate drop, and how to move forward in pinning down the issue.
Here are some details that may or may not have impact:
I'm rendering on demand, onOffsetsChanged is the trigger (render when dirty).
There is one single texture (created and bound only once), 1024x1024 RGBA. Replacing the one texture2D call with a plain vec4 seems to help remove some of the framerate drops. Reducing the texture size to 512x512 does nothing for performance.
The shaders are not complex, and as stated before, contain no branching.
There is not much data in the scene. There are only ~300 vertices and the one texture.
A systrace shows no suspicious methods - the GL related methods such as buffer population and state calls are not on top of the list.
Update:
As an experiment, I tried to render only every other frame, not requesting a render every onOffsetsChanged (swipe left/right). This was horrible for the look and feel, but got rid of the yellow lag spikes almost completely. This seems to tell me that doing 60 requests per frame is too much, but I can't figure out why.
My question is broad - but I'd like to know what things can cause this
type of framerate drop, and how to move forward in pinning down the
issue.
(1) Accumulation of render state. Make sure you "glClear" the color/depth/stencil buffers before you start each render pass (although if you are rendering directly to the window surface this is unlikely to be the problem, as state is guaranteed to be cleared every frame unless you set EGL_BUFFER_PRESERVE).
(2) Buffer/texture ghosting. Rendering is deeply pipelined, but OpenGL ES tries to present a synchronous programming abstraction. If you try to write to a buffer (SubBuffer update, SubTexture update, MapBuffer, etc) which is still "pending" use in a GPU operation still queued in the pipeline then you either have to block and wait, or you force a copy of that resource to be created. This copy process can be "really expensive" for large resources.
(3) Device DVFS (dynamic frequency and voltage scaling) can be quite sensitive on some devices, especially for content which happens to sit just around a level decision point between two frequencies. If the GPU or CPU frequency drops then you may well get a spike in the amount of time a frame takes to process. For debug purposes some devices provide a means to fix frequency via sysfs - although there is no standard mechnanism.
(4) Thermal limitations - most modern mobile devices can produce more heat than they can dissipate if everything is running at high frequency, so the maximum performance point cannot be sustained. If your content is particularly heavy then you may find that thermal management kicks in after a "while" (1-10 minutes depending on device, in my experience) and forcefully drops the frequency until thermal levels drop within safe margins. This shows up as somewhat random increases in frame processing time, and is normally unpredictable once a device hits the "warm" state.
If it is possible to share an API sequence which reproduces the issue it would be easier to provide more targeted advice - the question is really rather general and OpenGL ES is a very wide API ;)
With regard to JavaFX, I have the following questions:
Do I need to use setCache(true) on a Node for a cache hint set by setCacheHint() to actually have any effect?
Should calling setCache actually improve performance i.e. frame rate some or most of the time? I am unable to observe any change in frame rate when I use setCache(true) and I apply scaling and other transforms.
Do I need to use setCache(true) on a Node for a cache hint set by setCacheHint() to actually have any effect?
Yes.
The cache property is a hint to the system whether the node rendering should be cached (as an internal image) at all or not.
The cacheHint property is a hint to the system of what transforms are expected on the node so that the caching operation can be optimized for those transform types, (e.g. the cache hints rotate, scale or speed, etc).
If the node is not set to be cached at all, the cacheHint is irrelevant.
Should calling setCache actually improve performance i.e. frame rate some or most of the time?
Not necessarily. JavaFX has a default frame rate cap of 60fps, so if performance is good enough that the frame rate is reached even without any cache hints, you won't see any visible difference. This is the case for many basic animations and transforms.
Even if frame rate is not improved, the cache hint may make each transform a bit more efficient to perform so that it is less CPU or GPU intensive (usually by trading visual quality).
There may be other things which have a far greater impact on your frame rate. These things may have nothing to do with rendering speed of cacheable items (for example a long running operation executed on the JavaFX application thread during a game loop or constantly changing node content).
I have used a combination of setCache(true) and setCacheHint(CacheHint.SPEED) in small games I have written that featured multiple simultaneously animated nodes with multiple effects and translucency applied to the nodes. The settings did speed things up a lot (Mac OS X, Macbook Air 2012, Java FX 2.2).
Rather than relying on hints to the rendering system, you can also manually take a snapshot of a node tree and manually replace the node tree with the snapshot Image. This snapshot technique is not always the best way to go, but it does give you an alternative if the hints aren't working out well in your case.
I'm learning Libgdx and have some questions about updating my game logic during the render method..
I would ideally like to keep my game logic and my render separate. The reason for this is if i have high FPS on a system my game loop would "run" faster.
what i am looking for is to keep the experance constant and possibily Limit my updates..if any one can point me towards a tutorial on how to
a)Limit my render updates via DeltaTime
b)Limit my game logic updates via Deltatime.
Thank you :)
After re-reading your question, I think the trick that you are missing (based on your comment that running on a higher-refresh system would result in your game logic running faster), is that you actually scale your updates based on the "delta" time that is passed to render. Andrei Bârsan mentions this above, but I thought I'd elaborate a bit on how delta is used.
For instance, within my game's render(), I first call my entityUpdate(delta), which updates and moves all of the objects in my game scaled by the distance traveled in time "delta" (it doesn't render them, just moves their position variables). Then I call entityManageCollisions(delta), which resolves all of the collisions caused by the update, then I finally call entityDraw(batch, delta), which uses delta to get the right frames for sprite animations, and actually draws everything on the screen.
I use a variant of an Entity/Componet/System model so I handle all of my entities generically, and those method calls I mention above are essentially "Systems" that act on Entities with certain combinations of components on them.
So, all that to say, pass delta (the parameter passed into render()) into all of your logic, so you can scale things (move entities the appropriate distance) based on the amount of time that has elapsed since the last call. This requires that you set your speeds based on units / second for your entities, since you're passing in a value to scale them by that is a fraction of a second. Once you do it a few times, and experiment with the results, you'll be in good shape.
Also note: This will drive you insane in interactive debug sessions, since the delta timer keeps accumulating time since the last render call, causing your entities to fly across the whole screen (and beyond -- test those boundaries for you!) since they generally get sub-second updates, but may wind up getting passed 30 seconds (or however long you spent looking at things stepping through the debugger), so at the very top of my render(), I have a line that says delta = 0.016036086f; (that number was a sample detla from my dev workstation, and seems to give decent results -- you can capture what your video system's typical delta is by writting it to the console during a test run, and use that value instead, if you like) which I comment out for builds to be deployed, but leave un-commented when debugging, so each frame moves the game forward a consistent amount, regardless of how long I spend looking at things in the debugger.
Good luck!
The answer so far isn't using parallel threads - I've had this question myself in the past and I've been advised against it - link. A good idea would be to run the world update first, and then skip the rendering if there isn't enough time left in the frame for it. Delta times should be used nevertheless to keep everything going smooth and prevent lagging.
If using this approach, it would be wise to prevent more than X consecutive frame skips from happening, since in the (unlikely, but possible, depending on how much update logic there is compared to rendering) case that the update logic lasts more than the total time allocated for a frame, this could mean that your rendering never happens - and that isn't something that you'd want. By limiting the numbers of frames you skip, you ensure the updates can run smoothly, but you also guarantee that the game doesn't freeze when there's too much logic to handle.
My game has been stuttering because of the GC and it ranges from 40ms to 140ms.
My game is not creating new objects or anything in the update or render threads so I'm pretty sure my project is clean EXCEPT for one.
In the update method I have a List<TouchEvents> touchEvents = getTouchEvents();
I am pretty sure this is what is causing the GC to kick in as it only GC every time I'm moving around as it requires me touching the screen (using the ACTION_MOVE event).
How would I optimize or prevent this?
EDIT:
Now I'm starting to think it has to do with my FPS limit method.
I'm assuming since I am limiting FPS to 30 the GC does not have enough time without interfering with my game.
I came up with this theory after I took the limiter off and ran my game at full 60FPS.
The game goes PERFECTLY SMOOTH when running at 60FPS but not at 30FPS.
Any ideas?
Personally I wouldn't recommend capping the fps. Instead, let it run as fast as it can and refer to elapsed time when doing movement and physics.