I'm learning Libgdx and have some questions about updating my game logic during the render method..
I would ideally like to keep my game logic and my render separate. The reason for this is if i have high FPS on a system my game loop would "run" faster.
what i am looking for is to keep the experance constant and possibily Limit my updates..if any one can point me towards a tutorial on how to
a)Limit my render updates via DeltaTime
b)Limit my game logic updates via Deltatime.
Thank you :)
After re-reading your question, I think the trick that you are missing (based on your comment that running on a higher-refresh system would result in your game logic running faster), is that you actually scale your updates based on the "delta" time that is passed to render. Andrei Bârsan mentions this above, but I thought I'd elaborate a bit on how delta is used.
For instance, within my game's render(), I first call my entityUpdate(delta), which updates and moves all of the objects in my game scaled by the distance traveled in time "delta" (it doesn't render them, just moves their position variables). Then I call entityManageCollisions(delta), which resolves all of the collisions caused by the update, then I finally call entityDraw(batch, delta), which uses delta to get the right frames for sprite animations, and actually draws everything on the screen.
I use a variant of an Entity/Componet/System model so I handle all of my entities generically, and those method calls I mention above are essentially "Systems" that act on Entities with certain combinations of components on them.
So, all that to say, pass delta (the parameter passed into render()) into all of your logic, so you can scale things (move entities the appropriate distance) based on the amount of time that has elapsed since the last call. This requires that you set your speeds based on units / second for your entities, since you're passing in a value to scale them by that is a fraction of a second. Once you do it a few times, and experiment with the results, you'll be in good shape.
Also note: This will drive you insane in interactive debug sessions, since the delta timer keeps accumulating time since the last render call, causing your entities to fly across the whole screen (and beyond -- test those boundaries for you!) since they generally get sub-second updates, but may wind up getting passed 30 seconds (or however long you spent looking at things stepping through the debugger), so at the very top of my render(), I have a line that says delta = 0.016036086f; (that number was a sample detla from my dev workstation, and seems to give decent results -- you can capture what your video system's typical delta is by writting it to the console during a test run, and use that value instead, if you like) which I comment out for builds to be deployed, but leave un-commented when debugging, so each frame moves the game forward a consistent amount, regardless of how long I spend looking at things in the debugger.
Good luck!
The answer so far isn't using parallel threads - I've had this question myself in the past and I've been advised against it - link. A good idea would be to run the world update first, and then skip the rendering if there isn't enough time left in the frame for it. Delta times should be used nevertheless to keep everything going smooth and prevent lagging.
If using this approach, it would be wise to prevent more than X consecutive frame skips from happening, since in the (unlikely, but possible, depending on how much update logic there is compared to rendering) case that the update logic lasts more than the total time allocated for a frame, this could mean that your rendering never happens - and that isn't something that you'd want. By limiting the numbers of frames you skip, you ensure the updates can run smoothly, but you also guarantee that the game doesn't freeze when there's too much logic to handle.
Related
Before adding the detune variable, the physics update on a computer and a smartphone was strikingly different.
After adding and multiplying some variables, it turned out to smooth out the difference, but not make it completely the same.
In this regard, I ask for help, because I cannot figure out what to do myself.
public void update(float dt, Camera cam){
float detune=dt/0.01666f;
if(!ignoreGraviry)
attraction.add(getGravity().cpy().scl(detune));
float lx=1-.09f*detune;
float ly=1-.015f*detune;
attraction.scl(lx,ly);
Vector v=getMotion().scl(lx,ly).cpy();
lastPos=getPosition().cpy();
getPosition().add(
v.rotate(cam.getRotation()).add(
attraction.cpy().rotate(cam.getRotation())
).scl(dt));
}
Problem
What detune does is to scale up simulation cycle effects by a factor of 60. So basically, instead of having to simulate 60 cycles, this only has to simulate 1 cycle. But the results will be more inaccurate, maybe only a bit, maybe a lot, depending on if the rest of the simulation outside is stable/converging or not. Also with lx and ly, the way this detune is done LOOKS awfully bad (it MIGHT be OK with some outside knowlegde that your question does not provide), because you should never combine linear scaling effects with addition. This will throw you into hell's pit faster than you can imagine. lx for example will take negative numbers or positive, depending on dt. dt usually is the 'delta time' and lets you adjust the granularity vs speed of the simulation. So if someone adjusts dt and all of a sudden the simulation runs backwards, this will become a real sore issue.
Solution
You should NOT have detune in your code like this. Better increase the dt value. Ensure that calculation cycles have the same temporal distance on PC and Smartphones, like 30 times a second (30 fps, dt=33ms) and sleep the rest of the time. If you cannot guarantee that, simulation results will always differ between them, bringing advantages or disadvantages to either.
I do not know if libgdx has a fixed simulation-graphics-cycle, so exactly one simulation per graphics update. But in most engines (yes, especially games, that's why multithread/-core is usually useless there) they are heavily coupled, which - in modern programming languages - is really bad because then you'd have to restrict your simulation algorithm AND graphics updates to the lowest hardware for BOTH PCs AND phones, AND restricting them to both the worst graphical AND computational minimum requirements.
If you can decouple simulation and graphics, you'd only have to consider the lowest computational capabilities. Concerning graphics, you could always run the max frame rate on each system (or limit to 90fps, only very few people have a higher acuity), making the best of the graphics hardware, getting the smoothest rendering.
I'm developing a Live Wallpaper using OpenGL ES 3.0. I've set up according to the excellent tutorial at http://www.learnopengles.com/how-to-use-opengl-es-2-in-an-android-live-wallpaper/, adapting GLSurfaceView and using it inside the Live Wallpaper.
I have a decent knowledge of OpenGL/GLSL best practices, and I've set up a simple rendering pipeline where the draw loop is as tight as possible. No re-allocations, using one static VBO for non-changing data, a dynamic VBO for updates, using only one draw call, no branching in the shaders et cetera. I usually get very good performance, but at seemingly random but reoccurring times, the framerate drops.
Profiling with the on-screen bars gives me intervals where the yellow bar ("waiting for commands to complete") shoots away and takes everything above the critical 60fps threshold.
I've read any resources on profiling and interpreting those numbers I can get my hands on, including the nice in-depth SO question here. However, the main takeaway from that question seems to be that the yellow bar indicates time spent on waiting for blocking operations to complete, and for frame dependencies. I don't believe I have any of those, I just draw everything at every frame. No reading.
My question is broad - but I'd like to know what things can cause this type of framerate drop, and how to move forward in pinning down the issue.
Here are some details that may or may not have impact:
I'm rendering on demand, onOffsetsChanged is the trigger (render when dirty).
There is one single texture (created and bound only once), 1024x1024 RGBA. Replacing the one texture2D call with a plain vec4 seems to help remove some of the framerate drops. Reducing the texture size to 512x512 does nothing for performance.
The shaders are not complex, and as stated before, contain no branching.
There is not much data in the scene. There are only ~300 vertices and the one texture.
A systrace shows no suspicious methods - the GL related methods such as buffer population and state calls are not on top of the list.
Update:
As an experiment, I tried to render only every other frame, not requesting a render every onOffsetsChanged (swipe left/right). This was horrible for the look and feel, but got rid of the yellow lag spikes almost completely. This seems to tell me that doing 60 requests per frame is too much, but I can't figure out why.
My question is broad - but I'd like to know what things can cause this
type of framerate drop, and how to move forward in pinning down the
issue.
(1) Accumulation of render state. Make sure you "glClear" the color/depth/stencil buffers before you start each render pass (although if you are rendering directly to the window surface this is unlikely to be the problem, as state is guaranteed to be cleared every frame unless you set EGL_BUFFER_PRESERVE).
(2) Buffer/texture ghosting. Rendering is deeply pipelined, but OpenGL ES tries to present a synchronous programming abstraction. If you try to write to a buffer (SubBuffer update, SubTexture update, MapBuffer, etc) which is still "pending" use in a GPU operation still queued in the pipeline then you either have to block and wait, or you force a copy of that resource to be created. This copy process can be "really expensive" for large resources.
(3) Device DVFS (dynamic frequency and voltage scaling) can be quite sensitive on some devices, especially for content which happens to sit just around a level decision point between two frequencies. If the GPU or CPU frequency drops then you may well get a spike in the amount of time a frame takes to process. For debug purposes some devices provide a means to fix frequency via sysfs - although there is no standard mechnanism.
(4) Thermal limitations - most modern mobile devices can produce more heat than they can dissipate if everything is running at high frequency, so the maximum performance point cannot be sustained. If your content is particularly heavy then you may find that thermal management kicks in after a "while" (1-10 minutes depending on device, in my experience) and forcefully drops the frequency until thermal levels drop within safe margins. This shows up as somewhat random increases in frame processing time, and is normally unpredictable once a device hits the "warm" state.
If it is possible to share an API sequence which reproduces the issue it would be easier to provide more targeted advice - the question is really rather general and OpenGL ES is a very wide API ;)
I cannot seem to understand how the frame drawing sync with buffer swapping.
Following are the questions:
1.Since most of the open GL calls are non blocking (or buffered) how do you know if the gpu is done with current frame?
2.Does open GL handles it so that an unfinished frame wont get swapped to the window buffer
3.How do you calculate the frame rate? I mean what is the basis for determining the no of frames drawn or time taken by each frame?
The modern way of syncing with the GL is using sync objects. Using glFinish() (or other blocking GL calls) has the disadvantage of stalling both the GPU and the CPU (thread): the CPU will wait until the GPU is finished, and the GPU then stalls because there is no new work queued up. If sync objects are used properly, both can be completely avoided.
You just insert a fence sync into the GL command stream at any point you are interested an, and later can check if all commands before it are completed, or you can wait for the completion (while you still can have further commands queued up).
Note that for frame rate estimation, you don't need any explicit means of synchronization. Just using SwapBuffers() is sufficient. The gpu might queue up a few frames in advance (the nvidia driver has even a setting for this), but this won't disturb fps counting, since only the first n frames are queued up. Just count the number of SwapBuffer() calls issued each second, and you will do fine. If the user has enabled sync to vblank, the frame rate will be limited to the refresh rate of the monitor, and no tearing will appear.
If you need more detailed GPU timing statistics (but for a frame rate counter, you don't), you should have a look at timer queries.
Question 1 and 2: Invoke glFinish() instead of glFlush():
Description
glFinish does not return until the effects of all previously called GL commands are complete. Such effects include all changes to GL state, all changes to connection state, and all changes to the frame buffer contents.
Question 3: Start a Timer and count how many calls to glFinish() were executed within one second.
Generally, you don't. And I can't think of a very good reason why you should ever worry about it in a real application. If you really have to know, using sync objects (as already suggested in the answer by #derhass) is your best option in modern OpenGL. But you should make sure that you have a clear understanding of why you need it, because it seems unusual to me.
Yes. While the processing of calls in OpenGL is mostly asynchronous, the sequence of calls is still maintained. So if you make a SwapBuffers call, it will guarantee that all the calls you made before the SwapBuffers call will have completed before the buffers are swapped.
There's no good easy way to measure the time used for a single frame. The most practical approach is that you render for a sufficiently long time (at least a few seconds seems reasonable). You count the number of frames you rendered during this time, and the elapsed wall clock time. Then divide number of frames by the time taken to get a frame rate.
Some of the above is slightly simplified, because this opens up some areas that could be very broad. For example, you can use timer queries to measure how long the GPU takes to process a given frame. But you have to be careful about the conclusions you draw from it.
As a hypothetical example, say you render at 60 fps, limited by vsync. You put a timer query on a frame, and it tells you that the GPU spent 15 ms to render the frame. Does this mean that you were right at the limit of being able to maintain 60 fps? And making your rendering/content more complex would drop it below 60 fps? Not necessarily. Unless you also tracked the GPU clock frequency, you don't know if the GPU really ran at its limit. Power management might have reduced the frequency/voltage to the level necessary to process the current workload. And if you give it more work, it might be able to handle it just fine, and still run at 60 fps.
All I know is that delta relates somehow to adapting to different frame rates, but I'm not sure exactly what it stands for and how to use it in the math that calculates speeds and what not.
Where is delta declared? initialized?
How is it used? How are its values (min,max) set?
It's the number of milliseconds between frames. Rather than trying to build your game on a fixed number of milliseconds between frames, you want to alter your game to move/update/adjust each element/sprite/AI based on how much time has passed since the last time the update method has come around. This is the case with pretty much all game engines, and allows you to avoid needing to change your game logic based on the power of the hardware on which you're running.
Slick also has mechanisms for setting the minimum update times, so you have a way to guarantee that the delta won't be smaller than a certain amount. This allows your game to basically say, "Don't update more often than every 'x' milliseconds," because if you're running on powerful hardware, and have a very tight game loop, it's theoretically possible to get sub-millisecond deltas which starts to produce strange side effects, such as slow movement, or collision detection that doesn't seem to work the way you expect.
Setting a minimum update time also allows you to minimize recalculating unnecessarily, when only a very, very small amount of time has passed.
Have a read of the LWJGL timing tutorial found here. Its not strictly slick but will explain what the delta value is and how to use it.
I have been working on a childish little program: there are a bunch of little circles on the screen, of different colors and sizes. When a larger circle encounters a smaller circle it eats the smaller circle, and when a circle has eaten enough other circles it reproduces. It's kind of neat!
However, the way I have it implemented, the process of detecting nearby circles and checking them for edibility is done with a for loop that cycles through the entire living population of circles... which takes longer and longer as the population tends to spike into the 3000 before it starts to drop. The process doesn't slow my computer down, I can go off and play Dawn of War or whatever and there isn't any slow down: it's just the process of checking every circle to see if it has collided with every other circle...
So what occurred to me, is that I could try to separate the application window into four quadrants, and have the circles in the quadrants do their checks simultaneously, since they would have almost no chance of interfering with each other: or something to that effect!
My question, then, is: how does one make for loops that run side by side? In Java, say.
the problem you have here can actually be solved without threads.
What you need is a spatial data structure. a quad tree would be best, or if the field in which the spheres move is fixed (i assume it is) you could use a simple grid. Heres the idea.
Divide the display area into a square grid where each cell is at least as big as your biggest circle. for each cell keep a list (linked list is best) of all the circles whose center is in that cell. Then during the collision detection step go through each cell and check each circle in that cell against all the other circles in that cell and the surrounding cells.
technically you don't have to check all the cells around each one as some of them might have already been checked.
you can combine this technique with multithreading techniques to get even better performance.
Computers are usually single tasked, this means they can usually execute one instruction at a time per CPU or core.
However, as you have noticed, your operation system (and other programs) appear to run many tasks at the same time.
This is accomplished by splitting the work into processes, and each process can further implement concurrency by spawning threads. The operation system then switches between each process and thread very quickly to give the illusion of multitasking.
In your situation,your java program is a single process, and you would need to create 4 threads each running their own loop. It can get tricky, because threads need to synchronize their access to local variables, to prevent one thread editing a variable while another thread is trying to access it.
Because threading is a complex subject it would take far more explaining than I can do here.
However, you can read Suns excellent tutorial on Concurrency, which covers everything you need to know:
http://java.sun.com/docs/books/tutorial/essential/concurrency/
What you're looking for is not a way to have these run simultaneously (as people have noted, this depends on how many cores you have, and can only offer a 2x or maybe 4x speedup), but instead to somehow cut down on the number of collisions you have to detect.
You should look into using a quadtree. In brief, you recursively break down your 2D region into four quadrants (as needed), and then only need to detect collisions between objects in nearby components. In good cases, it can effectively reduce your collision detection time from N^2 to N * log N.
Instead of trying to do parallel-processing, you may want to look for collision detection optimization. Because in many situations, perforiming less calculations in one thread is better than distributing the calculations among multiple threads, plus it's easy to shoot yourself on the foot in this multi-threading business. Try googling "collision detection algorithm" and see where it gets you ;)
IF your computer has multiple processors or multiple cores, then you could easily run multiple threads and run smaller parts of the loops in each thread. Many PCs these days do have multiple cores -- so have it so that each thread gets 1/nth of the loop count and then create n threads.
If you really want to get into concurrent programming, you need to learn how to use threads.
Sun has a tutorial for programming Java threads here:
http://java.sun.com/docs/books/tutorial/essential/concurrency/
This sounds quite similar to an experiment of mine - check it out...
http://tinyurl.com/3fn8w8
I'm also interested in quadtrees (which is why I'm here)... hope you figured it all out.