LWJGL - Reason for cyclic freezes?

LWJGL - Reason for cyclic freezes? - java

I am currently working on a 2D Game that uses LWJGL, but I have stumbled across some serious performance issues.
When I render more than ~100 sprites, the window freezes for a very small amount of time. I did some tests and I found out the following:
The problem occurs with both Vsync enabled or disabled
The problem occurs even if I cap the frames at 60
The program is not just rendering less frames for a short time, the Rendering seems to actually pause
There are no other operations like Matrix-Calculations that slow down the program
I already have implemented batch rendering, but it does not seem to improve the performance
The frequency of the freezes increases with the amount of Sprites
My Graphics Card driver is up to date
The problem occurs although the framerate seems to be quite acceptable, with 100 rendered sprites at the same time, I have ~1500 fps, with 1000 sprites ~200 fps
I use a very basic shader, the transformation matrices are passed to the shader via uniform variables each rendering call (Once per sprite per frame). The size of the CPU/GPU bus shouldn't be an issue.
I have found a very similar issue here, but none of the suggested solutions work for me.
This is my first question here, please let me know if I am missing some important information.

It's probably GC.
Java is sadly not the best language for games thanks to GC and lack of any structures that can be allocated at stack, from languages similar to Java - c# is often better choice thanks to much more tools to control memory, like stack alloc and just structures in general.
So when writing game in languages with GC you should make sure your game loop does not allocate too many objects, in many cases in other languages people often try to go for 0 or near 0 allocations in loop.
You can create objects pools for your entities/sprites, so you don't allocate new ones, just re-use existing ones.
And if it's simple 2d game, then probably just avoiding allocating objects when there is no need to should be enough (like passing just two ints instead of object holding location on 2d map).
And you should use profiler to confirm what changes are worth it.
There are also more tricky solutions, like using off heap manually allocated memory to store some data without object overhead, but I don't think simple game will need such solutions. Just typical game-dev solutions like pooling and avoiding not needed objects should be enough.

Related

Android GPU profiling - OpenGL Live Wallpaper is slow

I'm developing a Live Wallpaper using OpenGL ES 3.0. I've set up according to the excellent tutorial at http://www.learnopengles.com/how-to-use-opengl-es-2-in-an-android-live-wallpaper/, adapting GLSurfaceView and using it inside the Live Wallpaper.
I have a decent knowledge of OpenGL/GLSL best practices, and I've set up a simple rendering pipeline where the draw loop is as tight as possible. No re-allocations, using one static VBO for non-changing data, a dynamic VBO for updates, using only one draw call, no branching in the shaders et cetera. I usually get very good performance, but at seemingly random but reoccurring times, the framerate drops.
Profiling with the on-screen bars gives me intervals where the yellow bar ("waiting for commands to complete") shoots away and takes everything above the critical 60fps threshold.
I've read any resources on profiling and interpreting those numbers I can get my hands on, including the nice in-depth SO question here. However, the main takeaway from that question seems to be that the yellow bar indicates time spent on waiting for blocking operations to complete, and for frame dependencies. I don't believe I have any of those, I just draw everything at every frame. No reading.
My question is broad - but I'd like to know what things can cause this type of framerate drop, and how to move forward in pinning down the issue.
Here are some details that may or may not have impact:
I'm rendering on demand, onOffsetsChanged is the trigger (render when dirty).
There is one single texture (created and bound only once), 1024x1024 RGBA. Replacing the one texture2D call with a plain vec4 seems to help remove some of the framerate drops. Reducing the texture size to 512x512 does nothing for performance.
The shaders are not complex, and as stated before, contain no branching.
There is not much data in the scene. There are only ~300 vertices and the one texture.
A systrace shows no suspicious methods - the GL related methods such as buffer population and state calls are not on top of the list.
Update:
As an experiment, I tried to render only every other frame, not requesting a render every onOffsetsChanged (swipe left/right). This was horrible for the look and feel, but got rid of the yellow lag spikes almost completely. This seems to tell me that doing 60 requests per frame is too much, but I can't figure out why.

My question is broad - but I'd like to know what things can cause this
type of framerate drop, and how to move forward in pinning down the
issue.
(1) Accumulation of render state. Make sure you "glClear" the color/depth/stencil buffers before you start each render pass (although if you are rendering directly to the window surface this is unlikely to be the problem, as state is guaranteed to be cleared every frame unless you set EGL_BUFFER_PRESERVE).
(2) Buffer/texture ghosting. Rendering is deeply pipelined, but OpenGL ES tries to present a synchronous programming abstraction. If you try to write to a buffer (SubBuffer update, SubTexture update, MapBuffer, etc) which is still "pending" use in a GPU operation still queued in the pipeline then you either have to block and wait, or you force a copy of that resource to be created. This copy process can be "really expensive" for large resources.
(3) Device DVFS (dynamic frequency and voltage scaling) can be quite sensitive on some devices, especially for content which happens to sit just around a level decision point between two frequencies. If the GPU or CPU frequency drops then you may well get a spike in the amount of time a frame takes to process. For debug purposes some devices provide a means to fix frequency via sysfs - although there is no standard mechnanism.
(4) Thermal limitations - most modern mobile devices can produce more heat than they can dissipate if everything is running at high frequency, so the maximum performance point cannot be sustained. If your content is particularly heavy then you may find that thermal management kicks in after a "while" (1-10 minutes depending on device, in my experience) and forcefully drops the frequency until thermal levels drop within safe margins. This shows up as somewhat random increases in frame processing time, and is normally unpredictable once a device hits the "warm" state.
If it is possible to share an API sequence which reproduces the issue it would be easier to provide more targeted advice - the question is really rather general and OpenGL ES is a very wide API ;)

java performance degrading due to arraylist of size more than 6000

Hello I am using jdk7 and ram of 6GB with intel corei5 processor.I have a java code which has an arraylist of size more than 6000 and each element in that arraylist contains 12 double values.The processing speed has decreased very much and now it takes around 20 mins to run that entire code.
What the code does is as follows:
There are some 4500 iterations happening due to nested for loops..and in each iteration a file of 400 kb is read and some processing happens and some values are stored in arraylist.
Once the arraylist is ready the values of arraylist are written in another file through csvwriter.and then i have used jtable and the jtable also required the arraylist for referring to some values in that arraylist.so basically i cant clear this arraylist.
I have given the values for heap memory as follows
-Xms1024M -Xmx4096M
I am new to programming and I am rather confused as to what should i do?Can i increase heap size more than this?will that help?My senior suggests to use hard disk memory for storing arraylist elements or processing but i doubt if that is possible.
Please help.Any help will be appreciated

12 x 8 x 6000 doubles are not going to take up a significant amount of memory
If your program's speed is getting slower each time until it eventually crashes with an OutOfMemoryError, then it's possible that you have a coding error that is causing a memory leak.
This question has some examples of memory leak causes in Java.
Using VisualVM or some manual logging will help to identify the issue. Static code anaylsers like FindBugs or PMD may also help.

Heap size isn't your issue. When you have one of those, you'll see an OutOfMemoryError.
Usually what you do when you encounter performance issues like this is you profile, either with something like VisualVM or by hand, using System.nanoTime() to track which part of your code is the bottleneck. From there, you make sure you're using appropriate data structures, algorithms, etc., and then see where you can parallelize your code.

I guess you're leaking the JTables somehow. This can easily happen with Listeners, TableSorters, etc. A proper tool would tell you, but the better way is IMHO to decompose the problem.
Either it's the GUI part what makes troubles or not. Ideally, the remaining program should be completely independent of the GUI, so you can run it in isolation and you'll see what happens.
My senior suggests to use hard disk memory for storing arraylist elements or processing but i doubt if that is possible.
Many things are possible but few make sense. If you're really storing just 12 x 8 x 6000 doubles, then it makes absolutely no sense. Even with the high overhead of Double, it's just a few megabytes.
Another idea: If all the data you need fits into memory, then you can try to read it all upfront. This ensures you're not storing multiple copies.

Graphics spoilt by intermittent jumps

I have written a game app with bitmaps moving around the screen. It employs a separate thread which writes directly to a canvas. On my Samsung Galaxy Y the animations seems smooth throughout the game, however on a "Tabtech m7" tablet the smooth graphics appear to be interrupted by intermittent freezes of about half a second duration, and spaced about three or four seconds apart. Is it possible that it is just a feature of the (cheap) tablet hardware, or is it more likely that it's some aspect of my programming? And if it's me, how could I go about diagnosing the cause?

Have a look in your log to see if the garbage collector is running approximately when you get the freezes. If so you could perhaps try and find out if its you or the system that is allocation memory in a inappropriate way.
In DDMS you can have a look at the Allocation Tracker, could possibly tell you whats going on.

Yes, echoing erbsman. To avoid GC make sure you're not allocating any new objects in your game loop. Also, GC's can be kicked off if you do lot of string conversions (ie, updating score) Like if you do Integer.toString(10) kinda stuff.

sun.java2d.loops.ProcessPath$Point

I am profiling an application suddenly using a lot of memory, and i am getting this:
sun.java2d.loops.ProcessPath$Point
As being allocated almost 11.000.000 times.
What is it, and is there a solution to this?

My initial response would be to question whether this is actually using a lot of memory/CPU cycles? The sun. packages are internal implementations of Sun's JVM, so they're likely to be low-level details of what your code is doing. If these objects are taking up a vast amount of memory that might be an issue, but simply seeing 11 million allocations is no indication that anything is out of the ordinary.
Edit: a little Googling seems to show that this is an object used to encode a reference to a particular point on a 2D plane. Chances are that if you're doing anything that involves graphics then yes, you'd have a large amount of them generated. Additionally, each one only stores two integers (x and y coordinates) and a boolean, so they are going to be very small objects in the grand scheme of things. Even if if none of those 11 million allocations were garbage collected (and I expect the majority will have been local variables so will have been quickly collected), then they're not going to account for a large part of the heap unless you're running on devices with tiny amounts of RAM.
In other words, look elsewhere for your problem. It would probably be more helpful to look at objects that are taking up a large amount of the current heap space, or even look at the number of objects currently referenced, in order to find your leak. Read documents giving guidelines on how to find and quash memory leaks with your tool(s) of choice. Looking at total allocations is rarely that useful, unless you know for a given class how many there should be (e.g. it can be good to check that singletons are only created once, for example).

I solved the memory problem. I was doing some nasty reference handling some places in my code.

How do I make for loops run side by side?

I have been working on a childish little program: there are a bunch of little circles on the screen, of different colors and sizes. When a larger circle encounters a smaller circle it eats the smaller circle, and when a circle has eaten enough other circles it reproduces. It's kind of neat!
However, the way I have it implemented, the process of detecting nearby circles and checking them for edibility is done with a for loop that cycles through the entire living population of circles... which takes longer and longer as the population tends to spike into the 3000 before it starts to drop. The process doesn't slow my computer down, I can go off and play Dawn of War or whatever and there isn't any slow down: it's just the process of checking every circle to see if it has collided with every other circle...
So what occurred to me, is that I could try to separate the application window into four quadrants, and have the circles in the quadrants do their checks simultaneously, since they would have almost no chance of interfering with each other: or something to that effect!
My question, then, is: how does one make for loops that run side by side? In Java, say.

the problem you have here can actually be solved without threads.
What you need is a spatial data structure. a quad tree would be best, or if the field in which the spheres move is fixed (i assume it is) you could use a simple grid. Heres the idea.
Divide the display area into a square grid where each cell is at least as big as your biggest circle. for each cell keep a list (linked list is best) of all the circles whose center is in that cell. Then during the collision detection step go through each cell and check each circle in that cell against all the other circles in that cell and the surrounding cells.
technically you don't have to check all the cells around each one as some of them might have already been checked.
you can combine this technique with multithreading techniques to get even better performance.

Computers are usually single tasked, this means they can usually execute one instruction at a time per CPU or core.
However, as you have noticed, your operation system (and other programs) appear to run many tasks at the same time.
This is accomplished by splitting the work into processes, and each process can further implement concurrency by spawning threads. The operation system then switches between each process and thread very quickly to give the illusion of multitasking.
In your situation,your java program is a single process, and you would need to create 4 threads each running their own loop. It can get tricky, because threads need to synchronize their access to local variables, to prevent one thread editing a variable while another thread is trying to access it.
Because threading is a complex subject it would take far more explaining than I can do here.
However, you can read Suns excellent tutorial on Concurrency, which covers everything you need to know:
http://java.sun.com/docs/books/tutorial/essential/concurrency/

What you're looking for is not a way to have these run simultaneously (as people have noted, this depends on how many cores you have, and can only offer a 2x or maybe 4x speedup), but instead to somehow cut down on the number of collisions you have to detect.
You should look into using a quadtree. In brief, you recursively break down your 2D region into four quadrants (as needed), and then only need to detect collisions between objects in nearby components. In good cases, it can effectively reduce your collision detection time from N^2 to N * log N.

Instead of trying to do parallel-processing, you may want to look for collision detection optimization. Because in many situations, perforiming less calculations in one thread is better than distributing the calculations among multiple threads, plus it's easy to shoot yourself on the foot in this multi-threading business. Try googling "collision detection algorithm" and see where it gets you ;)

IF your computer has multiple processors or multiple cores, then you could easily run multiple threads and run smaller parts of the loops in each thread. Many PCs these days do have multiple cores -- so have it so that each thread gets 1/nth of the loop count and then create n threads.

If you really want to get into concurrent programming, you need to learn how to use threads.
Sun has a tutorial for programming Java threads here:
http://java.sun.com/docs/books/tutorial/essential/concurrency/

This sounds quite similar to an experiment of mine - check it out...
http://tinyurl.com/3fn8w8
I'm also interested in quadtrees (which is why I'm here)... hope you figured it all out.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.