How to initialize huge float arrays in java, android? - java

I was creating a opengl android application. I was trying to render a opengl object with vertices more than 50,000.
float itemVerts [] = {
// f 231/242/231 132/142/132 131/141/131
0.172233487787643f, -0.0717437751698985f, 0.228589675538813f,
0.176742968653347f, -0.0680393472738536f, 0.2284149434494f,
0.167979223684599f, -0.0670168837233226f, 0.24286384937854f,
// f 131/141/131 230/240/230 231/242/231
0.167979223684599f, -0.0670168837233226f, 0.24286384937854f,
0.166391290343292f, -0.0686544011752973f, 0.241920432968569f,......
and many more.... But when i do this in a function or constructor i get a error while compiling that The code of method () is exceeding the 65535 bytes limit. So I was wondering if there is a different way to do this.
I tried storing the value in file and reading it back. But the IO operation, with string parsing of such huge record is very slow. Takes more than 60 sec. Which is not good.
Please let me know if there is any other way to do this. Thank you for your time and help.

But when i do this in a function or constructor i get a error while
compiling that The code of method () is exceeding the 65535 bytes
limit. So I was wondering if there is a different way to do this.
Put it outside the constructor (as a class variable or field)? If this doesn't change, just make it a constant. If it does change, make it a constant anyway and copy it in the constructor.
I tried storing the value in file and reading it back. But the IO
operation, with string parsing of such huge record is very slow. Takes
more than 60 sec. Which is not good.
If you do decide to keep it in an external file and read it in, don't read it as a string, just serialize it somehow (Java serialization, Protocol Buffers, etc.).

The program don't have to parse the float if we preprocess the data.
Write another program that write all float to a binary file using DataOutputStream.
In your program, read them back using DataInputStream. You might want to chain it with BufferedInputStream.

For this cases I normally use the assets folder to store files in binary format (even you can define some kind of file format to include the vertex, normals etc...) and allocate it on application initialization as wannik explains.

I would proprocess and store floats in binary form, then mmap it as byte buffer and create fload array out of it. This way you get float array, without parsing or allocation of space.

Related

from InputStream to List<String>, why java is allocating space twice in the JVM?

I am currently trying to process a large txt file (a bit less than 2GB) containing lines of strings.
I am loading all its content from an InputStream to a List<String>. I do that via the following snippet :
try(BufferedReader reader = new BufferedReader(new InputStreamReader(zipInputStream))) {
List<String> data = reader.lines()
.collect(Collectors.toList());
}
The problem is, the file itsef is less than 2GB, but when I look at the memory, the JVM is allocating twice the size of the file :
Also, here are the heaviest objects in memory :
So what I Understand is that Java is allocating twice the memory needed for the operation, one to put the content of the file in a byte array and another one to instanciate the string list.
My question is : can we optimize that ? avoid having twice the memory size needed ?
tl;dr String objects can take 2 bytes per character.
The long answer: conceptually a String is a sequence of char. Each char will represent one Codepoint (or half of one, but we can ignore that detail for now).
Each codepoint tends to represent a character (sometimes multiple codepoints make up one "character", but that's another detail we can ignore for this answer).
That means that if you read a 2 GB text file that was stored with a single-byte encoding (usually a member of the ISO-8859-* family) or variable-byte encoding (mostly UTF-8), then the size in memory in Java can easily be twice the size on disk.
Now there's a good amount on caveats on this, primarily that Java can (as an internal, invisible operation) use a single byte for each character in a String if and only if the characters used allow it (effectively if they fit into the fixed internal encoding that the JVM picked for this). But that didn't seem to happen for you.
What can you do to avoid that? That depends on what your use-case is:
Don't use String to store the data in the first place. Odds are that this data is actually representing some structure, and if you parse it into a dedicated format, you might get away with way less memory usage.
Don't keep the whole thing in memory: more often then not, you don't actually need everything in memory at once. Instead process and write away the data as you read it, thus never having more than a hand-full of records in memory at once.
Build your own string-like data type for your specific use-case. While building a full string replacement is a massive undertaking, if you know what subset of features you need it might actually be a quite surmountable challenge.
try to make sure that the data is stored as compact strings, if possible, by figuring out why that's not already happening (this requires digging deep in to the details of your JVM).

performance and size limitations on HttpServletResponse.getOutputStream.print(string) vs getWriter(String)

For a web project I'm writing large sections of text to a webpage(table) or even bigger (could be several MB) to CSV files for download.
The java method dealing with this receives a StringBuilder content string, which originally (by the creator of this module) was being sent char by char in a loop:
response.getOutputStream().write(content.charAt(i)).
Upon questioning about the loop, the reason given was that he thought the string might be too big for writing in one go. (using java 1.6).
I can't find any size restrictions anywhere, and then also the question came which method to use instead: print() or getWriter()?
The data in the string is all text.
He assumed wrong. If anything it's inefficient, or at least useless to do that one character at a time. If you have a String in memory, you can write it out at one go without worrying.
If you're only writing text, use a Writer. OutputStream is for binary data (although you can wrap it in an OutputStreamWriter to convert between the two). See Writer or OutputStream?

Efficient way to parse a datagram in Java

Right now I am using a socket and a datagram packet. This program is for a LAN network and sends at least 30 packets a second at 500 bytes maximum.
this is how I am receiving my data
payload = new String(incomingPacket.getData(), incomingPacket.getOffset(), incomingPacket.getLength(), "UTF-8");
Currently I am using no offset and I parse one by one through each character. I use the first 2 characters right now to determine what type of message it is but that is subject to change, then I break down variables and seperate the data with an exclamation mark to tell me when the next variable begins. At the end I parse it and apply it to my program. Is there a faster way to break down and interpret datagram packets? Will there be a performance difference if I put the length of the variables in the offset. Maybe an example would be useful. Also I think my variables are too small to use StringBuilder so I use normal concatenation.
What you are talking about here is setting up your own protocol for communication. While I have this as the fourth part of my socket tutorial (I'm currently working on part 3, non-blocking sockets) I can explain some things here already.
There are several ways of setting up such a protocol, depending on your needs.
One way of doing it is having a byte in front of each piece of data declaring the size, in bytes. That way, you know the length of the byte array containing the next variable value. This makes it easy to read out whole variables in one go via the System.arraycopy method. This is a method I've used before. If the object being sent is always the same, this is all you need to do. Write the variables in a standardized order, add the size of the variable value and you're done.
If you have to send multiple types of objects throught the stream you might want to add a bit of meta data. This meta data can then be used to tell what kind of object is being sent and the order of the variables. This meta data can then be put in a header which you add before the actual message. Once again, the values in the header are preceded by the size byte.
In my tutorial I'll write up a complete code example.
Don't use a String at all. Just process the underlying byte array directly: scan it for delimiters, counts, what have you. You can use a DataInputStream wrapped around a ByteArrayInputStream wrapped around the byte array if you want an API oriented to Java primitive datatypes.

Best data structure for storing dynamically sized blocks from file input Java

I'm working on a Java program where I'm reading from a file in dynamic, unknown blocks. That is, each block of data will not always be the same size and the size is determined as data is being read. For I/O I'm using a MappedByteBuffer (the file inputs are on the order of MB).
My goal:
Find an efficient way to store each complete block during the input phase so that I can process it.
My constraints:
I am reading one byte at a time from the buffer
My processing method takes a primitive byte array as input
Each block gets processed before the next block is read
What I've tried:
I played around with dynamic structures like Lists but they don't have backing arrays and the conversion time to a primitive array concerns me
I also thought about using a String to store each block and then getBytes() to get the byte[], but it's so slow
Reading the file multiple times in order to find the block size first, and then grab the relevant bytes
I am trying to find an approach that doesn't defeat the purpose of fast I/O. Any advice would be greatly appreciated.
Additional Info:
I'm using a rolling hash to decide where blocks should end
Here's a bit of pseudo-code:
circular_buffer[] = read first 128 bytes
rolling_hash = hash(buffer[])
block_storage = ??? // this is the data structure I'd like to use
while file has more text
b = next byte
add b to block_storage
add b to next index in circular_buffer (if reached end, start adding/overwriting front)
shift rolling_hash one byte to the right
if hash has a certain characteristic
process block_storage as a byte[] //should contain entire block of data
As you can see, I'm reading one byte at a time, and storing/overwriting that one byte repeatedly. However, once I get to the processing stage, I want to be able to access all of the info in the block. There is no predetermined max size of a block either, so I can't pre-allocate.
It seems to me, that you reqire a dynamically growing buffer. You can use the built in BytaArrayOutputStream to achieve that. It will automatically grow to store all data written to it. You can use write(int b) and toByteArray() to realize add b to block_storage and process block_storage as a byte[].
But take care - this stream will grow unbounded. You should implement some sanity checks around it to avoid using up all memory (e.g. count bytes written to it and break by throwing an exception, when it exceeds an reasonable amount). Also make sure to close and throw away the reference to a stream after consuming the block, to allow the GC to free up memory.
edit: As #marcman pointed out, the buffer can be reset().

Java Strings : how the memory works with immutable Strings

I have a simple question.
byte[] responseData = ...;
String str = new String(responseData);
String withKey = "{\"Abcd\":" + str + "}";
in the above code, are these three lines taking 3X memory. for example if the responseData is 1mb, then line 2 will take an extra 1mb in memory and then line 3 will take extra 1mb + xx. is this true? if no, then how it is going to work. if yes, then what is the optimal way to fix this. will StringBuffer help here?
Yes, that sounds about right. Probably even more because your 1MB byte array needs to be turned into UTF-16, so depending on the encoding, it may be even bigger (2MB if the input was ASCII).
Note that the garbage collector can reclaim memory as soon as the variables that use it go out of scope. You could set them to null as early as possible to help it make this as timely as possible (for example responseData = null; after you constructed your String).
if yes, then what is the optimal way to fix this
"Fix" implies a problem. If you have enough memory there is no problem.
the problem is that I am getting OutOfMemoryException as the byte[] data coming from server is quite big,
If you don't, you have to think about a better alternative to keeping a 1MB string in memory. Maybe you can stream the data off a file? Or work on the byte array directly? What kind of data is this?
The problem is that I am getting OutOfMemoryException as the byte[] data coming from server is quite big, thats why I need to figure it out first that am I doing something wrong ....
Yes. Well basically your fundamental problem is that you are trying to hold the entire string in memory at one time. This is always going to fail for a sufficiently large string ... even if you code it in the most optimal memory efficient fashion possible. (And that would be complicated in itself.)
The ultimate solution (i.e. the one that "scales") is to do one of the following:
stream the data to the file system, or
process it in such a way that you don't need ever need the entire "string" to be represented.
You asked if StringBuffer will help. It might help a bit ... provided that you use it correctly. The trick is to make sure that you preallocate the StringBuffer (actually a StringBuilder is better!!) to be big enough to hold all of the characters required. Then copy data into it using a charset decoder (directly or using a Reader pipeline).
But even with optimal coding, you are likely to need a peak of 3 times the size of your input byte[].
Note that your OOME problem is probably nothing to do with GC or storage leaks. It is actually about the fundamental space requirements of the data types you are using ... and the fact that Java does not offer a "string of bytes" data type.
There is no such OutOfMemoryException in my apidocs. If it's OutOfMemoryError, especially on the server-side, you definitely got a problem.
When you receive big requests from clients, those String related statements are not the first problem. Reducing 3X to 1X is not the solution.
I'm sorry I can't help without any further codes.
Use back-end storage
You should not store the whole request body on byte[]. You can store them directly on any back-end storage such as a local file, a remote database, or cloud storage.
I would
copy stream from request to back-end with small chunked buffer
Use streams
If can use Streams not Objects.
I would
response.getWriter().write("{\"Abcd\":");
copy <your back-end stored data as stream>);
response.getWriter().write("}");
Yes, if you use a Stringbuffer for the code you have, you would save 1mb of heap space in the last step. However, considering the size of data you have, I recommend an external memory algorithm where you bring only part of your data to memory, process it and put it back to storage.
As others have mentioned, you should really try not to have such a big Object in your mobile app, and that streaming should be your best solution.
That said, there are some techniques to reduce the amount memory your app is using now:
Remove byte[] responseData entirely if possible, so the memory it used can be released ASAP (assuming it is not used anywhere else)
Create the largest String first, and then substring() it, Android uses Apache Harmony for its standard Java library implementation. If you check its String class implementation, you'll see that substring() is implemented simply by creating a new String object with the proper start and end offset to the original data and no duplicate copy is created. So doing the following would cuts the overall memory consumption by at least 1/3:
String withKey = StringBuilder().append("{\"Abcd\").append(str).append("}").toString();
String str = withKey.substring("{\"Abcd\".length(), withKey.length()-"}".length());
Never ever use something like "{\"Abcd\":" + str + "}" for large Strings, under the hood "string_a"+"string_b" is implemented as new StringBuilder().append("string_a").append("string_b").toString(); so implicitly you are creating two (or at least one if the compiler is mart) StringBuilders. For large Strings, it's better that you take over this process yourself as you have deep domain knowledge about your program that the compiler doesn't, and knows how to best manipulate the strings.

Categories