Java: Need optimzed (fast) method for writing integer array to FileOutputStream [closed] - java

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 3 years ago.
Improve this question
Searching yields many questions about how to convert an int to byte[]. I have a project with a critical loop that writes a long int[] to a FileOutputStream. FileOutputStream requires a byte[] for writing. I can brute-force different methods; I'm looking for a way to send an int[] directly to a FileOutputStream or the fastest method to convert int[] to byte[] - something like wrapping a buffer. I see ways to wrap a byte[] to convert to int[] and float[]... but none the other way (from int[] to byte[]). Thanks.
Update: still hoping to avoid the complexity (or experimenting - for now) of memory mapped I/O until the need is proven. The comments below prompted me to look at creating a ByteBuffer, wrapping it in an IntBuffer, writing ints to the IntBuffer, then extracting a byte[] from the ByteBuffer to send to the FileOutputStream. The obvious alternative is just to use byte[] directly, which requires that I manipulate my data as bytes rather than ints, which I can do - but how much more efficient (if at all) is it compared to the byte[]/ByteBuffer/IntBuffer wrapping scheme?

Your bottleneck is most likely to be your disk IO so what you do in CPU doesn't matter. I would make sure you trying to solve a problem which will make a difference to your application.
If you have a fast disk sub system and you have short bursts of data, your CPU can matter and the fastest way to do the conversion is to avoid performing the conversion in the first place, ie don't use a byte[] at all. An example if OpenHFT/Java Chronicle this takes an int value and writes it direct to a memory mapped file memory region as a 32-bit value. This means each write consists of a single machine code instruction and takes about 1.5 ns on average.

Try ObjectOutputStream.writeObject(intArray). You can later read it with ObjectIntputStream.readObject

Related

Why are Java substrings bad? [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 6 years ago.
Improve this question
I have recently started using Java for the first time (I used to primarily use C, C++ or Assembly before this) and ran into substrings. I know that Java characters and strings take up at least double the space the character or string should take. But why are substrings bad? I have been advised by a lot of people to avoid them if possible on processing intensive platforms but Strings are used everywhere in web services which can be very processing intensive, so I am curious as to why so many people have this opinion.
This may be related to how substring() was previously implemented. In earlier Java versions calling substring() on a long String would keep the original String in memory (they would share the internal char[]). This can cause memory issues if the original Strings are kept around in memory unnecessarily.
In Java 8 this is no longer the case (the internal char[] is copied) and you can freely take substrings of even long Strings.

String.substring() making a copy of the underlying char[] value [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 7 years ago.
Improve this question
A question relating to performance considerations for String.substring. Prior to Java 1.7.0_06, the String.substring() method returned a new String object that shared the same underlying char array as its parents but with different offset and length. To avoid keeping a very large string in memory when only a small substring was needed to be kept, programmers used to write code like this:
s = new String(queryReturningHugeHugeString().substring(0,3));
From 1.7.0_06 onwards, it has not been necessary to create a new String because in Oracle's implementation of String, substrings no longer share their underlying char array.
My question is: can we rely on Oracle (and other vendors) not going back to char[] sharing in some future release, and simply do s = s.substr(...), or should we explicitly create a new String just in case some future release of the JRE starts using a sharing implementation again?
The actual representation of the String is an internal implementation detail, so you can never be sure. However according to public talks of Oracle engineers (most notably #shipilev) it's very unlikely that it will be changed back. This was done not only to fight with possible memory leak, but also to simplify the String internals. With simpler strings it's easier to implement many optimization techniques like String deduplication or Compact Strings.

Data types usage in java [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 8 years ago.
Improve this question
I am not new in java but when I write program I always use int Type of my variables. I want to know when I need to use int data, when byte, when long and so on... Can U explain me this with examples please.
If you are asking when to use float, double and long etc. This document can help you to understand.
For example, int is 32 bit but long is 64 bit. If you need to set a value over 32 bit you should use long to store data.
Good luck.
It basically depends on how you wanna use them and what considerations would you have (size, data type,..etc). I would recommend going throw Oracle's docs.
The types used should be the product of a deep thought about various of things, including (but not limited to) int over long (32bit vs 64bit), char over byte (user friendliness vs performance) , complex data structures over simple ones (Performance), backwards compatibility (JRE version), platforms the program's gonna run on (Windows? Unix? Mac OS?), readability of the code (Sometimes writing "byte x = 0xFF; char ch = (char)x; is worse than char ch = 'a' and the list goes on... of course some of the stuff I mentioned fit into more than one category.
This usually comes with experience. The more you code, the more platforms you want to support, the faster you want your program to respond etc...
You should always have a plan regarding your program:
What platforms will I support?
Is the task more important than performance?
...
...
I'm not saying you should carefully consider every type you choose, I'm saying you should always make the effort to tick all the V's and be satisfied about it, accomplishing everything you wanted.

What's the best java method to keep a list of string values read from file [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 8 years ago.
Improve this question
I know it's a simple question but really want to see if there's another way to do it instead of using ArrayList to hold all the values. And if that's necessary, what should be the design of the java class.
Say I would need 5 list of values read from 5 files. Previously I just used 5 ArrayList to store the 5 list of values.
public class Values{
ArrayList<String> o1 = new ArrayList<String>();
ArrayList<String> o2 = new ArrayList<String>();
ArrayList<String> o3 = new ArrayList<String>();
...
public void readFromFile(ArrayList<String> listName, String filePath){
/*read file contents into list*/
}
But my problem is, each may contain more than 2000 string values. Is this an appropriate way to do so? If so, what would be a better design of it?
I think you will be fine with using ArrayLists for such a task. I have processed a large dataset of Tweets (aka Twitter Streaming data) to the tune of 5 GB and 1.5 million individual tweets. It wasn't an issue.
You can always increase your heap size if you have problems. Do realize that unless you really need to create and store so many ArrayLists, you can always clear them after intermediate processing.
java -Xms2048M -Xmx4096M YourProgramName
I think this should give you an idea of how you should design your program. The idea here is to add, process, remove. For my case, I just parsed, manipulated a tweet, cleared and moved on.
Given that you really need to have that data in memory, there is nothing wrong with ArrayList to accomplish that. 5 files with 2000 strings of a length of 80 characters are 5*2000*80*2 bytes of character data + some overhead for the 10000 String objects + 5 ArrayList objects, in total you will use less than 1.7 MB of memory for that. Not a big deal.
You should change the declaration and use List instead of ArrayList, like this:
List<String> o1 = new ArrayList<String>();
In this way you can use for example a LinkedList instead of the ArrayList without changing to much of your code. But as long as you don't have any specific reason to use something else, go ahead and use the ArrayList, it is the simplest solution.
KISS.
Unless a different solution enhances testability, maintainability, clarity and simplicity of what you're trying to accomplish, go with what you have. Writing good, clean code is much more important at the outset than writing highly optimized, fast performing code. Clean code is code that's easy to optimize later anyway.

Java mimicking assembly [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 8 years ago.
Improve this question
What would be the best way to write a Java program that would simulate machine code? For example, I need to create a series of instructions such as add, subtract, increment, decrement, etc.
Let's say I'm writing the add instruction which accepts 3 parameters/registers (adding the values in the first 2 registers and storing the result in the 3rd). Is it as simple as writing a function such as:
int add(int x, int y) {
int result;
result = x + y;
return result; }
I'm also open to the possibility that I'm way off base here. Any input would be much appreciated.
If you just want to write Java code that will be more or less 1:1 with machine instructions I'd suggest you create variables for all of the registers and define methods for most of the instructions (similar to what you suggested). But this will not "restrict" what you can do the way real machine instructions do, since you can multiply the BX reg by the AX reg when the machine may not allow that.
Better would be to define a class that represents the machine state (ie, registers and RAM) and methods on the class for all of the instructions. Then you couldn't multiply BX times AX unless there were a MUL_BX_AX method. Many methods would not have parameters (because the registers are inside the "opaque" object), but some would have parms where the "real" instructions would accept an offset or whatever. (Eg, ADD_AX_IMMED(5).)
Added: There is the issue of branching, though, that would require some additional thought. Java doesn't have a GOTO equivalent that would fill the role very well, so initially (until you think of something better) you might have to use standard if/else logic, et al, testing "condition codes" in the machine state class.
The best way to simulate assembly would be to handle the raw bits and bytes and do the operations accordingly.
Sure you could do that, but the big thing is how to switch on the op-codes and do all the address-field calculations.
Typically, address fields can contain literal constants, global addresses, registers, offsets relative to registers, etc.
It depends if you're simulating a simple machine or a real one.

Categories