Java re-initializing java object with new , performance and memory - java

I want to know the drawback of writing below code using new for reinitializing object every time to create different object value.
List <Value> valueList = new ArrayList<>;
Value value = new Value();
value.setData("1");
valueList.add(value);
value = new value();
value.setData("2");
valueList.add(value);
value = new value();
value.setData("3");
valueList.add(value);
or a method could be added to return a value object similar to:
private Value getData(String input){
Value value = new Value();
value.setData(input);
return value;
}
List <Value> valueList = new ArrayList<>;
valueList.add(getData("1"));
valueList.add(getData("2"));
valueList.add(getData("3"));
Code wise the second approach looks better for me.
Please suggest the best approaches based on memory and performance.

Both options are creating 3 objects and adding them to a list. There is no difference for memory. Performance doesn't matter. If this code is executed often enough to "matter", the JIT will inline those method calls anyway. If the JIT decides: not important enough to inline, then we are talking about nanosecods anyway.
Thus: focus on writing clean code that gets the job done in a straight forward way.
From that perspective, I would suggest that you rather have a constructor that takes that data; and then you can write:
ValueList<Value> values = Arrays.asList(new Value("1"), new Value("2"), new Value("3"));
Long story short: performance is a luxury problem. Meaning: you only worry about performance when your tests/customers complain about "things taking too long".
Before that, you worry about creating a good, sound OO design and writing down a correct implementation. It is much easier to fix a certain performance problem within well built application - compared to getting "quality" into a code base that was driven by thoughts like those that we find in your questions.
Please note: that of course implies that you are aware of typical "performance pitfalls" which should be avoided. So: an experienced Java programmer knows how to implement things in an efficient way.
But you as a newbie: you only focus on writing correct, human readable programs. Keep in mind that your CPU does billions of cycles per second - thus performance is simply OK by default. You only have to worry when you are doing things on very large scale.
Finally: option 2 is in fact "better" - because it reduces the amount of code duplication.

In both cases, you create 3 instances of Value that are stored in a List.
It doesn't have sensitive differences in terms of consumed memory.
The last one produces nevertheless a cleaner code : you don't reuse a same variable and the variable has a limited scope.
You have indeed a factory method that does the job and returns the object for you.
So client code has just to "consume" it.
An alternative is a method with a varargs parameter :
private List<Value> getData(String... input){
// check not null before
List<Value> values = new ArrayList<>();
for (String s : input){
Value value = new Value();
value.setData(input);
}
return values;
}
List<Value> values = getData("1","2","3");

There is no difference in memory footprint, and there's little difference in performance, because method invocations are very inexpensive.
The second form of your code is a better-looking version of the first form of your code, with less code repetition. Other than that, the two are equivalent.
You can shorten your code by using streams:
List<Value> = Arrays.asList("1", "2", "3").stream()
.map(Value::new)
.collect(Collectors.toList());

Every time you calling new operator to create an object, it allocates spaces for this object on heap. It doesn't matter if you do it with 1st approach or 2nd approach, this objects are allocated to heap space the same way.
What you need to understand thou is a life-cycle of each object you creating and terms like Dependency, Aggregation, Association and Full Composition.

Related

Does introducing an intermediate list might cause performance overhead?

List<UserData> dataList = new ArrayList<>();
List<UserData> dataList1 = dataRepository.findAllByProcessType(ProcessType.OUT);
List<UserData> dataList2 = dataRepository.findAllByProcessType(ProcessType.CORPORATE_OUT);
dataList.addAll(dataList1);
dataList.addAll(dataList2 );
return dataList ;
vs
List<UserData> dataList = new ArrayList<>();
dataList.addAll(dataRepository.findAllByProcessType(ProcessType.OUT));
dataList.addAll(dataRepository.findAllByProcessType(ProcessType.CORPORATE_OUT));
return dataList ;
does the first implementation will cause any performance overhead? (i.e. more garbage / memory allocation than the second one)
P.S. - Yes, it can be optimised using one round trip to db as mentionted by #Tim. But that's not the answer i am looking for.I am in general want to know whether this type of implementation will cause overhead or not. Because this type of implementation helps debugging.
I'm going to say no, on the basis that I would be very surprised if the two code blocks produce different bytecode.
The first code does not "introduce an intermediate list". All it does is create new variables to reference lists that were created by the dataRepository call. I would expect the compiler to simply optimise those variables out.
Those lists are also created in the second code example, so there's no real difference.
Knowing that the compiler performs these sorts of optimisations frees us as programmers to write code that is well laid-out, clear, and maintainable, whilst still remaining confident that it will perform well.
The other consideration is debugging. In the first code block, it is easy to set breakpoints on the variable declaration lines, and inspect the values of the variables. Those simple operations become a pain when code is implemented in the second code block.
As the addAll() method should just be referencing the same data, both of your versions should perform about the same. But, the best thing to do here is to avoid the two unnecessary roundtrips to your database, and just use a single query:
List<ProcessType> types = Arrays.asList(ProcessType.OUT, ProcessType.CORPORATE_OUT);
List<UserData> dataList = findAllByProcessTypeIn(types);

Java software design - Looping, object creation VS modifying variables. Memory, performance & reliability comparison

Let's say we are trying to build a document scanner class in java that takes 1 input argument, the log path(eg. C:\document\text1.txt). Which of the following implementations would you prefer based on performance/memory/modularity?
ArrayList<String> fileListArray = new ArrayList<String>();
fileListArray.add("C:\\document\\text1.txt");
fileListArray.add("C:\\document\\text2.txt");
.
.
.
//Implementation A
for(int i =0, j = fileListArray.size(); i < j; i++){
MyDocumentScanner ds = new MyDocumentScanner(fileListArray.get(i));
ds.scanDocument();
ds.resultOutput();
}
//Implementation B
MyDocumentScanner ds = new MyDocumentScanner();
for(int i=0, j=fileListArray.size(); i < j; i++){
ds.setDocPath(fileListArray.get(i));
ds.scanDocument();
ds.resultOutput();
}
Personally I would prefer A due to its encapsulation, but it seems like more memory usage due to creation of multiple instances. I'm curious if there is an answer to this, or it is another "that depends on the situation/circumstances" dilemma?
Although this is obviously opinion-based, I will try an answer to tell my opinion.
You approach A is far better. Your document scanner obviously handles a file. That should be set at construction time and be saved in an instance field. So every method can refer to this field. Moreover, the constructor can do some checks on the file reference (null check, existence, ...).
Your approach B has two very serious disadvantages:
After constructing a document scanner, clients could easily call all of the methods. If no file was set before, you must handle that "illegal state" with maybe an IllegalStateException. Thus, this approach increases code and complexity of that class.
There seems to be a series of method calls that a client should or can perform. It's easy to call the file setting method again in the middle of such a series with a completely other file, breaking the whole scan facility. To avoid this, your setter (for the file) should remember whether a file was already set. And that nearly automatically leads to approach A.
Regarding the creation of objects: Modern JVMs are really very fast at creating objects. Usually, there is no measurable performance overhead for that. The processing time (here: the scan) usually is much higher.
If you don't need multiple instances of DocumentScanner to co-exist, I see no point in creating a new instance in each iteration of the loop. It just creates work to the garbage collector, which has to free each of those instances.
If the length of the array is small, it doesn't make much difference which implementation you choose, but for large arrays, implementation B is more efficient, both in terms of memory (less instances created that the GC hasn't freed yet) and CPU (less work for the GC).
Are you implementing DocumentScanner or using an existing class?
If the latter, and it was designed for being able to parse multiple documents in a row, you can just reuse the object as in variant B.
However, if you are designing DocumentScanner, I would recommend to design it such that it handles a single document and does not even have a setDocPath method. This leads to less mutable state in that class and thus makes its design much easier. Also using an instance of the class becomes less error-prone.
As for performance, there won't be a measurable difference unless instantiating a DocumentScanner is doing a lot of work (like instantiating many other objects, too). Instantiating and freeing objects in Java is pretty cheap if they are used only for a short time due to the generational garbage collector.

The "Why" behind PMD's StringInstantiation rule

Along the lines of an existing thread, The “Why” behind PMD's rules, I'm trying to figure out the meaning of one particular PMD rule : String and StringBuffer Rules.StringInstantiation.
This rule states that you shouldn't explicitly instantiate String objects. As per their manual page :
Avoid instantiating String objects; this is usually unnecessary since
they are immutable and can be safely shared.
This rule is defined by the following Java
class:net.sourceforge.pmd.lang.java.rule.strings.StringInstantiationRule
Example(s):
private String bar = new String("bar"); // just do a String bar =
"bar";
http://pmd.sourceforge.net/pmd-5.0.1/rules/java/strings.html
I don't see how this syntax is a problem, other than it being pointless. Does it affect overwhole performance ?
Thanks for any thought.
With String foo = "foo" there will be on instance of "foo" in PermGen space (This is referred to as string interning). If you were to later type String bar = "foo" there would still only be one "foo" in the PermGen space.
Writing String foo = new String( "foo" ) will also create a String object to count against the heap.
Thus, the rule is there to prevent wasting memory.
Cheers,
It shouldn't usually affect performance in any measurable way, but:
private String bar = new String("bar"); // just do a String bar = "bar";
If you execute this line a million times you will have created a million objects
private String bar = "bar"; // just do a String bar = "bar";
If you execute this line a million times you will have created one Object.
There are scenarios where that actually makes a difference.
Does it affect overwhole performance ?
Well, performance and maintenance. Doing something which is pointless makes the reader wonder why the code is there in the first place. When that pointless operation also involves creating new objects (two in this case - a new char[] and a new String) that's another reason to avoid it...
In the past, there has been a reason to call new String(existingString) if the existing string was originally obtained as a small substring of a longer string - or other ways of obtaining a string backed by a large character array. I believe that this is not the case with more recent implementations of Java, but obviously you can still be using an old one. This shouldn't be a problem for constant strings anyway, mind you.
(You could argue that creating a new object allows you to synchronize on it. I would avoiding synchronizing on strings to start with though.)
One difference is the memory footprint:
String a = "abc"; //one object
String b = "abc"; //same object (i.e. a == b) => still one object in memory
String c = new String("abc"); // This is a new object - now 2 objects in memory
To be honest, the only reason I can think of, why one would use the String constructor is in combination with substring, which is a view on the original string. Using the String constructor in that case helps getting rid of the original string if it is not needed any longer.
However, since java 7u6, this is not the case any more so I don't see any reasons to use it any more.
It can be useful, because it creates a new identity, and sometimes object identities are important/crucial to an application. For example, it can be used as an internal sentinel value. There are other valid use cases too, e.g. to avoid constant expression.
If a beginner writes such code, it's very likely a mistake. But that is a very short learning period. It is highly unlikely that any moderately experienced Java programmer would write that by mistake; it must be for a specific purpose. File it under "it looks like a stupid mistake, but it takes efforts to make, so it's probably intended".
It is
pointless
confusing
slightly slower
You should try to write the simplest, clearest code you can. Adding pointless code is bad all round.

What will use more memory

I am working on improving the performance of my app. I am confused about which of the following will use more memory: Here sb is StringBuffer
String strWithLink = sb.toString();
clickHereTextview.setText(
Html.fromHtml(strWithLink.substring(0,strWithLink.indexOf("+"))));
OR
clickHereTextview.setText(
Html.fromHtml(sb.toString().substring(0,sb.toString().indexOf("+"))));
In terms of memory an expression such as
sb.toString().indexOf("+")
has little to no impact as the string will be garbage collected right after evaluation. (To avoid even the temporary memory usage, I would recommend doing
sb.indexOf("+")
instead though.)
However, there's a potential leak involved when you use String.substring. Last time I checked the the substring basically returns a view of the original string, so the original string is still resident in memory.
The workaround is to do
String strWithLink = sb.toString();
... new String(strWithLink.substring(0,strWithLink.indexOf("+"))) ...
^^^^^^^^^^
to detach the wanted string, from the original (potentially large) string. Same applies for String.split as discussed over here:
Java String.split memory leak?
The second will use more memory, because each call to StringBuilder#toString() creates a new String instance.
http://www.docjar.com/html/api/java/lang/StringBuilder.java.html
Analysis
If we look at StringBuilder's OpenJDK sources:
public String toString() {
// Create a copy, don't share the array
return new String(value, 0, count);
}
We see, that it instantiates a whole new String object. It places in the string pool as many new instances as many times you call sb.toString().
Outcome
Use String strWithLink = sb.toString();, reusing it will retrieve the same instance of String from the pool, rather the new one.
Check other people's answers, the second one does take a little bit more memory, but this sounds like you are over optimizing. Keeping your code clear and readable should be the priority. I'd suggest you don't worry so much about such tiny optimizations if readability will suffer.
The less work you do, the more efficient it usually is. In this case, you don't need to call toString at all
clickHereTextview.setText(Html.fromHtml(sb.substring(0, sb.indexOf("+"))));
Creating new objects always take up more memory. However, in your case difference seems insignificant.
Also, in your case, you are creating a local variable which takes heap space.
Whenever there are references in more than one location in your method it good to use
String strWithLink = sb.toString();, as you can use the same strWithLink everywhere . Otherwise, if there is only one reference, its always better to just use sb.toString(); directly.

Java: Different between two ways when using new Object

For example, you want to reverse a string, will there two ways:
first:
String a = "StackOverFlow";
a = new StringBuffer(a).reverse().toString();
and second is:
String a = "StackOverFlow";
StringBuffer b = new StringBuffer(a);
a = b.reverse().toString();
at above code, I have two question:
1) in first code, does java create a "dummy object" StringBuffer in memory before do reverse and change to String.
2) at above code, does first will more optimize than second because It makes GC works more effectively ? (this is a main question I want to ask)
Both snippets will create the same number of objects. The only difference is the number of local variables. This probably won't even change how many values are on the stack etc - it's just that in the case of the second version, there's a name for one of the stack slots (b).
It's very important that you differentiate between objects and variables. It's also important to write the most readable code you can first, rather than trying to micro-optimize. Once you've got clear, working code you should measure to see whether it's fast enough to meet your requirements. If it isn't, you should profile it to work out where you can make changes most effectively, and optimize that section, then remeasure, etc.
The first way will create a very real, not at all a "dummy object" for the StringBuffer.
Unless there are other references to b below the last line of your code, the optimizer has enough information to let the environment garbage-collect b as soon as it's done with toString
The fact that there is no variable for b does not make the object created by new less real. The compiler will probably optimize both snippets into identical bytecode, too.
StringBuffer b is not a dummy object, is a reference; basically just a pointer, that resides in the stack and is very small memory-wise. So not only it makes no difference in performance (GC has nothing to do with this example), but the Java compiler will probably remove it altogether (unless it's used in other places in the code).
In answer to your first question, yes, Java will create a StringBuffer object. It works pretty much the way you think it does.
To your second question, I'm pretty sure that the Java compiler will take care of that for you. The compiler is not without its faults but I think in a simple example like this it will optimize the byte code.
Just a tip though, in Java Strings are immutable. This means they cannot be changed. So when you assign a new value to a String Java will carve out a piece of memory, put the new String value in it, and redirect the variable to the new memory space. After that the garbage collector should come by and clear out the old string.

Categories