Code design: performance vs maintainability [closed] - java

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 6 years ago.
Improve this question
Contextualisation
Im am implementing a bytecode instrumenter using the soot framework in a testing context and I want to know which design is better.
I am building the TraceMethod object for every Method in a Class that I am instrumenting and I want to run this instrumenter on multiple Classes.
Which Option offers more performance(Space–time)?
Option 1: (Maps)
public class TraceMethod {
boolean[] decisionNodeList;
boolean[] targetList;
Map<Integer,List<Integer>> dependenciesMap;
Map<Integer,List<Double>> decisionNodeBranchDistance;
}
Option 2: (Objects)
public class TraceMethod {
ArrayList<Target> targets = new ArrayList<Target>();
ArrayList<DecisionNode> decisionNodes = new ArrayList<DecisionNode>();
}
public class DecisionNode {
int id;
Double branchDistance;
boolean reached;
}
public class Target {
int id;
boolean reached;
List<DecisionNode> dependencies;
}
I have implemented the option 2 by myself, but my boss suggest me the option 1 and he argue that is "lighter". I saw that in this article "Class Object vs Hashmap" that HashMaps use more memory than Objects, but im still not convinced that my solution(option 2) is better.
Its a simple detail but i want to be sure that I am using the optimal solution, my concern is about performance(Space–time). I know that the second option are way better in term of maintainability but i can sacrifice that if its not optimal.

In general you should always go for maintenance, and not for supposed performance. There are few good reasons for this:
We tend to be fascinated by speed difference between array vs HashMap, but in real enterprise application these differences are not big enough to account in visible difference in application speed.
Most common bottlenecks in application are in either database or network.
JVM optimizes code to some extent
It is very unlikely that your application will have performance issues due to maintainable code. More likely case is your boss will run out of money when you will have millions lines of unmaintainable code .

Approach 1 has the potentical to be much faster and uses less space.
Especially for a byte code instrumenter, I would first implement approach 1.
And then when it works, replace both Lists with non generic lists that use primitive types instead of the Integer and Double object.
Note that an int needs 4 bytes while an Integer (Object) need 16 - 20 bytes, depending on the machine (16 at PC, 20 at android).
The List can be replaced with GrowingIntArray (I have found that in an statistic package of Apache if I remeber correctly) which uses primitive ints. (Or maybe just replaced by an int[] once you know that the content cannot change anymore)
Then you just write your own GrowingDoubleArray (or use double[])
Remember Collections are handy but slower.
Objects use 4 times more space than primitives.
A byte code instrumenter needs performance, it is not a software that is run once a week.
Finally I would not replace that Maps with non generic ones, that seems
for me to much work. But you may try it as last step.
As a final optimization step: look how many elements are in your lists or maps. If that are usually less than 16 (you have to try that out), you may switch to a linear search,
which is the fastest, for a very low number of elements.
You even can make your code intelligent to switch the search algorithms once the number of elements exceed a specific number.
(Sun/Oracle java does this, and Apple/ios, to) in some of their Collections.
However this last step will make you code much more complex.
Space as an exmample:
DecisionNode: 16 for the class + 4 (id) + 20 (Double) +4 (boolean) = 44 + 4 padding to then next multiple of 8 = 48 bytes.

Related

Best way to think about implementing recursive methods? [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 2 years ago.
Improve this question
so I was wondering if any of you can give me tips regarding this. I've been doing some challenges like (the classical) making a method to calculate the nth number of a Fibonacci sequence using a single recursive call (aka. avoid return fibo(n-1) + fibo(n-2);).
I really scratched my head on that one and ended up looking at the solution that made use of a helper method -
public static int fibonacci(int n) {
if (n < 2) {
return n;
}
return fibonacci_helper(n, 1, 0);
}
public static int fibonacci_helper(int n, int previous, int current) {
if (n < 1) {
return current;
}
return fibonacci_helper(n - 1, current, previous + current);
}
I'm not really sure what approach one takes to solve questions like that quickly (without first solving it iteratively and translating that to a tail recursion, which takes a lot of time).
Would really appreciate some tips, thanks in advance.
You need to first decide if the question needs a recursive solution.Typically a recursion is needed when a present solution is dependent on some previous (already calculated) solution.
To start with , first check on small inputs(call them corner/base cases) . Then build on it (manually by dry running) on small inputs.Once you have done this, you can , in most cases , figure out the recurrence relation(like here in fibonacci).Test its validity , and then using base cases and current recurrence relation , write a recursion.
For example , the given code searches for a node with particular value in a binary tree(check out if you don't know what binary tree is: https://en.wikipedia.org/wiki/Binary_tree)
bool search(Node root,int val){
if(root==null)//base case 1
return false;
if(root.value==val)//base case 2
return true;
return(search(root.left,val)||search(root.right,val));//recursing left and right subtrees for looking out for the value
}
Play with it on paper, and try discover hidden computations that are redone needlessly. Then try to avoid them.
Here you have f(n) = f(n-1) + f(n-2); obviously f(n-1) = f(n-2) + f(n-3) redoes f(n-2) needlessly, etc. etc. etc.. What if you could do the two at once?
Have f2(n) return two values, for n and for (n-1); then you do (in pseudocode)
f(n) = let { (a,b) := f2(n-1) } in (a+b)
Now you have two functions, none is yet defined, what good does it do? Turn this f into f2 as well, so it returns two values, not one, just as we expect it to do:
f2(n) = let { (a,b) := f2(n-1) } in (a+b,a)
And voila, a recursive definition where a is reused.
All that's left is to add some corner/edge/base case(s), and check for the off-by-1 errors.
Or even better, reverse the time arrow, start from the base case, and get your iterative version for free.
Recursion is a tool which is there to help us, to make problem solving easier.
The area you're thinking of is called Dynamic Programming. The way it works is that the solution to the larger problem you're trying to solve is composed of solutions to smaller problems, and the time complexity can be reduced dramatically if you keep those solutions and reuse them, instead of calculating them multiple times. The general approach to take is to consider how the problem can be broken down, and which solutions to the smaller problems you'll need to remember in order to solve it. In this case, you could do it in linear time and linear space by keeping all the results in an array, which should be pretty easy to think of if you're looking for a DP solution. Of course that can be simplified because you don't need to keep all those numbers, but that's a separate problem.
Typically, DP solutions will be iterative rather than recursive, because you need to keep a large number of solutions available to calculate the next larger one. To change it to use recursion, you just need to figure out which solutions you need to pass on, and include those as the parameters.

When not to use java8 streams? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 2 years ago.
Improve this question
I was just trying out a few code snippets and came across an observation where using simple for loop gave me better performance results compared to using java 8 stream. Now, I may have missed something in understanding these things. I need help understanding the difference. Adding my code below.
//Following takes almost 3ms
public int[] testPerf(int[] nums, int[] index) {
List<Integer> arrayL = new ArrayList<>();
for(int i =0; i< index.length; i++){
arrayL.add(index[i], nums[i]);
}
return arrayL.stream().mapToInt(i -> i).toArray();
}
//Following takes almost 1ms
public int[] testPerf(int[] nums, int[] index) {
List<Integer> arrayL = new ArrayList<>();
for(int i =0; i< index.length; i++){
arrayL.add(index[i], nums[i]);
}
int [] result = new int[index.length];
for(int i =0; i< index.length; i++){
result[i] = arrayL.get(i);
}
return result;
}
EDIT: START
What am I trying to test? Injecting elements of nums at indexes specified by index array to form a final result.
Input: nums = [0,1,2,3,4], index = [0,1,2,2,1]
Output: [0,4,1,3,2]
Explanation:
nums index target
0 0 [0]
1 1 [0,1]
2 2 [0,1,2]
3 2 [0,1,3,2]
4 1 [0,4,1,3,2]
EDIT: END
Note that I have tested these with different inputs and tried arrays containing 2k+ elements but still the code with for loop gave me better performance.
Please guide what is it that makes the other code take more time?
Also point out to any references where I can learn when NOT to use java streams? (When not to overcomplicate? :) )
I use streams over loops (I use anything over loops), when I can, because
streams a usually much easier to write without bugs, even in absence of tests, because of their declarative uniform syntax. For the same reason they are usually much easier to read for a non-author, even without thorough commenting. Also, streams are lazy (think "pull"), for loops are eager (think "push"). And being lazy is usually better.
Unfortunately, streams have (sometimes substantial) overhead, so need to pay attention when writing performance critical code, especially with collectors.
So, something like "use streams when you can and loops where you must".
The goal of streams is to make the code compacter, better readable, to reduce boiler plate and thus to simplify the developer's work. The goal of streams is not to give performance gains. The performance in a specific case can differ from for. Also streams can take a bit more memory. But the result - how streams are converted to a byte code - can depend on JVM: Oracle JDK, Open JDK, IBM JDK, etc.
If you have strict requirements for memory and performance, then there is no ready answer. You should compare your productive code (not the example you show here) in both variants - with for and with streams, then choose what fits your strict requirements.
But...
But in the most applications now days the impact of the loops, no matter is it for or streams, on performance is very low compared to other operations like accessing database or calling some web services via network. Even if for would be x10 faster compared to streams, but database access takes 2 000 ms, you would not see real difference.
Where as difference in the code style can be essential. Developers can produce code that is better readable. When some developers in your team go to other projects and some new join your team and have to extend the existing code, they will need less time to understand such code. It is not that understanding for is hard :) But reading code with multiple streams feels to many developers naturals compared to the code with for.
There are a few special cases where streams are hard to apply, like loop over 2 or 3 collections simultaneously. In such case streams make not much sense, for is preferable.
But in general there are no rules when to use or not to use specific construct. You know that any while and do - while loop can be replaced with a for loop. But why Java has different types of loops? Because some of them can express the logic in particular cases in a better way that other the others. Same with streams. If you and your team don't feel comfortable with streams, you don't have to use them. Give them a try, use them over 3 months, then discuss within the team and decide if every developer in your project should use them as much as possible or should try to avoid.

How slow/fast is String concatenation in Java relative to other compiled languages? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 4 years ago.
Improve this question
I'm just getting started with Java and while reading through this guide I noticed the following snippet, describing a recent update to the Junit framework.
We can now write assertion messages in a lambda in JUnit 5, allowing
the lazy evaluation to skip complex message construction until needed:
#Test
public void shouldFailBecauseTheNumbersAreNotEqual_lazyEvaluation() {
Assertions.assertTrue(
2 == 3,
() -> "Numbers " + 2 + " and " + 3 + " are not equal!");
}
As someone new to Java this feels like a large implementation just to get around string concatenation.
Are evaluating strings in Java really that slow (relative to other languages?). How does it compare to other compiled languages like C, Golang, etc..?
The point is: there is no lazy string formatting in Java.
Meaning, in languages like C you might see things such as:
#define debug_print...
( see some real world examples here )
The idea is to define a macro to which you pass a complicated string. But the compiler makes sure that code gets only generated for situations that actually that string to be present.
Meaning: when using debug_print(), that complicated string concat that is might be required to build the messages passed to the macro call only happens when the message is really needed! The message is concatenated lazily.
In Java, we simply have no way to express that. You always have to go
if (someCondition) {
then pull together that large string
which isn't nice, especially when doing it for tracing. Some people in our group write that code, and it is just overly annoying that each and any trace statement has that leading if statement. That renders your whole code much less readable.
Therefore: this is not at all about the cost of string concats. It is about only spending the required CPU cycles if that string is truly needed.
And to answer the actual question: in the end, when that code gets invoked often enough, the JIT will turn it into carefully optimized machine code anyway. Thus: the actual string concat is not the issue.
In other words: you don't think of "performance" for Java in terms of source code. What matters is what happens at runtime, by the JIT.
Here's the bottom line. Starting out in Java, don't worry about minor performance issues like String concatenation. It may be a small issue for a large application server where lots of String concatenation is done but the results are not used. An example would be logging, where the log level of causes the event to be ignored. Also, Java uses a StringBuilder to concatenate a series of literals separated by the "+' operator, which is reasonably performant.

Java library to calculate the relative difference between two Strings? [duplicate]

This question already has answers here:
Fuzzy string search library in Java [closed]
(8 answers)
Closed 9 years ago.
I'm looking for a way to do programmatically detect the delta ratio between two strings. I can use string length, but this doesn't give much useful information for like-sized but different inputs. There is a java diff tool on google code Java Diff Utils, but it hasn't been updated since 2011 and I don't need to actually modify the Strings themselves.
I'm attempting to do change detection with threshold values, for instance: Updated string is 42% different than existing string, are you sure you want to proceed?
Does anyone know of a library that could be used for this, or is java-diff-utils my only option? I couldn't find much in apache commons, and googling is returning irrelevant information.
You could use the Levenshtein Distance to calculate how much different two strings are amongst themselves. There's some quite complex math there but the actual code is rather short. You can easily rewrite the code in that wiki in Java.
The difference will be measured in integers, saying how many steps you'd take to turn one string into the other. A step may be a character addition, removal, or replacement with another character. It will tell you the amount of steps it takes, but not which steps, nor in which order. But then again, since you only want to measure the total difference, I'm sure that's enough information for your needs.
edit: one of the commenters (kaos) provided a link to an implementation of Levenshtein Distance in the Apache Commons.

Someone told me that it is saving memory to use numbers directly instead of static final int fields, is that true?

In my Android project, there are many constances to represent bundle extra keys, Handler's message arguments, dialog ids ant etc.
Someone in my team uses some normal number to do this, like:
handler.sendMessage(handler.obtainMessage(MESSAGE_OK, 1, 0));
handler.sendMessage(handler.obtainMessage(MESSAGE_OK, 2, 0));
handler.sendMessage(handler.obtainMessage(MESSAGE_OK, 3, 0));
in handler:
switch (msg.arg1) {
case 1:
break;
case 2:
break;
case 3:
break;
}
he said too many static final constances cost a lot of memory. but i think his solution makes the code hard to read and refactor.
I have read this question and googled a lot and failed to find an answer.
java: is using a final static int = 1 better than just a normal 1?
I hope someone could show me the memory cost of static finals.
Sorry for my poor English.
You shouldn't bother to change it to literals, it will make your code less readable and less maintainable.
In the long run you will benefit from this "lose of memory"
Technically, he is right - static int fields do cost some additional memory.
However, the cost is negligible. It's an int, plus the associated metadata for the reflection support. The benefits of using meaningfull names that make your code more readable, and ensure that the semantic of that number is well known and consistent evewhere it is used, clearly outweight that cost.
You can do a simple test. Write a small application that calls handler.sendMessage 1000 times with different number literal, build it and note down the size of the .dex file. Then replace the 1000 literals with 1000 static int consts, and do the same. Compare the two sizes and you will get an idea of the order of magnitude of additional memory your app will need. (And just for completeness, post the numbers here as comment :-))
It saves a very small amount of memory - basically just the extra metadata required to record the extra constant in the relevant class and refer to it from other classes.
It is NOT worth worrying about this, unless you are extremely memory constrained.
Using well-named static final constants rather than mysterious magic numbers is much better for your code maintainability and sanity in the long run.

Categories