Findbugs and comparing - java

I recently started using the findbugs static analysis tool in a java build I was doing. The first report came back with loads of High Priority warnings. Being the obsessive type of person, I was ready to go knock them all out. However, I must be missing something. I get most of the warnings when comparing things. Such as the following code:
public void setSpacesPerLevel(int value)
{
if( value >= 0)
{
spacesPerLevel = value;
}
else
{
spacesPerLevel = 0;
}
}
produces a high priority warning at the if statement that reads.
File: Indenter.java, Line: 60, Type:
BIT_AND_ZZ, Priority: High, Category:
CORRECTNESS Check to see if ((...) &
0) == 0 in
sample.Indenter.setSpacesPerLevel(int)
I am comparing an int to an int, seems like a common thing. I get quite a few of that type of error with similar simple comparisons.
I have alot of other high priority warnings on what appears to be simple code blocks. Am I missing something here? I realize that static analysis can produce false positives, but the errors I am seeing seem too trivial of a case to be a false positive.
This one has me scratching my head as well.
for(int spaces = 0;spaces < spacesPerLevel;spaces++)
{
result = result.concat(" ");
}
Which gives the following findbugs warning:
File: Indenter.java, Line: 160, Type: IL_INFINITE_LOOP, Priority: High, Category: CORRECTNESS
There is an apparent infinite loop in sample.Indenter.indent()
This loop doesn't seem to have a way to terminate (other than by perhaps throwing an exception).
Any ideas?
So basically I have a handful of files and 50-60 high priority warnings similar to the ones above. I am using findbugs 1.3.9 and calling it from the findbugs ant task
UPDATE:
I have this build being executed by a hudson server and had the code being instrumented by Clover for code coverage. When I turned that off, all of my high priority warnings disappeared. That makes sense now. Thanks for the feedback.

UPDATE: I have this build being executed by a hudson server and had the code being instrumented by Clover for code coverage. When I turned that off, all of my high priority warnings disappeared. That makes sense now. Thanks for the feedback.

A side note:
for(int spaces = 0;spaces < spacesPerLevel;spaces++)
{
result = result.concat(" ");
}
If result is a java.lang.String, this may be inefficient, as you do the following steps for each space character:
create a new char[] to hold the result of the concatenation
create a new java.lang.String instance that is wrapped around the character array
If you do this repeatedly, especially when result is already long, this takes a lot of time.
If performance (both time and memory) is important for that method, you should consider using a StringBuilder (not thread-safe) or a StringBuffer (thread-safe).

Are you running Findbugs thru Eclipse plugin, ant or gui? is it possible that the code hasn't recompiled since you ran it (before making changes)?
if setSpacesPerLevel isn't too long, post the output of
javap -v TheClassThatContainssetSpacerPerLevel
As for the second bug, you'd have to show the whole loop before one could say if it was a problem.

Related

Aparapi data types

I have following code for studying.
My calculate function produces unexpected results when runs on aparapi.
Is there any problem with my code, or aparapi?
Results are;
Result Num Expected
2026982348 406816880 40681688012
2026982516 406816881 40681688180
2026982594 406816882 40681688258
2026982662 406816883 40681688326
2026982830 406816884 40681688494
2026982898 406816885 40681688562
2026982966 406816886 40681688630
2026983044 406816887 40681688708
2026983212 406816888 40681688876
2026983280 406816889 40681688944
2026983338 406816890 40681689002
2026983506 406816891 40681689170
2026983584 406816892 40681689248
2026983652 406816893 40681689316
2026983820 406816894 40681689484
2026983888 406816895 40681689552
2026983956 406816896 40681689620
2026984134 406816897 40681689798
2026984202 406816898 40681689866
2026984270 406816899 40681689934
Edit: If I set executionMode JTP or CPU, I get true results (result == expected) but on GPU mode there is a problem. I'm using late 2013 macbook pro retina with windows 10.
Edit2: Return line of my calculate method causes the problem. If I return Long.MAX_VALUE, it works. But (long) tc * 100 (or ((long) tc) * 100) not giving (eg. 40681688900)
I think you should review your code checking against Aparapi Java Kernel Guidelines, expecially paying attention to Other restrictions and Beware Of Side Effects sections.
Remember to keep your code as simpler as you can.
Looking to your code, in the calculate method you make wide use of the modulus (%) operator. I would suggest you to log each calculation in order to be able to compare what you get in JTP mode and what you get in GPU mode, in order to find out if there are some issues with this operator.
EDIT:
In your calculate method you use int variables to hold values, which may hold numbers till 2^31-1, namely 2147483647 as known as Integer.MAX_VALUE.
If you perform int value=2147483647; value++; you will get as a result -2147483648 as known as Integer.MIN_VALUE.
You can alternatively try your program with lower starting numbers or change your variable declarations to long, which may hold Long.MAX_VALUE, namely 2^63-1.
Both long and int values are supported by Aparapi.
Hi I'm the primary maintainer over at the new Aparapi.com and new github repository. We are much more active over at the new project home and even have about a dozen releases in maven central already. You might want to consider moving over to the new Aparapi.
With that said I am a developer at the new Aparapi and ran this test case and confirmed it is a legitimate Aparapi bug. I will look into what is causing the bug and hopefully can get a bug fix in for you before the next release. The issue has been reported here if you would like to track it. Remember this for the new Aparapi project so the bug fix is not likely to show up in the older Aparapi project.

Checking error output is on my errorlist

I have the following code for logging all the errors after every command I run in cmd with my tool. (It runs p4 integrate commands, about 1000-1500/task)
if (errorArrayList.size() > 0) {
LoggerSingleton.I.writeDebugInfoTimeStampedLog("[INFO-CMD] CommandExecuter.java -> runAndGetResults: errors happened while running the following command: [ " + commandResultBean.getCommand() + " ]");
for (int i = 0; i < errorArrayList.size(); i++) {
LoggerSingleton.I.writeDebugErrorTimeStampedLog(errorArrayList.get(i));
commandResultBean.addToCLI_Error(errorArrayList.get(i));
}
LoggerSingleton.I.writeDebugInfoTimeStampedLog("[INFO-CMD] CommandExecuter.java -> runAndGetResults: Listing errors of command [" + commandResultBean.getCommand() + "] finished");
}
The feature that I'm working on right now is check the error I get, and if that's on a predefined error list (list of errors that doesn't matter, and in fact not real errors, for example "all revision(s) already integrated") do nothing else, but when it's a "real" error, write it to an other log file too (Because these debug logs way too long for the users of the tool, it's made for the developers more likely).
The question is, what is the best way for this?
I want to avoid big deceleration. I have many commands, but the number of errors less then the commands, but that is not unusual at all that I get 700-800 "irrelevant" errors in one task.
I will use another class to make the I/O part, and that is not a problem to extend the running time in case we catch a "real" error.
The list is constant, it is okay if it can be modified only by coding.
At the moment I don't know what type to use (2-3 single Strings, List, Array ...). What type should I use? I never used enums in Java before, in this one should I?
I guess a for or foreach and errorArrayList.get(i).contains(<myVariable>)in a method is the only option for the checking.
If I'm wrong, there is a better way to do this?
EDIT
If I have an ArrayList<String>called knownErrors with the irrelevant errors (can define only parts of it), and I use the following code will better performance than a method wrote above? Also, can I use it if I have only parts of the String? How?
if (errorArrayList.removeAll(knownErrors) {
//do the logging and stuff
}
ArrayList itself has a method removeAll(Collection c) which removes all the elements which are matching with input collection elements. Below program show it evidently. So if you have the known error to be skipped in arraylist and pass it to removeall method it will remove the known errors and errorArrayList will have only new errors.

Error that is neither syntactic nor semantic?

I had this question on a homework assignment (don't worry, already done):
[Using your favorite imperative language, give an example of
each of ...] An error that the compiler can neither catch nor easily generate code to
catch (this should be a violation of the language definition, not just a
program bug)
From "Programming Language Pragmatics" (3rd ed) Michael L. Scott
My answer, call main from main by passing in the same arguments (in C and Java), inspired by this. But I personally felt like that would just be a semantic error.
To me this question's asking how to producing an error that is neither syntactic nor semantic, and frankly, I can't really think of situation where it wouldn't fall in either.
Would it be code that is susceptible to exploitation, like buffer overflows (and maybe other exploitation I've never heard about)? Some sort of pit fall from the structure of the language (IDK, but lazy evaluation/weak type checking)? I'd like a simple example in Java/C++/C, but other examples are welcome.
Undefined behaviour springs to mind. A statement invoking UB is neither syntactically nor semantically incorrect, but rather the result of the code cannot be predicted and is considered erroneous.
An example of this would be (from the Wikipedia page) an attempt to modify a string-constant:
char * str = "Hello world!";
str[0] = 'h'; // undefined-behaviour here
Not all UB-statements are so easily identified though. Consider for example the possibility of signed-integer overflow in this case, if the user enters a number that is too big:
// get number from user
char input[100];
fgets(input, sizeof input, stdin);
int number = strtol(input, NULL, 10);
// print its square: possible integer-overflow if number * number > INT_MAX
printf("%i^2 = %i\n", number, number * number);
Here there may not necessarily be signed-integer overflow. And it is impossible to detect it at compile- or link-time since it involves user-input.
Statements invoking undefined behavior1 are semantically as well as syntactically correct but make programs behave erratically.
a[i++] = i; // Syntax (symbolic representation) and semantic (meaning) both are correct. But invokes UB.
Another example is using a pointer without initializing it.
Logical errors are also neither semantic nor syntactic.
1. Undefined behavior: Anything at all can happen; the Standard imposes no requirements. The program may fail to compile, or it may execute incorrectly (either crashing or silently generating incorrect results), or it may fortuitously do exactly what the programmer intended.
Here's an example for C++. Suppose we have a function:
int incsum(int &a, int &b) {
return ++a + ++b;
}
Then the following code has undefined behavior because it modifies an object twice with no intervening sequence point:
int i = 0;
incsum(i, i);
If the call to incsum is in a different TU from the definition of the function, then it's impossible to catch the error at compile time, because neither bit of code is inherently wrong on its own. It could be detected at link time by a sufficiently intelligent linker.
You can generate as many examples as you like of this kind, where code in one TU has behavior that's conditionally undefined for certain input values passed by another TU. I went for one that's slightly obscure, you could just as easily use an invalid pointer dereference or a signed integer arithmetic overflow.
You can argue how easy it is to generate code to catch this -- I wouldn't say it's very easy, but a compiler could notice that ++a + ++b is invalid if a and b alias the same object, and add the equivalent of assert (&a != &b); at that line. So detection code can be generated by local analysis.

Why does a Mattcher throw an exception after a successful find

I'm trying to write a loop that will find all the instances of "${arbitraryTextHere}" in an input string. E.g:
someText${findMe}moreText${findMeToo}EvenMoreText${DontForgetMe}
Here is my code:
Pattern placeholderPattern = Pattern.compile("\\$\\{[\\w|\\d]+\\}");
Matcher placeholderMatcher = placeholderPattern.matcher(templateString);
int workingIndex = 0;
while(placeholderMatcher.find()){
workingIndex = placeholderMatcher.start();
}
Note: The templateString I'm testing this out with is S"omeString ${someProp}"
The strange thing is that .find() has to return true in order to get inside the loop, but then .start() throws an IllegalStateException. The reason why this is so strange is that .start() only throws an IllegalStateException if the matcher's internal first variable is less than 0, but .find(), via the Matcher's boolean search(int from) method, will make sure that first is zero or greater unless no match is found, but if no match is found then .find() will return false, and we won't wind up in the loop body.
So what exactly is going on here?
Update: So I'f I encapsulate the above code so that it all runs in one unit test then it works. So I think the problem is related to having it in a class who's method is called from the unit test. But that's kind of weird. I'm going to dig into this aspect of the problem a bit more and then post an update.
Update: Ok, well I tried turning it off again and on again (I restarted my IntelliJ and recompiled my code) and now it's not broken anymore, so I think i must have screwed something up in that department.
As per the last update on my question, restarting IntelliJ and recompiling my code fixed things.

Why does the MongoDB Java driver use a random number generator in a conditional?

I saw the following code in this commit for MongoDB's Java Connection driver, and it appears at first to be a joke of some sort. What does the following code do?
if (!((_ok) ? true : (Math.random() > 0.1))) {
return res;
}
(EDIT: the code has been updated since posting this question)
After inspecting the history of that line, my main conclusion is that there has been some incompetent programming at work.
That line is gratuitously convoluted. The general form
a? true : b
for boolean a, b is equivalent to the simple
a || b
The surrounding negation and excessive parentheses convolute things further. Keeping in mind De Morgan's laws it is a trivial observation that this piece of code amounts to
if (!_ok && Math.random() <= 0.1)
return res;
The commit that originally introduced this logic had
if (_ok == true) {
_logger.log( Level.WARNING , "Server seen down: " + _addr, e );
} else if (Math.random() < 0.1) {
_logger.log( Level.WARNING , "Server seen down: " + _addr );
}
—another example of incompetent coding, but notice the reversed logic: here the event is logged if either _ok or in 10% of other cases, whereas the code in 2. returns 10% of the times and logs 90% of the times. So the later commit ruined not only clarity, but correctness itself.
I think in the code you have posted we can actually see how the author intended to transform the original if-then somehow literally into its negation required for the early return condition. But then he messed up and inserted an effective "double negative" by reversing the inequality sign.
Coding style issues aside, stochastic logging is quite a dubious practice all by itself, especially since the log entry does not document its own peculiar behavior. The intention is, obviously, reducing restatements of the same fact: that the server is currently down. The appropriate solution is to log only changes of the server state, and not each its observation, let alone a random selection of 10% such observations. Yes, that takes just a little bit more effort, so let's see some.
I can only hope that all this evidence of incompetence, accumulated from inspecting just three lines of code, does not speak fairly of the project as a whole, and that this piece of work will be cleaned up ASAP.
https://github.com/mongodb/mongo-java-driver/commit/d51b3648a8e1bf1a7b7886b7ceb343064c9e2225#commitcomment-3315694
11 hours ago by gareth-rees:
Presumably the idea is to log only about 1/10 of the server failures (and so avoid massively spamming the log), without incurring the cost of maintaining a counter or timer. (But surely maintaining a timer would be affordable?)
Add a class member initialized to negative 1:
private int logit = -1;
In the try block, make the test:
if( !ok && (logit = (logit + 1 ) % 10) == 0 ) { //log error
This always logs the first error, then every tenth subsequent error. Logical operators "short-circuit", so logit only gets incremented on an actual error.
If you want the first and tenth of all errors, regardless of the connection, make logit class static instead of a a member.
As had been noted this should be thread safe:
private synchronized int getLogit() {
return (logit = (logit + 1 ) % 10);
}
In the try block, make the test:
if( !ok && getLogit() == 0 ) { //log error
Note: I don't think throwing out 90% of the errors is a good idea.
I have seen this kind of thing before.
There was a piece of code that could answer certain 'questions' that came from another 'black box' piece of code. In the case it could not answer them, it would forward them to another piece of 'black box' code that was really slow.
So sometimes previously unseen new 'questions' would show up, and they would show up in a batch, like 100 of them in a row.
The programmer was happy with how the program was working, but he wanted some way of maybe improving the software in the future, if possible new questions were discovered.
So, the solution was to log unknown questions, but as it turned out, there were 1000's of different ones. The logs got too big, and there was no benefit of speeding these up, since they had no obvious answers. But every once in a while, a batch of questions would show up that could be answered.
Since the logs were getting too big, and the logging was getting in the way of logging the real important things he got to this solution:
Only log a random 5%, this will clean up the logs, whilst in the long run still showing what questions/answers could be added.
So, if an unknown event occurred, in a random amount of these cases, it would be logged.
I think this is similar to what you are seeing here.
I did not like this way of working, so I removed this piece of code, and just logged these
messages to a different file, so they were all present, but not clobbering the general logfile.

Categories