Enhanced for loop shortcut - java

Is the following version of for loop possible (or a variation thereof fulfilling the purpose of shortening code with one line)?
for(String string: stringArray; string.toLowerCase()){
//stuff
}
Instead of
for(String string: stringArray){
string = string.toLowerCase();
//stuff
}
May seem like a stupid question but that one line is tiresome to write all the time when it applies to every element of the loop.

Write it like this
for(String string: stringArray)string=string.toLowerCase();
This is just as short. Also in a normal for loop for(int i=0;i<40;i++) you can use the comma operator to keep everything on one line

No, there isn't.
The trick with the enhanced-for loop is that it behaves like any other loop over a collection - you're working with the individual elements one at a time, as opposed to all at once.
Furthermore, since toLowerCase() returns a new String, as it should, it should only be called in situations where it's absolutely needed, as opposed to creating a new variable for that (unless you need it in more places, in which case it's better to move the lower-case functionality into those methods).

You should consider refactoring your code into several methods each with their own loops. One method creates a new array (or list) with transformed elements from the original list (such as applying toLowerCase() to the Strings in an array). The other methods process the new array rather than the original.

Unfortunately that's not possible. You could take a look at Google Guava, which has something like this (Predicates/Closures), but it doesn't help much in improving your code.
Cmpletely offtopic maybe, but it might help, if you would use Groovy, which is fully compatible with Java, it would be something like:
String[] stringArray = ["Lower", "Case"] as String[]
stringArray.collect { it.toLowerCase() }.each { item ->
println item
}
Which would print:
lower
case
But, like I said, this might not be a viable option in your case.

I don't think that's possible as of now. :)

Related

is this use of objects redundant and/or inefficient?

I'm fairly inexperienced with using objects so I would really like some input.
I'm trying to remove comments from a list that have certain "unwanted words" in them, both the comments and the list of "unwanted words" are in ArrayList objects.
This is inside of a class called FormHelper, which contains the private member comments as an ArrayList, the auditList ArrayList is created locally in a member function called populateComments(), which then calls this function (below). PopulateComments() is called by the constructor, and so this function only gets called once, when an instance of FormHelper is created.
private void filterComments(ArrayList <String> auditList) {
for(String badWord : auditList) {
for (String thisComment : this.comments) {
if(thisComment.contains(badWord)) {
int index = this.comments.indexOf(thisComment);
this.comments.remove(index);
}
}
}
}
something about the way I implemented this doesn't feel right, I'm also concerned that I'm using ArrayList functions inefficiently. Is my suspicion correct?
It is not particularly efficient. However, finding a more efficient solution is not straightforward.
Lets step back to a simpler problem.
private void findBadWords(List <String> wordList, List <String> auditList) {
for(String badWord : auditList) {
for (String word : wordList) {
if (word.equals(badWord)) {
System.err.println("Found a bad word");
}
}
}
}
Suppose that wordList contains N words and auditList contains M words. Some simple analysis will show that the inner loop is executed N x M times. The N factor is unavoidable, but the M factor is disturbing. It means that the more "bad" words you have to check for the longer it takes to check.
There is a better way to do this:
private void findBadWords(List <String> wordList, HashSet<String> auditWords) {
for (String word : wordList) {
if (auditWords.contains(word))) {
System.err.println("Found a bad word");
}
}
}
Why is that better? It is better (faster) because HashSet::contains doesn't need to check all of the audit words one at a time. In fact, in the optimal case it will check none of them (!) and the average case just one or two of them. (I won't go into why, but if you want to understand read the Wikipedia page on hash tables.)
But your problem is more complicated. You are using String::contains to test if each comment contains each bad word. That is not a simple string equality test (as per my simplified version).
What to do?
Well one potential solution is to split the the comments into an array of words (e.g. using String::split and then user the HashSet lookup approach. However:
That changes the behavior of your code. (In a good way actually: read up on the Scunthorpe problem!) You will now only match the audit words is they are actual words in the comment text.
Splitting a string into words is not cheap. If you use String::split it entails creating and using a Pattern object to find the word boundaries, creating substrings for each word and putting them into an array. You can probably do better, but it is always going to be a non-trivial calculation.
So the real question will be whether the optimization is going to pay off. That is ultimately going to depend on the value of M; i.e. the number of bad words you are looking for. The larger M is, the more likely it will be to split the comments into words and use a HashSet to test the words.
Another possible solution doesn't involve splitting the comments. You could take the list of audit words and assemble them into a single regex like this: \b(word-1|word-2|...|word-n)\b. Then use this regex with Matcher::find to search each comment string for bad words. The performance will depend on the optimizing capability of the regex engine in your Java platform. It has the potential to be faster than splitting.
My advice would be to benchmark and profile your entire application before you start. Only optimize:
when the benchmarking says that the overall performance of the requests where this comment checking occurs is concerning. (If it is OK, don't waste your time optimizing.)
when the profiling says that this method is a performance hotspot. (There is a good chance that the real hotspots are somewhere else. If so, you should optimize them rather than this method.)
Note there is an assumption that you have (sufficiently) completed your application and created a realistic benchmark for it before you think about optimizing. (Premature optimization is a bad idea ... unless you really know what you are doing.)
As a general approach, removing individual elements from an ArrayList in a loop is inefficient, because it requires shifting all of the "following" elements along one position in the array.
A B C D E
^ if you remove this
^---^ you have to shift these 3 along by one
/ / /
A C D E
If you remove lots of elements, this will have a substantial impact on the time complexity. It's better to identify the elements to remove, and then remove them all at once.
I suggest that a neater way to do this would be using removeIf, which (at least for collection implementations such as ArrayList) does this "all at once" removal:
this.comments.removeIf(
c -> auditList.stream().anyMatch(c::contains));
This is concise, but probably quite slow because it has to keep checking the entire comment string to see if it contains each bad word.
A probably faster way would be to use regex:
Pattern p = Pattern.compile(
auditList.stream()
.map(Pattern::quote)
.collect(joining("|")));
this.comments.removeIf(
c -> p.matcher(c).find());
This would be better because the compiled regex would search for all of the bad words in a single pass over each comment.
The other advantage of a regex-based approach is that you can check case insensitively, by supplying the appropriate flag when compiling the regex.

Is use of AtomicInteger for indexing in Stream a legit way?

I would like to get an answer pointing out the reasons why the following idea described below on a very simple example is commonly considered bad and know its weaknesses.
I have a sentence of words and my goal is to make every second one to uppercase. My starting point for both of the cases is exactly the same:
String sentence = "Hi, this is just a simple short sentence";
String[] split = sentence.split(" ");
The traditional and procedural approach is:
StringBuilder stringBuilder = new StringBuilder();
for (int i=0; i<split.length; i++) {
if (i%2==0) {
stringBuilder.append(split[i]);
} else {
stringBuilder.append(split[i].toUpperCase());
}
if (i<split.length-1) { stringBuilder.append(" "); }
}
When want to use java-stream the use is limited due the effectively-final or final variable constraint used in the lambda expression. I have to use the workaround using the array and its first and only index, which was suggested in the first comment of my question How to increment a value in Java Stream. Here is the example:
int index[] = {0};
String result = Arrays.stream(split)
.map(i -> index[0]++%2==0 ? i : i.toUpperCase())
.collect(Collectors.joining(" "));
Yeah, it's a bad solution and I have heard few good reasons somewhere hidden in comments of a question I am unable to find (if you remind me some of them, I'd upvote twice if possible). But what if I use AtomicInteger - does it make any difference and is it a good and safe way with no side effects compared to the previous one?
AtomicInteger atom = new AtomicInteger(0);
String result = Arrays.stream(split)
.map(i -> atom.getAndIncrement()%2==0 ? i : i.toUpperCase())
.collect(Collectors.joining(" "));
Regardless of how ugly it might look for anyone, I ask for the description of possible weaknesses and their reasons. I don't care the performance but the design and possible weaknesses of the 2nd solution.
Please, don't match AtomicInteger with multi-threading issue. I used this class since it receives, increments and stores the value in the way I need for this example.
As I often say in my answers that "Java Stream-API" is not the bullet for everything. My goal is to explore and find the edge where is this sentence applicable since I find the last snippet quite clear, readable and brief compared to StringBuilder's snippet.
Edit: Does exist any alternative way applicable for the snippets above and all the issues when it’s needed to work with both item and index while iteration using Stream-API?
The documentation of the java.util.stream package states that:
Side-effects in behavioral parameters to stream operations are, in general, discouraged, as they can often lead to unwitting violations of the statelessness requirement, as well as other thread-safety hazards.
[...]
The ordering of side-effects may be surprising. Even when a pipeline is constrained to produce a result that is consistent with the encounter order of the stream source (for example, IntStream.range(0,5).parallel().map(x -> x*2).toArray() must produce [0, 2, 4, 6, 8]), no guarantees are made as to the order in which the mapper function is applied to individual elements, or in what thread any behavioral parameter is executed for a given element.
This means that the elements may be processed out of order, and thus the Stream-solutions may produce wrong results.
This is (at least for me) a killer argument against the two Stream-solutions.
By the process of elimination, we only have the "traditional solution" left. And honestly, I do not see anything wrong with this solution. If we wanted to get rid of the for-loop, we could re-write this code using a foreach-loop:
boolean toUpper = false; // 1st String is not capitalized
for (String word : splits) {
stringBuilder.append(toUpper ? word.toUpperCase() : word);
toUpper = !toUpper;
}
For a streamified and (as far as I know) correct solution, take a look at Octavian R.'s answer.
Your question wrt. the "limits of streams" is opinion-based.
The answer to the question (s) ends here. The rest is my opinion and should be regarded as such.
In Octavian R.'s solution, an artificial index-set is created through a IntStream, which is then used to access the String[]. For me, this has a higher cognitive complexity than a simple for- or foreach-loop and I do not see any benefit in using streams instead of loops in this situation.
In Java, comparing with Scala, you must be inventive. One solution without mutation is this one:
String sentence = "Hi, this is just a simple short sentence";
String[] split = sentence.split(" ");
String result = IntStream.range(0, split.length)
.mapToObj(i -> i%2==0 ? split[i].toUpperCase():split[i])
.collect(Collectors.joining(" "));
System.out.println(result);
In Java streams you should avoid the mutation. Your solution with AtomicInteger it's ugly and it's a bad practice.
Kind regards!
As explained in Turing85’s answer, your stream solutions are not correct, as they rely on the processing order, which is not guaranteed. This can lead to incorrect results with parallel execution today, but even if it happens to produce the desired result with a sequential stream, that’s only an implementation detail. It’s not guaranteed to work.
Besides that, there is no advantage in rewriting code to use the Stream API with a logic that basically still is a loop, but obfuscated with a different API. The best way to describe the idea of the new APIs, is to say that you should express what to do but not how.
Starting with Java 9, you could implement the same thing as
String result = Pattern.compile("( ?+[^ ]* )([^ ]*)").matcher(sentence)
.replaceAll(m -> m.group(1)+m.group(2).toUpperCase());
which expresses the wish to replace every second word with its upper case form, but doesn’t express how to do it. That’s up to the library, which likely uses a single StringBuilder instead of splitting into an array of strings, but that’s irrelevant to the application logic.
As long as you’re using Java 8, I’d stay with the loop and even when switching to a newer Java version, I would consider replacing the loop as not being an urgent change.
The pattern in the above example has been written in a way to do exactly the same as your original code splitting at single space characters. Usually, I’d encode “replace every second word” more like
String result = Pattern.compile("(\\w+\\W+)(\\w+)").matcher(sentence)
.replaceAll(m -> m.group(1)+m.group(2).toUpperCase());
which would behave differently when encountering multiple spaces or other separators, but usually is closer to the actual intention.

Is a for/break more efficient than iterator().next()?

It is known that mapX has a single entry in it. A code metric tool complains about using 'break' inside for loops, but normally, in such a small loop, I'd let it pass. In this case though, with it known that there's only a single entry, I'm not sure which is more efficient.
So, which of the following is more efficient when you know that the map will only ever contain one element?
Possible replacement:
Map.Entry<String,String> e = mapX.entrySet().iterator().next();
Y.setMsg(e.getValue());
Y.setMsgKey(e.getKey());
Original code:
for (String key : mapX.keySet()){
Y.setMsg(mapX.get(key));
Y.setMsgKey(key);
break;
}
The first version is faster than the second one (with the "for" loop / "break"):
The loop version has to call hasNext() before the call to next().
The loop version is iterating the keySet rather than the entryset, and therefore has to do an extra map lookup to get the corresponding value.
The loop version possibly has an extra branch instruction at the break ... though this is minor and can possibly be optimized away by the JIT compiler.
But the best reason for using the first version (IMO) is that the code is easier to understand / less complex. (The code metric tool is helpful in this case ...)
Of course, the flip-side is that if the map is empty, the non-looping version of the code is going to throw an exception. If you need to deal with the "empty map" case, you should write it like this:
if (!mapX.isEmpty()) {
Map.Entry<String,String> e = mapX.entrySet().iterator().next();
y.setMsg(e.getValue());
y.setMsgKey(e.getKey());
}
Personally I find first variant more concise. If you're using guava, you could also write it like this:
Map.Entry<String,String> e = Iterables.getOnlyElement(mapX.entrySet());
This
Map.Entry<String,String> e = mapX.entrySet().iterator().next();
Y.setMsg(e.getValue());
Y.setMsgKey(e.getKey());
is the same as
for (String key : mapX.keySet()){
Y.setMsg(mapX.get(key));
Y.setMsgKey(key);
break;
}
with one exception: The loop first calls hasNext() on the iterator.
The former implies more strongly that there's only one element (at least, only one that you're interested in). The latter says "I'm going to loop through all elements...except I'm going to stop after the first one." Why use a loop, if you're not going to loop?
Seems like the non-looping version is the one to go with.

Manipulating Strings on Arrays

I'm still new to Java and I would like to understand Strings and Arrays so I got this idea of manipulating elements and place them according to my objective. The objective is that there will be Array of Strings "ABBCCCBBAA" and the "AA","BB" must be replaced into "A" , "BA","AB" into CC. "CC","BC" into B. I basically have no idea how to make it happen but I know it must have Arrays of String. Please help
Regular expression can be very handy for you. Code bellow can do, your job with the use of regular expression:
String mainStr = "ABBCCCBBAA";
Pattern p = Pattern.compile("(AA)|(BB)|(BA)|(AB)|(CC)|(BC)");
Matcher m = p.matcher(mainStr);
while (m.find()) {
String matchedStr = m.group(0);
if("AA".equals(matchedStr) || "BB".equals(matchedStr)){
mainStr = mainStr.replaceFirst(matchedStr,"X");
}
else if("BA".equals(matchedStr) || "AB".equals(matchedStr)){
mainStr = mainStr.replaceFirst(matchedStr,"Y");
}
else if("CC".equals(matchedStr) || "BC".equals(matchedStr)){
mainStr = mainStr.replaceFirst(matchedStr,"Z");
}
}
mainStr = mainStr.replaceAll("X","A").replaceAll("Y","CC").replaceAll("Z","B");
System.out.println(mainStr);
Above code will handle your case of multiple occurrence of same pattern in a given string like:
ABBCCCBBAABBBBAA
will generate output:
CCBBAAAAA.
I am assuming that by "array of strings" you mean:
String[] myvariable = new String[number];
myvariable[0] = "ABBCCBBAA";
myvariable[1] = "some_other_string";
If you are new to Java I suggest you read a beginner's book like Head First Java and also look into java documentation; you don't even have to go that far if you are programming with a decent IDE, like Netbeans (thanks to its intelli-sense feature) is a source of documentation for what you seek (meaning that you can look at all the methods available for a string, read what they do, and see if they can help accomplish what you need).
I am assuming (from what you have said) that you want to replace "AA" for "A", and from that result replace "BB" for "BA", and from that result replace "AB" into "CC", and from that result "BC" into "B".
The code I am posting is REAL simple, and it will only work for this particular case (as I have understood it), if you want to create a method that does this for any string, you need to change some things, but I'll leave that to you.
String[] yourArrayOfStrings = new String[1];
yourArrayOfStrings[0] = "ABBCCBBAA";
String resultOfReplacement= yourArrayOfStrings[0].replaceFirst("AA", "A");
System.out.println(resultOfReplacement); //debugging purposes
resultOfReplacement = resultOfReplacement.replaceFirst("BB", "BA");
System.out.println(resultOfReplacement); //debugging purposes
resultOfReplacement = resultOfReplacement.replaceFirst("AB", "CC");
System.out.println(resultOfReplacement); //debugging purposes
resultOfReplacement = resultOfReplacement.replaceFirst("BC", "BB");
System.out.println(resultOfReplacement); //debugging purposes
The only reason why I created a String[] was because that's what you stated in your question, otherwise I would have simple created a String variable like I did with resultOfReplacement. To access the first element in an array you do arrayVariable[index]. Here I use the replaceFirst function that comes with Java for variables of type String. If you look the method up, it'll tell you that it will look for the first match of the first parameter and replace it with the second parameter.
The System.out.println I have added are for debugging purposes, so you can see on the console what is clearly happening with each replacement. So, the first time I call replaceFirst(...) on the original string which is a[0].
This will happen:
The method will look in "ABBCCBBAA" for the FIRST AND ONLY THE FIRST time "AA" appears and replace it with "A". The result is "return" and you must assign it to a variable if you want access to it to do more actions upon it. In this case, I assign it to a new String variable. You could have just assigned back to a[0], which is likely what you want. (You'd do so like this: a[0]=ourArrayOfStrings[0].replaceFirst("AA", "A");)
For the second replacement, the method will look in "ABBCCBBA" for the first time "BB" appears and replace it for "BA".
See the pattern? This is just a start, and depending on what you want you might need other methods like replaceAll().
Most IDEs will tell you what methods are available for a variable when you access it via ".", so that when you are typing " variablename. " right at that moment a list of methods available for it should appear, if they don´t you can go ahead and do a shortcut like ctrl+space for it to appear and navigate through the methods via the arrow keys so you can read what they do (at least for Eclpise and Netbeans, while programming in Java, it works). Documentation is power!

What is preferred option for assignment and formatting?

Which one is recommended considering readability, memory usage, other reasons?
1.
String strSomething1 = someObject.getSomeProperties1();
strSomething1 = doSomeValidation(strSomething1);
String strSomething2 = someObject.getSomeProperties2();
strSomething2 = doSomeValidation(strSomething2);
String strSomeResult = strSomething1 + strSomething2;
someObject.setSomeProperties(strSomeResult);
2.
someObject.setSomeProperties(doSomeValidation(someObject.getSomeProperties1()) +
doSomeValidation(someObject.getSomeProperties2()));
If you would do it some other way, what would that be? Why would you do that way?
I'd go with:
String strSomething1 = someObject.getSomeProperties1();
String strSomething2 = someObject.getSomeProperties2();
// clean-up spaces
strSomething1 = removeTrailingSpaces(strSomething1);
strSomething2 = removeTrailingSpaces(strSomething2);
someObject.setSomeProperties(strSomething1 + strSomething2);
My personal preference is to organize by action, rather than sequence. I think it just reads better.
I would probably go in-between:
String strSomething1 = doSomeValidation(someObject.getSomeProperties1());
String strSomething2 = doSomeValidation(someObject.getSomeProperties2());
someObject.setSomeProperties(strSomething1 + strSomething2);
Option #2 seems like a lot to do in one line. It's readable, but takes a little effort to parse. In option #1, each line is very readable and clear in intent, but the verbosity slows me down when I'm going over it. I'd try to balance brevity and clarity as above, with each line representing a simple "sentence" of code.
I prefer the second. You can make it just as readable with a little bit of formatting, without declaring the extra intermediate references.
someObject.setSomeProperties(
doSomeValidation( someObject.getSomeProperties1() ) +
doSomeValidation( someObject.getSomeProperties2() ));
Your method names provide all the explanation needed.
Option 2 for readability. I don't see any memory concerns here if the methods only do what their names indicate. I would be vary with concatenations though. Performance definitely takes a beat with increasing string concats because of the immutability of Java Strings.
Just curious to know, did you really write your own removeTrailingSpaces() method or is it just an example ?
I try to have one operation per line. The main reason is this:
setX(getX().getY()+getA().getB())
If you have a NPE here, which method returned null? So I like to have intermediate results in some variable which I can see after the code fell into the strong arms of the debugger and without having to restart!
for me, it depends on the context and the surrounding code.
[EDIT: does not make any sense, sorry]
if it was in method like "setSomeObjectProperties()", I'd prefer variant 2 but perhaps would create a private method "getProperty(String name)" which removes the trailing spaces if removing the spaces is not an important operation
[/EDIT]
If validation the properties is an important step of your method, then I'd call the method "setValidatedProperties()" and would prefer a variant of your first suggestion:
validatedProp1 = doValidation(someObject.getSomeProperty1());
validatedProp2 = doValidation(someObject.getSomeProperty2());
someObject.setSomeProperties(validatedProp1, validatedProp2);
If validation is not something important of this method (e.g. there's no point in returning properties which are not validated), I'd try to put the validation-step in "getSomePropertyX()"
Personally, I prefer the second one. It's less cluttered and I don't have to keep track of those temporary variables.
Might change easily with more complex expressions, though.
I like both Greg and Bill versions, I think I would more naturally write code like Greg's one. One advantage with intermediary variables: it is easier to debug (in the general case).

Categories