Is there a way to determine if the loop is iterating for the last time. My code looks something like this:
int[] array = {1, 2, 3...};
StringBuilder builder = new StringBuilder();
for(int i : array)
{
builder.append("" + i);
if(!lastiteration)
builder.append(",");
}
Now the thing is I don't want to append the comma in the last iteration. Now is there a way to determine if it is the last iteration or am I stuck with the for loop or using an external counter to keep track.
Another alternative is to append the comma before you append i, just not on the first iteration. (Please don't use "" + i, by the way - you don't really want concatenation here, and StringBuilder has a perfectly good append(int) overload.)
int[] array = {1, 2, 3...};
StringBuilder builder = new StringBuilder();
for (int i : array) {
if (builder.length() != 0) {
builder.append(",");
}
builder.append(i);
}
The nice thing about this is that it will work with any Iterable - you can't always index things. (The "add the comma and then remove it at the end" is a nice suggestion when you're really using StringBuilder - but it doesn't work for things like writing to streams. It's possibly the best approach for this exact problem though.)
Another way to do this:
String delim = "";
for (int i : ints) {
sb.append(delim).append(i);
delim = ",";
}
Update: For Java 8, you now have Collectors
It might be easier to always append. And then, when you're done with your loop, just remove the final character. Tons less conditionals that way too.
You can use StringBuilder's deleteCharAt(int index) with index being length() - 1
Maybe you are using the wrong tool for the Job.
This is more manual than what you are doing but it's in a way more elegant if not a bit "old school"
StringBuffer buffer = new StringBuffer();
Iterator iter = s.iterator();
while (iter.hasNext()) {
buffer.append(iter.next());
if (iter.hasNext()) {
buffer.append(delimiter);
}
}
This is almost a repeat of this StackOverflow question. What you want is StringUtils, and to call the join method.
StringUtils.join(strArr, ',');
Another solution (perhaps the most efficient)
int[] array = {1, 2, 3};
StringBuilder builder = new StringBuilder();
if (array.length != 0) {
builder.append(array[0]);
for (int i = 1; i < array.length; i++ )
{
builder.append(",");
builder.append(array[i]);
}
}
keep it simple and use a standard for loop:
for(int i = 0 ; i < array.length ; i ++ ){
builder.append(array[i]);
if( i != array.length - 1 ){
builder.append(',');
}
}
or just use apache commons-lang StringUtils.join()
Explicit loops always work better than implicit ones.
builder.append( "" + array[0] );
for( int i = 1; i != array.length; i += 1 ) {
builder.append( ", " + array[i] );
}
You should wrap the whole thing in an if-statement just in case you're dealing with a zero-length array.
As toolkit mentioned, in Java 8 we now have Collectors. Here's what the code would look like:
String joined = array.stream().map(Object::toString).collect(Collectors.joining(", "));
I think that does exactly what you're looking for, and it's a pattern you could use for many other things.
If you convert it to a classic index loop, yes.
Or you could just delete the last comma after it's done. Like so:
int[] array = {1, 2, 3...};
StringBuilder
builder = new StringBuilder();
for(int i : array)
{
builder.append(i + ",");
}
if(builder.charAt((builder.length() - 1) == ','))
builder.deleteCharAt(builder.length() - 1);
Me, I just use StringUtils.join() from commons-lang.
You need Class Separator.
Separator s = new Separator(", ");
for(int i : array)
{
builder.append(s).append(i);
}
The implementation of class Separator is straight forward. It wraps a string that is returned on every call of toString() except for the first call, which returns an empty string.
Based on java.util.AbstractCollection.toString(), it exits early to avoid the delimiter.
StringBuffer buffer = new StringBuffer();
Iterator iter = s.iterator();
for (;;) {
buffer.append(iter.next());
if (! iter.hasNext())
break;
buffer.append(delimiter);
}
It's efficient and elegant, but not as self-evident as some of the other answers.
Here is a solution:
int[] array = {1, 2, 3...};
StringBuilder builder = new StringBuilder();
bool firstiteration=true;
for(int i : array)
{
if(!firstiteration)
builder.append(",");
builder.append("" + i);
firstiteration=false;
}
Look for the first iteration :)
Yet another option.
StringBuilder builder = new StringBuilder();
for(int i : array)
builder.append(',').append(i);
String text = builder.toString();
if (text.startsWith(",")) text=text.substring(1);
Many of the solutions described here are a bit over the top, IMHO, especially those that rely on external libraries. There is a nice clean, clear idiom for achieving a comma separated list that I have always used. It relies on the conditional (?) operator:
Edit: Original solution correct, but non-optimal according to comments. Trying a second time:
int[] array = {1, 2, 3};
StringBuilder builder = new StringBuilder();
for (int i = 0 ; i < array.length; i++)
builder.append(i == 0 ? "" : ",").append(array[i]);
There you go, in 4 lines of code including the declaration of the array and the StringBuilder.
Here's a SSCCE benchmark I ran (related to what I had to implement) with these results:
elapsed time with checks at every iteration: 12055(ms)
elapsed time with deletion at the end: 11977(ms)
On my example at least, skipping the check at every iteration isn't noticeably faster especially for sane volumes of data, but it is faster.
import java.util.ArrayList;
import java.util.List;
public class TestCommas {
public static String GetUrlsIn(int aProjectID, List<String> aUrls, boolean aPreferChecks)
{
if (aPreferChecks) {
StringBuffer sql = new StringBuffer("select * from mytable_" + aProjectID + " WHERE hash IN ");
StringBuffer inHashes = new StringBuffer("(");
StringBuffer inURLs = new StringBuffer("(");
if (aUrls.size() > 0)
{
for (String url : aUrls)
{
if (inHashes.length() > 0) {
inHashes.append(",");
inURLs.append(",");
}
inHashes.append(url.hashCode());
inURLs.append("\"").append(url.replace("\"", "\\\"")).append("\"");//.append(",");
}
}
inHashes.append(")");
inURLs.append(")");
return sql.append(inHashes).append(" AND url IN ").append(inURLs).toString();
}
else {
StringBuffer sql = new StringBuffer("select * from mytable" + aProjectID + " WHERE hash IN ");
StringBuffer inHashes = new StringBuffer("(");
StringBuffer inURLs = new StringBuffer("(");
if (aUrls.size() > 0)
{
for (String url : aUrls)
{
inHashes.append(url.hashCode()).append(",");
inURLs.append("\"").append(url.replace("\"", "\\\"")).append("\"").append(",");
}
}
inHashes.deleteCharAt(inHashes.length()-1);
inURLs.deleteCharAt(inURLs.length()-1);
inHashes.append(")");
inURLs.append(")");
return sql.append(inHashes).append(" AND url IN ").append(inURLs).toString();
}
}
public static void main(String[] args) {
List<String> urls = new ArrayList<String>();
for (int i = 0; i < 10000; i++) {
urls.add("http://www.google.com/" + System.currentTimeMillis());
urls.add("http://www.yahoo.com/" + System.currentTimeMillis());
urls.add("http://www.bing.com/" + System.currentTimeMillis());
}
long startTime = System.currentTimeMillis();
for (int i = 0; i < 300; i++) {
GetUrlsIn(5, urls, true);
}
long endTime = System.currentTimeMillis();
System.out.println("elapsed time with checks at every iteration: " + (endTime-startTime) + "(ms)");
startTime = System.currentTimeMillis();
for (int i = 0; i < 300; i++) {
GetUrlsIn(5, urls, false);
}
endTime = System.currentTimeMillis();
System.out.println("elapsed time with deletion at the end: " + (endTime-startTime) + "(ms)");
}
}
Another approach is to have the length of the array (if available) stored in a separate variable (more efficient than re-checking the length each time). You can then compare your index to that length to determine whether or not to add the final comma.
EDIT: Another consideration is weighing the performance cost of removing a final character (which may cause a string copy) against having a conditional be checked in each iteration.
If you're only turning an array into a comma delimited array, many languages have a join function for exactly this. It turns an array into a string with a delimiter between each element.
In this case there is really no need to know if it is the last repetition.
There are many ways we can solve this. One way would be:
String del = null;
for(int i : array)
{
if (del != null)
builder.append(del);
else
del = ",";
builder.append(i);
}
Two alternate paths here:
1: Apache Commons String Utils
2: Keep a boolean called first, set to true. In each iteration, if first is false, append your comma; after that, set first to false.
Since its a fixed array, it would be easier simply to avoid the enhanced for... If the Object is a collection an iterator would be easier.
int nums[] = getNumbersArray();
StringBuilder builder = new StringBuilder();
// non enhanced version
for(int i = 0; i < nums.length; i++){
builder.append(nums[i]);
if(i < nums.length - 1){
builder.append(",");
}
}
//using iterator
Iterator<int> numIter = Arrays.asList(nums).iterator();
while(numIter.hasNext()){
int num = numIter.next();
builder.append(num);
if(numIter.hasNext()){
builder.append(",");
}
}
You can use StringJoiner.
int[] array = { 1, 2, 3 };
StringJoiner stringJoiner = new StringJoiner(",");
for (int i : array) {
stringJoiner.add(String.valueOf(i));
}
System.out.println(stringJoiner);
Related
I wrote a method to reduce a sequence of the same characters to a single character as follows. It seems its logic is correct while there is a room for improvement in terms of performance, according to my tutor. Could anyone shed some light on this?
Comments of aspects other than performance is also really appreciated.
public class RemoveRepetitions {
public static String remove(String input) {
String ret = "";
String last = "";
String[] stringArray = input.split("");
for(int j=0; j < stringArray.length; j++) {
if (! last.equals(stringArray[j]) ) {
ret += stringArray[j];
}
last = stringArray[j];
}
return ret;
}
public static void main(String[] args) {
System.out.println(RemoveRepetitions.remove("foobaarrbuzz"));
}
}
We can improve the performance by using StringBuilder instead of using string as string operations are costlier. Also, the split function is also not required (it will make the program slower as well).
Here is a way to solve this:
public static String remove(String input)
{
StringBuilder answer = new StringBuilder("");
int N = input.length();
int i = 0;
while (i < N)
{
char c = input.charAt(i);
answer.append( c );
while (i<N && input.charAt(i)==c)
++i;
}
return answer.toString();
}
The idea is to iterate over all characters of the input string and keep appending every new character to the answer and skip all the same consecutive characters.
Possible change which you could think of in your code is:
Time Complexity: Your code is achieving output in O(n) time complexity, which might be the best possible way.
Space Complexity: Your code is using extra memory space which arises due to splitting.
Question to ask: Can you achieve this output, without using the extra space for character array that you get after splitting the string? (as character by character traversal is possible directly on string).
I can provide you the code here but, it would be great if you could try it on your own, once you are done with your attempts
you can lookup for the best solution here (you are almost there)
https://www.geeksforgeeks.org/remove-consecutive-duplicates-string/
Good luck!
As mentioned before, it is much better to access the characters in the string using method String::charAt or at least by iterating a char array retrieved with String::toCharArray instead of splitting the input string into String array.
However, Java strings may contain characters exceeding basic multilingual plane of Unicode (e.g. emojis 😂😍😊, Chinese or Japanese characters etc.) and therefore String::codePointAt should be used. Respectively, Character.charCount should be used to calculate appropriate offset while iterating the input string.
Also the input string should be checked if it's null or empty, so the resulting code may look like this:
public static String dedup(String str) {
if (null == str || str.isEmpty()) {
return str;
}
int prev = -1;
int n = str.length();
System.out.println("length = " + n + " of [" + str + "], real length: " + str.codePointCount(0, n));
StringBuilder sb = new StringBuilder(n);
for (int i = 0; i < n; ) {
int cp = str.codePointAt(i);
if (i == 0 || cp != prev) {
sb.appendCodePoint(cp);
}
prev = cp;
i += Character.charCount(cp); // for emojis it returns 2
}
return sb.toString();
}
A version with String::charAt may look like this:
public static String dedup2(String str) {
if (null == str || str.isEmpty()) {
return str;
}
int n = str.length();
StringBuilder sb = new StringBuilder(n);
sb.append(str.charAt(0));
for (int i = 1; i < n; i++) {
if (str.charAt(i) != str.charAt(i - 1)) {
sb.append(str.charAt(i));
}
}
return sb.toString();
}
The following test proves that charAt fails to deduplicate repeated emojis:
System.out.println("codePoint: " + dedup ("😂😂😍😍😊😊😂 hello"));
System.out.println("charAt: " + dedup2("😂😂😍😍😊😊😂 hello"));
Output:
length = 20 of [😂😂😍😍😊😊😂 hello], real length: 13
codePoint: 😂😍😊😂 helo
charAt: 😂😂😍😍😊😊😂 helo
There are 2 functions defined below. They does the exactly same function i.e takes input a template (in which one wants to replace some substrings) and array of strings values( key value pair to replace, ex:[subStrToReplace1,value1,subStrToReplace1,value2,.....]) and returns the replaced String.
In second function I am iterating over words of the templates and searching for the relevant key if exist in hashmap and then next word. If I want to replace a word with some substring , which I again want to replace with some other key in values, I need to iterate over template twice. Thats what I did.
I would like to know which one should I use and why ? Any than alternative better than these are also welcome.
1st function
public static String populateTemplate1(String template, String... values) {
String populatedTemplate = template;
for (int i = 0; i < values.length; i += 2) {
populatedTemplate = populatedTemplate.replace(values[i], values[i + 1]);
}
return populatedTemplate;
}
2nd function
public static String populateTemplate2(String template, String... values) {
HashMap<String, String> map = new HashMap<>();
for (int i = 0; i < values.length; i += 2) {
map.put(values[i],values[i+1]);
}
StringBuilder regex = new StringBuilder();
boolean first = true;
for (String word : map.keySet()) {
if (first) {
first = false;
} else {
regex.append('|');
}
regex.append(Pattern.quote(word));
}
Pattern pattern = Pattern.compile(regex.toString());
int N0OfIterationOverTemplate =2;
// Pattern allowing to extract only the words
// Pattern pattern = Pattern.compile("\\w+");
StringBuilder populatedTemplate=new StringBuilder();;
String temp_template=template;
while(N0OfIterationOverTemplate!=0){
populatedTemplate = new StringBuilder();
Matcher matcher = pattern.matcher(temp_template);
int fromIndex = 0;
while (matcher.find(fromIndex)) {
// The start index of the current word
int startIdx = matcher.start();
if (fromIndex < startIdx) {
// Add what we have between two words
populatedTemplate.append(temp_template, fromIndex, startIdx);
}
// The current word
String word = matcher.group();
// Replace the word by itself or what we have in the map
// populatedTemplate.append(map.getOrDefault(word, word));
if (map.get(word) == null) {
populatedTemplate.append(word);
}
else {
populatedTemplate.append(map.get(word));
}
// Start the next find from the end index of the current word
fromIndex = matcher.end();
}
if (fromIndex < temp_template.length()) {
// Add the remaining sub String
populatedTemplate.append(temp_template, fromIndex, temp_template.length());
}
N0OfIterationOverTemplate--;
temp_template=populatedTemplate.toString();
}
return populatedTemplate.toString();
}
Definitively the first one for at least two reasons:
It is easier to read and shorter, so it is easier to maintain as it is much less error prone
You don't rely on a regular expression so it is faster by far
The first function is much clearer and easier to understand. I would prefer it unless you find out (by a profiler) that it takes a considerable amount of time and slows your application down. Then you can figure out how to optimize it.
Why make things complicated when you can make simple.
Keep in mind that simple solutions tend to be the best.
FYI, if the numbers of elements is and odd number you will get an ArrayIndexOutOfBoundsException.
I propose this improvement:
public static String populateTemplate(String template, String... values) {
String populatedTemplate = template;
int nextTarget = 2;
int lastTarget = values.length - nextTarget;
for (int i = 0; i <= lastTarget; i += nextTarget) {
String target = values[i];
String replacement = values[i + 1];
populatedTemplate = populatedTemplate.replace(target, replacement);
}
return populatedTemplate;
}
"Good programmers write code that humans can understand". Martin Fowler
I'm creating a bukkit plugin and one of its features is to show the plugins on the server, here's my code that handles the plugin listing:
for(int i = 0; i < plugins.length; i++){
String conplugin = plugins[i].toString();
String[] conplugin2 = conplugin.split(" ");
if(i + 1 == plugins.length) {
pluginlist.add(ChatColor.BLUE + conplugin2[0]);
} else {
pluginlist.add(ChatColor.BLUE + conplugin2[0] + ChatColor.DARK_GRAY + ", " );
}
}
I want to get all the strings from the array (pluginlist) and make one string out of them.
If you want to construct a String from a String array, you could use a for loop and append the array element to the end of your new string.
StringBuilder newString = new StringBuilder ();
for (int i = 0; i < arr.length; i++) {
newString.append (arr [i]);
}
return newString;
You could also use a String, but depending on the size of the array of plugins, it would probably be faster to create a StringBuilder.
Use a StringBuilder as it is mutable and you can append to it. Strings are immutable hence you can't change it.
You can use the previous StringBuilder example.
I know this question has been already asked several times but I can't find the way to apply it on my code.
So my propose is the following:
I have two files griechenland_test.txt and outagain5.txt . I want to read them and then get which percentage of outagain5.txt is inside the other file.
Outagain5 has input like that:
mit dem 542824
und die 517126
And Griechenland is an normal article from Wikipedia about that topic (so like normal text, without freqeuncy Counts).
1. Problem
- How can I split the input in bigramms? Like every two words, but always with the one before? So if I have words A, B, C, D --> get AB, BC, CD ?
I have this:
while ((sCurrentLine = in.readLine()) != null) {
// System.out.println(sCurrentLine);
arr = sCurrentLine.split(" ");
for (int i = 0; i < arr.length; i++) {
if (null == hash.get(arr[i])) {
hash.put(arr[i], 1);
} else {
int x = hash.get(arr[i]) + 1;
hash.put(arr[i], x);
}
}
Then I read the other file with this code ( I just add the word, and not the number (I split it with 4 spaces, so the two words are at h[0])).
for (String line = br.readLine(); line != null; line = br.readLine()) {
String h[] = line.split(" ");
words.add(h[0]);
}
2. Problem
Now I make the comparsion between the String x in hash and the String s in words. I have put the else System out.print to get which words are not contained in outagain5.txt, but there are several words printed out which ARE contained in outagain5.txt. I don't understand why :D
So I think that the comparsion doesn't work well or maybe this will be solved will fix the first problem.
ArrayList<String> words = new ArrayList<String>();
ArrayList<String> neuS = new ArrayList<String>();
ArrayList<Long> neuZ = new ArrayList<Long>();
for (String x : hash.keySet()) {
summe = summe + hash.get(x);
long neu = hash.get(x);
for (String s : words) {
if (x.equals(s)) {
neuS.add(x);
neuZ.add(neu);
disc = disc + 1;
} else {
System.out.println(x);
break;
}
}
}
Hope I made my question clear, thanks a lot!!
public static List<String> ngrams(int n, String str) {
List<String> ngrams = new ArrayList<String>();
String[] words = str.split(" ");
for (int i = 0; i < words.length - n + 1; i++)
ngrams.add(concat(words, i, i+n));
return ngrams;
}
public static String concat(String[] words, int start, int end) {
StringBuilder sb = new StringBuilder();
for (int i = start; i < end; i++)
sb.append((i > start ? " " : "") + words[i]);
return sb.toString();
}
It is much easier to use the generic "n-gram" approach so you can split every 2 or 3 words if you want. Here is the link I used to grab the code from: I have used this exact code almost any time I need to split words in the (AB), (BC), (CD) format. NGram Sequence.
If I recall, String has a method titled split(regex, count) that will split the item according to a specific point and you can tell it how many times to do it.
I am referencing this JavaDoc https://docs.oracle.com/javase/6/docs/api/java/lang/String.html#split(java.lang.String, int).
And I guess for running comparison between two text files I would recommend having your code read both of them, populated two unique arrays and then try to run comparisons between the two strings each time. Hope I helped.
I have the following string:
A:B:1111;domain:80;a;b
The A is optional so B:1111;domain:80;a;b is also valid input.
The :80 is optional as well so B:1111;domain;a;b or :1111;domain;a;b are also valid input
What I want is to end up with a String[] that has:
s[0] = "A";
s[1] = "B";
s[2] = "1111";
s[3] = "domain:80"
s[4] = "a"
s[5] = "b"
I did this as follows:
List<String> tokens = new ArrayList<String>();
String[] values = s.split(";");
String[] actions = values[0].split(":");
for(String a:actions){
tokens.add(a);
}
//Start from 1 to skip A:B:1111
for(int i = 1; i < values.length; i++){
tokens.add(values[i]);
}
String[] finalResult = tokens.toArray();
I was wondering is there a better way to do this? How else could I do this more efficiently?
There are not many efficiency concerns here, all I see is linear.
Anyway, you could either use a regular expression or a manual tokenizer.
You can avoid the list. You know the length of values and actions, so you can do
String[] values = s.split(";");
String[] actions = values[0].split(":");
String[] result = new String[actions.length + values.length - 1];
System.arraycopy(actions, 0, result, 0, actions.legnth);
System.arraycopy(values, 1, result, actions.length, values.length - 1);
return result;
It should be reasonably efficient, unless you insist on implementing split yourself.
Untested low-level approach (make sure to unit test and benchmark before use):
// Separator characters, as char, not string.
final static int s1 = ':';
final static int s2 = ';';
// Compute required size:
int components = 1;
for(int p = Math.min(s.indexOf(s1), s.indexOf(s2));
p < s.length() && p > -1;
p = s.indexOf(s2, p+1)) {
components++;
}
String[] result = new String[components];
// Build result
int in=0, i=0, out=Math.min(s.indexOf(s1), s.indexOf(s2));
while(out < s.length() && out > -1) {
result[i] = s.substring(in, out);
i++;
in = out + 1;
out = s.indexOf(s2, in);
}
assert(i == result.length - 1);
result[i] = s.substring(in, s.length());
return result;
Note: this code is optimized in the crazy way of that it will consider a : only in the first component. Handling the last component is a bit tricky, as out will have the value -1.
I would usually not use this last approach, unless performance and memory is extremely crucial. Most likely there are still some bugs in it, and the code is fairly unreadable, in particulare compare to the one above.
With some assumptions about acceptable characters, this regex provides validation as well as splitting into the groups you desire.
Pattern p = Pattern.compile("^((.+):)?(.+):(\\d+);(.+):(\\d+);(.+);(.+)$");
Matcher m = p.matcher("A:B:1111;domain:80;a;b");
if(m.matches())
{
for(int i = 0; i <= m.groupCount(); i++)
System.out.println(m.group(i));
}
m = p.matcher("B:1111;domain:80;a;b");
if(m.matches())
{
for(int i = 0; i <= m.groupCount(); i++)
System.out.println(m.group(i));
}
Gives:
A:B:1111;domain:80;a;b // ignore this
A: // ignore this
A // This is the optional A, check for null
B
1111
domain
80
a
b
And
B:1111;domain:80;a;b // ignore this
null // ignore this
null // This is the optional A, check for null
B
1111
domain
80
a
b
you could do something like
String str = "A:B:1111;domain:80;a;b";
String[] temp;
/* delimiter */
String delimiter = ";";
/* given string will be split by the argument delimiter provided. */
temp = str.split(delimiter);
/* print substrings */
for(int i =0; i < temp.length ; i++)
System.out.println(temp[i]);
Unless this is a bottleneck in your code and you have verified that don't worry much about efficiency as the logic here is reasonable. You can avoid creating the temporary array list and instead directly create the array as you know the required size.
If you want to keep the domain and port together, then I believe that you will need you will need two splits. You may be able to do it with some regex magic, but I would doubt that you will see any real performance gain from it.
If you do not mind splitting the domain and port, then:
String s= "A:B:1111;domain:80;a;b";
List<String> tokens = new ArrayList<String>();
String[] values = s.split(";|:");
for(String a : values){
tokens.add(a);
}