Copy array using Regular Expression in Java - java

I'm trying to copy a String array into another String array. However the copy has to contain only parts of each string of the original Array.
For example, If we have
String[] originalArray = {"/data/test2/", "/data/test4/", "/data/dropbox/test5/"}
I want the copy array to be
String[] copyArray = {"test2", "test4", "test5"}
My solution would be to simply iterate through the original array and use regular expression to grab the last part of the string in the originalArray and make a copyArray consisting of those values.
Is the above method valid, or is there a more efficient solution to this? Also what regular expression would I use for this case? The way I'm doing seems a bit too brute forced.
Ideally, I would just manually create the copyArray, but in this case, the size of the originalArray and the precise content is unknown.
Edit:
This seems trivial but for some reason it's not working.
I added the regular expression. It seems to work in the tester but it's not working as I wanted it to in the program. I first converted the originalArray into a String with | appended for the regex.
String pattern = "/\\w+(?=\\||$)/g";
String testArray = originalArray.replaceAll(pattern," ");
However test array is just giving me the original concatenated String without the regex applied.

Up to java 7, you need to code a loop, but java 8 introduced streams that allow a fluent one-line solution:
String[] names = Arrays.stream(originalArray)
.map(s -> s.replaceAll(".*/", ""))
.toArray();
The important bit is the lambda expression to convert the path to the name by using regex to replace everything up to and including the last slash with a blank (effectively removing it).

Your method is fine. Note that the length of an array is always known through its length field. You should create the copy using new String[originalArray.length].

If it is an array of string you could join it using something like | and then apply a single RexEx to the whole string like:
/\w+(?=\||$)/g
Online Demo

There is no need to use regex for this simple requirement. Simply use String#lastIndexOf() to get the index of last / and use String#substring() method to get the desired sub-string.
Sample code:
String[] originalArray = {"/data/test2", "/data/test4", "/data/dropbox/test5"};
String[] copyArray=new String[originalArray.length];
for(int i=0;i<originalArray.length;i++){
copyArray[i]=originalArray[i].substring(originalArray[i].lastIndexOf("/")+1);
}

String[] files(String[] originalArray) {
List<String> copy = new ArrayList<>(originalArray.length);
for (String s : originalArray)
copy.add(originalArray[idx].replaceFirst(".*([^/]*)$", "$1"));
return copy.toArray(new String[copy.size()]);
}

Related

String.split() returns an array with an additional empty value

I'm working on a piece of code where I've to split a string into individual parts. The basic logic flow of my code is, the numbers below on the LHS, i.e 1, 2 and 3 are ids of an object. Once I split them, I'd use these ids, get the respective value and replace the ids in the below String with its respective values. The string that I have is as follow -
String str = "(1+2+3)>100";
I've used the following code for splitting the string -
String[] arraySplit = str.split("\\>|\\<|\\=");
String[] finalArray = arraySplit[0].split("\\(|\\)|\\+|\\-|\\*");
Now the arrays that I get are as such -
arraySplit = [(1+2+3), >100];
finalArray = [, 1, 2, 3];
So, after the string is split, I'd replace the string with the values, i.e the string would now be, (20+45+50)>100 where 20, 45 and 50 are the respective values. (this string would then be used in SpEL to evaluate the formula)
I'm almost there, just that I'm getting an empty element at the first position. Is there a way to not get the empty element in the second array, i.e finalArray? Doing some research on this, I'm guessing it is splitting the string (1+2+3) and taking an empty element as a part of the string.
If this is the thing, then is there any other method apart from String.split() that would give me the same result?
Edit -
Here, (1+2+3)>100 is just an example. The round braces are part of a formula, and the string could also be as ((1+2+3)*(5-2))>100.
Edit 2 -
After splitting this String and doing some code over it, I'm goind to use this string in SpEL. So if there's a better solution by directly using SpEL then also it would be great.
Also, currently I'm using the syntax of the formula as such - (1+2+3) * 4>100 but if there's a way out by changing the formula syntax a bit then that would also be helpful, e.g replacing the formula by - ({#1}+{#2}+{#3}) *
{#4}>100, in this case I'd get the variable using {# as the variable and get the numbers.
I hope this part is clear.
Edit 3 -
Just in case, SpEL is also there in my project although I don't have much idea on it, so if there's a better solution using SpEL then its more than welcome. The basic logic of the question is written at the starting of the question in bold.
If you take a look at the split(String regex, int limit)(emphasis is mine):
When there is a positive-width match at the beginning of this string then an empty leading substring is included at the beginning of the resulting array.
Thus, you can specify 0 as limit param:
If n is zero then the pattern will be applied as many times as possible, the array can have any length, and trailing empty strings will be discarded.
If you keep things really simple, you may be able to get away with using a combination of regular expressions and string operations like split and replace.
However, it looks to me like you'd be better off writing a simple parser using ANTLR.
Take a look at Parsing an arithmetic expression and building a tree from it in Java and https://theantlrguy.atlassian.net/wiki/display/ANTLR3/Five+minute+introduction+to+ANTLR+3
Edit: I haven't used ANTLR in a while - it's now up to version 4, and there may be some significant differences, so make sure that you check the documentation for that version.

How can I let a set retain all strings that contain my specified substring in java?

I use a hashset for a dictionary. Now I would like to filter out words that do not start with my substring. So it should be something like this:
String word = 'ab';
List<String> list = Arrays.asList(word);
boolean result = lexiconSet.retainAll(list);
And instead of this resulting in the lexicon only containing the word 'ab', I would like to keep all words beginning with 'ab'. How can I do this?
I know I can convert the set to a string arraylist, and loop over all elements to see if the strings starts with 'ab', but since I think this can be time consuming and not efficient, I would like to hear better solutions. Thank you in advance!
With Java 8, life is easy:
list.removeIf(s -> !s.startsWith("ab"));
This will remove all elements that don't begin with "ab".
Note that you can use values() to retrieve the map's values and work directly on them, without the need to convert to ArrayList.

Can I add a char to a variable specified position within a string?

OK, this is the line I am working on:
newstring.charAt(w) += p;
trying to add a character/char (p) to the string 'newstring' at a particular position within the string which is defined by int 'w'. Is this possible?
Strings are immutable in Java, so the answer is no. But there are many ways around it. The easiest is to create a StringBuilder and use the setCharAt() method. Or insert() if you want to insert a new character at a given position.
If you make multiple modifications to your string, you can (and indeed should) reuse your StringBuilder.
Well, you can't modify your string, because Strings are immutable in Java. If you try to change the string, you will get a new string object as a result.
Now, you can use String#substring method for that, using which you can get new string which is generated by some concatenation of substring of original string.: -
str = str.substring(0, w) + "p" + str.substring(w);
But, of course, using StringBuilder as specified in #biziclop's answer is the best approach you can follow.

Java String split is not working

Java Experts ,
Please look into the below split command code and let me know why last two nulls are not captured.
String test = "1,O1,,,,0.0000,0.0000,,";
String[] splittest = test.split(",");
System.out.println("length -"+splittest.length);
for (String string : splittest) {
System.out.println("value"+string);
}
The result iam getting
length -7
value1
valueO1
value
value
value
value0.0000
value0.0000
surprisingly the length is 7 where as it should be 9 and also as you can see values after 0.0000 ie two last nulls are not coming . Lets say now if i change the string test
"1,O1,,,,0.0000,0.0000,0,0"
String test = "1,O1,,,,0.0000,0.0000,0,0";
String[] splittest = test.split(",");
System.out.println("length -"+splittest.length);
for (String string : splittest) {
System.out.println("value"+string);
}
I am getting correctly
length -9
value1
valueO1
value
value
value
value0.0000
value0.0000
value0
value0
I don't think iam doing wrong . Is it a bug ? JAVA Version - jdk1.6.0_31
It behaves as specified in the javadoc:
This method works as if by invoking the two-argument split method with the given expression and a limit argument of zero. Trailing empty strings are therefore not included in the resulting array.
If you want to keep the trailing blank strings, you can use the 2 argument split method with a negative limit:
String[] splittest = test.split(",", -1);
If the limit is non-positive then the pattern will be applied as many times as possible and the array can have any length.
split silently discards trailing separators, as specified in the Javadoc.
In general, the behavior of split is kind of weird. Consider using Guava's Splitter instead, which has somewhat more predictable and customizable behavior. (Disclosure: I contribute to Guava.)
Splitter.on(',').split("1,O1,,,,0.0000,0.0000,,");
// returns [1, O1, , , , 0.0000, 0.0000, , ]
Splitter.on(',').omitEmptyStrings()
.split("1,O1,,,,0.0000,0.0000,,");
// returns [1, O1, 0.0000, 0.0000]
As mentioned above, test.split(","); will ignore trailing blank strings. You could use the two parameter method with a large second argument. However, the API also states
If n is non-positive then the pattern will be applied as many times
as possible and the array can have any length.
where n is the second argument. So if you want all the trailing strings, I would recommend
test.split(",", -1);

How do I split a concatenated string into multiple floating point values?

I'm a begginer in java I have
packet=090209153038020734.0090209153039020734.0
like this I want to split this string and store into an array like two strings:
1) 090209153038020734.0
2) 090209153039020734.0
I have done like this:
String packetArray[] = packets.split(packets,Constants.SF);
Where:
Constants.SF=0x01.
But it won't work.
Please help me.
I'd think twice about using split since those are obviously fixed width fields.
I've seen them before on another question here (several in fact so I'm guessing this may be homework (or a popular data collection device :-)) and it's plain that the protocol is:
STX (0x01).
0x0f.
date (YYMMDD or DDMMYY).
time (HHMMSS).
0x02.
value (XXXXXX.X).
0x03.
0x04.
And, given that they're fixed width, you should probably just use substrings to get the information out.
The JavaDoc of String is helpful here: http://java.sun.com/j2se/1.4.2/docs/api/java/lang/String.html
You have your String packet;
String.indexOf(String) gives you a position of a special substring. your interested in the "." sign. So you write
int position = packet.indexOf(".")+1
+1 becuase you want the trailing decimal too. It will return something 20-ish and will be the last pos of the first number.
Then we use substring
String first = packet.substring(0,position) will give you everything up to the ".0"
String second = packet.substring(position-1) should give you everything starting after the ".0" and up to the end of the string.
Now if you want them explicitely into an array you can just put them there. The code as a whole - I may have some "off by one" -bugs.
int position = packet.indexOf(".")+1
String first = packet.substring(0,position)
String second = packet.substring(position-1)
String[] packetArray = new String[2];
packetArray[0] = first;
packetArray[1] = second;
String packetArray[] = packets.split("\u0001");
should work. You are using
public String[] split(String regex, int limit)
which is doing something else: It makes sure that split() returns an array with at most limit members (1 in this case, so you get what you ask for).
You need to read the Javadocs for the String.split() methods...you are calling the version of String.split() that takes a regular expression and a limit, but you are passing the string itself as the first parameter, which doesn't really make sense.
As Aaron Digulla mentioned, use the other version.
You don't say how you want to do the split. It could be based on a fixed length (number of characters) or you want one decimal place.
If the former you could do packetArray = new String[]{packet.substring(0, 20), packet.substring(21)};
int dotIndex = packets.indexOf('.');
packetArray = new String[]{packet.substring(0, dotIndex+2), packet.substring(dotIndex+2)};
Your solution confuses the regexp with the string.
split uses regular expressions as documented here. Your code seems to be trying to match the whole string Constants.SF = 0x01 times, which doesn't make much sense. If you know what char the boxes are then you can use something like {[^c]+cc} where c is the character of the box (i guess this is 0x01), to match each "packet".
I think you are trying to use it like the .net String.Split(...) function?

Categories