Regex for epoch time in millisecond using java - java

I have this string:
String str = "8240d66c-4771-4fae-9631-8a420f9099ca,458,cross_vendor_certificate_will_expire,1565102543758";
I would like to remove the epoch time from the string using regex I've searched the web but I didn't find a suitable solution.
This is my code I have so far:
public void createHashMapWithAlertCSVContent() throws Exception {
for(String item: lstServer) { //lstServer is a list contains names of the CSV files
String[] contentCSVStr= CmdHelper.Remote.File.cat(SERVER,INDENI_INSIGHT_PATH + "/"+item).split("\n");//Function to get CSV contents
mapServer.put(FileUtil.removeExtension(item), contentCSVStr);//Finally I add each String[] to hashmap key is the csv file name and String[] is the content of each CSV file
}
Assert.assertEquals(mapServer.size(), lstServer.size());
mapServer.remove("job");
}
example of possible content:
1. 0,TRIAL,8240d66c-4771-4fae-9631-8a420f9099ca,1566345600000,5,1565102549213
2. 8240d66c-4771-4fae-9631-8a420f9099ca,0,1565102673040
3. 8240d66c-4771-4fae-9631-8a420f9099ca,0.0.0.develop,4418,1565102673009
EDIT:
regex might be any location in the string and might exit more than once in the string.
length of the epoch time string for sure > 10

String input = "0,TRIAL,8240d66c-4771-4fae-9631-8a420f9099ca,1566345600000,5,1565102549213";
String output = input.replaceAll("\\d{10,},|,\\d{10,}", "");
System.out.println(output);
Output:
0,TRIAL,8240d66c-4771-4fae-9631-8a420f9099ca,5
The vertical bar | in the regular expression denotes two options, one with a number and a comma, the other with the comma before the number. This takes into account that the timestamp may be first or last in the string or somewhere between.
\\d denotes a digit and {10,} that there are at least 10 of them with no upper limit. Please consider yourself whether the lower limit should be 10, 13 or some other number of digits.
Corner case: if the string consists of only one or more timestamps, the above replacement will not remove the last one of them since it insists on removing one comma with each timestamp, and a string consisting of only one timestamp will not have a comma in it.

Related

regex to match a suffix in a given string

I have a function which is supposed to validate a string to not contain the below prefix
I want to match every word with
__test_timestamp__itemname
some examples are as follows
__test_1349333576093__cellphone_modelc1
__test_1349333576090__macbook_model_12
public boolean isvalid(String Name){
/*pattern match to check for suffix and return true if string starts with
__test_timestamp_
*/
}
The person name in this string can vary and so will the timestamp which is in milliseconds , however the timestamp is 13 characters in length and consists of digits , the itemname can contain numbers and underscore
How do I write a function to match this pattern ? Thank you in advance for helping!
I'm not familiarized with java but the regex is something like this:
^__test_[0-9]{13}__[A-Za-z0-9_]+$
^: for start string
[0-9]{13}: 13 numbers
[A-Za-z0-9_]+: 1 or more Mayus/minus chars, numbers and _
$: for end string
https://regex101.com/r/oWBfes/2
Edit: If you need more flexibility for the timestamp, you can set min and max like this:
{11,13}
^__test_[0-9]{11,13}__[A-Za-z0-9_]+$
Edit: add 100 max length:
(?=^.{0,100}$)(^__test_[0-9]{11,13}__[A-Za-z0-9_]+$)
Edit: to group last occurrence:
(?=^.{0,100}$)^__test_[0-9]{11,13}__([A-Za-z0-9_]+)$/
To catch what you want, group it with ()
You may use String#matches as follows:
public boolean isvalid(String name) {
return name.matches("__test_\\d+__\\S+");
}
Note that we don't assign any fixed width to the timestamp, because perhaps you have some earlier data whose timestamps could be less than 13 digits wide.

Java - Why does string split for empty string give me a non empty array?

I want to split a String by a space. When I use an empty string, I expect to get an array of zero strings. Instead, I get an array with only empty string. Why ?
public static void main(String [] args){
String x = "";
String [] xs = x.split(" ");
System.out.println("strings :" + xs.length);//prints 1 instead of 0.
}
The single element string array entry is in fact empty string. This makes sense, because the split on " " fails, and hence you just get back the input with which you started. As a general approach, you may consider that if splitting returns you a single element, then the split did not match anything, leaving you with the starting input string.
An interesting puzzle indeed:
> "".split(" ")
String[1] { "" }
> " ".split(" ")
String[0] { }
The question is, when you split the empty string, why does the result contain the empty string, and when you split a space, why does the result not contain anything? It seems inconsistent, but all is explained in the documentation.
The String.split(String) method "works as if by invoking the two-argument split method with the given expression and a limit argument of zero", so let's read the docs for String.split(String, int). The case of the empty string is answered by this part:
If the expression does not match any part of the input then the resulting array has just one element, namely this string.
The empty string has no part matching a space, so the output is an array containing one element, the input string, exactly as the docs say should happen.
The case of the string " " is answered by these two parts:
A zero-width match at the beginning however never produces such empty leading substring.
If n is zero then the pattern will be applied as many times as possible, the array can have any length, and trailing empty strings will be discarded.
The whole input string " " matches the splitting pattern. In principle we could include an empty string on either side of the match, but the docs say that an empty leading substring is never included, and (because the limit parameter n = 0) the trailing empty string is also discarded. Hence, the empty strings before and after the match are both not included in the resulting array, so it's empty.
It appears that since the String exists and it cannot be split (there are no spaces), it simply places the entire String into the first array position, causing there to be one. If you were to instead try
String x = " ";
String [] xs = x.split(" ");
System.out.println("strings :" + xs.length);//prints 1 instead of 0.
It will give you the zero you are expecting.
See also: Java String split removed empty values

Complex File content search in java

I have a file,having the content 'HREC ZZ INCOK4 ZZ BEOINDIANEX ICES1_5P CHCAE02 71484 20131104 1230'(first line of file ).I need to reach the 8th word, that could be CHCAE02 or CHCAI02 (here word is determined by space ) and need some logic checking on it.How can I achieve this with java .plz help me .It is urgent.Below shown is the full file content.
HREC ZZ INCOK4 ZZ BEOINDIANEX ICES1_5P CHCAE0271484201311041230
INCOK4104112013CHA Not Registered;IEC Not Registered;Invalid Bank Code;Authorised Dealer Code of IEC Not Found;Country of Destination can not be India;Wrong Port of destination:INCOK4;Wrong Port of destination:INCOK4;Wrong RITC Code For Inv./Item No:1/1;
TREC71484
There can be many ways to fetch the 8th column-
Using String.split(String regex)
String word = row.split("\W+")[7];
if column matches certain pattern like digits count and only digits then
String regex = "[0-9]{5}"; -- matches a word between 0 and 9 and 5 length.
Try String.split(regex)
String words[] = line.split(" ");
String eightWord = words[7];

How to detect if a string input has more than one consecutive space?

For a class I have to make a morse code program using a binary tree. The user is suppose to enter morse code and the program will decode it and print out the result. The binary tree only holds A-Z. And I only need to read dashes, dots, and spaces. If there is one space that is the end of the letter. If there is 2 or more spaces in a row that is the end of the word.
How do you detect if the string input has consecutive spaces? Right now I have it programmed where it detects if there is 2 (which will then print out a space), but i dont know how to have it where it knows there is 3+ spaces.
This is how I'm reading the input btw:
String input = showInputDialog( "Enter Code", null);
character = input.charAt(i);
And this is how I have it detecting a space: if (character == ' ').
Can anyone help?
Well, you could do something like this which if you had more than one item in the resulting array would tell you that you had at least one instance of 2+ spaces.
String[] foo = "a b c d".split(" +");
This splits into "a b", "c", and "d".
You'd probably need regex checks than just that though if you need to detect how many of each count of spaces (e.g. how many 2 spaces, how many 3 spaces, etc).
Note I have made an assumption that you are retrieving the full morse code message in one go and not one character at a time
Focusing on this point:
"If there is one space that is the end of the letter. If there is 2 or more spaces in a row that is the end of the word."
Personally, I'd use the split() method on the String class. This will split up a String into a String[] and then you can do some checks on the individual Strings in the array. Splitting on a space character like this will give you a couple of behavioural advantages:
Any strings that represent characters will have no trailing or leading spaces on them
Any sequences of multiple spaces will result in empty strings in the returned String[].
For example, calling split(" ") on the string "A B C" would give you a String[] containing {"A", "B", "", "C"}
Using this, I would first check if the empty string appeared at all. If this was the case, it implies that there were at least 2 space characters next to each other in the input morse code message. Then you can just ignore any empty strings that occur after the first one and it will cater for any number of sequential empty strings.
Without wanting to complete your assignment for you, here is some sample code:
public String decode(final String morseCode) {
final StringBuilder decodedMessage = new StringBuilder();
final String[] splitMorseCode = morseCode.split(" ");
for (final String morseCharacter : splitMorseCode) {
if( "".equals(morseCharacter) ) {
/* We now know we had at least 2 spaces in sequence
* So we check to see if we already added a space to spearate the
* resulting decoded words. If not, then we add one. */
if ( !decodedMessage.toString().endsWith(" ") ) {
decodedMessage.append(" ");
}
continue;
}
//Some code that decodes your morse code character.
}
return decodedMessage.toString();
}
I also wrote a quick test. In my example I made "--" convert to "M". Splitting the decodedMessage on the space character was a way of counting the individual words that had been decoded.
#Test
public void thatDecoderCanDecodeMultipleWordsSeparatedByMultipleSpaces() {
final String decodedMessage = this.decoder.decode("-- -- -- -- -- -- -- -- -- -- -- -- -- --");
assertThat(decodedMessage.split(" ").length, is(7));
assertThat(decodedMessage, is("MM MM MM MM MM MM MM"));
}
Of course, if this is still not making sense, then reading the APIs always helps
To detect if a String has more than one space:
if (str.matches(".* .*"))
This will help.,
public class StringTester {
public static void main(String args[]){
String s="Hello ";
int count=0;
char chr[]= s.toCharArray();
for (char chr1:chr){
if(chr1==' ')
count++;
}
if(count>=2)
System.out.println(" I got more than 2 spaces") ;
}

Java: Parsing a string based on delimiter

I have to design an interface where it fetches data from machine and then plots it. I have already designed the fetch part and it fetches a string of format A&B#.13409$13400$13400$13386$13418$13427$13406$13383$13406$13412$13419$00000$00000$
First five A&B#. characters are the identifier. Please note that the fifth character is new line feed i.e. ASCII 0xA.
The function I have written -
public static boolean checkStart(String str,String startStr){
String Initials = str.substring(0,5);
System.out.println("Here is start: " + Initials);
if (startStr.equals(Initials))
return true;
else
return false;
}
shows Here is start: A&B#. which is correct.
Question 1:
Why do we need to take str.substring(0,5) i.e. when I use str.substring(0,4) it shows only - Here is start: A&B# i.e. missing new line feed. Why is New Line feed making this difference.
Further to extract remaing string I have to use s.substring(5,s.length()) instead of s.substring(6,s.length())
i.e.
s.substring(6,s.length()) produces 3409$13400$13400$13386$13418$13427$13406$13383$13406$13412$13419$00000$00000$ i.e missing the first char after the identifier A&B#.
Question 2:
My parsing function is:
public static String[] StringParser(String str,String del){
String[] sParsed = str.split(del);
for (int i=0; i<sParsed.length; i++) {
System.out.println(sParsed[i]);
}
return sParsed;
}
It parses correctly for String String s = "A&B#.13409/13400/13400/13386/13418/13427/13406/13383/13406/13412/13419/00000/00000/"; and calling the function as String[] tokens = StringParser(rightChannelString,"/");
But for String such as String s = "A&B#.13409$13400$13400$13386$13418$13427$13406$13383$13406$13412$13419$00000$00000$" , the call String[] tokens = StringParser(rightChannelString,"$"); does not parse the string at all.
I am not able to figure out why this behaviour. Can any one please let me know the solution?
Thanks
Regarding question 1, the java API says that the substring method takes 2 parameters:
beginIndex the begin index, inclusive.
endIndex the end index, exclusive.
So in your example
String: A&B#.134
Index: 01234567
substring(0,4) = indexes 0 to 3 so A&B#, that's why you have to put 5 as the second parameter to recover your line delimiter.
Regarding question 2, I guess that the split method takes a regexp in parameter and $ is a special character. To match the dollar sign I guess you have to escape it with the \ character (as \ is a special char in strings so you must also escape it).
String[] tokens = StringParser(rightChannelString,"\\$");
Q1: review the description of substring in the documentation:
Returns a new string that is a substring of this string.
The substring begins at the specified beginIndex and extends to the
character at index endIndex - 1. Thus the length of the substring
is endIndex-beginIndex.
Q2: the split method takes a regular expression for the separator. $ is a special character for regular expressions, it matches the end of the line.

Categories