Complex File content search in java

Complex File content search in java - java

I have a file,having the content 'HREC ZZ INCOK4 ZZ BEOINDIANEX ICES1_5P CHCAE02 71484 20131104 1230'(first line of file ).I need to reach the 8th word, that could be CHCAE02 or CHCAI02 (here word is determined by space ) and need some logic checking on it.How can I achieve this with java .plz help me .It is urgent.Below shown is the full file content.
HREC ZZ INCOK4 ZZ BEOINDIANEX ICES1_5P CHCAE0271484201311041230
INCOK4104112013CHA Not Registered;IEC Not Registered;Invalid Bank Code;Authorised Dealer Code of IEC Not Found;Country of Destination can not be India;Wrong Port of destination:INCOK4;Wrong Port of destination:INCOK4;Wrong RITC Code For Inv./Item No:1/1;
TREC71484

There can be many ways to fetch the 8th column-
Using String.split(String regex)
String word = row.split("\W+")[7];
if column matches certain pattern like digits count and only digits then
String regex = "[0-9]{5}"; -- matches a word between 0 and 9 and 5 length.

Try String.split(regex)
String words[] = line.split(" ");
String eightWord = words[7];

Related

Regex for epoch time in millisecond using java

I have this string:
String str = "8240d66c-4771-4fae-9631-8a420f9099ca,458,cross_vendor_certificate_will_expire,1565102543758";
I would like to remove the epoch time from the string using regex I've searched the web but I didn't find a suitable solution.
This is my code I have so far:
public void createHashMapWithAlertCSVContent() throws Exception {
for(String item: lstServer) { //lstServer is a list contains names of the CSV files
String[] contentCSVStr= CmdHelper.Remote.File.cat(SERVER,INDENI_INSIGHT_PATH + "/"+item).split("\n");//Function to get CSV contents
mapServer.put(FileUtil.removeExtension(item), contentCSVStr);//Finally I add each String[] to hashmap key is the csv file name and String[] is the content of each CSV file
}
Assert.assertEquals(mapServer.size(), lstServer.size());
mapServer.remove("job");
}
example of possible content:
1. 0,TRIAL,8240d66c-4771-4fae-9631-8a420f9099ca,1566345600000,5,1565102549213
2. 8240d66c-4771-4fae-9631-8a420f9099ca,0,1565102673040
3. 8240d66c-4771-4fae-9631-8a420f9099ca,0.0.0.develop,4418,1565102673009
EDIT:
regex might be any location in the string and might exit more than once in the string.
length of the epoch time string for sure > 10

String input = "0,TRIAL,8240d66c-4771-4fae-9631-8a420f9099ca,1566345600000,5,1565102549213";
String output = input.replaceAll("\\d{10,},|,\\d{10,}", "");
System.out.println(output);
Output:
0,TRIAL,8240d66c-4771-4fae-9631-8a420f9099ca,5
The vertical bar | in the regular expression denotes two options, one with a number and a comma, the other with the comma before the number. This takes into account that the timestamp may be first or last in the string or somewhere between.
\\d denotes a digit and {10,} that there are at least 10 of them with no upper limit. Please consider yourself whether the lower limit should be 10, 13 or some other number of digits.
Corner case: if the string consists of only one or more timestamps, the above replacement will not remove the last one of them since it insists on removing one comma with each timestamp, and a string consisting of only one timestamp will not have a comma in it.

String format into specific pattern

Is there any pretty and flexible way to format String data into specific pattern, for example:
data input -> 0123456789
data output <- 012345/678/9
I did it by cutting String into multiple parts, but I'm searching for any more suitable way.

Assuming you want the last and 4th-2nd last in groups:
String formatted = str.replaceAll("(...)(.)$", "/$1/$2");
This captures the parts you want in groups and replaces them with intervening slashes.

You can use replaceAll with regex to match multiple groups like so :
String text = "0123456789";
text = text.replaceAll("(\\d{6})(\\d{3})(.*)", "$1/$2/$3");
System.out.println(text);
Output
012345/678/9
details
(\d{6}) group one match 6 digits
(\d{3}) group two match 3 digits
(.*) group three rest of your string
$1/$2/$3 replace with the group 1 followed by / followed by group 2 followed by / followed by group 3

You can use StringBuilder's insert to insert characters at a certain index:
String input = "0123456789";
String output = new StringBuilder(input)
.insert(6, "/")
.insert(10, "/")
.toString();
System.out.println(output); // 012345/678/9

Regex Help for a Command

I need to parse this Command:
direct print conference <style>:<First Name Last Name>,<First Name Last Name>,<First Name Last Name>,<title>,<conference series name>,<location>,<year>
A Example Command would be:
direct print conference ieee:Sergey Brin,Lawrence Page,,The Anatomy of a Large-Scale Hypertextual Web Search Engine,WWW,Brisbane Australia,1998
My Main Problem is (First Name Last Name) can be empty, but how do I do that with Regex?
For The (First Name Last Name) i always do ([a-zA-Z]+) ([a-zA-Z]+) but How do I definiate Empty Possible Places with Regex?
direct print conference ([a-zA-Z]+):([a-zA-Z]+) ([a-zA-Z]+),([a-zA-Z]+) ([a-zA-Z]+),([a-zA-Z]+) ([a-zA-Z]+),([^,]+),([a-zA-Z]+),([a-zA-Z ]+),([0-9]+)
That is my Regex for if the names are not empty but How i can include Empty Characters to my Regex like:
([a-zA-Z]+) ([a-zA-Z]+) OR EMPTY ?
I Hope you can Help me

Since your input is basicaly a CSV with a special start, you don't need regex here, just String.split():
String input = "direct print conference ieee:Sergey Brin,Lawrence Page,,The Anatomy of a Large-Scale Hypertextual Web Search Engine,WWW,Brisbane Australia,1998";
String[] parts = input.split(":");
String[] values = parts[1].split(",");
for(int i=0; i<values.length; i++) {
System.out.println(values[i]);
}
See it live

splits strings in java in different manner

I am very new to java. i want to splits the string into following manner.
suppose i have given a string input like sample 1 jayesh =10 vyas =13 harshit=10; and so on as a input
sample 2: harsh=2, vyas=5;
now i want to store jayesh, vyas, harshit from sample 1 and harsh , vyas from sample 2(all this type of strings which are just before the assignment operator) into string or char array.
so can anyone please tell me about that how to do this in java. i know about split method in java, but in this case there are multiple strings i have to store.

you can use =\\d+;? regex
=\\d+;? match = and as many digits with ; as optional
String s="jayesh =10 vyas =13 harshit=10;";
String[] ss=s.split("=\\d+;?");
System.out.println(Arrays.toString(ss));
output
[jayesh , vyas , harshit]
To extend it further you can use \\s*=\\d+[,;]?\\s*
\\s* : match zero or more spaces
[,;]? match any character mention in the list as optional
but if you want to avoid any special character after digits then use
\\s*=\\d+\\W*" :
\\s*= : match zero or more spaces and = character
\\d+ : match one or more digits
\W* : match zero or more non-word character except a-zA-z0-9_
String s="harsh=2, vyas=5; vyas=5";
String s2 ="jayesh =10 vyas=13 harshit=10;";
String regex="\\s*=\\d+\\W*";
String[] ss=s.split(regex);
String[] ss2=s2.split(regex);
System.out.println(Arrays.toString(ss));
System.out.println(Arrays.toString(ss2));
output
[harsh, vyas, vyas]
[jayesh, vyas, harshit]
Note : Space after , is added for formatting by the Arrays.toString function though there is no space in the ss and ss2 array elements.
For Hashset use
Set<String> mySet = new HashSet<String>(Arrays.asList(ss));

you can use replaceAll() to get the expected results
String stringToSearch = "jayesh =10 vyas =13 harshit=10;";
stringToSearch = stringToSearch.replaceAll("=\\d+;?","");
System.out.println(stringToSearch);
output:
jayesh vyas harshit

How to use Substring when String length is not fixed everytime

I have string something like :
SKU: XP321654
Quantity: 1
Order date: 01/08/2016
The SKU length is not fixed , so my function sometime returns me the first or two characters of Quantity also which I do not want to get. I want to get only SKU value.
My Code :
int index = Content.indexOf("SKU:");
String SKU = Content.substring(index, index+15);
If SKU has one or two more digits then also it is not able to get because I have specified limit till 15. If I do index + 16 to get long SKU data then for Short SKU it returns me some character of Quantity also.
How can I solve it. Is there any way to use instead of a static string character length as limit.
My SKU last digit will always number so any other thing which I can use to get only SKU till it's last digit?

Using .substring is simply not the way to process such things. What you need is a regex (or regular expression):
Pattern pat = Pattern.compile("SKU\\s*:\\s*(\\S+)");
String sku = null;
Matcher matcher = pattern.matcher(Content);
if(matcher.find()) { //we've found a match
sku = matcher.group(1);
}
//do something with sku
Unescaped the regex is something like:
SKU\s*:\s*(\S+)
you are thus looking for a pattern that starts with SKU then followed by zero or more \s (spacing characters like space and tab), followed by a colon (:) then potentially zero or more spacing characters (\s) and finally the part in which you are interested: one or more (that's the meaning of +) non-spacing characters (\S). By putting these in brackets, these are a matching group. If the regex succeeds in finding the pattern (matcher.find()), you can extract the content of the matching group matcher.group(1) and store it into a string.
Potentially you can improve the regex further if you for instance know more about how a SKU looks like. For instance if it consists only out of uppercase letters and digits, you can replace \S by [0-9A-Z], so then the pattern becomes:
Pattern pat = Pattern.compile("SKU\\s*:\\s*([0-9A-Z]+)");
EDIT: for the quantity data, you could use:
Pattern pat2 = Pattern.compile("Quantity\\s*:\\s*(\\d+)");
int qt = -1;
Matcher matcher = pat2.matcher(Content);
if(matcher.find()) { //we've found a match
qt = Integer.parseInt(matcher.group(1));
}
or see this jdoodle.

You know you can just refer to the length of the string right ?
String s = "SKU: XP321654";
String sku = s.substring(4, s.length()).trim();
I think using a regex is clearly overkill in this case, it is way way simpler than this. You can even split the expression although it's a bit less efficient than the solution above, but please don't use a regex for this !
String sku = "SKU: XP321654".split(':')[1].trim();

1: you have to split your input by lines (or split by \n)
2: when you have your line: you search for : and then you take the remaining of the line (with the String size as mentionned in Dici answer).

Depending on how exactly the string contains new lines, you could do this:
public static void main(String[] args) {
String s = "SKU: XP321654\r\n" +
"Quantity: 1\r\n" +
"Order date: 01/08/2016";
System.out.println(s.substring(s.indexOf(": ") + 2, s.indexOf("\r\n")));
}
Just note that this 1-liner has several restrictions:
The SKU property has to be first. If not, then modify the start index appropriately to search for "SKU: ".
The new lines might be separated otherwise, \R is a regex for all the valid new line escape characters combinations.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Complex File content search in java - java

There can be many ways to fetch the 8th column- Using String.split(String regex) String word = row.split("\W+")[7]; if column matches certain pattern like digits count and only digits then String regex = "[0-9]{5}"; -- matches a word between 0 and 9 and 5 length.

Try String.split(regex) String words[] = line.split(" "); String eightWord = words[7];

Related

Regex for epoch time in millisecond using java

String format into specific pattern

Regex Help for a Command

splits strings in java in different manner

How to use Substring when String length is not fixed everytime

Categories

Resources