Getting substring of a string that has a repeating character Java - java

I'm a writing a parser that will extract the tag and value out of a line that it reads from a file and I want to know how to get the value. So in this case I want to get
key = "accountName" and
value = "fname LName" and have it repeat with each line.
<accountName>fname LName</accountName>
<accountNumber>12345678912</accountNumber>
<accountOpenedDate>20200218</accountOpenedDate>
This is my code, this is within a while loop that is scanning each line using bufferedReader. I managed to get the key properly, but when I try to get the value, I get "String index out of range - 12. Not sure how to get the value between the two arrows > <.
String line;
if(line.startsWith("<"){
key = line.substring(line.indexOf("<"+1, line.indexOf(">"));
value = line.substring(line.indexOf(">"+1, line.indexOf("<")+1);
}

Though it is recommended to use XML parser but still if you want to do it by manually processing the string at each line:
(using regular expression is recommended to process line) but if you want todo manually with substring way here is the example:
private static void readKeyValue(String line) {
String key = null;
String value = null;
if (null != line && line.startsWith("<") && line.contains("</")) {
key = line.substring(line.indexOf("</")+ 2 , line.lastIndexOf(">"));
value = line.substring(line.indexOf(">") + 1, line.indexOf("</"));
}
System.out.println("key: "+ key);
System.out.println("value: "+ value);
}

You can use regular expressions to extract, assuming the line variable is a string read from each line.
String pattern = "<([a-zA-Z]+.*?)>([\\s\\S]*?)</[a-zA-Z]*?>";
// Create a Pattern object
Pattern r = Pattern.compile(pattern);
// Now create matcher object.
Matcher m = r.matcher(line);
// find
if (m.find()) {
String key = m.group(1);
String value = m.group(2);
System.out.println("Key: " + key);
System.out.println("Value: " + value);
} else {
System.out.println("Invalid");
}

Related

Length of String within tags in java

We need to find the length of the tag names within the tags in java
{Student}{Subject}{Marks}100{/Marks}{/Subject}{/Student}
so the length of Student tag is 7 and that of subject tag is 7 and that of marks is 5.
I am trying to split the tags and then find the length of each string within the tag.
But the code I am trying gives me only the first tag name and not others.
Can you please help me on this?
I am very new to java. Please let me know if this is a very silly question.
Code part:
System.out.println(
getParenthesesContent("{Student}{Subject}{Marks}100{/Marks}{/Subject}{/Student}"));
public static String getParenthesesContent(String str) {
return str.substring(str.indexOf('{')+1,str.indexOf('}'));
}
You can use Patterns with this regex \\{(\[a-zA-Z\]*)\\} :
String text = "{Student}{Subject}{Marks}100{/Marks}{/Subject}{/Student}";
Matcher matcher = Pattern.compile("\\{([a-zA-Z]*)\\}").matcher(text);
while (matcher.find()) {
System.out.println(
String.format(
"tag name = %s, Length = %d ",
matcher.group(1),
matcher.group(1).length()
)
);
}
Outputs
tag name = Student, Length = 7
tag name = Subject, Length = 7
tag name = Marks, Length = 5
You might want to give a try to another regex:
String s = "{Abc}{Defg}100{Hij}100{/Klmopr}{/Stuvw}"; // just a sample String
Pattern p = Pattern.compile("\\{\\W*(\\w++)\\W*\\}");
Matcher m = p.matcher(s);
while(m.find()) {
System.out.println(m.group(1) + ", length: " + m.group(1).length());
}
Output you get:
Abc, length: 3
Defg, length: 4
Hij, length: 3
Klmopr, length: 6
Stuvw, length: 5
If you need to use charAt() to walk over the input String, you might want to consider using something like this (I made some explanations in the comments to the code):
String s = "{Student}{Subject}{Marks}100{/Marks}{/Subject}{/Student}";
ArrayList<String> tags = new ArrayList<>();
for(int i = 0; i < s.length(); i++) {
StringBuilder sb = new StringBuilder(); // Use StringBuilder and its append() method to append Strings (it's more efficient than "+=") String appended = ""; // This String will be appended when correct tag is found
if(s.charAt(i) == '{') { // If start of tag is found...
while(!(Character.isLetter(s.charAt(i)))) { // Skip characters that are not letters
i++;
}
while(Character.isLetter(s.charAt(i))) { // Append String with letters that are found
sb.append(s.charAt(i));
i++;
}
if(!(tags.contains(sb.toString()))) { // Add final String to ArrayList only if it not contained here yet
tags.add(sb.toString());
}
}
}
for(String tag : tags) { // Printing Strings contained in ArrayList and their length
System.out.println(tag + ", length: " + tag.length());
}
Output you get:
Student, length: 7
Subject, length: 7
Marks, length: 5
yes use regular expression, find the pattern and apply that.

How to get exact match keyword from the given string using java?

I'm trying to match exact AdvanceJava keyword with the given inputText string but it executes both if and else condition,instead of I want only AdvanceJava keyword matched.
String inputText = ("iwanttoknowrelatedtoAdvancejava").toLowerCase().replaceAll("\\s", "");
String match = "java";
List keywordsList = new ArrayList<>();//where keywordsList{advance,core,programming} -> keywordlist fetch
// from database
Enumeration e = Collections.enumeration(keywordsList);
int size = keywordsList.size();
while (e.hasMoreElements()) {
for (int i = 0; i < size; i++) {
String s1 = (String) keywordsList.get(i);
if (inputText.contains(s1) && inputText.contains(match)) {
System.out.println("Yes we providing " + s1);
} else if (!inputText.contains(s1) && inputText.contains(match)) {
System.out.println("Yes we are working on java");
}
}
break;
}
Thanks
you can simply do this by using pattern and matcher classes
Pattern p = Pattern.compile("java");
Matcher m = p.matcher("Print this");
m.find();
If you want to find multiple matches in a line, you can call find() and group() repeatedly to extract them all.
Here's how you can achieve what you seek using pattern matching.
In the first example I have taken your input text as it is. This only improves your algorithm which has O(n^2) performance.
String inputText = ("iwanttoknowrelatedtoAdvancejava").toLowerCase().replaceAll("\\s", "");
String match = "java";
List<String> keywordsList = Arrays.asList("advance", "core", "programming");
for (String keyword : keywordsList) {
Pattern p = Pattern.compile(keyword.concat(match));
Matcher m = p.matcher(inputText);
//System.out.println(m.find());
if (m.find()) {
System.out.println("Yes we are providing " + keyword.concat(match));
}
}
But we can improve this in to a better implementation. Here's a more generic version of the above implementation. This code doesn't manipulate the input text before matching, rather we provide a more generic regular expression which ignores spaces and matches case insensitive manner.
String inputText = "i want to know related to Advance java";
String match = "java";
List<String> keywordsList = Arrays.asList("advance", "core", "programming");
for (String keyword : keywordsList) {
Pattern p = Pattern.compile(MessageFormat.format("(?i)({0}\\s*{1})", keyword, match));
Pattern p1 = Pattern.compile(MessageFormat.format("(?i)({0})", match));
Matcher m = p.matcher(inputText);
Matcher m1 = p1.matcher(inputText);
//System.out.println(m.find());
if(m.find()) {
System.out.println("Yes we are providing " + keyword.concat(match));
} else if(m1.find()) {
System.out.println("Yes we are working with " + match);
}
}
#sithum - Thanks but it executes both condition of if else in output.Please refer Screen shot which I attached here.
I applied following logic and it works fine. please refer it , Thanks.
String inputText = ("iwanttoknowrelatedtoAdvancejava").toLowerCase().replaceAll("\\s", "");
String match = "java";
List<String> keywordsList = session.createSQLQuery("SELECT QUESTIONARIES_RAISED FROM QUERIES").list(); // Fetch values from database (advance,core,programming)
String uniqueKeyword=null;
String commonKeyword= null;
int size =keywordsList.size();
for(int i=0;i<size;i++){
String s1 = (String) keywordsList.get(i);//get values one by one from list
if(inputText.contains(match)){
if(inputText.contains(s1) && inputText.contains(match)){
Queries q1 = new Queries();
q1.setQuestionariesRaised(s1); //set matched keyword to getter setter method
keywordsList1=session.createQuery("from Queries sentence where questionariesRaised='"+q1.getQuestionariesRaised()+"'").list(); // based on matched keyword fetch according to matched keyword sentence which stored in database
for(Queries ob : keywordsList1){
uniqueKeyword= ob.getSentence().toString();// Store fetched sentence to on string variable
}
break;
}else {
commonKeyword= "java only";
}
}
}}
if(uniqueKeyword!= null){
System.out.println("Yes we providing......................" + uniqueKeyword);
}else if(commonKeyword!= null){
System.out.println("Yes we providing " + commonKeyword);
}else{
}

Regex: capture group in list like string

I've searched stacked overflow and the net and I found similar questions but none that gave me a concrete answer. I have a string that acts as a list with the following formatting
Key(Value)/Key(value)/Key(value,value)). I would like to match them by key name IF the key exists, so I don't really want the parenthesis included anywhere.. just the key and the value. I coded something out, but it's a real mess...
so my conditions are:
1)extract key value pairs without parenthesis
2)extract IF they are available...
3)If value portion of list contains two values delimited by a ",", extract individually
textToParse = "TdkRoot(0x0)/Tdk(0x2,0x0)/Tdk(0x0,0x1)/VAL(40A8F0B32240,2x4)/SN(0000:0000:0000:0000:0000:0000:0000:0000/IP(000.1.000.1)/Blue(2x4,2x4)"
String patternText = "^TdkRoot\(( [A-Za-z0-9]) Tdk\(( \\w}+) VAL\(( \\w) SN\(( \\w) IP\ (( \\w) Blue\(( \\w)"
Pattern pattern = Pattern.compile( patternText );
Matcher matcher = pattern.matcher(textToParse);
//Extract the groups from the regex (e.g. elements in braces)
String messageId = matcher.group( 1 );
String submitDate = matcher.group(4);
String statusText = matcher.group( 6 );
I think a cleaner/easier approach would be to extract the elements using patterns for each individual key/value. If so what pattern could I use to tell regex: for "key" grab "value" but leave the parenthesis... if value is delimited by a coma.. return array?? possibly?
Thanks Community!! Hope to hear from you!
PS I know (?<=\()(.*?)(?=\)) will capture anything in the parentheses "(This) value was captured), but how can I modify that to specify a key before the parentheses? "I want to capture whats in THIS(parentheses)" ... key THIS
possibly delimited by a coma
public static void main(String[] args) {
String textToParse = "TdkRoot(0x0)/Tdk(0x2,0x0)/Tdk(0x0,0x1)/VAL(40A8F0B32240,2x4)/SN(0000:0000:0000:0000:0000:0000:0000:0000)/IP(000.1.000.1)/Blue(2x4,2x4)";
Pattern p = Pattern.compile("(\\w+)\\((.*?)\\)");
Matcher m = p.matcher(textToParse);
while (m.find()) {
System.out.println("key :" + m.group(1));
if (m.group(2).contains(",")) {
String[] s = m.group(2).split(",");
System.out.println("values : " + Arrays.toString(s));
} else {
System.out.println("value :" + m.group(2));
}
}
}
o/p:
key :TdkRoot
value :0x0
key :Tdk
values : [0x2, 0x0]
key :Tdk
values : [0x0, 0x1]
key :VAL
values : [40A8F0B32240, 2x4]
key :SN
value :0000:0000:0000:0000:0000:0000:0000:0000
key :IP
value :000.1.000.1
key :Blue
values : [2x4, 2x4]
Not sure if this is what you are looking for (your sample code does not compile) but the following code parses the input text into a map :
String inputText = "TdkRoot(0x0)/Tdk(0x2,0x0)/Tdk(0x0,0x1)/VAL(40A8F0B32240,2x4)/SN(0000:0000:0000:0000:0000:0000:0000:0000)/IP(000.1.000.1)/Blue(2x4,2x4)";
Pattern outerPattern = Pattern.compile("([^/()]+)\\(([^()]+)\\)");
Pattern innerPattern = Pattern.compile("([^,]+)");
Map<String, Collection<String>> parsedData = new HashMap<String, Collection<String>>();
Matcher outerMatcher = outerPattern.matcher(inputText);
while (outerMatcher.find()) {
String key = outerMatcher.group(1);
String val = outerMatcher.group(2);
Collection<String> valueCollection = new ArrayList<String>();
Matcher innerMatcher = innerPattern.matcher(val);
while (innerMatcher.find()) {
valueCollection.add(innerMatcher.group(1));
}
parsedData.put(key, valueCollection);
}
System.out.println(parsedData);
The resulting map (printed on last line) is
{Blue=[2x4, 2x4], VAL=[40A8F0B32240, 2x4], IP=[000.1.000.1], TdkRoot=[0x0], SN=[0000:0000:0000:0000:0000:0000:0000:0000], Tdk=[0x0, 0x1]}

regex matcher check in if logic not working

Hi, you can see my code below. I have some strings Country, rank and grank in my code, initially they will be null, but if regex is mached, it should change the value. But even if regex is matched it is not changing the value it is always null. If I remove all if statements and append the string it works fine, but if match is not found it is throwing an exception. Please let me know how can I check this in if logic.
System.err.println(content);
Pattern c = Pattern.compile("NAME=\"(.*)\" RANK");
Pattern r = Pattern.compile("\" RANK=\"(.*)\"");
Pattern gr = Pattern.compile("\" TEXT=\"(.*)\" SOURCE");
Matcher co = c.matcher(content);
Matcher ra = r.matcher(content);
Matcher gra = gr.matcher(content);
co.find();
ra.find();
gra.find();
String country = null;
String Rank = null;
String Grank = null;
if (co.matches()) {
country = co.group(1);
}
if (ra.matches()) {
Rank = ra.group(1);
}
if (gra.matches()) {
Grank = gra.group(1);
}
You have to escape a single \ - use double \\ then it should work.
Tried this?
while (co.find()) {
System.out.print("Start index: " + co.start());
System.out.print(" End index: " + co.end() + " ");
System.out.println(co.group());
}
Personally I can't make your program work with / without the if so it's not a problem of logic but just a problem that it doesn't match the string for me
So I changed it to get something working, maybe you can use it :)
String content = "NAME=\"salut\" RANK=\"pouet\" TEXT=\"text\" SOURCE";
System.out.println(content);
System.out.println(content.replaceAll(("NAME=\"(.*)\"\\sRANK=\"(.*)\"\\sTEXT=\"(.*)\" SOURCE"), "$1---$2---$3"));
Output
NAME="salut" RANK="pouet" TEXT="text" SOURCE
salut---pouet---text

How to replace a word within a square bracket based a certain condition

I've a tricky condition which does not seem to work. For a given string, "Hi [HandleKey], you have [Action]", and a map which contains, map<"HandleKey","Peter"> I want to replace the square bracket and the word within if the key is found in the map. In this case, the map does not contain the key Action. The string should return "Hi Peter, you have [Action]".
Here is the code that I'm working on:
private String messageFormatter(String tMessage, Map<String, String> messageMap)
{
String formattedMsg = null;
Set<String> keyset = messageMap.keySet();
Iterator<String> keySetItr = keyset.iterator();
String msgkey = null;
boolean isFormatted = false;
while (keySetItr.hasNext())
{
msgkey = keySetItr.next();
if(t.contains(msgkey))
{
if(!isFormatted)
{
formattedMsg = tMessage.replaceAll("\\[", "").replaceAll("\\]", "");
formattedMsg = formattedMsg.replaceAll(msgkey, messageMap.get(msgkey));
isFormatted= true;
}else
{
formattedMsg = formattedMsg.replaceAll(msgkey, messageMap.get(msgkey));;
}
}else
{
formattedMsg=tMessage;
}
}
return formattedMsg;
}
The last else part is not right. Can anyone please help me with this. This code works fine for all the cases except when a matching key is not found in the map
is this idea ok for you?
instead of applying regex or extracting the stuff between [..], you could do some trick on your map side. e.g.
String s = "Hi [HandleKey], you have [Action]";
for(String k: yourMap.keySet()){
s=s.replaceAll("\\["+k+"\\]",yourMap.get(k));
}
You can do this with regex, here is a complete example code
public static void main(String[] args) {
String str = "Hi [HandleKey], you have [Action] ";
Hashtable<String, String> table = new Hashtable<String, String>();
table.put("HandleKey", "Peter");
Pattern pattern = Pattern.compile("\\[(\\w+)\\]");
Matcher matcher = pattern.matcher(str);
while (matcher.find()) {
String key = matcher.group(1);
if (table.containsKey(key)) {
str = str.replaceFirst("\\[" + key + "\\]", table.get(key));
}
}
System.out.println(str);
}
Output:
Hi Peter, you have [Action]
Note that this is more efficient than looping over the Map if the map size is already large or growing.
To handle when key not in map with minimal changes to what you have above try
formattedMsg.replaceAll(msgkey,
(messageMap.containsKey(msgKey) ? messageMap.get(msgkey) : "[" + msgKey + "]"));
but looking again I can see that you're iterating the set of keys from the messageMap so the issue of a key not appearing in the map doesn't arise?
There's also a reference to if(t.contains(msgKey))... but not sure what t is
if you want the text to contain the formatted [msgKey] when its no found then replacing all "[" & "]" seems the wrong way to start if you want to put them back in in some cases.
I'd look at #iTech's suggestion and get regex doing more for you

Categories