I have the following string that I need to parse/extract the '20000' out of it.
"where f_id = '20000' and (flag is true or flag is null)"
Any sugguestions on best way to do this?
Here's more code to help understand:
List<ReportDto> reportDtoList = new ArrayList<ReportDto>();
for (Report report : reportList) {
List<ReportDetailsDto> ReportDetailsDtoList = new ArrayList<ReportDetailsDto>();
ReportDto reportDto = new ReportDto();
reportDto.setReportId(report.getReportId());
reportDto.setReportName(report.getName());
Pattern p = Pattern.compile("=\\s'[0-9]+'");
String whereClause = report.getWhereClause();
Matcher m = p.matcher(whereClause);
Confused of what to do after this?
You can use this regex to extract a single nonegative integer from your String
Pattern p = Pattern.compile("[0-9]+");
Matcher m = p.matcher(text);
if (m.find()) {
System.out.println(m.group());
}
Or if you want to preserve the single quotes :
Pattern p = Pattern.compile("['0-9]+");
This will extract a pattern that includes '=' and a single space after that. It will print a String containing the number without '=' or the space. So if this matches you know there is a number after a '='
Pattern p = Pattern.compile("=\\s'[0-9]+");
Matcher m = p.matcher(text);
if (m.find()) {
System.out.println(m.group().substring(3));
}
EDIT
based on the code you added this is how it would look like
List<ReportDto> reportDtoList = new ArrayList<ReportDto>();
Pattern p = Pattern.compile("=\\s'[0-9]+");
for (Report report : reportList) {
List<ReportDetailsDto> ReportDetailsDtoList = new ArrayList<ReportDetailsDto>();
ReportDto reportDto = new ReportDto();
reportDto.setReportId(report.getReportId());
reportDto.setReportName(report.getName());
String whereClause = report.getWhereClause();
Matcher m = p.matcher(whereClause);
if (m.find()) {
String foundThis = m.group().substring(3);
// do something with foundThis
} else {
// didn't find a number or =
}
}
Try this:
Pattern p = Pattern.compile("-?\\d+");
String s = "your string here";
Matcher m = p.matcher(s);
List<String> extracted = new ArrayList<String>();
while (m.find()) {
extracted.add(m.group());
}
for floats and negatives
Pattern p = Pattern.compile("(-?\\d+)(\\.\\d+)?");
String s = "where f_id = '20000' 3.2 and (flag is true or flag is null)";
Matcher m = p.matcher(s);
List<String> extracted = new ArrayList<String>();
while (m.find()) {
extracted.add(m.group());
}
for (String g : extracted)
System.out.println(g);
prints out
20000
3.2
Related
I have got a text like this in my String s (which I have already read from txt.file)
trump;Donald Trump;trump#yahoo.eu
obama;Barack Obama;obama#google.com
bush;George Bush;bush#inbox.com
clinton,Bill Clinton;clinton#mail.com
Then I'm trying to cut off everything besides an e-mail address and print out on console
String f1[] = null;
f1=s.split("(.*?);");
for (int i=0;i<f1.length;i++) {
System.out.print(f1[i]);
}
and I have output like this:
trump#yahoo.eu
obama#google.com
bush#inbox.com
clinton#mail.com
How can I avoid such output, I mean how can I get output text without line breakers?
Try using below approach. I have read your file with Scanner as well as BufferedReader and in both cases, I don't get any line break. file.txt is the file that contains text and the logic of splitting remains the same as you did
public class CC {
public static void main(String[] args) throws IOException {
Scanner scan = new Scanner(new File("file.txt"));
while (scan.hasNext()) {
String f1[] = null;
f1 = scan.nextLine().split("(.*?);");
for (int i = 0; i < f1.length; i++) {
System.out.print(f1[i]);
}
}
scan.close();
BufferedReader br = new BufferedReader(new FileReader(new File("file.txt")));
String str = null;
while ((str = br.readLine()) != null) {
String f1[] = null;
f1 = str.split("(.*?);");
for (int i = 0; i < f1.length; i++) {
System.out.print(f1[i]);
}
}
br.close();
}
}
You may just replace all line breakers as shown in the below code:
String f1[] = null;
f1=s.split("(.*?);");
for (int i=0;i<f1.length;i++) {
System.out.print(f1[i].replaceAll("\r", "").replaceAll("\n", ""));
}
This will replace all of them with no space.
Instead of split, you might match an email like format by matching not a semicolon or a whitespace character one or more times using a negated character class [^\\s;]+ followed by an # and again matching not a semicolon or a whitespace character.
final String regex = "[^\\s;]+#[^\\s;]+";
final String string = "trump;Donald Trump;trump#yahoo.eu \n"
+ " obama;Barack Obama;obama#google.com \n"
+ " bush;George Bush;bush#inbox.com \n"
+ " clinton,Bill Clinton;clinton#mail.com";
final Pattern pattern = Pattern.compile(regex, Pattern.MULTILINE);
final Matcher matcher = pattern.matcher(string);
final List<String> matches = new ArrayList<String>();
while (matcher.find()) {
matches.add(matcher.group());
}
System.out.println(String.join("", matches));
[^\\s;]+#[^\\s;]+
Regex demo
Java demo
package com.test;
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class Test {
public static void main(String[] args) {
String s = "trump;Donald Trump;trump#yahoo.eu "
+ "obama;Barack Obama;obama#google.com "
+ "bush;George Bush;bush#inbox.com "
+ "clinton;Bill Clinton;clinton#mail.com";
String spaceStrings[] = s.split("[\\s,;]+");
String output="";
for(String word:spaceStrings){
if(validate(word)){
output+=word;
}
}
System.out.println(output);
}
public static final Pattern VALID_EMAIL_ADDRESS_REGEX = Pattern.compile(
"^[A-Z0-9._%+-]+#[A-Z0-9.-]+\\.[A-Z]{2,6}$",
Pattern.CASE_INSENSITIVE);
public static boolean validate(String emailStr) {
Matcher matcher = VALID_EMAIL_ADDRESS_REGEX.matcher(emailStr);
return matcher.find();
}
}
Just replace '\n' that may arrive at start and end.
write this way.
String f1[] = null;
f1=s.split("(.*?);");
for (int i=0;i<f1.length;i++) {
f1[i] = f1[i].replace("\n");
System.out.print(f1[i]);
}
I need to parse hashtags from String (test comment #georgios#gsabanti sefse #afa).
String text = "test comment #georgios#gsabanti sefse #afa";
String[] words = text.split(" ");
List<String> tags = new ArrayList<String>();
for ( final String word : words) {
if (word.substring(0, 1).equals("#")) {
tags.add(word);
}
}
In the end i need an Array with "#georgios" , "#gsabanti" , "#afa" elements.
But now #georgios#gsabanti showing like one hashtag.
How to fix it?
+1 for the Regular Expressions:
Matcher matcher = Pattern.compile("(#[^#\\s]*)")
.matcher("test comment #georgios#gsabanti sefse #afa");
List<String> tags = new ArrayList<>();
while (matcher.find()) {
tags.add(matcher.group());
}
System.out.println(tags);
Here is a simple way of doing that
String text = "test comment #georgios#gsabanti sefse #afa";
String patternst = "#[a-zA-Z0-9]*";
Pattern pattern = Pattern.compile(patternst);
Matcher matcher = pattern.matcher(text);
List<String> tags = new ArrayList<String>();
while (matcher.find()) {
tags.add(matcher.group(0));
}
I hope it will work for you :)
Use Arraylist instead of array:
String text = "test comment #georgios#gsabanti sefse #afa";
ArrayList<String> hashTags = new ArrayList()<>;
char[] c = text.toCharArray();
for(int i=0;i<c.length;i++) {
if(c[i]=='#') {
String hash = "";
for(int j=i+1;j<c.length;j++) {
if(c[j]==' ' || c[j]=='#') {
hashTags.add(hash);
hash="";
break;
}
hash+=c[j];
}
}
}
String text = "test comment #georgios#gsabanti sefse #afa";
String[] words = text.split("(?=#)|\\s+")
List<String> tags = new ArrayList<String>();
for ( final String word : words) {
if (!word.isEmpty() && word.startsWith("#")) {
tags.add(word);
}
}
You can split your string at " " or "#" and keep the delimiters and filter those out which start with "#" like below:
public static void main(String[] args){
String text = "test comment #georgios#gsabanti sefse #afa";
String[] tags = Stream.of(text.split("(?=#)|(?= )")).filter(e->e.startsWith("#")).toArray(String[]::new);
System.out.println(Arrays.toString(tags));
}
I am trying to get the location data from this string using String.split("[,\\:]");
String location = "$,lat:27.980194,lng:46.090199,speed:0.48,fix:1,sats:6,";
String[] str = location.split("[,\\:]");
How can i get the data like this.
str[0] = 27.980194
str[1] = 46.090199
str[2] = 0.48
str[3] = 1
str[4] = 6
Thank you for any help!
If you just want to keep the numbers (including dot separator), you can use:
String[] str = location.split("[^\\d\\.]+");
You will need to ignore the first element in the array which is an empty string.
That will only work if the data names don't contain numbers or dots.
String location = "$,lat:27.980194,lng:46.090199,speed:0.48,fix:1,sats:6,";
Matcher m = Pattern.compile( "\\d+\\.*\\d*" ).matcher(location);
List<String> allMatches = new ArrayList<>();
while (m.find( )) {
allMatches.add(m.group());
}
System.out.println(allMatches);
Quick and Dirty:
String location = "$,lat:27.980194,lng:46.090199,speed:0.48,fix:1,sats:6,";
List<String> strList = (List) Arrays.asList( location.split("[,\\:]"));
String[] str = new String[5];
int count=0;
for(String s : strList){
try {
Double d =Double.parseDouble(s);
str[count] = d.toString();
System.out.println("In String Array:"+str[count]);
count++;
} catch (NumberFormatException e) {
System.out.println("s:"+s);
}
}
I need a sort help
I don't know why it's jump in the while by matcher.find() when i'm have the string "3*3"
code:
public void delSin_Cos_Tan()
{
o = new ArrayList<>();
String aDate = "3*3";
Pattern datePattern = Pattern.compile("((sin|cos|tan|sinh|cosh|tanh|asin|acos|atan)\\((.+)\\))");
//Operat.Sin_Cos_Tan.Patter = ((sin|cos|tan|sinh|cosh|tanh|asin|acos|atan)\((.+)\))
Matcher matcher = datePattern.matcher(aDate);
Log.d(TAG,"Sin Startz");
Log.d(TAG,"Sin " + Aufgabe);
while (matcher.find());
{
Log.e(TAG,matcher.group(1)); // there is the Error, but withe the String "3*3" an i don't konw why it is jump inside the while
String Gesammt = matcher.group(1);
String TYP = matcher.group(2);
String Inklammer = matcher.group(3);
Log.d(TAG, String.valueOf("------------------------"));
Log.d(TAG, Gesammt);
Log.d(TAG, Inklammer);
Log.d(TAG, TYP);
Log.d(TAG, String.valueOf("------------------------"));
}
}
My completely Code: http://pastebin.com/jWN1ghfz
you got a ;after your while loop.
This is why your complete block will always get executed!
while (matcher.find()); should be while (matcher.find()) (whithout ;)
It's because
while (matcher.find());
{
//...
}
is the same as
while (matcher.find()){
;
}
{
//...
}
I've been working on a weekend project, a simple, lightweight XML parser, just for fun, to learn more about Regexes. I've been able to get data in atributes and elements, but am having a hard time separating tags. This is what I have:
CharSequence inputStr = "<a>test</a>abc<b1>test2</b1>abc1";
String patternStr = openTag+"(.*?)"+closeTag;
Pattern pattern = Pattern.compile(patternStr);
Matcher matcher = pattern.matcher(inputStr);
StringBuffer buf = new StringBuffer();
boolean found = false;
while ((found = matcher.find())) {
String replaceStr = matcher.group();
matcher.appendReplacement(buf, "found tag (" + replaceStr + ")");
}
matcher.appendTail(buf);
String result = buf.toString();
System.out.println(result);
Output: found tag (<a>test</a>abc<b1>test2</b1>)abc1
I need to to end the 'found tag' at each tag, not the whole group. Any way I can have it do that? Thanks.
You can try with something as follows to get it working as you require;
int count = matcher.groupCount();
for(int i=0;i<count;i++)
{
String replaceStr = matcher.group(i);
matcher.appendReplacement(buf, "found tag (" + replaceStr + ")");
}