Parsing in Java [closed] - java

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 6 years ago.
Improve this question
I have a few, theoretical ideas, but I don't know the language well. We basically need to make a rudimentary lexical analyzer. I have most of the bits, but here's the context. The straight-forward question is at the end.
I need to read in a line from the console, then see if parts match up with a symbol table, which includes the keywords: "print [variable]", "load [variable]", "mem [variable]", "sqrt" and "stop", as well as mathematical symbols.
It also needs to recognise the variables on their own (such as "c = a + b" as well.)
So...it's not that hard, in theory. You'd check the first character of the string matched up with keywords or variables. If they do, keep looping through that keyword or variable to check if it's the same string, up until you hit a space.
To summarize: How do I check the characters of a read in string to compare to stuff in Java?

I recommend to use Regex for text pattern matching. You receive the text via console as argument, you do so by using the args array of the main-method. Here's a small example:
public final class Parser {
public static void main(final String[] args) {
if (args.length < 1) {
// Error, no input given
}
String input = args[0];
Pattern pattern = Pattern.compile("YOUR REGEX HERE");
Matcher matcher = pattern.matcher(input);
if (matcher.find()) {
// Input matches the Regex pattern
// Access to capturing groups using matcher.group(int)
// Example: System.out.println(matcher.group(1));
}
}
}
For Regex you can find various explanations on the web and on SO. You can try out your patterns at regex101.
Here's an example pattern which matches "name = name + name":
(.+) = (.+) \+ (.+)
The () creates capturing groups. Using matcher.group(x) for x from 1 to 3 you can access the matched values inside the brackets, i.e. the variables.
Here's the same example online with test input: regex101.com/r/mJ9jI5/1
Fairly easy. However you may need to make the pattern more robust. It may not accept whitespace characters or special characters (for example a +) inside a variable name and so on.

Related

Regex Validation for "ASSOC123" in Java [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 2 years ago.
Improve this question
I have a task to check an ID that must start with "ASSOC" in uppercase followed by 3 digits. I am a newbie in Java and still learning regex so any help would be welcome!
The following regular expression matches any string that matches the pattern you want: ASSOC[0-9]{3}.
If you also want to extract the 3-digit ID as a so-called regex-group from the match, the expression must be as follows: ASSOC([0-9]{3}).
The [0-9] says here, that we expect a number-character (digit) of 0-9. The curly brackets also express that exactly 3 digits, therefore a 3-digit number is expected.
On Regex101 you also have the possibility to validate different inputs with this regex. Furthermore the regular expression is explained in detail.
Example 1: https://regex101.com/r/zEN9Gt/1
Example 2: https://regex101.com/r/zEN9Gt/2
In Java you could test this as follows:
import java.util.regex.Matcher;
import java.util.regex.Pattern;
final String regex = "ASSOC([0-9]{3})";
final Pattern pattern = Pattern.compile(regex);
final Matcher matcher1 = pattern.matcher("ASSOC123");
final Matcher matcher2 = pattern.matcher("Assoc123");
if(matcher1.find()) {
System.out.println(matcher1.group(1));
// Since ASSOC123 is a valid input for the regular expression,
// matcher1.find() is true and we can output the 3 digit number (attention: group 1!).
// Group 0 would be the entire expression found, hence "ASSOC123".
}
if(matcher2.find()) {
// Since "Assoc123" does not match the regular expression, matcher2.find() is false
}

Need help on Regex return values [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 7 years ago.
Improve this question
I have a regex question assigned by my instructor and he wants us to make all the return values true by changing the string value in the three declared variables. This is my first time doing a regex question and I wanted a little help if that's okay. I tried www.regexpal.com but I didn't know how to use it.
Could someone shed a little light on this topic as to how I begin to solve this? Thanks
Heres the following code:
public class RegexTester {
public static void main(String[] args) {
String regexSSN = ""; //TODO add a regex for Social Security Numbers
String regex9000 = ""; //TODO add a regex for GGC 9000 numbers here
String regexZipPlus4 = ""; //TODO add a regex for zip+4 zipcodes here
System.out.println("All of the following tests shoule return true, "
+ "the negative tests are negated (meaning that they should "
+ "also return true)");
System.out.println("192-192-5555".matches(regexSSN)); // the following tests should all match
System.out.println("444-183-1212".matches(regexSSN));
System.out.println("032-431-9375".matches(regexSSN));
System.out.println("122-650-4343".matches(regexSSN));
System.out.println("900012389".matches(regex9000));
System.out.println("900112389".matches(regex9000));
System.out.println("900012390".matches(regex9000));
System.out.println("900050000".matches(regex9000));
System.out.println("30043".matches(regexZipPlus4));
System.out.println("30043-1234".matches(regexZipPlus4));
System.out.println(); // the following codes print out true
System.out.println(!"192-XYZ-5555".matches(regexSSN)); // the following tests should NOT match
System.out.println(!"00-192-5555".matches(regexSSN));
System.out.println(!"90005000".matches(regex9000)); // too short!
System.out.println(!"900250000".matches(regex9000)); // must be 9000* or 9001*
System.out.println(!"9002500001".matches(regex9000)); // to big
System.out.println(!"9001FOO00".matches(regex9000)); // no alpha allowed
System.out.println(!"30043-12345".matches(regexZipPlus4)); // too long
System.out.println(!"300430-1234".matches(regexZipPlus4)); // too long
System.out.println(!"30043-12".matches(regexZipPlus4)); // too short
System.out.println(!"300FO-1234".matches(regexZipPlus4)); // no alpha allowed
System.out.println(!"30043=1234".matches(regexZipPlus4)); // missing hyphen
}
}
Start with reading through java.util.regex.Pattern documentation. It contains all necessary information to complete the assignment. You need to clearly understand requirements when constructing your regex pattern. You can then convert those requirements into regex.
E.g., to match telephone number of the following format XXX-XXX-XXXX, where X is any number you need
3 digits followed by dash, followed by 3 digits, followed by another dash and then by 4 digits:
$\d{3}\-\d{3}\-\d{4}$
Please note that when assigning this pattern to a Java string, you need to escape special characters.
I like using RegexPlanet to test my code. Here is a link for the fist problem: regexSSN (although ssn should be 9 digit long, in your code it's 10). Click on Go button. You will be able to enter your test cases.
Here's the solution for your fist case.
String regexSSN = "^(\\d{3}\\-\\d{3}\\-\\d{4})";
Hopefully this will get you started so you can complete other two problems.
When designing regex strings, I like to begin by categorizing parts of the string into similar components. Lets take the SSN regex as an example.
Step 1: We see the format is ###-###-##### where # is a number 0-9
Step 2: The regex for matching a number is either [0-9] or \d
Step 3: Now we can write it out in regex \d\d\d-\d\d\d-\d\d\d\d where - is just a literal dash.
Step 4: Notice repetition? We can take care of that too with {n} where n is the number of time we want to repeat the previous section, so now we have \d{3}-\d{3}-\d{4}
And thats how you do SSN Regex.

Blank space in array with String splitter [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 7 years ago.
Improve this question
My objective is to separate the numbers from the string, but in my array's first position i get a blank space. So i need help for that not to happen.
str1 = "Y=9x1+29x2";
String[] split2 = str2.split("[^-?.?0-9]+");
Blank space at the start is due to presence of non-digit character at the start of your input.
You can remove all non-digits at start before splitting:
String linha = "Y=9x1+29x2";
String[] split = linha.replaceFirst("[^-.\\d]+", "").split("[^-.\\d]+");
for (String tok: split2)
System.out.println(tok);
Output:
9
1
29
2
I think your question is rather vague, but after looking at it, I'm guessing that you want to extract the numbers out of the string, where a "number" has this format: an optional minus sign, followed by an optional decimal point, followed by one or more digits. I suspect you also want to include numbers that have digits followed by a decimal point followed by more digits.
I'm guessing this is what you want, because of the ? you put in your regex. The problem is that inside square brackets, ? doesn't mean "optional", and it doesn't mean "zero or one of something". It means a question mark. The regex [^-?.?0-9] means "match one character that is not a digit, a period, a hyphen, or a question mark". A pattern in square brackets always matches one character, and you tell it what characters are OK (or, if you begin with ^, what characters are not OK). This kind of "character set" pattern never matches a sequence of characters. It just looks at one character at a time. If you put + after the pattern, it still looks at one character at a time; it just does so repeatedly.
I think what you're trying to do is to take a pattern that represents a number, and then say "look for something that doesn't look like that pattern", and you tried to do it by using [^...]. That simply will not work.
In fact, split() is the wrong tool for this job. The purpose of split is to break up a string whose delimiters match a given pattern. Using it when the strings you want to keep in the array match a given pattern doesn't work very well, unless the pattern is extremely simple. I recommend that you create a Matcher and use the find() method in a loop. find() is set up so that it can find all matching substrings of a string if you call it repeatedly. This is what you want to accomplish, so it's the right tool.

Regex validation issue [closed]

Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 7 years ago.
Improve this question
I need to validate and prevent entering '='(equal sign) for an input string and i use regex to catch it [=]* . but it catches the other strings as well. ex input: 2c450807-4a4c-4f18-bf4f-5a100ced87a0 . above regex catches the this string as well.
please help me.
and also ,can anyone please explain me why this regex doesn't catch the input. I need to catch the special characters mentioned in the regex.
final String REGEX="[.,%*$##?^<!&>'|/\\\\~\\[\\]{}+=\"-]*";
Pattern pattern = Pattern.compile(REGEX);
Matcher matcher = pattern.matcher("2c450807-4a4c-4f18-bf4f-5a100ced87a0");
if (matcher.matches()) {
System.out.println("found");
}
else{
System.out.println("not found!");
}
this prints "not found!"
When you use regular expression, you might want to find items depending the number of times they appear:
If you want to match a group containering exactly n symbol (in your case: Equal (=) ) you can do something like this:
(=){n}
ie: if(myVar === myValue) is matched when n=3
If you want to match this symbol One or More times:
(=)+
ie: if((myVar = myValue) or (myVar == myValue) or (myVar === myValue))
If you want to match an item which might appear:
(=)*
ie: if(myVar < myValue)
The item does not need to be present in your expression to check. The value can be present 0 to n times.
I think the problem you have is that the * quantifier allows 0 occurrences of the preceding subpattern. Thus, [=]* matches any string.
You need to use a mere
=
And then you will not match 2c450807-4a4c-4f18-bf4f-5a100ced87a0.
Also please note that = is not a special regex character, you do not need to escape it, or place into a character class to avoid escaping.
However, as it has been pointed out in another comment, if you do not have to use a "regex", just check if a string contains = with a str.contains("=").

Java Regex Find Characters In Any String [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
Questions asking for code must demonstrate a minimal understanding of the problem being solved. Include attempted solutions, why they didn't work, and the expected results. See also: Stack Overflow question checklist
Closed 9 years ago.
Improve this question
I'm trying to program a Minecraft Bukkit plugin using Java that will find when an IP is placed in the chat, so I'm using regex for it. What I need is regex to find anywhere in the string, a character followed by a period followed by another character, such as 127.0.0.1 being valid, but it also needs to be able to find it with any characters surrounding it such as This IP: 127.0.0.1 is your localhost IP. This is my current code:
Pattern p = Pattern.compile("[a-z1-9]" + "." + "[a-z1-9]");
Matcher matcher = p.matcher(message);
if(matcher.matches()){
player.sendMessage(plugin.prefix + "ยง7You cannot advertise an IP address!");
event.setCancelled(true);
}
This code will only search for something like 127.0 and only that, but as I said above I need it to find any amount of [letter/number].[letter/number] in any string, if that made sense.
What I need is regex to find anywhere in the string, a character followed by a period followed by another character. I need it to find any amount of [letter/number].[letter/number] in any string, if that made sense...
What you can do here is use a word boundary \b to match for these patterns in larger text.
For a simple solution, you can use something like this.
\\b((?:(?:[0-9]{1,3}\\.){3}[0-9]{1,3}|(?:[a-z0-9]+(?:-[a-z0-9]+)*\\.)+[a-z]{2,4}))\\b
Example:
import java.util.regex.*;
class rTest {
public static void main (String[] args) {
String in = "Let's match 127.0.0.1 being valid, or this IP: 127.0.0.1 and joinTHIS.server.com or build 1.2";
String re = "\\b((?:(?:[0-9]{1,3}\\.){3}[0-9]{1,3}|(?:[a-z0-9]+(?:-[a-z0-9]+)*\\.)+[a-z]{2,4}))\\b";
Pattern p = Pattern.compile(re, Pattern.CASE_INSENSITIVE);
Matcher m = p.matcher(in);
while (m.find()) {
System.out.println(m.group(1));
}
}
}
Output
127.0.0.1
127.0.0.1
joinTHIS.server.com
Read please this: link
PADDRESS_PATTERN =
"([01]?\\d\\d?|2[0-4]\\d|25[0-5])\\." +
"([01]?\\d\\d?|2[0-4]\\d|25[0-5])\\." +
"([01]?\\d\\d?|2[0-4]\\d|25[0-5])\\." +
"([01]?\\d\\d?|2[0-4]\\d|25[0-5])";
Try this:
Pattern.compile("(\\d{1,3}\\.){3}\\d{1,3}");
This will simply find any sequence of 4 3-digit numbers separated by periods.
Edit: plsgogame's answer contains a better pattern for finding an IP address (which may only contain numbers between 0 and 255).
Your regex matches any string of length 3 starting and ending with [a-z0-9] because you're not escaping the '.' which stands for any character. Moreover, the set of character in parenthesis should be repeated. For example you could use something like:
[\d]*\.[\d]*\.[\d]*\.[\d]*
which matches one or more digits followed by a period three times and finally one or more digits. This means you'll get a match for any string of the form '123.456.789.101' but also stuff like '122533252.13242351432142.375547547.62463636', so that's not completely helpful.
An improvement, but not perfect, is the following:
[\d][\d][\d]\.[\d][\d][\d]\.[\d][\d][\d]\.[\d][\d][\d]
which will match groups of three digits separated by a dot.
If you want to fast forward to something much more interesting and effective but also more difficult to understand if you are a beginner, you can use the example found on this page, that is:
\b(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.){3}(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\b
which does exactly what you need.
Moreover, the matches() method tries to match all of the input, not a section of it, so you could add a '.*' at the beginning and end of the regex and run it from java code like this:
Pattern p = Pattern.compile(".*\\b(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\\.){3}(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\\b.*");
Matcher matcher = p.matcher(message);
if (matcher.matches()) System.out.println("It's a match");
If you want to find all the IPs you can do instead:
Pattern p = Pattern.compile("\\b(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\\.){3}(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\\b");
Matcher matcher = p.matcher(message);
while (matcher.find()) System.out.println("Match: " + matcher.group());
Regexes are wonderful although the learning curve is steep. So good luck learning!

Categories