Java regex for balanced parantheses - java

I have a string like:
If ({{SQL}}.Employee.Title starts with 'Production')
and (substring of {{SQL}}.Employee.Title from '27' for '2' is not '30')
and ({{SQL}}.Employee.HireDate is greater than or equal to '2000-01-01 00:00:00.000')
then Pull {{SQL}}.Title, {{SQL}}.HireDate from Employee
From this expression, I want to find out whether round brackets are properly balanced or not in Java language.
One way is to create a counter variable which will be incremented as soon as I find out the opening bracket and decrement it when closing bracket is encountered. Based on the result, I can decide the result.
But this is not going to help for string like () i.e. not having any alpha numeric character between brackets.
Is there any way in which I can determine whether round brackets are balanced and there should be alpha numeric characters between these brackets.
In case brackets are empty i.e. no character between opening and closing brackets, it should throw an error.

You'll need a code similar to below one. It does use a Stack to track the number of opened/closed parantheses + remembers what was the last char occurence in order to track empty parantheses:
String test = "{TEST}(A){";
Stack<Integer> stack = new Stack<>();
boolean lastCharIsParantheses = false;
for (char c : test.toCharArray()) {
switch (c) {
case '{':
case '(': {
stack.push(1);
lastCharIsParantheses = true;
continue;
}
case '}':
case ')':
stack.pop();
if (lastCharIsParantheses) {
throw new RuntimeException("Empty parantheses");
}
}
lastCharIsParantheses = false;
}
if (!stack.empty()) {
throw new RuntimeException("Not matching number of opened/closed parantheses");
}

Related

How to skip to next character in Java

I am working on a program that converts a prefix to a postfix expression. However, when there is an unexpected blank space in an expression such as "$+-ABC+D-E F" instead of "$+-ABC+D-EF" the program doesn't work correctly. How do I write skip to the next character and ignore the whitespace, trying to do it through an if else statement using a boolean isBlank method.
public class PrefixConverter{
// Checks if character is an operator
public boolean isOperator(char c){
switch (c){
case '+':
case '-':
case'*':
case'/':
case'$':
return true;
}
return false;
}
// Ignores white space
public boolean isBlank(char c){
switch (c){
case ' ':
return true;
}
return false;
}
// Method to convert Prefix expression to Postfix expression
public String preToPost (String prefix_exp){
// Create a new stack with length of the prefix string
int size = prefix_exp.length();
Stack expression_stack = new Stack (size);
// Read expression from right to left
for (int i = size -1; i >=0 ; i-- ){
if (isOperator(prefix_exp.charAt(i))){
// Pop two operands from the stack
String op1 = expression_stack.peek();
expression_stack.pop();
String op2 = expression_stack.peek();
expression_stack.pop();
// Concatenate the operands and the operator
String temp = op1 + op2 + prefix_exp.charAt(i);
// Push the result back onto the stack
expression_stack.push(temp);
}
else if(isBlank(prefix_exp.charAt(i))){
// Skip to next character
}
// If the symbol is an operand
else {
// Push the operand onto the stack
expression_stack.push(prefix_exp.charAt(i) + "");
}
}
return expression_stack.peek();
}
}
One way would be to write a continue; in this else if():
else if(isBlank(prefix_exp.charAt(i))){
// Skip to next character
continue;
}
continue will simply move to the next iteration of the loop
However, if you do not need the spaces you can remove them from the prefix_exp String in the beginning by doing this:
prefix_exp = prefix_exp.replaceAll("\\s", "");
Just make sure you do the replaceAll before you call .length() of the String as the size changes.
Just use the continue statement to skip to the end of your for loop.
This will trigger the loop to run with the next character. But since the rest of your code is in if statements anyway, your code should behave well by just doing nothing.
...
else if(isBlank(prefix_exp.charAt(i))){
// Skip to next character
continue;
}
See also https://docs.oracle.com/javase/tutorial/java/nutsandbolts/branch.html

delimiters check using stack

the code checks if delimiters are balanced in the string or not. I've been using a stack to solve this. I traverse the string to the end, whenever an opening delimiter is encountered I push it into the stack, for each closing delimiter encountered I make a check if the stack is empty (and report error if it is) and then pop the stack to match the popped character and the closing delimiter encountered. I ignore all other characters in the string.
At the end of the traversal I make a check if the stack is empty (that is I check if all the opening delimiters were balanced out or not). If it's not empty, I report an error.
Although I have cross checked many times the code seems to be reporting every string as invaalid(i.e with unbalanced delimiters). Here's the code:
import java.util.*;
public class delimiter {
public static void main(String args[]){
String s1 = "()";
String s2 = "[}[]";
if(delimitercheck(s1)){
System.out.println("s1 is a nice text!");
}
else
System.out.println("S1 is not nice");
if(delimitercheck(s2)){
System.out.println("s2 is a nice text!");
}
else
System.out.println("S2 is not nice");
}
public static boolean delimitercheck(String s){
Stack<Character> stk = new Stack<Character>();
if(s==null||s.length()==0)//if it's a null string return true
return true;
for(int i=0;i<s.length();i++){
if(s.charAt(i)=='('||s.charAt(i)=='{'||s.charAt(i)=='['){
stk.push(s.charAt(i));
}
if(s.charAt(i)==')'||s.charAt(i)=='}'||s.charAt(i)==']'){
if(stk.isEmpty()){
return false;
}
if(stk.peek()==s.charAt(i)){
stk.pop();
}
}
}
if(stk.isEmpty()){
return true;
}
else
return false;
}
}
Can anyone point to me where am I going wrong?
Your error is here :
if(stk.peek()==s.charAt(i)){
stk.pop();
}
The i'th character shouldn't be equal to stk.peek(). It should be closing it. i.e. if stk.peek() == '{', s.charAt(i) should be '}', and so on.
In addition, if the current closing parenthesis doesn't match to top of the stack, you should return false.
You can either have a separate condition for each type of paretheses, or you can create a Map<Character,Character> that maps each opening parenthesis to its corresponding closing parenthesis, and then your condition will become :
if(map.get(stk.peek())==s.charAt(i)){
stk.pop();
} else {
return false;
}
where map can be initialized to :
Map<Character,Character> map = new HashMap<>();
map.put('(',')');
map.put('{','}');
map.put('[',']');
Yes, when encountering a closing bracket, you check if it is similar to the opening bracket which is not correct.
if(stk.peek()==s.charAt(i)){
stk.pop();
}
should be replaced with something similar to
Character toCheck = s.charAt(i);
Character peek = stk.peek();
if (toCheck == ')') {
if (peek == '(') {
stk.pop();
} else {
return false;
}
} else if ( // ... check all three bracket types
And please stick to brackets for every if-statement - there's nothing more tedious then one day encountering an error due to omitted brackets which will cause you more internal pain.
You are checking the stack value which has an opening delimiter with a closing delimiter. So for the first example you are checking '(' with ')'. Instead for every corresponding end delimiter you should check the stack for its starting delimiter i.e., '(' with ')'. Hope that makes sense.

Remove all vowels in a string with Java

I am doing a homework assignment for my Computer Science course. The task is to get a users input, remove all of the vowels, and then print the new statement.
I know I could easily do it with this code:
string.replaceAll("[aeiou](?!\\b)", "")
But my instructor wants me to use nested if and else if statements to achieve the result. Right now I am using something like this:
if(Character.isLetter('a')){
'do something'
}else if(Character.isLetter('e')){
'do something else'
But I am not sure what to do inside the if and else if statements. Should I delete the letter? Or is there a better way to do this?
Seeing as this is my homework I don't want full answers just tips. Thanks!
I think what he might want is for you to read the string, create a new empty string (call it s), loop over your input and add all the characters that are not vowels to s (this requires an if statement). Then, you would simply print the contents of s.
Edit: You might want to consider using a StringBuilder for this because repetitive string concatenation can hinder performance, but the idea is the same. But to be honest, I doubt it would make a noticeable difference for this type of thing.
Character.isLetter('a')
Character.isLetter(char) tells you if the value you give it is a letter, which isn't helpful in this case (you already know that "a" is a letter).
You probably want to use the equality operator, ==, to see if your character is an "a", like:
char c = ...
if(c == 'a') {
...
} else if (c == 'e') {
...
}
You can get all of the characters in a String in multiple ways:
As an array with String.toCharArray()
Getting each character from the String using String.charAt(index)
I think you can iterate through the character check if that is vowel or not as below:
define a new string
for(each character in input string)
//("aeiou".indexOf(character) <0) id one way to check if character is consonant
if "aeiou" doesn't contain the character
append the character in the new string
If you want to do it in O(n) time
Iterate over the character array of your String
If you hit a vowel skip the index and copy over the next non vowel character to the vowel position.
You will need two counters, one which iterates over the full string, the other which keeps track of the last vowel position.
After you reach the end of the array, look at the vowel tracker counter - is it sitting on a vowel, if not then the new String can be build from index 0 to 'vowelCounter-1'.
If you do this is in Java you will need extra space to build the new String etc. If you do it in C you can simply terminate the String with a null character and complete the program without any extra space.
I don't think your instructor wanted you to call Character.isLetter('a') because it's always true.
The simplest way of building the result without regexp is using a StringBuilder and a switch statement, like this:
String s = "quick brown fox jumps over the lazy dog";
StringBuffer res = new StringBuffer();
for (char c : s.toCharArray()) {
switch(c) {
case 'a': // Fall through
case 'u': // Fall through
case 'o': // Fall through
case 'i': // Fall through
case 'e': break; // Do nothing
default: // Do something
}
}
s = res.toString();
System.out.println(s);
You can also replace this with an equivalent if, like this:
if (c!='a' && c!='u' && c!='o' && c!='i' && c!='e') {
// Do something
}

Java: Efficient way to determine if a String meets several criteria?

I would like to find an efficient way (not scanning the String 10,000 times, or creating lots of intermediary Strings for holding temporary results, or string bashing, etc.) to write a method that accepts a String and determine if it meets the following criteria:
It is at least 2 characters in length
The first character is uppercased
The remaining substring after the first character contains at least 1 lowercased character
Here's my attempt so far:
private boolean isInProperForm(final String token) {
if(token.length() < 2)
return false;
char firstChar = token.charAt(0);
String restOfToken = token.substring(1);
String firstCharAsString = firstChar + "";
String firstCharStrToUpper = firstCharAsString.toUpperCase();
// TODO: Giving up because this already seems way too complicated/inefficient.
// Ignore the '&& true' clause - left it there as a placeholder so it wouldn't give a compile error.
if(firstCharStrToUpper.equals(firstCharAsString) && true)
return true;
// Presume false if we get here.
return false;
}
But as you can see I already have 1 char and 3 temp strings, and something just doesn't feel right. There's got to be a better way to write this. It's important because this method is going to get called thousands and thousands of times (for each tokenized word in a text document). So it really really needs to be efficient.
Thanks in advance!
This function should cover it. Each char is examined only once and no objects are created.
public static boolean validate(String token) {
if (token == null || token.length() < 2) return false;
if (!Character.isUpperCase(token.charAt(0)) return false;
for (int i = 1; i < token.length(); i++)
if (Character.isLowerCase(token.charAt(i)) return true;
return false;
The first criteria is simply the length - this data is cached in the string object and is not requiring traversing the string.
You can use Character.isUpperCase() to determine if the first char is upper case. No need as well to traverse the string.
The last criteria requires a single traversal on the string- and stop when you first find a lower case character.
P.S. An alternative for the 2+3 criteria combined is to use a regex (not more efficient - but more elegant):
return token.matches("[A-Z].*[a-z].*");
The regex is checking if the string starts with an upper case letter, and then followed by any sequence which contains at least one lower case character.
It is at least 2 characters in length
The first character is
uppercased
The remaining substring after the first character contains
at least 1 lowercased character
Code:
private boolean isInProperForm(final String token) {
if(token.length() < 2) return false;
if(!Character.isUpperCase(token.charAt(0)) return false;
for(int i = 1; i < token.length(); i++) {
if(Character.isLowerCase(token.charAt(i)) {
return true; // our last criteria, so we are free
// to return on a met condition
}
}
return false; // didn't meet the last criteria, so we return false
}
If you added more criteria, you'd have to revise the last condition.
What about:
return token.matches("[A-Z].*[a-z].*");
This regular expression starts with an uppercase letter and has at least one following lowercase letter and therefore meets your requirements.
To find if the first character is uppercase:
Character.isUpperCase(token.charAt(0))
To check if there is at least one lowercase:
if(Pattern.compile("[a-z]").matcher(token).find()) {
//At least one lowercase
}
To check if first char is uppercase you can use:
Character.isUpperCase(s.charAt(0))
return token.matches("[A-Z].[a-z].");

How to create a Pattern matching given set of chars?

I get a set of chars, e.g. as a String containing all of them and need a charclass Pattern matching any of them. For example
for "abcde" I want "[a-e]"
for "[]^-" I want "[-^\\[\\]]"
How can I create a compact solution and how to handle border cases like empty set and set of all chars?
What chars need to be escaped?
Clarification
I want to create a charclass Pattern, i.e. something like "[...]", no repetitions and no such stuff. It must work for any input, that's why I'm interested in the corner cases, too.
Here's a start:
import java.util.*;
public class RegexUtils {
private static String encode(char c) {
switch (c) {
case '[':
case ']':
case '\\':
case '-':
case '^':
return "\\" + c;
default:
return String.valueOf(c);
}
}
public static String createCharClass(char[] chars) {
if (chars.length == 0) {
return "[^\\u0000-\\uFFFF]";
}
StringBuilder builder = new StringBuilder();
boolean includeCaret = false;
boolean includeMinus = false;
List<Character> set = new ArrayList<Character>(new TreeSet<Character>(toCharList(chars)));
if (set.size() == 1<<16) {
return "[\\w\\W]";
}
for (int i = 0; i < set.size(); i++) {
int rangeLength = discoverRange(i, set);
if (rangeLength > 2) {
builder.append(encode(set.get(i))).append('-').append(encode(set.get(i + rangeLength)));
i += rangeLength;
} else {
switch (set.get(i)) {
case '[':
case ']':
case '\\':
builder.append('\\').append(set.get(i));
break;
case '-':
includeMinus = true;
break;
case '^':
includeCaret = true;
break;
default:
builder.append(set.get(i));
break;
}
}
}
builder.append(includeCaret ? "^" : "");
builder.insert(0, includeMinus ? "-" : "");
return "[" + builder + "]";
}
private static List<Character> toCharList(char[] chars) {
List<Character> list = new ArrayList<Character>();
for (char c : chars) {
list.add(c);
}
return list;
}
private static int discoverRange(int index, List<Character> chars) {
int range = 0;
for (int i = index + 1; i < chars.size(); i++) {
if (chars.get(i) - chars.get(i - 1) != 1) break;
range++;
}
return range;
}
public static void main(String[] args) {
System.out.println(createCharClass("daecb".toCharArray()));
System.out.println(createCharClass("[]^-".toCharArray()));
System.out.println(createCharClass("".toCharArray()));
System.out.println(createCharClass("d1a3e5c55543b2000".toCharArray()));
System.out.println(createCharClass("!-./0".toCharArray()));
}
}
As you can see, the input:
"daecb".toCharArray()
"[]^-".toCharArray()
"".toCharArray()
"d1a3e5c55543b2000".toCharArray()
prints:
[a-e]
[-\[\]^]
[^\u0000-\uFFFF]
[0-5a-e]
[!\--0]
The corner cases in a character class are:
\
[
]
which will need a \ to be escaped. The character ^ doesn't need an escape if it's not placed at the start of a character class, and the - does not need to be escaped when it's placed at the start, or end of the character class (hence the boolean flags in my code).
The empty set is [^\u0000-\uFFFF], and the set of all the characters is [\u0000-\uFFFF]. Not sure what you need the former for as it won't match anything. I'd throw an IllegalArgumentException() on an empty string instead.
What chars need to be escaped?
- ^ \ [ ] - that's all of them, I've actually tested it. And unlike some other regex implementations [ is considered a meta character inside a character class, possibly due to the possibility of using inner character classes with operators.
The rest of task sounds easy, but rather tedious. First you need to select unique characters. Then loop through them, appending to a StringBuilder, possibly escaping. If you want character ranges, you need to sort the characters first and select contiguous ranges while looping. If you want the - to be at the beginning of the range with no escaping, then set a flag, but don't append it. After the loop, if the flag is set, prepend - to the result before wrapping it in [].
Match all characters ".*" (zero or more repeitions * of matching any character . .
Match a blank line "^$" (match start of a line ^ and end of a line $. Note the lack of stuff to match in the middle of the line).
Not sure if the last pattern is exactly what you wanted, as there's different interpretations to "match nothing".
A quick, dirty, and almost-not-pseudo-code answer:
StringBuilder sb = new StringBuilder("[");
Set<Character> metaChars = //...appropriate initialization
while (sourceString.length() != 0) {
char c = sourceString.charAt(0);
sb.append(metaChars.contains(c) ? "\\"+c : c);
sourceString.replace(c,'');
}
sb.append("]");
Pattern p = Pattern.compile(sb.toString());
//...can check here for the appropriate sb.length cases
// e.g, 2 = empty, all chars equals the count of whatever set qualifies as all chars, etc
Which gives you the unique string of char's you want to match, with meta-characters replaced. It will not convert things into ranges (which I think is fine - doing so smells like premature optimization to me). You can do some post tests for simple set cases - like matching sb against digits, non-digits, etc, but unless you know that's going to buy you a lot of performance (or the simplification is the point of this program), I wouldn't bother.
If you really want to do ranges, you could instead sourceString.toCharArray(), sort that, iterate deleting repetitions and doing some sort of range check and replacing meta characters as you add the contents to StringBuilder.
EDIT: I actually kind of liked the toCharArray version, so pseudo-coded it out as well:
//...check for empty here, if not...
char[] sourceC = sourceString.toCharArray();
Arrays.sort(sourceC);
lastC = sourceC[0];
StringBuilder sb = new StringBuilder("[");
StringBuilder range = new StringBuilder();
for (int i=1; i<sourceC.length; i++) {
if (lastC == sourceC[i]) continue;
if (//.. next char in sequence..//) //..add to range
else {
// check range size, append accordingly to sb as a single item, range, etc
}
lastC = sourceC[i];
}

Categories