How to find the smallest positive int efficiently? - java

I'm reading text where I want to find the end of the first sentence, at this point the first index of either '.', '?', or '!' in a string. So here's my Java code:
int next = -1;
int nextQ = text.indexOf("? ");
int nextE = text.indexOf("! ");
int nextDot = text.indexOf(". ");
if (nextDot > 0) {
next = nextDot;
if (nextQ > 0){
if (nextQ < next) {next = nextQ;}
if (nextE > 0) {
if (nextE < next) {next = nextE;}
}
} else if (nextE > 0){
if (nextE < next) {next = nextE;}
}
} else if (nextQ > 0){
next = nextQ;
if (nextE > 0 && nextE < next){next = nextE;}
} else if (nextE > 0) { next = nextE;}
I believe the code works but that's a total of 10 if statements, which doesn't look too neat. I might want to add more sentence delimiters there but I don't think this approach is very flexible. Is there any better way of doing the same? Any shorter way of achieving the same result? ...or should I try some other programming language for this sort of problems? Which one?

I'd suggesting using a regular expression to search for any of those delimiters at once.
String text = <TEXT>;
int next;
Pattern p = Pattern.compile("\\? |! |\\. ");
Matcher m = p.matcher(text);
if (m.find()) {
int next = m.start();
} else next = -1;
You can change the regex to adjust exactly what is matched. For example, I'd suggest that instead of requiring exactly a space after the delimiter, you instead require any whitespace character, so that a line break or tab will also work. This would be as follows: "\\?\\s|!\\s|\\.\\s". You would be able to add extra delimiters in a similar manner, and with a little extra work be able to detect which delimiter was triggered.
The documentation for Java regular expressions in the Pattern class is here and a useful tutorial here.

Use methods to keep DRY:
int firstDelimiterIndex(String s) {
return minIndex(s.indexOf(". "), minIndex(s.indexOf("? "), s.indexOf("! ")));
}
int minIndex(int a, int b) {
if (a == -1) return b;
if (b == -1) return a;
return Math.min(a, b);
}
Or choose a faster algorithm:
for (int i = 0; i < s.length; i++) {
switch (s.charAt(i)) {
case '.':
case '?':
case '!':
if (i + 1 < s.length() && s.charAt(i + 1) == ' ')
return i;
}
}

Use Math.min and a small modification.
First, turn -1 into large positive integers:
int largeMinusOne(int a)
{
return a==-1 ? 9999999 : a;
}
int nextQ = largeMinusOne(text.indexOf("? "));
int nextE = largeMinusOne(...);
int nextDot = largeMinuseOne(...);
And now:
int next = Math.min(Math.min(nextQ, nextE), nextDot);

You may like to just filter out values, which are not ok ( == -1) (Java 8):
int nextQ = text.indexOf("? ");
int nextE = text.indexOf("! ");
int nextDot = text.indexOf(". ");
OptionalInt res = IntStream.of(nextQ, nextE, nextDot).filter(i -> i != -1).min();
if (res.isPresent())
// ok, using res.get()
else
// none of these substrings found
It's more a joke, than a real answer, in real life gandaliter's answer should be used.

I would suggest just looping through the string character by character and stopping when you encounter any of those characters. What you're doing now is many times less efficient.

Related

Finding the last character before a given character inside a Java String

Say I have the following string:
String string = "122045b5423";
In Java, what would be the most efficient way in finding the last 2 before the b?
I know I can split the string and then use lastIndexOf() method from String class, but
is there a more efficient way with less creating of variables. Can there be a method in the StringBuilder class that will allow us to do this?
If you are looking for a more compact solution, how about regex?
// A 2, followed by arbitrary chars that are not a 2 and finally a b
Pattern pattern = Pattern.compile("(2)[^2]*b");
Matcher matcher = pattern.matcher(string);
if (matcher.find()) {
System.out.print("Start index: " + matcher.start());
System.out.print(" End index: " + matcher.end());
System.out.println(" Found: " + matcher.group());
}
Have not tested it, but something similar should work
I think the simplest (with almost zero memory-overhead) is to simply scan the string yourself:
int findLastCharBeforeChar(final String string, final char anchor, final char needle) {
int i = string.length() - 1;
while (i >= 0 && string.charAt(i) != anchor) {
--i;
}
while (i >= 0) {
if (string.charAt(i) == needle) return i;
--i;
}
return i;
}
If you want to make that a bit shorter (but likely minimally slower and definitely harder to read):
int findLastCharBeforeChar(final String string, final char anchor, final char needle) {
char target = anchor;
while (i >= 0) {
final char ch = string.charAt(i);
if (ch == target) target = needle;
if (target == needle && ch == target) return i;
--i;
}
return i;
}
Not what was asked (most efficient), but followed up in the comments for the "shortest" solution, there you go (note that this is far from efficient, and depending on where you call it, this could be bad):
string.split('b')[0].lastIndexOf('2');
You didn't specify in your OP what should happen if 'b' is not part of the input string. Should the result be -1? (will be with my first implementation) or should the method then just return the index of the last '2' in the string (the string split solution)? Changing the method to handle this case as well is trivial, just check if the first loop terminated at -1 and reset the index to the string's last index.
But this is somewhat moot. You put the 9 lines of code in a method, write proper unit tests for it and then call your new method. Calling the new method is: a) a one-liner b) efficient c) likely to be inlined by the JVM
Look at the method substring of String class (or subSequence). That should give you what you need.
the code Should be something like this
String result = null;
int index = myString.indexOf("b");
if(index > -1) {
if(index >= 2) {
result = myString.substring(index - 2, index);
} else {
result = myString.substring(0, index);
}
}

Java: Find the longest substring without any number and at least one upper case character

Came across a programming exercise and was stuck. The problem is:
You need to define a valid password for an email but the only
restrictions are:
The password must contain one uppercase character
The password should not have numeric digit
Now, given a String, find the length of the longest substring which
is a valid password. For e.g Input Str = "a0Ba" , the output should
be 2 as "Ba" is the valid substring.
I used the concept of longest substring without repeating characters which I already did before but was unable to modify it to find the solution to above problem. My code for longest substring without repeating characters is:
public int lengthOfLongestSubstring(String s) {
int n = s.length();
Set<Character> set = new HashSet<>();
int ans = 0, i = 0, j = 0;
while (i < n && j < n) {
// try to extend the range [i, j]
if (!set.contains(s.charAt(j))){
set.add(s.charAt(j++));
ans = Math.max(ans, j - i);
}
else {
set.remove(s.charAt(i++));
}
}
return ans;
}
How about
final String input = "a0Ba";
final int answer = Arrays.stream(input.split("[0-9]+"))
.filter(s -> s.matches("(.+)?[A-Z](.+)?"))
.sorted((s1, s2) -> s2.length() - s1.length())
.findFirst()
.orElse("")
.length();
out.println(answer);
Arrays.stream(input.split("[0-9]+")) splits the original string into an array of strings. The separator is any sequence of numbers (numbers aren't allowed so they serve as separators). Then, a stream is created so I can apply functional operations and transformations.
.filter(s -> s.matches("(.+)?[A-Z](.+)?")) keeps into the stream only strings that have at least one upper-case letter.
.sorted((s1, s2) -> s2.length() - s1.length()) sorts the stream by length (desc).
.findFirst() tries to get the first string of the stream.
.orElse("") returns an empty string if no string was found.
.length(); gets the length of the string.
I suggest that you split your String to have an array of strings without digit:
yourString.split("[0-9]")
Then iterate over this array (says array a) to get the longest string that contains one Upper case character:
a[i].matches("[a-z]*[A-Z]{1}[a-z]*");
You can use a simple array. The algorithm to use would be a dynamic sliding window. Here is an example of a static sliding window: What is a Sliding Window
The algorithm should be as follows:
Keep track of 2 indexes of the array of char. These 2 indexes will be referred to as front and back here, representing the front and back of the array.
Have an int (I'll name it up here) to keep track of the number of upper case char.
Set all to 0.
Use a while loop that terminates if front > N where N is the number of char given.
If the next char is not a number, add 1 to front. Then check if that char is upper case. If so, add 1 to up.
If up is at least 1, update the maximum length if necessary.
If the next char is a number, continue checking the following char if they are also numbers. Set front to the first index where the char is not a number and back to front-1.
Output the maximum length.
You can use my solution which runs in O(n) time and finds the longest part without any digit and with a capital letter:
String testString = "skjssldfkjsakdfjlskdssfkjslakdfiop7adfaijsldifjasdjfil8klsasdfŞdijpfjapodifjpoaidjfpoaidjpfi9a";
int startIndex = 0;
int longestStartIndex = 0;
int endIndex = 0;
int index = 0;
int longestLength = Integer.MIN_VALUE;
boolean foundUpperCase = false;
while(index <= testString.length()) {
if (index == testString.length() || Character.isDigit(testString.charAt(index))) {
if (foundUpperCase && index > startIndex && index - startIndex > longestLength) {
longestLength = index - startIndex;
endIndex = index;
longestStartIndex = startIndex;
}
startIndex = index + 1;
foundUpperCase = false;
} else if (Character.isUpperCase(testString.charAt(index))) {
foundUpperCase = true;
}
index++;
}
System.out.println(testString.substring(longestStartIndex, endIndex));
You don't need regular expressions. Just use a few integers to act as index pointers into the string:
int i = 0;
int longestStart = 0;
int longestEnd = 0;
while (i < s.length()) {
// Skip past all the digits.
while (i < s.length() && Character.isDigit(s.charAt(i))) {
++i;
}
// i now points to the start of a substring
// or one past the end of the string.
int start = i;
// Keep a flag to record if there is an uppercase character.
boolean hasUppercase = false;
// Increment i until you hit another digit or the end of the string.
while (i < s.length() && !Character.isDigit(s.charAt(i))) {
hasUppercase |= Character.isUpperCase(s.charAt(i));
++i;
}
// Check if this is longer than the longest so far.
if (hasUppercase && i - start > longestEnd - longestStart) {
longestEnd = i;
longestStart = start;
}
}
String longest = s.substring(longestStart, longestEnd);
Ideone demo
Whilst more verbose than regular expressions, this has the advantage of not creating any unnecessary objects: the only object created is the longest string, right at the end.
I am using modification of Kadane algorithm to search the required password length. You may use isNumeric() and isCaps() function or include inline if statements. I have shown below with functions.
public boolean isNumeric(char x){
return (x>='0'&&x<='9');
}
public boolean isCaps(char x){
return (x>='A'&&x<='Z');
}
public int maxValidPassLen(String a)
{
int max_so_far = 0, max_ending_here = 0;
boolean cFlag = false;
int max_len = 0;
for (int i = 0; i < a.length(); i++)
{
max_ending_here = max_ending_here + 1;
if (isCaps(a.charAt(i))){
cFlag = true;
}
if (isNumeric(a.charAt(i))){
max_ending_here = 0;
cFlag = false;
}
else if (max_so_far<max_ending_here){
max_so_far = max_ending_here;
}
if(cFlag&&max_len<max_so_far){
max_len = max_so_far;
}
}
return max_len;
}
Hope this helps.
There are plenty of good answers here but thought it might be of interest to add one that uses Java 8 streams:
IntStream.range(0, s.length()).boxed()
.flatMap(b -> IntStream.range(b + 1, s.length())
.mapToObj(e -> s.substring(b, e)))
.filter(t -> t.codePoints().noneMatch(Character::isDigit))
.filter(t -> t.codePoints().filter(Character::isUpperCase).count() == 1)
.mapToInt(String::length).max();
If you wanted the string (rather than just the length), then the last line can be replaced with:
.max(Comparator.comparingInt(String::length));
Which returns an Optional<String>.
I'd use Streams and Optionals:
public static String getBestPassword(String password) throws Exception {
if (password == null) {
throw new Exception("Invalid password");
}
Optional<String> bestPassword = Stream.of(password.split("[0-9]"))
.filter(TypeErasure::containsCapital)
.sorted((o1, o2) -> o1.length() > o2.length() ? 1 : 0)
.findFirst();
if (bestPassword.isPresent()) {
return bestPassword.get();
} else {
throw new Exception("No valid password");
}
}
/**
* Returns true if word contains capital
*/
private static boolean containsCapital(String word) {
return word.chars().anyMatch(Character::isUpperCase);
}
Be sure to write some unit tests
public String pass(String str){
int length = 0;
boolean uppercase = false;
String s= "";
String d= "";
for(int i=0;i<str.length();i++){
if(Character.isUpperCase(str.charAt(i)) == true){
uppercase = true;
s = s+str.charAt(i);
}else if(Character.isDigit(str.charAt(i)) == true ){
if(uppercase == true && s.length()>length){
d = s;
s = "";
length = s.length();
uppercase = false;
}
}else if(i==str.length()-1&&Character.isDigit(str.charAt(i))==false){
s = s + str.charAt(i);
if(uppercase == true && s.length()>length){
d = s;
s = "";
length = s.length();
uppercase = false;
}
}else{
s = s+str.charAt(i);
}
}
return d;}
Here is a simple solution with Scala
def solution(str: String): Int = {
val strNoDigit = str.replaceAll("[0-9]", "-")
strAlphas = strNoDigit.split("-")
Try(strAlphas.filter(_.trim.find(_.isUpper).isDefined).maxBy(_.size))
.toOption
.map(_.length)
.getOrElse(-1)
}
Another solution using tail recursion in Scala
def solution2(str: String): Int = {
val subSt = new ListBuffer[Char]
def checker(str: String): Unit = {
if (str.nonEmpty) {
val s = str.head
if (!s.isDigit) {
subSt += s
} else {
subSt += '-'
}
checker(str.tail)
}
}
checker(str)
if (subSt.nonEmpty) {
val noDigitStr = subSt.mkString.split("-")
Try(noDigitStr.filter(s => s.nonEmpty && s.find(_.isUpper).isDefined).maxBy(_.size))
.toOption
.map(_.length)
.getOrElse(-1)
} else {
-1
}
}
This is a dynamic programming problem. You can solve this yourself using a matrix. It is easy enough. Just give it a try. Take the characters of the password as the rows and columns of the matrix. Add the diagonals if the current character appended to the last character forms a valid password. Start with the smallest valid password as the initial condition.
String[] s = testString.split("[0-9]");
int length = 0;
int index = -1;
for(int i=0; i< s.length; i++){
if(s[i].matches("[a-z]*.*[A-Z].*[a-z]*")){
if(length <= s[i].length()){
length = s[i].length();
index = i;
}
}
}
if(index >= 0){
System.out.println(s[index]);
}
//easiest way to do it:
String str = "a0Ba12hgKil8oPlk";
String[] str1 = str.split("[0-9]+");
List<Integer> in = new ArrayList<Integer>();
for (int i = 0; i < str1.length; i++) {
if (str1[i].matches("(.+)?[A-Z](.+)?")) {
in.add(str1[i].length());
} else {
System.out.println(-1);
}
}
Collections.sort(in);
System.out.println("string : " + in.get(in.size() - 1));
This is my solution with c#. I tested a range of strings and it gave me the correct value. Used Split. No Regex or Substrings. Let me know if it works; open to improvements and corrections.
public static int validPassword(string str)
{
List<int> strLength = new List<int>();
if (!(str.All(Char.IsDigit)))
{
//string str = "a0Bb";
string[] splitStrs = str.Split(new char[] { '0', '1', '2', '3', '4', '5', '6', '7', '8', '9' });
//check if each string contains a upper case
foreach (string s in splitStrs)
{
//Console.WriteLine(s);
if (s.Any(char.IsUpper) && s.Any(char.IsLower) || s.Any(char.IsUpper))
{
strLength.Add(s.Length);
}
}
if (strLength.Count == 0)
{
return -1;
}
foreach (int i in strLength)
{
//Console.WriteLine(i);
}
return strLength.Max();
}
else
{
return -1;
}
}
I think this solution takes care of all the possible corner cases. It passed all the test cases in an Online Judge. It is a dynamic sliding window O(n) solution.
public class LongestString {
public static void main(String[] args) {
// String testString = "AabcdDefghIjKL0";
String testString = "a0bb";
int startIndex = 0, endIndex = 0;
int previousUpperCaseIndex = -1;
int maxLen = 0;
for (; endIndex < testString.length(); endIndex++) {
if (Character.isUpperCase(testString.charAt(endIndex))) {
if (previousUpperCaseIndex > -1) {
maxLen = Math.max(maxLen, endIndex - startIndex);
startIndex = previousUpperCaseIndex + 1;
}
previousUpperCaseIndex = endIndex;
} else if (Character.isDigit(testString.charAt(endIndex))) {
if (previousUpperCaseIndex > -1) {
maxLen = Math.max(maxLen, endIndex - startIndex);
}
startIndex = endIndex + 1;
previousUpperCaseIndex = -1;
}
}
if (previousUpperCaseIndex > -1)
maxLen = Math.max(maxLen, endIndex - startIndex);
System.out.println(maxLen);
}}
function ValidatePassword(password){
var doesContainNumber = false;
var hasUpperCase = false;
for(var i=0;i<password.length;i++){
if(!isNaN(password[i]))
doesContainNumber = true;
if(password[i] == password[i].toUpperCase())
hasUpperCase = true;
}
if(!doesContainNumber && hasUpperCase)
return true;
else
return false;
}
function GetLongestPassword(inputString){
var longestPassword = "";
for(var i=0;i<inputString.length-1;i++)
{
for (var j=i+1;j<inputString.length;j++)
{
var substring = inputString.substring(i,j+1);
var isValid = ValidatePassword(substring);
if(isValid){
if(substring.length > longestPassword.length)
{
longestPassword = substring;
}
}
}
}
if(longestPassword == "")
{
return "No Valid Password found";
}
else
{
return longestPassword;
}
}

Returning a string minus a specific character between specific characters

I am going through the Java CodeBat exercises. Here is the one I am stuck on:
Look for patterns like "zip" and "zap" in the string -- length-3, starting with 'z' and ending with 'p'. Return a string where for all such words, the middle letter is gone, so "zipXzap" yields "zpXzp".
Here is my code:
public String zipZap(String str){
String s = ""; //Initialising return string
String diff = " " + str + " "; //Ensuring no out of bounds exceptions occur
for (int i = 1; i < diff.length()-1; i++) {
if (diff.charAt(i-1) != 'z' &&
diff.charAt(i+1) != 'p') {
s += diff.charAt(i);
}
}
return s;
}
This is successful for a few of them but not for others. It seems like the && operator is acting like a || for some of the example strings; that is to say, many of the characters I want to keep are not being kept. I'm not sure how I would go about fixing it.
A nudge in the right direction if you please! I just need a hint!
Actually it is the other way around. You should do:
if (diff.charAt(i-1) != 'z' || diff.charAt(i+1) != 'p') {
s += diff.charAt(i);
}
Which is equivalent to:
if (!(diff.charAt(i-1) == 'z' && diff.charAt(i+1) == 'p')) {
s += diff.charAt(i);
}
This sounds like the perfect use of a regular expression.
The regex "z.p" will match any three letter token starting with a z, having any character in the middle, and ending in p. If you require it to be a letter you could use "z[a-zA-Z]p" instead.
So you end up with
public String zipZap(String str) {
return str.replaceAll("z[a-zA-Z]p", "zp");
}
This passes all the tests, by the way.
You could make the argument that this question is about raw string manipulation, but I would argue that that makes this an even better lesson: applying regexes appropriately is a massively useful skill to have!
public String zipZap(String str) {
//If bigger than 3, because obviously without 3 variables we just return the string.
if (str.length() >= 3)
{
//Create a variable to return at the end.
String ret = "";
//This is a cheat I worked on to get the ending to work easier.
//I noticed that it wouldn't add at the end, so I fixed it using this cheat.
int minusAmt = 2;
//The minus amount starts with 2, but can be changed to 0 when there is no instance of z-p.
for (int i = 0; i < str.length() - minusAmt; i++)
{
//I thought this was a genius solution, so I suprised myself.
if (str.charAt(i) == 'z' && str.charAt(i+2) == 'p')
{
//Add "zp" to the return string
ret = ret + "zp";
//As long as z-p occurs, we keep the minus amount at 2.
minusAmt = 2;
//Increment to skip over z-p.
i += 2;
}
//If it isn't z-p, we do this.
else
{
//Add the character
ret = ret + str.charAt(i);
//Make the minus amount 0, so that we can get the rest of the chars.
minusAmt = 0;
}
}
//return the string.
return ret;
}
//If it was less than 3 chars, we return the string.
else
{
return str;
}
}

check if a string has four consecutive letters in ascending or descending order

Good day stack overflow.
I'm a noob in using regex and here is my problem - I need to check a password if it contains 4 consecutive characters. so far what I have just covered is regarding the digits. Here is my regex:
ascending digits - ^.?(?:0123|1234|2345|3456|4567|5678|6789).$
descending digits - ^.?(?:9876|8765|7654|6543|5432|4321|3210).$
This works only for the digits. I know this is already an overkill in regex so I dont want to do it with the letters. It will be waaay too overkill if I do that.
abcdblah //true because of abcd
helobcde //true because of bcde
dcbablah //true beacause of dcba
heloedcb //true because of edcb
Any help would be highly appreciated. Thanks stackoverflow.
The answer is simple: don't use regexes.
Use this approach:
iterate over each letter (of course, skip the last tree letters)
iterate over the next three letters and check for ascending order
if they all were ascending return true.
iterate over the next three letters and check for descending order
if they all were descending return false.
return false
In code, this would look like this (untested code):
public boolean checkForAscendingOrDescendingPart(String txt, int l)
{
for (int i = 0; i <= txt.length() - l; ++i)
{
boolean success = true;
char c = txt.charAt(i);
for (int j = 1; j < l; ++j)
{
if (((char) c + j) != txt.charAt(i + j))
{
success = false;
break;
}
}
if (success) return true;
success = true;
for (int j = 1; j < l; ++j)
{
if (((char) c - j) != txt.charAt(i + j))
{
success = false;
break;
}
}
if (success) return true;
}
return false;
}
Good luck!
StackOverflow :)
here is an idea that doesn't use regex:
all characters have an ansi value and usually consecutive. so abcd should have let's say the following ansi values:64,65,66,67
pseudocode:
for (i=string.start;i<string.end-4;i++) {
check=string.substring(i,4);
c1=check.substring(0,1);
c2=check.substring(1,1);
c3=check.substring(2,1);
c4=check.substring(3,1);
if (c1.ansival==c2.ansival+1 && c2.ansival==c3.ansival+1 && c3.ansival==c4.ansival+1) {
return false;
} else {
return true;
}
}
also repeat in reverse order (c1.ansival+1==c2.ansival) for descending order
There is no way to solve this using regexes apart from the "overkill" solution of listing each of the possible sequences you want to match. Regexes are not expressive enough to offer a better solution.
This is my solution. It uses only a single loop.
Keep in mind that you'll need more logic if you want to constrain it to pure ASCII.
static boolean isWeak(String pass) {
Character prev = null;
Boolean asc = null;
int streak = 0;
for (char c : pass.toCharArray()) {
if (prev != null) {
switch (c - prev) {
case -1:
if (Boolean.FALSE.equals(asc)) streak++;
else { asc = false; streak = 2; }
break;
case 1:
if (Boolean.TRUE.equals(asc)) streak++;
else { asc = true; streak = 2; }
break;
default: asc = null; streak = 0;
}
if (streak == 4) return true;
}
prev = c;
}
return false;
}
Consider this
String s = "aba";
for (int i = 0; i < s.length() - 1; i++) {
if (!(Character.isLetter(c1) && Character.isLetter(c2))) {
//reject
}
if ((int)s.charAt(i) > (int)s.charAt(i + 1))) {
//reject
}
}
for s the if statement would be true so you could reject it. If s was abc then the if statement would never be true.
The code above using the > checks for ascending order. Use < for descending order

How can I find the index of the first "element" in my string using Java?

I'm working on writing a simple Prolog interpreter in Java.
How can I find the last character index of the first element either the head element or the tail element of a string in "List Syntax"?
List Syntax looks like:
(X)
(p a b)
(func (func2 a) (func3 X Y))
(equal eve (mother cain))
The head for each of those strings in order are:
Head: "X", Index: 1
Head: "p", Index: 1
Head: "func", Index: 4
Head: "equal", Index: 5
Basically, I need to match the string that immediately follows the first "(" and ends either with a space or a closing ")", whichever comes first. I need the character index of the last character of the head element.
How can I match and get this index in Java?
Brabster's solution is really close. However, consider the case of:
((b X) Y)
Where the head element is (b x). I attempted to fix it by removing "(" from the scanner delimiters but it still hiccups because of the space between "b" and "x".
Similarly:
((((b W) X) Y) Z)
Where the head is (((b w) x) Y).
Java's Scanner class (introduced in Java 1.5) might be a good place to start.
Here's an example that I think does what you want (updated to include char counting capability)
public class Test {
public static void main(String[] args) {
String[] data = new String[] {
"(X)",
"(p a b)",
"(func (func2 a) (func3 X Y))",
"(equal eve (mother cain))",
"((b X) Y)",
"((((b W) X) Y) Z)"
};
for (String line:data) {
int headIdx = 0;
if (line.charAt(1) == '(') {
headIdx = countBrackets(line);
} else {
String head = "";
Scanner s = new Scanner(line);
s.useDelimiter("[)|(| ]");
head = s.next();
headIdx = line.indexOf(head) + head.length() - 1;
}
System.out.println(headIdx);
}
}
private static int countBrackets(String line) {
int bracketCount = 0;
int charCount = 0;
for (int i = 1; i < line.length(); i++) {
char c = line.charAt(i);
if (c == '(') {
bracketCount++;
} else if (c == ')') {
bracketCount--;
}
if (bracketCount == 0) {
return charCount + 1;
}
charCount++;
}
throw new IllegalStateException("Brackets not nested properly");
}
}
Output:
1
1
4
5
5
13
It's not a very elegant solution, but regexes can't count (i.e. brackets). I'd be thinking about using a parser generator if there's any more complexity in there :)
Is there a reason you can't just brute force it? Something like this?
public int firstIndex( String exp ) {
int parenCount = 0;
for (int i = 1; i < exp.length(); i++) {
if (exp.charAt(i) == '(') {
parenCount++;
}
else if (exp.charAt(i) == ')') {
parenCount--;
}
if (parenCount == 0 && (exp.charAt(i+1) == ' ' || exp.charAt(i) == ')')) {
return i;
}
}
}
I may be missing something here, but I think that would work.
I suggest you write a proper parser (operator precedence in the case of Prolog) and represent the terms as trees of Java objects for further processing.

Categories