Comparing Strings lexicographically new approach fails for one test case

Comparing Strings lexicographically new approach fails for one test case - java

I was asked to check whether String a is lexicographically larger String b. So even before thinking about compareTo() method I got a new idea.
Take the minimum of the lengths of both a & b.
Iterate a for loop till that minimum length and store the sum of ascii's of each characters in both a & b separately.
Compare the ascii's to print the result.
Here is my code
private static void isInLexicographicOrder(String a, String b) {
char[] arr1 = a.toCharArray();
int asciCount1 = 0;
char[] arr2 = b.toCharArray();
int asciCount2 = 0;
long asciLength = (a.length() < b.length()) ? a.length() : b.length();
for(int i=0; i<asciLength; i++) {
asciCount1 += arr1[i];
asciCount2 += arr2[i];
}
if(asciCount1 < asciCount2) {
System.out.println("In Lexicographic Order");
}
else {
System.out.println("Not In Lexicographic Order");
}
}
It is working fine for many inputs I provided, then I found this link String Comparison in Java, so for confirmation I used compare to method in my code.
System.out.println((a.compareTo(b)) < 0 ? "In Lexicographic Order" : "Not In Lexicographic Order");
Now when I submitted the code the other website is saying that the code is failing for one test case
Sample input
vuut
vuuuuu
They are want output to come as No ie, Not In Lexicographic Order. But my logic and the compareTo() logic says In Lexicographic Order. So whats wrong, Is my logic is completely correct?
This is the link where I got the Question. Sorry if I'm wrong

The comareTo method iterates over the characters of the two strings until it reaches a position where the two characters differ. The return-value is the difference between the two codepoint values.
Your implemenation adds all the codepoints to a sum and returns the difference of the result of this addition.
Try your method with the values abcd and dcba. I expect your method to return 0instead of a negative number

Your logic is not correct. Comparing the sums of the characters is wrong, since "bab", "abb" and "bba" will have the same value, but that tells you nothing regarding which of them comes first lexicographicaly.
You should compare each pair of characters separately. The first time you encounter a pair of characters not equal to each other, the one with the lower value belongs to the String that should come first.
for(int i=0; i<asciLength; i++) {
if (arr1[i] > arr2[i]) {
System.out.println("Not In Lexicographic Order");
return;
} else if (arr1[i] < arr2[i]) {
System.out.println("In Lexicographic Order");
return;
}
}
// at this point we know that the Strings are either equal or one
// is fully contained in the other. The shorter String must come first
if (arr1.length <= arr2.length) {
System.out.println("In Lexicographic Order");
} else {
System.out.println("Not In Lexicographic Order");
}

Related

How to handle the time complexity for permutation of strings during anagrams search?

I have a program that computes that whether two strings are anagrams or not.
It works fine for inputs of strings below length of 10.
When I input two strings whose lengths are equal and have lengths of more than 10 program runs and doesn't produce an answer .
My concept is that if two strings are anagrams one string must be a permutation of other string.
This program generates the all permutations from one string, and after that it checks is there any matching permutation for the other string. In this case I wanted to ignore cases.
It returns false when there is no matching string found or the comparing strings are not equal in length, otherwise returns true.
public class Anagrams {
static ArrayList<String> str = new ArrayList<>();
static boolean isAnagram(String a, String b) {
// there is no need for checking these two
// strings because their length doesn't match
if (a.length() != b.length())
return false;
Anagrams.permute(a, 0, a.length() - 1);
for (String string : Anagrams.str)
if (string.equalsIgnoreCase(b))
// returns true if there is a matching string
// for b in the permuted string list of a
return true;
// returns false if there is no matching string
// for b in the permuted string list of a
return false;
}
private static void permute(String str, int l, int r) {
if (l == r)
// adds the permuted strings to the ArrayList
Anagrams.str.add(str);
else {
for (int i = l; i <= r; i++) {
str = Anagrams.swap(str, l, i);
Anagrams.permute(str, l + 1, r);
str = Anagrams.swap(str, l, i);
}
}
}
public static String swap(String a, int i, int j) {
char temp;
char[] charArray = a.toCharArray();
temp = charArray[i];
charArray[i] = charArray[j];
charArray[j] = temp;
return String.valueOf(charArray);
}
}
1. I want to know why can't this program process larger strings
2. I want to know how to fix this problem
Can you figure it out?

To solve this problem and check whether two strings are anagrams you don't actually need to generate every single permutation of the source string and then match it against the second one. What you can do instead, is count the frequency of each character in the first string, and then verify whether the same frequency applies for the second string.
The solution above requires one pass for each string, hence Θ(n) time complexity. In addition, you need auxiliary storage for counting characters which is Θ(1) space complexity. These are asymptotically tight bounds.

you're doing it in very expensive way and the time complexity here is exponential because your'e using permutations which requires factorials and factorials grow very fast , as you're doing permutations it will take time to get the output when the input is greater than 10.
11 factorial = 39916800
12 factorial = 479001600
13 factorial = 6227020800
and so on...
So don't think you're not getting an output for big numbers you will eventually get it
If you go something like 20-30 factorial i think i will take years to produce any output , if you use loops , with recursion you will overflow the stack.
fact : 50 factorial is a number that big it is more than the number of sand grains on earth , and computer surrender when they have to deal with numbers that big.
That is why they make you include special character in passwords to make the number of permutations too big that computers will not able to crack it for years if they try every permutations , and encryption also depends on that weakness of the computers.
So you don't have to and should not do that to solve it (because computer are not good very at it), it is an overkill
why don't you take each character from one string and match it with every character of other string, it will be quadratic at in worst case.
And if you sort both the strings then you can just say
string1.equals(string2)
true means anagram
false means not anagram
and it will take linear time,except the time taken in sorting.

You can first get arrays of characters from these strings, then sort them, and then compare the two sorted arrays. This method works with both regular characters and surrogate pairs.
public static void main(String[] args) {
System.out.println(isAnagram("ABCD", "DCBA")); // true
System.out.println(isAnagram("𝗔𝗕𝗖𝗗", "𝗗𝗖𝗕𝗔")); // true
}
static boolean isAnagram(String a, String b) {
// invalid incoming data
if (a == null || b == null
|| a.length() != b.length())
return false;
char[] aArr = a.toCharArray();
char[] bArr = b.toCharArray();
Arrays.sort(aArr);
Arrays.sort(bArr);
return Arrays.equals(aArr, bArr);
}
See also: Check if one array is a subset of the other array - special case

What should be the logic of hashfunction() in order to check that two strings are anagrams or not?

I want to write a function that takes string as a parameter and returns a number corresponding to that string.
Integer hashfunction(String a)
{
//logic
}
Actually the question im solving is as follows :
Given an array of strings, return all groups of strings that are anagrams. Represent a group by a list of integers representing the index in the original list.
Input : cat dog god tca
Output : [[1, 4], [2, 3]]
Here is my implementation :-
public class Solution {
Integer hashfunction(String a)
{
int i=0;int ans=0;
for(i=0;i<a.length();i++)
{
ans+=(int)(a.charAt(i));//Adding all ASCII values
}
return new Integer(ans);
}
**Obviously this approach is incorrect**
public ArrayList<ArrayList<Integer>> anagrams(final List<String> a) {
int i=0;
HashMap<String,Integer> hashtable=new HashMap<String,Integer>();
ArrayList<Integer> mylist=new ArrayList<Integer>();
ArrayList<ArrayList<Integer>> answer=new ArrayList<ArrayList<Integer>>();
if(a.size()==1)
{
mylist.add(new Integer(1));
answer.add(mylist);
return answer;
}
int j=1;
for(i=0;i<a.size()-1;i++)
{
hashtable.put(a.get(i),hashfunction(a.get(i)));
for(j=i+1;j<a.size();j++)
{
if(hashtable.containsValue(hashfunction(a.get(j))))
{
mylist.add(new Integer(i+1));
mylist.add(new Integer(j+1));
answer.add(mylist);
mylist.clear();
}
}
}
return answer;
}
}

Oh boy... there's quite a bit of stuff that's open for interpretation here. Case-sensitivity, locales, characters allowed/blacklisted... There are going to be a lot of ways to answer the general question. So, first, let me lay down a few assumptions:
Case doesn't matter. ("Rat" is an anagram of "Tar", even with the capital lettering.)
Locale is American English when it comes to the alphabet. (26 letters from A-Z. Compare this to Spanish, which has 28 IIRC, among which 'll' is considered a single letter and a potential consideration for Spanish anagrams!)
Whitespace is ignored in our definition of an anagram. ("arthas menethil" is an anagram of "trash in a helmet" even though the number of whitespaces is different.)
An empty string (null, 0-length, all white-space) has a "hash" (I prefer the term "digest", but a name is a name) of 1.
If you don't like any of those assumptions, you can modify them as you wish. Of course, that will result in the following algorithm being slightly different, but they're a good set of guidelines that will make the general algorithm relatively easy to understand and refactor if you wish.
Two strings are anagrams if they are exhaustively composed of the same set of characters and the same number of each included character. There's a lot of tools available in Java that makes this task fairly simple. We have String methods, Lists, Comparators, boxed primitives, and existing hashCode methods for... well, all of those. And we're going to use them to make our "hash" method.
private static int hashString(String s) {
if (s == null) return 0; // An empty/null string will return 0.
List<Character> charList = new ArrayList<>();
String lowercase = s.toLowerCase(); // This gets us around case sensitivity
for (int i = 0; i < lowercase.length(); i++) {
Character c = Character.valueOf(lowercase.charAt(i));
if (Character.isWhitespace(c)) continue; // spaces don't count
charList.add(c); // Note the character for future processing...
}
// Now we have a list of Characters... Sort it!
Collections.sort(charList);
return charList.hashCode(); // See contract of java.util.List#haschCode
}
And voila; you have a method that can digest a string and produce an integer representing it, regardless of the order of the characters within. You can use this as the basis for determining whether two strings are anagrams of each other... but I wouldn't. You asked for a digest function that produces an Integer, but keep in mind that in java, an Integer is merely a 32-bit value. This method can only produce about 4.2-billion unique values, and there are a whole lot more than 4.2-billion strings you can throw at it. This method can produce collisions and give you nonsensical results. If that's a problem, you might want to consider using BigInteger instead.
private static BigInteger hashString(String s) {
BigInteger THIRTY_ONE = BigInteger.valueOf(31); // You should promote this to a class constant!
if (s == null) return BigInteger.ONE; // An empty/null string will return 1.
BigInteger r = BigInteger.ONE; // The value of r will be returned by this method
List<Character> charList = new ArrayList<>();
String lowercase = s.toLowerCase(); // This gets us around case sensitivity
for (int i = 0; i < lowercase.length(); i++) {
Character c = Character.valueOf(lowercase.charAt(i));
if (Character.isWhitespace(c)) continue; // spaces don't count
charList.add(c); // Note the character for future processing...
}
// Now we have a list of Characters... Sort it!
Collections.sort(charList);
// Calculate our bighash, similar to how java's List interface does.
for (Character c : charList) {
int charHash = c.hashCode();
r=r.multiply(THIRTY_ONE).add(BigInteger.valueOf(charHash));
}
return r;
}

You need a number that is the same for all strings made up of the same characters.
The String.hashCode method returns a number that is the same for all strings made up of the same characters in the same order.
If you can sort all words consistently (for example: alphabetically) then String.hashCode will return the same number for all anagrams.
return String.valueOf(Arrays.sort(inputString.toCharArray())).hashCode();
Note: this will work for all words that are anagrams (no false negatives) but it may not work for all words that are not anagrams (possibly false positives). This is highly unlikely for short words, but once you get to words that are hundreds of characters long, you will start encountering more than one set of anagrams with the same hash code.
Also note: this gives you the answer to the (title of the) question, but it isn't enough for the question you're solving. You need to figure out how to relate this number to an index in your original list.

Java: I compare two Strings but it didn't recognize it

I have this problem:
I wrote this function because I need to get the index of the occurrence of a particular string st in a String array
static public int indicestring(String[] array, String st) {
int ret = -1;
for (int i = 0; i < array.length; i++){
if (st.equals(array[i])) {
ret=i;
break;
}
}
return ret;
}
I then called:
System.out.println("indicestring(NODO,"ET2"));
and I got the correct number.
But then when I do:
String[] arcos2 = linea.split("-");//reading from a file and separating by "-"
String aux = arcos2[1];
System.out.println(arcos2[1]);
System.out.println(aux);
if (aux.equals(arcos2[1])) {
System.out.println("Is equal 1");
}
if (aux.equals("ET2")) {
System.out.println("Is equal 2");
}
if ("ET2".equals(aux)) {
System.out.println("is equal 3");
}
The first two prints were ET2, but then it only printed of the 3 ifs is "Is equal 1".... The thing is I have nearly 200 nodes like "ET2" and only 3 are failing and giving me -1 in the first function...
My question is....Am I using wrong the arrays to save and compare the data, because if aux=arcos2[1]="ET2", why is 'aux.equals("ET2") 'or 'arcos2[1].equals("ET2)' not working
? Is ther another function you can recommend to try?(I tried changing equals with compareTo() == 0 and that didn't work either and trimming was also recommended).
Before, I had a similar error where I compare two arrays like this:
if(a[0] == b[0] && a[1] == b[1])
There was a case that clearly was correct but it was ignored...
But it got corrected when a i changed it to:
if (Arrays.equals(a, b))
Is there maybe some change like that

You should put a debug break point in the code and add expression watches to identify the root cause of the problem.

Why will this java string routine not print the answer?

I have been working on the Project Euler problem 4. I am new to java, and believe I have found the answer (906609 = 993 * 913, by using Excel!).
When I print the line commented out, I can that my string manipulations have worked. I've researched a few ways to compare strings in case I had not understoof something, but this routine doesn't give me a result.
Please help me identify why it is not printing the answer?
James
public class pall{
public static void main(String[] args){
int i;
int j;
long k;
String stringProd;
for(i=994;i>992; i--){
for (j=914;j>912; j--){
k=(i*j);
stringProd=String.valueOf(k);
int len=stringProd.length();
char[] forwards=new char[len];
char[] back = new char[len];
for(int l=0; l<len; l++){
forwards[l]=stringProd.charAt(l);
}
for(int m=0; m<len;m++){
back[m]=forwards[len-1-m];
}
//System.out.println(forwards);
//System.out.println(back);
if(forwards.toString().equals(back.toString())){
System.out.println(k);}
}
}
}
}

You are comparing the string representation of your array. toString() doesn't give you what you think. For example, the below code makes it clear:
char[] arr1 = {'a', 'b'};
char[] arr2 = {'a', 'b'};
System.out.println(arr1.toString() + " : " + arr2.toString());
this code prints:
[C#16f0472 : [C#18d107f
So, the string representation of both the arrays are different, even though the contents are equal. This is because arrays don't override toString() method. It inherits the Object#toString() method.
The toString method for class Object returns a string consisting of
the name of the class of which the object is an instance, the at-sign
character #, and the unsigned hexadecimal representation of the hash
code of the object. In other words, this method returns a string equal
to the value of:
getClass().getName() + '#' + Integer.toHexString(hashCode())
So, in the above output, [C is the output of char[].class.getName(), and 18d107f is the hashcode.
You can't also compare the arrays using forward.equals(back), as arrays in Java don't override equals() or hashCode() either. Any options? Yes, for comparing arrays you can use Arrays#equals(char[], char[]) method:
if (Arrays.equals(forward, back)) {
System.out.println(k);
}
Also, to get your char arrays, you don't need those loops. You can use String#toCharArray() method. And also to get the reverse of the String, you can wrap the string in a StringBuilder instance, and use it's reverse() method:
char[] forwards = stringProd.toCharArray();
char[] back = new StringBuilder(stringPod).reverse().toString().toCharArray();
And now that you have found out an easy way to reverse a string, then how about using String#equals() method directly, and resist creating those character arrays?
String stringPod = String.valueOf(k);
String reverseStringPod = new StringBuilder(stringPod).reverse().toString()
if (stringPod.equals(reverseStringPod)) {
System.out.println(k);
}
Finally, since it is about project euler, which is about speed and mostly mathematics. You should consider avoiding String utilities, and do it with general division and modulus arithmetic, to get each individual digits, from beginning and end, and compare them.

To convert a string to char[] use
char[] forward = stringProd.toCharArray();
To convert a char[] to String, use String(char[]) constructor:
String backStr = new String(back); // Not the same as back.toString()
However, this is not the most performant solution, for several reasons:
You do not need to construct a back array to check if a string is a palindrome - you can walk the string from both ends, comparing the characters as you go, until you either find a difference or your indexes meet in the middle.
Rather than constructing a new array in a loop, you could reuse the same array - in case you do want to continue with an array, you could allocate it once for the maximum length of the product k, and use it in all iterations of your loop.
You do not need to convert a number to string in order to check if it is a palindrome - you can get its digits by repeatedly taking the remainder of division by ten, and then dividing by ten to go to the next digit.
Here is an illustration of the last point:
boolean isPalindrome(int n) {
int[] digits = new int[10];
if (n < 0) n = -n;
int len = 0;
while (n != 0) {
digits[len++] = n % 10;
n /= 10;
}
// Start two indexes from the opposite sides
int left = 0, right = len-1;
// Loop until they meet in the middle
while (left < right) {
if (digits[left++] != digits[right--]) {
return false;
}
}
return true;
}

Quicksort a string array with elements combining number and string

I'm doing my homework,that i need to use quick sort to sort string array,and the element in array combing number and string. for example
String s[];
s[0]="172,19,Nina";
s[1]="178,18,Apple";
s[2]="178,18,Alex";
So after sorted, it should be
s[0]=172,19,Nina
s[1]=178,18,Alex
s[2]=178,18,Apple
Im thinking should i split all the Strings first into number and string, and then sort 172,178,178, and then sort 19 18 18, and at the end sort Nina Apple Alex??
what is the best way to do this?

If all your numbers have the same number of characters, lexicographic order is the same as numeric order, so you might just compare your Strings directly.
Else, you should split the strings and transform them into proper objects which implement the Comparable interface:
public class Record implements Comparable<Record> {
private int firstNumber;
private int secondNumber;
private String name;
...
#Override
public int compareTo(Record r) {
int result = Integer.valueOf(firstNumber).compareTo(Integer.valueOf(r.firstNumber);
if (result != 0) {
result = Integer.valueOf(secondNumber).compareTo(Integer.valueOf(r.secondNumber);
}
if (result != 0) {
result = name.compareTo(r.name);
}
return result;
}
}

Yes, you are right: you need to separate the combined string into its elements and sort the array based on these separated elements. Note that you do not need to sort the full array based on the first number, then the second, etc., but provide a comparison logic based on them. This is called lexical ordering
Basically, when you need to decide whether an element is less then another, you have the following logic (pseudocode):
if elem1's first number < elem2's first number
then
return elem1 less than elem2
else if elem1's first number > elem2's first number
then
return elem1 greater than elem2
// from here on: elem1's first number == elem2's first number
else if elem1's second number < elem2's second number
then
return elem1 less then elem2
else if elem1's second number > elem2's second number
then
return elem1 greater than elem2
// from here on: elem1's second number == elem2's second number
else if elem1's third string < elem2's third string
then
return elem1 less then elem2
else if elem1's third string > elem2's third string
then
return elem1 greater than elem2
else // everything is the same
return elem1 equal elem2

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Comparing Strings lexicographically new approach fails for one test case - java

Related

How to handle the time complexity for permutation of strings during anagrams search?

What should be the logic of hashfunction() in order to check that two strings are anagrams or not?

Java: I compare two Strings but it didn't recognize it

Why will this java string routine not print the answer?

Quicksort a string array with elements combining number and string

Categories

Resources