Java very long string compare seems not to work - java

Update:
Using your suggestions in checking between strings, I found out that the difference is on the arrangement of some fields since these strings are actually JSON strings.
Example:
the field username: johndoe#dummy.com on string1 is located at the beginning, but is located somewhere in the middle in string2.
I wonder if there is a way to check or compare 2 json objects regardless of the arrangement of their fields/properties... as long as their contents (field values) are the same.
What I tried:
private boolean sameJsonObject(Object o1, Object o2) throws IOException {
if (o1 == null || o2 == null) {
return false;
}
JsonNode json1 = mapper.readTree(mapper.writeValueAsString(o1));
JsonNode json2 = mapper.readTree(mapper.writeValueAsString(o2));
return json1.equals(json2);
}
this works but I am sure that this can be improved.
Original Problem:
I would like to check if two strings are equal, but these strings are really long that it cannot be set to a variable/string object and would get string constant too long.
I know that there is equals(), equalsIgnoreCase(), and StringUtils.equals(s1, s2) but none of these seems to work.
The strings that I am comparing came from two different sources and comparing it manually get the same results (I mean the contents are the same).
I tried to post the sample strings here but I can't. The size of each string to compare is more than 30k (170k each string).
btw, these strings are actual data (json) and they are really huge and I want to test its equality (content).
Is there a way to do the checking in java?
Thanks!

The simple answer is: compare the two strings char by char.
In other words: most likely, the built-in Java string compare methods are correct, leading to: your input strings aren't equal. It is extremely unlikely that equal strings result in equals() giving false.
Thus the reasonable first option is: change your comparison code so that it:
iterates the first string, char by char
fetches the corresponding char from the second string
compares those (either "full" equals, or ignoring case)
on mismatch: print out the index and the two different chars (where you ensure to print something like "<" + thatChar + ">" just to ensure that non-printable chars, or maybe using Character.getNumericValue())
So, the answer here is basically to enable yourself to do proper debugging.

btw, these strings are actual data (json) and they are really huge and
I want to test its equality (content).
If these are JSON data, don't compare them with String.
Use a JSON library (Jackson, GSON or anything)to represent these data and also to compare them (equals() is generally overridden).It will compare them more cleanly and more specifically by considering or ignoring things like whitespace, node order and so forth...
You can find some examples here.
You could consider more particularly SkyScreamer library that provides multiple flavors to compare JSON. For example this JSONAssert.assertEquals() overload :
public static void assertEquals(org.json.JSONArray expected,
org.json.JSONArray actual,
JSONCompareMode compareMode)
throws org.json.JSONException
where you can specify a JSONCompareMode object such as JSONCompareMode.LENIENT.

170k is not too large for a String, though it is large for a string literal in source code.
Load your two strings from files that contain them, and compare in the normal way using equals.
You mention that the strings are not text but JSON. For most purposes, you'd want to normalize your json (make the whitespace, property order and punctuation the same).

Related

After converting hashcode into String I can't reconvert it to int in Java

I have a problem with the method that aims to print a hypothetical new hashCode for my class. I am trying to create for each object a hashcode that will be composed of the hashCodes of two String variables of my class. That's my code:
public void printHashCodes(){
String plateNumber = String.valueOf(this.liNumber.hashCode()) + String.valueOf(this.country.hashCode());
int hashCodeToConvert = Integer.valueOf(plateNumber);
System.out.println(hashCodeToConvert);
System.out.println(plateNumber);
}
Whenever I delete "hashCodeToConvert" lines and print a String called "plateNumber" method works fine. From that I understand that conversion of joined hashCodes to String was succesful.
Whenever I replace the content of the "plateNumber" String with some constant value like "123456789" the method works fine too so the code must be working well but in the form above it just doesn't work.
I assume there must be some limitation in conversion of potentially unlimited String into limited primitive int but what is the actual reason and what is the solution for that problem?
Clarification: My actual aim is to make a hashCode for an object but I am using printHashCodes method to verify those hashCodes. Since hashCode must be an int I have to reconvert it into int. It cannot stay as String(as farest I know)
I can see two reasons why your code is failing.
First, the result of String.valueOf(this.liNumber.hashCode()) + String.valueOf(this.country.hashCode()) could be too big.
The maximum value for an integer is 2147483647.
Secondly, the output of the hashCode() method could be below zero, and the value of plateNumbercould be 2147483647-1164645587, which is not easy to transform into an integer.
In both cases, it will throw a java.lang.NumberFormatException.
For me, the easiest solution would be to create a hashcode for an object that would contain these two values.
Or to use this:
return Objects.hash(liNumber, country);

Java concatenate strings vs static strings

I try to get a better understanding of Strings. I am basically making a program that requires a lot of strings. However, a lot of the strings are very, very similar and merely require a different word at the end of the string.
E.g.
String one = "I went to the store and bought milk"
String two = "I went to the store and bought eggs"
String three = "I went to the store and bought cheese"
So my question is, what approach would be best suited to take when dealing with strings? Would concatenating 2 strings together have any benefits over just having static strings in, say for example, performance or memory management?
E.g.
String one = "I went to the store and bought "
String two = "milk"
String three = "cheese"
String four = one + two
String five = one + three
I am just trying to figure out the most optimal way of dealing with all these strings. (If it helps to put a number of strings I am using, I currently have 50 but the number could surplus a huge amount)
As spooky has said the main concern with the code is readability. Unless you are working on a program for a phone you do not need to manage your resources. That being said, it really doesn't matter whether you create a lot of Strings that stand alone or concatenate a base String with the small piece that varies. You won't really notice better performance either way.
You may set the opening sentence in a string like this
String openingSentence = "I went to the store and bought";
and alternate defining each word alone, by defining one array of strings like the following ::
String[] thingsToBeBought = { "milk", "water", "cheese" .... };
then you can do foreach loop and concatenate each element in the array with the opening sentence.
In Java, if you concatenate two Strings (e.g. using '+') a new String is created, so the old memory needs to be garbage collected. If you want to concatenate strings, the correct way to do this is to use a StringBuilder or StringBuffer.
Given your comment about these strings really being URLs, you probably want to have a StringBuilder/StringBuffer that is the URL base, and then append the suffixes as needed.
Performance wise final static strings are always better as they are generated during compile time. Something like this
final static String s = "static string";
Non static strings and strings concatenated as shown in the other example are generated at runtime. So even though performance will hardly matter for such a small thing, The second example is not as good as the first one performance wise as in your code :
// not as good performance wise since they are generated at runtime
String four = one + two
String five = one + three
Since you are going to use this string as URL, I would recommend to use StringJoiner (in case your are using JAVA 8). It will be as efficient as StringBuilder (will not create a new string every time you perform concatenation) and will automatically add "/" between strings.
StringJoiner myJoiner = new StringJoiner("/")
There will be no discernable difference in performance, so the manner in which you go about this is more a matter of preference. I would likely declare the first part of the sentence as a String and store the individual purchase items in an array.
Example:
String action = "I went to the store and bought ";
String [] items = {"milk", "eggs", "cheese"};
for (int x = 0; x< items.length; x++){
System.out.println(action + items[x]);
}
Whether you declare every possible String or separate Strings to be concatenated isn't going to have any measurable impact on memory or performance in the example you give. In the extreme case of declaring truly large numbers of String literals, Java's native hash table of interned Strings will use more memory if you declare every possible String, because the table's cached values will be longer.
If you are concatenating more than 2 Strings using the + operator, you will be creating extra String objects to be GC'd. For example if you have Strings a = "1" and b = "2", and do String s = "s" + a + b;, Java will first create the String "s1" and then concatenate it to form a second String "s12". Avoid the intermediate String by using something like StringBuilder. (This wouldn't apply to compile-time declarations, but it would to runtime concatenations.)
If you happen to be formatting a String rather than simply concatenating, use a MessageFormat or String.format(). It's prettier and avoids the intermediate Strings created when using the + operator. So something like, String urlBase = "http://host/res?a=%s&b=%s"; String url = String.format(urlBase, a, b); where a and b are the query parameter String values.

Efficient data structure that checks for existence of String

I am writing a program which will add a growing number or unique strings to a data structure. Once this is done, I later need to constantly check for existence of the string in it.
If I were to use an ArrayList I believe checking for the existence of some specified string would iterate through all items until a matching string is found (or reach the end and return false).
However, with a HashMap I know that in constant time I can simply use the key as a String and return any non-null object, making this operation faster. However, I am not keen on filling a HashMap where the value is completely arbitrary. Is there a readily available data structure that uses hash functions, but doesn't require a value to be placed?
If I were to use an ArrayList I believe checking for the existence of some specified string would iterate through all items until a matching string is found
Correct, checking a list for an item is linear in the number of entries of the list.
However, I am not keen on filling a HashMap where the value is completely arbitrary
You don't have to: Java provides a HashSet<T> class, which is very much like a HashMap without the value part.
You can put all your strings there, and then check for presence or absence of other strings in constant time;
Set<String> knownStrings = new HashSet<String>();
... // Fill the set with strings
if (knownString.contains(myString)) {
...
}
It depends on many factors, including the number of strings you have to feed into that data structure (do you know the number by advance, or have a basic idea?), and what you expect the hit/miss ratio to be.
A very efficient data structure to use is a trie or a radix tree; they are basically made for that. For an explanation of how they work, see the wikipedia entry (a followup to the radix tree definition is in this page). There are Java implementations (one of them is here; however I have a fixed set of strings to inject, which is why I use a builder).
If your number of strings is really huge and you don't expect a minimal miss ratio then you might also consider using a bloom filter; the problem however is that it is probabilistic; but you can get very quick answers to "not there". Here also, there are implementations in Java (Guava has an implementation for instance).
Otherwise, well, a HashSet...
A HashSet is probably the right answer, but if you choose (for simplicity, eg) to search a list it's probably more efficient to concatenate your words into a String with separators:
String wordList = "$word1$word2$word3$word4$...";
Then create a search argument with your word between the separators:
String searchArg = "$" + searchWord + "$";
Then search with, say, contains:
bool wordFound = wordList.contains(searchArg);
You can maybe make this a tiny bit more efficient by using StringBuilder to build the searchArg.
As others mentioned HashSet is the way to go. But if the size is going to be large and you are fine with false positives (checking if the username exists) you can use BloomFilters (probabilistic data structure) as well.

if statement not working to filter empty names [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
If statement using == gives unexpected result
Hi I'm using this code to add elements to my ComboBox, and I do not want to add empty elements, here's the code:
public void elrendezesBetoltes(ArrayList<Elrendezes> ElrLista){
int i;
Elrendezes tmp;
model.removeAllElements();
model = new DefaultComboBoxModel(comboBoxItems);
for(i=0; i<ElrLista.size(); i++){
tmp = ElrLista.get(i);
if(tmp.getName()!="")comboBoxItems.add(tmp.getName()); //not working
addButton2(tmp.getSeatnum(),tmp.getCoord(),tmp.getFoglalt());
}
}
My problem is that the if statement is not working, it still adds empty names to my combobox. What am I doing wrong?
Always use equals method to compare Strings: -
if (tmp.getName()!="")
should be: -
if (!tmp.getName().equals(""))
or simply use this, if you want to check for empty string: -
if (!tmp.getName().isEmpty()) {
comboBoxItems.add(tmp.getName());
}
Use equals method to compare string. By using != operator, you are comparing the string instances, which is always going the be true as they(tmp.getName() and "") are not same string instances.
Change
tmp.getName()!=""
to
!"".equals(tmp.getName())
Putting "" as first string in comparison will take care of your null scenario as well i.e. it will not break if tmp.getName() is null.
Use equals():
if (!tmp.getName().equals(""))
Using == or != compares string references, not string contents. This is almost never what you want.
you have to compare Strings with "equals", then it will work
if(!tmp.getName().equals(""))comboBoxItems.add(tmp.getName())
you are comparing for identity (==, !=) but each String instance has its own identity, even when they are equal.
So you need to do !tmp.getName().equals("").
Generally it is considered best practice to start with the constant string first, because it will never be null: !"".equals(tmp.getName())
However, I would recommend to use apache commons lang StringUtils. It has a notEmpty() and notBlank() method that take care of null handling and also trimming.
PS: sometimes identity will work for Strings. but it should not be relied upon as it is caused by compiler or jvm optimization due to String immutability.
Use String#isEmpty()
if(!tmp.getName().isEmpty())
OR:
if(!tmp.getName().equals(""))
Always, check String equality with equals method. == operator only checks if two references point to the same String object.
Another alternative if not on Java 6 and isEmpty is unavailable is this:
if (tmp.getName.length()>0)
Checking for the length is supposed to be quicker than using .equals although tbh the potential gain is so small its not worth worrying too much about.

Comparing two strings in Java [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
Java String.equals versus ==
I know it' a dumb question but why this code doesn't work.
boolean correct = "SampleText" == ((EditText)findViewById(R.id.editText1)).getText().toString();
if(correct) ((TextView)findViewById(R.id.textView1)).setText("correct!");
else ((TextView)findViewById(R.id.textView1)).setText("uncorrect!");
The point is to check if content of "editText1" is equal to "Sample Text"
In Java, two strings (and in general, two objects) must be compared using equals(), not ==. The == operator tests for identity (meaning: testing if two objects are exactly the same in memory), whereas the method equals() tests two objects for equality (meaning: testing if two objects have the same value), no matter if they're two different objects. Almost always you're interested in equality, not in identity.
To fix your code, do this:
String str = ((EditText)findViewById(R.id.editText1)).getText().toString();
boolean correct = "SampleText".equals(str);
Also notice that it's a good practice to put the string literal first in the call to equals(), in this way you're safe in case the second string is null, avoiding a possible NullPointerException.
In Java Strings have to be compared with their equals() method:
String foo = "foo";
String bar = "bar";
if (foo.equals(bar)) System.out.println("correct");
else System.out.println("incorrect");
to compare the values for two strings (for equality), you need to use equals, not == (or use equalsIgnoreCase if you do not care about case sensitivity).
Using equals will check the contents/values of the strings (as opposed to "==" which will only check if the two variables point to the same object - not the same value).
The correct way to compare 2 objects in java is using equals() method of Object class
And as String is an object in java, it should be compared in same way.
The correct way to compare a String is with,
s1.equals(s2)
So you can use this,
boolean correct = "SampleText".equals(((EditText)findViewById(R.id.editText1)).getText().toString());
((TextView)findViewById(R.id.textView1)).setText("SampleTest".equals(((EditText)findViewById(R.id.editText1)).getText().toString()) ? "correct!" : "incorrect!");
It's a bit long and theres probably a better way you could do this. The .toString() feels weird!

Categories