Faster way to compare String objects

Faster way to compare String objects - java

This is a method i use:
public void addSong(LibrarySong song){
//Check if doublicates exist
for(int i=0; i<intoPanel.getComponentCount(); i++){
if(song.getName().toLowerCase().equals(intoPanel.getComponent(i).getName().toLowerCase()))
return;
}
intoPanel.add(song);
}
.... to insert a new Component to a JPanel.I do this by checking if the name already exists.This works well,but when i have to D&D or insert manually 100.000 items then is running slower and slower.
My question is:
Can i use something better to do this process faster?Thanks....
Edit:
Following an answer i have changed the code to this:
String name;
public void addSong(LibrarySong song){
//Check if doublicates exist
name=song.getName().toLowerCase();
for(int i=0; i<intoPanel.getComponentCount(); i++){
if(name.equals(intoPanel.getComponent(i).getName().toLowerCase()))
return;
}
intoPanel.add(song);
}

Move the song.getName().toLowerCase() before the loop to do toLowerCase() once.
UPDATE: Actualy it's not good idea to add 100 000 components. Use e.g JList of JTable to represent the songs with a custom CellRenderer

You can write a sorted linked list class to store all your values. Because it is sorted, both insert and search will take O(logN) time, as you can use binary search.
Seems like a lot of work and for practical purposes the HashMap or HashSet are probably better.

First of all, don't add 100,000 items to your user interface. For the amount of user interface items you can add to the interface without confusing your users (a few hundred max), the performance of your linear search will be more than sufficient.
If you want to store more than a few hundred songs you will have to think about both your data structures and the user interface.
You can gain much more, both with respect to performance and with respect to user-friendliness, by choosing the right data structures and user interface. For data structure you can use for example a HashMap<String,LibrarySong>. Then you can use the following code:
if (!song.containsKey(song.getName().toLowerCase()) {
map.put(song.getName().toLowerCase(), song);
}
which almost runs in constant time.
For the user interface it probably best just to let the user enter a name, and then search the corresponding song in your data structure.
Have you considered using a database for your data storage?

Related

How to store over 300 words with definitions in a Android studio

I'm trying to create an android app where you can learn hard words its over 300 words. I'm wondering how I should store the data in java.
I have a text file where all the words are. I have split the text so I have one array with the words and another Array with the definitions, they have the same index. In an activity, I want to make it as clean as possible, because sometimes I need to delete an index and It's not efficient to that with an ArrayList since they all need to move down.
PS. I really don't wanna use a database like Firebase.

Instead of using two different arrays and trying to ensure that their order/indices are matched, you should consider defining your own class.
class Word {
String wordName;
String wordDefinition;
}
You can then make a collection of this using ArrayList or similar.
ArrayList<Word> wordList;
I know you were concerned about using an ArrayList due to the large number of words, however I think for your use case the ArrayList is fine. Using a database is probably overkill, unless if you want to put in the whole dictionary ;)
In any case, it is better to define your own class and use this as a "wildcard" to collection types which accept these. This link may give you some ideas of other feasible data types.
https://developer.android.com/reference/java/util/Collections

I personally would use a HashMap.
The reason for this is because you can set the key to be the word and the value to be the definition of the word. And then you can grab the definition of the word by doing something like
// Returns the definition of word or null if the word isn't in the hashmap
hashMapOfWords.getOrDefault(word, null);
Check out this link for more details on a HashMap
https://developer.android.com/reference/java/util/HashMap

Difference in time complexity between storing data as a HashMap and record instance

For one of my school assigments, I have to parse GenBank files using Java. I have to store and retrieve the content of the files together with the extracted information maintaining the smallest time complexity possible. Is there a difference between using HashMaps or storing the data as records? I know that using HashMaps would be O(1), but the readability and immutability of using records leads me to prefer using them instead. The objects will be stored in an array.
This my approach now
public static GenBankRecord parseGenBankFile(File gbFile) throws IOException {
try (var fileReader = new FileReader(gbFile); var reader = new BufferedReader(fileReader)) {
String organism = null;
List<String> contentList = new ArrayList<>();
while (true) {
String line = reader.readLine();
if (line == null) break; //Breaking out if file end has been reached
contentList.add(line);
if (line.startsWith(" ORGANISM ")) {
// Organism type found
organism = line.substring(12); // Selecting the correct part of the line
}
}
// Loop ended
var content = String.join("\n", contentList);
return new GenBankRecord(gbFile.getName(),organism, content);
}
}
with GenBankRecord being the following:
record GenBankRecord(String fileName,String organism, String content) {
#Override
public String toString(){
return organism;
}
}
Is there a difference between using a record and a HashMap, assuming the keys-value pairs are the same as the fields of the record?
String current_organism = gbRecordInstance.organism();
and
String current_organism = gbHashMap.get("organism");

I have to store and retrieve the content of the files together with the extracted information maintaining the smallest time complexity possible.
Firstly, I am somewhat doubtful that your teachers actually stated the requirements like that. It doesn't make a lot of sense to optimize just for time complexity.
Complexity is not efficiency.
Big O complexity is not about the value of the measure (e.g. time taken) itself. It is actually about how the measure (e.g. time taken) changes as some variable gets very large.
For example, HashMap.get(nameStr) and someRecord.name are both O(1) complexity.
But they are not equivalent in terms of efficiency. Using Java 17 record types or regular Java classes with named fields will be orders of magnitude faster than using a HashMap. (And it will use orders of magnitude less memory.)
Assuming that your objects have a fixed number of named fields, the complexity (i.e how the performance changes with an ever increasing number of fields) is not even a relevant.
Performance is not everything.
The most differences between HashMap and a record class are actually in the functionality that they provide:
A Map<String, SomeType> provides an set of name / value pairs where:
the number of pairs in the set is not fixed
the names are not fixed
the types of the values are all instances of SomeType or a subtype.
A record (or classic class) can be viewed as set of fieldname / value pairs where:
the number of pairs is fixed at compile time
the field names are fixed at compile time
the field types don't have to be subtypes of any single given type.
As #Louis Wasserman commented:
Records and HashMap are apples and oranges -- it doesn't really make sense to compare them.
So really, you should be choosing between records and hashmaps by comparing the functionality / constraints that they provide versus what your application actually needs.
(The problem description in your question is not clear enough for us to make that judgement.)
Efficiency concerns may be relevant, but it is a secondary concern. (If the code doesn't meet functional requirements, efficiency is moot.)
Is Complexity relevant to your assignment?
Well ... maybe yes. But not in the area that you are looking at.
My reading of the requirements is that one of them is that you be able to retrieve information from your in-memory data structures efficiently.
But so far you have been thinking about storing individual records. Retrieval implies that you have a collection of records and you have to (efficiently) retrieve a specific record, or maybe a set of records matching some criteria. So that implies you need to consider the data structure to represent the collection.
Suppose you have a collection of N records (or whatever) representing (say) N organisms:
If the collection is a List<SomeRecord>, you need to iterate the list to find the record for (say) "cat". That is O(N).
If the collection is a HashMap<String, SomeRecord> keyed by the organism name, you can find the "cat" record in O(1).

Search for keywords in a HashMap with a user entered string

I have a Java assignment where we are to program a "database" of books and journals using the ArrayList class to store them as objects of type Reference.
One of the requirements from this assignment is that we split the titles of the books in the database and save them into a HashMap to allow for keyword searching later.
My hashmap is declared like this:
private HashMap <String, Reference> titles = new HashMap <String,Reference> (10);
I know through testing that the way I add titles and References to the HashMap works. What does not work is my search function.
private void searchBooks(String callNumber, String[] keywords, int startYear, int endYear) {
Set<String>commonKeys = new HashSet<String>();
for(int i = 0; i < keywords.length; i ++)
{
commonKeys.add(keywords[i]);
}
titles.keySet().retainAll(commonKeys);
System.out.println(titles);
This is code I've pieced together based off of my knowledge, and similar problems I've been able to find on this sites various threads.
Am I approaching this right? Is there something I'm missing?

Let's say you have two books. One is titled "Book One" and the other is titled "Book Two". If I understand your question correctly, you are populating the titles map with something like the below.
for each title
split on " " character
titles.put(title word, reference to the book)
If I got that right, the problem isn't with your searching code, but rather the data structure of the titles object itself. If we run the example above through the pseudocode I wrote above, what does the map look like at the end?
Book -> Book Two
One -> Book One
Two -> Book Two
Now, if you search for "Book" you will get the behavior described.
How do you solve this? You need something more than a simple map. One option is a map of lists, in your case
Map<String, List<Reference>>
You would populate this nearly the same way you already are, but instead of
titles.put(title word, reference to the book)
You would do:
if (titles.containsKey(title word)) {
titles.get(title word).add(reference to the book);
} else {
titles.put(new List<Reference>() { reference to the book };
}
Another option would be the Multimap class from the GS Collections library, but the code above is a simple starting point.

Your search function is also permanently damaging your "titles" hashMap. A search function should only look at your map, but it is also altering it. Note how, after every search, you wiped out all the entries that didn't match. Your next search won't find those, maybe this is the reason. If you are adding a new entry after searching, this would explain perfectly why only one entry is being found - all the others were wiped out in the previous search.

The right datastructure for selecting objects

I'm new to Java and as a learning project, I would like to program a little vocabulary application, so that the user can test himself but also search for entries. However, I struggle to find the right datastructure for this and even after spending the last few days googling for it, I'm still at a loss.
Here is what I have in mind for my vocabulary object:
import java.io.*;
class Vocab implements Serializable {
String lang1;
String lang2;
int rightAnswersInARow; // to influence what to ask during testing
int numberOfTimesSearched; // to influence search suggestions
// ... plus the appropriate setter and getter methods.
}
Now for the testing, at first glance an ArrayList seems to be the most appropriate (choosing a random number and then selecting that object to test). But what if I would also like to factor in the rightAnswersInARow and ask vocabularies with a low number more often? My approach would be count the number of objects for each value, give each value an interval (e.g. the interval for rightAnswersInARow = 0 would be inflated by the factor 3) and then randomly select from there.
But even if I go through the ArrayList each time, get the rightAnswersInARow and determine the intervals...how would I then map the calculated number to the right index since the elements are not sorted? So would a TreeSet be more appropriate?
To search for entries in both languages and maybe even adding a dropdown-list with suggested words (like in Google's search) would require that I find the strings quickly (HashMap?). Or maybe go through 2+ (one for each language) TreeSets to reach the first element that starts with those letters, then selecting the next few elements from there? But that would mean the search would always suggest the same words, ignoring which words were searched for the most.
What would you suggest? Have a HashMap with each value pair and manually implement something like a relational database?
Thank you in advance! :)

Looking for a table-like data structure

I have 2 sets of data.
Let say one is a people, another is a group.
A people can be in multiple groups while a group can have multiple people.
My operations will basically be CRUD on group and people.
As well as a method that makes sure a list of people are in different groups (which gets called alot).
Right now I'm thinking of making a table of binary 0's and 1's with horizontally representing all the people and vertically all the groups.
I can perform the method in O(n) time by adding each list of binaries and compare with the "and" operation of the list of binaries.
E.g
Group A B C D
ppl1 1 0 0 1
ppl2 0 1 1 0
ppl3 0 0 1 0
ppl4 0 1 0 0
check (ppl1, ppl2) = (1001 + 0110) == (1001 & 0110)
= 1111 == 1111
= true
check (ppl2, ppl3) = (0110 + 0010) == (0110+0010)
= 1000 ==0110
= false
I'm wondering if there is a data structure that does something similar already so I don't have to write my own and maintain O(n) runtime.

I don't know all of the details of your problem, but my gut instinct is that you may be over thinking things here. How many objects are you planning on storing in this data structure? If you have really large amounts of data to store here, I would recommend that you use an actual database instead of a data structure. The type of operations you are describing here are classical examples of things that relational databases are good at. MySQL and PostgreSQL are examples of large scale relational databases that could do this sort of thing in their sleep. If you'd like something lighter-weight SQLite would probably be of interest.
If you do not have large amounts of data that you need to store in this data structure, I'd recommend keeping it simple, and only optimizing it when you are sure that it won't be fast enough for what you need to do. As a first shot, I'd just recommend using java's built in List interface to store your people and a Map to store groups. You could do something like this:
// Use a list to keep track of People
List<Person> myPeople = new ArrayList<Person>();
Person steve = new Person("Steve");
myPeople.add(steve);
myPeople.add(new Person("Bob"));
// Use a Map to track Groups
Map<String, List<Person>> groups = new HashMap<String, List<Person>>();
groups.put("Everybody", myPeople);
groups.put("Developers", Arrays.asList(steve));
// Does a group contain everybody?
groups.get("Everybody").containsAll(myPeople); // returns true
groups.get("Developers").containsAll(myPeople); // returns false
This definitly isn't the fastest option available, but if you do not have a huge number of People to keep track of, you probably won't even notice any performance issues. If you do have some special conditions that would make the speed of using regular Lists and Maps unfeasible, please post them and we can make suggestions based on those.
EDIT:
After reading your comments, it appears that I misread your issue on the first run through. It looks like you're not so much interested in mapping groups to people, but instead mapping people to groups. What you probably want is something more like this:
Map<Person, List<String>> associations = new HashMap<Person, List<String>>();
Person steve = new Person("Steve");
Person ed = new Person("Ed");
associations.put(steve, Arrays.asList("Everybody", "Developers"));
associations.put(ed, Arrays.asList("Everybody"));
// This is the tricky part
boolean sharesGroups = checkForSharedGroups(associations, Arrays.asList(steve, ed));
So how do you implement the checkForSharedGroups method? In your case, since the numbers surrounding this are pretty low, I'd just try out the naive method and go from there.
public boolean checkForSharedGroups(
Map<Person, List<String>> associations,
List<Person> peopleToCheck){
List<String> groupsThatHaveMembers = new ArrayList<String>();
for(Person p : peopleToCheck){
List<String> groups = associations.get(p);
for(String s : groups){
if(groupsThatHaveMembers.contains(s)){
// We've already seen this group, so we can return
return false;
} else {
groupsThatHaveMembers.add(s);
}
}
}
// If we've made it to this point, nobody shares any groups.
return true;
}
This method probably doesn't have great performance on large datasets, but it is very easy to understand. Because it's encapsulated in it's own method, it should also be easy to update if it turns out you need better performance. If you do need to increase performance, I would look at overriding the equals method of Person, which would make lookups in the associations map faster. From there you could also look at a custom type instead of String for groups, also with an overridden equals method. This would considerably speed up the contains method used above.
The reason why I'm not too concerned about performance is that the numbers you've mentioned aren't really that big as far as algorithms are concerned. Because this method returns as soon as it finds two matching groups, in the very worse case you will call ArrayList.contains a number of times equal to the number of groups that exist. In the very best case scenario, it only needs to be called twice. Performance will likely only be an issue if you call the checkForSharedGroups very, very often, in which case you might be better off finding a way to call it less often instead of optimizing the method itself.

Have you considered a HashTable? If you know all of the keys you'll be using, it's possible to use a Perfect Hash Function which will allow you to achieve constant time.

How about having two separate entities for People and Group. Inside People have a set of Group and vice versa.
class People{
Set<Group> groups;
//API for addGroup, getGroup
}
class Group{
Set<People> people;
//API for addPeople,getPeople
}
check(People p1, People p2):
1) call getGroup on both p1,p2
2) check the size of both the set,
3) iterate over the smaller set, and check if that group is present in other set(of group)
Now, you can basically store People object in any data structure. Preferably a linked list if size is not fixed otherwise an array.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.