In my quest to continue my java education I'm trying to figure out if there is a native java method that quickly and efficiently allows a lookup of a string value in a ArrayList of Arrays.
Here is my code that shows what I'm trying to do:
public void exampleArrayListofArray () {
ArrayList<String []> al = new ArrayList<>();
al.add(new String[] {"AB","YZ"});
al.add(new String[] {"CD","WX"});
al.add(new String[] {"EF","UV"});
al.add(new String[] {"GH","ST"});
al.add(new String[] {"IJ","QR"});
al.add(new String[] {"KL","OP"});
displayArrayListofArray(al);
}
public void displayArrayListofArray(List<String []> al) {
for (String [] row : al)
for (int column = 0; column <= 1 ; column ++){
System.out.println("Value at Index Row " + al.indexOf(row) +
" Column " + column + " is " + (row)[column]);
}
String lookUpString = "YZ";
lookUpMethod(al, lookUpString);
lookUpString = "ST";
lookUpMethod(al, lookUpString);
lookUpString = "IJ";
lookUpMethod(al, lookUpString);
lookUpString = "AA";
lookUpMethod(al, lookUpString);
}
public void lookUpMethod(List<String []> al, String lookUpString) {
boolean isStringFound = false;
for (String[] row : al) {
for (int column = 0; column <= 1; column++) {
if (al.get(al.indexOf(row))[column] == lookUpString) {
System.out.println("Index of '" + lookUpString + "': " + al.indexOf(row) + column);
isStringFound = true;
}
}
}
if (!isStringFound) {
System.out.println("Search string '" + lookUpString + "' does not exist.");
}
}
Is this the most efficient way of searching my ArrayList for a given string?
Is there anything that I should be doing to make my code more efficient (besides not using an ArrayList)?
I know that perhaps to do what I'm trying to do here there could be more efficient ways of doing it than an ArrayList such as a HashMap but with my currently very limited java knowledge I'm making progress with ArrayList and would have to start from scratch using a HashMap. The very end goal of my code is to do the following:
Read an asset text file to load the ArrayList
Search the ArrayList for a user entered value
Do some calcs with the neighbouring values in the searched row
Allow the user to update the neighbouring values at the searched row
Allow the user to add a new row if the searched string is not found
Save any changes back to the asset text file in alphabetical order
Airfix
My answer is: don't worry.
I think you are looking at this from the wrong angle: if you find that the users of your application have a "performance" issue; and if you then do profiling, and then profiling shows that your current "search" code is the "culprit" (the single hot-spot that kills "end user perceived performance"); then you will have to bite the bullet and learn about using different data structures than ArrayLists.
(side note there: in reality, Set/HashSet isn't much "different"; learning how to use them ... isn't as big of a deal as it might sound).
But: if you answered any of the above "questions" with "no" (like: you do not have users that complain about bad performance) ... then there is no point in worrying about performance.
Long story short: either performance is really an issue - then you have to solve it. Otherwise: don't try to fix something that is not broken.
(as said: from a learning perspective, I would still encourage you to save your code; and start a new version that uses sets. There are plenty of tutorials out there that explain all the things you need to know).
But just to give you some direction: your main "performance" killer is (as you thought yourself) the inappropriate usage of data structures. There is no advantage in using an ArrayList to store arrays of strings that you want to search for. That adds "two layers"; each one requiring your code to iterate those "lists" in an sequential way. If you would use a single Set (like HashSet) instead; and add all your "search strings" to that set, your whole "lookup" for matches ... boils down to ask that set: "do you contain this value".
Related
I'm building a small app which auto translates boolean queries in Java.
This is the code to find if the query string contains a certain word and if so, it replaces it with the translated value.
int howmanytimes = originalValues.size();
for (int y = 0; y < howmanytimes; y++) {
String originalWord = originalValues.get(y);
System.out.println("original Word = " + originalWord);
if (toReplace.contains(" " + originalWord.toLowerCase() + " ")
|| toCheck.contains('"' + originalWord.toLowerCase() + '"')) {
toReplace = toReplace.replace(originalWord, translatedValues.get(y).toLowerCase());
System.out.println("replaced " + originalWord + " with " + translatedValues.get(y).toLowerCase());
}
System.out.println("to Replace inside loop " + toReplace);
}
The problem is when a query has, for example, '(mykeyword OR "blue mykeyword")' and the translated values are different, for example, mykeyword translates to elpalavra and "blue mykeyword" translates to "elpalavra azul". What happens in this case is that the result string will be '(elpalavra OR "blue elpalavra")' when it should be '(elpalavra OR "elpalavra azul")' . I understand that in the first loop it replaces all keywords and in the second it no longer contains the original value it should for translation.
How can I fix this?
Thank you
you can sort originalValues by size desc. And after that loop through them.
This way you first replace "blue mykeyword" and only after you replace "mykeyword"
The "toCheck" variable is not explained what is for, and in any case the way it is used looks weird (to me at least).
Keeping that aside, one way to answer your request could be this (based only on the requirements you specified):
sort your originalValues, so that the ones with more words are first. The ones that have same number of words, should be ordered from more length to less.
I am trying to display a list of words that are similar to what the user is typing. For example, if I have a list of words like ["Software Eng 1", "Software Eng 2", "Blah"]
And the user typed S, it would filter to the Software Eng 1 and Software Eng 2. Again, the user types So and it filters to the same two words. But if the user types Soc, it would have nothing. What is the best way to do this? I tried
for (EmployeeName r : list)
{
if (textField.getText().matches(r.getName()))
{
System.out.println(r.getName() + " is similar");
}
else System.out.println("NOPE");
}
But this only seems to be catching the case when textField.getText() is exactly the same
you could use String#startsWith if you're checking whether
any Employees name within the list starts with the text entered.
Example:
for (EmployeeName r : list)
{
if (r.getName().startsWith(textField.getText()))
{
System.out.println(r.getName() + " is similar");
}
}
if you're looking to see if the text entered is contained anywhere within the Employees name
then String#contains would do the job.
you could even use String#indexOf to check if the text entered is contained anywhere within the Employees name.
Example:
for (EmployeeName r : list)
{
if (r.getName().indexOf(textField.getText()) != -1)
{
System.out.println(r.getName() + " is similar");
}
}
You can use contains for example ;
textField.getText().toLowerCase().contains(r.getName().toLowerCase())
I might have overlooked some factors influencing the process but that is why i seek help here. It is my first post here and i have read the initial prescriptions for helping me getting the best question as a basis for the best answer. I hop you will understand(otherwise please make a comment with further questions)
The case is that i have been creating an ArrayList
ArrayList<String> liste = new ArrayList<String>();
I gather several names, quantities, and dates:
if(shepherd == 0) {
} else if(shepherd <= 0) {
System.out.println(shepherd);
String s = "('shepherd'," + "'" + shepherd + "'," +"'" + ft.format(date) + "'" + ")";
liste.add(s);
}
I have defined shepherd as follows:
double shepherd = 0;
Next, I wish to add these entries to my MySql database.
I construct a query, and print it out so that I can verify that it is of the correct format:
System.out.println("INSERT INTO kennel VALUES");
for(int i = 0; i < liste.size(); i++) {
System.out.println(liste.get(i));
if(i != liste.size()-1) {
System.out.println(",");
}
}
This shows the correct command, with the proper syntax, but it's only output to the console at this point.
I have to send this through some Jsch or Ganymed. Most likely as a String. So i am wondering how i could take all the different parts, the doubles, the strings, the loop and build up a String, identical to the printed line i get in console.
I sensed it would look like this:
String command = (mysql -e "use kennel;insert into department3 values ('shepherd','1','2013-03-04');";
I believe that I am having some trouble with the " and ( and '.
I hope i made it clear what the trouble is about. Thank you in advance. Sincerely
Your string need to be held within quotation marks. Because this will interfere with the quotation marks within your String, you need to escape them. You can do this by placing a backslash in front of the character. :)
String command = "(mysql -e \"use kennel;insert into department3 values ('shepherd','1','2013-03-04');\"";
I have a list of words in a file. They might contain words like who's, didn't etc. So when reading from it I need to make them proper like "who is" and "did not". This has to be done in Java. I need to do this without losing much time.
This is actually for handling such queries during a search that uses solr.
Below is a sample code I tried using a hash map
Map<String, String> con = new HashMap<String, String>();
con.put("'s", " is");
con.put("'d", " would");
con.put("'re", " are");
con.put("'ll", " will");
con.put("n't", " not");
con.put("'nt", " not");
String temp = null;
String str = "where'd you're you'll would'nt hello";
String[] words = str.split(" ");
int index = -1 ;
for(int i = 0;i<words.length && (index =words[i].lastIndexOf('\''))>-1;i++){
temp = words[i].substring(index);
if(con.containsKey(temp)){
temp = con.get(temp);
}
words[i] = words[i].substring(0, index)+temp;
System.out.println(words[i]);
}
If you are worried about queries containing for eg "who's" finding documents containing for eg "who is" then you should look at using a Stemmer, which is designed exactly for this purpose.
You can easily add a stemmer buy configuring it as a filter in your solr config. See http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters
Edit:
A SnowballPorterFilterFactory will probably do the job for you.
Following on from #James Jithin's last remark:
the "'s" -> " is" transform is incorrect if the word is a possessive form.
the "'d" -> " would" transform is incorrect in archaic forms, where the "'d" can be a contraction of "ed".
the "'nt" -> " not" transform is not correct because this is really just a mis-spelling of the "n't" contraction. (I mean "wo'nt" is just plain wrong ... isn't it.)
So, to my mind, the best way to implement this would be to enumerate the small number of contractions that are common and valid, and leave the rest alone. This also has the advantage that you can implement it with a simple string match rather than a suffix match.
The code can be written as
Map<String, String> con = new HashMap<String, String>();
con.put("'s", " is");
con.put("'d", " would");
con.put("'re", " are");
con.put("'ll", " will");
con.put("n't", " not");
con.put("'nt", " not");
String str = "where'd you're you'll would'nt hello";
for(String key : con.keySet()) {
str = str.replaceAll(key + "\\b" , con.get(key));
}
with the logic you have. But suppose its script's is a word which shows possession, changing it to script is alters the meaning.
I have been playing around a bit with a fairly simple, home-made search engine, and I'm now twiddling with some relevancy sorting code.
It's not very pretty, but I'm not very good when it comes to clever algorithms, so I was hoping I could get some advice :)
Basically, I want each search result to get scoring based on how many words match the search criteria. 3 points per exact word and one point for partial matches
For example, if I search for "winter snow", these would be the results:
winter snow => 6 points
winter snowing => 4 points
winterland snow => 4 points
winter sun => 3 points
winterland snowing => 2 points
Here's the code:
String[] resultWords = result.split(" ");
String[] searchWords = searchStr.split(" ");
int score = 0;
for (String resultWord : resultWords) {
for (String searchWord : searchWords) {
if (resultWord.equalsIgnoreCase(searchWord))
score += 3;
else if (resultWord.toLowerCase().contains(searchWord.toLowerCase()))
score++;
}
}
Your code seems ok to me. I suggest little changes:
Since your are going through all possible combinations you might get the toLowerCase() of your back at the start.
Also, if an exact match already occurred, you don't need to perform another equals.
result = result.toLowerCase();
searchStr = searchStr.toLowerCase();
String[] resultWords = result.split(" ");
String[] searchWords = searchStr.split(" ");
int score = 0;
for (String resultWord : resultWords) {
boolean exactMatch = false;
for (String searchWord : searchWords) {
if (!exactMatch && resultWord.equals(searchWord)) {
exactMatch = true;
score += 3;
} else if (resultWord.contains(searchWord))
score++;
}
}
Of course, this is a very basic level. If you are really interested in this area of computer science and want to learn more about implementing search engines start with these terms:
Natural Language Processing
Information retrieval
Text mining
stemming
for acronyms case sensitivity is important, i.e. SUN; any word that matches both content and case must be weighted more than 3 points (5 or 7)?
use the strategy design pattern
For example, consider this naive score model:
interface ScoreModel {
int startingScore();
int partialMatch();
int exactMatch();
}
...
int search(String result, String searchStr, ScoreModel model) {
String[] resultWords = result.split(" ");
String[] searchWords = searchStr.split(" ");
int score = model.startingScore();
for (String resultWord : resultWords) {
for (String searchWord : searchWords) {
if (resultWord.equalsIgnoreCase(searchWord)) {
score += model.exactMatch();
} else if (resultWord.toLowerCase().contains(searchWord.toLowerCase())) {
score += model.partialMatch();
}
}
}
return score;
}
Basic optimization can be done by preprocessing your database: don't split entries into words every time.
Build words list (prefer hash or binary tree to speedup search in the list) for every entry during adding it into DB, remove all too short words, lower case and store this data for further usage.
Do the same actions with the search string on search start (split, lower case, cleanup) and use this words list for comparing with every entry words list.
1) You can sort searchWords first. You could break out of the loop once your result word was alphabetically after your current search word.
2) Even better, sort both, then walk along both lists simultaneously to find where any matches occur.
You can use regular expressions for finding patterns and lengths of matched patterns (for latter classification/scoring).