Number and string sorting Al - java

I want to sort some number+string combination but the sorting will be based on the number from that combination. Can you suggest an optimal solution?
Say my strings are:
12 Masdf
4 Oasd
44 Twer
and so on. The sorting will be based on the numbers like 12, 4, 44 and after the sorting I have to show the full alphanumeric strings.
As the program will run on thousands of data I don't want to split the string and compare the number on each iteration. My plan is to extract the numbers and take those in an array and then sort the array. After sorting done, I want to put back the numbers with associated strings and keep those in a string array to show.
It should be done in C++. Algorithms should be applied - Insertion sort, Quick sort, Merge sort, etc.

Create a class to store the full string and the number. Make the class Comparable. Convert your list of string to list of Class. Sort the list using which sort method is relevant. Iterate the list and print the string fields.
Sorry, that was an answer for Java, since you tagged it Java. Replace/remove Comparable for whatever is good for C++.

I am going to assume these two parts are in separate variables and are not together as one string (if they were you could just store them in a list).
First consider a Map. Each 'bucket' of the map can be represented by a number. Within each of the maps buckets is a bunch of strings in a list. (Note this could also be solved with an array especially if the Integer part is always under some fixed value) The java equivalent would look like:
Map map = new HashMap<Integer,ArrayList<String>>();
For sorting on this custom collection first the integer part of the value would be searched on the map returning a list. Every item in the list will have the same starting number. So we now search the list the string part of the value (I am assuming the list is sorted so you can do whatever sort you want ie: selection/quicksort).
The advantages of this search mean that if the number is not found in the Hashmap you instantly know there is no string part for it.

Related

How to calculate domino-chain for integer pairs?

The problem I'm facing is more like of algorithmic nature.
Let's say that I have a list of pair objects containing integers. Is there a way to sort the list so that the second part of the pair is equal to first part of the next pair?
For instance given this list of pairs:
A = Pair(2,1),Pair(2,3),Pair(1,3).
After sorting the list becomes:
A = Pair(1,3), Pair(3,2),Pair(2,1).
As you can see it is allowed to change the order of values inside the pair like the Pair(2,3) which became Pair(3,2).
I though about using comparator or comparable interfaces but they dont cover complex cases like the above.

Data Structure choices based on requirements

I'm completely new to programming and to java in particular and I am trying to determine which data structure to use for a specific situation. Since I'm not familiar with Data Structures in general, I have no idea what structure does what and what the limitations are with each.
So I have a CSV file with a bunch of items on it, lets say Characters and matching Numbers. So my list looks like this:
A,1,B,2,B,3,C,4,D,5,E,6,E,7,E,8,E,9,F,10......etc.
I need to be able to read this in, and then:
1)display just the letters or just the numbers sorted alphabetically or numerically
2)search to see if an element is contained in either list.
3)search to see if an element pair (for example A - 1 or B-10) is contained in the matching list.
Think of it as an excel spreadsheet with two columns. I need to be able to sort by either column while maintaining the relationship and I need to be able to do an IF column A = some variable AND the corresponding column B contains some other variable, then do such and such.
I need to also be able to insert a pair into the original list at any location. So insert A into list 1 and insert 10 into list 2 but make sure they retain the relationship A-10.
I hope this makes sense and thank you for any help! I am working on purchasing a Data Structures in Java book to work through and trying to sign up for the class at our local college but its only offered every spring...
You could use two sorted Maps such as TreeMap.
One would map Characters to numbers (Map<Character,Number> or something similar). The other would perform the reverse mapping (Map<Number, Character>)
Let's look at your requirements:
1)display just the letters or just the numbers sorted alphabetically
or numerically
Just iterate over one of the maps. The iteration will be ordered.
2)search to see if an element is contained in either list.
Just check the corresponding map. Looking for a number? Check the Map whose keys are numbers.
3)search to see if an element pair (for example A - 1 or B-10) is
contained in the matching list.
Just get() the value for A from the Character map, and check whether that value is 10. If so, then A-10 exists. If there's no value, or the value is not 10, then A-10 doesn't exist.
When adding or removing elements you'd need to take care to modify both maps to keep them in sync.

Searching for a set of Strings contain a particular string from ArrayList in Java

Is there any fast algorithm to search in an Arraylist of String for a particular string?
For example :
I have an Arraylist :
{"white house","yellow house","black door","house in heaven","wife"}
And want to search strings contain "house".
It should return {"white house","yellow house","house in heaven"} but in a minimum time.
I mean my problem is to deal with big data (a list of about 167000 strings) without index.
Thanks!
There are two answers to your question, depending on whether you are planning to run multiple queries or not:
If you need to run the query only once, you are out of luck: you must search the entire array from the beginning to the end.
If you need to run a significant number of queries, you can reduce the amount of work by building an index.
Make a data structure Map<String,List<String>>, go through the strings in your List<String>, and split them into words. For each word on the list of tokens, add the original string to the corresponding list.
This operation runs in O(N*W), where N is the number of long strings, and W is the average number of words per string. With such map in hand you could run a query in O(1).
Note that this approach pays off only when the number of queries significantly exceeds the average number of words in each string. For example, if your strings have ten words on the average, and you need to run five to eight queries, a linear search would be faster.
I agree with Josh Engelsma. Iterate the list and check one by one is the most simple way. And 167000 is really not a quite big data, unless each String in the List is quite long. Liner search algorithm can be finished in only a few seconds in normal PC.
Consider the coding conventions, the code may be like this:
for(String s : list) {
if(s.contains.("house")) {
//do sth.
}
}
If search will be performed many times on the same list with different keywords, you can build a reverse index to speed up searching.
In your example:
{"white house","yellow house","black door","house in heaven","wife"}
You could pre-process the list, separate each sentence into words, and build an index like:
"house" --> {0,1,3}
"white" --> {0}
"yellow" --> {1}
...
which means "house" is contained in the 0,1 and 3 -th elements of the list, and so on. The index can be implemented with HashMap:
Map<String, LinkedList<Integer>> = new HashMap<String, LinkedList<Integer>>();
And the search operation will be speedup to O(1) complexity ideally.

Efficient way to get elements out of a treeset with different sorting criteria

I have a treeset containing student objects(name, roll number, address and age) and are stored in ascending order of their names and if the names are same, then roll numbers. This list is coming from a file and could get really big.
Now i have to provide a way to display the list that could be in any order - sorted in ascending/descending order according to name, age, address or roll number. I am looking for an efficient solution for my problem.
What i am thinking of doing is to take a temporary arraylist and get the elements into it in the order i want. But for this i will have to implement a different method for every criteria and this looks inefficient to me.
Is there any way i can get the elements in the treeset out in the way i want into an array coz i just need to print the values and destroy the temp list afterwards
The only way to do it is, as you said, iterating over the tree and pulling out all of the matching elements into an ArrayList. Once you've done that you can sort based on a particular Comparator.
If you just want to pull out elements based on the natural ordering, you can use the subset method, but that depends on the compareTo method in the tree, which is not valid for all the different searches that you want to do.
Given that, why are you using a TreeSet in the first place? Do your elements have a natural ordering that the TreeSet leverages? If not why not just dump them all into an ArrayList and sort the ArrayList as necessary?

Comparator for TreeBag to sort by the number of occurrences

I have a source of strings (let us say, a text file) and many strings repeat multiple times. I need to get the top X most common strings in the order of decreasing number of occurrences.
The idea that came to mind first was to create a sortable Bag (something like org.apache.commons.collections.bag.TreeBag) and supply a comparator that will sort the entries in the order I need. However, I cannot figure out what is the type of objects I need to compare. It should be some kind of an internal map that combines my object (String) and the number of occurrences, generated internally by TreeBag. Is this possible?
Or would I be better off by simply using a hashmap and sort it by value as described in, for example, Java sort HashMap by value
Why don't you put the strings in a map. Map of string to number of times they appear in text.
In step 2, traverse the items in the map and keep on adding them to a minimum heap of size X. Always extract min first if the heap is full before inserting.
Takes nlogx time.
Otherwise after step 1 sort the items by number of occurrences and take first x items. A tree map would come in helpful here :) (I'd add a link to the javadocs, but I'm in a tablet )
Takes nlogn time.
With Guava's TreeMultiset, just use Multisets.copyHighestCountFirst.

Categories