I have to make a program that sorts a collection of songs by runtime. I have to analyse a selection of songs, each with a "Title" string, a "Composer" string, and a "Running Time" integer. Input will be piped through stdin, and output will be in stdout.
Here's an example input:
3
&
Pink Frost&Phillipps, Martin&234933
Se quel guerrier io fossi&Puccini, Giacomo&297539
Non piu andrai&Mozart&234933
M'appari tutt'amor&Flotow, F&252905
And output:
Se quel guerrier io fossi&Puccini, Giacomo&297539
M'appari tutt'amor&Flotow, F&252905
Non piu andrai&Mozart&234933
I know I have to sort these by Running Time, but I'm not sure which sorting algorithm to use. By general knowledge, the two sorting algo's that come to mind are Merge Sort and Quicksort, because they seem to be the quickest on average. I also have the idea of using a Comparator to compare two "Running time" elements in a Collection.
Could someone please point me in the right direction?
The easiest way is to write a class to hold the above values, which implements Comparable interface (or you could write up your own Comparator). The compareTo method can check the runtime and return a value accordingly.
Then pass it up to Collections.sort() method. This method uses a optimized version of Merge Sort. You don't have to write your own sorting logic to handle it this way, and you can rely on the Java Platform to do it for you. Unless you need specific performance tuning of sorting method, I guess this is the simplest way to go (KISS - Keep It Simple, Stupid).
Excerpt from Java API Docs on Collections.sort (http://download.oracle.com/javase/1,5.0/docs/api/java/util/Collections.html#sort%28java.util.List%29):
The sorting algorithm is a modified mergesort (in which the merge is omitted if the highest element in the low sublist is less than the lowest element in the high sublist). This algorithm offers guaranteed n log(n) performance. This implementation dumps the specified list into an array, sorts the array, and iterates over the list resetting each element from the corresponding position in the array. This avoids the n2 log(n) performance that would result from attempting to sort a linked list in place.
Just stick to compareTo() method for String or int (running tittle) and use them in your Comparators. Nextly - use Collections.sort() which uses merge sort that is quite good :)
Ah and during runtime you should add those songs to list of songs - ArrayList or LinkedList. And sort them by Collections.sort(yourListName, new yourComparatorName());
Related
I have the following homework question:
Suppose you are given two sequences S1 and S2 of n elements, possibly containing duplicates, on which a total order relation is defined. Describe an efficient algorithm for determining if S1 and S2 contain the same set of elements. Analyze the running time of this method
To solve this question I have compared elemements of the two arrays using retainAll and a HashSet.
Set1.retainAll(new HashSet<Integer>(Set2));
This would solve the problem in constant time.
Do I need to sort the two arrays before the retainAll step to increase efficiency?
I suspect from the code you've posted that you are missing the point of the assignment. The idea is not to use a Java library to check if two collections are equal (for that you could use collection1.equals(collections2). Rather the point is to come up with an algorithm for comparing the collections. The Java API does not specify an algorithm: it's hidden away in the implementation.
Without providing an answer, let me give you an example of an algorithm that would work, but is not necessarily efficient:
for each element in coll1
if element not in coll2
return false
remove element from coll2
return coll2 is empty
The problem specifies that the sequences are ordered (i.e. total order relation is defined) which means you can do much better than the algorithm above.
In general if you are asked to demonstrate an algorithm it's best to stick with native data types and arrays - otherwise the implementation of a library class can significantly impact efficiency and hide the data you want to collect on the algorithm itself.
I have a Java program where I implement sorting algorithm. I need to calculate how many operations(or sentences I guess) were done during the execution. Is there any built method in JAVA API to do that?
I suggest you add a counter to your comparator. This should add less than 1 nano-second and it will be trivial compare to anything else you are doing.
What do you mean by saying comparator? I do not use any
In that case, I suggest you add one which sorts and also counts the number of times it is called.
Somewhat of an odd question, but does anyone know what kind of sort MapReduce uses in the sort portion of shuffle/sort? I would think merge or insertion (in keeping with the whole MapReduce paradigm), but I'm not sure.
It's Quicksort, afterwards the sorted intermediate outputs get merged together.
Quicksort checks the recursion depth and gives up when it is too deep. If this is the case, Heapsort is used.
Have a look at the Quicksort class:
org.apache.hadoop.util.QuickSort
You can change the algorithm used via the map.sort.class value in the hadoop-default.xml.
TreeMap has O(log n) performance (best case), however, since I need the following operations efficiently:
get the highest element
get XY highest elements
insert
Other possibility would be to make a PriorityQueue with the following:
use "index" element as order for PriorityQueue
equals implementation to check only "index" element equality
but this would be a hack since "equals" method would be error prone (if used outside of PriorityQueue).
Any better structure for this?
More details below which you might skip since the first answer provided good answer for this specifics, however, I'm keeping it active for the theoretical discussion.
NOTE: I could use non standard data structures, in this project I'm already using UnrolledLinkedList since it most likely would be the most efficient structure for another use.
THIS IS USE CASE (in case you are interesting): I'm constructing AI for a computer game where
OffensiveNessHistory myOffensiveNess = battle.pl[orderNumber].calculateOffensivenessHistory();
With possible implementations:
public class OffensiveNessHistory {
PriorityQueue<OffensiveNessHistoryEntry> offensivenessEntries = new PriorityQueue<OffensiveNessHistoryEntry>();
..
or
public class OffensiveNessHistory {
TreeMap<Integer, OffensiveNessHistoryEntry> offensivenessEntries = new TreeMap();
..
I want to check first player offensiveness and defensiveness history to calculate the predict if I should play the most offensive or the most defensive move.
First, you should think about the size of the structure (optimizing for just a few entries might not be worth it) and the frequency of the operations.
If reads are more frequent than writes (which I assume is the case), I'd use a structure that optimizes for reads on the cost of inserts, e.g. a sorted ArrayList where you insert at a position found using a binary search. This would be O(log n) for the search + the cost of moving other entries to the right but would mean good cache coherence and O(1) lookups.
A standard PriorityQueue internally also uses an array, but would require you to use an iterator to get element n (e.g. if you'd at point need the median or the lowest entry).
There might be strutures that optimize write even more while keeping O(1) reads but unless those writes are very frequent you might not even notice any performance gains.
Finally and foremost, you should try not to optimize on guesses but profile first. There might be other parts of your code that might eat up performance and which might render optimization of the datastructures rather useless.
Let's say I have a Java ArrayList, that is sorted. Now I would like to find the index of value x. What would be the fastest (without more than 30 lines of code) way to do this? Use of the IndexOf() method? Iterate through all values in a simple for loop? Use of some cool algorithm? We are talking about around let's say 50 integer keys.
Binary search, but since it's only 50 items, who cares (unless you have to do it millions of times)? A simple linear search is simpler and the difference in performance for 50 items is negligible.
Edit: You could also use the built-in java.util.Collections binarySearch method. Be aware, that it will return an insertion point even if the item isn't found. You may need to make an extra couple of checks to make sure that the item really is the one you want. Thanks to #Matthew for the pointer.
tvanfosson is right, that the time for either will be very low, so unless this code runs very frequently it won't make much difference.
However, Java has built-in functionality for binary searches of Lists (including ArrayLists), Collections.binarySearch.
import java.util.ArrayList;
import java.util.Collections;
ArrayList myList = new ArrayList();
// ...fill with values
Collections.sort( myList );
int index = Collections.binarySearch( myList, "searchVal" );
Edit: untested code
If the keys have a acceptable distribution, Interpolation Search might be very fast way considering execution time.
Considering coding time IndexOf() is the way to go (or a built-in in binary search if availiable for your data type (I am from C# and don't know Java)).