How to compare two Arraylist values in java? - java

I have Two Arraylist RunningProcessList AllProcessList its contains following values are
RunningProcessList:
Receiver.jar
AllProcessList:
Receiver.jar
Sender.jar
Timeout.jar
TimeourServer.jar
AllProcessList arraylist contains the all java processes , RunningProcessList arraylist contains currently running process. I want to compare these two arraylist and I want to display If the process is not running. For Example compare two list and want to display following process is not running.
Result:
Sender.jar
Timeout.jar
TimeourServer.jar
I used the following code but its not working.
Object Result = null;
for (int i = 0; i <AllProcessList.size(); i++) {
for (int j = 0; j < RunningProcessList.size(); j++) {
if( AllProcessList.get(i) != ( RunningProcessList.get(j))) {
System.out.println( RunningProcessList.get(j)));
Result =RunningProcessList.get(j);
}
if(AllProcessList.get(i) != ( RunningProcessList.get(j))) {
list3.add(Result);
}
}
}

Take a look at the documentation for List, ecpecially the removeAll() method.
List result = new ArrayList(AllProcessList);
result.removeAll(RunningProcessList);
You could then iterate over that list and call System.out.println if you wanted, as you've done above... but is that what you want to do?

Assuming your lists are not too long, you can just collect all elements of AllProcessList that are not in the RunningProceesList
for (Object process : AllProcessList) {
if (!RunningProcessList.contains(process)) {
list3.add(process);
}
}
it's important that the RunningProcessList contains the same instances as the AllProcessList (or the objects must implement a functional equals method).
it would be better if your list contains instances of Process (or some other dedicated class).
List<Process> AllProcessList = new ArrayList<Process>();
List<Process> RunningProcessList = new ArrayList<Process>();
List<Process> list3 = new ArrayList<Process>();
...
for (Process process : AllProcessList) {
if (!RunningProcessList.contains(process)) {
list3.add(process);
}
}
English is not my first (neither second) language, any correction is welcome

Hi lakshmi,
I upvoted noelmarkham's answer as I think it's the best code wise and suits Your needs. So I'm not going to add another code snippet to this already long list, I just wanted to point You towards two things:
If Your processes are unique (their name/id whatever), You might consider to use (Hash)Sets in order to store them for better performance of Your desired operations. This should only be a concern when Your lists are large.
What about using ActiveProcesses and InactiveProccesses instead of Your current two lists? If a process changes its state You just have to remove it from one list and insert it into the other. This would lead to an overall cleaner design and You could access the not-running processes immediately.
Greetings

Depending on the type on AllProcessList and RunningProcessList (whocu should be allProcessList and runningProcessList to follow the Java naming conventions) the following will not work:
if ( AllProcessList.get(i) != ( RunningProcessList.get(j))) {
you should replace it with
if (!(AllProcessList.get(i).equals(RunningProcessList.get(j)))) {
!= compares physical equality, are the two things the exact same "new"ed object?
.equals(Object) compared locaical equality, ate the two things the "same"?
To do that you will need to override the equals and hashCode methods. Here is an article on that.
If the class is a built in Java library one then odds are equals and hashCode are done.

For sorted lists, the following is O(n). If a sort is needed, this method becomes O(nlogn).
public void compareLists(final List<T> allProcesses, final List<T> runningProcesses) {
// Assume lists are sorted, if not call Collection.sort() on each list (making this O(nlogn))
final Iterator<T> allIter = allProcesses.iterator();
final Iterator<T> runningIter = runningProcesses.iterator();
T allEntry;
T runningEntry;
while (allIter.hasNext() && runningIter.hasNext()) {
allEntry = allIter.next();
runningEntry = runningIter.next();
while (!allEntry.equals(runningEntry) && allIter.hasNext()) {
System.out.println(allEntry);
allEntry = allIter.next();
}
// Now we know allEntry == runningEntry, so we can go through to the next iteration
}
// No more running processes, so just print the remaining entries in the all processes list
while (allIter.hasNext()) {
System.out.println(allIter.next());
}
}

Related

What is the fastest way to find orphans between two large (size ~900K ) Vectors of Strings in Java?

I'm currently working on a Java program that is required to handle large amounts of data. I have two Vectors...
Vector collectionA = new Vector();
Vector collectionB = new Vector();
...and both of them will contain around 900,000 elements during processing.
I need to find all items in collectionB that are not contained in collectionA. Right now, this is how I'm doing it:
for (int i=0;i<collectionA.size();i++) {
if(!collectionB.contains(collectionA.elementAt(i))){
// do stuff if orphan is found
}
}
But this causes the program to run for lots of hours, which is unacceptable.
Is there any way to tune this so that I can cut my running time significantly?
I think I've read once that using ArrayList instead of Vector is faster. Would using ArrayLists instead of Vectors help for this issue?
Use a HashSet for the lookups.
Explanation:
Currently your program has to test every item in collectionB to see if it is equal to the item in collectionA that it is currently handling (the contains() method will need to check each item).
You should do:
Set<String> set = new HashSet<String>(collectionB);
for (Iterator i = collectionA.iterator(); i.hasNext(); ) {
if (!set.contains(i.next())) {
// handle
}
}
Using the HashSet will help, because the set will calculate a hash for each element and store the element in a bucket associated with a range of hash values. When checking whether an item is in the set, the hash value of the item will directly identify the bucket the item should be in. Now only the items in that bucket have to be checked.
Using a SortedSet like TreeSet would also be an improvement over Vector, since to find the item, only the position it would be in has tip be checked, instead of all positions. Which Set implementation would perform best depends on the data.
If ordering of the elements doesn't matter, I would go for HashSets, and do it as follows:
Set<String> a = new HashSet<>();
Set<String> b = new HashSet<>();
// ...
b.removeAll(a):
So in essence, you're removing from set b all the elements that are in set a, leaving the asymmetric set difference. Note that the removeAll method does modify set b, so if that's not what you want, you would need to make a copy first.
To find out whether HashSet or TreeSet is more efficient for this type of operation, I ran the below code with both types, and used Guava's Stopwatch to measure execution time.
#Test
public void perf() {
Set<String> setA = new HashSet<>();
Set<String> setB = new HashSet<>();
for (int i=0; i < 900000; i++) {
String uuidA = UUID.randomUUID().toString();
String uuidB = UUID.randomUUID().toString();
setA.add(uuidA);
setB.add(uuidB);
}
Stopwatch stopwatch = Stopwatch.createStarted();
setB.removeAll(setA);
System.out.println(stopwatch.elapsed(TimeUnit.MILLISECONDS));
}
On my modest development machine, using Oracle JDK 7, the TreeSet variant is about 4 times slower (~450ms) than the HashSet variant (~105ms).

How can I add objects to an array list using the equals method to exclude similar objects?

I am trying to create Line objects and add them to an array list. The problem I am having is excluding any lines that are similar to each other. I have already created an equals method that compares two lines to determine if they are equal. I am having trouble using the while loop. I do not have an error message. It compiles just fine. It just will not read from the text file. I am stuck and do not know where else to go from here.
public void read( File fileName ) throws Exception
{
reader = new Scanner(fileName);
//---------------------
//Have to read the first number before starting the loop
int numLines = reader.nextInt();
lines = new ArrayList <Line> (numLines);
//This loop adds a new Line object to the lines array for every line in the file read.
while( reader.hasNext() ) {
for( int i = 0; i < numLines; i++ ) {
int x = reader.nextInt();
int y = reader.nextInt();
Point beg = new Point(x,y);
x = reader.nextInt();
y = reader.nextInt();
Point end = new Point(x,y);
String color = reader.next();
Line l = new Line( beg, end, color );
if (l.equals(lines.get(i)))
break;
else
lines.add(i, l);
}
}
//Print the action to the console
System.out.println( "reading text file: " + fileName );
reader.close();
}
There is a lot to discover in the Java Collection. You are using the wrong data structure, you can add two different objects in a List because the purpose of the list is to be :
An ordered collection (also known as a sequence). The user of this interface has precise control over where in the list each element is inserted. The user can access elements by their integer index (position in the list), and search for elements in the list.
So you have the elements in a given order which is kept when adding objects and you can access objects at any given index in this order.
Now it seems that's not what you want, you'd rather have no duplicate elements rather than an order, right ? If so you need to use a class that implements the Set interface which purpose is to be :
A collection that contains no duplicate elements. More formally, sets contain no pair of elements e1 and e2 such that e1.equals(e2), and at most one null element. As implied by its name, this interface models the mathematical set abstraction.
The java framework contains two implementations of a set :
HashSet : It is a hash based implementations, enjoying the benefits of hashing ensuring a constant access time whatever the size of your collection
TreeSet : It is a tree based implementation with a log(n) time for basic operations.
I recommend that you look into the first link I gave, it is the oracle tutorial explaining with much details the Java Collections.
Your example with a Set
It's really easy and not that far from using an ArrayList.
Change the declaration of your List to a Set like that (I used a TreeSet but you can use any other implementations of a Set) :
Set<Line> lines = new TreeSet<Line>();
Just use the add(E e) function of the Set interface when you want to populate your collection and let it do the job :
Line l = new Line(beg.x, beg.y, end.x, end.y);
lines.add(l);
And if you still want to use a List
You can check if an element is in a List (or any other Collection for that matters) by using the contains(Object o) method.
lines.contains(l)
This will return true if the freshly created Line (l) is contained in your collection (lines).

Java: How to loop through three List<String>

I have three List<String> variables: classFiles, usernames, and fileDirectories. I have a String (a list of strings but I will be comparing every string in the list with the loop below) that consists of one item from each of the lists. I want to loop through all three lists and check if one value from all three of the lists are in the String
What would be the best way to go about this?
for(String classFile:classFiles) {
//if contains classfile statement
for(String username:usernames) {
//if contains username statement
for(String fileDirectory:fileDirectories) {
//if contains filedirectory statement
}
}
}
or
for(String classFile:classFiles) {
for(String username:usernames) {
for(String fileDirectory:fileDirectories) {
//if statement
}
}
}
or
for(String classFile:classFiles) {
//make list of files that contain classFile
}
for(String username:usernames) {
//remove items from list that do not contain username
}
for(String fileDirectory:fileDirectories){
//remove items from list that do not contain fileDirectory
}
Or is there a better way to do this?
EDIT: Example
classFiles - a1, a2, a3
usernames - noc1, noc2, noc3
fileDirectories - C:/projects/a1/noc1/example.java, C:/projects/a1/ad3/example.java
and the string to check
String - C:/bin/a1/noc1/example.class
what i want to do is if both the fileDirectory and String contain a classFile and username, then add it to a list
so in this example C:/bin/a1/noc1/example.class will be added to the list but C:/bin/a4/fd1/example.class wont be or C:/bin/a3/noc3/example.class would not be added
When you perform remove operations for-each loop is not best choice. You should use Iterator and remove on iterator to avoid concurrent modification exception.
Instead
for(String fileDirectory:fileDirectories){
//remove items from list that do not contain fileDirectory
}
You should do something like
Iterator iter = fileDirectores.iterator();
while(iter.hasNext())
{
//Get next
//Do your check
iter.remove();
}
This leads to having three separate iterates to full fill your requirement.
It is sometimes best to define the function you're actually trying to write:
/**
* Checks to see if candidate has one string in each of classFiles, usernames and fileDirectories
*/
public boolean hasEssentialComponents(List<String> candidate) {
//Code here
}
Now, your first option has a very long maximum run time O(n^3). If you expect the function to generally fail what it means is that for each item in your three lists you are looping through the next list. Most of this is redundant, and you will have a huge performance impact if these lists are long.
The second one is subtly different, but with the same total runtime.
The third is clearly better; in this you can fail as soon as you find out that a list doesn't have a component and you never check a list for a component twice. However, Java provides some sugar that can make this easier.
public boolean hasEssentialComponents(List<String> candidates) {
//Sanity check the data
if (candidate.size() != 3) { return false; } //I'm assuming a 'good' candidate has only three items.
valid = true;
for (String candidate:candidates) {
if (valid &&
! ( check(this.classFiles, candidate)
|| check(this.usernames, candidate)
|| check(this.fileDirectories, candidate) )
)) {
valid = false;
}
}
return valid;
}
private boolean check(List<String> masterList, String candidate) {
return masterList.contains(candidate);
}
Now, I'm being unnecessarily verbose here to make sure to tease out the parts of the problem. Please note that you should use Java built in functions when possible; they're well optimized. Do not add your lists together; you're spending unnecessary time copying. Also, make each comparison only once if that is all you need: note that if you know where element in your string list should be in one of your master lists, this can be made even better.
Finally, I really recommend you write out a method signature first. It forces you to think about what you're actually trying to do.
So I take it you want to get the intersection between all three lists?
Just use the retainAll method on List.
classFiles.retainAll(usernames);
classFiles.retainAll(fileDirectories);
Now classFiles will just have the intersection between all three lists.
Can you consider to use a HashSet or HashMap to access quickly your string value (with .contains(string))?
It will eliminates the loops.
I would do this way if I understand well the problem (not sure^^)

Java: Getting the 500 most common words in a text via HashMap

I'm storing my wordcount into the value field of a HashMap, how can I then get the 500 top words in the text?
public ArrayList<String> topWords (int numberOfWordsToFind, ArrayList<String> theText) {
//ArrayList<String> frequentWords = new ArrayList<String>();
ArrayList<String> topWordsArray= new ArrayList<String>();
HashMap<String,Integer> frequentWords = new HashMap<String,Integer>();
int wordCounter=0;
for (int i=0; i<theText.size();i++){
if(frequentWords.containsKey(theText.get(i))){
//find value and increment
wordCounter=frequentWords.get(theText.get(i));
wordCounter++;
frequentWords.put(theText.get(i),wordCounter);
}
else {
//new word
frequentWords.put(theText.get(i),1);
}
}
for (int i=0; i<theText.size();i++){
if (frequentWords.containsKey(theText.get(i))){
// what to write here?
frequentWords.get(theText.get(i));
}
}
return topWordsArray;
}
One other approach you may wish to look at is to think of this another way: is a Map really the right conceptual object here? It may be good to think of this as being a good use of a much-neglected-in-Java data structure, the bag. A bag is like a set, but allows an item to be in the set multiple times. This simplifies the 'adding a found word' very much.
Google's guava-libraries provides a Bag structure, though there it's called a Multiset. Using a Multiset, you could just call .add() once for each word, even if it's already in there. Even easier, though, you could throw your loop away:
Multiset<String> words = HashMultiset.create(theText);
Now you have a Multiset, what do you do? Well, you can call entrySet(), which gives you a collection of Multimap.Entry objects. You can then stick them in a List (they come in a Set), and sort them using a Comparator. Full code might look like (using a few other fancy Guava features to show them off):
Multiset<String> words = HashMultiset.create(theWords);
List<Multiset.Entry<String>> wordCounts = Lists.newArrayList(words.entrySet());
Collections.sort(wordCounts, new Comparator<Multiset.Entry<String>>() {
public int compare(Multiset.Entry<String> left, Multiset.Entry<String> right) {
// Note reversal of 'right' and 'left' to get descending order
return right.getCount().compareTo(left.getCount());
}
});
// wordCounts now contains all the words, sorted by count descending
// Take the first 50 entries (alternative: use a loop; this is simple because
// it copes easily with < 50 elements)
Iterable<Multiset.Entry<String>> first50 = Iterables.limit(wordCounts, 50);
// Guava-ey alternative: use a Function and Iterables.transform, but in this case
// the 'manual' way is probably simpler:
for (Multiset.Entry<String> entry : first50) {
wordArray.add(entry.getElement());
}
and you're done!
Here you can find a guide how to sort a HashMap by the values. After the sorting you can just iterate over the first 500 entries.
Take a look at the TreeBidiMap provided by the Apache Commons Collections package. http://commons.apache.org/collections/api-release/org/apache/commons/collections/bidimap/TreeBidiMap.html
It allows you to sort the map according to both the key or the value set.
Hope it helps.
Zhongxian

Modifying a set during iteration java

I'm looking to make a recursive method iterative.
I have a list of Objects I want to iterate over, and then check their subobjects.
Recursive:
doFunction(Object)
while(iterator.hasNext())
{
//doStuff
doFunction(Object.subObjects);
}
I want to change it to something like this
doFunction(Object)
iIterator = hashSet.iterator();
while(Iterator.hasNext()
{
//doStuff
hashSet.addAll(Object.subObjects);
}
Sorry for the poor psuedo code, but basically I want to iterate over subobjects while appending new objects to the end of the list to check.
I could do this using a list, and do something like
while(list.size() > 0)
{
//doStuff
list.addAll(Object.subObjects);
}
But I would really like to not add duplicate subObjects.
Of course I could just check whether list.contains(each subObject) before I added It.
But I would love to use a Set to accomplish that cleaner.
So Basically is there anyway to append to a set while Iterating over it, or is there an easier way to make a List act like a set rather than manually checking .contains()?
Any comments are appreciated.
Thanks
I would use two data structures --- a queue (e.g. ArrayDeque) for storing objects whose subobjects are to be visited, and a set (e.g. HashSet) for storing all visited objects without duplication.
Set visited = new HashSet(); // all visited objects
Queue next = new ArrayDeque(); // objects whose subobjects are to be visited
// NOTE: At all times, the objects in "next" are contained in "visited"
// add the first object
visited.add(obj);
Object nextObject = obj;
while (nextObject != null)
{
// do stuff to nextObject
for (Object o : nextObject.subobjects)
{
boolean fresh = visited.add(o);
if (fresh)
{
next.add(o);
}
}
nextObject = next.poll(); // removes the next object to visit, null if empty
}
// Now, "visited" contains all the visited objects
NOTES:
ArrayDeque is a space-efficient queue. It is implemented as a cyclic array, which means you use less space than a List that keeps growing when you add elements.
"boolean fresh = visited.add(o)" combines "boolean fresh = !visited.contains(o)" and "if (fresh) visited.add(o)".
I think your problem is inherently a problem that needs to be solved via a List. If you think about it, your Set version of the solution is just converting the items into a List then operating on that.
Of course, List.contains() is a slow operation in comparison to Set.contains(), so it may be worth coming up with a hybrid if speed is a concern:
while(list.size() > 0)
{
//doStuff
for each subObject
{
if (!set.contains(subObject))
{
list.add(subObject);
set.add(subObject)
}
}
}
This solution is fast and also conceptually sound - the Set can be thought of as a list of all items seen, whereas the List is a queue of items to work on. It does take up more memory than using a List alone, though.
If you do not use a List, the iterator will throw an exception as soon as you read from it after modifying the set. I would recommend using a List and enforcing insertion limits, then using ListIterator as that will allow you to modify the list while iterating over it.
HashSet nextObjects = new HashSet();
HashSet currentObjects = new HashSet(firstObject.subObjects);
while(currentObjects.size() > 0)
{
Iterator iter = currentObjects.iterator();
while(iter.hasNext())
{
//doStuff
nextObjects.add(subobjects);
}
currentObjects = nextObjects;
nextObjects = new HashSet();
}
I think something like this will do what I want, I'm not concerned that the first Set contains duplicates, only that the subObjects may point to the same objects.
Use more than one set and do it in "rounds":
/* very pseudo-code */
doFunction(Object o) {
Set processed = new HashSet();
Set toProcess = new HashSet();
Set processNext = new HashSet();
toProcess.add(o);
while (toProcess.size() > 0) {
for(it = toProcess.iterator(); it.hasNext();) {
Object o = it.next();
doStuff(o);
processNext.addAll(o.subObjects);
}
processed.addAll(toProcess);
toProcess = processNext;
toProcess.removeAll(processed);
processNext = new HashSet();
}
}
Why not create an additional set that contains the entire set of objects? You can use that for lookups.

Categories