Let's suppose I've an object that looks like this:
public class Supermarket {
public String supermarketId;
public String lastItemBoughtId;
// ...
}
and I have two lists of supermarkets, one "old", another "new" (i.e. one is local, the other is retrieved from the cloud).
List<Supermarket> local = getFromLocal();
List<Supermarket> cloud = getFromCloud();
I would like to find all the pairs of Supermarket objects (given supermarketId) that have lastItemBoughtId different from one another.
The first solution I have in mind is iterating the first List, then inside the first iteration iterating the second one, and each time that local.get(i).supermarketId.equals(cloud.get(j).supermarketId), checking if lastItemBoughtId of the i element is different from the id of the j element. If it's different, I add the whole Supermarket object on a new list.
To be clearer, something like this:
List<Supermarket> difference = new ArrayList<>();
for (Supermarket localSupermarket : local) {
for (Supermarket cloudSupermarket : cloud) {
if (localSupermarket.supermarketId.equals(cloudSupermarket.supermarketId) &&
!localSupermarket.lastItemBoughtId.equals(cloudSupermarket.lastItemBoughtId))
difference.add(cloudSupermarket);
}
}
Clearly this looks greatly inefficient. Is there a better way to handle such a situation?
One solution :
Construct a Map of the Local supermarkets using the supermarketId as the key by running through the list once
Loop through the cloud list and do you comparison, looking up the local supermarket from your map.
i.e. O(n) instead of O(n2)
Here's a two-line solution:
Map<String, Supermarket> map = getFromLocal().stream()
.collect(Collectors.toMap(s -> s.supermarketId, s -> s));
List<Supermarket> hasDiffLastItem = getFromCloud().stream()
.filter(s -> !map.get(s.supermarketId).lastItemBoughtId.equals(s.lastItemBoughtId))
.collect(Collectors.toList());
I would put one of the lists in a Map with as key the Supermarket ID and as value the supermarket instance then iterate over the other getting from the Map and comparing the lastItemBoughtId.
Related
So I'm going crazy with this one. This is for an assignment and can't seem to get this to work at all!!
I have the following HashMap:
HashMap<String, ArrayList<Team>> teams;
(Team being another class to obtain the details of the teams)
What I need to be able to do is get the List of teams for the Key(String) from the above HashMap, and assign the List to a local variable I have declared:
List<Team> results = teams.get(division);
But this is where I get stuck. I have no idea how I'm suppose to complete this task.
As a further note "division" is the Key used in the HashMap. The ArrayList is a list of teams that belong to the division.
I have tried the below, which does not compile at all. Really not sure how I can get this to work!!
public void recordResult(String division, String teamA, String teamB, int teamAScore, int teamBScore)
{
List<Team> results = teams.get(division);
for (String i : teams.keySet())
{
results = new ArrayList<Team>();
results.add();
}
}
**You can ignore the arguments after the "String division". These will be used later.
Iterate over the entrySet() of the Map. Now you can fetch each List for that specific key and proceed further. Something like:
for (Entry<String, ArrayList<Team>> entry : teams.entrySet()) {
// extract the value from the key using `teams.get(entry.getKey())`
// proceed further with the value obtained
}
Ive been searching SO about this question and most only have the problem with two arrays comparing by have a nested loop. My problem is quite the same but on a bigger scale. Suppose I have a 100 or thousand user on my app, and each user has the list of item it wants.
Something like this
User1 = {apple,orange,guava,melon,durian}
User2 = {apple, melon,banana,lemon,mango}
User3 = {orange,carrots,guava,melon,tomato}
User4 = {mango,carrots,tomato,apple,durian}
.
.
Nuser = ...
I wanted to see how many apples or oranges was listed from all the users array. So I am basically comparing but on a bigger scale. The data isn't static as well, A user can input an unknown fruit from the developers knowledge but on the users knowledge they can put it there so there can be multiple users that can put this unknown fruit and yet the system can still figure out how many is this unknown item was listed. Keep in mind this is a dynamic one. User can reach for example a 100 users depending how popular an app would be. I can't afford to do nested loop here.
PS this is not the exact problem but it is the simplest scenario I can think of to explain my problem.
PS: just to clarify, I dont intend to use 3rd party lib as well like guava. I am having a problem on proguard with it.
Edit
Just read that Original poster cannot use Java 8, which is a pity, because this would realy make it very easy!
Java 7 solution
final Map<String, Integer> occurencesByFruit = new HashMap<>();
for (User user : users) {
String[] fruits = user.getFruits();
for (String fruit : fruits) {
final Integer currentCount = occurencesByFruit.get(fruit);
if (currentCount == null) {
occurencesByFruit.put(fruit, 1);
} else {
occurencesByFruit.put(fruit, currentCount + 1);
}
}
}
Java 8 solution
I'd stream the users, flatMap() to the actual fruit elements, and then use Collectors.groupingBy() with a downstream collector Collectors.counting().
This will give you a Map where the keys are the fruits, and the values are the occurrences of each fruit throughout all your users.
List<User> users = Arrays.asList(/* ... */);
final Map<String, Long> occurencesByFruit = users.stream()
.map(User::getFruits)
.flatMap(Arrays::stream)
.collect(Collectors.groupingBy(f -> f, Collectors.counting()));
Seems it is a good possibility to use HashMap<Item, Integer> fruits. You could iterate over all Users (you would need to store all Users in some kind of list, such as ArrayList<User> users) and check the list of items chosen by each User (I suppose User should have a field ArrayList<Item> items in its body to store items). You could achieve it with something like that:
for (User user : users) { // for each User from users list
for (Item item : user.items) { // check each item chosen by this user
if (fruits.containsKey(item) { // if the fruit is already present in the items HashMap increment the amount of items
int previousNumberOfItems = fruits.get(item);
fruits.put(item, ++previousNumberOfItems);
else { // otherwise put the first occurrency of this item
fruits.put(item, 1);
}
}
}
I would either create an ArrayList containing a HashMap with strings and ints or use two ArrayLists (one of type String and one of type Integer). Then you can iterate over every entry in each of the user arrays (this is only a simple nested loop). For every entry in the current user array you check if there is already the same entry in the ArrayList you created additionally. If so, you increment the respective int. If not, you add a string and an int. In the end, you have the number of occurrences of all the fruit strings in the added ArrayLists, which is, if I understood you correctly, just what you wanted.
I have two list containing an important number of object with each N elements:
List<Foo> objectsFromDB = {{MailId=100, Status=""}, {{MailId=200, Status=""}, {MailId=300, Status=""} ... {MailId=N , Status= N}}
List <Foo> feedBackStatusFromCsvFiles = {{MailId=100, Status= "OPENED"}, {{MailId=200, Status="CLICKED"}, {MailId=300, Status="HARDBOUNCED"} ... {MailId=N , Status= N}}
Little Insights:
objectFromDB retrieves row of my database by calling a Hibernate method.
feedBackStatusFromCsvFiles calls a CSVparser method and unmarshall to Java objects.
My entity class Foo has all setters and getters. So I know that the basic idea is to use a foreach like this:
for (Foo fooDB : objectsFromDB) {
for(Foo fooStatus: feedBackStatusFromCsvFiles){
if(fooDB.getMailId().equals(fooStatus.getMailId())){
fooDB.setStatus(fooStatus.getStatus());
}
}
}
As far as my modest knowledge of junior developer is, I think it is a very bad practice doing it like this? Should I implement a Comparator and use it for iterating on my list of objects? Should I also check for null cases?
Thanks to all of you for your answers!
Assuming Java 8 and considering the fact that feedbackStatus may contain more than one element with the same ID.
Transform the list into a Map using ID as key and having a list of elements.
Iterate the list and use the Map to find all messages.
The code would be:
final Map<String, List<Foo>> listMap =
objectsFromDB.stream().collect(
Collectors.groupingBy(item -> item.getMailId())
);
for (final Foo feedBackStatus : feedBackStatusFromCsvFiles) {
listMap.getOrDefault(feedBackStatus.getMailId(), Colleactions.emptyList()).forEach(item -> item.setStatus(feedBackStatus.getStatus()));
}
Use maps from collections to avoid the nested loops.
List<Foo> aList = new ArrayList<>();
List<Foo> bList = new ArrayList<>();
for(int i = 0;i<5;i++){
Foo foo = new Foo();
foo.setId((long) i);
foo.setValue("FooA"+String.valueOf(i));
aList.add(foo);
foo = new Foo();
foo.setId((long) i);
foo.setValue("FooB"+String.valueOf(i));
bList.add(foo);
}
final Map<Long,Foo> bMap = bList.stream().collect(Collectors.toMap(Foo::getId, Function.identity()));
aList.stream().forEach(it->{
Foo bFoo = bMap.get(it.getId());
if( bFoo != null){
it.setValue(bFoo.getValue());
}
});
The only other solution would be to have the DTO layer return a map of the MailId->Foo object, as you could then use the CVS list to stream, and simply look up the DB Foo object. Otherwise, the expense of sorting or iterating over both of the lists is not worth the trade-offs in performance time. The previous statement holds true until it definitively causes a memory constraint on the platform, until then let the garbage collector do its job, and you do yours as easy as possible.
Given that your lists may contain tens of thousands of elements, you should be concerned that you simple nested-loop approach will be too slow. It will certainly perform a lot more comparisons than it needs to do.
If memory is comparatively abundant, then the fastest suitable approach would probably be to form a Map from mailId to (list of) corresponding Foo from one of your lists, somewhat as #MichaelH suggested, and to use that to match mailIds. If mailId values are not certain to be unique in one or both lists, however, then you'll need something a bit different than Michael's specific approach. Even if mailIds are sure to be unique within both lists, it will be a bit more efficient to form only one map.
For the most general case, you might do something like this:
// The initial capacity is set (more than) large enough to avoid any rehashing
Map<Long, List<Foo>> dbMap = new HashMap<>(3 * objectFromDb.size() / 2);
// Populate the map
// This could be done more effciently if the objects were ordered by mailId,
// which perhaps the DB could be enlisted to ensure.
for (Foo foo : objectsFromDb) {
Long mailId = foo.getMailId();
List<Foo> foos = dbMap.get(mailId);
if (foos == null) {
foos = new ArrayList<>();
dbMap.put(mailId, foos);
}
foos.add(foo);
}
// Use the map
for (Foo fooStatus: feedBackStatusFromCsvFiles) {
List<Foo> dbFoos = dbMap.get(fooStatus.getMailId());
if (dbFoos != null) {
String status = fooStatus.getStatus();
// Iterate over only the Foos that we already know have matching Ids
for (Foo fooDB : dbFoos) {
fooDB.setStatus(status);
}
}
}
On the other hand, if you are space-constrained, so that creating the map is not viable, yet it is acceptable to reorder your two lists, then you should still get a performance improvement by sorting both lists first. Presumably you would use Collections.sort() with an appropriate Comparator for this purpose. Then you would obtain an Iterator over each list, and use them to iterate cooperatively over the two lists. I present no code, but it would be reminiscent of the merge step of a merge sort (but the two lists are not actually merged; you only copy status information from one to the other). But this makes sense only if the mailIds from feedBackStatusFromCsvFiles are all distinct, for otherwise the expected result of the whole task is not well determined.
your problem is merging Foo's last status into Database objects.so you can do it in two steps that will make it more clearly & readable.
filtering Foos that need to merge.
merging Foos with last status.
//because the status always the last,so you needn't use groupingBy methods to create a complex Map.
Map<String, String> lastStatus = feedBackStatusFromCsvFiles.stream()
.collect(toMap(Foo::getMailId, Foo::getStatus
, (previous, current) -> current));
//find out Foos in Database that need to merge
Predicate<Foo> fooThatNeedMerge = it -> lastStatus.containsKey(it.getMailId());
//merge Foo's last status from cvs.
Consumer<Foo> mergingFoo = it -> it.setStatus(lastStatus.get(it.getMailId()));
objectsFromDB.stream().filter(fooThatNeedMerge).forEach(mergingFoo);
Given a list in which each entry is a object that looks like
class Entry {
public String id;
public Object value;
}
Multiple entries could have the same id. I need a map where I can access all values that belong to a certain id:
Map<String, List<Object>> map;
My algorithm to achieve this:
for (Entry entry : listOfEntries) {
List<Object> listOfValues;
if (map.contains(entry.id)) {
listOfValues = map.get(entry.id);
} else {
listOfValues = new List<Object>();
map.put(entry.id, listOfValues);
}
listOfValues.add(entry.value);
}
Simply: I transform a list that looks like
ID | VALUE
---+------------
a | foo
a | bar
b | foobar
To a map that looks like
a--+- foo
'- bar
b---- foobar
As you can see, contains is called for each entry of the source list. That's why I wonder if I could improve my algorithm, if I pre-sort the source list and then do this:
List<Object> listOfValues = new List<Object>();
String prevId = null;
for (Entry entry : listOfEntries) {
if (prevId != null && prevId != entry.id) {
map.put(prevId, listOfValues);
listOfValues = new List<Object>();
}
listOfValues.add(entry.value);
prevId = entry.id;
}
if (prevId != null) map.put(prevId, listOfValues);
The second solution has the advantage that I don't need to call map.contains() for every entry but the disadvantage that I have to sort before. Futhermore the first algorithm is easier to implement and less error prone, since you have to add some code after the actual loop.
Therefore my question is: Which method has better performance?
The examples are written in Java pseudo code but the actual question applies to other programming languages as well.
If you have a hash map and a very large amount of entries then inserting items one by one will be faster than sorting and inserting them list by list (O(n) vs O(N log N)). If you use a tree based map than the complexity is the same for both approaches.
However, I really doubt you have a sufficiently large amount of entries so memory access patterns, and how fast compare and hash functions are come into effect. You have 2 options: ignore it since the difference is not going to be significant or benchmark both options and see which one is working better on your system. If you don't have millions of entries I would ignore the issue and go with whatever is easier to understand.
Don't presort. Even fast sorting algorithms like quicksort take, on average, O(n log n) for n items. Afterwards, you still need O(n) to walk the list. contains on a (hash) map takes constant time (checkout this question), don't worry about it. Walk the list in linear time and use contains.
Would like to offer another solution using streams
import static java.util.stream.Collectors.groupingBy;
import static java.util.stream.Collectors.mapping;
import static java.util.stream.Collectors.toList;
Map<String, List<Object>> map = listOfValues.stream()
.collect(groupingBy(entry -> entry.id, mapping(entry -> entry.value, toList())));
This code is more declarative - it only specifies that List should be transformed into Map.
Then it is a library responsibility to actually perform transformation in efficient way.
I have a HashSet of Strings in the format: something_something_name="value"
Set<String> name= new HashSet<String>();
Farther down in my code I want to check if a String "name" is included in the HashSet. In this little example, if I'm checking to see if "name" is a substring of any of the values in the HashSet, I'd like it to return true.
I know that .contains() won't work since that works using .equals(). Any suggestions on the best way to handle this would be great.
With your existing data structure, the only way is to iterate over all entries checking each one in turn.
If that's not good enough, you'll need a different data structure.
You can build a map (name -> strings) as follows:
Map<String, List<String>> name_2_keys = new HashMap<>();
for (String name : names) {
String[] parts = key.split("_");
List<String> keys = name_2_keys.get(parts[2]);
if (keys == null) {
keys = new ArrayList<>();
}
keys.add(name);
name_2_keys.put(parts[2], keys);
}
Then retrieve all the strings containing the name name:
List<String> keys = name_2_keys.get(name)
You can keep another map where name is the key and something_something_name is the value.
Thus, you would be able to move from name -> something_something_name -> value. If you want a single interface, you can write a wrapper class around these two maps, exposing the functionality you want.
I posted a MapFilter class here a while ago.
You could use it like:
MapFilter<String> something = new MapFilter<String>(yourMap, "something_");
MapFilter<String> something_something = new MapFilter<String>(something, "something_");
You will need to make your container into a Map first.
This would only be worthwhile doing if you look for the substrings many times.