This is mainly a question intended for me to learn about various performant ways of filtering and assigning objects to Lists.
Assume
public class A implements Comparable<A> {
private String id;
public String getId() {
return id;
}
public void setId(String id) {
this.id = id;
}
#Override
public int compareTo(A o) {
return o.getId().compareTo(this.getId());
}
}
public class B implements Comparable<B>{
private String id;
private List<A> aList = new ArrayList<>();
public String getId() {
return id;
}
public void setId(String id) {
this.id = id;
}
public void addA(A a)
{
aList.add(a);
}
#Override
public int compareTo(B o) {
return o.getId().compareTo(this.getId());
}
}
public class Main {
public static void main(String[] args) {
SortedSet<A> aSet = new TreeSet<>();
SortedSet<B> bSet = new TreeSet<>();
for(int i=0;i<100000;i++)
{
UUID uuid = UUID.randomUUID();
String uuidAsString = uuid.toString();
A a1 = new A();
a1.setId(uuidAsString);
aSet.add(a1);
A a2 = new A();
a2.setId(uuidAsString);
aSet.add(a2);
B b = new B();
b.setId(uuidAsString);
bSet.add(b);
}
//this is where the performance part comes in
//scenario: For each B I want to find A whose Id matches B's Id, and assign it to B
//assume B can have 1-5 instances of A (even though for this example I only initialized 2)
bSet.parallelStream().forEach(b -> {
aSet.parallelStream().filter(a -> {
return b.getId().equals(a.getId());
}).forEach(a -> {
b.addA(a);
});
});
}
}
The solution I came up with was to combine parallelstreams and filters to find the matching IDs between the two types of objects and then to loops through the filtered results to add the instances of A to B.
I used TreeSets here because I thought the ordered IDs might help speed things up, same reason I used parallelStreams.
This is mostly abstracted out from a scenario from a project I am doing at the office which I cant post here. The classes in the actual project have a lot more variables, and in the worst case - have sublists of lists (I resolved that using flatMaps in streams).
However my inexperienced gut tells me there is a more performant way to solve this problem.
I am primarily looking for practical ways to speed this up.
Some ways I thought of speeding this up:
Switch the lists and sets to Eclipse Collections
Assuming the starting point of these classes are CSV files -> Maybe write an apache spark application that will map these(I assumed that Spark could have some internal clever way of doing this faster than Streams).
I dunno......write them all to sql tables....map them via foreign keys and then query them again?
Speed is the name of the game, solutions using vanilla java, different librarys (like Eclipse Collections), or entire engines like Spark are acceptable
Assume the minimum list size is atleast 50,000
Bonus complexity: You can add another class 'C', with multiple instances of 'B' in it. My inexperienced self can only think of writing another similar streaming operation as A->B and run it after the first stream is done. Is there a way to combine both A->B and B->C operations together so that they happen at once. That will definitely speed things up.
Sorry about my inexperienced self and sorry again if this is a duplicate too
In your code, you use b.addA(a); where b is an instance of B while B doesn't have a method addA(A). Is B supposed to keep a list of A's?
However, the answer to your question is hashing. You are looking for a multimap, to be specific. As a quick fix you can use a TreeMap that stores a List of A's by their id:
public static void main(String[] args) {
TreeMap<String, ArrayList<A>> aSet = new TreeMap<>();
ArrayList<B> bSet = new ArrayList<>();
for (int i = 0; i < 100000; i++) {
UUID uuid = UUID.randomUUID();
String uuidAsString = uuid.toString();
A a1 = new A();
a1.setId(uuidAsString);
ArrayList<A> list = aSet.get(a1.getId());
if (list == null) {
list = new ArrayList<>();
aSet.put(a1.getId(), list);
}
list.add(a1);
A a2 = new A();
a2.setId(uuidAsString);
list = aSet.get(a2.getId());
if (list == null) {
list = new ArrayList<>();
aSet.put(a2.getId(), list);
}
list.add(a2);
B b = new B();
b.setId(uuidAsString);
bSet.add(b);
}
for (B b : bSet) {
System.out.println(aSet.get(b.getId()));
}
}
Please note that this isn't a good implementation and instead you should write your own multimap or use the one in guava
Related
I used the method below to make a copy of a list, as you can see the output, they are independent. Did I get something wrong? or are they really independent? because I did some research on the internet, and it told me this method should pass-by-reference (which list 'a' and 'copy' should be dependent).
public static void main(String[] args) {
ArrayList<String> a = new ArrayList<>(Arrays.asList("X", "X"));
ArrayList<String> copy = new ArrayList<>(a);
copy.set(0, "B");
copy.remove(copy.size()-1);
System.out.println(a);
System.out.println(copy);
}
Output:
[X, X]
[B]
As per the documentation, the ArrayList copy constructor:
Constructs a list containing the elements of the specified collection, in the order they are returned by the collection's iterator.
Modifying one list has no effect on the other, which your code confirms.
Yes, this method should pass-by-reference (which list 'a' and 'copy' should be dependent). But these two operations don't prove this.
copy.set(0, "B");
copy.remove(copy.size()-1);
See if the following code helps you understand:
public static void main(String[] args) {
Process process = new Process(1);
Process process2 = new Process(2);
ArrayList<Process> a = new ArrayList<>(Arrays.asList(process, process2));
ArrayList<Process> copy = new ArrayList<>(a);
copy.get(0).id = 10;
// This proves that both ArrayLists maintain the same Process object at this point
// output:
// [Id:10, Id:2]
// [Id:10, Id:2]
System.out.println(a);
System.out.println(copy);
// copy.remove(copy.size() - 1) or copy.set(0, process3) doesn't affect another ArrayList
Process process3 = new Process(3);
process3.id = 100;
copy.set(0, process3);
copy.remove(copy.size() - 1);
// output:
// [Id:10, Id:2]
// [Id:100]
System.out.println(a);
System.out.println(copy);
}
static class Process {
public int id;
public Process(int id) {
this.id = id;
}
#Override
public String toString() {
return "Id:" + id;
}
}
Modifying a local variable in forEach gives a compile error:
Normal
int ordinal = 0;
for (Example s : list) {
s.setOrdinal(ordinal);
ordinal++;
}
With Lambda
int ordinal = 0;
list.forEach(s -> {
s.setOrdinal(ordinal);
ordinal++;
});
Any idea how to resolve this?
Use a wrapper
Any kind of wrapper is good.
With Java 10+, use this construct as it's very easy to setup:
var wrapper = new Object(){ int ordinal = 0; };
list.forEach(s -> {
s.setOrdinal(wrapper.ordinal++);
});
With Java 8+, use either an AtomicInteger:
AtomicInteger ordinal = new AtomicInteger(0);
list.forEach(s -> {
s.setOrdinal(ordinal.getAndIncrement());
});
... or an array:
int[] ordinal = { 0 };
list.forEach(s -> {
s.setOrdinal(ordinal[0]++);
});
Note: be very careful if you use a parallel stream. You might not end up with the expected result. Other solutions like Stuart's might be more adapted for those cases.
For types other than int
Of course, this is still valid for types other than int.
For instance, with Java 10+:
var wrapper = new Object(){ String value = ""; };
list.forEach(s->{
wrapper.value += "blah";
});
Or if you're stuck with Java 8 or 9, use the same kind of construct as we did above, but with an AtomicReference...
AtomicReference<String> value = new AtomicReference<>("");
list.forEach(s -> {
value.set(value.get() + s);
});
... or an array:
String[] value = { "" };
list.forEach(s-> {
value[0] += s;
});
This is fairly close to an XY problem. That is, the question being asked is essentially how to mutate a captured local variable from a lambda. But the actual task at hand is how to number the elements of a list.
In my experience, upward of 80% of the time there is a question of how to mutate a captured local from within a lambda, there's a better way to proceed. Usually this involves reduction, but in this case the technique of running a stream over the list indexes applies well:
IntStream.range(0, list.size())
.forEach(i -> list.get(i).setOrdinal(i));
If you only need to pass the value from the outside into the lambda, and not get it out, you can do it with a regular anonymous class instead of a lambda:
list.forEach(new Consumer<Example>() {
int ordinal = 0;
public void accept(Example s) {
s.setOrdinal(ordinal);
ordinal++;
}
});
As the used variables from outside the lamda have to be (implicitly) final, you have to use something like AtomicInteger or write your own data structure.
See
https://docs.oracle.com/javase/tutorial/java/javaOO/lambdaexpressions.html#accessing-local-variables.
An alternative to AtomicInteger is to use an array (or any other object able to store a value):
final int ordinal[] = new int[] { 0 };
list.forEach ( s -> s.setOrdinal ( ordinal[ 0 ]++ ) );
But see the Stuart's answer: there might be a better way to deal with your case.
Yes, you can modify local variables from inside lambdas (in the way shown by the other answers), but you should not do it. Lambdas have been made for functional style of programming and this means: No side effects. What you want to do is considered bad style. It is also dangerous in case of parallel streams.
You should either find a solution without side effects or use a traditional for loop.
If you are on Java 10, you can use var for that:
var ordinal = new Object() { int value; };
list.forEach(s -> {
s.setOrdinal(ordinal.value);
ordinal.value++;
});
You can wrap it up to workaround the compiler but please remember that side effects in lambdas are discouraged.
To quote the javadoc
Side-effects in behavioral parameters to stream operations are, in general, discouraged, as they can often lead to unwitting violations of the statelessness requirement
A small number of stream operations, such as forEach() and peek(), can operate only via side-effects; these should be used with care
I had a slightly different problem. Instead of incrementing a local variable in the forEach, I needed to assign an object to the local variable.
I solved this by defining a private inner domain class that wraps both the list I want to iterate over (countryList) and the output I hope to get from that list (foundCountry). Then using Java 8 "forEach", I iterate over the list field, and when the object I want is found, I assign that object to the output field. So this assigns a value to a field of the local variable, not changing the local variable itself. I believe that since the local variable itself is not changed, the compiler doesn't complain. I can then use the value that I captured in the output field, outside of the list.
Domain Object:
public class Country {
private int id;
private String countryName;
public Country(int id, String countryName){
this.id = id;
this.countryName = countryName;
}
public int getId() {
return id;
}
public void setId(int id) {
this.id = id;
}
public String getCountryName() {
return countryName;
}
public void setCountryName(String countryName) {
this.countryName = countryName;
}
}
Wrapper object:
private class CountryFound{
private final List<Country> countryList;
private Country foundCountry;
public CountryFound(List<Country> countryList, Country foundCountry){
this.countryList = countryList;
this.foundCountry = foundCountry;
}
public List<Country> getCountryList() {
return countryList;
}
public void setCountryList(List<Country> countryList) {
this.countryList = countryList;
}
public Country getFoundCountry() {
return foundCountry;
}
public void setFoundCountry(Country foundCountry) {
this.foundCountry = foundCountry;
}
}
Iterate operation:
int id = 5;
CountryFound countryFound = new CountryFound(countryList, null);
countryFound.getCountryList().forEach(c -> {
if(c.getId() == id){
countryFound.setFoundCountry(c);
}
});
System.out.println("Country found: " + countryFound.getFoundCountry().getCountryName());
You could remove the wrapper class method "setCountryList()" and make the field "countryList" final, but I did not get compilation errors leaving these details as-is.
To have a more general solution, you can write a generic Wrapper class:
public static class Wrapper<T> {
public T obj;
public Wrapper(T obj) { this.obj = obj; }
}
...
Wrapper<Integer> w = new Wrapper<>(0);
this.forEach(s -> {
s.setOrdinal(w.obj);
w.obj++;
});
(this is a variant of the solution given by Almir Campos).
In the specific case this is not a good solution, as Integer is worse than int for your purpose, anyway this solution is more general I think.
I've got two domain objects, of the same type. They contain enums, primitive arrays, and other objects and theres a list in the Heirarchy there too.
I need something to extract a third object of the same type that only contains their differences, almost like a mask that contains only their changes. And anything that hasn't changed be set to null.
Everything points to the Apache BeanUtils, but I cant find exactly what I'm looking for, any suggestions?
Edit#1
Example to clarify :
If obj1 is the original, and obj2 is the updated version. Then if obj1.value is equal to obj2.value then obj3.value will be null. If obj1.value is not equal to obj2.value then obj3.value will be set to the value of obj2.value
Edit#2
Ideally it should be abstract and in no way need to know what type of object the comparison is being run on. As this could be used for different objects in the future.
If one of the update values is set to null than it can be ignored as if its not a change.
Your question is interesting for me. I searching very much for your goals and find a little library do it. This library is in google code and its name is jettison.
This utility has a main class with name Diff4J that has a method with diffs method and by it compare two object and find differents.
Then I write codes for your goals as following:
fisrt define a Model Object with name Bean :
public class Bean
{
private String name;
private String family;
public String getName()
{
return name;
}
public void setName(String name)
{
this.name = name;
}
public String getFamily()
{
return family;
}
public void setFamily(String family)
{
this.family = family;
}
public Bean()
{
}
public Bean(String name, String family )
{
this.name = name;
this.family = family;
}
}
Then coding a test class as following:
public static void main(String[] args) throws IllegalAccessException,
InvocationTargetException
{
Bean bean_1 = new Bean("Sara", "clooney");
Bean bean_2 = new Bean("Sally", "clooney");
Diff4J comparator = new Diff4J();
Collection<ChangeInfo> diffs = comparator.diff(bean_1, bean_2);
Bean final_result = new Bean();
for(ChangeInfo c : diffs)
{
String filedName = c.getFieldName();
Object to_value = c.getTo();
Object from_value = c.getFrom();
BeanUtilsBean.getInstance().setProperty(final_result, filedName, to_value);
}
System.out.println(final_result);
}
By this solution if you run this code see following result:
Bean [family=null, name=Sally]
this result is your goals.
Note: In last line of loop statement, I used BeanUtilBean from Apache Commons Util for fill object by Reflection.
This utility has a problem, it doesn't support Deep Comparator(maybe I couldn't find it) and you have to simulate this task.
for see this library go to http://code.google.com/p/jettison/.
I hope this answer help you.
It can be done without any external library ;)
Let's take a trivial bean
public class Bean {
public String value;
public List<String> list;
public String[] array;
public EnumType enum;
}
and add a static (factory) method:
public static Bean createDelta(Bean master, Bean variant) {
Bean delta = new Bean();
// fields are simple
if (!master.value.equals(variant.value))
delta.value = variant.value;
// enums are simple too
if (master.enumValue != variant.enumValue)
delta.value = variant.value;
// for arrays .. it get's slightly difficult, because arrays may vary in size
int size = master.array.length > variant.array.length ?
master.array.length : variant.array.length;
delta.array = new String[size];
for (int i = 0; i < size; i++) {
if ((i >= master.array.length) ||
(!master.array[i].equals(variant.array[i]))) {
delta.array[i] = variant.array[i];
// same pattern for lists - except we have to add null
int size = master.array.length > variant.array.length ?
master.array.length : variant.array.length;
delta.list = new ArrayList<String>();
for (int i = 0; i < size; i++) {
if ((i >= master.array.length) ||
(!master.array[i].equals(variant.array[i]))) {
delta.list.add(variant.get(i));
} else {
delta.list.add(null);
}
}
}
(Note - not tested, no IDE/compiler at hand - but it shows a general approach)
I hope in a good manner :-)
I wrote this piece of code.
What I wished to do, is to build something like "cache".
I assumed that I had to watch for different threads, as might many calls get to that class, so I tried the ThreadLocal functionality.
Base pattern is
have "MANY SETS of VECTOR"
The vector holds something like:
VECTOR.FieldName = "X"
VECTOR.FieldValue= "Y"
So many Vector objects in a set. Different set for different calls from different machines, users, objects.
private static CacheVector instance = null;
private static SortedSet<SplittingVector> s = null;
private static TreeSet<SplittingVector> t = null;
private static ThreadLocal<SortedSet<SplittingVector>> setOfVectors = new ThreadLocal<SortedSet<SplittingVector>>();
private static class MyComparator implements Comparator<SplittingVector> {
public int compare(SplittingVector a, SplittingVector b) {
return 1;
}
// No need to override equals.
}
private CacheVector() {
}
public static SortedSet<SplittingVector> getInstance(SplittingVector vector) {
if (instance == null) {
instance = new CacheVector();
//TreeSet<SplittingVector>
t = new TreeSet<SplittingVector>(new MyComparator());
t.add(vector);
s = Collections.synchronizedSortedSet(t);//Sort the set of vectors
CacheVector.assign(s);
} else {
//TreeSet<SplittingVector> t = new TreeSet<SplittingVector>();
t.add(vector);
s = Collections.synchronizedSortedSet(t);//Sort the set of vectors
CacheVector.assign(s);
}
return CacheVector.setOfVectors.get();
}
public SortedSet<SplittingVector> retrieve() throws Exception {
SortedSet<SplittingVector> set = setOfVectors.get();
if (set == null) {
throw new Exception("SET IS EMPTY");
}
return set;
}
private static void assign(SortedSet<SplittingVector> nSet) {
CacheVector.setOfVectors.set(nSet);
}
So... I have it in the attach and I use it like this:
CachedVector cache = CachedVector.getInstance(bufferedline);
The nice part: Bufferedline is a splitted line based on some delimiter from data files. Files can be of any size.
So how do you see this code? Should I be worry ?
I apologise for the size of this message!
Writing correct multi-threaded code is not that easy (i.e. your singleton fails to be), so try to rely on existing solutions if posssible. If you're searching for a thread-safe Cache implementation in Java, check out this LinkedHashMap. You can use it to implement a LRU cache. And collections.synchronizedMap(). can make this thread-safe.
I have two Collection objects, I want to associate each object of these two in a readable way (HashMap, Object created on purpose, you choose).
I was thinking of two loops one nested into the other, but maybe it's a well known problem and has a commonly understandable solution...
What if the number of Collection objects raises above two?
EDIT after Joseph Daigle comment: The items of the Collection objects are all of the same type, they are rooms of hotels found to be bookable under certain conditions.
Collection<Room> roomsFromA = getRoomsFromA();
Collection<Room> roomsFromB = getRoomsFromB();
for(Room roomA : roomsFromA){
for(Room roomB : roomsFromB){
//add roomA and roomB to something, this is not important for what I need
//the important part is how you handle the part before
//especially if Collection objects number grows beyond two
}
}
EDIT 2: I'll try to explain better, sorry for the question being unclear.
Follows an example:
A user requests for a double and a single room.
The hotel has 3 double and 4 single rooms available.
I need to associate every "double room" to every "single room", this is because each Room has its own peculiarity say internet, a more pleasant view, and so on. So i need to give the user all the combinations to let him choose.
This is the simple case, in which only two Collection of Room objects are involved, how do you manage the problem when say both hotel and user can offer / request more Room types?
What you are trying to do here is to get all possible permutations of choosing X from a set of Y. This is a well known problem in discrete mathematics and I think it is just called Combinatorial Mathematics.
To solve your problem you need to create a super collection containing all your Room types. If this is an array or a List you can then use this example to calculate all possible ways of choosing X from the set of Y. The example will give you the indices from the list/array.
Do the collections line up exactly?
HashMap map = new HashMap();
for (int i=0; i<c1.Size(); i++) {
map.put(c1[i], c2[i]);
}
Well, since I don't know if you will need to search for both of them having only one, the HashMap won't work.
I would create a class that receives a Pair.. sort of:
private static class Pair<K, T> {
private K one;
private T two;
public Pair(K one, T two) {
this.one = one;
this.two = two;
}
/**
* #return the one
*/
public K getOne() {
return one;
}
/**
* #return the two
*/
public T getTwo() {
return two;
}
}
And create a List with them.
Your example implies that the return value from "roomsFromB" is a subcollection of the return value of "roomsFromA", so it'd be more natural to model it that way:
class Room {
public Collection<Room> getRoomsFromB { ...
}
which would then let you do :
//Collection rooms
for (Room a: rooms)
{
for(Room b a.getRoomsFromB){ ...
This is assuming that they're modeled hierarchically, of course. If they're not then this is inappropriate, but then the question you're asking, it seems to me, is really how to model the relationship between them, and you haven't yet made that explicit.
You might reconsider whether you need exactly this logic. You're introducing an O(n^2) operation, which can quickly get out of hand. (Technically O(mn), but I'm guessing m and n are roughly the same order.)
Is there another solution to your problem? Perhaps you could create a 'set' which includes all of A and all of B, and then each object in A and B could point to this set, instead?
I assume that:
Each element in collection 1 will
match a single element in
collection 2
The collections have the same
size
The collections can be ordered and
the order matches each element in
both collections
Order both collections (in the same
order) by the property that
identifies each object.
Iterate through both collections with a single loop, build a relation object and add it into a new collection.
See if this helps you:
public static class Room {
private int number;
private String name;
public Room(int number, String name) {
super();
this.number = number;
this.name = name;
}
public int getNumber() {
return number;
}
public String getName() {
return name;
}
}
public static class RoomRelation {
private Room a;
private Room b;
public RoomRelation(Room a, Room b) {
super();
this.a = a;
this.b = b;
}
public Room getA() {
return a;
}
public Room getB() {
return b;
}
#Override
public String toString() {
return a.getName() + "(" + a.getNumber() + ") " + b.getName() + "(" + b.getNumber() + ")";
}
}
public static void main(String[] args) {
List<Room> roomsFromA = new ArrayList<Room>();
List<Room> roomsFromB = new ArrayList<Room>();
roomsFromA.add(new Room(1,"Room A"));
roomsFromA.add(new Room(2,"Room A"));
roomsFromB.add(new Room(1,"Room B"));
roomsFromB.add(new Room(2,"Room B"));
Comparator<Room> c = new Comparator<Room>() {
#Override
public int compare(Room o1, Room o2) {
return o1.getNumber() - o2.getNumber();
} };
Collections.sort(roomsFromA, c);
Collections.sort(roomsFromB, c);
List<RoomRelation> relations = new ArrayList<RoomRelation>();
for (int i = 0; i < roomsFromA.size(); i++) {
relations.add(new RoomRelation(roomsFromA.get(i), roomsFromB.get(i)));
}
for (RoomRelation roomRelation : relations) {
System.out.println(roomRelation);
}
}
Your question is quite unclear. As I understand you want to list all combinations of rooms, minus duplicates. Here us some code to build up a 2d array of all the room combinations. For more kinds of room, put in another nested loop.
Collection<Room> roomsFromA = getRoomsFromA();
Collection<Room> roomsFromB = getRoomsFromB();
Room[][] combinations = new Room[roomsFromA .size()][roomsFromB .size()];
int a = 0;
int b = 0;
for(Room roomA : roomsFromA){
for(Room roomB : roomsFromB){
combinations [a][b] = [roomA][roomB]; //Build up array
b++;
}
a++;
}
return combinations;
It is a common problem. It's called a Cartesian product. If you have two collections like in your case, I would not hesitate to have two nested loops. Otherwise, see this question.