How to intersect a Guava Range and a TreeSet efficiently? - java

I'd like to take the intersection of a set and a range, so that I get a set containing every element that is not in the range. For example, I'd like a way to take set and range from the following code snippet:
import com.google.common.collect.*;
TreeSet<Integer> set = Sets.newTreeSet();
Collections.addAll(set, 1,2,3,5,11);
Range<Integer> range = Range.closed(4,10);
and return a new TreeSet containing just 5

In this particular example, you're better off not using Range at all, but using set.subSet(4, true, 10, true) directly, but presumably you have a more complicated use case, and your code is a simplified example.
There's really not much alternative but to deal with all the cases yourself. Part of the problem is that a NavigableSet can use an arbitrary Comparator, but Range (deliberately) works only with the natural ordering of the value type, so it'd be somewhat awkward to provide a method in Guava that takes an arbitrary Range and a NavigableSet and intersects them.
The most general solution would look something like...
if (range.hasLowerBound()) {
if (range.hasUpperBound()) {
return set.subSet(
range.lowerEndpoint(),
range.lowerBoundType() == BoundType.CLOSED,
range.upperEndpoint(),
range.upperBoundType() == BoundType.CLOSED);
} else {
return set.tailSet(
range.lowerEndpoint(),
range.lowerBoundType() == BoundType.CLOSED);
}
} else {
if (range.hasUpperBound()) {
return set.headSet(
range.upperEndpoint(),
range.upperBoundType() == BoundType.CLOSED);
} else {
return set;
}
}
That said, it's worth mentioning that if you're not concerned about efficiency, you can just do Iterables.removeIf(set, Predicates.not(range)) or Sets.filter(set, range).

Related

Issue with implementing custom Comparator for Priority Queue in Java

Please pardon my understanding towards priority Queue and Comparator in Java.
It seems ,I am able to implement basic comparator for Priority Queue based on some sort order.
But I am not able to come up with something for the below scenario :
1. Given a list of Files with name convention xx_yy_zz.dat .<br/>
2.xx,yy,zz can be from 00-50 <br/>
3.I need to process the files with xx=30 first,xx=35 second xx=40 third and then the rest.<br/>
Since I have limited knowledge with Priority Queue ,I tried to implement it which i was able to sort but only in asc or desc value of xx which was not the requirement.
My approach was
put the list of file names in priority Queue ,split the filename on regex "_"
then compare the first index of split array using comparator based on it values but as expected i failed miserably since my requirement was something different
Please share some ideas/approach.
It seems sadly ,I am not able to come up with the a required comparator for my case .
Nevertheless thanking you in anticipation
You can use simple if statements inside the compare() method to check if one string starts with "30" and the other does not. Then you know that this string must come before the other one. You run the following if statements like this on the first part of the filenames:
Are they the same?
Is the left one 30?
Is the right one 30?
Is the left one 35?
Is the right one 35?
Is the left one 40?
Is the right one 40?
The comparator might look like this:
public int compare(String a, String b) {
String[] splitA = a.split("_");
String[] splitB = b.split("_");
if (splitA[0].equals(splitB[0])) {
return 0;
}
if (splitA[0].equals("30")) {
return -1;
}
if (splitB[0].equals("30")) {
return 1;
}
if (splitA[0].equals("35")) {
return -1;
}
if (splitB[0].equals("35")) {
return 1;
}
if (splitA[0].equals("40")) {
return -1;
}
if (splitB[0].equals("40")) {
return 1;
}
return 0;
}
With the following test source code:
System.out.println(Arrays.toString(data));
Arrays.sort(data, new SpecialComparator());
System.out.println(Arrays.toString(data));
You might get an output like this (depending on the data array):
[30_45_35.dat, 00_12_34.dat, 35_50_20.dat, 40_03_05.dat, 33_28_14.dat,
30_16_31.dat, 20_29_23.dat, 24_41_29.dat, 30_49_18.dat, 40_12_13.dat]
[30_45_35.dat, 30_16_31.dat, 30_49_18.dat, 35_50_20.dat, 40_03_05.dat,
40_12_13.dat, 00_12_34.dat, 33_28_14.dat, 20_29_23.dat, 24_41_29.dat]
(new lines added for clarity)
As you see you have the 30s first, then the only 35 second, then the 40s third and after that all the remaining stuff. You might want to use compareTo() on the strings in case the compareTo method would return 0 to get better "sub sorting" of strings, which would be equal based on this basic sorting above.
May be I'm not understand what exactly you need... but simply try this code and it sort me all strings if they has two digits on the begining
public static void main(String[] args) {
PriorityQueue<String> q = new PriorityQueue<String>((first, second) -> {
return Integer.parseInt(first.substring(0, 2)) - Integer.parseInt(second.substring(0, 2));
//and if you want to reverse order, simply add "-" like this:
//return -(Integer.parseInt(first.substring(0, 2)) - Integer.parseInt(second.substring(0, 2)));
});
q.add("23lk");
q.add("22lkjl");
q.add("45ljl");
for(String str : q) {
System.out.println(str);
}
}
}
adn output
22lkjl
23lk
45ljl
If this not solution, please explain problem with more details, may be I or anybody else will help you.

In Java, a "decorate and sort" concise implementation?

I'm learning Java for the first time (my prior experience is Python and Haskell). I have a situation that would, in Python, require a "decorate and sort" idiom. Such as the following (code not tested but this is roughly correct):
origList = <something>
decorated = sorted( [(evalFunc(item), item) for item in origList] )
finalList = [item for _, item in decorated]
By choosing a different evalFunc you can choose how this is sorted.
In Java, I'm writing a program that composes music by choosing from among a list of notes, evaluating the "fitness" of each note, and picking the best. I have a class representing musical notes:
class Note {
...
}
I have a class that represents the fitness of a note as two values, its goodness and badness (yes, these are separate concepts in my program). Note: in Python or Haskell, this would simply be a 2-tuple, but my understanding is that Java doesn't have tuples in the usual sense. I could make it a pair, but it gets unwieldy to declare variables all over the place like List<Pair<Type1,Pair<Type2,Type3>>>. (As an aside, I don't think Java has type aliases either, which would let me shorten the declarations.)
class Fitness {
double goodness;
double badness;
}
The function that evaluates the fitness needs access to several pieces of data other than the Note. We'll say it's part of a "Composition" class:
class Composition {
... data declared here ... ;
public Fitness evaluate(Note n) {
}
}
I'd like to be able to compare Fitness objects in numerical order. There are two ways to compare: either goodness or badness can be numerically compared, depending on the situation.
class CompareFitnessByGoodness implements Comparator<Fitness> {
}
class CompareFitnessByBadness implements Comparator<Fitness> {
}
I'd like to package the Note together with its fitness, so I can sort the combined list by fitness and later pull out the best Note.
class Together {
public Note;
public Fitness;
}
I'd like to sort a List<Together> by either the goodness, or by the badness. So I might need:
class CompareTogetherByGoodness implements Comparator<Together> {
...
}
class CompareTogetherByBadness implements Comparator<Together> {
...
}
Eventually I'll write something like
Note pickBest(List<Together> notes) {
// Pick a note that's not too bad, and pretty good at the same
// time.
// First sort in order of increasing badness, so I can choose
// the bottom half for the next stage (i.e. the half "least bad"
// notes).
Collections.sort(notes, new CompareTogetherByBadness());
List<Together> leastBadHalf = notes.subList(0, notes.size()/2);
// Now sort `leastBadHalf` and take the last note: the one with
// highest goodness.
Collections.sort(leastBadHalf, new CompareTogetherByGoodness());
return leastBadHalf.get(leastBadHalf.size()-1);
}
Whew! That is a LOT of code for something that would be a few lines in Haskell or Python. Is there a better way to do this?
EDIT:
Addressing some of the answers.
"You don't need to decorate." Well, my fitness computation is very expensive, so I want to compute it once for each note, and save the result for later access as well.
"Store goodness/badness in Note." The goodness or badness is not a property of the note alone; it's only meaningful in context and it can change. So this is a suggestion that I add mutable state which is only meaningful in some contexts, or plain wrong if there's a bug which accidentally mutates it. That's ugly, but maybe a necessary crutch for Java.
Going by what you already have
origList = <something>
decorated = sorted( [(evalFunc(item), item) for item in origList] )
finalList = [item for _, item in decorated]
This is the equivalent in modern Java:
Given your composition object:
Composition composer = ...;
And a list of notes:
List<Note> notes = ...;
Then you can do:
List<Together> notesAllTogetherNow = notes.stream()
.map(note -> new Together(note, composer.evaluate(note)))
.sorted(new CompareTogetherByGoodness())
.collect(Collectors.toList());
To get the best note, you can take a bit further:
Optional<Note> bestNote = notes.stream()
.map(note -> new Together(note, composer.evaluate(note)))
.sorted(new CompareTogetherByBadness())
.limit(notes.size() / 2) // Taking the top half
.sorted(new CompareTogetherByGoodness())
.findFirst() // Assuming the last comparator sorts in descending order
.map(Together::getNote);
You can use streams:
Function<Foo, Bar> func = ...
Comparator<Foo> comparator = ...
var list = ...
var sorted = list.stream()
.sorted(comparator)
.map(func)
.collect(Collectors.toList());
Java plainly includes a Collections.sort :: List -> Comparator -> List that does everything for you. It mutates the original list, though.
Unfortunately, Java's standard library does not include tuples and even a plain Pair; the Apache Commnons library does, though.
In short, you don't need the decorate / undecorate approach in Java.
class Fitness {
double goodness;
double badness;
}
class Together {
Note note;
Fitness fitness;
}
class Note{
}
List<Together> notes = ...
Collections.sort(notes, Comparator.comparingDouble(value -> value.fitness.badness));
List<Together> leastBadHalf = notes.subList(0, notes.size()/2);
return leastBadHalf.stream().max(Comparator.comparingDouble(value -> value.fitness.goodness));

Java 8 Stream.findAny() vs finding a random element in the stream

In my Spring application, I have a Couchbase repository for a document type of QuoteOfTheDay. The document is very basic, just has an id field of type UUID, value field of type String and created date field of type Date.
In my service class, I have a method that returns a random quote of the day. Initially I tried simply doing the following, which returned an argument of type Optional<QuoteOfTheDay>, but it would seem that findAny() would pretty much always return the same element in the stream. There's only about 10 elements at the moment.
public Optional<QuoteOfTheDay> random() {
return StreamSupport.stream(repository.findAll().spliterator(), false).findAny();
}
Since I wanted something more random, I implemented the following which just returns a QuoteOfTheDay.
public QuoteOfTheDay random() {
int count = Long.valueOf(repository.count()).intValue();
if(count > 0) {
Random r = new Random();
List<QuoteOfTheDay> quotes = StreamSupport.stream(repository.findAll().spliterator(), false)
.collect(toList());
return quotes.get(r.nextInt(count));
} else {
throw new IllegalStateException("No quotes found.");
}
}
I'm just curious how the findAny() method of Stream actually works since it doesn't seem to be random.
Thanks.
The reason behind findAny() is to give a more flexible alternative to findFirst(). If you are not interested in getting a specific element, this gives the implementing stream more flexibility in case it is a parallel stream.
No effort will be made to randomize the element returned, it just doesn't give the same guarantees as findFirst(), and might therefore be faster.
This is what the Javadoc says on the subject:
The behavior of this operation is explicitly nondeterministic; it is free to select any element in the stream. This is to allow for maximal performance in parallel operations; the cost is that multiple invocations on the same source may not return the same result. (If a stable result is desired, use findFirst() instead.)
Don’t collect into a List when all you want is a single item. Just pick one item from the stream. By picking the item via Stream operations you can even handle counts bigger than Integer.MAX_VALUE and don’t need the “interesting” way of hiding the fact that you are casting a long to an int (that Long.valueOf(repository.count()).intValue() thing).
public Optional<QuoteOfTheDay> random() {
long count = repository.count();
if(count==0) return Optional.empty();
Random r = new Random();
long randomIndex=count<=Integer.MAX_VALUE? r.nextInt((int)count):
r.longs(1, 0, count).findFirst().orElseThrow(AssertionError::new);
return StreamSupport.stream(repository.findAll().spliterator(), false)
.skip(randomIndex).findFirst();
}

Checking for either/or with an EnumSet

So I'm converting some bitfields in our application to use EnumSet instead, and I'm curious if there's a better way to do a comparison for X|Y. Currently we do something like:
if(bitfield & (X | Y) != 0) {
//do stuff
}
The EnumSet equivalent seems to be:
if(enumSet.contains(X) || enumSet.contains(Y)) {
//do stuff
}
Is there a cleaner way to do this? I know you can check for containsAll() like so:
EnumSet flagsToCheck = EnumSet.of(X, Y);
if(enumSet.containsAll(flagsToCheck)) {
//do stuff
}
But that's for a scenario where you want to know if (X & Y) is set. Is there an equivalent way to check for (X | Y)? I would think there would be something like a containsAny() method, but I don't see anything that seems to have that effect.
I would say the existing approach is more readable than your bitwise approach. It says exactly what you mean: if the set contains X, or the set contains Y... Keep it as it is. It's already clean.
If the set becomes larger, you could use:
EnumSet<Foo> valid = EnumSet.of(Foo.X, Foo.Y, Foo.A, Foo.B);
valid.retainAll(enumSet);
if (valid.isEmpty()) {
...
}
But I'd only keep that for larger cases. For two or three options I'd use the longhand form.
You can use the AbstractSet method removeAll (true if any of the elements was found). Obviously, probably you want to do that with a clone of the original set.
If you can't update the set, just create a new one... #assylias is right. An option to that is to just create a new set based on the enum values you want and change/verify accordingly.
public enum ResultingState {
NOT_PERSISTED, PERSISTED, NOT_CALCULATED, CALCULATED;
}
EnumSet<ResultingState> errorsState = EnumSet.of(ResultingState.NOT_PERSISTED, ResultingState.NOT_CALCULATED);
Collection<ResultingState> results = new HashSet<>(phaseResults.values());
boolean containsAny = results.retainAll(errorsState) && results.size() > 0;

Is there a way to test for enum value in a list of candidates? (Java)

This is a simplified example. I have this enum declaration as follows:
public enum ELogLevel {
None,
Debug,
Info,
Error
}
I have this code in another class:
if ((CLog._logLevel == ELogLevel.Info) || (CLog._logLevel == ELogLevel.Debug) || (CLog._logLevel == ELogLevel.Error)) {
System.out.println(formatMessage(message));
}
My question is if there is a way to shorten the test. Ideally i would like somethign to the tune of (this is borrowed from Pascal/Delphi):
if (CLog._logLevel in [ELogLevel.Info, ELogLevel.Debug, ELogLevel.Error])
Instead of the long list of comparisons. Is there such a thing in Java, or maybe a way to achieve it? I am using a trivial example, my intention is to find out if there is a pattern so I can do these types of tests with enum value lists of many more elements.
EDIT: It looks like EnumSet is the closest thing to what I want. The Naïve way of implementing it is via something like:
if (EnumSet.of(ELogLevel.Info, ELogLevel.Debug, ELogLevel.Error).contains(CLog._logLevel))
But under benchmarking, this performs two orders of magnitude slower than the long if/then statement, I guess because the EnumSet is being instantiated every time it runs. This is a problem only for code that runs very often, and even then it's a very minor problem, since over 100M iterations we are talking about 7ms vs 450ms on my box; a very minimal amount of time either way.
What I settled on for code that runs very often is to pre-instantiate the EnumSet in a static variable, and use that instance in the loop, which cuts down the runtime back down to a much more palatable 9ms over 100M iterations.
So it looks like we have a winner! Thanks guys for your quick replies.
what you want is an enum set
http://docs.oracle.com/javase/1.5.0/docs/api/java/util/EnumSet.html
put the elements you want to test for in the set, and then use the Set method contains().
import java.util.EnumSet;
public class EnumSetExample
{
enum Level { NONE, DEBUG, INFO, ERROR };
public static void main(String[] args)
{
EnumSet<Level> subset = EnumSet.of(Level.DEBUG, Level.INFO);
for(Level currentLevel : EnumSet.allOf(Level.class))
{
if (subset.contains(currentLevel))
{
System.out.println("we have " + currentLevel.toString());
}
else
{
System.out.println("we don't have " + currentLevel.toString());
}
}
}
}
There's no way to do it concisely in Java. The closest you can come is to dump the values in a set and call contains(). An EnumSet is probably most efficient in your case. You can shorted the set initialization a little using the double brace idiom, though this has the drawback of creating a new inner class each time you use it, and hence increases the memory usage slightly.
In general, logging levels are implemented as integers:
public static int LEVEL_NONE = 0;
public static int LEVEL_DEBUG = 1;
public static int LEVEL_INFO = 2;
public static int LEVEL_ERROR = 3;
and then you can test for severity using simple comparisons:
if (Clog._loglevel >= LEVEL_DEBUG) {
// log
}
You could use a list of required levels, ie:
List<ELogLevel> levels = Lists.newArrayList(ELogLevel.Info,
ELogLevel.Debug, ELogLevel.Error);
if (levels.contains(CLog._logLevel)) {
//
}

Categories