Arrays.equals without length of arrays - java

I have two arrays with different length, but same elements. For example
A1 = {1,2,3,null,null}
A2 = {1,2,3}
Arrays.equals gives me false, because arrays have different length. Are there any method in java that will compare only elements in method?
I don't want to use .toString
I'm trying to make compare method in my own generic stack realization.

No, because its a weird request. null does not mean 'not here', null means 'unknown / unset', that's why it throws exceptions when you interact with it: You're asking "hey, thing that has not been set yet, are you X", and there is no way to answer such a question.
That doesn't mean your code is wrong, just, you can stop looking for existing implementations. Weird requests generally aren't catered to by the core libraries (or any other). You also may want to change your mindset on null. Programming in java is a lot less aggravating if at all times a NullPointerException is a good thing. In other words, avoid using null as having any semantic meaning. If you ever write if (x == null || x.isEmpty()) you are doing it wrong. Instead, where-ever 'x' is coming from, it should hold, or be updated to ASAP, the empty string instead. So, if reading in external data (e.g. you marshalled some JSON into an object), do a 'clean' step that replaces all null values that have semantic meaning with an object that actually represents it, and for methods that return stuff, always return an object that represents what you are returning - only return null if you WANT to convey the notion that there is no result (i.e. that's not the same as 'an empty result', i.e. if any code acts like there was a result, you want it to crash).
In other words, I doubt you are asking the right question. But in case you are, you have two broad options.
First make null-less arrays then compare those as normal
One option is to make new arrays that have nulls stripped. Something like:
#SuppressWarnings("unchecked")
<T> T[] stripNulls(T[] in) {
Class<?> componentType = in.getClass().getComponentType();
return (T[]) Arrays.stream(in)
.filter(x -> x != null)
.toArray(len -> java.lang.reflect.Array.newInstance(componentType, len));
}
// which you can then use; you don't need generics for a compare,
// it wouldn't add anything at all.
boolean compare(Object[] a, Object[] b) {
return Arrays.equals(stripNulls(a), stripNulls(b));
}
Just compare in place
If it's performance sensitive that's suboptimal. A better approach would involve a little more coding:
boolean compare(Object[] a, Object[] b) {
Object ae = null, be = null;
int ai = 0, bi = 0, al = a.length, bl = b.length;
while (true) {
/* set `ae` and `be` to the next non-null element */
while (ae == null && ai < al) ae = a[ai++];
while (be == null && bi < bl) be = b[bi++];
/* Have we hit the end? */
if (ai == al && bi == bl) return true;
/* If one is at the end, but the other isn't... */
if (ai == al || bi == bl) return false;
/* check if the 2 current elements are equal */
if (!ae.equals(be)) return false;
}
}

Not a native Java Developer, but maybe this helps you?
boolean arraysEqual = Arrays.equals(Arrays.stream(a1).filter(n => n != null).toArray(), Arrays.stream(a2).filter(n => n != null).toArray())

Related

Custom Java Comparator With Pre Defined Top Result

I want to sort a list of field names alphabetically however I need to include a condition in the doCompare method of the comparator so that if the field name is "pk" that will always be sorted to the top of the list. What I have is below but I'm not sure if I'm taking the right approach, particualrly with the reurn value of -1000. Any advice on this would be much appreciated.
#Override
public int doCompare(Object firstRec, Object secondRec)
{
MyField firstField = (MyField) firstRec;
MyField secondField = (MyField ) secondRec;
if(firstField.name() == "pk")
{
return -1000;
}
return StringUtils.compareStrings(firstField.name().toLowerCase(), secondField.name().toLowerCase());
}
The requirements of a Comparator (and, by extension, methods which are supposed to act like Comparator.compare) are described in the Javadoc:
The implementor must ensure that sgn(compare(x, y)) == -sgn(compare(y, x)) for all x and y. (This implies that compare(x, y) must throw an exception if and only if compare(y, x) throws an exception.)
The implementor must also ensure that the relation is transitive: ((compare(x, y)>0) && (compare(y, z)>0)) implies compare(x, z)>0.
Finally, the implementor must ensure that compare(x, y)==0 implies that sgn(compare(x, z))==sgn(compare(y, z)) for all z.
Assuming StringUtils.compareStrings correctly implements these requirements, the thing you've got wrong is the first requirement: you also need to consider the cases when secondField is also pk:
The general pattern for writing correct Comparators is:
int firstComparison = /* compare something about firstField and secondField */;
if (firstComparison != 0) {
return firstComparison;
}
int secondComparison = /* compare something else about firstField and secondField */;
if (secondComparison != 0) {
return secondComparison;
}
// ...
return 0;
Applying that here:
int pkComparison = Boolean.compare(secondField.name().equals("pk"), firstField.name().equals("pk"));
if (pkComparison != 0) {
return pkComparison;
}
int compareStringsComparison = StringUtils.compareStrings(firstField.name().toLowerCase(), secondField.name().toLowerCase());
if (compareStringsComparison != 0) {
return compareStringsComparison;
}
return 0;
Obviously, the last if statement is redundant, because you always return compareStringsComparison whether or not it is zero; so you could write simply:
return StringUtils.compareStrings(firstField.name().toLowerCase(), secondField.name().toLowerCase());
I would recommend sticking to the compare/check and return/finally return 0 pattern, because it's easier to slot in additional conditions later. But it's not terrible either way.
The new static methods of class Comparator available since Java 8 are very handy to create a multi-criteria Comparator like in your case.
You could try something like this:
List<String> list = ... ;
list.sort(
Comparator.comparingBoolean("PK"::equals)
.thenComparing(StringUtils::compare)
);
You may need to use .reversed() in case the order is the opposite of what you want.
The great advantage of Comparator.comparing / Comparator.comparingXXX is that you don't need to twist your mind to get the correct behavior when to return a positive, negative or 0 value.
The Comparator.thenComparing dos proper chaining, i.e. it checks further criterias only when needed, only when previous comparisons returned 0.
If your list may contain null values, there are also methods to handle them properly. This isn't the case in this short example.

Design pattern for multiple combinations

If I have to make a different database query depending on the presence or not of different parameters, which would be the correct design pattern to avoid too many if-else with the different combinations ?
Let's say I have parameters a, b, c (the amount can grow in the future), I'm using repositories so I would have to make a call something like this
public Foo getFoo(String a, String b, String c){
Foo foo;
if(a!=null && !a.isEmpty() && b!=null && !b.isEmpty() && c!=null && !c.isEmpty())
foo = repository.findByAAndBAndC(a,b,c);
if((a==null || a.isEmpty()) && b!=null && !b.isEmpty() && c!=null && !c.isEmpty())
foo = repository.findByBAndC(b,c);
if((a!=null && !a.isEmpty()) && (b==null || b.isEmpty()) && c!=null && !c.isEmpty())
foo = repository.findByAAndC(a,c);
if((a==null || a.isEmpty()) && (b==null || b.isEmpty()) && !b.isEmpty() && c!=null && !c.isEmpty())
foo = repository.findByC(c);
if((a==null || a.isEmpty()) && (b==null || b.isEmpty()) && !b.isEmpty() && (b==null || b.isEmpty()))
foo = repository.findOne();
etc.
.
.
.
return foo;
}
How can that be better structured ?
At the beginning, I would propose you the Specification design pattern that :
is a particular software design pattern, whereby business rules can be
recombined by chaining the business rules together using boolean
logic. The pattern is frequently used in the context of domain-driven
design.
but your actual code doesn't suit completely to that as you don't invoke the same method of the repository according to the case.
So I think that you have two ways :
1) Refactoring your repository to provide a single common method accepting a specification parameter and able to handle the different cases.
If you use Spring, you could look at the JpaSpecificationExecutor interface that provides methods such as :
List<T> findAll(Specification<T> spec)
Even if you don't use Spring, I think that these examples could help you .
2) If you cannot refactor the repository, you should look for another way and provide a abstraction level about which repository methods/parameters may be passed to.
Actually, you invoke a different method with different parameters according to the input parameters but in any case you return the same type of object to the client of the method : Foo. So to avoid conditional statements, polymorphism is the way to follow.
Each case to handle is finally a different strategy. So you could have a strategy interface and you could determine the strategy to use to return the Foo to the client.
Besides, as suggested in a comment : a!=null && !a.isEmpty() repeated multiple times is not a good smell. It makes much duplication and also makes the code less readable. It would better to apply this processing by using a library such as Apache common or even a custom method.
public class FooService {
private List<FindFooStrategy> strategies = new ArrayList<>();
public FooService(){
strategies.add(new FindFooByAAndBAndCStrategy());
strategies.add(new FindFooByBAndCStrategy());
strategies.add(new FindFooByAAndCStrategy());
strategies.add(new FindFooByCStrategy());
}
public Foo getFoo(String a, String b, String c){
for (FindFooStrategy strategy : strategies){
if (strategy.isApplicable(a, b, c)) {
return strategy.getFoo(a, b, c);
}
}
}
}
Where FindFooStrategy is defined as :
public interface FindFooStrategy{
boolean isApplicable(String a, String b, String c);
Foo getFoo(String a, String b, String c);
}
And where each subclass defines its rules. For example :
public class FindFooByAAndBAndCStrategy implements FindFooStrategy{
public boolean isApplicable(String a, String b, String c){
return StringUtils.isNotEmpty(a) && StringUtils.isNotEmpty(b) &&
StringUtils.isNotEmpty(c);
}
public Foo getFoo(String a, String b, String c){
return repository.findByAAndBAndC(a,b,c);
}
}
This is not a complete answer. I will offer several suggestions to address the problem at hand.
Dealing with Null Values
To avoid checking whether a value is null, I suggest that you use a container class for your String query parameters with some method, say getValue() that returns parameter's value e.g., parameter='value' if the value is present or some default string value e.g., parameter like '%' if it's null. This approach follows the so-called, Null Design Pattern.
Dynamic Construction of Query
After doing this, it will no longer matter what values the parameters you passed have and you can just construct your condition iteratively such as:
for parameter in parameters:
condition = "AND" + parameter.getValue()
Perhaps you can combine this with a generic method for querying that accepts arbitrary length condition such as:
repository.findBy(condition)
I am not 100% sure since I am typing this answer from the top of my mind but I think this approach works and should be able to address the problem mentioned in your post. Let me know what you think.
You can make use of a enum defining bitmap-constants with a valueOf method:
public enum Combinations{
A_AND_B_AND_C (0b111),
B_AND_C (0b110),
A_AND_C (0b101),
C (0b100),
A_AND_B (0b011),
B (0b010),
A (0b001),
NONE (0b000),
;
private final int bitmap;
Combinations(int bitmap){
this.bitmap = bitmap;
}
public static Combinations valueOf(String... args){
final StringBuilder builder = new StringBuilder();
for(int i = args.length - 1; i >= 0; i--){
final String arg = args[i];
builder.append(arg != null && !arg.isEmpty() ? '1' : '0');
}
final int bitmap = Integer.parseInt(builder.toString(), 2);
final Combinations[] values = values();
for(int i = values.length -1; i >= 0; i--){
if(values[i].bitmap == bitmap){
return values[i];
}
}
throw new NoSuchElementException();
}
}
And another class which has a switch case statement:
public class SomeClass {
public Foo getFoo(String a, String b, String c){
switch(Combinations.valueOf(a, b, c)){
case A_AND_B_AND_C:
return repository.findByAAndBAndC(a, b, c);
case B_AND_C:
return repository.findByBAndC(b, c);
/* all other cases */
case NONE:
return repository.findOne();
default:
// type unknown
throw new UnsupportedOperationException();
}
}
}
This may be a lot of work in the first place. But you'll be glad when you've done it. By using Bitmaps you can have a lot of combinations. The valueOf method takes care of finding out which combination actually should be taken. But what should happen after can't be done generically. So when adding another parameter d you'll get a lot more combinations which must be added to the enum.
All in all this solution is overkill for small amounts of parameters. Is still quite easy to understand, because the logic is split up into many small parts. You just still don't get around the big switch statement at the end though.

A cleaner if statement with multiple comparisons [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 9 years ago.
Improve this question
The following statement just looks very messy when you have a lot of terms:
if(a.equals("x") || a.equals("y") || a.equals("z") || Any number of terms...... )
//Do something
Is there a cleaner way of performing the same action, I would like my code to be as readable as possible.
NOTE: x, y and z are just placeholders for any string of any length. There could be 20 string terms here of variable length in if condition each being OR'd together
What do you think looks "unclean" about it?
If you have a bunch of complicated boolean logic, you might separate the different parts of it into individual boolean variables and refer to them in the if statement.
Or you could create a function that takes your 'a' variable and returns a boolean. You'd just be hiding your logic in the method, but it would clean up your if statement.
Set<String> stuff = new HashSet<String>();
stuff.add("x");
stuff.add("y");
stuff.add("z");
if(stuff.contains(a)) {
//stuff
}
If this is a tight loop you can use a static Set.
static Set<String> stuff;
static {
stuff = new HashSet<String>();
stuff.add("x");
stuff.add("y");
stuff.add("z");
}
//Somewhere else in the cosmos
if(stuff.contains(a)) {
//stuff
}
And if you want to be extra sure nothing is getting modified while you're not looking.
Set<String> test = Collections.unmodifiableSet(new HashSet<String>() {
{
add("x");
add("y");
add("z");
}
});
If you just want to get some logic in there for a handful of hard coded conditions then one of the switch or if statement with newlines solutions might be better. But if you have a lot of conditions then it might be good to separate your configuration from logic.
Alternatively, if you are using Java 7+ you can use strings in switch/case. For example (I extracted this from an Oracle doc and modified)
switch (str) {
case "x":
case "y":
case "z":
//do action
break;
default:
throw new IllegalArgumentException("argument not matched "+str);
}
Here is the link
Use a regular expression
If (a.matches("[xyz]")){
// matches either "x", "y", or "z"
or, for longer strings,
If (a.matches("one|two|three")){
// matches either "one", "two" or "three"
But this is computationally expensive, but probably not much worse than instantiating a set etc. But it's the clearest way I can think of.
But in the end, the nicest way is probably to leave things as they are, with an adjustment to the formatting:
if (a.equals("x") ||
a.equals("y") ||
a.equals("z")
){
There is then absolutely no ambiguity in what the code is doing and so your code will be easier to maintain. If performance matters, you can even put the most likely occurrences towards the top of the list.
Reaching for semantics
On a semantic level, what you are checking for is set membership. However, you implement it on a very low level, basically inlining all the code needed to achieve the check. Apart from forcing the reader to infer the intent behind that massive condition, a prominent issue with such an approach is the large number of degrees of freedom in a general Boolean expression: to be sure the whole thing amounts to just checking set membership, one must carefully inspect each clause, minding any parentheses, misspellings of the repeated variable name, and more.
Each loose degree of freedom means exposure to not just one more bug, but to one more class of bugs.
An approach which uses an explicit set would have these advantages:
clear and explicit semantics;
tight constraint on the degrees of freedom to look after;
O(1) time complexity vs. O(n) complexity of your code.
This is the code needed to implement a set-based idiom:
static final Set<String> matches =
unmodifiableSet(new HashSet<>(asList("a","b","c")));
...
if (matches.contains(a)) // do something;
*I'm implying import static java.util.Arrays.asList and import static java.util.Collections.unmodifiableSet
Readability Is Mostly Formatting
Not readable...
if(a.equals("x") || a.equals("y") || a.equals("z") || Any number of terms...... )
//Do something
Now easy to real...
if(a.equals("x") ||
a.equals("y") ||
a.equals("z") ||
Any number of terms...... )
//Do something
Readability is very subjective to the person reading the source code.
If I came across code that implements collections, loops or one of the many other complicated answers here. I'd shake my head in disbelieve.
Separate The Logic From The Problem
You are mixing two different things. There is the problem of making the business logic easy to read, and the problem of implementing the business logic.
if(validState(a))
// Do something
How you implement validState doesn't matter. What's important is that code with the if statement is readable as business logic. It should not be a long chain of Boolean operations that hide the intent of what is happening.
Here is an example of readable business logic.
if(!isCreditCard(a)) {
return false;
}
if(isExpired(a)) {
return false;
}
return paymentAuthorized(a);
At some level there has to be code that processes basic logic, strings, arrays, etc.. etc.. but it shouldn't be at this level.
If you find you often have to check if a string is equal to a bunch of other strings. Put that code into a string utility class. Separate it from your work and keep your code readable. By ensuring it shows what you're really trying to do.
You can use Arrays.asList().This is the simplest approach and less verbosity.
Arrays.asList("x","y","z"...).contains(a)
For performance reason if your collection is too big you could put data in a HashSet cause searching there is in constant time.
Example make your own util method
public final class Utils{
private Utils(){}//don't let instantiate
public static <T> boolean contains(T a,T ... x){
return new HashSet<>(Arrays.asList(x)).contains(a);
}
}
Then in your client code:
if(Utils.contains(a,"x","y","z","n")){
//execute some code
}
With a little bit of help, you can get the syntactic sugar of a nicer if-statement with just a tiny bit of overhead. To elaborate on Tim's recommendation and Jesko's recommendation a tad further...
public abstract class Criteria {
public boolean matchesAny( Object... objects ) {
for( int i = 0, count = objects.length; i < count; i++ ) {
Object object = objects[i];
if( matches( object ) ) {
return true;
}
}
return false;
}
public boolean matchesAll( Object... objects ) {
for( int i = 0, count = objects.length; i < count; i++ ) {
Object object = objects[i];
if( !matches( object ) ) {
return false;
}
}
return true;
}
public abstract boolean matches( Object object );
}
public class Identity extends Criteria {
public static Identity of( Object self ) {
return new Identity( self );
}
private final Object self;
public Identity( Object self ) {
this.self = self;
}
#Override
public boolean matches( Object object ) {
return self != null ? self.equals( object ) : object == null;
}
}
Your if-statement would then look like this:
if( Identity.of( a ).matchesAny( "x", "y", "z" ) ) {
...
}
This is sort of a middle ground between having a generic syntax for this sort of conditional matching and having the expression describe a specific intent. Following this pattern also lets you perform the same sort of matching using criteria other than equality, much like how Comparators are designed.
Even with the improved syntax, this conditional expression is still just a little bit too complex. Further refactoring might lead to externalizing the terms "x", "y", "z" and moving the expression into a method whose name clearly defines its intent:
private static final String [] IMPORTANT_TERMS = {
"x",
"y",
"z"
};
public boolean isImportant( String term ) {
return Identity.of( term ).matchesAny( IMPORTANT_TERMS );
}
...and your original if-statement would finally be reduced to...
if( isImportant( a ) ) {
...
}
That's much better, and now the method containing your conditional expression can more readily focus on Doing One Thing.
Independent of what you are trying to achieve, this
if(a.equals("x") || a.equals("y") || a.equals("z") || Any number of terms...... )
//Do something
is always messy and unclean. In the first place it is just too long to make sense of it quickly.
The simplest solution for me would be to express your intend instead of being explicit.
Try to do this instead:
public class SomeClass{
public void SomeMethod(){
if ( matchesSignificantChar(a) ){
//doSomething
}
}
private bool matchesSignificantChar(String s){
return (s.equals("x") || s.equals("y") || s.equals("z") || Any number of terms...... )
}
}
This simplifies the scope of your conditional statement and makes it easier to understand while moving the complexity to a much smaller and named scope, that is headed by your intend.
However, this is still not very extensible. If you try to make it cleaner, you can extract the boolean method into another class and pass it as a delegate to SomeClass'es Constructor or even to SomeMethod. Also you can look into the Strategy Pattern for even more exensiblity.
Keep in mind that as a programmer you will spend much more time reading code (not only yours) than writing it, so creating better understandable code will pay off in the long run.
I use following pattern
boolean cond = false; // Name this variable reasonably
cond = cond || a.equals("x");
cond = cond || a.equals("y");
cond = cond || a.equals("z");
// Any number of terms......
if (cond) {
// ...
}
Note: no objects created on the heap. Also you can use any conditions, not only "equals".
In ruby you can use operator ||= for this purpose like cond ||= a.equals("x").
The Set answer is good. When not comparing for membership of a collection you can also separate out some or all of the conditional statement into methods. For example
if (inBounds(x) && shouldProcess(x) ) {
}
If a is guaranteed to be of length 1, you could do:
if ("xyz".indexOf(a) != -1)
One really nice way to do something like this is to use ASCII values, assuming your actual case here is where a is a char or a single character string. Convert a to its ASCII integer equivalent, then use something like this:
If you want to check that a is either "t", "u", "v", ... , "z", then do.....
If (val >= 116 && val <= 122) {//code here}
I prefer to use regexp like few guys wrote upper.
But also you can use next code
private boolean isOneMoreEquals(Object arg, Object... conditions) {
if (conditions == null || arg == null) {
return false;
}
for (int i = 0, d = conditions.length; i < d; i++) {
if (arg.equals(conditions[i])) {
return true;
}
}
return false;
}
so your code will be next:
if (isOneMoreEquals(a, "x", "y", "z") {
//do something
}
Assuming that your "x", "y", and "z" can be of arbitrary length, you can use
if (0 <= java.util.Arrays.binarySearch(new String[] { "x", "y", "z" }, a)) {
// Do something
}
Just make sure that you list your items in lexicographic order, as required by binarySearch(). That should be compatible all the way back to Java 1.2, and it should be more efficient than the solutions that use Java Collections.
Of course, if your "x", "y", and "z" are all single characters, and a is also a character, you can use if (0 <= "xyz".indexOf(a)) { ... } or
switch (a) {
case 'x': case 'y': case 'z':
// Do something
}
If x,y,z... is Consecutiveļ¼Œ you can use if(a >= 'x' && a <= '...'), if not, you can use ArrayList or just Arrays.
I think that cleanest and fastest way is to put values in array.
String[] values={"value1","value2","value3"};
for (string value : values) {
if (a.equals(value){
//Some code
}
}

Data structure to check for pairs?

Say I have objects A,B,C,D. They can contain references to one another, for example, A might reference B and C, and C might reference A. I want to create segments but dont want to create them twice, so I don't want segment A C and segment C A, just 1 of them. So I want to keep a list of created segments, ex: A C, and check if I already have an A C or C A and skip it if so.
Is there a data structure that can do this?
Thanks
if(list.contains(a,b)
{
//dont add
}
you may introduce something like
class PairKey<T extends Comparable<T>> {
final T fst, snd;
public PairKey(T a, T b) {
if (a.compareTo(b) <=0 ) {
fst = a;
snd = b;
} else {
fst = b;
snd = a;
}
}
#Override
public int hashCode() {
return a.hashCode() & 37 & b.hashCode();
}
#Override
public boolean equals(Object other) {
if (other == this) return true;
if (!(other instanceOf PairKey)) return false;
PairKey<T> obj = (PairKey<T>) other;
return (obj.fst.equals(fst) && obj.snd.equals(snd));
}
}
then you may put edges into HashSet < PairKey < ? extends Comparable> > and then check if the given pair is already there.
You will need to make your vertexes comparable, so it will be possible to treat PairKey(A,B) equal to PairKey(B,A)
And then HashSet will do the rest for you, e.g you will be able to query
pairs.contains(new PairKey(A,B));
and if pairs contain either PairKey(A,B) or PairKey(B,A) - it will return true.
hashCode implementation might be slightly different, may be IDE will generate something more sophisticated.
Hope that helps.
I would use an object called Pair that would look something like this:
class Pair
{
Node start;
Node end;
public Pair(Node start, Node end)
{
this.start=start;
this.end=end;
}
public Pair reverse()
{
return new Pair(end,start);
}
}
Now you can do something like this:
if(pairs.contains(currentPair) || pairs.contains(currentPair.reverse())
{
continue;
} else{
pairs.add(currentPair);
}
As pointed out in the comments, you will need to implement equals and hashcode. However, doing the check in equals to make it match the reversal of the segment is a bad practice in a pure OO since. By implementing equals in the fashion, described within the comments, would bind Pair to your application only and remove the portability of it.
You can use a set of sets of objects.
Set<Set<MyObjectType>> segments = new HashSet<Set<MyObjectType>>();
Then you can add two-element sets representing pairs of MyObject. Since sets are unordered, if segments contains a set with A and B, attempting to add a set containing B and A will treat it as already present in segments.
Set<MyObjectType> segment = new HashSet<MyObjectType>();
segment.add(A); // A and B are instances of MyObjectType
segment.add(B);
segments.add(segment);
segment = new HashSet<MyObjectType>();
segment.add(B);
segment.add(A);
segments.add(segment);
System.out.println("Number of segments: " + segments.size()); // prints 1
Your problem is related with graph theory.
What you can try is to remove that internal list and create a Incidence Martrix, that all you objects share.
The final solution mostly depend of the task goal and available structure. So is hard to choose best solution for you problem with the description you have provided.
Use java.util.Set/ java.util.HashSet and keep adding the references you find e.g.
Set set1 = new HashSet();
set1.add(A), set1.Add(C), set1.Add(C)
You can add this finding in an external set, as finalSet.add(set1)
Set<Set> finalSet = new HashSet<Set>();
finalSet.add(set1);
This will filter out the duplicates automatically and in the end, you will be left with A & C only.

Lucene: Boolean OR in MultiFieldQueryParser

I have a database with 10 fields, and I need to construct a query that looks something like the following pseudo code:
theQuery = ((field1 == A) &&
(field2 == B) &&
(field3 == C) &&
(field4 == D) &&
(field5 == E) &&
(field6 == F) &&
(field7 == G) &&
((field8 == H) || (field9 == H) || (field10 == H)))
That is to say that I need fields 1-7 to definitely contain the corresponding supplied variable, and I need the variable H to definitely appear in at least one of fields 8-10.
I have been trying to use the MultiFieldQueryParser, but the problem that I have is that the BooleanClauses supplied are MUST, MUST_NOT and SHOULD, and we can set the default operator of the MultiFieldQueryParser to be either AND or OR.
When I try using AND and setting fields 1-7 with MUST and fields 8-10 with SHOULD, the query parser basically ignores fields 8-10 and gives me back anything that contains the specified data in fields 1-7.
I haven't yet tried setting the default operator to OR, because I'm guessing that the query will return results that contain one or more of the supplied variables in fields 1-10.
For those that wish to see code, my code is as follows:
ArrayList queries = new ArrayList();
ArrayList fields = new ArrayList();
ArrayList flags = new ArrayList();
if(varA != null && !varA.equals(""))
{
queries.Add(varA);
fields.Add("field1");
flags.Add(BooleanClause.Occur.Must);
}
//... The same for 2-7
if(varH != null && !varH.equals(""))
{
queries.Add(varA);
queries.Add(varA);
queries.Add(varA);
fields.Add("field8");
fields.Add("field9");
fields.Add("field10");
flags.Add(BooleanClause.Occur.Should);
flags.Add(BooleanClause.Occur.Should);
flags.Add(BooleanClause.Occur.Should);
}
Query q = MultiFieldQueryParser.parse(VERSION.LUCENE_34,
queries.toArray(),
fields.toArray(),
flags.toArray(),
theAnalyzer);
Obviously this is somewhat simplified as the ArrayLists don't neatly return me arrays of Strings and BooleanClause.Occurs, but you get the idea.
Does anyone know of a way of forming a multifield query, including both boolean ANDs and boolean ORs?
Thanks,
Rik
I don't really understand your notation, so it's hard to figure out what the problem is. But just use standard queries:
BooleanQuery topQuery = new BooleanQuery();
topQuery.add(new TermQuery(...), BooleanClause.Occur.Must);
etc.
Or just do it in text and let the parser parse it for you: +field1:A +field2:B ...

Categories