I know hashtable doesnt allow null keys ...but how is the below code working.
And what does initializing the Big Decimal to -99 in the below code do.
private static final BigDecimal NO_REGION = new BigDecimal (-99);
public List getAllParameters (BigDecimal region, String key) {
List values = null;
if (region==null) {
region = NO_REGION;
}
Hashtable paramCache = (Hashtable)CacheManager.getInstance().get(ParameterCodeConstants.PARAMETER_CACHE);
if (paramCache.containsKey(region)) {
values = (List) ((Hashtable)paramCache.get(region)).get(key);
}
return values;
}
Am struggling for a long time and dont understand it.
This is an implementation of the null object pattern: a special object, BigDecimal(-99), is designated to play the role of null in a situation where "real" nulls are not allowed.
The only requirement is that the null object must be different from all "regular" objects. This way, the next time the program needs to find entries with no region, all it needs to do is a lookup by the NO_REGION key.
Regions are identified by a BigDecimal in the hashtable (key) - when no region is provided (null) a default value of -99 is used.
It just looks like poor code to me - if something that short makes you "struggle for a long time", that is usually the best indicator.
Just cleaning it up a little and it probably will make a lot more sense:
private static Hashtable paramCache = (Hashtable)CacheManager.getInstance().get(ParameterCodeConstants.PARAMETER_CACHE);
public List getAllParameters (BigDecimal region, String key) {
List values = null;
if (region != null && paramCache.containsKey(region)) {
Hashtable regionMap = (Hashtable) paramCache.get(region);
values = (List) regionMap.get(key);
}
return values;
}
Seems the writer into hashtable used NO_REGION as key for values without a region. So, the reader is doing the same thing.
Related
I am new in Hadoop.
I try to use MapReduce to get the min and max Monthly Precipitation value for each year.
Here is one year of the data set looks like:
Product code,Station number,Year,Month,Monthly Precipitation Total (millimetres),Quality
IDCJAC0001,023000,1839,01,11.5,Y
IDCJAC0001,023000,1839,02,11.4,Y
IDCJAC0001,023000,1839,03,20.8,Y
IDCJAC0001,023000,1839,04,10.5,Y
IDCJAC0001,023000,1839,05,4.8,Y
IDCJAC0001,023000,1839,06,90.4,Y
IDCJAC0001,023000,1839,07,54.2,Y
IDCJAC0001,023000,1839,08,97.4,Y
IDCJAC0001,023000,1839,09,41.4,Y
IDCJAC0001,023000,1839,10,40.8,Y
IDCJAC0001,023000,1839,11,113.2,Y
IDCJAC0001,023000,1839,12,8.9,Y
And this is what the result I get for the year 1839:
1839 1.31709005E9 1.3172928E9
Obviously, the result is not matched to the original data...But I cannot figure out why it happens...
Your code has multiple issues.
(1) In MinMixExposure, you write doubles, but read ints. You also use Double type (meaning that you care about nulls) but do not handle nulls in serialization/deserialization. If you really need nulls, you should write something like this:
// write
out.writeBoolean(value != null);
if (value != null) {
out.writeDouble(value);
}
// read
if (in.readBoolean()) {
value = in.readDouble();
} else {
value = null;
}
If you do not need to store nulls, replace Double with double.
(2) In map function you wrap your code in IOException catch blocks. This doesn't make any sense. If input data has records in incorrect format, then most probably you will get NullPointerException/NumberFormatError in Double.parseDouble(). However, you do not handle these exceptions.
Checking for nulls after you called parseDouble also doesn't make sense.
(3) You pass map key to reducer as Text. I would recommend to pass year as IntWritable (and configure your job with job.setMapOutputKeyClass(IntWritable.class);).
(4) maxExposure must be handled similarly to minExposure in reducer code. Currently you just return the value for the last record.
Your logic to find the min and max exposure in the Reducer seems off. You set maxExposure twice, and never check whether it is actually the max exposure. I'd go with:
public void reduce(Text key, Iterable<MinMaxExposure> values,
Context context) throws IOException, InterruptedException {
Double minExposure = Double.MAX_VALUE;
Double maxExposure = Double.MIN_VALUE;
for (MinMaxExposure val : values) {
if (val.getMinExposure() < minExposure) {
minExposure = val.getMinExposure();
}
if (val.getMaxExposure() > maxExposure) {
maxExposure = val.getMaxExposure();
}
}
MinMaxExposure resultRow = new MinMaxExposure();
resultRow.setMinExposure(minExposure);
resultRow.setMaxExposure(maxExposure);
context.write(key, resultRow);
}
We are storing complex objects in Hazelcast maps and need the possibility to search for objects not only based on the key but also on the content of these complex objects. In order to not take too large a performance hit, we are using indices on those search terms.
We are also using spring-data-hazelcast which provides repositories that allow us to use findByAbcXyz() type semantic queries. For some of the more complex queries we are using the #Query annotation (which spring-data-hazelcast internally translates to SqlPredicates).
We have now encountered an issue where under certain situations these #Query based search methods did not return any values, even if we could verify that the searched objects did in fact exist in the map.
I have managed to reproduce this issue with core hazelcast (i.e. without the use of spring-data-hazelcast).
Here is our object structure:
BetriebspunktKey.java
public class BetriebspunktKey implements Serializable {
private Integer uicLand;
private Integer nummer;
public BetriebspunktKey(final Integer uicLand, final Integer nummer) {
this.uicLand = uicLand;
this.nummer = nummer;
}
public Integer getUicLand() {
return uicLand;
}
public Integer getNummer() {
return nummer;
}
}
Betriebspunkt.java
public class Betriebspunkt implements Serializable {
private BetriebspunktKey key;
private List<BetriebspunktVersion> versionen;
public Betriebspunkt(final BetriebspunktKey key, final List<BetriebspunktVersion> versionen) {
this.key = key;
this.versionen = versionen;
}
public BetriebspunktKey getKey() {
return key;
}
}
BetriebspunktVersion.java
public class BetriebspunktVersion implements Serializable {
private List<BetriebspunktKey> zusatzbetriebspunkte;
public BetriebspunktVersion(final List<BetriebspunktKey> zusatzbetriebspunkte) {
this.zusatzbetriebspunkte = zusatzbetriebspunkte;
}
}
In my main file, I am now setting up hazelcast:
Config config = new Config();
final MapConfig mapConfig = config.getMapConfig("points");
mapConfig.addMapIndexConfig(new MapIndexConfig("versionen[any].zusatzbetriebspunkte[any].nummer", false));
HazelcastInstance instance = Hazelcast.newHazelcastInstance(config);
IMap<BetriebspunktKey, Betriebspunkt> map = instance.getMap("points");
I am also preparing my search criteria for later on:
Predicate equalPredicate = Predicates.equal("versionen[any].zusatzbetriebspunkte[any].nummer", 53090);
Predicate sqlPredicate = new SqlPredicate("versionen[any].zusatzbetriebspunkte[any].nummer=53090");
Next, I am creating two objects, one with the "full depth" of information, the other does not contain any "zusatzbetriebspunkte":
final Betriebspunkt abc = new Betriebspunkt(
new BetriebspunktKey(80, 166),
Collections.singletonList(new BetriebspunktVersion(
Collections.singletonList(new BetriebspunktKey(80, 53090))
))
);
final Betriebspunkt def = new Betriebspunkt(
new BetriebspunktKey(83, 141),
Collections.singletonList(new BetriebspunktVersion(
Collections.emptyList()
))
);
Here is, where things become interesting. If I first insert the "full" object into the map, the search using both the EqualPredicate as well as the SqlPredicate works:
map.put(abc.getKey(), abc);
map.put(def.getKey(), def);
Collection<Betriebspunkt> equalResults = map.values(equalPredicate);
Collection<Betriebspunkt> sqlResults = map.values(sqlPredicate);
assertEquals(1, equalResults.size()); // contains "abc"
assertEquals(1, sqlResults.size()); // contains "abc"
However, if I insert the objects into my map in reverse order (i.e. first the "partial" object and then the "full" one), only the EqualPredicate works correctly, the SqlPredicate returns an empty list, no matter what the content of the map or the search criteria.
map.put(abc.getKey(), abc);
map.put(def.getKey(), def);
Collection<Betriebspunkt> equalResults = map.values(equalPredicate);
Collection<Betriebspunkt> sqlResults = map.values(sqlPredicate);
assertEquals(1, equalResults.size()); // contains "abc"
assertEquals(1, sqlResults.size()); // --> this fails, it returns en empty list
What is the reason for this behaviour? It looks like a bug in the hazelcast code.
The reason for failing
After a lot of debugging, I have found the reason for this issue. The reasons can indeed be found in the hazelcast code.
When putting a value into a hazelcast map DefaultRecordStore.putInternal is called. At the end of this method DefaultRecordStore.saveIndex is called which finds the corresponding indexes and then calls Indexes.saveEntryIndex. This method iterates over each index and calls InternalIndex.saveEntryIndex (or rather its implementation IndexImpl.saveEntryIndex. The interesting part of that method are the following lines:
if (this.converter == null || this.converter == TypeConverters.NULL_CONVERTER) {
this.converter = entry.getConverter(this.attributeName);
}
Aparently each index stores a converter class when the first element is put into the map. Looking at QueryableEntry.getConverter explains what happens:
TypeConverter getConverter(String attributeName) {
Object attribute = this.getAttributeValue(attributeName);
if (attribute == null) {
return TypeConverters.NULL_CONVERTER;
} else {
AttributeType attributeType = this.extractAttributeType(attributeName, attribute);
return attributeType == null ? TypeConverters.IDENTITY_CONVERTER : attributeType.getConverter();
}
}
When first inserting the "full" object, extractAttributeType() will follow the "path" of our index definition "versionen[any].zusatzbetriebspunkte[any].nummer" and find out that nummer is an integer type, accordingly a TypeConverters.IntegerConverter will be returned and stored.
When first inserting the "partial" object, "zusatzbetriebspunkte[any]" is emtpy, and there is no way for extractAttributeType to find out what type nummer hast, it therefore returns null which means that TypeConverters.IdentityConverter is used.
Also, whenever a "full" element is inserted an entry is written into the index map using nummer as key, i.e. the index-map is of type Map.
So much for writing to the map. Let's now look at how data is read from the map. When calling map.values(predicate) we will eventually get to QueryRunner.runUsingGlobalIndexSafely which contains a line:
Collection<QueryableEntry> entries = indexes.query(predicate);
this will in turn after some boilerplate code call
Set<QueryableEntry> result = indexAwarePredicate.filter(queryContext);
For both of our predicates we will eventually get to IndexImpl.getRecords() which looks as follows:
public Set<QueryableEntry> getRecords(Comparable attributeValue) {
long timestamp = this.stats.makeTimestamp();
if (this.converter == null) {
this.stats.onIndexHit(timestamp, 0L);
return new SingleResultSet((Map)null);
} else {
Set<QueryableEntry> result = this.indexStore.getRecords(this.convert(attributeValue));
this.stats.onIndexHit(timestamp, (long)result.size());
return result;
}
}
The crucial call is this.convert(attributeValue) where attributeValue is the value of the predicate.
If we compare our two predicates, we can see that the EqualPredicate has two members:
attributeName = "versionen[any].zusatzbetriebspunkte[any].nummer"
value = {Integer} 53090
The SqlPredicate contains the initial string (which we passed to its constructor) but which at constructions was also parsed and mapped to a internal EqualPredicate (which when evaluating the predicate is eventually used and passed to getRecords() above):
sql = "versionen[any].zusatzbetriebspunkte[any].nummer=53090"
predicate = {EqualPredicate}
attributeName = "versionen[any].zusatzbetriebspunkte[any].nummer"
value = {String} "53090"
And this explains why the manually created EqualPredicate works in both cases: Its value is an integer. When passed to the converter, it does not matter whether it is the IntegerConverter or the IdentityConverter, as both will return the integer which can then be used as key in the index-map (which uses an integer as key).
With the SqlPredicate however, the value is a String. If this is passed to the IntegerConverter, it is converted to its corresponding integer value and accessing the index-map works. If it is passed to the IdentityConverter, the string is returned by the conversion and trying to access the index-map with a string will never find any results.
A possible solution
How can we solve this issue? I see several possibilities:
insert a "fully built" dummy value into our map during startup to ensure the converter is correctly initialised. While this works, it is ugly and not maintenance friendly
avoid using SqlPredicate and use the integer based EqualPredicate. This is not an option when working with spring-data-hazelcast as it always converts #Query based searches to SqlPredicates. We could of course use hazelcast directly and circumvent the spring-data wrapper but while that would work it means having two ways of accessing hazelcast which is also not very maintainable
use hazelcast's ValueExtractor class. This is the elegant solution that works both natively and using spring-data-hazelcast. I will outline what that looks like:
First we need to implement a value extractor which returns all zusatzbetriebspunkte of our Betriebspunkt in a form suitable for us
public class BetriebspunktExtractor extends ValueExtractor<Betriebspunkt, String> implements Serializable {
#Override
public void extract(final Betriebspunkt betriebspunkt, final String argument, final ValueCollector valueCollector) {
betriebspunkt.getVersionen().stream()
.map(BetriebspunktVersion::getZusatzbetriebspunkte)
.flatMap(List::stream)
.map(zbp -> zbp.getUicLand() + "_" + zbp.getNummer())
.forEach(valueCollector::addObject);
}
}
You'll notice that I am not only returning the nummer field but also include the uicLand field this is something we really wanted but couldn't get working using the "...[any]..." notation. We could of course only return the nummer if we wanted the exact same behavior as outlined above.
Now we need to modify our hazelcast configuration slightly:
Config config = new Config();
final MapConfig mapConfig = config.getMapConfig("points");
//mapConfig.addMapIndexConfig(new MapIndexConfig("versionen[any].zusatzbetriebspunkte[any].nummer", false));
mapConfig.addMapIndexConfig(new MapIndexConfig("zusatzbetriebspunkt", false));
mapConfig.addMapAttributeConfig(new MapAttributeConfig("zusatzbetriebspunkt", BetriebspunktExtractor.class.getName()));
You'll notice that the "long" index definition using the "...[any]..." notation is no longer needed.
Now we can use this "pseudo attribute" to query our values and it doesn't matter in which order the objects have been added to the map:
Predicate keyPredicate = Predicates.equal("zusatzbetriebspunkt", "80_53090");
Collection<Betriebspunkt> keyResults = map.values(keyPredicate);
assertEquals(1, keyResults.size()); // always contains "abc"
And in our spring-data-hazelcast repository we can now do this:
#Query("zusatzbetriebspunkt=%d_%d")
List<StammdatenBetriebspunkt> findByZusatzbetriebspunkt(Integer uicLand, Integer nummer);
If you do not need to use spring-data-hazelcast, instead of returning a string to the ValueCollector, you could return the BetriebspunktKey directly and then use it in the predicate as well. That would be the cleanest solution:
public class BetriebspunktExtractor extends ValueExtractor<Betriebspunkt, String> implements Serializable {
#Override
public void extract(final Betriebspunkt betriebspunkt, final String argument, final ValueCollector valueCollector) {
betriebspunkt.getVersionen().stream()
.map(BetriebspunktVersion::getZusatzbetriebspunkte)
.flatMap(List::stream)
//.map(zbp -> zbp.getUicLand() + "_" + zbp.getNummer())
.forEach(valueCollector::addObject);
}
}
and then
Predicate keyPredicate = Predicates.equal("zusatzbetriebspunkt", new BetriebspunktKey(80, 53090));
However, for this to work, BetriebspunktKey needs to implement Comparable and must also provide its own equals and hashCode methods.
In my code I have a List<Person>. Attributes to the objects in this list may include something along the lines of:
ID
First Name
Last Name
In a part of my application, I will be allowing the user to search for a specific person by using any combination of those three values. At the moment, I have a switch statement simply checking which fields are filled out, and calling the method designated for that combination of values.
i.e.:
switch typeOfSearch
if 0, lookById()
if 1, lookByIdAndName()
if 2, lookByFirstName()
and so on. There are actually 7 different types.
This makes me have one method for each statement. Is this a 'good' way to do this? Is there a way that I should use a parameter or some sort of 'filter'? It may not make a difference, but I'm coding this in Java.
You can do something more elgant with maps and interfaces. Try this for example,
interface LookUp{
lookUpBy(HttpRequest req);
}
Map<Integer, LookUp> map = new HashMap<Integer, LookUp>();
map.put(0, new LookUpById());
map.put(1, new LookUpByIdAndName());
...
in your controller then you can do
int type = Integer.parseInt(request.getParameter(type));
Person person = map.get(type).lookUpBy(request);
This way you can quickly look up the method with a map. Of course you can also use a long switch but I feel this is more manageable.
If good means "the language does it for me", no.
If good means 'readable', I would define in Person a method match() that returns true if the object matches your search criteria. Also, probably is a good way to create a method Criteria where you can encapsulate the criteria of search (which fields are you looking for and which value) and pass it to match(Criteria criteria).
This way of doing quickly becomes unmanageable, since the number of combinations quickly becomes huge.
Create a PersonFilter class having all the possible query parameters, and visit each person of the list :
private class PersonFilter {
private String id;
private String firstName;
private String lastName;
// constructor omitted
public boolean accept(Person p) {
if (this.id != null && !this.id.equals(p.getId()) {
return false;
}
if (this.firstName != null && !this.firstName.equals(p.getFirstName()) {
return false;
}
if (this.lastName != null && !this.lastName.equals(p.getLastName()) {
return false;
}
return true;
}
}
The filtering is now implemented by
public List<Person> filter(List<Person> list, PersonFilter filter) {
List<Person> result = new ArrayList<Person>();
for (Person p : list) {
if (filter.accept(p) {
result.add(p);
}
}
return result;
}
At some point you should take a look at something like Lucene which will give you the best scalability, manageability and performance for this type of searching. Not knowing the amount of data your dealing with I only recommend this for a longer term solution with a larger set of objects to search with. It's an amazing tool!
HashMap allows one null key and any number of null values. What is the use of it?
I'm not positive what you're asking, but if you're looking for an example of when one would want to use a null key, I use them often in maps to represent the default case (i.e. the value that should be used if a given key isn't present):
Map<A, B> foo;
A search;
B val = foo.containsKey(search) ? foo.get(search) : foo.get(null);
HashMap handles null keys specially (since it can't call .hashCode() on a null object), but null values aren't anything special, they're stored in the map like anything else
One example would be for modeling trees. If you are using a HashMap to represent a tree structure, where the key is the parent and the value is list of children, then the values for the null key would be the root nodes.
One example of usage for null values is when using a HashMap as a cache for results of an expensive operation (such as a call to an external web service) which may return null.
Putting a null value in the map then allows you to distinguish between the case where the operation has not been performed for a given key (cache.containsKey(someKey) returns false), and where the operation has been performed but returned a null value (cache.containsKey(someKey) returns true, cache.get(someKey) returns null).
Without null values, you would have to either put some special value in the cache to indicate a null response, or simply not cache that response at all and perform the operation every time.
The answers so far only consider the worth of have a null key, but the question also asks about any number of null values.
The benefit of storing the value null against a key in a HashMap is the same as in databases, etc - you can record a distinction between having a value that is empty (e.g. string ""), and not having a value at all (null).
Here's my only-somewhat-contrived example of a case where the null key can be useful:
public class Timer {
private static final Logger LOG = Logger.getLogger(Timer.class);
private static final Map<String, Long> START_TIMES = new HashMap<String, Long>();
public static synchronized void start() {
long now = System.currentTimeMillis();
if (START_TIMES.containsKey(null)) {
LOG.warn("Anonymous timer was started twice without being stopped; previous timer has run for " + (now - START_TIMES.get(null).longValue()) +"ms");
}
START_TIMES.put(null, now);
}
public static synchronized long stop() {
if (! START_TIMES.containsKey(null)) {
return 0;
}
return printTimer("Anonymous", START_TIMES.remove(null), System.currentTimeMillis());
}
public static synchronized void start(String name) {
long now = System.currentTimeMillis();
if (START_TIMES.containsKey(name)) {
LOG.warn(name + " timer was started twice without being stopped; previous timer has run for " + (now - START_TIMES.get(name).longValue()) +"ms");
}
START_TIMES.put(name, now);
}
public static synchronized long stop(String name) {
if (! START_TIMES.containsKey(name)) {
return 0;
}
return printTimer(name, START_TIMES.remove(name), System.currentTimeMillis());
}
private static long printTimer(String name, long start, long end) {
LOG.info(name + " timer ran for " + (end - start) + "ms");
return end - start;
}
}
Another example : I use it to group Data by date.
But some data don't have date. I can group it with the header "NoDate"
A null key can also be helpful when the map stores data for UI selections where the map key represents a bean field.
A corresponding null field value would for example be represented as "(please select)" in the UI selection.
Is there any better way to cache up some very large objects, that can only be created once, and therefore need to be cached ? Currently, I have the following:
public enum LargeObjectCache {
INSTANCE;
private Map<String, LargeObject> map = new HashMap<...>();
public LargeObject get(String s) {
if (!map.containsKey(s)) {
map.put(s, new LargeObject(s));
}
return map.get(s);
}
}
There are several classes that can use the LargeObjects, which is why I decided to use a singleton for the cache, instead of passing LargeObjects to every class that uses it.
Also, the map doesn't contain many keys (one or two, but the key can vary in different runs of the program) so, is there another, more efficient map to use in this case ?
You may need thread-safety to ensure you don't have two instance of the same name.
It does matter much for small maps but you can avoid one call which can make it faster.
public LargeObject get(String s) {
synchronized(map) {
LargeObject ret = map.get(s);
if (ret == null)
map.put(s, ret = new LargeObject(s));
return ret;
}
}
As it has been pointed out, you need to address thread-safety. Simply using Collections.synchronizedMap() doesn't make it completely correct, as the code entails compound operations. Synchronizing the entire block is one solution. However, using ConcurrentHashMap will result in a much more concurrent and scalable behavior if it is critical.
public enum LargeObjectCache {
INSTANCE;
private final ConcurrentMap<String, LargeObject> map = new ConcurrentHashMap<...>();
public LargeObject get(String s) {
LargeObject value = map.get(s);
if (value == null) {
value = new LargeObject(s);
LargeObject old = map.putIfAbsent(s, value);
if (old != null) {
value = old;
}
}
return value;
}
}
You'll need to use it exactly in this form to have the correct and the most efficient behavior.
If you must ensure only one thread gets to even instantiate the value for a given key, then it becomes necessary to turn to something like the computing map in Google Collections or the memoizer example in Brian Goetz's book "Java Concurrency in Practice".