I'm somewhat of a beginner to java, although I understand the basics. I believed this was the best implementation for my problem, but obviously I may be wrong. This is a mock example I made, and I'm not interested in looking for different implementations. I simply mention I'm not sure if it's the best implementation in the case that it's impossible. Regardless:
Here I have an enum, inside of which I want a map (specifically a LinkedHashMap) as one of the enum object's stored values
enum Recipe {
PANCAKES(true, new LinkedHashMap<>() ),
SANDWICH(true, new LinkedHashMap<>() ),
STEW(false, new LinkedHashMap<>() );
private final boolean tasty;
private final LinkedHashMap<String, String> directions;
// getter for directions
Recipe(boolean tasty, LinkedHashMap<String, String> directions) {
this.tasty = tasty
this.directions = directions;
}
}
However, I haven't found a way to Initialize and Populate a Map of any size in a single line
(as this would be needed for an enum)
For example, I thought this looked fine
PANCAKES(true, new LinkedHashMap<>(){{
put("Pancake Mix","Pour");
put("Water","Mix with");
put("Pan","Put mixture onto");
}};)
Until I read that this is dangerous and can cause a memory leak. Plus, it isn't the best looking code.
I also found the method:
Map.put(entry(), entry()... entry())
Which can be turned into a LinkedHashMap by passing it through its constructor:
PANCAKES(true, new LinkedHashMap<>(Map.put(entry(), ...)) );
Although I haven't found a way to ensure the insertion order is preserved, since as far as I'm aware Maps don't preserve insertion order.
Of course, there's always the option to store the LinkedHashMaps in a different place outside of the enum and simply put those in manually, but I feel like this would give me a headache managing, as I intend to add to this enum in the future.
Is there any other way to accomplish this?
to clarify, I don't literally need the code to occupy a single line, I just want the LinkedHashMap initialization and population to be written in the same place, rather than storing these things outside of the enum
Without more context, I'd say that Recipe is kind of a square peg to try to fit into the round hole of enum. In other words, in the absence of some other requirement or context that suggests an enum is best, I'd probably make it a class and expose public static final instances that can be used like enum values.
For example:
public class Recipe {
public static final Recipe PANCAKES =
new Recipe(true,
new Step("Pancake Mix","Pour"),
new Step("Water","Mix with"),
new Step("Pan","Put mixture onto")
);
public static final Recipe SANDWHICH =
new Recipe(true
// ...steps...
);
// ...more Recipes ...
#Getter
public static class Step {
private final String target;
private final String action;
private Step(String target, String action ) {
this.target = target;
this.action = action;
}
}
private final boolean tasty;
private final LinkedHashMap<String, Step> directions;
private Recipe(boolean tasty, Step... steps) {
this.tasty = tasty;
this.directions = new LinkedHashMap<>();
for (Step aStep : steps) {
directions.put(aStep.getTarget(), aStep);
}
}
}
You could also do this as anenum, where the values would be declared like this:
PANCAKES(true,
new Step("Pancake Mix","Pour"),
new Step("Water","Mix with"),
new Step("Pan","Put mixture onto")
),
SANDWHICH(true
// ...steps...
);
but like I said, this feels like a proper class as opposed to an enum.
First off, you don't really need to declare the map as a concrete implementation. If you just use Map then you will have a lot more choices.
enum Recipe {
PANCAKES(true, Map.empty()),
SANDWICH(true, Map.empty()),
STEW(false, Map.empty());
private final boolean tasty;
private final Map<String, String> directions;
// getter for directions
Recipe(boolean tasty, Map<String, String> directions) {
this.tasty = tasty
this.directions = directions;
}
}
Then, assuming you don't have more than 10 directions, you can use this form:
PANCAKES(true, Map.of(
"Pancake Mix","Pour",
"Water","Mix with",
"Pan","Put mixture onto"))
Map.of creates an immutable map, which is probably what you want for this kind of application, and should not have memory leakage issues.
Related
I am studying data skew processing in Flink and how I can change the low-level control of physical partition in order to have an even processing of tuples. I have created synthetic skewed data sources and I aim to process (aggregate) them over a window. Here is the complete code.
streamTrainsStation01.union(streamTrainsStation02)
.union(streamTicketsStation01).union(streamTicketsStation02)
// map the keys
.map(new StationPlatformMapper(metricMapper)).name(metricMapper)
.rebalance() // or .rescale() .shuffle()
.keyBy(new StationPlatformKeySelector())
.window(TumblingProcessingTimeWindows.of(Time.seconds(20)))
.apply(new StationPlatformRichWindowFunction(metricWindowFunction)).name(metricWindowFunction)
.setParallelism(4)
.map(new StationPlatformMapper(metricSkewedMapper)).name(metricSkewedMapper)
.addSink(new MqttStationPlatformPublisher(ipAddressSink, topic)).name(metricSinkFunction)
;
According to the Flink dashboard I could not see too much difference among .shuffle(), .rescale(), and .rebalance(). Even though the documentation says rebalance() transformation is more suitable for data skew.
After that I tried to use .partitionCustom(partitioner, "someKey"). However, for my surprise, I could not use setParallelism(4) on the window operation. The documentation says
Note: This operation is inherently non-parallel since all elements
have to pass through the same operator instance.
I did not understand why. If I am allowed to do partitionCustom, why can't I use parallelism after that? Here is the complete code.
streamTrainsStation01.union(streamTrainsStation02)
.union(streamTicketsStation01).union(streamTicketsStation02)
// map the keys
.map(new StationPlatformMapper(metricMapper)).name(metricMapper)
.partitionCustom(new StationPlatformKeyCustomPartitioner(), new StationPlatformKeySelector())
.windowAll(TumblingProcessingTimeWindows.of(Time.seconds(20)))
.apply(new StationPlatformRichAllWindowFunction(metricWindowFunction)).name(metricWindowFunction)
.map(new StationPlatformMapper(metricSkewedMapper)).name(metricSkewedMapper)
.addSink(new MqttStationPlatformPublisher(ipAddressSink, topic)).name(metricSinkFunction)
;
Thanks,
Felipe
I got an answer from FLink-user-mail list. Basically using keyBy() after rebalance() is killing all effect that rebalance() is trying to do. The first (ad-hoc) solution that I found is to create a composite key that cares about the skewed key.
public class CompositeSkewedKeyStationPlatform implements Serializable {
private static final long serialVersionUID = -5960601544505897824L;
private Integer stationId;
private Integer platformId;
private Integer skewParameter;
}
I use it on the map function before use keyBy().
public class StationPlatformSkewedKeyMapper
extends RichMapFunction<MqttSensor, Tuple2<CompositeSkewedKeyStationPlatform, MqttSensor>> {
private SkewParameterGenerator skewParameterGenerator;
public StationPlatformSkewedKeyMapper() {
this.skewParameterGenerator = new SkewParameterGenerator(10);
}
#Override
public Tuple2<CompositeSkewedKeyStationPlatform, MqttSensor> map(MqttSensor value) throws Exception {
Integer platformId = value.getKey().f2;
Integer stationId = value.getKey().f4;
Integer skewParameter = 0;
if (stationId.equals(new Integer(2)) && platformId.equals(new Integer(3))) {
skewParameter = this.skewParameterGenerator.getNextItem();
}
CompositeSkewedKeyStationPlatform compositeKey = new CompositeSkewedKeyStationPlatform(stationId, platformId,
skewParameter);
return Tuple2.of(compositeKey, value);
}
}
here is my complete solution.
Say I have the following code snippet to create colored vegetables for a small random game I'm making to practice separating object properties out of object classes:
List<Vegetable> vegList = new ArrayList<Vegetable>();
Map<MyProperty, Object> propertyList = new HashMap<MyProperty, Object>();
propertyList.put(MyProperty.COLOR, "#0000BB");
propertyList.put(MyProperty.TYPE, MyType.VEGETABLE);
propertyList.put(MyProperty.COMMONNAME, "Potato");
vegList.add(new Vegetable("Maisie", propertyList));
propertyList.put(MyProperty.COLOR, "#00FF00");
propertyList.put(MyProperty.COMMONNAME, "Poisonous Potato");
vegList.add(new Vegetable("Horror", propertyList));
I realized while doing this (making my own example from Head First OOA&D, basically) I have no idea why changing propertyList the second time doesn't affect the values previously set within Maisie.
I followed the structure provided by the book, but the first time around I was creating a new HashMap for each individual Vegetable object, before adding it to the list. The book shows that's unnecessary but doesn't go into why.
All I can see is the interpreter is making a choice to create a new instance of the hashmap when it's specified in the Vegetable constructor the second time around. But why?
How does it know that I'd rather have a different HashMap in there, rather than reusing the first object and .put() changing its values for both Vegetables?
Second related question is.... should I want to actually have 2 vegetables share the exact same list of properties (the same HashMap object), how would I do that? And should this actually be a horrible idea... why? How would wanting this show I just don't know what I'm doing?
My understanding hits a wall beyond "it has to do with object references".
Thanks for helping me clear this up.
Vegetable class as requested:
public class Vegetable {
public VegetableSpec characteristics;
public String name;
public Vegetable(String name, Map<MyProperty, Object> propertyList) {
this.name = name;
this.characteristics = new VegetableSpec(propertyList);
}
public void display() {
System.out.printf("My name is %s!\n", this.name);
for (Entry<MyProperty, Object> entry : characteristics.properties.entrySet()) {
System.out.printf("key: %s, val: %s\n", entry.getKey().toString(), entry.getValue().toString());
}
}
}
... which made me look at VegetableSpec again (I put it in because the book used a separate Spec class, but I didn't understand why it was necessary beyond adding search capabilities; now I think I see it does 2 things, one is defensive copying!):
public class VegetableSpec {
Map<MyProperty, Object> properties;
public VegetableSpec(Map<MyProperty, Object> properties) {
if (properties == null) {
// return a null = bad way to signal a problem
this.properties = new HashMap();
} else {
// correction above makes it clear this isn't redundant
this.properties = new HashMap(properties);
}
}
}
It sounds like the constructor for Vegetable is making a defensive copy. It is generally a good idea to do this to prevent anyone from changing an object in ways the designer of the object does not want. You should (nearly) always be making defensive copies.
I want to actually have 2 vegetables share the exact same list of properties (the same HashMap object), how would I do that?
Pass the same hash map in, and ignore the fact that it makes a defensive copy, should not matter to you as a consumer.
I couldn't find a better title (feel free to edit it if you find a better one), but the use case is the following. I have two lists of constants. One of those contains the constants I use in my application, the other contains the different constants that are sent to me via a CSV file (along with data).
To give a rough exemple : in the CSV file, there is a field called "id of the client". In my application, I want to use a field called "clientId". So I basically need to create a static link between the two constants, so that I can easily switch from one to the other depending on what I need to achieve.
I've thought about creating a static Map(String, String) of values, but I figured there might be better solutions.
Thanks !
EDIT : changed title to "N" constants instead of 2, because Hashmap doesn't seem to be an option any longer in that case.
you can use the double bracket innitializer idiom to keep map initialization close to the map declaration, so it would be not so "ugly" eg:
static Map<String, String> someMap = new HashMap<String, String>() {{
put("one", "two");
put("three", "four");
}};
Beware that without the static modifier each anonymous class (there is one created in this example) holds a refernce to the enclosing object and if you'll give a reference to this map to some other class it will prevent the enclosing class from being garbage collect.
Fortunatelly, there is a hope for us with java update, in java 9 there will be very handy Map.of() to help us do it more safely.
The best way to separate the mapping from your application code is to use a properties file where in which you define your mapping.
For example, you could have a csv-mapping.properties in the root of your resources and load them with the following code:
final Properties properties = new Properties();
properties.load( this.getClass().getResourceAsStream( "/csv-mapping.properties" ) );
This will work just like a Map, with the added separation of code from configuration.
There are many methods that you can use to easily solve these types of problem.
One way is to use a Properties file, or file containing the key value pair.
Here is the code for Properties.
import java.util.ResourceBundle;
public class ReadingPropertiesFile {
public static void main(String[] args) {
ResourceBundle messages;
messages = ResourceBundle.getBundle("msg");
System.out.println(messages.getString("ID"));
}
}
msg.properties file contains values::
ID = ClientID.
PRODUCT_ID = prod_ID
The output of the program is ClientID.
You can also read from a simple text file. Or you could use the map as you are using. But I would suggest you to use the properties file.
One good option would be to use an enum to create such mappings beetween multiple constants to a single common sense value, eg:
import java.util.Arrays;
import java.util.Collections;
import java.util.HashSet;
import java.util.Set;
public enum MappingEnum {
CLIENT_ID("clientId", "id of the client", "clientId", "IdOfTheClient"),
CLIENT_NAME("clientName", "name of the client", "clientName");
private Set<String> aliases;
private String commonSenseName;
private MappingEnum(String commonSenseName, String... aliases) {
this.commonSenseName = commonSenseName;
this.aliases = Collections.unmodifiableSet(new HashSet<String>(Arrays.asList(aliases)));
}
public static MappingEnum fromAlias(String alias) {
for (MappingEnum mappingEnum : values()) {
if (mappingEnum.getAliases().contains(alias)) {
return mappingEnum;
}
}
throw new RuntimeException("No MappingEnum for mapping: " + alias);
}
public String getCommonSenseName() {
return commonSenseName;
}
}
and then you can use it like:
String columnName = "id of the client";
String targetFieldName = MappingEnum.fromAlias(columnName).getCommonSenseName();
Hello,
I'm currently working on a word prediction in Java.
For this, I'm using a NGram based model, but I have some memory issues...
In a first time, I had a model like this :
public class NGram implements Serializable {
private static final long serialVersionUID = 1L;
private transient int count;
private int id;
private NGram next;
public NGram(int idP) {
this.id = idP;
}
}
But it's takes a lot of memory, so I thought I need optimization, and I thought, if I have "hello the world" and "hello the people", instead of get two ngram, I could keep in one that keep "Hello the" and then have two possibilty : "people" and "world".
To be more clear, this is my new model :
public class BNGram implements Serializable {
private static final long serialVersionUID = 1L;
private int id;
private HashMap<Integer,BNGram> next;
private int count = 1;
public BNGram(int idP) {
this.id = idP;
this.next = new HashMap<Integer, BNGram>();
}
}
But it seems that my second model consume twice more memory... I think it's because of HashMap, but I don't how to reduce this? I tried to use different Map implementations like Trove or others, but it don't change any thing.
To give you a idea, for a text of 9MB with 57818 different word (different, but it's not the total number of word), after NGram generation, my javaw process consume 1.2GB of memory...
If I save it with GZIPOutputStream, it takes arround 18MB on the disk.
So my question is : how can I do to use less memory ? Can I make something with compression (as the Serialization).
I need to add this to a other application, so I need to reduce the memory usage before...
Thanks a lot, and sorry for my bad english...
ZiMath
You need a specialized structure to achieve what you want.
Take a look at Apache's PatriciaTrie. It's like a Map, but it's memory-wise and works with Strings. It's also extremely fast: operations are O(k), with k being the number of bits of the largest key.
It has an operation that suits your immediate needs: prefixMap(), which returns a SortedMap view of the trie that contains Strings which are prefixed by the given key.
A brief usage example:
public class Patricia {
public static void main(String[] args) {
PatriciaTrie<String> trie = new PatriciaTrie<>();
String world = "hello the world";
String people = "hello the people";
trie.put(world, null);
trie.put(people, null);
SortedMap<String, String> map1 = trie.prefixMap("hello");
System.out.println(map1.keySet()); // [hello the people, hello the world]
SortedMap<String, String> map2 = trie.prefixMap("hello the w");
System.out.println(map2.keySet()); // [hello the world]
SortedMap<String, String> map3 = trie.prefixMap("hello the p");
System.out.println(map3.keySet()); // [hello the people]
}
}
There are also the tests, which contain more examples.
Here, I'm primarily trying to explain why you are observing such an excessive memory consumption, and what you could do about this (if you wanted to stick to the HashMap) :
A HashMap that is created with the default constructor will have an initial capacity of 16. This means that it will have space for 16 entries, even if it is empty. Additionally, you seem to create the map, regardless of whether it is needed or not.
So way to reduce the memory consumption in your case would be to
Create the map only when it is necessary
Create it with a smaller initial capacity
Applied to your class, this could roughly look like this:
public class BNGram {
private int id;
private Map<Integer,BNGram> next;
public BNGram(int idP) {
this.id = idP;
// (Do not create a new `Map` here!)
}
void doSomethingWhereTheMapIsNeeded(Integer key, BNGram value) {
// Create a map, when required, with an initial capacity of 1
if (next == null) {
next = new HashMap<Integer, BNGram>(1);
}
next.put(key, value);
}
}
But...
... conceptually, it is questionable to have a large "tree" structure consisting of many, many maps, each only with "few" entries. This suggests that a different data structure is more appropriate here. So you should definitely prefer a solution like the one in the answer by Magnamag, or (if this is not applicable for you, as suggested in your comments), look out for an alternative data structure - maybe even by formulating this as a new question that does not suffer from the XY Problem.
I have an object like:
class House {
String name;
List<Door> doors;
}
what I want to do is to tranform a List<House> to a List<Door> containing all doors of all houses.
Is there a chance to do this with guava?
I tried with guava used Lists.transform function but i only getting a List<List<Door>> as result.
If you really need to use a functional approach, you can do this using FluentIterable#transformAndConcat:
public static ImmutableList<Door> foo(List<House> houses) {
return FluentIterable
.from(houses)
.transformAndConcat(GetDoorsFunction.INSTANCE)
.toImmutableList();
}
private enum GetDoorsFunction implements Function<House, List<Door>> {
INSTANCE;
#Override
public List<Door> apply(House input) {
return input.getDoors();
}
}
FluentIterable.from(listOfHouses).transformAndConcat(doorFunction)
would do the job just fine.
You don't need Guava for it (assuming I understood you correctly):
final List<Door> doorsFromAllHouses = Lists.newArrayList();
for (final House house : houses) {
doorsFromAllHouses.addAll(house.doors);
}
// and doorsFromAllHouses is full of all kinds of doors from various houses
Using Lists.transform for input list of houses and transform function get all doors from a house gave you correct output of list of *each house's doors* (which is exactly List<List<Door>>).
More generally you want reduce / fold function instead of transform, which isn't implemented in Guava (see this issue), mostly because Java's verbose syntax and presense of for-each loop which is good enough. You'll be able to reduce in Java 8 (or you are able to do this in any other mainstream language nowadays...). Pseudo-Java8-code:
List<Door> doors = reduce(
houses, // collection to reduce from
new ArrayList<Door>(), // initial accumulator value
house, acc -> acc.addAll(house.doors)); // reducing function