Asking about threading, arrays and cache memory

Asking about threading, arrays and cache memory - java

I hope in a good manner :-)
I wrote this piece of code.
What I wished to do, is to build something like "cache".
I assumed that I had to watch for different threads, as might many calls get to that class, so I tried the ThreadLocal functionality.
Base pattern is
have "MANY SETS of VECTOR"
The vector holds something like:
VECTOR.FieldName = "X"
VECTOR.FieldValue= "Y"
So many Vector objects in a set. Different set for different calls from different machines, users, objects.
private static CacheVector instance = null;
private static SortedSet<SplittingVector> s = null;
private static TreeSet<SplittingVector> t = null;
private static ThreadLocal<SortedSet<SplittingVector>> setOfVectors = new ThreadLocal<SortedSet<SplittingVector>>();
private static class MyComparator implements Comparator<SplittingVector> {
public int compare(SplittingVector a, SplittingVector b) {
return 1;
}
// No need to override equals.
}
private CacheVector() {
}
public static SortedSet<SplittingVector> getInstance(SplittingVector vector) {
if (instance == null) {
instance = new CacheVector();
//TreeSet<SplittingVector>
t = new TreeSet<SplittingVector>(new MyComparator());
t.add(vector);
s = Collections.synchronizedSortedSet(t);//Sort the set of vectors
CacheVector.assign(s);
} else {
//TreeSet<SplittingVector> t = new TreeSet<SplittingVector>();
t.add(vector);
s = Collections.synchronizedSortedSet(t);//Sort the set of vectors
CacheVector.assign(s);
}
return CacheVector.setOfVectors.get();
}
public SortedSet<SplittingVector> retrieve() throws Exception {
SortedSet<SplittingVector> set = setOfVectors.get();
if (set == null) {
throw new Exception("SET IS EMPTY");
}
return set;
}
private static void assign(SortedSet<SplittingVector> nSet) {
CacheVector.setOfVectors.set(nSet);
}
So... I have it in the attach and I use it like this:
CachedVector cache = CachedVector.getInstance(bufferedline);
The nice part: Bufferedline is a splitted line based on some delimiter from data files. Files can be of any size.
So how do you see this code? Should I be worry ?
I apologise for the size of this message!

Writing correct multi-threaded code is not that easy (i.e. your singleton fails to be), so try to rely on existing solutions if posssible. If you're searching for a thread-safe Cache implementation in Java, check out this LinkedHashMap. You can use it to implement a LRU cache. And collections.synchronizedMap(). can make this thread-safe.

Related

How do I return a limited number of cached instances in Java?

I have a "configuration" class that becomes a field of several other classes. It indicates some kind of configuration or "abilities" of those other classes to allow or disallow actions. The configuration class as of now contains a set of four independent booleans and will likely remain like that --or grow with another bolean--. The configuration is immutable: once the object is created, the configuration will never change.
public class Configuration {
private final boolean abilityOne;
private final boolean abilityTwo;
private final boolean abilityThree;
private final boolean abilityFour;
public Configuration (final boolean abilityOne, final boolean abilityTwo,
final boolean abilityThree, final boolean abilityFour) {
this.configuration = ((1 * (abilityOne ? 1 : 0)) +
(2 * (abilityTwo ? 1 : 0)) +
(4 * (abilityThree ? 1 : 0)) +
(8 * (abilityFour ? 1 : 0)));
}
public boolean isAbilityOne() {
return((1 & this.configuration) > 0);
}
public boolean isAbilityTwo() {
return((2 & this.configuration) > 0);
}
public boolean isAbilityThree() {
return((4 & this.configuration) > 0);
}
public boolean isAbilityFour() {
return((8 & this.configuration) > 0);
}
}
Because of C / limited-hardware background, my next implementation (attempt at reducing memory footprint) was with an int used as a bit map: 1 -> first boolean, 2-> second, 4 -> third, 8-> fourth. This way I store an integer and the boolean functions I needed were like:
It works fine and it is quite memory efficient. But it is frowned upon by my Java-all-my-life colleagues.
The number of different configurations is limited (the combinations of boolean values), but the number of objects using them is very large. In order to decrease memory consumption I thought of some kind of "multi-singleton", enumeration or cached instances. And this is where I am now. What is best?

I think multiton pattern is the most efficient way to do this:
public class Configuration {
private static Map<Long, Configuration> configurations = new HashMap<>();
private long key;
private long value;
public static Configuration getInstanse(long key, boolean... configs) {
if (configurations.containsKey(key)) {
return configurations.get(key).setConfigs(configs);
}
Configuration configuration = new Configuration(key, configs);
configurations.put(key, configuration);
return configuration;
}
// Max number of configs.length is 64
private Configuration(long key, boolean... configs) {
this.key = key;
setConfigs(configs);
}
private Configuration setConfigs(boolean[] configs) {
this.value = 0L;
boolean config;
for (int i = 0; i < configs.length; i++) {
config = configs[i];
this.value = this.value | (config ? (1L << i) : 0L);
}
}
public long getKey() {
return key;
}
public boolean getConfig(int place) {
return (value & (1L << place)) == (1L << place);
}
}

I would suggest the following, it is very easy to expand as you just have to add another Ability to your enum.
enum Ability {
Ability1, Ability2, Ability3, Ability4
}
public class Configuration {
private static LoadingCache<Set<Ability>, Configuration> cache = CacheBuilder.newBuilder()
.build(new CacheLoader<Set<Ability>, Configuration>() {
#Override
public Configuration load(Set<Ability> withAbilities) {
return new Configuration(withAbilities);
}
});
Set<Ability> abilities;
private Configuration(Collection<Ability> withAbilities) {
this.abilities = createAbilitySet(withAbilities);
}
public static Configuration create(Ability... withAbilities) {
Set<Ability> searchedAbilities = createAbilitySet(Arrays.asList(withAbilities));
try {
return cache.get(searchedAbilities);
} catch (ExecutionException e) {
Throwables.propagateIfPossible(e);
throw new IllegalStateException();
}
}
private static Set<Ability> createAbilitySet(Collection<Ability> fromAbilities) {
if (fromAbilities.size() == 0) {
return Collections.emptySet();
} else {
return EnumSet.copyOf(fromAbilities);
}
}
public boolean hasAbility(Ability ability) {
return abilities.contains(ability);
}
}

If the configuration implementation objects are small and not expensive to create, there is no need to cache them. Because each monster object will have to keep a reference to each of its configurations, and at machine level a reference is a pointer and uses at least the same memory as an int.
The EnumSet way proposed by #gamulf can probably be used as it without any caching, because according to EnumSet javadoc:
Enum sets are represented internally as bit vectors. This representation is extremely compact and efficient. The space and time performance of this class should be good enough to allow its use as a high-quality, typesafe alternative to traditional int-based "bit flags."
I did not benchmarked it, but caching is likely to be useless with #gamulf's solution because a Configuration object contains only an EnumSet that contains no more than an int.
If you had a heavy configuration class (in term of memory or expensive to create) and only a small number of possible configurations, you could use a static HashSet member in the class, and a static factory method that would return the cached object:
public class Configuration {
static Set<Configuration > confs = new HashSet<>();
...
public Configuration (Ability ... abs) {
...
}
public boolean hasAbility(Ability ab) {
...
}
static Configuration getConfiguration(Ability ... abs) {
for (ConfImpl2 conf: confs) {
if (conf.isSame(abs)) { return conf; }
}
ConfImpl2 conf = new ConfImpl2(abs);
confs.add(conf);
return conf;
}
private boolean isSame(Ability ... abs) {
// ensures that this configuration has all the required abilities and only them
...
}
}
But as I have already said, that is likely to be useless for objects as lightweight as those proposed by #gamulf

I want to share the investigation I made based on your answers, so I'm posting one answer with those results. This way it might be clearer why I choose one answer over other.
The bare result rank are as follows (memory used for 600 "monster" objects, 10% of what will be needed):
trivial option: Class with four booleans inside: 22.200.040
Initial option: Class with one integer as map of bits: 22.200.040
"multiton" option: one factory class that returns references to the trivial option's Class: 4.440.040
EnumSet (without guava cache): 53.401.896 (in this one I probably messed up, since results are not as expected... I might work further on this later on)
EnumSet with guava cache: 4.440.040
Since my tests run first a series of comparisons to ensure that all implementations give the exact same results for all configurations, it has become clear that the 4.440.040 number is the size of the List<> I used to hold the items, for before I resolved to set it to null before measuring memory, those numbers were consistently 0.
Please don't go into how I measured memory consumption (gc(); freeMemory(); before and after I freed each list and set it to null), since I used the same method for all, and performed 20 executions each time and in different orders of execution. Results were consistent enough for me.
These results point at the multiton solution as the easiest of the best performing. That's why I set it as the selected answer.
As side note/curiosity, please be informed that the project for which this investigation started has selected the trivial option as the solution and most of this investigation was made to satisfy my own curiosity --and with some hidden desire to be able to demonstrate that some other solution would be soooo much more efficient than the trivial one... but no--. This is why it took me so long to come up with a conclusion.

How to create a dynamic list of custom objects associated with a flag (boolean)?

I am eventually fetching objects from my backend and I have to keep track of them. I need a collection where there are no duplicates, but every time I fetch the same object from the backend I get a new instance, so I must compare it's String key manually, I suppose.
Plus, these objects need a boolean associated with them, because they may be in this list and be "used" and I should know that later.
A typical scenario is that I have a list of 10 objecst in my collection and I fetch 8 new ones, and only 3 are new. I should add these 3 to these list and discard the 5 repeated ones.
I am about to start implementing a custom Collection for that. Is there any possibility to do it combining Pair with List, or maybe HashMap? I've been thinking on this and I couldn't come up with a conclusion.

http://docs.oracle.com/javase/7/docs/api/java/util/Set.html#add(E) Try something with a set. It allows no duplicates.

In the class of your objects, override both equals() and hashCode() to specify when two instances of your class can be considered to be the same.
If you do this, you can simply throw them into a HashSet and it will make sure that no to elements in it are the same (by the definition that you provided in the overridden methods)
Take a look at this similar question:
Implement equals with Set

For future reference, I implemented a custom class with both a list of objects and an array with my booleans. Since I had to keep both list and array synchronized, I had to iterate this list on all steps.
This is my code:
public class PromoCollection {
public static List<ParseObject> promotions = new ArrayList<ParseObject>();
public static List<Boolean> isTriggered = new ArrayList<Boolean>();
public static void add(ParseObject newObj) {
for (ParseObject p : promotions) {
if (p.getObjectId().equals(newObj.getObjectId())) {
return; // Object already in list, do not add
}
}
promotions.add(newObj); // Add new object
isTriggered.add(false); // And respective boolean
}
public static void remove(ParseObject obj) {
for (int i = 0; i < promotions.size(); ++i) {
if (obj.getObjectId().equals(promotions.get(i).getObjectId())) {
promotions.remove(i);
isTriggered.remove(i);
return;
}
}
}
public static void trigger(ParseObject obj) {
for (int i = 0; i < promotions.size(); ++i) {
if (obj.getObjectId().equals(promotions.get(i).getObjectId())) {
isTriggered.set(i, true);
}
}
}
public static boolean isTriggered(ParseObject obj) {
for (int i = 0; i < promotions.size(); ++i) {
if (obj.getObjectId().equals(promotions.get(i).getObjectId())) {
return isTriggered.get(i);
}
}
throw new ArrayStoreException();
}
}

Set variable to reference another variable, Java

In Android Studio, I have two array lists with a custom object
ArrayList<MenuMaker> consessionlist = new ArrayList<MenuMaker>();
ArrayList<MenuMaker> entrylist = new ArrayList<MenuMaker>();
And have a few voids that depending on which mode we are in, it needs to use one ArrayList or the other:
private void createMenuButtons()
{
int FoodSize = consessionlist.size();
...
I realize I could do an if statement that if mode = 0 use consessionlist, else use entrylist, but is there a way to say
private void setmode(mode)
{
if (mode == 0){
menulist = consessionlist;
}
else
{
menulist = entrylist;
}
}
private void createMenuButtons()
{
int FoodSize = menulist.size();
...
*Pass-by-reference vs pass-by-value seem to kick my butt on the Oracle test.

I thought I would have to use an if statement overtime I need to choose or have to add some weird complexity, but thus far its actually working as I wanted it to.

Is Java application performance dependent on passing of variables to method?

Maybe this is trivial question for experienced programmers but i wonder if there is any significant performance difference (with big or very big amount of data in collection) between two difference approaches of passing variables?
I've made a tests but with rather small data structures and i don't see any significant differences. Additionally i am not sure if these differences aren't caused by interferences from other applications run in background.
Class with collection:
public class TestCollection
{
ArrayList<String[]> myTestCollection = new ArrayList<String[]>();
public TestCollection()
{
fillCollection();
}
private void fillCollection()
{
// here is fillng with big amount of data
}
public ArrayList<String[]> getI()
{
return myTestCollection;
}
}
And methods that operate on collection:
public class Test
{
static TestCollection tc = new TestCollection();
public static void main(String[] args)
{
new Test().approach_1(tc);
new Test().approach_2(tc.getI());
}
public void approach_1(TestCollection t)
{
for (int i = 0; i < tc.getI().size(); i++)
{
// some actions with collection using tc.getI().DOSOMETHING
}
}
public void approach_2(ArrayList<String[]> t)
{
for (int i = 0; i < t.size(); i++)
{
// some actions with collection using t.DOSOMETHING
}
}
}
Regards.

No, there is no real difference here.
Java passes object references to methods, not copies of the entire object. This is similar to the pass by reference concept in other languages (although we are actually passing an object reference to the called method, passed by value).
If you come from a C programming background it's important to understand this!
And, some tips - firstly, it's better practise to declare your list as List<...> rather than ArrayList<...>, like this:
List<String[]> myTestCollection = new ArrayList<String[]>();
And secondly, you can use the improved for loop on lists, like this:
// first case
for (String[] s : tc.getI()) { /* do something */ }
// second case
for (String[] s : t) { /* do something */ }
Hope this helps :)

Setting a value only on first access -- best practice, (micro)performance?

In the below code, assume that getAndClear() will get called billions of times, i.e. assume that performance matters. It will return an array only during its first call. It must return null in all further calls. (That is, my question is about micro-optimization in some sense, and I'm aware of the fact it's bad practice, but you can also consider it as a question of "which code is nicer" or "more elegant".)
public class Boo {
public static int[] anything = new int[] { 2,3,4 };
private static int[] something = new int[] { 5,6,7 }; // this may be much bigger as well
public static final int[] getAndClear() {
int[] st = something;
something = null;
// ... (do something else, useful)
return st;
}
}
Is the below code faster? Is it better practice?
public static int[] getAndClear() {
int[] array = sDynamicTextIdList;
if (array != null) {
sDynamicTextIdList = null;
// ... (do something else, useful)
return array;
}
// ... (do something else, useful)
return null;
}
A further variant could be this:
public static int[] getAndClear() {
int[] array = sDynamicTextIdList;
if (array != null) {
sDynamicTextIdList = null;
}
// ... (do something else, useful)
return array;
}
I know it probably breaks down to hardware architecture level and CPU instructions (setting something to 0 vs. checking for 0), and performance-wise, it doesn't matter, but then I would like to know which is the "good practive" or more quality code. In this case, the question can be reduced to this:
private static boolean value = true;
public static int[] getTrueOnlyOnFirstCall() {
boolean b = value;
value = false;
return b;
}
If the method is called 100000 times, this means that value will be set to false 99999 times unnecessarily. The other variant (faster? nicer?) would look like this:
public static int[] getTrueOnlyOnFirstCall() {
boolean b = value;
if (b) {
value = false;
return true;
}
return false;
}
Moreover, compile-time and JIT-time optimizations may also play a role here, so this question could be extended by "and what about in C++". (If my example is not applicable to C++ in this form, then feel free to subtitute the statics with member fields of a class.)

IMHO, it's not worth doing the micro-optimization. One drawback to optimization is that it relies heavily on the environment (as you mentioned JIT--the version of the JDK plays a strong role; what is faster now may be slower in the future).
Code maintainability is (in my opinion) far more important over the long haul. Implement the version which is the clearest. I like the getTrueOnlyOnFirstCall() which contains the if statement, for example.
In all of these examples, though, you would need synchronization around the getters and around the portions which modify the boolean.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.