Best way to define a constant identifier? - java

In my java application one of my objects has exactly one value from a set of values. Now I wonder how to define them to increase the performance:
private static final String ITEM_TYPE1= "type1"
private static final int ITEM_TYPE1= 1
Does defining int better than string? (I should convert the value to string so I like to define as string but just fearing for performance reasons because comparing ints is simpler than srtings maybe)
EDIT: I am aware of enums but I just want to know whether ints has more performance than strings or not? This depends on how JDK and JRE handle the undergoing. (In Android dalvik or ART ..)

In my java application one of my objects has exactly one value from a set of values
That is what java enums are for.
Regarding the question "have ints more performance than strings", that is almost nonsensical.
You are talking about static constants. Even if they are used a 100 or a 1000 times in your app, performance doesn't matter here. What matters is to write code that is easy to read and maintain. Because then the JIT can kick in and turn it into nicely optimized machine code.
Please understand: premature optimisation is the root of all evil! Good or bad performance of your app depends on many other factors, definitely not on representing constants as ints or strings.
Beyond that: the type of some thing in Java should reflect its nature. If it is a string, make it a string (like when you want to mainly use it as string, and concatenate it to other strings). When you have numbers and deal with them as numbers, make it an int.

First of all, an int has always a fixed size which it uses in memory, on most systems it's 4 bytes (I guess on Java always).
A String is a complex type which means that it takes not only the bytes of the actual string data but also additional like the length of the string and so on.
So if you have the choice between String and int, you should always chose int. it does not take so much place and is faster to operate with.

Related

Using chars as digits,

I am writing a simulation that will use a lot of memory and needs to be fast. Instead of using ints I am using chars (8 bits not 32). I need to operate on them as if these chars were ints.
To achieve that I have done something like
char a = 1;
char b = 2;
System.out.println(a*1 + b*1); //it give me 3 in console so it has int-like behavior;
I don't know what's going on "under the mask" when I multiply char with an integer. Is this the fastest way to do it?
Thank you for help!
Performance wise it's not worth using char instead of int, because all modern hardware architectures are optimized for 32- or 64-bit wide memory and register access.
Only reason to use char would be if you want to reduce memory footprint, i.e. if you work with large amount of data.
Additional info: Performance of built-in types : char vs short vs int vs. float vs. double
A char is simply a number (not a 32bit int but a number) which is normally represented ascii-encoded. By multiply it with an integer the compiler does an implicit cast from char to int that's why you get printed 3 on the console instead of the ascii representative.
You should use ints. the size of the character will not affect the fetch speed of the data, nor the processing time unless it is composed of multiple computer words, which is unlikely. There is no reason to do this. Moreover, chars are stored as integers and are therefore the same size. an alternative would be to use small ints but again this is not helpful for what you want.
Also, I notice your code has 'System.Out.Println' which means you are using java. There are several implications of this, the first is that no, you are not going to be going fast or using very little memory. Period, it will not happen. There is a large amount of overhead involved in running the JVM, JIT, garbage collector and other parts of the Java platform. If efficiency is a relavent factor you are starting off wrong. The second implication is that your choice of datatypes will have no impact on processing time because they will be identical to the physical hardware. Only the Virtual machine will distinguish between them, and in the case of primitives, there is no difference anyways.

Is there a way to efficiently store a sequence of enum values in Java?

I'm looking for a way to encode a sequence of enum values in Java that packs better than one object reference per element. In fantasy-code:
List<MyEnum> list = new EnumList<MyEnum>(MyEnum.class);
In principle it should be possible to encode each element using log2(MyEnum.values().length) bits per element. Is there an existing implementation for this, or a simple way to do it?
It would be sufficient to have a class that encodes a sequence of numbers of arbitrary radix (i.e. if there are 5 possible enum values then use base 5) into a sequence of bytes, since a simple wrapper class could be used to implement List<MyEnum>.
I would prefer a general, existing solution, but as a poor man's solution I might just use an array of longs and radix-encode as many elements as possible into each long. With 5 enum values, 27 elements will fit into a long and waste only ~1.3 bits, which is pretty good.
Note: I'm not looking for a set implementation. That wouldn't preserve the sequence.
You can store bits in an int (32 bits, 32 "switches"). But aside from the exercise value, what's the point?- you're really talking about a very small amount of memory. A better question might be, why do you want to save a few bytes in enum references? Other parts of your program are likely to be using much more memory.
If you're concerned with transferring data efficiently, you could consider leaving the Enums alone but using custom serialization, though again, it'd be an unusual situation where it'd be worth the effort.
One object reference typically occupies one 32-bit or 64-bit word. To do better than that, you need to convert the enum values into numbers that are smaller than 32 bits, and hold them in an array.
Converting to a number is as simple as calling getOrdinal(). From there you could:
cast to a byte or short, then represent the sequence as an array of byte / short values, or
use a suitable compression algorithm on the array of int values.
Of course, all of this comes at the cost of making your code more complicated. For instance you cannot make use of the collection APIs, and you have to do your own sequence management. I doubt that this will be worth it unless you have to deal with very large sequences or huge numbers of sequences.
In principle it should be possible to encode each element using log2(MyEnum.values().length) bits.
In fact you may be able to do better than that ... by compressing the sequences. It depends on how much redundancy there is.

Performance implications of using Java BigInteger for a huge bitmask

We have an interesting challenge. We have to control access to data that reside in "bins". There will be, potentially, hundreds of thousands of "bins". Access to each bin is controlled individually but the restrictions can, and probably will, overlap. We are thinking of assigning each bin a position in a bitmask (1,2,3,4, etc..).
Then when a user logs into the system, we look at his security attributes and determine which bins he's allowed to see. With that info we construct a bitmask for this user where the "set" bits correspond to the identifier of the bins he's allowed to see. So if he can see bins 1, 3 and 4, his bit mask would be 1101.
So when a user searches the data, we can look at the bin index of the returned row and see if that bit is set on his bitmask. If his bitmask has that bit set we let him see that row. We are planning for the bitmask to be stored as a BigInteger in Java.
My question is: Assuming the index number doesn't get bigger that Integer.MAX_INT, is a BigInteger bitmask going to scale for hundreds of thousands of bit positions? Would it take forever to run BigInteger.isBitSet(n) where n could be huge (e.g. 874,837)? Would it take forever to create such a BigInteger?
And secondly: If you have an alternative approach, I'd love to hear it.
BigInteger should be fast if you don't change it often.
A more obvious choice would be BitSet which is designed for this sort of thing. For looking up bits, I suspect the performance is similar. For creating/modifying it would be more efficient to use a BitSet.
Note: PaulG has commented the difference is "impressive" and BitSet is faster.
Java has a more convenient class for this, called BitSet.
You do not need to check if the bit is set in a loop: you can make a mask, use a bitwise and, and see if the result is non-empty to decide on whether to grant or deny the access:
BitSet resourceAccessMask = ...
BitSet userAllowedAccessMask = ...
BitSet test = (BitSet)resourceAccessMask.clone();
test.and(userAllowedAccessMask);
if (!test.isEmpty()) {
System.out.println("access granted");
} else {
System.out.println("access denied");
}
We used this class in a similar situation in my prior company, and the performance was acceptable for our purposes.
You could define your own Java interface for this, initially using a Java BitSet to implement that interface.
If you run into performance issues, or if you require the use of long later on, you may always provide a different implementation (e.g. one that uses caching or similar improvements) without changing the rest of the code. Think well about the interface you require, and choose a long index just to be sure, you can always check if it is out of bounds in the implementation later on (or simply return "no access" initially) for anything index > Integer.MAX_VALUE.
Using BigInteger is not such a good idea, as the class was not written for that particular purpose, and the only way of changing it is to create a fully new copy. It is efficient regarding memory use; it uses an array consisting 64 bit longs internally (at the moment, this could of course change).
One thing that should be worth considering (beside using BitSet) is using different granularity. Therefore you use a shorter bit set where each bit 'guards' multiple real bits. This way you would not need to have millions of bits per user in ram.
A simple way to achieve this is having a smaller bit set like n/32 and do something like this:
boolean isSet(int n) {
return guardingBits.isSet(n / 32) && realBits.isSet(n);
}
This gives you a good chance to avoid loading the real bits if those bits are mostly zero. You can modify this approach to match the expected bit-set. If you expect almost all bits are set you can use this guarding bits for storing a one if all bits it guards are set. So you only need to check for bits that might be zero.
Also this might be even the beginning. Depending on the usage and requirements you might want to use a B-tree or a paginated version where you only held a fraction of the big bit field in memory.

Using boolean instead of byte or int in Java

Is using boolean instead of byte(if I need 2 states) in Java useful for performance or it's just illusion... Does all the space profit leveled by alignment?
You should use whichever is clearer, unless you have profiled your code, and decided that making this optimization is worth the cost in readability. Most of the time, this sort of micro-optimization isn't worth the performance increase.
According to Oracle,
boolean: ... This data type represents one bit of information, but its
"size" isn't something that's precisely defined.
To give you an idea, I once consulted in a mini-shop (16-bit machines).
Sometimes people would have a "flag word", a global int containing space for 16 boolean flags.
This was to save space.
Never mind that to test a flag required two 16-bit instructions, and to set a flag required three or more.
Yes, boolean may use only 1 bit. But more important, it makes it clearer for another developer reading your code that there are only two possible states.
The answer depends on your JVM, and on your code. The only way to find out for sure is by profiling your actual code.
If you only have 2 states that you want to represent, and you want to reduce memory usage you can use a java.util.BitSet.
Only most JVMs a boolean uses the same amount of space as a byte. Accessing a byte/boolean can be more work than accessing an int or long, so if performance is the only consideration, a int or long can be faster. When you share a value between threads, there can be an advantage to reserving a whole cache line to the field (in the most extreme cases) This is 64-bytes on many CPUs.

Which is faster, int to String or String to int?

This may seem like a fairly basic question, for which I apologise in advance.
I'm writing an Android app that uses a set of predefined numbers. At the moment I'm working with int values, but at some point I will probably need to use float and double values, too.
The numbers are used for two things. First, I need to display them to the user, for which I need a String (I'm creating a custom View and drawing the String on a Canvas). Second, I need will be using them in a sort of calculator, for which they obviously need to be int (or float/double).
Since the numbers are the same whether they are used as String or int, I only want to store them once (this will also reduce errors if I need to change any of them; I'll only need to change them in the one place).
My question is: should I store them as String or as int? Is it faster to write an int as a String, or to parse an int from a String? My gut tells me that parsing would take more time/resources, so I should store them as ints. Am I right?
Actually, your gut may be wrong (and I emphasise may, see my comments below on measuring). To convert a string to an integer requires a series of multiply/add operations. To convert an integer to a string requires division/modulo. It may well be that the former is faster than the latter.
But I'd like to point out that you should measure, not guess! The landscape is littered with the corpses of algorithms that relied on incorrect assumptions.
I would also like to point out that, unless your calculator is expected to do huge numbers of calculations each second (and I'm talking millions if not billions), the difference will be almost certainly be irrelevant.
In the vast majority of user-interactive applications, 99% of all computer time is spent waiting for the user to do something.
My advice is to do whatever makes your life easier as a developer and worry about performance if (and only if) it becomes an issue. And, just to clarify, I would suggest that storing them in native form (not as strings) would be easiest for a calculator.
I did a test on a 1 000 000 size array of int and String. I only timed the parsing and results says :
Case 1, from int to String : 1 000 000 in an average of 344ms
Case 2, from String to int : 1 000 000 in an average of 140ms
Conclusion, you're guts were wrong :) !
And I join the others saying, this is not what is going to make you're application slow. Better concentrate on making it simpler and safer.
I'd say that's not really relevant. What should matter more is type safety: since you have numbers int (or float and double) would force you to use numbers and not store "arbitrary" data (which String would allow to some extent).
The best is to do a bench test. Write two loops
one that converts 100000 units from numeric to String
one that converts 100000 units from String to numeric
And measure the time elapsed by getting System.currentTimeMillis() before and after each loop.
But personally, if I would need to do calculation on these numbers, I would store them in their native format (int or float) and I would only convert them to String for display. This is more a question of design and maintainability than a question of execution speed. Focusing on execution speed is sometime counterproductive: to gain a few µSec nobody will notice is not worth sacrifying the design and the robustness (of course, some compromise may have to be done when this is a question of saving a lot of CPU time). This reading may interest you.
A human who is using the calculator will not notice a performance difference, but as others have said. Using strings as your internal representation is a bad idea since you don't get type safety in that case.
You will most likely get into maintenance problems later on if you decide to use strings.
It's better design practice to have the view displayed to the user being derived from the underlying data, rather than the other way around - at some point you might decide to render the calculator using your own drawing functions or fixed images, and having your data as strings would be a pain here.
That being said, neither of these operations are particularly time consuming using modern hardware.
Parsing is a slow thing, printing a number is not. The internal representation as number allows you to compute, which is probably what you intend to d with your numbers. Storing numbers as, well, numbers (ints, floats, decimals) also takes up less space than their string representations, so … you'll probably want to go with storing them as ints, floats, or whatever they are.
You are writing an application for mobile devices, where the memory comsumption is a huge deal.
Storing an int is cheap, storing a String is expensive. Go for int.
Edit: more explanation. Storing an int bteween -2^31 and 2^31-1 costs 32 bits. No matter what the number is. Storing it in a String is 16 bits per digit in its base 10 representation.

Categories