I have observed while setting heap size people prefer the values 64,128,256,1024.. . If I give a value in- between these numbers (say 500), won't the JVM accept that value? Why these numbers are important and preferred? Why we also upgrade RAM in this pattern?
Please help me to understand.
JVM will accept any value, no problem with that. Using 2^n values is just a "convention", using others will have no negative effect in practice.
Well, if you think about it this way:
1 byte is 8 bits
1 kb = 1024 bytes
1 mb = 1024 kb
1 gb = 1024 mb
... and so on ...
It's not just 2^n. Things in terms of memory in computing are closely related to the number eight - the number which defines one byte in most modern computers.
The main reason why bits are grouped together is to represent characters. Because of the binary nature of all things computing, ideal 'clumps' of bits come in powers of 2 i.e. 1, 2, 4, 8, 16, 32.... (basically because they can always be divided into smaller equal packages (it also creates shortcuts for storing size, but that's another story)). Obviously 4 bits (nybble in some circles) can give us 2^4 or 16 unique characters. As most alphabets are larger than this, 2^8 (or 256 characters) is a more suitable choice.
Machines exist that have used other length bytes (particularly 7 or 9). This has not really survived mainly because they are not as easy to manipulate. You certainly cannot split an odd number in half, which means if you were to divide bytes, you would have to keep track of the length of the bitstring.
Finally, 8 is also a convenient number, many people (psychologists and the like) claim that the human mind can generally recall only 7-8 things immediately (without playing memory tricks).
If it won't accept the value, check whether you put a megabytes(M or m) or gigabytes(G or g) modifier after the amount.
Example: java -Xms500M -Xmx500M -jar myJavaProgram.jar
Also, take a look at this link.
Why we also upgrade RAM in this pattern.
That is because memory chips / cards / come in sizes that are a power of 2 bytes. And the fundamental reason for that is that it makes the electronics simpler. And simpler means cheaper, more reliable and (probably) faster.
Except non-written convention, it has also performance impact - depending on the the architecture of the machine.
For example if a machine is ternary based, it would work better with a heap size set to a value which is a power of 3.
Related
Actually, my question is very similar with this one, but the post is focus on the C# only. Recently I read an article said that java will 'promote' some short types (like short) to 4 bytes in memory even if some bits are not used, so it can't reduce usage. (is it true ?)
So my question is how languages, especially C, C++ and java (as Manish said in this post talked about java), handles memory allocation of small datatypes. References or any approaches to figure out it are preferred. Thanks
C/C++ uses only the specified amount of memory but aligns the data (by default) to an address that is a multiple of some value, typically 4 bytes for 32 bit applications or 8 bytes for 64 bit.
So for example if the data is aligned on a 4 or 8 byte boundary then a "char" uses only one byte. An array of 5 chars will use 5 bytes. But the data item that is allocated after the 5 byte char array is placed at an address that skips 3 bytes to keep it correctly aligned.
This is for performance on most processors. There are usually pragmas like "pack" and "align" that can be used to change the alignment or disable it.
In C and C++, different approaches may be taken depending on how you've requested the memory.
For T* p = (T*)malloc(n * sizeof(T)); or T* p = new T[n]; then the data will occupy sizeof(T)*n bytes of memory, so if sizeof(T) is reduced (e.g. to int16_t instead of int32_t) then that space is reduced accordingly. That said, heap allocations tend to have some overheads, so few large allocations are better than a great many allocations for individual data items or very small arrays, where the overheads may be much more significant than small differences in sizeof(T).
For structures, static and stack usage, padding is more significant than for large arrays, as the following data item might be of a different type with different alignment requirements, resulting in more padding.
At the other extreme, you can apply bitfields to effectively pack values into the minimum number of bits they need - very dense compression indeed, though you need to rely on compiler pragmas/attributes if you want explicit control - the Standard leaves it unspecified when a bitfield might start in a new memory "word" (e.g. 32 bit memory word for a 32 bit process, 64 for 64) or wrap across separate words, where in the words the bits hold data vs padding etc.). Data types like C++ bitsets and vector<bool> may be more efficient than arrays of bool (which may well use an int for each element, but it's unspecified in the C++03 Standard).`
I'm a pretty new programmer and I'm using Java. My teacher said that there was many types of integers. I don't know when to use them. I know they have different sizes, but why not use the biggest size all the time? Any reply would be awesome!!!
Sometimes, when you're building massive applications that could take up 2+ GB of memory, you really want to be restrictive about what primitive type you want to use. Remember:
int takes up 32 bits of memory
short takes up 16 bits of memory, 1/2 that of int
byte is even smaller, 8 bits.
See this java tutorial about primitive types: http://docs.oracle.com/javase/tutorial/java/nutsandbolts/datatypes.html
The space taken up by each type really matters if you're handling large data sets. For example, if your program has an array of 1 million ints, then you're taking up 3.81 MB of RAM. Now let's say you know for certain that those 1,000,000 numbers are only going to be in the range of 1-10. Why not, then use a byte array? 1 million bytes only take up 976 Kilobytes, less than 1 MB.
You always want to use the number type that is just "large" enough to fit, just as you wouldn't put an extra-large T-shirt on a newborn baby.
So you could. Memory is cheap these days and assuming you are writing a simple program it is probably not a big deal.
Back in the days when memory was expensive, you needed to be a lot more careful with how much memory you use.
Let's say if you are word processing on an IBM 5100 as in one of the first PCs in the 70s -- which has a minimum of 16KB RAM (unimaginable these days) -- if you use 64-bit values all day, you can keep at most 2048 characters without any memory for the word processing program itself, that's not enough to hold what I'm typing right now!
Knowing that English has a limited number of characters and symbols, if you choose to use ASCII to represent the text, you would use 8 bits (or 1 byte) per character which allows you to go up to about 16,000 characters, and that's quite a bit more room for typing.
Generally you will use a data type that's just big enough to hold the biggest number you might need to save on memory. Let's say you are writing a database for the IRS to keep track of all the tax IDs, if you are able to save 1 bit of memory per record that's billions of bits (gigabytes!) of memory savings.
The ones that can hold higher numbers use more memory. Using more memory is bad. One int verses one byte is not a big difference right now. But if you write big programs in the future, the used memory adds up.
Also you said double in the title. A double is not like an int. It does hold a number, but it can have decimal places (eg. 2.36), unlike a int which can only hold numbers like 8.
Because we like to be professional and also the byte uses less memory than int and double can include decimal places.
I know that Integers act like interns for the values less than 128 (default) and not for the values more than that. I know that this has been given as an answer many times but I couldn't noticed a place that the reason is asked.
So what I want to know is Why Integers act as interns only for the values less than 128 (default) and not for the values more than that? How does it improve the less memory usage / high performance?
Technically the values are pre-cached when the class is loaded. It is not like String.intern() where a value you created can be returned.
Also the maximum might not be 127, it can be higher if you set it so or use options like -XX:+AggressiveOpts
The default range is likely to be chosen just to be consistent with Byte. Note: the cached values are
Boolean: both values
Byte: all
Character: 0 to 127
Short: -128 to 127
Integer: -128 to 127
Long: -128 to 127
Float and Double: none
BigInteger: -16 to 16 (in HotSpot Java 7)
BigDecimal: 0 to 10 (if you use valueOf(long)) and
0 to 0.000000000000000 (if you use valueOf(long, int)) (in HotSpot Java 7)
The reason it is done is to improve performance and reduce GC pressure.
Creating garbage can fill your cache with garbage, slowing down all your code, it also takes work to create objects and to clean them up. The less work you do the faster and more consistent your program will be.
Here is a good article of the difference it makes http://www.javaspecialists.eu/archive/Issue191.html
Look at it like this:
It is certainly not desirable to cache all Integer values. For, this would mean that when a JVM starts up, it'd have to make more than 4 billion Integer objects which would most likely not fit in the memories of contemporary computers.
Hence, one must decide to cache some smaller number of Integers (if at all). This number should be small enough so that no memory usage or slower startup is noticeable. OTOH, it should cover as most cases as possible. Not having read any study concerning this, but from experience we can say with confidence that very small integers are most often used.
Thus, the decision to cache 128 numbers is completly arbitrary, yet it still makes sense. It could as well be 30 or 300. But certainly not a million.
How does this help performance? Well, it sppeds up autoboxing for small numbers, because one does not have to construct an Integer, but rather pick one from the cache (which is most likely an Integer[128] array) with a single memroy access. At the same time, it is well known that many Integers are short-lived. Use of pre-allocated objects that need not be garbage collected takes away stress from the GC.
On this blog post, it's said that the minimum memory usage of a String is:
8 * (int) ((((no chars) * 2) + 45) / 8) bytes.
So for the String "Apple Computers", the minimum memory usage would be 72 bytes.
Even if I have 10,000 String objects of twice that length, the memory usage would be less than 2Mb, which isn't much at all. So does that mean I'm underestimating the amount of Strings present in an enterprise application, or is that formula wrong?
Thanks
String storage in Java depends on how the string was obtained. The backing char array can be shared between multiple instances. If that isn't the case, you have the usual object overhead plus storage for one pointer and three ints which usually comes out to 16 bytes overhead. Then the backing array requires 2 bytes per char since chars are UTF-16 code units.
For "Apple Computers" where the backing array is not shared, the minimum cost is going to be
backing array for 16 chars -- 32B which aligns nicely on a word boundary.
pointer to array - 4 or 8B depending on the platform
three ints for the offset, length, and memoized hashcode - 12B
2 x object overhead - depends on the VM, but 8B is a good rule of thumb.
one int for the array length.
So roughly 72B of which the actual payload constitutes 44.4%. The payload constitutes more for longer strings.
In Java7, some JDK implementations are doing away with backing array sharing to avoid pinning large char[]s in memory. That allows them to do away with 2 of the three ints.
That changes the calculation to 64B for a string of length 16 of which the actual payload constitutes 50%.
Is it possible to save character data using less memory than a Java String? Yes.
Does it matter for "enterprise" applications (or even Android or J2ME applications, which have to get by on a lot less memory)? Almost never.
Premature optimization is the root...
Compared to a other data types that you have, it is definitely high. The other primitives use 32 bits,64 bits,etc.
And given that String is immutable, every time you perform any operation on it, you end up creating a new String object, consuming even more memory.
Who decides the size of data types such as int in Java? JVM or OS or Processor?
int size is 4 bytes..Will it be always 4 bytes irrespective of OS or processor?
The Java Language Specification decides them. They're the same size on all VMs, on all OSes, on all processors. If they're not, it's not Java anymore.
It is the JVM specification that drives JVM implementations to decide the size of data-types. Refer http://docs.oracle.com/javase/specs/jvms/se7/html/jvms-2.html#jvms-2.3
While the Java spec decides how many bits each type actually uses, the default 32 bit JVM actually pads some types in memory, using 32 bits of space to store values, even ones that don't need that much space. They still all behave (regarding the program), as if they took up their real amount of storage, but the amount of space used is still much larger.
Other JVMs can do this differently, for instance they could store an array of booleans with only one bit per value, rather than 32 bits per value.
So while the size of an int will always appear to be 32 bits, because it will always wrap around at 32 bits and can only hold 32 bit values, the JVM does have the option of internally storing it using 64 bits of space, or some other number depending on the hardware and implementation.
Also, floating point values are actually much less restricted in this regard - while the spec for float and double does indeed require they use a particular format, the presence of the two different libraries java.lang.Math and java.lang.StrictMath (see What's the difference between java.lang.Math and java.lang.StrictMath? for an explanation) is a clear example. Using java.lang.Math does not require that the intermediate calculations for those functions be stored in any particular way, or that it use a particular number of bits to compute everything.