Can someone explain that why do arithmetic operations on integral types in Java always result in "int" or "long" results?
I think it's worth pointing out that this (arithmetic operations on integers producing integers) is a feature of many many programming languages, not only Java.
Many of those programming languages were invented before Java, many after Java, so I think that arguments that it is a hang-over from the days when hardware was less capable are wide of the mark. This feature of language design is about making languages type-safe. There are very good reasons for separating integers and floating-point numbers in programming languages, and for making the programmer responsible for identifying when and how conversions from type to type take place.
Check this out: http://www.particle.kth.se/~lindsey/JavaCourse/Book/Part1/Java/Chapter02/operators.html#ArithOps
It explains how the type of the return value is determined by the types of the operands. Essentially:
the arithmetic operators require a numeric type
if the type of either operand is an integral type, the return value will be the widest type included (so int + long = long)
if the type of either operand is a floating-point number then a floating-point number will be returned
if both operands are floating-point, then a double will be returned if either operand is a double
If you need to control the types, then you'll need to cast the operands to the appropriate types. For example, int * int could be too long for an int, so you may need to do:
long result = myInt * (long) anotherInt
Likewise for really large or really tiny floats resulting from arithmetic operations.
Because the basic integer arithmetic operators are only defined either between int and int or between long and long. In all other cases types are automatically widened to suit. There are no doubt some abstruse paragraphs in the Java Language Specification explaining exactly what happens.
Dummy answer: because this is how the Java Language Specification defines them:
4.2.2. Integer Operations
The Java programming language provides a number of operators that act on integral values:
[...]
The numerical operators, which result in a value of type int or long:
Do you mean why you don't get a double or BigInteger result? Historical accident and efficiency reasons, mostly. Detecting overflow from + or * and handing the result (from Integer.MAX_VALUE * Integer.MAX_VALUE, say) means generating lots of exception detection code that will almost never get triggered, but always needs to get executed. Much easier to define addition or multiplication modulo 2^32 (or 2^64) and not worry about it. Same for division with a fractional remainder.
This was certainly the case long ago with C. It is less of an issue today with superscalar processors and lots of bits to play with. But people got used to it, so it remains in Java today. Use Python 3 if you want your arithmetic autoconverted to a type that can hold the result.
The reason is kind of the same as why we have primitive types in Java at all -- it allows writing efficient code. You may argue that it also makes less efficient but correct code much uglier; you'd be about right. Keep in mind that the design choice was made around 1995.
Related
Why are the "F" and "L" suffixes needed when declaring a long or float? According to the documentation:
An integer literal is of type long if it ends with the letter L or l; otherwise it is of type int.
A floating-point literal is of type float if it ends with the letter F or f; otherwise its type is double.
So, from that, obviously the compiler is treating the values as either an int data type or a double data type, by default. That doesn't quite explain things for me.
I dug a bit deeper and found a discussion where a user describes the conversion from a 64-bit double into a 32-bit float would result in data loss, and the designers didn't want to make assumptions.
Questions I still have:
Why would the compiler allow one to write byte myByte = 100;, and the compiler automatically convers 100, an int as described above, into a byte, but the compiler won't allow the long myLong = 3_000_000_000;? Why will it not auto-convert 3_000_000_000 into a long, despite it being well within the range of a long?
As discussed above, when designing Java, the designers won't allow a double to be assigned to a float because of the data loss. While this may be true for a value that is outside of the range of a float, obviously something like 3.14 is small-enough for a float. So then, why does the compiler throw an error with the assignment float myFloat = 3.14;?
Ultimately, I'm failing to fully understand why the suffixes are needed, and the rules surrounding automatic casting (if that's what's happening under-the-hood), etc.
I know this topic has been discussed before, but the answers given only raise more questions, so I am deciding to create a new post.
In answer to your specific questions:
The problem with long myLong = 3_000_000_000; is that 3_000_000_000 is not a legal int literal because 3,000,000,000 does not fit into 4 bytes. The fact that you want to promote it to a long in order to initialize myLong is irrelevant. (Yes, the language designers could have designed the language so that in this context 3_000_000_000 could have been parsed as a long, but they didn't, probably to keep the language simpler and to avoid ambiguities in other contexts.)
The problem with 3.14 is not a matter of range but of loss of precision. In particular, while 3.14 terminates in base 10 representation, it does not have a finite representation in binary floating point. So converting from a double to a float (in order to initialize myFloat) would involve truncating significant, non-zero bits of the representation. (But just to be clear: Java considers every narrowing conversion from double to float to be lossy, regardless of the actual values involved. So float myFloat = 3.0; would also fail. However, float myFloat = 3; succeeds because conversion from an int value to a float is considered a widening conversion.)
In both cases, the right thing to do is to indicate exactly to the compiler what you are trying to do by appending the appropriate suffix to the numeric literal.
Why would the compiler allow one to write byte myByte = 100;, and the compiler automatically convers 100, an int as described above, into a byte, but the compiler won't allow the long myLong = 3_000_000_000;?
Because the spec says so. Note that byte myByte = 100; does work, yes, but that is a special case, explicitly mentioned in the Java Language Specification; ordinarily, 100 as a literal in a .java file is always interpreted as an int first, and never silently converts itself to a byte, except in two cases, both explicitly mentioned in the JLS: The cast is 'implied' in modified assignment: someByteArr += anyNumber; always works and implies the cast (again, why? Because the spec says so), and the same explicit presumption is made when declaring a variable: byte b = 100;, assuming the int literal is in fact in byte range (-128 to +127).
The JLS does not make an explicit rule that such concepts are applied in a long x = veryLargeLiteral;. And that is where your quest really ought to end. The spec says so. End of story.
If you'd like to ask the question: "Surely whomever person or persons added this, or rather failed to add this explicit case to the JLS had their reasons for it, and these reasons are more technical and merit based than 'cuz they thought of it in a dream' or 'because they flipped a coin', and then we get to a pure guess (because you'd have to ask them, so probably James Gosling, about why he made a decision 25 years ago):
Because it would be considerably more complex to implement for the javac codebase.
Right now literals are first considered as an int and only then, much later in the process, if the code is structured such that the JLS says no cast is needed, they can be 'downcast'. Whereas with the long scenario this does not work: Once you try to treat 3_000_000_000 as an int, you already lost the game because that does not fit, thus the parser that parses this needs to create some sort of bizarro 'schrodinger's cat' style node, which represents 3_000_000_000 accurately, but nevertheless will downstream get turned into a parsing error UNLESS it is used in an explicit scenario where the silently-treat-as-long part is allowed. That's certainly possible, but slightly more complex.
Presumably the same argument applies to why, in 25 years, java has not seen an update. It could get that at some point in time, but I doubt it'll have high priority.
As discussed above, when designing Java, the designers won't allow a double to be assigned to a float because of the data loss.
This really isn't related at all. int -> long is lossy, but double -> float mostly isn't (it's floating point, you lose a little every time you do stuff with them pretty much, but that's sort of baked into the contract when you use them at all, so that should not stop you).
obviously something like 3.14 is small-enough for a float.
Long and int are easy: Ints go from about -2 billion to about +2 billion and longs go a lot further. But float/double is not like that. They represent roughly the same range (which is HUGE, 300+ digit numbers are fine), but their accuracy goes down as you get away from the 0, and for floats it goes down a lot faster. Almost every number, probably including 3.14, cannot be perfectly represented by either float or double, so we're just arguing on how much error is acceptable. Thus, java does not as a rule silently convert stuff to a float, because, hey, you picked double, presumably for a reason, so you need to explicitly tell the compiler: "Yup. I get it, I want you to convert and I will accept the potential loss, it is what I want", because once the compiler starts guessing at what you meant, that is an excellent source of hard to find bugs. Java has loads of places where it is designed like this. Contrast to languages like javascript or PHP where tons of code is legal even if it is bizarre and seems to make no sense, because the compiler will just try to guess at what you wanted.
Java is much better than that - it draws a line; once your code is sufficiently weird that the odds that javac knows what you wanted drop below a treshold, java will actively refuse to then take a wild stab in the dark at what you meant and will just flat out refuse and ask you to be more clear about it. In a 20 year coding career I cannot stress enough how useful that is :)
I know this topic has been discussed before, but the answers given only raise more questions, so I am deciding to create a new post.
And yet you asked the same question again instead of the 'more questions' than this raised. Shouldn't you have asked about those?
First, we need to understand how declaration happens in Java. Java is a statically-typed language, once we declare a variable, we can't change the data type of our variable after. Let's look up an examples:
long myLong = 3_000_000_000;
Integral types(byte,short,int,long) are "int" by default. The differences are sizes(byte<short<int<long).
When we declare a variable we're saying to java that "myLong" variable's type should be long(which is int but longer size). And after we're trying to equalize with "3_000_000_000"(Literal) which is int BUT int's max value is 3,147,483,647 so it's bigger. That's why we should write "L or l" to the end of the literal. After adding "l", now, our literal is long and we can equalize with declared long "myLong". => long myLong = 3_000_000_000l;
int myInt = 300L; => (Error will appears)
In this example our literal(300L) is long. As I mentioned before long's size is bigger than other integral types. When we delete "L" from end of the literal, "300" will be int.
Here is another example for FLoat and Double :
float myFloat = 5.5; (Error)
float myFloat = 5.5F; (Correct version)
Float and Double are "double" by default. The difference is, double bigger than float. myFloat is "float" in the begining, 5.5 is double so error will appear that we can't equalize. That is why we should add "F or f" to the end of the 5.5. We can use "D or d" for double but it's up on us, it's not necessary because there's no bigger floating type than double.
Hope it's clear :)
If you use BigInteger (or BigDecimal) and want to perform arithmetic on them, you have to use the methods add or subtract, for example. This may sound fine until you realize that this
i += d + p + y;
would be written like this for a BigInteger:
i = i.add(d.add(p.add(y)));
As you can see it is a little easier to read the first line. This could be solved if Java allowed operator overloading but it doesn't, so this begs the question:
Why isn't BigInteger a primitive type so it can take advantage of the same operators as other primitive types?
That's because BigInteger is not, in fact, anything that is close to being a primitive. It is implemented using an array and some additional fields, and the various operations include complex operations. For example, here is the implementation of add:
public BigInteger add(BigInteger val) {
if (val.signum == 0)
return this;
if (signum == 0)
return val;
if (val.signum == signum)
return new BigInteger(add(mag, val.mag), signum);
int cmp = compareMagnitude(val);
if (cmp == 0)
return ZERO;
int[] resultMag = (cmp > 0 ? subtract(mag, val.mag)
: subtract(val.mag, mag));
resultMag = trustedStripLeadingZeroInts(resultMag);
return new BigInteger(resultMag, cmp == signum ? 1 : -1);
}
Primitives in Java are types that are usually implemented directly by the CPU of the host machine. For example, every modern computer has a machine-language instruction for integer addition. Therefore it can also have very simple byte code in the JVM.
A complex type like BigInteger cannot usually be handled that way, and it cannot be translated into simple byte code. It cannot be a primitive.
So your question might be "Why no operator overloading in Java". Well, that's part of the language philosophy.
And why not make an exception, like for String? Because it's not just one operator that is the exception. You need to make an exception for the operators *, /, +,-, <<, ^ and so on. And you'll still have some operations in the object itself (like pow which is not represented by an operator in Java), which for primitives are handled by speciality classes (like Math).
Fundamentally, because the informal meaning of "primitive" is that it's data that can be handled directly with a single CPU instruction. In other words, they are primitives because they fit in a 32 or 64 bits word, which is the data architecture that your CPU works with, so they can explicitely be stored in the registers.
And thus your CPU can make the following operation:
ADD REGISTER_3 REGISTER_2 REGISTER_1 ;;; REGISTER_3 = REGISTER_1 + REGISTER_2
A BigInteger which can occupy an arbitrarily large amount of memory can't be stored in a single REGISTER and will need to perform multiple instructions to make a simple sum.
This is why they couldn't possibly be a primitive type, and now they actually are objects with methods and fields, a much more complex structure than simple primitive types.
Note: The reason why I called this informal is because ultimately the Java designers could define a "Java primitive type" as anything they wanted, they own the word, however this is vaguely the agreed use of the word.
int and boolean and char aren't primitives so that you can take advantage of operators like + and /. They are primitives for historical reasons, the biggest of which is performance.
In Java, primitives are defined as just those things that are not full-fledged Objects. Why create these unusual structures (and then re-implement them as proper objects, like Integer, later on)? Primarily for performance: operations on Objects were (and are) slower than operations on primitive types. (As other answers mention, hardware support made these operations faster, but I'd disagree that hardware support is an "essential property" of primitives.)
So some types received "special treatment" (and were implemented as primitives), and others didn't. Think of it this way: if even the wildly-popular String is not a primitive type, why would BigInteger be?
It's because primitive types have a size limit. For instance int is 32 bits and long is 64 bits. So if you create a variable of type int the JVM allocates 32 bits of memory on the stack for it. But as for BigInteger, it "theoretically" has no size limit. Meaning it can grow arbitrarily in size. Because of this, there is no way to know its size and allocate a fixed block of memory on the stack for it. Therefore it is allocated on the heap where the JVM can always increase the size if needed.
Primitive types are normally historic types defined by processor architecture. Which is why byte is 8-bit, short is 16-bit, int is 32-bit and long is 64-bit. Maybe when there's more 128-bit architectures, an extra primitive will be created...but I can't see there being enough drive for this...
This question already has answers here:
Why does the Java API use int instead of short or byte?
(7 answers)
Closed 8 years ago.
I would like to know the reason why byte and short values are promoted to int whenever an expression is evaluated or a bit-wise operation is processed?
Because the Java Language Specification says so. Section 5.6.1 defines unary numeric promotion for evaulation of certain operators, and it says:
If the operand is of compile-time type byte, short, or char, it is promoted to a value of type int by a widening primitive conversion (§5.1.2).
Section 5.6.2 on evaluation of binary numeric operators ('binary' meaning operators that have two operands, like '+') says something similar:
If either operand is of type double, the other is converted to double.
Otherwise, if either operand is of type float, the other is converted to float.
Otherwise, if either operand is of type long, the other is converted to long.
Otherwise, both operands are converted to type int.
Why was it defined this way? A major reason is that at the time of design of the Java language and Java virtual machine, 32-bit was the standard word size of computers, where there is no performance advantage to doing basic arithmetic with smaller types. The Java virtual machine was designed to take advantage of this, by using 32-bit as the int size, and then providing dedicated instructions in the Java bytecode for arithmetic with ints, longs, floats, and doubles, but not with any of the smaller numeric types (byte, short, and char). Eliminating the smaller types makes the bytecode simpler, and lets the complete instruction set, with room for future expansion, still fit the opcode in a single byte. Similarly, the JVM was designed with a bias towards easy implementation on 32-bit systems, in the layout of data in classes and in the stack, where 64-bit types (doubles and longs) take two slots and all other types (32-bit or smaller) take one slot.
So, the smaller types were generally treated as second-class citizens in the design of Java, converted to ints at various steps, because that simplified some things. The smaller types are still important because they take less memory when packed together (e.g., in arrays), but they do not help when evaluating expressions.
Let's say I were to have the following line of code:
int number = b/2;
where b is an odd int. What would happen?
Also, if b were instead a long, would java automatically convert this long to an int? What if b were a char, or something else ridiculous?
Java will widen types automatically, but you must narrow the types yourself with a cast.
I suggest you try this for yourself as you might learn something. You can't learn to program without actually doing it at some point.
It will return integer value of b/2 like if b=3 then b/2 will return 1.but if b is long variable say long b=3,then it will return an error saying possible loss precision.
Dividing integers results in an integer which is rounded towards 0. When you start mixing types it depends on if they need widening (will happen automatically) or narrowing (will in most cases not happen automatically).
More details for division can be found in Java Language Specification 15.17.2. Division Operator / and for narrowing and widening at Java Language Specification Chapter 5. Conversions and Promotions.
But I think trying and experimenting in you Java program is a better way to understand than just reading the specification. You will not destroy anything by writing a small test program.
In java, I know the data type of the result of an arithmetic calculation depends on the data types of the numbers involved in the calculation.
For example,
int + int = int
long/double=double
a. But I can't find any references which can give me all these rules. Could someone help me?
b. How to avoid over flow in arithmetic calculation? For example, the results of 2 long may not fit into a long anymore...
Thanks a lot.
a. These rules are called numeric promotion rules and are specified in Java Language Specification, §5.6.2 (currently).
b. There are two generally accepted method for dealing with overflows.
The first method, a post-check, where you do an operation, say addition and then check that the result is greater than either of the operands. For example:
int c = a + b;
if( c<a) { // assuming a>=0 and b>=0
// overflow happened
}
The second method, is a pre-check, where you basically try to avoid the overflow from happening in the first place. Example:
if( a > Integer.MAX_INTERGER - b ) {
// overflow happened
}
The specific section of the Java Language Specification that deals with these rules is section 4.
If you don't want values to overflow at all, use a BigInteger or some other arbitrary-precision arithmetic type.
For avoiding overflows in the general case, Guava (which I contribute to) provides methods like IntMath.checkedAdd(int, int) and LongMath.checkedMultiply(long, long), which throw exceptions on overflow. (Some of those are nontrivial to implement yourself, but these are all very exhaustively tested.) You can look at the source to see how they work, but most of them rely on cute bit-twiddling tricks to check for overflow efficiently.
The result of an arithmetic operation on any two primitive integer operands will be at least an int -- even if the operands are bytes and short.
Answer to question A:
The type of operand with the "larger type"
Double is the "largest" type
Note: char and boolean cannot be used w/ arithmetic operator
Source: https://www.geeksforgeeks.org/type-conversion-java-examples/