operations on unsigned numbers in java using least memory

operations on unsigned numbers in java using least memory - java

as we know Java doesn't support unsigned operations because it considers the last bit as a sign bit for all integers.
I know that one way is to use a greater size defined for the numbers involving in the operations. for example if you want to have 32-bit operations, you can do them by long which is 64-bit, then consider the bottom 32 bits as the result.
but the point is that, it takes twofold memory.
I read something about using the class integer but i didn't understand it. Do you have any idea to do unsigned operations with the least memory used?
thanks!

"signed" and "unsigned" is about the interpretation of bit patterns as numeric values. And most (all?) modern CPUs (and the JVM) use Two's Complement representation for signed numbers, having the interesting property that for addition and subtraction, the bit pattern operations are the same as for unsigned numbers.
E.g. In 16-bit, the pattern 0xfff0 is 65520 when interpreted as an unsigned, and -16 as signed number. Subtracting 16 (0x0010) from that number gives 0xffe0, which represents the correct result both for the signed (being -32) as well as the unsigned case (65504).
So, depending on the operations you are doing on your numbers, there are cases where you can simply ignore the signed/unsigned question - just be careful with input and output.
Regarding the relevance of using the shorter representation, only you can tell. And Java can be quite fast when doing numerics. I remember a coding contest quite some years ago where even Java 1.4 outperformed Microsoft C (both with default settings - with optimization switches C was a bit faster).

As we know Java doesn't support unsigned operations because it considers the last bit as a sign bit for all integers.
This is incorrect.
While Java does not support 32 and 64 bit unsigned integer types, it DOES support unsigned integer operations; see the javadoc for the Integer and Long types, and look for the xxxUnsigned methods.
Note that you won't see methods for add, subtract or multiply because they would return the same representations as the primitive operators do.
In short, you can use the 32 bit signed int type to do 32 bit unsigned arithmetic and comparison operations.
For more information:
Java8 unsigned arithmetic
How to use the unsigned Integer in Java 8 and Java 9?
Unsigned long in Java
https://www.baeldung.com/java-unsigned-arithmetic

Related

Unsigned long in Java

Currently, I am using signed values, -2^63 to 2^63-1. Now I need the same range (2 * 2^64), but with positive values only. I found the java documentations mentioning unsigned long, which suits this use.
I tried to declare 2^64 to a Long wrapper object, but it still loses the data, in other words, it only captures till the Long.MAX_VALUE, so I am clearly missing something.
Is BigInteger the signed long that Java supports?
Is there a definition or pointer as to how to declare and use it?

In Java 8, unsigned long support was introduced. Still, these are typical longs, but the sign doesn't affect adding and subtracting. For dividing and comparing, you have dedicated methods in Long. Also, you can do the following:
long l1 = Long.parseUnsignedLong("12345678901234567890");
String l1Str = Long.toUnsignedString(l1)
BigInteger is a bit different. It can keep huge numbers. It stores them as int[] and supports arithmetic.

Although Java has no unsigned long type, you can treat signed 64-bit two's-complement integers (i.e. long values) as unsigned if you are careful about it.
Many primitive integer operations are sign agnostic for two's-complement representations. For example, you can use Java primitive addition, subtraction and multiplication on an unsigned number represented as a long, and get the "right" answer.
For other operations such as division and comparison, the Long class provides method like divideUnsigned and compareUnsigned that will give the correct results for unsigned numbers represented as long values.
The Long methods supporting unsigned operations were added in Java 8. Prior to that, you could use 3rd-party libraries to achieve the same effect. For example, the static methods in the Guava UnsignedLongs class.
Is BigInteger the signed long that Java supports?
BigInteger would be another way to represent integer values greater that Long.MAX_VALUE. But BigInteger is a heavy-weight class. It is unnecessary if your numbers all fall within the range 0 to 264 - 1 (inclusive).

If using a third party library is an option, there is jOOU (a spin off library from jOOQ), which offers wrapper types for unsigned integer numbers in Java. That's not exactly the same thing as having primitive type (and thus byte code) support for unsigned types, but perhaps it's still good enough for your use-case.
import static org.joou.Unsigned.*;
// and then...
UByte b = ubyte(1);
UShort s = ushort(1);
UInteger i = uint(1);
ULong l = ulong(1);
All of these types extend java.lang.Number and can be converted into higher-order primitive types and BigInteger. In your case, earlier versions of jOOU simply stored the unsigned long value in a BigInteger. Version 0.9.3 does some cool bit shifting to fit the value in an ordinary long.
(Disclaimer: I work for the company behind these libraries)

somehow working with unsigned bytes in Java

I'm trying to create a Java program that writes files for my Adruino to read. The Arduino is a simple 8 bit microcontroller board, and with some extra hardware, can read text files from SD cards, byte by byte.
Turns out this was a whole lot harder than I thought. Firstly, there are no unsigned values in Java. Not even bytes for some reason! Even trying to set a byte to 0xFF gives a possible loss of precision error! This isn't very useful for this low-level code..
I would use ints and only use the positive values, but I like using byte overflow to my advantage in a lot of my code (though I could probably do this with modulus right after the math operation or something) and the biggest problem of all is I have no idea how to add an int as an 8 bit character to a String that gets written to a file later. Output is currently my biggest problem.
So, what would be the best way to do unsigned bit math based on some user input and then write those bits to a file as if each one was an ASCII character?

So, here's how it works.
You can treat Java bytes as unsigned. The only places where signs make a difference are
constants: just cast them to bytes
toString and parseInt
division
<, >, >=, <=
Operations where signedness does not matter:
addition
subtraction
multiplication
bit arithmetic (except for >>, just use >>> instead)
To convert bytes to their unsigned values as ints, just use & 0xFF, and to convert those to bytes use (byte).
Alternatively, if third-party libraries are acceptable, you might be interested in Guava's UnsignedBytes utility class. (Disclosure: I contribute to Guava.)

Why would you need unsigned types in Java?

I have often heard complaints against Java for not having unsigned data types. See for example this comment. I would like to know how is this a problem? I have been programming in Java for 10 years more or less and never had issues with it. Occasionally when converting bytes to ints a & 0xFF is needed, but I don't consider that as a problem.
Since unsigned and signed numbers are represented with the same bit values, the only places I can think of where signedness matters are:
When converting the numbers to other bit representation. Between 8, 16 and 32 bit integer types you can use bitmasks if needed.
When converting numbers to decimal format, usually to Strings.
Interoperating with non-Java systems through API's or protocols. Again the data is just bits, so I don't see the problem here.
Using the numbers as memory or other offsets. With 32 bit ints this might be problem for very huge offsets.
Instead I find it easier that I don't need to consider operations between unsigned and signed numbers and the conversions between those. What am I missing? What are the actual benefits of having unsigned types in a programming language and how would having those make Java better?

Occasionally when converting bytes to ints a & 0xFF is needed, but I don't consider that as a problem.
Why not? Is "applying a bitwise AND with 0xFF" actually part of what your code is trying to represent? If not, why should it have to be part of have you write it? I actually find that almost anything I want to do with bytes beyond just copying them from one place to another ends up requiring a mask. I want my code to be cruft-free; the lack of unsigned bytes hampers this :(
Additionally, consider an API which will always return a non-negative value, or only accepts non-negative values. Using an unsigned type allows you to express that clearly, without any need for validation. Personally I think it's a shame that unsigned types aren't used more in .NET, e.g. for things like String.Length, ICollection.Count etc. It's very common for a value to naturally only be non-negative.
Is the lack of unsigned types in Java a fatal flaw? Clearly not. Is it an annoyance? Absolutely.
The comment that you quote hits the nail on the head:
Java's lack of unsigned data types also stands against it. Yes, you can work around it, but it's not ideal and you'll be using code that doesn't really reflect the underlying data correctly.
Suppose you are interoperating with another system, which wants an unsigned 16 bit integer, and you want to represent the number 65535. You claim "the data is just bits, so I don't see the problem here" - but having to pass -1 to mean 65535 is a problem. Any impedance mismatch between the representation of your data and its underlying meaning introduces an extra speedbump when writing, reading and testing the code.
Instead I find it easier that I don't need to consider operations between unsigned and signed numbers and the conversions between those.
The only times you would need to consider those operations is when you were naturally working with values of two different types - one signed and one unsigned. At that point, you absolutely want to have that difference pointed out. With signed types being used to represent naturally unsigned values, you should still be considering the differences, but the fact that you should is hidden from you. Consider:
// This should be considered unsigned - so a value of -1 is "really" 65535
short length = /* some value */;
// This is really signed
short foo = /* some value */;
boolean result = foo < length;
Suppose foo is 100 and length is -1. What's the logical result? The value of length represents 65535, so logically foo is smaller than it. But you'd probably go along with the code above and get the wrong result.
Of course they don't even need to represent different types here. They could both be naturally unsigned values, represented as signed values with negative numbers being logically greater than positive ones. The same error applies, and wouldn't be a problem if you had unsigned types in the language.
You might also want to read this interview with Joshua Bloch (Google cache, as I believe it's gone from java.sun.com now), including:
Ooh, good question... I'm going to say that the strangest thing about the Java platform is that the byte type is signed. I've never heard an explanation for this. It's quite counterintuitive and causes all sorts of errors.

If you like, yes, everything is ones and zeroes. However, your hardware arithmetic and logic unit doesn't work that way. If you want to store your bits in a signed integer value but perform operations that are not natural to signed integers, you will usually waste both storage space and processing time.
An unsigned integer type stores twice as many non-negative values in the same space as the corresponding signed integer type. So if you want to take into Java any data commonly used in a language with unsigned values, such as a POSIX date value (unsigned number of seconds) that is normally used with C, then in general you will need to use a wider integer type than C would use. If you are processing many such values, again you will waste both storage space and fetch-execute time.

The times I have used unsigned data types have been when I read in large blocks of data that correspond to images, or worked with openGL. I personally prefer unsigned if I know something will never be negative, as a "safety feature" of sorts.
Unsigned types are useful for bit-by-bit comparisons, and I'm pretty sure they are used extensively in graphics.

Objective-C and Java Primitive Data Types

I need to convert a piece of code from Objective-C to Java, but I have a problem understanding the Primitive Types in Objective-C. So I had their data types in my Objective-C code :
UInt64, Uint32, UInt8 ,
which are unsigned integers (as I understand from internet). So my question is, can I use Java primitive types like byte (8bit) - instead of UInt8, int (32bit) - instead of UInt32, and long (64bit) - instead of UInt64.

Unfortunately, it isn't a straight translation and without knowing more about your program, its hard to suggest what the "right" approach is.
If your UInt8 values really range from 0-255, you may have to use Java signed int to be able to hold the entire range.
If you are dealing with byte streams or memory layouts and really need to use just a single byte of memory, than you could try byte, but you may have to test and handle cases to handle when the high-bit is set (value > 127). Ditto with the other unsigned types.
Ideally, if your code just kind of "defaulted" to the unsigned types, but really the signed versions would have worked fine too (i.e. the ranges of your values never equal or exceed 2^7, 2^15, or 2^31 respectively), then you may be fine with the "straight" translation to byte, int, and long.

Yes, those are the correctly sized data types to use in Java. Make sure you take into account that Java does not have unsigned types and the trick is to use the next largest size. 64 bit unsigned arithmetic requires special consideration.

In Java, is it safe to assume a certain size of the primitive types for bitwise operations?

Given Java's "write once, run anywhere" paradigm and the fact that the Java tutorials give explicit bit sizes for all the primitive data types without the slightest hint that this is dependent on anything, I would say that, yes, an int is always 32 bit.
But are there any caveats? The language spec defines the value range, but says nothing about the internal representation, and I guess that it probably shouldn't. However, I have some code which does bitwise operations on int variables that assume 32 bit width, and I was wondering whether that code is safe on all architectures.
Are there good in-depth resources for this type of question?

Java code always works as though ints are 32-bit, regardless of the native architecture.
In the specification, there's also a part that is definitive about representation:
The integral types are byte, short, int, and long, whose values are 8-bit, 16-bit, 32-bit and 64-bit signed two's-complement integers, respectively, and char, whose values are 16-bit unsigned integers representing UTF-16 code units

While the behaviour of Java's primitives is specified completely and exactly in the language spec, there is one caveat: on a 64bit architetcture, it's possible that ints will be word-aligned, which means that an array of ints (or any non-64bit primitive type) could take twice as much memory as on a 32bit achitecture.

you may be check also the JVM specs: each bitwise operation gets it's opcode (ISHL, IOR, IAND, etc)

Yes, there is no sizeof operator in Java.
According to Bruce Eckel's Thinking in Java:
short: 16 bits
int: 32 bits
long : 64 bits
These values don't vary between architectures.

This isn't an answer because there is already a good answer but I thought I'd point out that the reason this is so for Java but it wasn't for C or C++ is that Java compiles to a virtual machine (the Java VM or JVM). Because the JVM runs the same bytecode and has the same internal structure no matter which machine it is on, it seems to have the same size for primitive types on every machine. C and C++ did not try to emulate any particular behaviors and were subject to the whims of processor implementations on a variety of machines.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.