What is Java Byte Array and How should it be used? - java

What does it mean by byte array ? I mean it holds the 0s and 1s just like how data is hold in memory ?
For example
String a = "32";
byte [] arr = a.getBytes() ;
What does exist now inside arr array,why and when to use it?

By byte array, it literally means an array where each item is of the byte primitive data type. If you do not know the difference between a byte and a common int (Integer), the main difference is the bit width: bytes are 8-bit and integers are 32-bit. You can read up on that here.
If you do not know what an array is, an array is basically a sequence of items (in your case a sequence of bytes, declared as byte[]).
The function a.getBytes() takes a, which is a String, and returns an array of bytes. This can be done because the human-readable characters in a String can be represented as 8-bit numbers, where the mapping between number and character is determined by the CharSet. Examples of two common CharSets are ASCII and UTF-8. Now, arr is an array of bytes, where each byte in the array represents each character in the original string a. In both ASCII and UTF-8, the String "32" is represented by the bytes 51 and 50 in decimal, and 0x33 and 0x32 in hexadecimal.
Byte arrays are commonly used in applications that read and write data byte-wise, such as socket connections that send data in byte streams through TCP or UDP protocols.
Hope I could help!

Related

Java 11 Compact Strings magic behind char[] to byte[]

I been reading about encoding Unicode Java 9 compact Strings in the last two days i am getting quite well. But there is something that i dont understand.
About byte data type
1). Is a 8-bit storage ranges from -128 to 127
Questions
1). Why Java didn't implement it like char unsigned 16 bits? i mean it would be in a range of 0.256 because from 0 to 127 only can i hold a Ascii value but what would happen if i set the value 200 a extended ascii would overflow to -56.
2). Does the negative value mean something i mean i have try a simple example using Java 11
final char value = (char)200;//in byte would overflow
final String stringValue = new String(new char[]{value});
System.out.println(stringValue);//THE SAME VALUE OF JAVA 8
I have checked the String.value variable and i see a byte array of
System.out.println(value[0]);//-56
The same questions like before arise does the -56 mean something i mean the (negative value) in other languages this overflow is detected to return to the value 200? How can Java know that -56 value is the same as 200 in char.
I have try hardest examples like codepoint 128048 and i see in String.value variable a array of bytes like this.
0 = 61
1 = -40
2 = 48
3 = -36
I know this codepoint takes 4 bytes but i get it how is transformed char[] to byte[] but i dont know how String handle this byte[] data.
Sorry if this question is simple and sorry any typing english is not my natural language thanks a lot.
Why Java didn't implement it like char unsigned 16 bits? i mean it would be in a range of 0.256 because from 0 to 127 only can i hold a Ascii value but what would happen if i set the value 200 a extended ascii would overflow to -56.
Java’s primitive data types were settled with Java 1.0 a quarter century ago. The compact strings were introduced in Java 9, less than two years ago. This new feature, which is merely an implementation detail, did not justify fundamental changes at Java’s type system.
Besides that, you are looking at one interpretation of the data stored in a byte. For the sake of representing iso-latin-1 units, it is entirely irrelevant whether interpreting the same data as Java’s built-in signed byte would result in a positive or negative number.
Likewise Java’s I/O API allows reading a file into a byte[] array and write byte[] arrays back to files and these two operations are already sufficient to copy a file losslessly, regardless of its file format which would be relevant when interpreting its content.
So the following works since Java 1.1:
byte[] bytes = "È".getBytes("iso-8859-1");
System.out.println(bytes[0]);
System.out.println(bytes[0] & 0xff);
-56
200
The two numbers, -56 and 200 are just different interpretations of the bit pattern 11001000 whereas the iso-latin-1 interpretation of a byte containing the bit pattern 11001000 is the character È.
A char value is also just an interpretation of a two byte quantity, i.e. as UTF-16 code unit. Likewise, a char[] array is a sequence of bytes in the computer’s memory with a standard interpretation.
We can also interpret other byte sequences this way.
StringBuilder sb = new StringBuilder().appendCodePoint(128048);
byte[] array = new byte[4];
StandardCharsets.UTF_16LE.newEncoder()
.encode(CharBuffer.wrap(sb), ByteBuffer.wrap(array), true);
System.out.println(Arrays.toString(array));
will print the value you’ve seen, [61, -40, 48, -36].
The advantage of using a byte[] array inside the String class is, that now, the interpretation can be chosen, to use iso-latin-1 when all characters are representable with this encoding or utf-16 otherwise.
The possible numeric interpretations are irrelevant to the string. However, when you ask “How can Java know that -56 value is the same as 200”, you should ask yourself, how does it know that the bit pattern 11001000 of a byte is -56 in the first place?
System.out.println(value[0]);
bears an actually expensive operation, compared to ordinary computer arithmetic, the conversion of a byte (or an int) to a String. This conversion operation is often overlooked as it has been defined as the default way of printing a byte, but is not more natural than a conversion to a String interpreting the value as an unsigned quantity. For further reading, I recommend Two's complement.
This is because not all bytes in a string are interpreted the same. This depends to the string's character encoding.
Example:
if a string is an UTF-8 string, its characters will be 8-bits in size.
in an UTF-16 string, its characters will be 16-bits in size.
etc...
This means, if the string is to be represented as UTF-8, the characters will be made by reading 1 byte at a time; if 16-bits, the characters will made by reading 2 bytes at a time.
Look at this code: a single byte array data is transformed to string using UTF-8 and UTF-16.
byte[] data = new byte[] {97, 98, 99, 100};
System.out.println(new String(data, StandardCharsets.UTF_8));
System.out.println(new String(data, StandardCharsets.UTF_16));
The output of this code is:
abcd // 4 bytes = 4 chars, 1 byte per char
慢捤 // 4 bytes = 2 chars, 2 byte per char
Going back to the question, what motivated the developers to do so is to reduce memory footprint on strings. Not all strings uses all the 16-bits a char offers.
EDIT: Code here

How byte array in Java is used to represent a binary data?

I have read that byte array in Java is used to represent a binary data. I am not able to understand this. How byte array can represent a binary data (and which can be transferred over the network and can be constructed back to original form).
Byte can have (integer) values from -128 to 127; so how does a byte array represent a binary data?
Byte can be (integer) values -128 to 127, so how does a byte
array represent a binary data?
Each byte (octet) is a sequence of eight bits, and having sequence of bytes lets us represent binary data of any length (though it's limited to per 8-bits increments).
Memory of most modern computers is addressed as a sequence of bytes, network interfaces send packets containing sequences of bytes, hard drives store sequences of bytes (but are addressable only in much larger blocks, say, 4096 bytes).
There is rarely need to access data bit-by-bit, and when needed it can be done with bitwise operators, so no data type for sequence of bits is provided by default.
So to conclude:
1 Byte == 8 bits, and Byte Array == stream of bits,
and hence represent binary data?
Yes. For example: A Byte Array of length 2 bytes is a stream of 16 bits of binary data.

Java vs. C#: BigInteger hex string yields different result?

Question:
This code in Java:
BigInteger mod = new BigInteger("86f71688cdd2612ca117d1f54bdae029", 16);
produces (in java) the number
179399505810976971998364784462504058921
However, when I use C#,
BigInteger mod = BigInteger.Parse("86f71688cdd2612ca117d1f54bdae029", System.Globalization.NumberStyles.HexNumber); // base 16
i don't get the same number, I get:
-160882861109961491465009822969264152535
However, when I create the number directly from decimal, it works
BigInteger mod = BigInteger.Parse("179399505810976971998364784462504058921");
I tried converting the hex string in a byte array and reversing it, and creating a biginteger from the reversed array, just in case it's a byte array with different endianness, but that didn't help...
I also encountered the following problem when converting Java-Code to C#:
Java
BigInteger k0 = new BigInteger(byte[]);
to get the same number in C#, I must reverse the array because of different Endianness in the biginteger implementation
C# equivalent:
BigInteger k0 = new BigInteger(byte[].Reverse().ToArray());
Here's what MSDN says about BigInteger.Parse:
If value is a hexadecimal string, the Parse(String, NumberStyles) method interprets value as a negative number stored by using two's complement representation if its first two hexadecimal digits are greater than or equal to 0x80. In other words, the method interprets the highest-order bit of the first byte in value as the sign bit. To make sure that a hexadecimal string is correctly interpreted as a positive number, the first digit in value must have a value of zero. For example, the method interprets 0x80 as a negative value, but it interprets either 0x080 or 0x0080 as a positive value.
So, add a 0 in front of the parsed hexadecimal number to force an unsigned interpretation.
As for round-tripping a big integer represented by a byte array between Java and C#, I'd advise against that, unless you really have to. But both implementations happen to use a compatible two's complement representation, if you fix the endianness issue.
MSDN says:
The individual bytes in the array returned by this method appear in little-endian order. That is, the lower-order bytes of the value precede the higher-order bytes. The first byte of the array reflects the first eight bits of the BigInteger value, the second byte reflects the next eight bits, and so on.
Java docs say:
Returns a byte array containing the two's-complement representation of this BigInteger. The byte array will be in big-endian byte-order: the most significant byte is in the zeroth element.

Bit masking java, only showing last 6 bites of a hex

I am playing around on how to manipulate bytes from an inputted Hex number. Data is a Hex:
0x022DA822 == 10001011011010100000100010. After I run the following code:
byte mask= (byte) data;
mask will = 100010, only those last bits. How come it only shows the last 6 bits or 22 in the hex?
Does it mask the first 20 bits by default?
Your cast is causing a loss of data. A byte can hold (you guessed it), one byte of data. Thus the range of a byte is [-128, 127]. Note that the most significant bit is reserved as the sign bit. So basically when you are saying: (byte)data, you are converting your hex data into a variable of type byte, which has a smaller range than your hex string. And thus only the last byte of your data can be stored in the byte.

outputStream.write() for int[] array

I'm looking for something that does the same thing than outputStream.write() but which will accept an array of int.
Actually, I'm using this one : outputStream.write() but this one accepts only byte,byte[] or int.
I could use the byte[] but the values I want to send are
[255,44,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,255]
so I can use the byte[] because there range are only from -127 to 127 :/
It's to send a command on a Port_Com which accept only packet of 19 bytes and must be start and end with 255.
This is a common misconception about bytes, because of rumors repeated again and again.
Actually, the range of byte is from
00000000 (binary) to 11111111 (binary)
There is no reason to interpret bytes as numbers, if you're only interested in the bit patterns. There is, in particular, no reason to interpret bytes as signed numbers, just because java does it that way by default.
Hence, go ahead, as Jon Skeet says, cast your integers to byte and write those bytes.

Categories