Eliminate default zeros while creating string from byte array - java

I am getting bytes from IOStream and converting it to string. From that string i am extracting a sequence using substring api.
Size of ByteArray is 128 bytes. If the stream contains only 10 bytes and remaining are filled with zero[initially filled]. I am converting the byte array to string by passing to a string constructor new String(byte[]) and checking the length. The length is 128. Why it is showing 128? Actually it should show for 10 byte character length.
How to eliminate the zero while converting to string. Is there any api's to eliminate the default zeros in byte array. It's creating problem while creating a substring from the constructed string.
byte[] b = { 99, 116, 101, 100, 46, 13, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0}
System.out.println("byte length = " + b.length);
String str;
try {
str = new String(b, "UTF-8");
System.out.println("String length = " + str.length());
System.out.println(str);
System.out.println(" ## substring = " + str.substring(0));
System.out.println(" substring length = "
+ str.substring(0).length());
System.out.println("Done......");
} catch (UnsupportedEncodingException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}0, 0, 0 };

To create a String from part of a byte array, use the constructor String(byte[] bytes, int offset, int length, String charsetName). Example:
// uses the first 10 bytes of b
str = new String(b, 0, 10, "UTF-8");
Also, if you're compiling for Java 7 you might as well use StandardCharsets (from the java.nio.charset package), and avoid having to handle UnsupportedEncodingException. Example:
str = new String(b, 0, 10, StandardCharsets.UTF_8);

When you read from an InputStream, it will tell you how many bytes were read. The length of the byte[] itself is mostly irrelevant (other than defining the max number of bytes which could be read in a single call). There should be no need to later go examine the byte[] to try and determine how much of the data is relevant. Pay attention to the return value from read and use that when creating a String.
Additionally, if all of your data is text, consider using an InputStreamReader, perhaps in combination with a BufferedReader.

First an explanation.
Not every byte sequence is valid UTF-8. A binary byte 0 (0x00) is valid, and does not terminate a String as in C.
In fact a terminating \0 was later deplored by either C's Kernighan or Ritchie, as being suboptimal.
To prevent problems, not only Unicode code points above U+007F (0x7f) are multi-byte encoded (whith high bits of bytes set), but also U+0000 in Java's UTF-8, DataOutputSream.
byte[] bytes = get UTF-8 bytes from string
Now bytes could have a multi-byte sequence for the code point 0.
So you might either clean up the bytes, a small loop, or clean up the string:
str = str.replace("\u0000", ""); // All bytes 0
str = str.replaceFirst("\u0000+$", ""); // Only trailing bytes 0, regex

Your code would be like this
byte[] b = { 99, 116, 101, 100, 46, 13, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0};
int nonZeroPos=0;
for (int i = b.length-1; i >0; i--) {
if(b[i]!=0){
nonZeroPos=i;
break;
}
}
System.out.println("byte length = " + b.length);
String str;
try {
str = new String(b, 0, nonZeroPos, "UTF-8");
System.out.println("String length = " + str.length());
System.out.println(str);
System.out.println(" ## substring = " + str.substring(0));
System.out.println(" substring length = "
+ str.substring(0).length());
System.out.println("Done......");
} catch (UnsupportedEncodingException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
You also could have it done this way -
String zerostring = new String(new byte[]{0});
str=new String(b).replace(zerostring , "");
System.out.println(str);
But disadvantage of this is it will replace 0s coming in the word.

Related

getting the binary data represented by the hexadecimal string back in java vs python

I know that in python binascii.unhexlify(initValue)
return the binary data represented by the hexadecimal string back.
I am trying to convert binascii.unhexlify(initValue) to java.
I tried the following code lines in java but I am getting different results then the code in python:
DatatypeConverter.parseHexBinary(value);
I run the following example:
my input - hexadecimal string:
value = '270000f31d32d1051400000000000000000000000006000000000000000000000000000000000000'
when running in python:
result = binascii.unhexlify(value)
I am getting:
result = "'\x00\x00\xf3\x1d2\xd1\x05\x14\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x06\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00"
when running in java:
byte[] bytes = DatatypeConverter.parseHexBinary(value);
I am getting:
bytes = [39, 0, 0, -13, 29, 50, -47, 5, 20, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 6, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
1.why I am getting different results?
why do I get the output in python with '\' marks?
The first hex of your result, "'" is exactly 39 in signed char. In python, you can use built-in function ord("'") to get 39.
You can probably get what you want in this python code
value = '270000f31d32d1051400000000000000000000000006000000000000000000000000000000000000'
result = binascii.unhexlify(value)
bytes = [ord(x) for x in result]
You will be getting this unsigned char:
[39, 0, 0, 243, 29, 50, 209, 5, 20, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 6, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]

how to convert string to collection

I have a string in the below format.
[[115, 1, 0123490, 63824005632, 0036760004, , 01, N, 78, , 7481067028,
122016, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 14, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
TABORA, EMMANUEL, J, 4732 WENATCHIE TRL, LIMA, OH, 45805, EM, RXRELIEF CARD,
MUCINEX DM 20 0056-32 TAB SA 12HR 600-30MG], [115, 1, 0123490,
63824005632, 0038380001, ,
01, N, 78, , 7481067028, 122016, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 20, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, TABORA, EMMANUEL, J, 4732 WENATCHIE TRL, LIMA,
OH, 45805, EM, APEX AFFINITY DISCOUNT CARD, MUCINEX DM 20 0056-32 TAB SA
12HR 600-30MG]]
I want to store in collection with each
[115, 1, 0123490, 63824005632, 0038380001, , 01, N, 78, , 7481067028,
122016, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 20, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
TABORA, EMMANUEL, J, 4732 WENATCHIE TRL, LIMA, OH, 45805, EM, APEX AFFINITY
DISCOUNT CARD, MUCINEX DM 20 0056-32 TAB SA 12HR 600-30MG]
[115, 1, 0123490, 63824005632, 0036760004, , 01, N, 78, , 7481067028,
122016, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 14, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
TABORA, EMMANUEL, J, 4732 WENATCHIE TRL, LIMA, OH, 45805, EM, RXRELIEF CARD,
MUCINEX DM 20 0056-32 TAB SA 12HR 600-30MG]
How can I split or store in collection?
This should work as long the character ] does not appear as part of a value inside an entry:
public static void main(String[] args) {
String clob = "[[115, 1, 0123490, 63824005632, 0036760004, , 01, N, 78, , 7481067028, 122016, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 14, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, TABORA, EMMANUEL, J, 4732 WENATCHIE TRL, LIMA, OH, 45805, EM, RXRELIEF CARD, MUCINEX DM 20 0056-32 TAB SA 12HR 600-30MG], [115, 1, 0123490, 63824005632, 0038380001, , 01, N, 78, , 7481067028, 122016, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 20, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, TABORA, EMMANUEL, J, 4732 WENATCHIE TRL, LIMA, OH, 45805, EM, APEX AFFINITY DISCOUNT CARD, MUCINEX DM 20 0056-32 TAB SA 12HR 600-30MG]]";
List<String> entries = new ArrayList<>();
int start = 1;
while (true) {
start = clob.indexOf("[", start);
int end = clob.indexOf("]", start);
if (start != -1 && end != -1) {
entries.add(clob.substring(start, end + 1));
start = end + 1;
} else {
break;
}
}
}
If you know the escape sequence for characters inside your entries (e.g. \]) you have to check if the found end index represents that escape sequence and if, read again, starting from the end index.
There are many ways to do it. Here’s my suggestion:
public static List<List<String>> stringTo2DList(String input) {
if (input.equals("[]")) {
return Collections.emptyList();
}
if (! input.startsWith("[[")) {
throw new IllegalArgumentException("Not a list of lists");
}
if (! input.endsWith("]]")) {
throw new IllegalArgumentException("Not a list of lists");
}
List<List<String>> result = new ArrayList<>();
String[] innerLists = input.substring(2, input.length() - 2).split("\\], \\[");
for (String innerList : innerLists) {
// check for empty inner list
if (innerList.isEmpty()) {
result.add(Collections.emptyList());
} else {
result.add(Arrays.asList(innerList.split(", ")));
}
}
return result;
}
Should your string contain [], I am interpreting it as an empty list even though it might be a list of one element, the empty string. If you prefer the latter, just skip the check for the empty list in for loop.

Read string from USB HID RFID Reader with Java

I'm trying to read a String from a via USB connected RFID-Reader. The Reader is recognized correctly inside my appliaction. But I do not know how to read the transferred characters into a String.
If I do not detach the device, the String is printed like by a Keyboard (as you would expect from a HID). What I want is to catch that String inside my Java application only. This is the reason why I detach the USB device.
For example my application prints '"''#$&' to the console (see code below) or something like this
[0, 0, 39, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 34, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 39, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 39, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 35, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 36, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 38, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 39, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
But what I wanted to read is 05006790.
I think that there is a stupid false in my attempt. I hope that someone can help me to figure out how I have to read the Bytes into a String correctly.
Thank you very much.
CODE FOLLOWS HERE
Context context = new Context();
int result = LibUsb.init(context);
DeviceList list = new DeviceList();
result = LibUsb.getDeviceList(context, list);
for (Device device: list)
{
int address = LibUsb.getDeviceAddress(device);
int busNumber = LibUsb.getBusNumber(device);
DeviceDescriptor descriptor = new DeviceDescriptor();
DeviceHandle handle = new DeviceHandle();
int resultOpen = LibUsb.open(device, handle);
// if (resultOpen < 0) // handle = null;
int resultDescriptor = LibUsb.getDeviceDescriptor(device, descriptor);
// if (resultDescriptor< 0) // handle = null;
if(descriptor.idVendor() == 0x8ff && descriptor.idProduct() == 0x0009)
{
System.out.println("found");
LibUsb.detachKernelDriver(handle, 0);
}
}
UsbServices services = UsbHostManager.getUsbServices();
UsbDevice deviceHigh = findDevice(services.getRootUsbHub(), (short) 0x8ff, (short) 0x0009);
if(deviceHigh != null)
{
System.out.println("found high");
UsbConfiguration configuration = deviceHigh.getActiveUsbConfiguration();
UsbInterface iface = configuration.getUsbInterface((byte) 0x00);
iface.claim();
UsbEndpoint endpoint = iface.getUsbEndpoint((byte) 0x81);
UsbPipe pipe = endpoint.getUsbPipe();
pipe.open();
byte[] buffer = new byte[128];
int rx = 0;
rx = pipe.syncSubmit(buffer);
System.out.printf("%d bytes received\n", rx);
System.out.println(Arrays.toString(buffer));
iface.release();
}

use fileReader in java to read one character at a time

As a part of a school exercise, I am trying to read characters from a text file and count the frequency of the characters appeared the text file. I stored the frequency in an array, where the index is the ASCII code of the char, and the number in the array is the frequency.
int c;
FileReader fr = new FileReader (inputFile);
int [] freq = new int [200];
while ( (c= fr.read())!= -1){
int index = c;
freq [index]= freq [index]+1;
}
PrintWriter pw = new PrintWriter(new FileWriter(outputFile));
for (int i =0; i<frequency.length; i++) {
if(frequency[i]!=0){
pw.println( ((char)i) + " " +frequency[i]);
Somehow this method only works with text files with a single line, like "abcdefgh". It doesn't work with files with multiple lines, like "ab /newline cde /newline..." For this type of file, it will generate a blank line and some numbers on top of the result when I print out the array. I really couldn't figure out why.
It looks fine to me.
import java.io.FileReader;
import java.util.Arrays;
public class Foo {
public static void main(String[] args) throws Exception {
FileReader fr = new FileReader("/tmp/a");
int[] freq = new int[200];
int c;
while ((c = fr.read()) != -1) {
freq[c] = freq[c] + 1;
}
System.out.println(Arrays.toString(freq));
}
}
Example contents of /tmp/a:
abc
def
Output:
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 2, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
edit - In response to the revised question:
The output is
2
a 1
b 1
c 1
d 1
e 1
f 1
The file has two line breaks, so the program is writing a line break, and then "2".
I'm guessing you want to convert the characters to something like their Java escape sequences. Here's a solution using Apache commons-lang:
import org.apache.commons.lang3.StringEscapeUtils;
import java.io.FileReader;
import java.io.FileWriter;
import java.io.IOException;
import java.io.PrintWriter;
public class Foo {
public static void main(String[] args) throws Exception {
write(read());
}
static int[] read() throws IOException {
FileReader fr = new FileReader("/tmp/a");
int[] freq = new int[200];
int c;
while ((c = fr.read()) != -1) {
freq[c] = freq[c] + 1;
}
return freq;
}
static void write(int[] freq) throws IOException {
try (PrintWriter pw = new PrintWriter(new FileWriter("/tmp/b"))) {
for (int i = 0; i < freq.length; i++) {
if (freq[i] != 0) {
char c = (char) i;
String s = StringEscapeUtils.escapeJava("" + c);
pw.println(s + " " + freq[i]);
}
}
}
}
}
Output:
\n 2
a 1
b 1
c 1
d 1
e 1
f 1

Convert IBuffer to Byte Array in Java

I have used Xuggler tutorial to decode and play video.
I need to see the raw data for every frame.
I have used the method getData as follows:
IBuffer img = picture.getData();
How to convert IBuffer img to Byte Array in Java?
Thanks
EDIT
I have found the following commands:
java.nio.ByteBuffer buffer = img.getByteBuffer(0, bufSize);//bytebuffer
byte[] bytes = new byte[10]; // Create a byte array
ByteBuffer buf=buffer.wrap(bytes); // Wrap a byte array into a buffer
bytes = new byte[buf.remaining()]; // Retrieve bytes between the position and limit
// (see Putting Bytes into a ByteBuffer)
buf.get(bytes, 0, bytes.length); // transfer bytes from this buffer into the given destination array
buf.clear(); // Retrieve all bytes in the buffer
bytes = new byte[buf.capacity()];
// transfer bytes from this buffer into the given destination array
buf.get(bytes, 0, bytes.length);
System.out.println(bytes);
The Output are:
[B#1ccd51b
[B#215eee
[B#1e579dc
[B#14793a8
[B#1082746
.....
They don't have the same length. Is that correct?
I have used:
System.out.println(Arrays.toString(bytes));
to print out the byte array content, but the output look like this:
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
How can I solve this problem?
THANKS

Categories