Dart How to parse bytes from a String

Dart How to parse bytes from a String - java

My server side Dart application receives a JSON String from a socket. The string is generated by Java code. When the string is sent to the socket in the Java code it is encoded to UTF8 and two bytes, a short int, are prepended to the string. The value of this short is the number of bytes in the string + 2.
I need to extract that value as an int in order to handle the string but nothing I've tried has worked. It dies at JSON.decode (below) because it encounters the start of another JSON string. The first byte of the second string is the start short with the length of the second JSON string. The two strings are each less then 40 characters long.
(I will need to append the length to strings sent from Dart to Java as well.)
Java line
out.writeUTF(json); // converts and writes to the socket stream.
Dart server side code method
handleJavaSocket(Socket javasocket){
javasocket.transform(UTF8.decoder).listen((String socketString){
var truncated = socketString.substring(2);
String message = JSON.decode(truncated); // dies here
// more code
}, onError: (error) {
print('Bad JavaSocket request');
});
}
One of the JSON strings before encoding
{"target":"DOOR","command":"OPEN"}

So you're always sending as [sizeuper8, sizelower8, ..utf8-string...] as your message boundaries? UTF8 decode doesn't expect length as a parameter and sees the two bytes as unicode (probably null followed by a character).
I'm currently working on a StreamBuffer for Quiver (https://pub.dartlang.org/packages/quiver) that will let you pipe the socket to a buffer that gives you:
read(2).then((bytes) => read(bytes[1]<<8|bytes[2]).then((string) => UTF8.decode(string)));
You can can post the decoded string for whatever you like after that, but it should demux your data.
Current pull request (work in progress): https://github.com/google/quiver-dart/pull/117

I couldn't find a way to convert the two byte header to an integer especially since one byte could be 0 and some string methods look at that as a string terminator.
Since a message from the Java application will always consist of the header followed by the message I ignore the header and parse the message(s) from the Stream adding them to a list (in order to echo them back as verification),
I then traverse the list and simulate the header by writing a 0 byte followed the message size to the stream stream followed by the message. This won't work as is with message lengths greater than 255 but I don't expect any even close to that size.
it should be noted that the Dart application and the Java application are on the same machine.
handleJavaSocket(Socket javasocket){
// echo socket
javasocket.transform(UTF8.decoder).forEach((item) {
var start = 2;
var end = -1;
var messages = new List<String>();
while((++end < item.length) && (end = item.indexOf('}',end)) != -1) {
messages.add(item.substring(start, ++end));
start = end + 2;
}
for(var message in messages) {
// header message length as two bytes
javasocket.writeCharCode(0); // max length 254
javasocket.writeCharCode(message.length);
javasocket.write(message); // <== send
}
});
}

Related

Processing bufferUntil() method only works with '\n'

TL,DR : bufferUntil() and readStringUntil() works fine when set to '\n' but creates problems for other characters.
The code that sends data to pc is below;
Serial.print(rollF);
Serial.print("/");
Serial.println(pitchF);
And the relevant parts from processing are;
myPort = new Serial(this, "COM3", 9600); // starts the serial communication
myPort.bufferUntil('\n');
void serialEvent (Serial myPort) {
// reads the data from the Serial Port up to the character '\n' and puts it into the String variable "data".
data = myPort.readStringUntil('\n');
// if you got any bytes other than the linefeed:
if (data != null) {
data = trim(data);
// split the string at "/"
String items[] = split(data, '/');
if (items.length > 1) {
//--- Roll,Pitch in degrees
roll = float(items[0]);
pitch = float(items[1]);
}
}
}
A picture from my incoming data(from arduino serial monitor):
0.62/-0.52
0.63/-0.52
0.63/-0.52
0.64/-0.53
0.66/-0.53
0.67/-0.53
0.66/-0.54
Until here, everything is fine as it should be. Nothing special. The problem occurs when I change the parameters of bufferUntil() and readStringUntil() functions to anything other than '\n'. Of course when I do that, I also change the corresponding parts from the arduino code. For example when replacing '\n' by 'k', the incoming data seen from arduino serial monitor looks like,
45.63/22.3k21.51/77.32k12.63/88.90k
and goes on like that. But the processing cannot get the second value in each buffer. When I check it by printing the values also on the console of processing I get the value of first one(roll) right however the second value(pitch) is shown as NaN. So what is the problem? What is the reason that it only works when it is '\n'.

I cannot check it right now but I think you might have two issues.
First off, you don't need to use bufferUntil() and readStringUntil() at the same time.
And second and more important, both functions take the character as an int so if you want to read until the character k you should do:
data = myPort.readStringUntil(int('k'));
Or, since k is ASCII code 107:
data = myPort.readStringUntil(107);
If you call the function with the wrong type as you are doing nothing will happen and the port will keep reading until it finds the default line feed.

Splitting a string with byte length limits in java

I want to split a String to a String[] array, whose elements meet following conditions.
s.getBytes(encoding).length should not exceed maxsize(int).
If I join the splitted strings with StringBuilder or + operator, the result should be exactly the original string.
The input string may have unicode characters which can have multiple bytes when encoded in e.g. UTF-8.
The desired prototype is shown below.
public static String[] SplitStringByByteLength(String src,String encoding, int maxsize)
And the testing code:
public boolean isNice(String str, String encoding, int max)
{
//boolean success=true;
StringBuilder b=new StringBuilder();
String[] splitted= SplitStringByByteLength(str,encoding,max);
for(String s: splitted)
{
if(s.getBytes(encoding).length>max)
return false;
b.append(s);
}
if(str.compareTo(b.toString()!=0)
return false;
return true;
}
Though it seems easy when the input string has only ASCII characters, the fact that it could cobtain multibyte characters makes me confused.
Thank you in advance.
Edit: I added my code impementation. (Inefficient)
public static String[] SplitStringByByteLength(String src,String encoding, int maxsize) throws UnsupportedEncodingException
{
ArrayList<String> splitted=new ArrayList<String>();
StringBuilder builder=new StringBuilder();
//int l=0;
int i=0;
while(true)
{
String tmp=builder.toString();
char c=src.charAt(i);
if(c=='\0')
break;
builder.append(c);
if(builder.toString().getBytes(encoding).length>maxsize)
{
splitted.add(new String(tmp));
builder=new StringBuilder();
}
++i;
}
return splitted.toArray(new String[splitted.size()]);
}
Is this the only way to solve this problem?

The class CharsetEncode has provision for your requirement. Extract from the Javadoc of the Encode method:
public final CoderResult encode(CharBuffer in,
ByteBuffer out,
boolean endOfInput)
Encodes as many characters as possible from the given input buffer, writing the results to the given output buffer...
In addition to reading characters from the input buffer and writing bytes to the output buffer, this method returns a CoderResult object to describe its reason for termination:
...
CoderResult.OVERFLOW indicates that there is insufficient space in the output buffer to encode any more characters. This method should be invoked again with an output buffer that has more remaining bytes. This is typically done by draining any encoded bytes from the output buffer.
A possible code could be:
public static String[] SplitStringByByteLength(String src,String encoding, int maxsize) {
Charset cs = Charset.forName(encoding);
CharsetEncoder coder = cs.newEncoder();
ByteBuffer out = ByteBuffer.allocate(maxsize); // output buffer of required size
CharBuffer in = CharBuffer.wrap(src);
List<String> ss = new ArrayList<>(); // a list to store the chunks
int pos = 0;
while(true) {
CoderResult cr = coder.encode(in, out, true); // try to encode as much as possible
int newpos = src.length() - in.length();
String s = src.substring(pos, newpos);
ss.add(s); // add what has been encoded to the list
pos = newpos; // store new input position
out.rewind(); // and rewind output buffer
if (! cr.isOverflow()) {
break; // everything has been encoded
}
}
return ss.toArray(new String[0]);
}
This will split the original string in chunks that when encoded in bytes fit as much as possible in byte arrays of the given size (assuming of course that maxsize is not ridiculously small).

The problem lies in the existence of Unicode "supplementary characters" (see Javadoc of the Character class), that take up two "character places" (a surrogate pair) in a String, and you shouldn't split your String in the middle of such a pair.
An easy approach to splitting would be to stick to the worst-case that a single Unicode code point can take at most four bytes in UTF-8, and split the string after every 99 code points (using string.offsetByCodePoints(pos, 99) ). In most cases, you won't fill the 400 bytes, but you'll be on the safe side.
Some words about code points and characters
When Java started, Unicode had less than 65536 characters, so Java decided that 16 bits were enough for a character. Later the Unicode standard exceeded the 16-bit limit, and Java had a problem: a single Unicode element (now called a "code point") no longer fit into a single Java character.
They decided to go for an encoding into 16-bit entities, being 1:1 for most usual code points, and occupying two "characters" for the exotic code points beyond the 16-bit limit (the pair built from so-called "surrogate characters" from a spare code range below 65535). So now it can happen that e.g. string.charAt(5) and string.charAt(6) must be seen in combination, as a "surrogate pair", together encoding one Unicode code point.
That's the reason why you shouldn't split a string at an arbitrary index.
To help the application programmer, the String class then got a new set of methods, working in code point units, and e.g. string.offsetByCodePoints(pos, 99) means: from the index pos, advance by 99 code points forward, giving an index that will often be pos+99 (in case the string doesn't contain anything exotic), but might be up to pos+198, if all the following string elements happen to be surrogate pairs.
Using the code-point methods, you are safe not to land in the middle of a surrogate pair.

java - how to pass multiple parameters over serial port to Arduino mega

Through my java program. i want to pass a byte value to the Arduino mega to blink an Led
and also at the same time i want to pass a string value to the Arduino to be displayed in the lcd.
How can I separately get above 2 inputs from the java program to Arduino and use them in different processes inside Arduino..
Below is the arduino code
LiquidCrystal lcd (12, 11, 10, 9, 8, 7);
int operation;
void setup() {
lcd.begin(16, 2);
Serial.begin(9600);
Serial1.begin(9600);
Serial2.begin(9600);
pinMode(3, OUTPUT);
pinMode(2, OUTPUT);
}
int count = 0;
void loop() {
//LCD start
if (Serial.available()) {
// wait a bit for the entire message to arrive
delay(50);
// clear the screen
lcd.clear();
delay(10);
// read all the available characters
while (Serial.available() > 0) {
// display each character to the LCD
lcd.write(Serial.read());
}
}
//LCD end
//LED Blink start
if (Serial.available() > 0)
delay(10);
{
operation = Serial.read();
}
if(operation == '2')
{
digitalWrite(2, LOW);
digitalWrite(3, HIGH);
delay(50);
digitalWrite(3, LOW);
delay(50);
}
if(operation == '1')
{
digitalWrite(3, LOW);
digitalWrite(2, HIGH);
delay(50);
digitalWrite(2, LOW);
delay(50);
}
//LED Blink end
// Recieve rfid tag numbers
if(Serial1.available()) {
int x = Serial1.read();
Serial1.print(x);
}
if(Serial2.available()) {
int x = Serial2.read();
Serial1.print(x);
}
}
Below is the Java code to send data
Code to send number 1 to Arduino
String buf = "1";
char buf2[] = buf.toCharArray();
output.write((byte)buf2[0]);
Code to send string to display in lcd
output.write("Hellow world. this is a String from java".getBytes());
When I run these codes separately it works well without any interference.. but when I do them both together... sometimes value 1 or 2 is displayed in the lcd.. and led doesnot blink properly . how to get two inputs from java to arduino and process them separately inside the Arduino?

I think there might be several issues working in concert.
First of all, Strings are represented internally in Java as UTF-16 encoded characters. I don't remember what Arduino "C" operates on per default, but I am pretty sure it is not UTF-16.
From the JavaDoc
"A String represents a string in the UTF-16 format in which
supplementary characters are represented by surrogate pairs (see the
section Unicode Character Representations in the Character class for
more information). Index values refer to char code units, so a
supplementary character uses two positions in a String."
http://docs.oracle.com/javase/6/docs/api/java/lang/String.html
Second,
the method getBytes() return a platform specific encoded array, so depending on what platform you run the program on, the returned bytes can vary.
Try looking into using public byte[] getBytes(String charsetName), which will give you predictable values back.
Try something like bytes[] asciiBytes = new String("Hello World").getBytes("US-ASCII");.
See http://docs.oracle.com/javase/6/docs/api/java/nio/charset/Charset.html for more info on Character sets.

You could concatenate the data.
So, if you are passing "Hello" and "200" as 2 items then combine them before the send to Arduino and send "Hello%200" and split on the % inside the Arduino.

Getting Exception in Converting ByteArray to String with Fixed length

I want to convert bytes in to String.
I have one android application and I am using flatfile for data storage.
Suppose I have lots of record in my flatfile.
Here in flat file database, my record size is fixed and its 10 characters and here I am storing lots of String records sequence.
But when I read one record from the flat file, then it is fixed number of bytes for each record. Because I wrote 10 bytes for every record.
If my string is S="abc123";
then it is stored in flat file like abc123 ASCII values for each character and rest would be 0.
Means byte array should be [97 ,98 ,99 ,49 ,50 ,51,0,0,0,0].
So when I want to get my actual string from the byte array, at that time I am using below code and it is working fine.
But when I give my inputString = "1234567890" then it creates problem.
public class MainActivity extends Activity {
public static short messageNumb = 0;
public static short appID = 16;
#Override
protected void onCreate(Bundle savedInstanceState) {
super.onCreate(savedInstanceState);
setContentView(R.layout.activity_main);
// record with size 10 and its in bytes.
byte[] recordBytes = new byte[10];
// fill record by 0's
Arrays.fill(recordBytes, (byte) 0);
// input string
String inputString = "abc123";
int length = 0;
int SECTOR_LENGTH = 10;
// convert in bytes
byte[] inputBytes = inputString.getBytes();
// set how many bytes we have to write.
length = SECTOR_LENGTH < inputBytes.length ? SECTOR_LENGTH
: inputBytes.length;
// copy bytes in record size.
System.arraycopy(inputBytes, 0, recordBytes, 0, length);
// Here i write this record in the file.
// Now time to read record from the file.
// Suppose i read one record from the file successfully.
// convert this read bytes to string which we wrote.
Log.d("TAG", "String is = " + getStringFromBytes(recordBytes));
}
public String getStringFromBytes(byte[] inputBytes) {
String s;
s = new String(inputBytes);
return s = s.substring(0, s.indexOf(0));
}
}
But I am getting problem when my string has complete 10 characters. At that time I have two 0's in my byte array so in this line
s = s.substring(0, s.indexOf(0));
I am getting the below exception:
java.lang.StringIndexOutOfBoundsException: length=10; regionStart=0; regionLength=-1
at java.lang.String.startEndAndLength(String.java:593)
at java.lang.String.substring(String.java:1474)
So what can I do when my string length is 10.
I have two solutions- I can check my inputBytes.length == 10 then make it not to do subString condition otherwise check contains 0 in byte array.
But i don't want to use this solution because I used this thing at lots of places in my application. So, is there any other way to achieve this thing?
Please suggest me some good solution which works in every condition. I think at last 2nd solution would be great. (check contains 0's in byte array and then apply sub string function).

public String getStringFromBytes(byte[] inputBytes) {
String s;
s = new String(inputBytes);
int zeroIndex = s.indexOf(0);
return zeroIndex < 0 ? s : s.substring(0, zeroIndex);
}

i think this line cause the error
s = s.substring(0, s.indexOf(0));
s.indexOf(0)
returns -1 , perhaps you should specifiy the ASCII code
for zero which is 48
so this will work s = s.substring(0, s.indexOf(48));
check documentation for indexOf(int)
public int indexOf (int c) Since: API Level 1 Searches in this string
for the first index of the specified character. The search for the
character starts at the beginning and moves towards the end of this
string.
Parameters c the character to find. Returns the index in this string
of the specified character, -1 if the character isn't found.

Efficient ByteArrayInputStream manipulation

I am working with a ByteArrayInputStream that contains an XML document consisting of one element with a large base 64 encoded string as the content of the element. I need to remove the surrounding tags so I can decode the text and output it as a pdf document.
What is the most efficient way to do this?
My knee-jerk reaction is to read the stream into a byte array, find the end of the start tag, find the beginning of the end tag and then copy the middle part into another byte array; but this seems rather inefficient and the text I am working with can be large at times (128KB). I would like a way to do this without the extra byte arrays.

Base 64 does not use the characters < or > so I'm assuming you are using a web-safe base64 variant meaning you do not need to worry about HTML entities or comments inside the content.
If you are really sure that the content has this form, then do the following:
Scan from the right looking for a '<'. This will be the beginning of the close tag.
Scan left from that position looking for a '>'. This will be the end of the start tag.
The base 64 content is between those two positions, exclusive.
You can presize your second array by using
((end - start + 3) / 4) * 3
as an upper bound on the decoded content length, and then b64decode into it. This works because each 4 base64 digits encodes 3 bytes.
If you want to get really fancy, since you know the first few bytes of the array contain ignorable tag data and the encoded data is smaller than the input, you could destructively decode the data over your current byte buffer.

Do your search and conversion while you are reading the stream.
// find the start tag
byte[] startTag = new byte[]{'<', 't', 'a', 'g', '>'};
int fnd = 0;
int tmp = 0;
while((tmp = stream.read()) != -1) {
if(tmp == startTag[fnd])
fnd++;
else
fnd=0;
if(fnd == startTage.size()) break;
}
// get base64 bytes
while(true) {
int a = stream.read();
int b = stream.read();
int c = stream.read();
int d = stream.read();
byte o1,o2,o3; // output bytes
if(a == -1 || a == '<') break;
//
...
outputStream.write(o1);
outputStream.write(o2);
outputStream.write(o3);
}
note The above was written in my web browser, so syntax errors may exist.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.