Input stream from bluetooth is late , android - java

I've done application which receives Bluetooth data, it's a string that is splitted into the integers. I work in Android studio. 2 activities use same functionality and in one activity it works okay (it can be faster) but in another it is so slow, needs 20sec to show the change after the message has received.
There are 2 types of messages, some which come all time and they are interpreted perfectly fine - fast. But another receives occassionally, when sensor registers something (not important). And there is problem with this which comes occassionally, it is slow. Actually in one android device, it is fine but on the others is horrible. I checked to print into terminal when it arrives and terminal is also late, means the problem might not be to interface and layout. The code is like this:
while(reader.read!=-1){
--function for splitting string
--if(message arrived) { function }
}
I can't put whole code, there are multiple files and lot of code, if needs some part I can post. I also eliminated data which comes all time when I don't need those data, so program does not lose time interpreting data it does not need. Thanks!
public class onReceive implements Runnable {
public void run() {
--bluetooth connection (it's ok as it works in other activities)
int b=0;
while ((b = reader.read()) != -1) {
obj.Split_string(b);
if (obj.messageArrived()) {
obj.calculation(parametar_tocka, radni_zahvat);
obj.set_messageArrived(false);
}
}
}
String splitting in Class :
public void Split_string(int b) {
char character = (char) b;
---further algorithm
}

Related

Why is a particular Guava Stopwatch.elapsed() call much later than others? (output in post)

I am working on a small game project and want to track time in order to process physics. After scrolling through different approaches, at first I had decided to use Java's Instant and Duration classes and now switched over to Guava's Stopwatch implementation, however, in my snippet, both of those approaches have a big gap at the second call of runtime.elapsed(). That doesn't seem like a big problem in the long run, but why does that happen?
I have tried running the code below as both in focus and as a Thread, in Windows and in Linux (Ubuntu 18.04) and the result stays the same - the exact values differ, but the gap occurs. I am using the IntelliJ IDEA environment with JDK 11.
Snippet from Main:
public static void main(String[] args) {
MassObject[] planets = {
new Spaceship(10, 0, 6378000)
};
planets[0].run();
}
This is part of my class MassObject extends Thread:
public void run() {
// I am using StringBuilder to eliminate flushing delays.
StringBuilder output = new StringBuilder();
Stopwatch runtime = Stopwatch.createStarted();
// massObjectList = static List<MassObject>;
for (MassObject b : massObjectList) {
if(b!=this) calculateGravity(this, b);
}
for (int i = 0; i < 10; i++) {
output.append(runtime.elapsed().getNano()).append("\n");
}
System.out.println(output);
}
Stdout:
30700
1807000
1808900
1811600
1812400
1813300
1830200
1833200
1834500
1835500
Thanks for your help.
You're calling Duration.getNano() on the Duration returned by elapsed(), which isn't what you want.
The internal representation of a Duration is a number of seconds plus a nano offset for whatever additional fraction of a whole second there is in the duration. Duration.getNano() returns that nano offset, and should almost never be called unless you're also calling Duration.getSeconds().
The method you probably want to be calling is toNanos(), which converts the whole duration to a number of nanoseconds.
Edit: In this case that doesn't explain what you're seeing because it does appear that the nano offsets being printed are probably all within the same second, but it's still the case that you shouldn't be using getNano().
The actual issue is probably some combination of classloading or extra work that has to happen during the first call, and/or JIT improving performance of future calls (though I don't think looping 10 times is necessarily enough that you'd see much of any change from JIT).

Understanding Kafka stream groupBy and window

I am not able to understand the concept of groupBy/groupById and windowing in kafka streaming. My goal is to aggregate stream data over some time period (e.g. 5 seconds). My streaming data looks something like:
{"value":0,"time":1533875665509}
{"value":10,"time":1533875667511}
{"value":8,"time":1533875669512}
The time is in milliseconds (epoch). Here my timestamp is in my message and not in key. And I want to average the value of 5 seconds window.
Here is code that I am trying but it seems I am unable to get it work
builder.<String, String>stream("my_topic")
.map((key, val) -> { TimeVal tv = TimeVal.fromJson(val); return new KeyValue<Long, Double>(tv.time, tv.value);})
.groupByKey(Serialized.with(Serdes.Long(), Serdes.Double()))
.windowedBy(TimeWindows.of(5000))
.count()
.toStream()
.foreach((key, val) -> System.out.println(key + " " + val));
This code does not print anything even though the topic is generating messages every two seconds. When I press Ctrl+C then it prints something like
[1533877059029#1533877055000/1533877060000] 1
[1533877061031#1533877060000/1533877065000] 1
[1533877063034#1533877060000/1533877065000] 1
[1533877065035#1533877065000/1533877070000] 1
[1533877067039#1533877065000/1533877070000] 1
This output does not make sense to me.
Related code:
public class MessageTimeExtractor implements TimestampExtractor {
#Override
public long extract(ConsumerRecord<Object, Object> record, long previousTimestamp) {
String str = (String)record.value();
TimeVal tv = TimeVal.fromJson(str);
return tv.time;
}
}
public class TimeVal
{
final public long time;
final public double value;
public TimeVal(long tm, double val) {
this.time = tm;
this.value = val;
}
public static TimeVal fromJson(String val) {
Gson gson = new GsonBuilder().create();
TimeVal tv = gson.fromJson(val, TimeVal.class);
return tv;
}
}
Questions:
Why do you need to pass serializer/deserializer to group by. Some of the overloads also take ValueStore, what is that? When grouped, how the data looks in the grouped stream?
How window stream is related to group stream?
The above, I was expecting to print in streaming way. That means buffer for every 5 seconds and then count and then print. It only prints once press Ctrl+c on command prompt i.e. it prints and then exits
It seems you don't have keys in your input data (correct me if this is wrong), and it further seems, that you want to do global aggregation?
In general, grouping is for splitting a stream into sub-streams. Those sub-streams are build by key (ie, one logical sub-stream per key). You set your timestamp as key in your code snippet an thus generate a sub-stream per timestamps. I assume this is not intended.
If you want to go a global aggregation, you will need to map all record to a single substream, ie, assign the same key to all records in groupBy(). Note, that global aggregations don't scale as the aggregation must be computed by a single thread. Thus, this will only work for small workloads.
Windowing is applied to each generated sub-stream to build the windows, and the aggregation is computed per window. The windows are build base on the timestamp returned by the Timestamp extractor. It seems you have an implementation that extracts the timestamp for the value for this purpose already.
This code does not print anything even though the topic is generating messages every two seconds. When I press Ctrl+C then it prints something like
By default, Kafka Streams uses some internal caching and the cache will be flushed on commit -- this happens every 30 seconds by default, or when you stop your application. You would need to disable caching to see result earlier (cf. https://docs.confluent.io/current/streams/developer-guide/memory-mgmt.html)
Why do you need to pass serializer/deserializer to group by.
Because data needs to be redistributed and this happens via a topic in Kafka. Note, that Kafka Streams is build for a distributed setup, with multiple instances of the same application running in parallel to scale out horizontally.
Btw: we might also be interesting in this blog post about the execution model of Kafka Streams: https://www.confluent.io/blog/watermarks-tables-event-time-dataflow-model/
It seems like you misunderstand the nature of window DSL.
It works for internal message timestamps handled by kafka platform, not for arbitrary properties in your specific message type that encode time information. Also, this window does not group into intervals - it is a sliding window. It means any aggregation you get is for the last 5 seconds before the current message.
Also, you need the same key for all group elements to be combined into the same group, for example, null. In your example key is a timestamp which is kind of entry-unique, so there will be only a single element in a group.

SlidingWindows for slow data (big intervals) on Apache Beam

I am working with Chicago Traffic Tracker dataset, where new data is published every 15 minutes. When new data is available, it represents records off by 10-15 minutes from the "real time" (example, look for _last_updt).
For example, at 00:20, I get data timestamped 00:10; at 00:35, I get from 00:20; at 00:50, I get from 00:40. So the interval that I can get new data "fixed" (every 15 minutes), although the interval on timestamps change slightly.
I am trying to consume this data on Dataflow (Apache Beam) and for that I am playing with Sliding Windows. My idea is to collect and work on 4 consecutive datapoints (4 x 15min = 60min), and ideally update my calculation of sum/averages as soon as a new datapoint is available. For that, I've started with the code:
PCollection<TrafficData> trafficData = input
.apply("MapIntoSlidingWindows", Window.<TrafficData>into(
SlidingWindows.of(Duration.standardMinutes(60)) // (4x15)
.every(Duration.standardMinutes(15))) . // interval to get new data
.triggering(AfterWatermark
.pastEndOfWindow()
.withEarlyFirings(AfterProcessingTime.pastFirstElementInPane()))
.withAllowedLateness(Duration.ZERO)
.accumulatingFiredPanes());
Unfortunately, looks like when I receive a new datapoint from my input, I do not get a new (updated) result from the GroupByKey that I have after.
Is this something wrong with my SlidingWindows? Or am I missing something else?
One issue may be that the watermark is going past the end of the window, and dropping all later elements. You may try giving a few minutes after the watermark passes:
PCollection<TrafficData> trafficData = input
.apply("MapIntoSlidingWindows", Window.<TrafficData>into(
SlidingWindows.of(Duration.standardMinutes(60)) // (4x15)
.every(Duration.standardMinutes(15))) . // interval to get new data
.triggering(AfterWatermark
.pastEndOfWindow()
.withEarlyFirings(AfterProcessingTime.pastFirstElementInPane())
.withLateFirings(AfterProcessingTime.pastFirstElementInPane()))
.withAllowedLateness(Duration.standardMinutes(15))
.accumulatingFiredPanes());
Let me know if this helps at all.
So #Pablo (from my understanding) gave the correct answer. But I had some suggestions that would not fit in a comment.
I wanted to ask whether you need sliding windows? From what I can tell, fixed windows would do the job for you and be computationally simpler as well. Since you are using accumulating fired panes, you don't need to use a sliding window since your next DoFn function will already be doing an average from the accumulated panes.
As for the code, I made changes to the early and late firing logic. I also suggest increasing the windowing size. Since you know the data comes every 15 minutes, you should be closing the window after 15 minutes rather than on 15 minutes. But you also don't want to pick a window which will eventually collide with multiples of 15 (like 20) because at 60 minutes you'll have the same problem. So pick a number that is co-prime to 15, for example 19. Also allow for late entries.
PCollection<TrafficData> trafficData = input
.apply("MapIntoFixedWindows", Window.<TrafficData>into(
FixedWindows.of(Duration.standardMinutes(19))
.triggering(AfterWatermark.pastEndOfWindow()
// fire the moment you see an element
.withEarlyFirings(AfterPane.elementCountAtLeast(1))
//this line is optional since you already have a past end of window and a early firing. But just in case
.withLateFirings(AfterProcessingTime.pastFirstElementInPane()))
.withAllowedLateness(Duration.standardMinutes(60))
.accumulatingFiredPanes());
Let me know if that solves your issue!
EDIT
So, I could not understand how you computed the above example, so I am using a generic example. Below is a generic averaging function:
public class AverageFn extends CombineFn<Integer, AverageFn.Accum, Double> {
public static class Accum {
int sum = 0;
int count = 0;
}
#Override
public Accum createAccumulator() { return new Accum(); }
#Override
public Accum addInput(Accum accum, Integer input) {
accum.sum += input;
accum.count++;
return accum;
}
#Override
public Accum mergeAccumulators(Iterable<Accum> accums) {
Accum merged = createAccumulator();
for (Accum accum : accums) {
merged.sum += accum.sum;
merged.count += accum.count;
}
return merged;
}
#Override
public Double extractOutput(Accum accum) {
return ((double) accum.sum) / accum.count;
}
}
In order to run it you would add the line:
PCollection<Double> average = trafficData.apply(Combine.globally(new AverageFn()));
Since you are currently using accumulating firing triggers, this would be the simplest coding way to solve the solution.
HOWEVER, if you want to use a discarding fire pane window, you would need to use a PCollectionView to store the previous average and pass it as a side input to the next one in order to keep track of the values. This is a little more complex in coding but would definitely improve performance since constant work is done every window, unlike in accumulating firing.
Does this make enough sense for you to generate your own function for discarding fire pane window?

Update data only by difference between files (delta for java)

UPDATE: I solved the problem with a great external library - https://code.google.com/p/xdeltaencoder/. The way I did it is posted below as the accepted answer
Imagine I have two separate pcs who both have an identical byte[] A.
One of the pcs creates byte[] B, which is almost identical to byte[] A but is a 'newer' version.
For the second pc to update his copy of byte[] A into the latest version (byte[] B), I need to transmit the whole byte[] B to the second pc. If byte[] B is many GB's in size, this will take too long.
Is it possible to create a byte[] C that is the 'difference between' byte[] A and byte[] B? The requirements for byte[] C is that knowing byte[] A, it is possible to create byte[] B.
That way, I will only need to transmit byte[] C to the second PC, which in theory would be only a fraction of the size of byte[] B.
I am looking for a solution to this problem in Java.
Thankyou very much for any help you can provide :)
EDIT: The nature of the updates to the data in most circumstances is extra bytes being inserted into parts of the array. Ofcourse it is possible that some bytes will be changed or some bytes deleted. the byte[] itself represents a tree of the names of all the files/folders on a target pc. the byte[] is originally created by creating a tree of custom objects, marshalling them with JSON, and then compressing that data with a zip algorithm. I am struggling to create an algorithm that can intelligently create object c.
EDIT 2: Thankyou so much for all the help everyone here has given, and I am sorry for not being active for such a long time. I'm most probably going to try to get an external library to do the delta-encoding for me. A great part about this thread is that I now know what I want to achieve is called! I believe that when I find an appropriate solution I will post it and accept it so others can see as to how I solved my problem. Once again, thankyou very much for all your help.
Using a collection of "change events" rather than sending the whole array
A solution to this would be to send a serialized object describing the change rather than the actual array all over again.
public class ChangePair implements Serializable{
//glorified struct
public final int index;
public final byte newValue;
public ChangePair(int index, byte newValue) {
this.index = index;
this.newValue = newValue;
}
public static void main(String[] args){
Collection<ChangePair> changes=new HashSet<ChangePair>();
changes.add(new ChangePair(12,(byte)2));
changes.add(new ChangePair(1206,(byte)3));
}
}
Generating the "change events"
The most efficient method for achieving this would be to track changes as you go, but assuming thats not possible you can just brute force your way through, finding which values are different
public static Collection<ChangePair> generateChangeCollection(byte[] oldValues, byte[] newValues){
//validation
if (oldValues.length!=newValues.length){
throw new RuntimeException("new and old arrays are differing lengths");
}
Collection<ChangePair> changes=new HashSet<ChangePair>();
for(int i=0;i<oldValues.length;i++){
if (oldValues[i]!=newValues[i]){
//generate a change event
changes.add(new ChangePair(i,newValues[i]));
}
}
return changes;
}
Sending and recieving those change events
As per this answer regarding sending serialized objects over the internet you could then send your object using the following code
Collection<ChangePair> changes=generateChangeCollection(oldValues,newValues);
Socket s = new Socket("yourhostname", 1234);
ObjectOutputStream out = new ObjectOutputStream(s.getOutputStream());
out.writeObject(objectToSend);
out.flush();
On the other end you would recieve the object
ServerSocket server = new ServerSocket(1234);
Socket s = server.accept();
ObjectInputStream in = new ObjectInputStream(s.getInputStream());
Collection<ChangePair> objectReceived = (Collection<ChangePair>) in.readObject();
//use Collection<ChangePair> to apply changes
Using those change events
This collection can then simply be used to modify the array of bytes on the other end
public static void useChangeCollection(byte[] oldValues, Collection<ChangePair> changeEvents){
for(ChangePair changePair:changeEvents){
oldValues[changePair.index]=changePair.newValue;
}
}
Locally log the changes to the byte array, like a little version control system. In fact you could use a VCS to create patch files, send them to the other side and apply them to get an up-to-date file;
If you cannot log changes, you would need to double the array locally, or (not so 100% safe) use an array of checksums on blocks.
The main problem here is data compression.
Kamikaze offers you good compression algorithms for data arrays. It uses Simple16 and PForDelta coding. Simple16 is a good and (as the name says) simple list compression option. Or you can use Run Lenght Encoding. Or you can experiment with any compression algorithm you have available in Java...
Anyway, any method you use will be optimized if you first preprocess the data.
You can reduce the data calculating differences or, as #RichardTingle pointed, creating pairs of different data locations.
You can calculate C as B - A. A will have to be an int array, since the difference between two byte values can be higher than 255. You can then restore B as A + C.
The advantage of combining at least two methods here is that you get much better results.
E.g. if you use the difference method with A = { 1, 2, 3, 4, 5, 6, 7 } and B = { 1, 2, 3, 5, 6, 7, 7 }. The difference array C will be { 0, 0, 0, 1, 1, 1, 0 }. RLE can compress C in a very effective way, since it is good for compressing data when you have many repeated numbers in sequence.
Using the difference method with Simple16 will be good if your data changes in almost every position, but the difference between values is small. It can compress an array of 28 single-bit values (0 or 1) or an array of 14 two-bit values to a single 32-byte integer.
Experiment, it all will depend on how your data behaves. And compare the data compression ratios for each experiment.
EDIT: You will have to preprocess the data before JSON and zip compressing.
Create two sets old and now. The latter contains all files that exists now. For the former, the old files, you have at least two options:
Should contain all files that existed before you sent them to the other PC. You will need to keep a set of what the other PC knows to calculate what has changed since the last synchronization, and send only the new data.
Contains all files since you last checked for changes. You can keep a local history of changes and give each version an "id". Then, when you sync, you send the "version id" together with the changed data to the other PC. Next time, the other PC first sends its "version id" (or you keed the "version id" of each PC locally), then you can send the other PC all the new changes (all the versions that come after the one that PC had).
The changes can be represented by two other sets: newFiles, and deleted files. (What about files that changed in content? Don't you need to sync these too?) The newFiles contains the ones that only exist in set now (and do not exist in old). The deleted set contains the files that only exist in set old (and do not exist in now).
If you represent each file as an String with the full pathname, you safely will have unique representations of each file. Or you can use java.io.File.
After you reduced your changes to newFiles and deleted files set, you can convert them to JSON, zip and do anything else to serialize and compress the data.
So, what I ended up doing was using this:
https://code.google.com/p/xdeltaencoder/
From my test it works really really well. However, you will need to make sure to checksum the source (in my case fileAJson), as it does not do it automatically for you!
Anyways, code below:
//Create delta
String[] deltaArgs = new String[]{fileAJson.getAbsolutePath(), fileBJson.getAbsolutePath(), fileDelta.getAbsolutePath()};
XDeltaEncoder.main(deltaArgs);
//Apply delta
deltaArgs = new String[]{"-d", fileAJson.getAbsolutePath(), fileDelta.getAbsolutePath(), fileBTarget.getAbsolutePath()};
XDeltaEncoder.main(deltaArgs);
//Trivia, Surpisingly this also works
deltaArgs = new String[]{"-d", fileBJson.getAbsolutePath(), fileDelta.getAbsolutePath(), fileBTarget.getAbsolutePath()};
XDeltaEncoder.main(deltaArgs);

Reading large files for a simulation (Java crashes with out of heap space)

For a school assignment, I need to create a Simulation for memory accesses. First I need to read 1 or more trace files. Each contains memory addresses for each access. Example:
0 F001CBAD
2 EEECA89F
0 EBC17910
...
Where the first integer indicates a read/write etc. then the hex memory address follows. With this data, I am supposed to run a simulation. So the idea I had was parse these data into an ArrayList<Trace> (for now I am using Java) with trace being a simple class containing the memory address and the access type (just a String and an integer). After which I plan to loop through these array lists to process them.
The problem is even at parsing, it running out of heap space. Each trace file is ~200MB. I have up to 8. Meaning minimum of ~1.6 GB of data I am trying to "cache"? What baffles me is I am only parsing 1 file and java is using 2GB according to my task manager ...
What is a better way of doing this?
A code snippet can be found at Code Review
The answer I gave on codereview is the same one you should use here .....
But, because duplication appears to be OK, I'll duplicate the answer here.
The issue is almost certainly in the structure of your Trace class, and it's memory efficiency. You should ensure that the instrType and hexAddress are stored as memory efficient structures. The instrType appears to be an int, which is good, but just make sure that it is declared as an int in the Trace class.
The more likely problem is the size of the hexAddress String. You may not realise it but Strings are notorious for 'leaking' memory. In this case, you have a line and you think you are just getting the hexString from it... but in reality, the hexString contains the entire line.... yeah, really. For example, look at the following code:
public class SToken {
public static void main(String[] args) {
StringTokenizer tokenizer = new StringTokenizer("99 bottles of beer");
int instrType = Integer.parseInt(tokenizer.nextToken());
String hexAddr = tokenizer.nextToken();
System.out.println(instrType + hexAddr);
}
}
Now, set a break-point in (I use eclipse) your IDE, and then run it, and you will see that hexAddr contains a char[] array for the entire line, and it has an offset of 3 and a count of 7.
Because of the way that String substring and other constructs work, they can consume huge amounts of memory for short strings... (in theory that memory is shared with other strings though). As a consequence, you are essentially storing the entire file in memory!!!!
At a minimum, you should change your code to:
hexAddr = new String(tokenizer.nextToken().toCharArray());
But even better would be:
long hexAddr = parseHexAddress(tokenizer.nextToken());
Like rolfl I answered your question in the code review. The biggest issue, to me, is the reading everything into memory first and then processing. You need to read a fixed amount, process that, and repeat until finished.
Try use class java.nio.ByteBuffer instead of java.util.ArrayList<Trace>. It should also reduce the memory usage.
class TraceList {
private ByteBuffer buffer;
public TraceList(){
//allocate byte buffer
}
public void put(byte operationType, int addres) {
//put data to byte buffer
}
public Trace get(int index) {
//get data from byte buffer by index
byte type = ...//read type
int addres = ...//read addres
return new Trace(type, addres)
}
}

Categories