How to store bytes and read them back to an array - java

I am trying to store a list of numbers(bytes) into a file so that I can retrieve them into a byte[].
59 20 60 21 61 22 62 23 63 24 64 25 65 26 66 27 67 28 68 29
67 30 66 31 65 32 64 33 63 34 62 35 61 36 60 37 59 38
66 29 65 30 64 31 63 32 62 33 61 34 60 35 59 36 58 37
65 28 64 29 63 30 62 31 61 32 60 33 59 34 58 35 57 36...
I have tried saving them into a text file but the relevant code doesn't seem to read it properly.
try {
File f = new File("cube_mapping2.txt");
array = new byte[file.size()]
FileInputStream stream = new FileInputStream(f);
stream.read(array);
} catch (Exception e) {
e.printStackTrace();
}
Is there a proper way to save the file so that FileInputReader.read(byte[] buffer) will populate the array with my bytes?

I'd be using Scanner. Something like this:
public static void main(String[] args) throws IOException {
InputStream stream = new FileInputStream("cube_mapping2.txt");
Scanner s = new Scanner(stream);
List<Byte> bytes = new ArrayList<Byte>();
while (s.hasNextByte()) {
bytes.add(s.nextByte());
}
System.out.println(bytes);
}
I tested this on a file containing your exact input and it worked. Output was:
[59, 20, 60, 21, 61, 22, 62, 23, 63, 24, 64, 25, 65, 26, 66, 27, 67, 28, 68, 29, 67, 30, 66, 31, 65, 32, 64, 33, 63, 34, 62, 35, 61, 36, 60, 37, 59, 38, 66, 29, 65, 30, 64, 31, 63, 32, 62, 33, 61, 34, 60, 35, 59, 36, 58, 37, 65, 28, 64, 29, 63, 30, 62, 31, 61, 32, 60, 33, 59, 34, 58, 35, 57, 36]

FileInputStream works on binary files. The code you posted would read from a binary file, but isn't quite right because stream.read(array) reads up to the length of the array; it doesn't promise to read the whole array. The return value from read(array) is the number of bytes actually read. To be sure of getting all the data you want you need the read() call to be in a loop.
To answer your actual question: to write to a file in such a way that stream.read(array) will be able to read it back it, use FileOutputStream.write(array).
If you're happy with a text file instead of a binary file, go with #Bohemian's answer.

array = new byte[file.size()]
Is that means, there is no space left to store the separate mark for each two numbers?
According to your byte array, if each one of them is only two spaces, then you can use a two spaces temp byte array to read each byte you store in your file. Something like
byte[] temp = new byte[2];
stream.read(temp);
Which can make sure to read the byte number one by one.

Related

Spring Webflux, Make all request wait for queue to be replenished by first request

I am using Spring webflux and default project reactor. Each http request will poll the ConcurrentLinkedQueue and get a unique number, if the queue becomes empty all the request should wait for the queue to be populated (I am looking at smart way to refill the queue as part of the reactive chain of the first request which found queue is empty and make all the other requests wait for the queue to be replenished). I am find it hard to implement it in spring webflux and project reactor. By way, during the app startup I have event lister to populate the queue on app load. This problem will be prominent once the queue is empty. Any help here will be much appreciated. Here is the minified version of the code...
package com.learning;
import reactor.core.publisher.Mono;
import java.util.Queue;
import java.util.concurrent.ConcurrentLinkedQueue;
import java.util.concurrent.ExecutorService;
import java.util.concurrent.Executors;
import java.util.stream.Collectors;
import java.util.stream.LongStream;
public class UniqueNumberGenerator {
Queue<Long> uniqueNumberCache = new ConcurrentLinkedQueue<>();
static private Long sequence = 0l;
public static void main(String arg[]) throws InterruptedException {
ExecutorService executorService = Executors.newFixedThreadPool(9);
UniqueNumberGenerator uniqueNumberGenerator = new UniqueNumberGenerator();
for (int i = 0; i < 15; i++) {
executorService.submit(() -> {
uniqueNumberGenerator.getUniqueNumber()
.doOnNext(number -> System.out.println("Consumed number : " + number))
.subscribe();
});
}
Thread.sleep(5000);
}
public Mono<Long> getUniqueNumber() {
Long orderId = uniqueNumberCache.poll();
if(orderId != null) {
return Mono.just(orderId);
} else {
return generateUniqueNumbers();
}
}
/**
* This is where the problem lies, When the queue is empty in the above method,
* All thw thread are concurrently trying to get the db sequence and generating the
* unique numbers and populating the queue,
* where I want to only one thread to populate the queue and make all other to wait for it
*
* #return
*/
public Mono<Long> generateUniqueNumbers() {
Mono<Long> sequenceMono = getSequence()
.flatMap(nextVal -> {
var list = LongStream.rangeClosed(1, 9)
.boxed()
.map(value -> Long.valueOf(nextVal + "" + value))
.collect(Collectors.toList());
System.out.println("Generated number " + list);
var nextOrderId = list.remove(0);
uniqueNumberCache.addAll(list);
return Mono.just(nextOrderId);
});
return sequenceMono;
}
/**
* This method is actually backed by db sequence
*
* #return
*/
public Mono<Long> getSequence() {
return Mono.create(longMonoSink -> {
synchronized (sequence) {
sequence = sequence + 1;
//System.out.println("sequence : " + sequence);
longMonoSink.success(sequence);
}
});
}
}
Output
Generated number [11, 12, 13, 14, 15, 16, 17, 18, 19]
Generated number [31, 32, 33, 34, 35, 36, 37, 38, 39]
Generated number [41, 42, 43, 44, 45, 46, 47, 48, 49]
Generated number [51, 52, 53, 54, 55, 56, 57, 58, 59]
Generated number [21, 22, 23, 24, 25, 26, 27, 28, 29]
Consumed number : 31
Consumed number : 11
Consumed number : 21
Consumed number : 41
Consumed number : 51
Generated number [61, 62, 63, 64, 65, 66, 67, 68, 69]
Consumed number : 61
Generated number [71, 72, 73, 74, 75, 76, 77, 78, 79]
Consumed number : 71
Generated number [81, 82, 83, 84, 85, 86, 87, 88, 89]
Consumed number : 81
Generated number [91, 92, 93, 94, 95, 96, 97, 98, 99]
Consumed number : 91
Consumed number : 17
Consumed number : 12
Consumed number : 13
Consumed number : 16
Consumed number : 14
Consumed number : 15
By looking at above example it is very clear that all the threads are populating the queue. There are around 90 (each thread generated 10 numbers ) unique numbers generated for 15 requests

Binary fields encoding/serialization format in a proprietary XML file (Roche LC480 .ixo file)

I recently received an example export file generated by the Roche LightCycler 480 instrument. It uses a proprietary XML format, for which I haven't found a specification yet.
From such types of files, I would like to extract some information relevant to my purposes. Although most of it can be easily parsed and interpreted, it contains a number of (unpadded) base 64 encoded fields of binary/serialized data representing arrays of integer and/or floating point numbers. A link to the example file can be found in this gist.
I have included some fragment of it at the end of this post. The AcquisitionTable contains a total of 19 such encoded item entries. That likely represent arrays of integer (SampleNo) and floating point (Fluor1) values.
How the decoded bytes are to be translated to integer or floating point values is still unclear to me. When base 64 decoding, each of the items starts with the following (hex) 6 byte sequence:
42 41 52 5A 00 00 ... // ['B','A','R','Z','\0','\0', ...]
Note that while it is my expectation that each 'item' contains the same amount of numbers (or "rows" in this table), I am observing a different number of decoded bytes for similar items: 5654 for Fluor1 and 5530 for Fluor2.
Additionally for those arrays which I suspect contain (sequential) integers, a pattern can be observed:
SampleNo : ... 1F F5 1F 07 2F 19 2F 2B 2F 3D 2F 4F 2F 61 2F 00 73 2F 85 2F 97 2F A9 2F BB 2F CD 2F DF 2F F1 2F 00 03 3F 15 3F 27 ...
Cycles : ... 1F FF 1F 11 2F 23 2F 35 2F 47 2F 59 2F 6B 2F 00 7D 2F 8F 2F A1 2F B3 2F C5 2F D7 2F E9 2F FB 2F 00 0D 3F 1F 3F 31 ...
Gain : ... 1F EE 1F 00 2F 12 2F 24 2F 36 2F 00 48 2F 5A 2F 6C 2F 7E 2F 90 2F A2 2F B4 2F C6 2F 00 D8 2F EA 2F FC 2F 0E 3F 20 3F 32 ...
It looks like pairs of bytes, where the second byte is increasing by 0x12 (18) and occasionally a group of 3 bytes with 0x00 as the second byte in case the last byte's nibble is 3, D or 8 for the three examples respectively.
I was wondering if the type of encoding/serialization format would be obvious to anyone (or, even better, if someone has a specification of this file format).
I believe the software used to create these files is currently Java based, but has a history as a Windows/MFC/C++ product.
<obj name="AcquisitionTable" class="AcquisitionTable" version="1">
<prop name="Count">2400</prop>
<prop name="ChannelCount">6</prop>
<list name="Columns" count="19">
<item name="SampleNo">QkFSWgAABHgCAER0Cu3xAe3wAuv//f8PDyEPADMPRQ9XD2kPew+ND58PsQ8Aww/VD+cP+Q8LHx0fLx9BHwBTH2Ufdx+JH5sfrR+/H9EfAOMf9R8HLxkvKy89L08vYS8Acy+FL5cvqS+7L80v3y/xLwADPxU/Jz85P0s/XT9vP4E/AJM/pT+3P8k/2z/tP/8/EU8AI081T0dPWU9rT31Pj0+hTwCzT8VP10/pT/tPDV8fXzFfAENfVV9nX3lfi1+dX69fwV8A01/lX/dfCW8bby1vP29RbwBjb3Vvh2+Zb6tvvW/Pb+FvAPNvBX8Xfyl/O39Nf19/cX8Ag3+Vf6d/uX/Lf91/738BjwATjyWPN49Jj1uPbY9/j5GPAKOPtY/Hj9mP64/9jw+fIZ8AM59Fn1efaZ97n42fn5+xnwDDn9Wf55/5nwuvHa8vr0GvAFOvZa93r4mvm6+tr7+v0a8A46/1rwe/Gb8rvz2/T79hvwBzv4W/l7+pv7u/zb/fv/G/AAPPFc8nzznPS89dz2/Pgc8Ak8+lz7fPyc/bz+3P/88R3wAj3zXfR99Z32vffd+P36HfALPfxd/X3+nf+98N7x/vMe8AQ+9V72fvee+L753vr+/B7wDT7+Xv9+8J/xv/Lf8//1H/AGP/df+H/5n/q/+9/8//4f8A8/8FDxcPKQ87D00PXw9xDwCDD5UPpw+5D8sP3Q/vDwEfABMfJR83H0kfWx9tH38fkR8Aox+1H8cf2R/rH/0fDy8hLwAzL0UvVy9pL3svjS+fL7EvAMMv1S/nL/kvCz8dPy8/QT8AUz9lP3c/iT+bP60/vz/RPwDjP/U/B08ZTytPPU9PT2FPAHNPhU+XT6lPu0/NT99P8U8AA18VXydfOV9LX11fb1+BXwCTX6Vft1/JX9tf7V//XxFvACNvNW9Hb1lva299b49voW8As2/Fb9dv6W/7bw1/H38xfwBDf1V/Z395f4t/nX+vf8F/ANN/5X/3fwmPG48tjz+PUY8AY491j4ePmY+rj72Pz4/hjwDzjwWfF58pnzufTZ9fn3GfAIOflZ+nn7mfy5/dn++fAa8AE68lrzevSa9br22vf6+RrwCjr7Wvx6/Zr+uv/a8PvyG/ADO/Rb9Xv2m/e7+Nv5+/sb8Aw7/Vv+e/+b8Lzx3PL89BzwBTz2XPd8+Jz5vPrc+/z9HPAOPP9c8H3xnfK98930/fYd8Ac9+F35ffqd+7383f39/x3wAD7xXvJ+8570vvXe9v74HvAJPvpe+378nv2+/t7//vEf8AI/81/0f/Wf9r/33/j/+h/wCz/8X/1//p//v/DQ8fDzEPAEMPVQ9nD3kPiw+dD68PwQ8A0w/lD/cPCR8bHy0fPx9RHwBjH3Ufhx+ZH6sfvR/PH+EfAPMfBS8XLykvOy9NL18vcS8Agy+VL6cvuS/LL90v7y8BPwATPyU/Nz9JP1s/bT9/P5E/AKM/tT/HP9k/6z/9Pw9PIU8AM09FT1dPaU97T41Pn0+xTwDDT9VP50/5TwtfHV8vX0FfAFNc</item>
<item name="ProgramNo">QkFSWgAABHMCAERvANz///8RDyMPNQ9HD1kPaw8AfQ+PD6EPsw/FD9cP6Q/7DwANHx8fMR9DH1UfZx95H4sfAJ0frx/BH9Mf5R/3HwkvGy8ALS8/L1EvYy91L4cvmS+rLwC9L88v4S/zLwU/Fz8pPzs/AE0/Xz9xP4M/lT+nP7k/yz8A3T/vPwFPE08lTzdPSU9bTwBtT39PkU+jT7VPx0/ZT+tPAP1PD18hXzNfRV9XX2lfe18AjV+fX7Ffw1/VX+df+V8LbwAdby9vQW9Tb2Vvd2+Jb5tvAK1vv2/Rb+Nv9W8Hfxl/K38APX9Pf2F/c3+Ff5d/qX+7fwDNf99/8X8DjxWPJ485j0uPAF2Pb4+Bj5OPpY+3j8mP248A7Y//jxGfI581n0efWZ9rnwB9n4+foZ+zn8Wf15/pn/ufAA2vH68xr0OvVa9nr3mvi68Ana+vr8Gv06/lr/evCb8bvwAtvz+/Ub9jv3W/h7+Zv6u/AL2/z7/hv/O/Bc8XzynPO88ATc9fz3HPg8+Vz6fPuc/LzwDdz+/PAd8T3yXfN99J31vfAG3ff9+R36Pftd/H39nf698A/d8P7yHvM+9F71fvae977wCN75/vse/D79Xv5+/57wv/AB3/L/9B/1P/Zf93/4n/m/8Arf+//9H/4//1/wcPGQ8rDwA9D08PYQ9zD4UPlw+pD7sPAM0P3w/xDwMfFR8nHzkfSx8AXR9vH4Efkx+lH7cfyR/bHwDtH/8fES8jLzUvRy9ZL2svAH0vjy+hL7MvxS/XL+kv+y8ADT8fPzE/Qz9VP2c/eT+LPwCdP68/wT/TP+U/9z8JTxtPAC1PP09RT2NPdU+HT5lPq08AvU/PT+FP808FXxdfKV87XwBNX19fcV+DX5Vfp1+5X8tfAN1f718BbxNvJW83b0lvW28AbW9/b5Fvo2+1b8dv2W/rbwD9bw9/IX8zf0V/V39pf3t/AI1/n3+xf8N/1X/nf/l/C48AHY8vj0GPU49lj3ePiY+bjwCtj7+P0Y/jj/WPB58ZnyufAD2fT59hn3OfhZ+Xn6mfu58AzZ/fn/GfA68VryevOa9LrwBdr2+vga+Tr6Wvt6/Jr9uvAO2v/68RvyO/Nb9Hv1m/a78Afb+Pv6G/s7/Fv9e/6b/7vwANzx/PMc9Dz1XPZ895z4vPAJ3Pr8/Bz9PP5c/3zwnfG98ALd8/31HfY99134ffmd+r3wC938/f4d/z3wXvF+8p7zvvAE3vX+9x74Pvle+n77nvy+8A3e/v7wH/E/8l/zf/Sf9b/wBt/3//kf+j/7X/x//Z/+v/AP3/Dw8hDzMPRQ9XD2kPew8AjQ+fD7EPww/VD+cP+Q8LHwAdHy8fQR9TH2Ufdx+JH5sfAK0fvx/RH+Mf9R8HLxkvKy8APS9PL2Evcy+FL5cvqS+7LwDNL98v8S8DPxU/Jz85P0s/AF0/bz+BP5M/pT+3P8k/2z8A7T//PxFPI081T0dPWU9rTwB9T49PoU+zT8VP10/pT/tPAA1fH18xX0NfVV9nUwA</item>
... snipped
<item name="Fluor1">QkFSWgAAFg0CAFYJ+xwg7vGsP1qIWb738CFAHegc//CsnT/u9cqGyQ8A/PcbfVgeAas/qoOpJwDu/P9SgVE/ACFAHmuwHUcArR0GwX9WAWYUD2l9bgFcD9l7hgFmdA9Jep4BLA8pd7YBZqQPmXXOAYwP0XTmAWa8D0Fz/gHsD7FxFhFmBB9Zby4RHB8BbUYRZjQfqWpeEUwfGWl2EWZkH8FmjhHUD2lkphFmfB8RYr4RlB+5X9YRZqwf8V7uEdwfmVwGIWb0H0FaHiEML+lXNiFmxB+RVU4hPC9xUmYhZiQv4VB+IVQviU6WIWZsL2lLriGcL0lIxiFmtC+5Rt4hzC+ZQ/YhMoQvQQ4y/C8hPiYy/f4zkTw+MRQ/cTlWMUQ/M+E3bjFcP4k1hjF0PzMxM54xLD/ZMLYxpD8zSS/OMYw/KSzmMbw/M5kq/jHsP0EoFkHUPzPpJS5BHE+RI0ZBBE8zASJeQUxPqR92QWRPMxkejkF8TzEapkGUT5p3Ehk0TxEX1kFED/GZE+5BrE+ZEQZRxE9BmQ8eUfRPsQ02USRfWZkLTlE8X3EHZlEMXxmZBX5RbF+JA5ZRhF/BuQKuUZxfof+gx1AgxsVP/hDfUMxXrgb6KJz3UMxfmfiYD2D8X7Fz9LAnYBRvIfMgP2HmTU/wAFdgLG9x7nCcb2Bcb6ntqIdgRG+Jc+qIn2B0bzHoMLdgzqRvoeagz2CMb0nkOUjnYNRv8eHw/2Dsb+eZ35gXcLxvCd4InC9wHH/p2uhHcDR/kXPYkF9wTH9x1XB3cJkgRQZ2JtPgj3Bkf8Gb0MCncCBA7vW+Vs05oL9wlH8RzBDXcMR/5ynIKO9wBH+ZxpicB4Dcf3nDeB+A9H8hg8EgN4EtbzeE7vXu9TlzvThngDyP4brgf4DOJI+JuIiXgGyP+bY5+K+AhI+htKDHgLSP50mySN+AnI/xr/Cc94Dkj9Gs0A+QVI95c6p4J5Csf+mo6D+QzvyPyaXIV5BEn3GjOXBvkCyf4aHgh5DMj+fBnsCfkHSfMZ0wnLeQjJ9JmUjPkBSfuXOXuOeQXJ9hlWD/kM7snwmTCBegpJ95kTl4L6C8nyGPIEehFQ7nAYwAX6BMr6mJqJx3oByvGYgYj6Bkr8FzhcCnoJSvaYNov6DOrK9JgEjXoNSf8H3M7qF8r2B8BrHErwh6zB6xDL+wdzaxJL8gdsxOshUOyHNmsTSvcHHMfrFUvxhvlrFsv8BszK6xnL+gacax3K8QaMzesYS/8GT2seS/mGLMDsH8v0BgJsG0v+hdzD7BFM/IWlbB9K9wWMxuwUTPqFeGwXTPiFTMnsGMzzBStsEsz2hRzM7BpM9ITubBvM+4TMz+wVzPYEoW0QTfCEjMLtEc33hGRtE031hDzF7R1M8AQXbRZN+oPsyO0UzfUDym0XzfwDrMvtGs32g41tHE30g1zO7RlN8oMgbh3N/QL8we4QzveC024STvICtkTuJl3yhm4fTfcCZ+4WZU7xgkluGE74giruFmnO8wIMbh7M8QHd7hZmzvuBr24eTvYBgO8WbM79AWJvEU/7ATPvFmtO+QEFbxRP9wDW7xZlz/4AuG8XT/wAie8cj874b/oNQKzvG8/zAHzObx1P/YBP7x7P+AAlwWAQQPoPufLwAfHQ+sLw8Bs/VfXwAfrX6w+viB8EwPAOz/6/+Z63wHdrbqb6cAZA+gc+KfvwCsD4Dff9cAznwPQNk/7wDcD5DUOY8HEPQP4M/fHxDED+cwyy83EAwfgMZ/nE8QPB9gw19nEJQPsPO+r38QVB8Auv+5O/+5hB9QtU+vEJwf5xCvD8cQtB/QqM/s3xAkGJAa7xCqP7CTpa/3EMwfAMZS/B/gc53fJyAULzCZLz8gzmwf8JLvVyAsL9CPOc9vIFwvkImPhyB0L+fghN+fIIwvMIAvmLcgJBdGVQ99ziGkL1+ZeOYh1C8fcv4irX7fmWsWMewvn2UuMQQ/f5liRjEcP89dXjFMPx+ZWXYyZa7fUo4xfD8vmU6mMZQ/70e+MTQ/P5lD1jGsPx9A7jN2bW+ZOwZB9D+/Nh5BDE/vmS42QSRPPypOQTxPj5klZkFUT28ifkHEPy+ZHJZBbE/vFa5BhE8/mRHGQZxP/wreQcxPv5kE9kHkHw8ADlG0T1/r+14nUB6tfh/1Hsw/UW1P8G5XUCxfL+o5Lm9QXF+f6J6HUERYzu8Uz+DOn1B0Xz/fOT63UERf/9j+z1CkX+dP1E7nUNRfn8+eHP9Q7F/vyu4XYLxX3mXnz8fOL2C8X6/ErpxHYDRvb75uX2BMb99zvN53YBxvD7UOj2DOBG9fsF6nYJRvH6o5Hr9hZa5Pok7XYIxf5y+fLu9grG/vmO6cB3DEb6+Srh9wDH9vc4xuN3Akf9+K3k9wzjx/n4SeZ3B8b15+zH5x9G+ueZZxhH/+dMyucZx/3nHGcbR/DmrM3nFUf85j9nHkf65gzA6BzH/+WyaBFI9OV8w+gfx/LlRWgSyPzkrMboFcj65HhoF0j45EzJ6BjI9OPraBRI8OOMzOgaSPXjPmgWx/ri7M/oG8j24oFpHUj94mzC6RHJ8OH0aRNJ9+HcxekQSfXhp2kUyfrhWYjpF8mO8U/hCmkXyfvpkKvpGUn34E1pGsn165Ae6R3J+u/K0HoB3WBZ/2bR+gHTWf7Q0cN6AMr37rfU+gPKjvFOc+5T1noDyvjuCNnH+gJK/e292XoISvvnPYva+gbK9+0n3HoHKcr15eMsyvHskd96HOrQ5uxG0PsLSvvr85vSew/K9+uX0/sBS/5z6zPVewRL+Oro2cb7Bcv76mvYewLL8Oc6INn7B0v16dXbewzoy/jpWNz7Ckv96QOd3nsNS/noqd/7Dsv0dehF0XwLy/X5J+HM8zbXlGwTTPLXNewUzPM31udsFkzz1ojsF8z5MdZabCNc9gvsEEz00xXdbBrMjvFA1X7sGszzPNUAbR9M89Tx7RxM8z/Ug20QzfTURO0TzfmZ1l0lTf7Tp+0dzPzZk3ltFs3x0zrtKtzy7MxtG0370p3tEk3w0lzPbRhN/NHg7hzN+tG8wm4eTfbRU+4RTvnQ3MVuFE7+0IbuFc760CXIbhdO9t/Gyf4BzN33P3vLfgpO8N8wzP4By86O8UPes85+C8724TztTvTeBMF/Ds753bOZwv8AT/XdVcR/Ac/9fN08xf8Bw1Do3NcYx38GT47xTdyNyP8M5M/y3ELKfwZP99vzl8v/CU/826zNfwrP/njbSM7/DE/22xbJwHAPT/LassHwAMD11zo1w3ADT/HZ0cTwHOrQ7dltxnAFQPvZM5vH8AbA+dkJyXAIQP512KXK8AnA89hzycxwA8D/2A/N8AzA8smXn2Adz/fHQOEbQPPBluJhEUH4xpPhEsF74lM0xjVhEsH3xbbhFcHzNcWIYRdB+sU54RjB8z3Eu2EaQfLEfOEbwfk+xA5hHUHzz+Enz/jJk3FiHsH2w0LiEcL0yZMUYhNC98KV4hTC9cYSZ2IWQvviPkD/wcpiE2fC9MGL4hHEB+VcYWmRXWIZQvzAvuIawvrJkIBjHcL2wCHjH0Lyxr/Cs3MBtlL/RbTzA4PD9OP4BT/Or7fzBsP+dM5kuXMIQ/LOMrnK8wnD/s3OvHMLQ/PHPYO98wzD+M04v3MM7kP0zNSw9A/D+cyHWbJ0AbxS58xXs/QJwUSM+UrL2rV0BET/xzuPtvQFxPLLErh0CdG1V+DK4Ln0B0T8xzp8u3QBRPjKGLz0DOpE9snmvnQIxPvJk5u/9AvE8MlQsXUARf51yQWy9QHF88jTuYR1AsSJ8lhvtfUDRfvDOAu3dQZF8LfI5R1E8zW3emUXxfG3G+Uu1eM2ts1lHEXytm7lHcXzN7YQZh9F87Wx5hDG8zi1Y2YSRv21FOYSxPM5tLZmE8b8tDfmFsbzM7QpZhhG/7O65hnG8zSzfGYbRvKzTeYZRfM+st9mHMb1ssDnHkbzOLJCZx/G/bHz5xFH8zKxtWcexP6xRucSx/MzsQhnF0f4sLnnGMf3MrArZxTF8L/wrPcM0adX/3Oudw1H+r9Tmq/3Dsf/vw+heABI/HK+kqL4AciMd15gqcR4AcjzveOl+ATI8bc9sad4HFnrvRuo+AmGSPjo7Hdc0Kv4Csj8tzxsrXgMSPG8Ia74HmxV+9ageQ3I+7uLqcH5AMn3uyejeQ9I8766w6T5AaTV7botqcZ5Akn7ufun+QPJ9bc5Zal5CEn6uRqq+Qzpyf+4z6x5Bsn0uIOUrfkLSfm4Oa95DknzPqfg6h/J/KeyahFK8z+nM+oSyvSm9WocyfM5pqbqFErwpphqFcr6M6YZ6hdK+vAlykr7qZVM6hjK9aS+ahvK8aqUX+oi2vQRaxGu2+upk7LrEEv3o1RrHUrzqZL16xVJ9qJ3axZL+6mSKOsTS/Ch6msUy/qpkUvrGUvxoT1rIlntqZDO6xfL+6CQbB3L96uQMewcS/Ov05N8AZxqWv+IlPwDzIL5RK8jlJZ8BUzyrvKX/APM/Xeup5l8AZxS7K5TnJr8CEz4rficfAtM/Mxs/IpNe598Dkz5rUOZkP0PzPWs5ZJ9AU3+eKxok/0Czf2sHZ5FfQbM8KBsJE31q1WZyH0FzfGq8Zn9FVz6c5SbfQdN8qpCnP0Lzf5+qd6efQ1N+ql6mc/9Ck3xqWGRfhTV5Kc45JL+Ds3yqLKUfgrjTvCogJX+AZ3f7KM4HJd+BM7xl9juIlnjNpeKbhfO9Jdb7hrO8zCW/W4cTv6Wvu4dzvMzlnBvH074liHvGU7zPZXTbxHO+5Wk7xDP8z6VJm8VT/OU5+8Wz/M2lGlvGE/7lBrvGc/zPpOcbxJP+pM97xnM8ziTD28cz/KScOATz/M+kgJgG0/zkcPgH8/zPZElYB5P8pDm4BRA8zWQaGASwPOQOeAYwP12n7aLcAGCWeSfg5SM8AvA8J8gjnANQIOM/l5wROXVntWBcQ7A/nqeioLxDUD/nj+JxHEBwfKdwoXxA0Hwnj2Qh3EAQYH0RZ1FicjxBkH6nPqKcQBB/5c8r4vxBMH0nGSNcQzpQfmcGY7xDEH8m5OcgHIKwf+bH4HyAML+fZrtg3INwfuau4nE8gPC95pXhnIPQfWXOiWH8gbC+JmoiXIM5UL0mUSK8gnC8JjjkIxyC0L1mJWN8gzC9nGYMY9yElnrh5DjFmJC94cyYx/C/Ibj4xZiw/GGpWMhVOSGJuMWZEPyhfhjF0P3hanjFmXD/IVbYxjD8YUc4yZiWeSEnmMdQ/CEP+MWbsP8g8FkEUP4g2LkFmHE9IMEZBBE8IKl5BZkxPWCV2QaQ/iB2OQWY0T0gXpkF8T5gSvkFmrE84CdZBZE8YBu5CrjUe2P/XB1AXlU/+NUcfUBeNP/Z3N1AkX8dY81dPUDxY75So7jmnZ1A8X4jrh39QDFjmZzXlR5dQVF943Xecr1CcXzjXN8dQtF/4c9D331DMX7jKt/dQzgxfCMYHD2BsX+jCOecnYBRvGLsXP2FtX3O5h1dg5F+4sbdvYM5cb5iul4dg/F94qzl3n2BEb+ip57dgpG/nOKU3z2C8b2idZ5znYIxvuJi3/2B0bwhzlAcXcNRvyI3HL3DOHH84jDdHcDR/aISZZ19wTH8nfnZxZH9XmXaOcXx/N3OmcZR/15lpvnEEf7dm1nGEX3eZYO5x7G83WgaBrH+HmVUegcR/R082gfR/B5lJToEMjzdBZoFUj6fMfYI8j0c2loFsj7c0zK6BnI/nLMaBJI+nJszegbSPZyD2geSPJxrMDpH8j1cSJpEUnxcMZD6SrX8FVpHcfycBbpGuRJ9X+VaHkBZtj/Rxpp+QjJjvlPfv9reQzoyf1+zWz5CknyfouSbnkLyfl+aW/5AWzg0uV+BWF6Dsn6fbOaYvoByv99b2R6DUn+e30LZfoASvd8p2nHegTK9Xx1aPoDSvN3PENqegfK9nvGa/oM6Ur9e61tegZK8ntjkm76Csr3exdgew3K+Hx6zGH7DErx61rh2Hc6aGT7Akv/ek9mewzlS/J50mf7A8v5ebOZaXsGy/x5PGr7Ccv2cXjxbHsVW/i/bfsM7Mv7eFtvewhL92f8wOweS/xnomwfy/hnTMPsEUz7ZsVsEsz5ZpzG7BRM/GYYbBXM+mXsyewXTPZli2wYzP1lbMzsGkz7ZT5sG8z+ZLKP7B1M/XEkgE32Yt0hzfZC7f+MQyXtFM33Y0dtGGZN8WK47Sfd+P32EiRpkjvtHsz7Yh1tGs35YxHu7RxNh/ZDYVBuH0rzMWEh7iDS5GCjbhJO9zJgdO4Qzvxv3FZ+CdFQ0uFvkVf+E1HgbD6wWX4Gzvlu+YxOGlHMfgtO/23PXf4ITof2TnltOV9+Dk73bQdZwP8PzvNso1J/AU/2ZzwmU/8MzvJrwlV/DORP9WtFVv8ITvNrE5NYfwdP+2pLWf8Iz/nJ71rh02mDXP8Lz/pnOWpefwpP9GjUX/8M7U/5aIlRcABA9WgrlVLwAcBxrIGT8AqWc/pX1GAez/9XheAcYcDyVxdgFkCH9ktXLMjgFkD1VppgGUDzVmzL4BTA8VY9YCNR4FbszuAawP5WoGEXwPhWHMHhH0D0VbNhEkH3VTzE4RDB+lS2YRPB9lRcx+EWwf1UOWEYQftUAMrhHcD5U9xhG0H8YfYCMzxTX2EZwfNTQOIfwfQ0UpJiEULyYvABQfViFm5B9VHm4hXC+FFoYhZkQv1RGeIYwvBQq2IWakL+UGziF0LzUC5iGuLP/V+NT/IBR9D/UxtBcwBDj/Be3kLzAcP+c16TRHMDQ/9eL0Ol8wFCVetdy0dzAEP+fV39SPMGQ/5dTkcKcwlD+mNc7tFc0U1zGYxT/XNGY+w7QHQPQ/5VO75B9A3D+loPG1JE/nRaxET0A8TyWpJJxnQAxPVaFUf0BUTzVznjSXQGxPFZsUr0HGTQ+X9MdAtEh/ZEWTOUTfQIRPBY0E90DkT+dViFQPUPxPNYU0nCdQfD+FgIQ/UBRftGF4VlFEX1ZVdm00bIZRZlxf9GWeUbRPlFy2UWaMX8RUzlF0XxRQ5lJmZd5ESP5R7F8kRRZhZgRvVD0uYbxfxDtGYWY0bzQ6XmFMb4Q1dmFmZG9kMo5hfG+UKqZhZpRv5CW+YaxvFB7WYmZdniQT7mEcb1QLBnHm9G+kBh5xxG9E/UM6N3ATZd4E9wNPcDx/16Tto2dwE8Vv7BOcf3HN7tTl05dwVH9Ew+RDr3CEf65/cKN03Dlz33Ccf8TXw/dwzHjO7+TUzNMPgOR/RMs5QyeAFI9UwFM/gCyPzD6FZm64g2+AXI/UsznTh4B0j7Sws5+ARI/nBKwDt4Ckj5StkxzPgLyPhJ+D54CMj+aP5rBUlJMXkASfVI5TnC+Q1I/0hPNHkDSfs2F+XpFMn16dPgUTb46SZs3uM3KmkWSfY2q+kWaUn7Nl1pHEn+Nd7pFm3J8TVgaize6jVx6hZgyvQ042oSSvI0tOoWasn8NBZqE8rzNAfqFmbK9DNZahhK+TMK6hZpyv4yvGoVSvMyfeoQC0pwA</item>
... snipped
<item name="Gain6">QkFSWgAACOQCAEjgBu3z8D/u/wAPEg8kDzYPAEgPWg9sD34PkA+iD7QPxg8A2A/qD/wPDh8gHzIfRB9WHwBoH3ofjB+eH7Afwh/UH+YfAPgfCi8cLy4vQC9SL2Qvdi8AiC+aL6wvvi/QL+Iv9C8GPwAYPyo/PD9OP2A/cj+EP5Y/AKg/uj/MP94/8D8CTxRPJk8AOE9KT1xPbk+AT5JPpE+2TwDIT9pP7E/+TxBfIl80X0ZfAFhfal98X45foF+yX8Rf1l8A6F/6XwxvHm8wb0JvVG9mbwB4b4pvnG+ub8Bv0m/kb/ZvAAh/Gn8sfz5/UH9if3R/hn8AmH+qf7x/zn/gf/J/BI8WjwAojzqPTI9ej3CPgo+Uj6aPALiPyo/cj+6PAJ8SnySfNp8ASJ9an2yffp+Qn6KftJ/GnwDYn+qf/J8OryCvMq9Er1avAGiveq+Mr56vsK/Cr9Sv5q8A+K8Kvxy/Lr9Av1K/ZL92vwCIv5q/rL++v9C/4r/0vwbPABjPKs88z07PYM9yz4TPls8AqM+6z8zP3s/wzwLfFN8m3wA430rfXN9u34Dfkt+k37bfAMjf2t/s3/7fEO8i7zTvRu8AWO9q73zvju+g77LvxO/W7wDo7/rvDP8e/zD/Qv9U/2b/AHj/iv+c/67/wP/S/+T/9v8ACA8aDywPPg9QD2IPdA+GDwCYD6oPvA/OD+AP8g8EHxYfACgfOh9MH14fcB+CH5Qfph8AuB/KH9wf7h8ALxIvJC82LwBIL1ovbC9+L5Avoi+0L8YvANgv6i/8Lw4/ID8yP0Q/Vj8AaD96P4w/nj+wP8I/1D/mPwD4PwpPHE8uT0BPUk9kT3ZPAIhPmk+sT75P0E/iT/RPBl8AGF8qXzxfTl9gX3JfhF+WXwCoX7pfzF/eX/BfAm8UbyZvADhvSm9cb25vgG+Sb6Rvtm8AyG/ab+xv/m8QfyJ/NH9GfwBYf2p/fH+Of6B/sn/Ef9Z/AOh/+n8Mjx6PMI9Cj1SPZo8AeI+Kj5yPro/Aj9KP5I/2jwAInxqfLJ8+n1CfYp90n4afAJifqp+8n86f4J/ynwSvFq8AKK86r0yvXq9wr4KvlK+mrwC4r8qv3K/urwC/Er8kvza/AEi/Wr9sv36/kL+iv7S/xr8A2L/qv/y/Ds8gzzLPRM9WzwBoz3rPjM+ez7DPws/Uz+bPAPjPCt8c3y7fQN9S32Tfdt8AiN+a36zfvt/Q3+Lf9N8G7wAY7yrvPO9O72Dvcu+E75bvAKjvuu/M797v8O8C/xT/Jv8AOP9K/1z/bv+A/5L/pP+2/wDI/9r/7P/+/xAPIg80D0YPAFgPag98D44PoA+yD8QP1g8A6A/6DwwfHh8wH0IfVB9mHwB4H4ofnB+uH8Af0h/kH/YfAAgvGi8sLz4vUC9iL3Qvhi8AmC+qL7wvzi/gL/IvBD8WPwAoPzo/TD9eP3A/gj+UP6Y/ALg/yj/cP+4/AE8STyRPNk8ASE9aT2xPfk+QT6JPtE/GTwDYT+pP/E8OXyBfMl9EX1ZfAGhfel+MX55fsF/CX9Rf5l8A+F8KbxxvLm9Ab1JvZG92bwCIb5pvrG++b9Bv4m/0bwZ/ABh/Kn88f05/YH9yf4R/ln8AqH+6f8x/3n/wfwKPFI8mjwA4j0qPXI9uj4CPko+kj7aPAMiP2o/sj/6PEJ8inzSfRp8AWJ9qn3yfjp+gn7KfxJ/WnwDon/qfDK8erzCvQq9Ur2avAHiviq+cr66vwK/Sr+Sv9q8ACL8avyy/Pr9Qv2K/dL+GvwCYv6q/vL/Ov+C/8r8EzxbPACjPOs9Mz17PcM+Cz5TPps8AuM/Kz9zP7s8A3xLfJN823wBI31rfbN9+35Dfot+038bfANjf6t/83w7vIO8y70TvVu8AaO9674zvnu+w78Lv1O/m7wD47wr/HP8u/0D/Uv9k/3b/AIj/mv+s/77/0P/i//T/Bg8AGA8qDzwPTg9gD3IPhA+WDwCoD7oPzA/eD/APAh8UHyYfADgfSh9cH24fgB+SH6Qfth8AyB/aH+wf/h8QLyIvNC9GLwBYL2ovfC+OL6Avsi/EL9YvAOgv+i8MPx4/MD9CP1Q/Zj8AeD+KP5w/rj/AP9I/5D/2PwAITxpPLE8+T1BPYk90T4ZPAJhPqk+8T85P4E/yTwRfFl8AKF86X0xfXl9wX4JflF+mXwC4X8pf3F/uXwBvEm8kbzZvAEhvWm9sb35vkG+ib7Rvxm8A2G/qb/xvDn8gfzJ/RH9WfwBof3p/jH+ef7B/wn/Uf+Z/APh/Co8cjy6PQI9Sj2SPdo8AiI+aj6yPvo/Qj+KP9I8GnwAYnyqfPJ9On2Cfcp+En5afAKifup/Mn96f8J8CrxSvJq8AOK9Kr1yvbq+Ar5KvpK+2rwDIr9qv7K/+rxC/Ir80v0a/AFi/ar98v46/oL+yv8S/1r8A6L/6vwzPHs8wz0LPVM9mzwB4z4rPnM+uz8DP0s/kz/bPAAjfGt8s3z7fUN9i33Tfht8AmN+q37zfzt/g3/LfBO8W7wAo7zrvTO9e73Dvgu+U76bvALjvyu/c7+7vAP8S/yT/Nv8ASP9a/2z/fv+Q/6L/tP/G/wDY/+r//P8ODyAPMg9ED1YPAGgPeg+MD54PsA/CD9QP5g8A+A8KHxwfLh9AH1IfZB92HwCIH5ofrB++H9Af4h/0HwYvABgvKi88L04vYC9yL4Qvli8AqC+6L8wv3i/wLwI/FD8mPwA4P0o/XD9uP4A/kj+kP7Y/AMg/2j/sP/4/EE8iTzRPRk8AWE9qT3xPjk+gT7JPxE/WTwDoT/pPDF8eXzBfQl9UX2ZfAHhfil+cX65fwF/SX+Rf9l8ACG8abyxvPm9Qb2JvdG+GbwCYb6pvvG/Ob+Bv8m8EfxZ/ACh/On9Mf15/cH+Cf5R/pn8AuH/Kf9x/7n8AjxKPJI82jwBIj1qPbI9+j5CPoo+0j8aPANiP6o/8jw6fIJ8yn0SfVp8AaJ96n4yfnp+wn8Kf1J/mnwD4nwqvHK8ur0CvUq9kr3avAIivmq+sr76v0K/ioQ</item>
</list>
</obj>
This is what I have so far.
The document you're using is not from an actual PCR run, as inferred from the readable data. It is a color compensation run (short overview that seems to match the file) (full updated manual, page 250, not as fitting). Specifically, it seems to be a color compensation run for the "FAM/Pulsar 650" dye.
The output type, as you point out, is this "AcquisitionTable" with 2400 "counts" which must be different, I believe, from output you would normally get from a PCR run. I'm sure you've found these already, but a few public examples of PCR templates (not completed runs) are here, here, here and here.
According to the LCRunProgram in your file, the protocol here was:
hold 95°C for 0" at a speed of 20°C/s
hold 40°C for 30", 20°C/s
hold 95°C for 0" at 0.1°C/s, acquisition mode "2".
So, we're expecting that the acquisition timeframe lasted an estimated (95°C-40°C) / 0.1°C/s = 550 seconds, approximately; during which time, there should have been a fixed number of acquisition events per second.
EDIT 0 - this is what I had done at the beginning, so I'm not deleting it, but I got more interesting information later (see below).
I took a look at the data with a simple Python script (I'm a Python guy), to search for patterns. The script holds your data's initial strings in a dictionary called values which would be too long to post here; so here's it in a gist, just as you had to do.
#!/usr/bin/env python3
import base64
from collections import OrderedDict, defaultdict
from values import values
def splitme(name, sep):
splitted = base64.b64decode(values[name]+'==').split(sep)
print("{:<12} [{}; {}] separated in {} chunks: {}".format(
name,
len(values[name]), len(base64.b64decode(values[name]+'==')),
len(splitted),
[len(i) for i in splitted]))
return splitted
if __name__ == '__main__':
allchunks = defaultdict(list)
separator = b'\r'
print("separating by:", separator)
for key in values:
data = splitme(key, sep=separator)
for i, item in enumerate(data):
allchunks[item].append((key, i))
print("Common chunks:")
for location in [value for item, value in allchunks.items() if len(value)>1]:
print(location)
Let's get the obvious out of the way and say that ProgramNo and CycleNo hold the same data; and all Gain are identical. So I'll only post one of each.
Now trying the script with the separator b'\r' (just to try for one) cuts a few of them in chunks of 272 (271+separator) bytes. The others aren't tidy.
separating by: b'\r'
SampleNo [1536; 1152] separated in 5 chunks: [174, 271, 271, 271, 161]
ProgramNo [1531; 1148] separated in 6 chunks: [47, 271, 271, 271, 271, 12]
SegmentNo [1531; 1148] separated in 5 chunks: [169, 271, 271, 271, 162]
Separating by b'\t' gives similar results:
separating by: b'\t'
SampleNo [1536; 1152] separated in 5 chunks: [204, 271, 271, 271, 131]
ProgramNo [1531; 1148] separated in 5 chunks: [76, 271, 271, 271, 255]
SegmentNo [1531; 1148] separated in 5 chunks: [199, 271, 271, 271, 132]
And separating by b'\n' splits the gains this time, in a similar way:
separating by: b'\n'
Gain1 [3046; 2284] separated in 10 chunks: [81, 271, 271, 271, 271, 271, 271, 271, 271, 26]
So I am not at all implying that these "separators" are of any importance; I'm thinking that they are rare characters that appear to cut the data in 272-byte chunks, and this value, 272 bytes, might be important in understanding how this data is stored.
The beginning of each string "BARZ" seems like a "foo-bar" thing; probably set as check at the start of the header.
Another thing that is interesting is that the gains data separates into 8 equal-sized chunks (plus other two smaller blocks). If this data is from a 96-well plate, I would start exploring if this might possibly be a header and then 8 chunks (lines) which would be splittable in 12 items (colums), so that 8*12=96 replicating the setting of a 96-well plate.
Also, if this "272 bytes per line" hypothesis is true, then the data in ProgramNo, SampleNo etc that do split into 272-bytes chunks might be explained if the plate wasn't full, and some wells had samples (with a few complete lines) while others were empty. I'm not sure if this would make sense for a color compensation plate.
Time, Temperature, Error and Fluors do not separate into chunks and you are correct in thinking they are a set of continuous values; not necessarily floats though. Fluorescence can be captured as "units" which might be positive ints (I don't have a LightCycler so I don't know if it's the case or not).
And this is where I am so far. I'm not sure I'll have time to go further. In case I don't reply back, good luck with your endeavour.
EDIT 1:
So regarding the SampleNo data, it seems to be structured in this way:
1) a header, which might or might not be separated by 0x00 like:
* the BARZ header, then 2 times 0x00 (total 6 bytes)
* three bytes, then 0x00 (total 4 bytes)
* 17 bytes, then 0x00 (total 18 bytes)
2) a series of data, each of them comprised of 16 bytes and terminated by 0x00 (so 17 bytes each).
This means that Samples holds a header, plus 66 sets of 17 bytes.
EDIT 2:
Splitting everything by 0x00 with this awful piece of code:
def splitme(name):
data = base64.b64decode(values[name]+'==')
hit = 0
index = 0
countit = 0
splits = []
while hit >= 0 and countit < 500:
countit += 1
hit = data[index+1:].find(0)
index += hit+1
if hit >= 0:
splits.append(index)
lastindex = -1
splitted = []
if splits:
for index in splits:
splitted.append(data[lastindex+1:index])
lastindex = index
else:
splitted = [data]
Yields:
separating by: 0x0
SampleNo [1536; 1152] separated in 70 chunks: [4, 0, 3, 17, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16]
ProgramNo [1531; 1148] separated in 71 chunks: [4, 0, 3, 2, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 12]
SegmentNo [1531; 1148] separated in 69 chunks: [4, 0, 3, 18, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16]
CycleNo [1531; 1148] separated in 71 chunks: [4, 0, 3, 2, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 12]
Time [11944; 8958] separated in 63 chunks: [4, 0, 3, 45, 14, 42, 76, 46, 172, 110, 109, 15, 81, 90, 111, 108, 78, 46, 175, 141, 88, 209, 74, 117, 156, 170, 59, 107, 78, 103, 125, 171, 103, 170, 191, 333, 154, 187, 11, 257, 149, 208, 173, 156, 153, 412, 72, 55, 207, 131, 131, 274, 284, 238, 19, 241, 247, 13, 74, 558, 763, 8, 0]
Temperature [6731; 5048] separated in 14 chunks: [4, 0, 3, 394, 186, 543, 177, 173, 530, 534, 371, 714, 373, 1032]
Error [398; 298] separated in 21 chunks: [4, 0, 3, 2, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 12]
Fluor1 [7539; 5654] separated in 38 chunks: [4, 0, 3, 31, 13, 7, 7, 426, 331, 218, 187, 11, 10, 13, 7, 6, 7, 48, 45, 217, 840, 6, 7, 14, 7, 6, 7, 7, 6, 1178, 8, 6, 1147, 7, 6, 141, 630, 2]
...
Gain1 [3046; 2284] separated in 145 chunks: [4, 0, 3, 9, 7, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 8, 7, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 8, 7, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 8, 7, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 8, 7, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 8, 7, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 8, 7, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 8, 7, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 8, 7, 16, 16, 16, 16]
...
So SampleNo, ProgramNo, SegmentNo, Error and the Gains all split in blocks of 17 bytes (16 bytes + 0x00).
EDIT 3:
The first fifteen 17-bit chunks of ProgramNo (and the copy CycleNo) and Error are identical.
Just to clarify, the "chunks" I describe are what you describe as a series of number pairs, one of which increases by 0x12. The 0x00 that you mention is the separator between the chunks.
EDIT 4:
About Gain data, the link between my initial "272 bytes" blocks and the (16+0x00)-byte blocks, is that there's a repeating pattern of 16 blocks, 15 of them are "16+0x00" blocks and one last block has a 0x00 in the middle. So 17 bytes(=16+0x00) * 16 blocks = 272 bytes total for this repeat.
The whole string is built as follows: the "header" part, then 8 such repeats of 17bytes*16 blocks, and then four 17bytes blocks at the end. So on one side I was right about the 8 blocks, but apparently I was wrong when making the parallel with a 8x12 wells PCR plate. Here it's more like 8*16 (+4).
About Fluor etc. data, I don't have an answer but I'd try to strip the header and see if any (integer or float) compression algorithm can work on it... Compressed data would explain why you have different lengths for these fields.
This is what I found so far. (Some of it overlaps with what you already found)
The data is encoded in Base64, where the padding (=) is missing, so you will need to add that.
The first bytes identify the kind of data. The file I am looking at has DARZ/LARZ/FORM/Empty.
DARZ = Double[]
LARZ = Time? Havent decoded this
FORM = Double[][] (has 96 DARZ fields), this is the only field where byte 6 is 01x
Empty = Just a bunch of 0-1-2
For the first three types the first four bytes thus identify the type.
byte 1-4 = TypeID
byte 5-8 = The size of the element (BigEndian)
byte 9-12 = Checksum?
byte 13 - 13+length = the actual data.
In my case, I needed to extract Fluoresence0 items that have the DARZ header.
Header DARZ (5 bytes including null terminator)
Null bytes (2 bytes)
Block size (1 byte)
Null bytes (2 bytes)
Array size (1 byte)
Array of Doubles / Float64 (8 bytes each one)
End mark (1 byte)
Using HxD editor and it's cool data inspector is possible to validate the values.
With that information, it is easy to parse the data using python and hachoir
class CycleFloat64(FieldSet):
def createFields(self):
yield CString(self, "DARZ Header")
yield Bytes(self, "6 bytes", 0x6)
yield Float64(self, "Value 1")
yield Float64(self, "Value 2")
yield Float64(self, "Value 3")
yield Bytes(self, "1 byte", 0x1)

Retrieving bytes of String returns different results in ObjC than Java

I've got a string that I'm trying to convert to bytes in order to create an md5 hash in both ObjC and Java. For some reason, the bytes are different between the two languages.
Java
System.out.println(Arrays.toString(
("78b4a02fa139a2944f17b4edc22fb175:8907f3c4861140ad84e20c8e987eeae6").getBytes()));
Output:
[55, 56, 98, 52, 97, 48, 50, 102, 97, 49, 51, 57, 97, 50, 57, 52, 52, 102, 49, 55, 98, 52, 101, 100, 99, 50, 50, 102, 98, 49, 55, 53, 58, 56, 57, 48, 55, 102, 51, 99, 52, 56, 54, 49, 49, 52, 48, 97, 100, 56, 52, 101, 50, 48, 99, 56, 101, 57, 56, 55, 101, 101, 97, 101, 54]
ObjC
NSString *str = #"78b4a02fa139a2944f17b4edc22fb175:8907f3c4861140ad84e20c8e987eeae6";
NSData *bytes = [str dataUsingEncoding:NSISOLatin1StringEncoding allowLossyConversion:NO];
NSLog(#"%#", [bytes description]);
Output:
<37386234 61303266 61313339 61323934 34663137 62346564 63323266 62313735 3a383930 37663363 34383631 31343061 64383465 32306338 65393837 65656165 36>
I've tried using different charsets with no luck and can't think of any other reasons why the bytes would be different. Any ideas? I did notice that all of the byte values are different by some factor of 18 but am not sure what is causing it.
Actually, Java is printing in decimal, byte by byte. Obj C is printing in hex, integer by integer.
Referring this chart:
Dec Hex
55 37
56 38
98 62
...
You'll just have to find a way to output byte by byte in Obj C.
I don't know about Obj C, but if that NSLog function works similar to printf() in C, I'd start with that.
A code snippet from Apple
unsigned char aBuffer[20];
NSString *myString = #"Test string.";
const char *utfString = [myString UTF8String];
NSData *myData = [NSData dataWithBytes: utfString length: strlen(utfString)];
[myData getBytes:aBuffer length:20];
The change in bytes can be due to Hex representation. The above code shows how to convert the string to bytes and store the result in a buffer.

Java AES CBC Decryption First Block

I've got a big problem with AES Cryptography between Java and C++ (CryptoPP to be specific), that I was expecting to be way easier than asymetric cryptography, that I managed to solve earlier.
When I'm decrypting 48 bytes and the result is byte[] array of 38 bytes (size + code + hashOfCode), the last 22 bytes are decrypted properly and the first 16 are wrong.
try {
cipher = Cipher.getInstance("AES/CBC/PKCS5Padding", "BC");
byte[] key = { 107, -39, 87, -65, -1, -28, -85, -94, 105, 76, -94,
110, 48, 116, -115, 86 };
byte[] vector = { -94, 112, -23, 93, -112, -58, 18, 78, 1, 69, -92,
102, 33, -96, -94, 59 };
SecretKey aesKey = new SecretKeySpec(key, "AES");
byte[] message = { 32, -26, -72, 25, 63, 114, -58, -5, 4, 90, 54,
88, -28, 3, -72, 25, -54, -60, 17, -53, -27, -91, 34, -101,
-93, -3, -47, 47, -12, -35, -118, -122, -77, -7, -9, -123,
7, -66, 10, -93, -29, 4, -60, -102, 16, -57, -118, 94 };
IvParameterSpec aesVector = new IvParameterSpec(vector);
cipher.init(Cipher.DECRYPT_MODE, aesKey, aesVector);
byte[] wynik = cipher.doFinal(message);
Log.d("Solution here", "Solution");
for (byte i : wynik)
Log.d("Solution", "" + i);
} catch (Exception e) {
Log.d("ERROR", "TU");
e.printStackTrace();
}
Decrypted message, that I'm expecting to get is:
0 0 0 32 10 0 16 43 81 -71 118 90 86 -93 -24 -103 -9 -49 14 -29 -114 82 81 -7 -59 3 -77 87 -77 48 -92 -111 -125 -21 123 21 86 4
But what I'm getting is
28 127 -111 92 -75 26 18 103 79 13 -51 -60 -60 -44 18 126 -9 49 14 -29 -114 82 81 -7 -59 3 -77 87 -77 48 -92 -111 -125 -21 123 21 86 4
As you can see only last 22 bytes are the same.
I know that AES works with blocks and so I was thinking that maybe something with initialization vector is wrong (because only the first block is broken), but as you can see I'm setting vector in the way I think is OK.
And I have no idea why is it working that way. Any help will be really appreciated, cause I'm running out of time.
[EDIT]
I add the Cipher initialization. As you wrote, it is AES/CBC/PKCS5Padding.
On the CryptoPP/C++ side (that is in fact not my code, so I'd provide the least piece of information that I can find useful) there is:
CryptoPP::CBC_Mode< CryptoPP::AES>::Encryption m_aesEncryption;
CryptoPP::CBC_Mode< CryptoPP::AES>::Decryption m_aesDecryption;
QByteArray AESAlgorithmCBCMode::encrypt(const QByteArray& plain)
{
std::string encrypted;
try {
StringSource(reinterpret_cast<const byte*>(plain.data()), plain.length(), true,
new StreamTransformationFilter(m_aesEncryption,
new StringSink(encrypted)));
} catch (const CryptoPP::Exception& e) {
throw SymmetricAlgorithmException(e.what());
}
return QByteArray(encrypted.c_str(), encrypted.length());
}
QByteArray AESAlgorithmCBCMode::decrypt(const QByteArray& encrypted)
{
std::string plain;
try {
StringSource(reinterpret_cast<const byte*>(encrypted.data()), encrypted.length(), true,
new StreamTransformationFilter(m_aesDecryption,
new StringSink(plain)));
} catch (const CryptoPP::Exception& e) {
throw SymmetricAlgorithmException(e.what());
}
return QByteArray(plain.c_str(), plain.length());
}
Key and initialization vector are exactly the same (I checked).
The fun part is that is a part of a bigger communication protocol, and the previous message was encrypted and decrypted perfectly fine. And there were also zeros at the beginning.
The answer was provided in the question; that didn't change even after a clear comment that it should be posted as an answer.
This is said answer:
The point is that every time doFinal() is invoked, it resets the state of cipher. What you should do is store last block of message (encrypted for Decryptor and decrypted for Encryptor) that will be used next time as a new InitializationVector. Then init() with this new IV should be invoked. Naturally, different instances of Cipher for Encryption and Decryption should be provided.

byte[] to string and back to byte[]

I have a problem with interpreting a file. The file is builded as follow:
"name"-#-"date"-#-"author"-#-"signature"
The signature is a byte array. When i read the file back in i parse it to String en split it:
myFileInpuStream.read(fileContent);
String[] data = new String(fileContent).split("-#-");
If i look at the var fileContent i see that the bytes are al good.
But when i try to get the signature byte array:
byte[] signature= data[3].getBytes();
Sometimes i get wrong values of 63. I tried a few solutions with:
new String(fileContent, "UTF-8")
But no luck. Can someone help?
The signature is not a fixed length thus i can not do it hard coded...
Some extra info:
Original signature:
[48, 45, 2, 21, 0, -123, -3, -5, -115, 84, -86, 26, -124, -112,
75, -10, -1, -56, 40, 13, -46, 6, 120, -56, 100, 2, 20, 66, -92, -8,
48, -88, 101, 57, 56, 20, 125, -32, -49, -123, 73, 96, 76, -82, 81,
51, 69]
filecontent(var after reading):
... 48, 45, 2, 21, 0, -123, -3, -5, -115, 84, -86, 26, -124, -112,
75, -10, -1, -56, 40, 13, -46, 6, 120, -56, 100, 2, 20, 66, -92, -8,
48, -88, 101, 57, 56, 20, 125, -32, -49, -123, 73, 96, 76, -82, 81,
51, 69]
signature (after split and getBytes()):
[48, 45, 2, 21, 0, -123, -3, -5, 63, 84, -86, 26, -124, 63, 75,
-10, -1, -56, 40, 13, -46, 6, 120, -56, 100, 2, 20, 66, -92, -8, 48, -88, 101, 57, 56, 20, 125, -32, -49, -123, 73, 96, 76, -82, 81, 51, 69]
You can't access data[4] because you have 4 String in your table. So you can access data from 0 to 3.
data[0] = name
data[1] = date
data[2] = author
data[3] = signature
The solution :
byte[] signature = data[3].getBytes();
Edit: I think I finally understand what you are doing.
You have four parts: name, date, author, signature. The name and author are strings, the date is a date and the signature is a hashed or encrypted array of bytes. You want to store them as text in a file, separated by -#-. To do this, you first need to convert each to a valid string. Name and author are already strings. Converting a date to string is easy. Converting an array of bytes to string is not easy.
You can use base64 encoding to convert a byte array to a string. Use javax.xml.bind.DatatypeConverter printBase64Binary() for encoding and javax.xml.bind.DatatypeConverter parseBase64Binary() for decoding.
For example, if you have a name denBelg, date 2013-03-19, author Virtlink and this signature:
30 2D 02 15 00 85 FD FB 8D 54 AA 1A 84 90 4B F6 FF C8 28 0D D2 06 78 C8 64 02 14
42 A4 F8 30 A8 65 39 38 14 7D E0 CF 85 49 60 4C AE 51 33 45
Then, after concatenation and base64 encoding of the signature, the resulting string became, for example:
denBelg-#-20130319-#-Virtlink-#-MC0CFQCF/fuNVKoahJBL9v/IKA3SBnjIZAIUQqT4MKhlOTgUfeDPhUlgTK5RM0U=
Later, when you split the string on -#- you can decode the base64 signature part and get back an array of bytes.
Note that when the name or author can include -#- in their name, they can mess up your code. For example, if I set name as den-#-Belg then your code would fail.
Original post:
Java's String.getBytes() uses the platform default encoding for the string. Encoding is the way string characters are mapped to bytes values. So, depending on the platform the resulting bytes may be different.
Fix the encoding to UTF-8 and read it with the same encoding, and your problems will go away.
byte[] signature = data[3].getBytes("UTF-8");
String sigdata = new String(signature, "UTF-8");
0-???����T�?��K���(
�?x�d??B��0�e98?}�υI`L�Q3E
Your example represents some garbled mess of characters (is it encrypted or something?), but the bytes you highlighted show the problem:
You start with a byte value of -115. The minus indicates it is a byte value above 0x7F, whose character representation highly depends on the encoding used. Let's assume extended US-ASCII, then your byte represents (according to this table) the character ì (with an accent). Now when you decode it the decoder (depending on the encoding you use) might not understand the byte value 0x8D and instead represents it with a question mark ?. Note that the question mark is US-ASCII character 63, and that's where your 63 came from.
So make sure you use your encodings consistently and don't rely on the system's default.
Also, never use string encoding to decode byte arrays that do not represent strings (e.g. hashes or other cryptographic content).
According to your comment you are trying to read encrypted data (which are bytes) and converting them to a string using a decoder? That will never work in any way you expect it to. After you've encrypted something you have an array of bytes which you should store as-is. When you read them back, you have to put the bytes through a decrypter to regain the unencrypted bytes. Only if those decrypted bytes represent a string, then you can use an encoding to decode the string.
You're making extra work for yourself by converting these bytes into Strings by hand. Why aren't you doing it using the classes intended for this?
// get the file /logs/access.log
Path path = FileSystems.getRoot().getPath("logs", "access.log");
// open it, decoding UTF-8
BufferReader reader = Files.newBufferedReader(path, StandardCharsets.UTF_8);
// read a line of text, properly decoded
String line = reader.readLine();
Or, if you're in Java 6:
BufferedReader reader = new BufferedReader(new InputStreamReader(new FileInputStream("/logs/access.log"), "UTF-8"));
String line = reader.readLine();
Links:
Files.newBufferedReader
InputStreamReader
Sounds like an encoding issue to me.
First you need to know what encoding your file is using, and use that when reading the file.
Secondly, you say you signature is a byte array, but java strings are always unicode. If you want a different encoding (I'm guessing you want ASCII), you need to do getBytes("US-ASCII").
Of course, if your input was ascii, it would be strange that this could cause encoding issues.

Categories