Byte alignment problem using Preon - java

Hello everybody :)
I am currently using preon for a spare time project, and I have encountered the following problem: I am trying to read a fixed length String with the following code:
#Bound int string_size;
#ByteAlign #BoundString(size = "string_size") my_string;
The file specification expects a variable padding, so that the next block's offset is a multiple of 4.
For example, if string_size = 5, then 3 null bytes will be added, and so on. I initially thought that the #ByteAlign annotation did exactly this, however, looking into the source code, I realized that it wasn't the case.
I tried to make this quick fix:
#If ("string_size % 4 == 2") #BoundList(size = "2", type = Byte.class) byte[] padding;
Sadly, Limbo doesn't seem to support the "%" operator. Is there a way around this?
(Also, where/how can I get the latest version?)
Thanks in advance.

Preon currently doesn't have a solution for your issue built-in. As you said, it's expression language doesn't have a modulo operator, and it looks like you could use one. You can however implement your own CodecDecorator, which is probably the thing you want to do. You could implement a CodecDecorator that inserts a Codec reading a couple of extrac bytes after it decoded the value.
The latest version of Preon is at Codehaus:
git://git.codehaus.org/preon.git
You could checkout the head, but there's also a separate branch called PREON-35 that has the bits for doing what is discussed over here.

Related

Using libsvm in Java for String classification

Looking around I was not able to find a good way to use libsvm with Java and I still have some open questions:
1) It is possible to use only libsvm or I have to use also weka? If any, what's the difference?
2) When using String type data how can I pass the training set as Strings? I was using matlab for a similar problem for proteins classification and there I just gave the strings to the machine without problem. Is there a way to do this in Java?
Here is an incomplete example of what I did in matlab (it works):
[~,posTrain] = fastaread('dataset/1.25.1.3_d1ilk__.pos-train.seq');
[~,posTest] = fastaread('dataset/1.25.1.3_d1ilk__.pos-test.seq');
trainKernel = spectrumKernel(trainData,k);
testKernel = spectrumKernel(testData,k);
trainKf =[(1:length(trainData))', trainKernel];
testKf = [(1:length(testData))', testKernel];
disp('custom');
model = libsvmtrain(trainLabel,trainKf,'-t 4');
[~, accuracy, ~] = libsvmpredict(testLabel,testKf,model)
As you can see I read the file in fasta format and feed them to libsvm but libsvm for java look like it wants something called Node that is made of double. What I did is to take byte[] from the String and then transform them into Double. Is it correct?
3) How to use a custom kernel? I've found this line of code
KernelManager.setCustomKernel(custom_kernel);
but with my libsvm.jar I don't find. Which lib do I have to use?
Sorry for the multiple questions, I hope you will give me a brief overview of what is going on here.
Thanks.
Please note that I've used LIBSVM for MATLAB, but not for Java. I can only really answer question 1, but hopefully this still helps:
It definitely is possible to use libsvm only, and the code is located here: https://www.csie.ntu.edu.tw/~cjlin/libsvm/. Note that jlibsvm is a port of libsvm, and it seems to be easier to use and more optimized for Java. As far as I can tell, weka just has a wrapper class that runs libsvm anyways (it even requires the libsvm.jar), though I mainly based it off of this: https://weka.wikispaces.com/LibSVM.

What language is this (think it's Java?), and how do I test (using a browser ide) the math is correct in it?

div(1, sum(1, exp(sum(div(5, product(100, .1)), -5))))
I'm using this in a Solr query, and want to verify that it is the same as :
Where x is 5.
Is this language Java?
If it is, why am I getting this output here:
http://ideone.com/LWYWtU
If it isn't, what language is this and how do I test it?
Thanks in advance for your help.
EDIT: To add more of the surrounding code, here is the full boost value I'm sending to Solr:
if(exists(query({!frange l=0 u=60 v=product(geodist(),0.621371)})),div(1, sum(1, exp(sum(div(product(5), product(100, .1)), -5)))),0)
The reason I think it might be Java is because in the docs, it says Most Java Math functions are now supported, including: and then lists the math functions I ended up using for code.
Solr is Java, but that's not relevant since this is a set of functions that Solr parses and evaluate itself (and not related to Java, except that the backing functions are implemented in Java).
As far as I can say from what you've mapped the functions correctly, as long as the 5 in product(5) is the same as X. You shouldn't need product there, as the value can be included in div directly as far as I can see.
A way to validate it would be to use debugQuery in Solr and see what the value is evaluated as, and then compare it to your own value. Remember that floating point evaluation can introduce a few uncertanities.

Which kind of representation can be '\r\x00\x00\x00' (if usually I have hexadecimal code:'\x0\x00\x00\x03')

I'm using a program (klee) that give me tests of c code.
I need to use the results in my program.
It is not readable information, but some of the solutions are hexadecimal data with the next format:
'\x0e\x00\x00\x00'
I have already asked about how to convert it into integer, and I found the solution.
I will have to introduce this kind of results in structs too, I will know the size but any about the fields or anything else about it.
I think I can solve this but now the problem is that sometimes you can obtain things like:
'\n\x00\x00\x00'= 13
or
'\r\x00\x00\x00' = 10
And I didn't found which kind of representation they use to convert it in readable information..
Apparently I could solve this in python with:
import struct
selection = struct.unpack('
I don't have any idea of pyton, and I would like found a solution in java or c.
Thanks very much
The value \n\r is used by Windows systems to indicate a newline - the \n moves to the new line, and \r moves the write pointer to the start of the line. I'm thinking that you might have had some character data containing a newline where each character was converted into a 32-bit integer value in big-endian format.
Hope this helps!

Is the Google Annotations Gallery useful in production code?

I could actually see a use for the Google Annotations Gallery in real code:
Stumble across code that somehow works
beyond all reason? Life's short. Mark
it with #Magic and move on:
#Magic
public static int negate(int n) {
return new Byte((byte) 0xFF).hashCode()
/ (int) (short) '\uFFFF' * ~0
* Character.digit ('0', 0) * n
* (Integer.MAX_VALUE * 2 + 1)
/ (Byte.MIN_VALUE >> 7) * (~1 | 1);
}
This is a serious question. Could this be used in an actual code review?
Quite. Well, not all of them, but many could be substitutes for longer comments.
That holds true for not too many of these annotations, but some (as in your example) could be handy.
It may be said that these annotations present the most common comments in a shorter and perhaps more readable way.
You can later process them, and add tresholds for, say, the number of #Magic annotations. If a project becomes too "magic", measures should be taken.
It would be easier to use comments with a key such as "MAGIC", then work with those. Hudson and Eclipse and other tools can count or mark those occurrences.
I can definitely see how the #CarbonFootprint would fit into several client's CSR policies, and the #WTF("comment") annotation would be really handy when you're working on a new project where you're not sure whether a certain piece of code actually is needed to work around some crazy bug/corner-condition or if it's just random, left-over crap that no one knew how to write better at the time.
FYI, Sonar seems to now include a better revision plugin.
Anyway, were you not to guess, i think the short project name is clear enough about this project's intentions : gag the annotations for what they can become when left free : an equivalent to the oh-so-y2k XML hell.
I guess some people may have missed the acronym and the date of the that Google Annotation Gallery (GAG) on April 1st... or maybe in some countries it's not a national day for jokes, or gags...

Tail -n 1000 in Java (Apache commons, etc)

I'm wondering if util code already exists to implement some/all of *NIX tail. I'd like to copy the last n lines of some file/reader to another file/reader, etc.
This seems like a good bet: Tailer Library. This implementation is based on it, but isn't the same. Neither implement a lookback to get the last 100 lines though. :(
You could take a look at this tail implementation in one of Heritrix's utility classes. I didn't write it but I wrote the code that uses it, works correctly as far as I can tell.
This is a UI app - you can look at the source though to see what it does (basically some threading & IO). Follow.
The "last n lines" is quite tricky to do with potentially variable width encodings etc.
I wrote a reverse line iterator in C# in response to another SO question. The code is all there, although it uses iterator blocks which aren't available in C# - you'd probably be better off passing the desired size into the method and getting it to build a list. (You can then convert the yield return statements in my code into list.add() calls.) You'll need to use a Java Charset instead of Encoding of course, and their APIs are slightly different too. Finally, you'll need to reverse the list when you're done.
This is all assuming you don't want to just read the whole file. If you don't mind doing that, you could use a circular buffer to keep "the last n lines at the moment", reading through until the end and returning the buffer afterwards. That would be much much simpler to implement, but will be much less efficient for very long files. It's easy to make that cope with any reader though, instead of just a few selected charsets over a stream (which my reverse iterator does).

Categories