I want to send some complex objects from a Java client to C server via a TCP Socket.
How can I do that ?
Fundamentally the question is, "How to serialize/deserialize objects in a
language agnostic manner?" Specifically Java and C in your case. Since you'll
be sending this data over a network, it is also important to take care of network order/endianness issues.
I assume you have access to both the the client and the server. This means you
get to choose how to serialize the data. (If not, the answer is simple. Write
to the specs of what the other is expecting)
Personally, I would use Protocol Buffers.
There are Java bindings
and C bindings.
If you don't like Protocol Buffers, there are other options like:
JSON (already mentioned)
YAML
Apache Thrift
XDR
roll your own
...
Write the fields of the Java objects to a string (perhaps JSON), send them via TCP, and have the C program read the string and use it to initialize new C variables on the other end.
This question is pretty old, but just in case some one is still looking for a good solution, you can try out the protocol buffers implementation, as mentioned in the previous answer by #Adam Liss: (developers.google.com/protocol-buffers/)
In short, you define any complex message type as in your protocol implementation, and the tool generates C++/Java/Python code which can serialize and deserialize it.
For the same purpose using C code, a research project at the Technische Universität München (TUM) Germany have created a code generator in standard C, that can be used with embedded-C projects. This is fully compatible(with limitations due to C structs) with Google's protobuf implementation. This works better than the C Bindings because it does not need any library to be linked with.
I had issues in getting the C Bindings to work on the embedded systems I was working with, because it needs to be linked with the support library.
This saved my (painful) day with my embedded project - passing complex network data(request-responses) between an embedded system and Android app(Java)/Desktop app(C++/Qt).
Related
We have the following scenario:
There is a program (already written) in Java which runs on a server (in the web). Let's call it JavaServerProgram. It takes user input, calculates stuff and finally generates a bunch classes. Let's call these classes JavaClasses. All these classes are serializable to json.
There is a library (already written) in C# that contains many classes describing a tree-like data structure. Let's call it C#Data. Let's call the root class C#Root. All classes in C#Dataare (de-)serializable to/from json.
The bunch of JavaClasses that JavaServerProgram outputs can be converted into a C#Root instance. We have a library written in C# for that which takes json representations of the JavaClasses as input and creates a C#Root instance. Let's call it C#Convert. This library will always be needed by another project; i.e. it can not be discontinued.
There is a program (already written) in C# that takes an instance of C#Root and does some actions (like shown a GUI, modifying files, ...) on a client. Let's call it C#ClientRun.
The workflow should be like this:
JavaServerProgram runs on the server and outputs JavaClasses.
The JavaClasses are converted into a C#Root instance on the server.
C#ClientRun gets the C#Root instance as input and runs on the client.
The question is, how do we implement the whole thing?
Version A:
We use all already existing programs and libraries. That means:
We modify JavaServerProgram so that after creation of the JavaClasses, it serializes them into jsons and outputs them.
We write a C# program that takes the json representations of the JavaClasses from JavaServerProgram as input, uses C#Convert to create a C#Root instance, serializes the C#Root instance to json and outputs it.
Then, after JavaServerProgram has run, we run that C# program and finally send the resulting C#Root json to the client where it will be derserialized into a C#Root instance and input into C#Run.
Pro: We use existing code.
Con: We have overhead due to
the conversion being an own program (it takes time and memory for the OS to manage it),
"media discontinuity": Instead of directly converting the JavaClasses into C#Root, we must serialize them to json (in Java) to be able to send them to the converter. (The converter does NOT deserialize them to JavaClasses, though. It processes the jsons directly.)
Version B:
We make a Java clone of C#Covernt, i.e.
we duplicate all classes from C#Data in Java as well as their ability to be serialized to json,
we duplicate the conversion algorithm in Java but without using jsons in-between, i.e. we directly convert from JavaClasses to C#Root.
Then we extend JavaServerProgram to contain the above clone, i.e. after creation of the JavaClasses it converts them into a C#Root instance and serializes it to one json. Then we send that C#Root json to the client where it will be derserialized into a C#Root instance and input into C#Run.
Pro: We have no own-program-overhead and no "media discontinuity".
Con: We need to maintain the C#Data classes and the conversion algorithm in two languages (C# and Java).
Version C:
We find a way to write code both in C# and Java but compile it to some common intermediate language that runs in one shared environment (like JVM / .NET-VM).
Pro: No duplicate code and no overhead/"media discontinuity".
Con: Cannot see any. (The time needed to get to know this new environment does not count as a con since it will be just invested once.)
Can anyone elaborate pros and cons from a practical perspective? Like:
Version A: Will the expected overhead be relevant? Or is it going to be small?
Version B: Is maintaining duplicate code in different languages practical? Is it common? Are there tools to assist? (Maybe there are tools that can automatically convert from C# to Java?)
Version C: Does such an environment as described exist? Which one? Has anyone experience with it?
I would probably prefer the Version B alternative.
From my understanding that will create a clean separation of concerns between java and c#, i.e. all Java runs on the server, all c# on the client. They will also share a common object model of the objects that need to be transferred. Note that the data format should be clearly documented, so it is obvious what side is incorrect if there are any issues.
You might also consider making a entirely separate API between the client and server, even if it may happen to look very similar to some existing data structures. That could let you evolve the API without necessarily needing to affect other systems.
But your question implies that this library is used in other contexts, so I would probably recommend figuring out what language to use in what situations. Otherwise you will keep running into problems like "Code from project K would be perfect for project L, but is in the wrong language". As well as risking various employment issues. I.e. holy language wars, conflicts, knowledge gaps, extra training, recruitment difficulties etc.
Version A will do extra work on the server that might or might not cause a performance overhead, but more importantly it will make debugging more difficult, since it might be difficult to tell if the error is in the java code or the conversion code. And the server developer may not be able to debug the c# code efficiently.
Version C is a nice thought, but even if it is possible it would be a uncommon solution. So you will likely have much more issues with build systems, compatibility and finding help when there are issues.
I have a Java Server that sending the java serializable object to my client, and receive java serializable object for execution. If my client is also java written, which is nice, that allow me to do communication within any problems.
But now, I would like to extend my programme to not only java client, the client may be written in C, objective C, python or php. So, I would like to do something to "convent" to client request to a java object, and send back to Server. The convent process, I can use the JSON to receive, and construct a Java object to the Server, but I also need a layer that convert back the Java object to JSON to the client.
My Question is except make a JSON-Java Translation layer, is there any other ways to do so? Also, we can afford to change some code in server side, but we must use Java as our primary language for that. Any suggestions? Thanks.
I use Netty API for designing my protocol and it is quite quick to do so if you can understand a NIO-like Byte and Buffer API.
It is design to work with a concept of Encoder and Decoder that could fit your need, there are a lot of default implementation of Encoder and Decoder for zipping, using ssl...
The problem you have seems to looks like this one:
JBoss Netty with JSON
I don't know JSON very well but most of the time is could also be quick and easy to design your own protocol.
Do you need a generic Serialization process for any kind of Object or do you simply need to serialize some String and primitive types (Integer, Short, Float..etc)?
In the case of simple objects it is easy and a lot faster to do the wrapper by yourself.
If objects are quite simple, and I would guess this is the case, your need it to design your own "protocol" specification meaning how to turn each Object into a sequence of primitive types, String and arrays. Than it should be quite easy to write both the Encoder and the Decoder in each language.
Good luck
There are other libraries designed for this, like protocol buffers and thrift.
http://thrift.apache.org/
http://code.google.com/p/protobuf/
I'd like to create a web API of some kind (I don't have a preference for the protocol), where the server uses Java and the client uses PHP.
I want the request and response to both be objects (instances of classes, not JSON-style hashes). The objects' fields can be primitive types or other objects. I would define all the necessary classes in both the client and server code. PHP and Java have similar object models, so it shouldn't be hard to write corresponding classes in both languages.
To make this work, there would need to be some automated way to serialize an object on one side, and unserialize it on the other. It would need to know which PHP class maps to which Java class, and how to convert the fields. I could write something, but is there an existing protocol for transferring objects like this? Can this be done with SOAP?
Java and PHP objects are not interchangeable. You will have to define the object types on both ends, and the transfer protocol could be anything you like. Serialization and deserialization makes the whole process transparent. The transport medium could be JSON, XML, YAML, or anything else for that matter.
For a record-like objects:
{"_type":"MyCoolObjectType", "a":1, "b":2, "c":3"}
If you're wanting to write once and use everywhere, I'd recommend using the same language on both ends, otherwise you'll have to have a compiler that can translate between your choice languages.
A SOAP web service can handle the basic abstraction as long as the request/response is not very complex. You can create the classes in java and then get the API to export a WSDL for them.
You need to have them both serialize to the same string. The PHP format and Java format for serialization are different, and therefore incompatible. You need a common exchange format, and I recommend that you DON'T use PHP's. However, the functions to serialize in PHP are fairly simple, are contained in ext/standard/var.c file in the PHP source if you choose to use it..
See the following:
Unserialize in Java a serialized php object - A similar question to yours.
http://en.wikipedia.org/wiki/Serialization#Serialization_formats
http://en.wikipedia.org/wiki/XML
XML, API, CSV, SOAP! Understanding the Alphabet Soup of Data Exchange
From http://en.wikipedia.org/wiki/XML (emphasis mine):
Although the design of XML focuses on documents, it is widely used for the representation of arbitrary data structures, for example in web services.
I was wondering if anyone had some resources that describe the binary protocol used by ObjectOutputStream. I realize of course that objects themselves can specify what their data by implementing the Externalizable interface, so I guess I'm looking more toward the structure of the object graph - the metadata if you will.
I am writing a C program that has to talk to a legacy Java program. I have no way to change either of these requirements so find myself reverse engineering the ObjectOutputStream protocol. (There is a server that uses HTTP for transport and returns Object*Stream as the HTTP response.)
However, I feel like someone else out there has to have done this work before. Can you point to any resources to speed up my work?
http://java.sun.com/javase/6/docs/technotes/guides/serialization/index.html
and from there
http://java.sun.com/javase/6/docs/platform/serialization/spec/protocol.html
The log4j network adapter sends events as a serialised java object. I would like to be able to capture this object and deserialise it in a different language (python). Is this possible?
NOTE The network capturing is easy; its just a TCP socket and reading in a stream. The difficulty is the deserialising part
Generally, no.
The stream format for Java serialization is defined in this document, but you need access to the original class definitions (and a Java runtime to load them into) to turn the stream data back into something approaching the original objects. For example, classes may define writeObject() and readObject() methods to customise their own serialized form.
(edit: lubos hasko suggests having a little java program to deserialize the objects in front of Python, but the problem is that for this to work, your "little java program" needs to load the same versions of all the same classes that it might deserialize. Which is tricky if you're receiving log messages from one app, and really tricky if you're multiplexing more than one log stream. Either way, it's not going to be a little program any more. edit2: I could be wrong here, I don't know what gets serialized. If it's just log4j classes you should be fine. On the other hand, it's possible to log arbitrary exceptions, and if they get put in the stream as well my point stands.)
It would be much easier to customise the log4j network adapter and replace the raw serialization with some more easily-deserialized form (for example you could use XStream to turn the object into an XML representation)
Theoretically, it's possible. The Java Serialization, like pretty much everything in Javaland, is standardized. So, you could implement a deserializer according to that standard in Python. However, the Java Serialization format is not designed for cross-language use, the serialization format is closely tied to the way objects are represented inside the JVM. While implementing a JVM in Python is surely a fun exercise, it's probably not what you're looking for (-:
There are other (data) serialization formats that are specifically designed to be language agnostic. They usually work by stripping the data formats down to the bare minimum (number, string, sequence, dictionary and that's it) and thus requiring a bit of work on both ends to represent a rich object as a graph of dumb data structures (and vice versa).
Two examples are JSON (JavaScript Object Notation) and YAML (YAML Ain't Markup Language).
ASN.1 (Abstract Syntax Notation One) is another data serialization format. Instead of dumbing the format down to a point where it can be easily understood, ASN.1 is self-describing, meaning all the information needed to decode a stream is encoded within the stream itself.
And, of course, XML (eXtensible Markup Language), will work too, provided that it is not just used to provide textual representation of a "memory dump" of a Java object, but an actual abstract, language-agnostic encoding.
So, to make a long story short: your best bet is to either try to coerce log4j into logging in one of the above-mentioned formats, replace log4j with something that does that or try to somehow intercept the objects before they are sent over the wire and convert them before leaving Javaland.
Libraries that implement JSON, YAML, ASN.1 and XML are available for both Java and Python (and pretty much every programming language known to man).
I would recommend moving to a third-party format (by creating your own log4j adapters etc) that both languages understand and can easily marshal / unmarshal, e.g. XML.
In theory it's possible. Now how difficult in practice it might be depends on whether Java serialization format is documented or not. I guess, it's not. edit: oops, I was wrong, thanks Charles.
Anyway, this is what I suggest you to do
capture from log4j & deserialize Java object in your own little Java program.
now when you have the object again, serialize it using your own custom formatter.
Tip: Maybe you don't even have to write your own custom formatter. for example, JSON (scroll down for libs) has libraries for Python and Java, so you could in theory use Java library to serialize your objects and Python equivalent library to deserialize it
send output stream to your python application and deserialize it
Charles wrote:
the problem is that for this
to work, your "little java program"
needs to load the same versions of all
the same classes that it might
deserialize. Which is tricky if you're
receiving log messages from one app,
and really tricky if you're
multiplexing more than one log stream.
Either way, it's not going to be a
little program any more.
Can't you just simply reference Java log4j libraries in your own java process? I'm just giving general advice here that is applicable to any pair of languages (name of the question is pretty language agnostic so I just provided one of the generic solutions). Anyway, I'm not familiar with log4j and don't know whether you can "inject" your own serializer into it. If you can, then of course your suggestion is much better and cleaner.
Well I am not Python expert so I can't comment on how to solve your problem but if you have program in .NET you may use IKVM.NET to deserialize Java objects easily. I have experimented this by creating .NET Client for Log4J log messages written to Socket appender and it worked really well.
I am sorry, if this answer does not make sense here.
If you can have a JVM on the receiving side and the class definitions for the serialized data, and you only want to use Python and no other language, then you may use Jython:
you would deserialize what you received using the correct Java methods
and then you process what you get with you Python code