I couldn't find any "Best Practices" online for usage of gRPC and protobuf within a project.
I'm implementing an event-sourced server side app.
The core defines the domain aggregates, events and services without having external dependencies. The gRPC server calls the core services passing in request objects which eventually translates into events being published. Events are serialized using protobuf and published on the wire.
We're currently in a dilemma on whether our events should be the protobuf generated classes directly, or should we keep the core and events separate and implement a mapper/serializer layer to translate events between protobuf <-> core
If there's another approach we're not considering, please guide us :)
Thanks for the help.
Domain Model Objects and Data Transfer Objects (Protobuf Message) should be separated as much as possible. For this the best way is to transform your Domain Model Objects into Google Protobuf Messages and vice versa. We've made a protobuf-converter to make it extremely simple.
Protobufs are really good for serialization and backwards compatibility, but not so good at being first class Java objects. Adding custom functionality to protos is currently not possible. You can get a lot of the benefits by using Protobufs at the stub layer, wrap them in one of your event Pojos, and pass them around internally as such:
public final class Event {
private final EventProto proto;
public void foo() {
// do something with proto.
}
}
Most projects don't change their .proto file that often, and almost never in a backwards incompatible way (neither wire nor API). Having to change a lot of code because of proto changes has never been a problem in my experience.
Related
Martin Fowler said to avoid automatic deserialization in an API:
I prefer to avoid automatic deserialization altogether. Automatic
deserialization usually falls into the WSDL pitfall of coupling
consumers and producers by duplicating a static class structure in
both.
What this means?
Is it to receive all information as JSON in each Rest Service without any "converter" in the middle?
By "converter" I mean some type adapter, like in GsonBuilder.
By automatic deserialization he means that there's a predefined hard structure for the JSON object which is used to retrieve the object itself.
This is however appropriate for most use cases.
Examples of predefined structure are Java Class or XML XSD.
Automatic deserialization usually falls into the WSDL pitfall of coupling consumers and producers by duplicating a static class structure in both.
What he means here is that using classes for deserialization is same as using WSDL to serialize or deserialize objects.
On the contrary to the hard structure of classes and XSD documents, JSON is much more relaxed as it's based on Javascript which allows modification to the object definition at any point of it's life cycle.
So the alternative would be to use a HashMap and ArrayList in Java combination (or parsing String itself) to deserialize the object, as then even if the server produces something different (like new fields) no change would be needed at the client side. And new clients can take advantage of the new fields.
In a hard structure since both the producer and consumer are strongly coupled because of the shared structure of the model classes, any change in the producer has to be reflected in the consumer.
In some SOA projects where I worked, we used to add some extra fields in all the request/response objects for future use so that there was no need to change the clients running in the production to accommodate the needs of a new client. These fields had some random name like customParam1 to customParam5, where the meaning of these fields was released with the documentation. These names were not intuitive all because we were coupling the producer and consumer on the shared structure or models.
Example:
class MyClass<S> {
}
Is the above class a POJO?
EDIT: The question has been put on hold so let me explain further. Firstly, the question is very clear and precise. Secondly, I think it is important since numerous docs says things like (to quote the google docs at https://developers.google.com/eclipse/docs/endpoints-addentities):
In the Endpoint methods, the return value type cannot be simple type such as String or int. The return value needs to be a POJO, an array or a Collection.
In such a case I would want to know exactly what classes I can use without having to go through a tedious trial-and-error process.
The term POJO (plain old java object) became popular around the time of early version of J2EE (now called JEE) and Enterprise Java Beans (EJB).
EJB sought to extend the java-beans philosophy of reusable, component driven architectures by providing enterprise service abstractions - things like database access, security, messaging.
Unfortunately, these early attempts required extending base classes that could only be used within the context of an application server. This had a lot of problems, for example it made testing a very cumbersome and slow process.
As a counterpoint to this POJOs emerged which aimed to provide enterprise services without having to extend base classes. Spring used Dependency Injection and Aspect Oriented Programming for this, and quickly became popular as classes could now easily be unit and integration tested outside of the heavy app server.
The idea behind POJO is that your class should extend from the business domain rather than an infrastructure domain. Therefore yes, there's no reason why a POJO can't use generics, as long as it honors this philosophy.
Every Java Class which doesnt extend prespecified classes and doesnt implement prespecified Interfaces. Also a POJO (Plain Old Java Object) doesnt have a prespecified Annotation.
This means your example is a POJO.
I have an in-house enterprise application (EJB2) that works with a certain BPM vendor. The current implementation of the in-house application involves pulling in an object that is only exposed by the vendor's API and making changes to it through the exposed methods in the API.
I'm thinking that I need to somehow map an internal object to this external one, but that seems too simple and I'm not quite sure of the best strategy to go about doing this. Can anyone shed some light on how they have handled such a situation in the past?
I want to "black box" this vendor's software so I can replace it easily if needed. What would be the best approach from a design point of view to somehow map an internal object to this exposed API object? Keep in mind that my in-house app needs to talk to the API still, so there is going to be some dependency between the two, but I want to reduce it so I can also test in isolation from this software using junit.
Thanks,
Jason
Create an interface for the service layer, internally all your code can work with that. Then make a class that uses that interface and calls the third party api methods and as the api facade.
i.e.
interface IAPIEndpoint {
MyDomainDataEntity getData();
}
class MyAPIEndpoint : IAPIEndpoint {
public MyDomainDataEntity getData() {
MyDomainDataEntity dataEntity = new MyDomainDataEntity();
// Call the third party api and fill it
return dataEntity;
}
}
It is always a good idea to interface out third party apis so you don't get their funk invading your app domain, and you can swap out as needed. You could make another class implementation that uses a different service entirely.
To use it in code you just call
IAPIEndpoint endpoint = new MyAPIEndpoint(); // or get it specific to the lang you are using.
Making your stuff based on interfaces when it spans multiple implementations is the way to go. It works great for TDD as well so you can just swap out the interface to a local test one that can inspect your domain code entirely separate from the third party api.
Abstraction; implement a DAL which will provide the transition from internal to external and back.
Then if you switched vendors your internals would remain valuable and you could change out the vendor specific code; assuming the vendors provide the same functionality and the data types related to each other.
I will be the black sheep here and advocate for the YAGNI principle. The problem is that if you do an abstraction layer now, it will look so close to the third party API that it will just be a redundant layer. Since you don't know now what a hypothetical future second vendor's API will look like, you don't know what differences you need to account for, and any future port is likely to require a rework for those unforeseen differences anyway.
If you need a test framework, my recommendation is to make your own test implementation using the same API as the BPM vendor. Even better, almost all reputable API providers provide some sort of sandbox mode for testing. If they don't, you should ask for one.
I very much like the simplicity of calling remote methods via Java's RMI, but the verbosity of its serialization format is a major buzz kill (Yes, I have benchmarked, thanks). It seems that the architects at Sun did the obvious right thing when designing the RPC (speaking loosely) component, but pulled an epic fail when it came to implementing serialization.
Conversely, it seems the architects of Thrift, Avro, Kryo (especially), protocol buffers (not so much), etc. generally did the obvious right thing when designing their serialization formats, but either do not provide a RPC mechanism, provide one that is needlessly convoluted (or immature), or else one that is more geared toward data transfer than invoking remote methods (perfectly fine for many purposes, but not what I'm looking for).
So, the obvious question: How can I use RMI's method-invocation loveliness but employ one of the above libraries for the wire protocol? Is this possible without a lot of work? Am I evaluating one of the aforementioned libraries too harshly (N.B. I very much dislike code generation, in general; I dislike unnecessary annotations somewhat, and XML configuration quite a bit more; any sort of "beans" make me cringe--I don't need the weight; ideally, I'm looking to just implement an interface for my remote objects, as with RMI).
Once upon a time, I did have the same requirement. I had changed rmi methods arguments and return types to byte[].
I had serialized objects with my preferred serializer to byte array, then called my modified rmi methods.
Well, as you mentioned java serialization is too verbose, therefore 5 years ago I did implement a space efficient serialization algorithm. It saves too much space, if you are sending a very complex object graph.. Recently, I have to port this serialization implementation to GWT, because GWT serialization in Dev mode is incredibly slow.
As an example;
rmi method
public void saveEmployee(Employee emp){
//business code
}
you should change it like below ,
public void saveEmployee(byte[] empByte) {
YourPreferredSerializer serialier = YourPreferredSerializerFactory.creteSerializer();
Employee emp = (Employee) serializer.deSerialize(empByte);
//business code
}
EDIT :
You should check MessagePack . it looks promising.
I don't think there is a way to re-wire RMI, but it might be that specific replacement projects -- I am specifically thinking of DiRMI -- might? And/or project owners might be interest in helping with this (Brian, its author, is a very competent s/w engineer from Amazon.com).
Another interesting project is Protostuff -- its author is building a RPC framework too (I think); but even without it supports an impressive range of data formats; and does this very efficiently (as per https://github.com/eishay/jvm-serializers/wiki/).
Btw, I personally think biggest mistake most projects have made (like PB, Avro) is not keeping proper separation between RPC and serialization aspects nicely separate.
So ability to do RPC using a pluggable data format or serialization providers seems like a good idea to me.
writeReplace() and readResolve() is probably the best combo for doing so. Mighty powerful in the right hands.
Java serialization is only verbose where it describes the classes and fields it's serializing. Overall, the format is as "self describing" as XML. You can can actually override this and replace it with something else. This is what the writeClassDescriptor and readClassDescriptor methods are for. Dirmi overrides these methods, and so it is able to use standard object serialization with less wire overhead.
The way it works is related to how its sessions work. Both endpoints may have different versions of the object, and so simply throwing away the class descriptors won't work. Instead, additional data is exchanged (in the background) so that the serialized descriptor is replaced with a session-specific identifier. Upon seeing the identifier, a lookup table is examined to find the descriptor object. Because the data is exchanged in the background, there's a brief "warm up period" after a session is created and for every time an object type is written for the first time.
Dirmi has no way to replace the wire format at this time.
Lots of frameworks let me expose an ejb as a webservice.
But then 2 months after publishing the initial service I need to change the ejb or any part of its interface. I still have clients that need to access the old interface, so I obviously need to have 2 webservices with different signatures.
Anyone have any suggestions on how I can do this, preferably letting the framework do the grunt work of creating wrappers and copying logic (unless there's an even smarter way).
I can choose webservice framework on basis of this, so suggestions are welcome.
Edit: I know my change is going to break compatibility,and I am fully aware that I will need two services with different namespaces at the same time. But how can I do it in a simple manner ?
I don't think, you need any additional frameworks to do this. Java EE lets you directly expose the EJB as a web service (since EJB 2.1; see example for J2EE 1.4), but with EE 5 it's even simpler:
#WebService
#SOAPBinding(style = Style.RPC)
public interface ILegacyService extends IOtherLegacyService {
// the interface methods
...
}
#Stateless
#Local(ILegacyService.class)
#WebService(endpointInterface = "...ILegacyService", ...)
public class LegacyServiceImpl implements ILegacyService {
// implementation of ILegacyService
}
Depending on your application server, you should be able to provide ILegacyService at any location that fits. As jezell said, you should try to put changes that do not change the contract directly into this interface. If you have additional changes, you may just provide another implementation with a different interface. Common logic can be pulled up into a superclass of LegacyServiceImpl.
I'm not an EBJ guy, but I can tell you how this is generally handled in the web service world. If you have a non-breaking change to the contract (for instance, adding a property that is optional), then you can simply update the contract and consumers should be fine.
If you have a breaking change to a contract, then the way to handle it is to create a new service with a new namespace for it's types. For instance, if your first service had a namespace of:
http://myservice.com/2006
Your new one might have:
http://myservice.com/2009
Expose this contract to new consumers.
How you handle the old contract is up to you. You might direct all the requests to an old server and let clients choose when to upgrade to the new servers. If you can use some amount of logic to upgrade the requests to the format that the new service expects, then you can rip out the old service's logic and replace it with calls to the new. Or, you might just deprecate it all together and fail all calls to the old service.
PS: This is much easier to handle if you create message class objects rather than reusing domain entities.
Ok here goes;
it seems like dozer.sourceforge.net is an acceptable starting-point for doing the grunt work of copying data between two parallel structures. I suppose a lot of web frameworks can generate client proxies that can be re-used in a server context to maintain compatibility.