Java serialization and instance sharing using remote services

Java serialization and instance sharing using remote services - java

I'm currently thinking about some design details of remoting / serialization between a Java Swing WebStart Application (Fat Client) and some remote services running on Tomcat. I want to use a http compatible transport to communicate with the server and since I'm already using Spring, I assume Spring's HTTP Remoting is a good choice. But I'm open to other alternatives here. My design problem is best illustrated with some small example
The client will call some services on the remote side. A sample service interface:
public interface ServiceInterface extends Serialiazable {
// Get immutable reference data
public List<Building> getBuildings();
public List<Office> getOffices();
// Create, read and update Employee objects
public insertEmployee(Employee employee);
public Employee getEmployee();
public void updateEmployee(Employee employee);
}
Building and Office are immutable reference data objects, e.g.
public Building implements Serializable {
String name;
public Building(String name) { this.name = name; }
public String getName() { return name; }
}
public Office implements Serializable {
Building building;
int maxEmployees;
public Office(Building building, int maxEmployess) {
this.building = building;
this.maxEmployees = maxEmployees;
}
public Building getBuilding() { return building; }
punlic int getMaxEmployees() { retrun maxEmployees; }
}
The available Buildings and Offices won't change during runtime and should be prefeteched by the client to have the available for selection list, filter condition, ... I want to have only one instance of each particular Building and Office on client and one instance onserver side. On server side it is not a big problem, but in my eyes the problem starts here when I call getOffices() after getBuildings(). The Buildings returned by getOffices() share the same instance of Buildings (if they have the same Building assigned) but the Buildings returned by getOffices() (referenced in Office objects) are not the same instance as the Buildings returned by getBuildings().
This might been solved by using some getReferenceDate() method returning both information in the same call, but than the problem will start if I have Employees referencing Offices.
I was thinking about some custom serialization (readObject, writeObject) transfering only the primary key and than getting the instance of the object from some class holding the reference data objects. But is this the best solution to this problem? I assume that this problem is not an uncommon problem, but did not find anything on Google. Is there a better solution? If not, what would be the best way to implemet it?

If you're going to serialize, You'll probably need to implement readResolve to guarantee that you're not creating additional instances:
From the javadoc for Serializable:
Classes that need to designate a
replacement when an instance of it is
read from the stream should implement
this special method with the exact
signature.
ANY-ACCESS-MODIFIER Object
readResolve() throws
ObjectStreamException;
I seem to remember reading about this approach in pre-enum days for handling serialization of objects that had to be guaranteed to be singular, like typesafe enums.
I'd also strongly recommend that you include a manual serialVersionUID in your serialized classes so that you can manually control when the application will decide that your classes represent incompatible versions that can't be deserialized.
However, on a more basic level, I'd question the whole approach - rather than trying to guarantee object identity over the network, which sounds to be, at the very least, a concurrency nightmare, why not just pass the raw data around and have your logic determine identity by id, the way we did it in my grandpappy's day? Your back end has a building object, it gets one from the front end, compares via ID (If you've altered the object on the front end you'll have to commit your object to your central datastore and determine what's changed, which could be a synchronizing issue with multiple clients, but you'd have that issue anyway).
Passing data remotely via Spring-httpclient is nice and simple, a bit less low-level than RMI.

Firstly, I'd recommend using RMI for your remoting, which can be proxied over HTTP (IIRC). Secondly, if you serialize the ServiceInterface, I believe the serialization mechanism will maintain the relative references for when it is deserialized in the remote JVM.

Related

Decorator Pattern with ever changing interfaces

I have a use case where I have a Database interface vended by an external vendor let's say it looks like following:
interface Database{
public Value get(Key key);
public void put(Key key, Value value)
}
The vendor provides multiple implementations of this interface e.g. ActualDatabaseImpl, MockDatabaseImpl. My consumers want to consume DataBase interface but before calling some of the APIs they want to perform some additional work e.g Call client side rate limiter before making call. So rather than every consumer having to do the extra work of checking rateLimiter's limit, I thought of creating a decorated class which will abstract out the rate limit part and consumers can interact with DB without knowing the logic of RateLimiter. e.g.
class RateLimitedDatabase implements Database{
private Database db;
public RateLimitedDatabase(Database db) {this.db = db;}
public Value get(Key key) {
Ratelimiter.waitOrNoop();
return db.get(key);
}
public void put(Key key, Value value) {
Ratelimiter.waitOrNoop();
return db.put(key, value);
}
}
This works fine as long as the Database interface doesn't introduce new methods.But as soon as they start adding APIs that I don't really care about e.g. delete/getDBInfo/deleteDB etc problems start arising.
Whenever a new version of DB with newer methods is released my build for RateLimitedDatabase will break.One option is to implement the new methods in the decorated class on investigating the root cause for build failure but that's just an extra pain for developers. Is there any other way to deal with such cases since this seems to be a common problem when using Decorator pattern with an ever changing/extending interface?
NOTE: I can also think of building a reflection based solution but that seems to be an overkill/over-engineering for this particular problem.

If that's feasible (you need to modify all your client code), you can extract a "mirror" of the vendor.Database interface and call it eg. mirror.Database; and copy just the methods you need from the vendor.Database interface to the mirror.Database (with the very same signatures).
Edit the client code to use the mirror.Database interface, and let RateLimitedDatabase implement this mirror.Database interface. Since all method signatures are the same, switching the client code to the mirrored interface should be painless. RateLimitedDatabase will delegate to a vendor.Database implementation of course.
(I think what I described is more or less the Bridge Pattern (using an interface to "shield" against underlying changes) , https://en.wikipedia.org/wiki/Bridge_pattern)

Aspect oriented programming has an solution to this issue.
Most frameworks will generate a dynamic proxy for your interface so it is always in sync.

Is it OK to use ThreadLocal for storing the requested Locale?

I am working on internationalizing user entered data in a rather large Client/Server (HTTP (Hessian) is used for communication) application which is stored in a database. Users can choose the language they want to see and there is a default language which is used when a translation in the requested language is not present.
Currently a data class may look like this:
class MyDataClass {
private Long id;
private String someText;
/* getters and setters */
}
After internationalization it could look like this:
class MyDataClass {
private Long id;
private Set<LocalizedStrings> localizedStrings;
/* getters and setters */
}
class LocalizedStrings {
private Locale locale;
private String someText;
/* getters and setters */
}
Of course it may be interesting to create a delegate getter in MyDataClass which takes care of getting the text in the correct locale:
public String getSomeText(Locale locale) {
for(LocalizedString localized : localizedStrings) {
if (localized.getLocale().equals(locale)) {
return localized.getSomeText();
}
}
}
In my team there were some concerns though about the need to pass the locale around all the time until they reach the data class. Since all this stuff happens on the server and every request to the server is handled in a dedicated Thread, some people suggested to store the requested locale in a ThreadLocal object and create a backward compatible no-argument getter:
public String getSomeText() {
return getSomeText(myThreadLocalLocale.get());
}
The ThreadLocal then needs to be a global variable (static somewhere) or it needs to be injected into MyDataClass on every single instance creation (we are using spring, so we could inject it if we make our data classes spring managed (which feels wrong to me)).
Using a ThreadLocal for the locale somehow feels wrong to me. I can vaguely argue that I don't like the invisible magic in the getter and the dependency on a global variable (in a data class!). However, having a "bad feeling" about this is not really a good way to argue with my colleagues about it. To help I need an answer with one of the following:
Tell me that my feeling sucks and the solution is great for reasons X,Y and Z.
Give me some good quotable arguments I can use to argue with my colleagues and tell me how to do it better (just always pass locale around or any other idea?)

Although, common practise I don't like to do localizing "deep" within the application.
Intead of this:
public String getSomeText() {
return getSomeText(myThreadLocalLocale.get());
}
We do this:
public LocalizableText getSomeText() {
return new LocalizableText(resourceBundle, "someText");
}
And then do, e.g. in a JSP or output layer:
<%= localizable.getString(locale) %>
The logic itself is language agnostic. We have cases where, after some processing, the application sends out the result by mail, logs it and presents it to the web user in all different languages. So processing together with result generation and then localization must be separate.

This approach is perfectly valid.
For example, Spring makes Locale available using ThreadLocal through RequestContextListener and LocaleContextHolder.
If you create a custom implementation, make sure you handle your ThreadLocal (set/remove) properly.

Using a thread local like you describe is a very common pattern in web applications. See this class in the Spring API as an example:
org.springframework.web.context.request.RequestContextHolder
Use a servlet filter (or similar) to both set the locale in a thread local, and then CLEAR the locale value after the server finished each request. Instead of injecting it in each place it is used, use a static factory/accessor method similar to RequestContextHolder: RequestContextHolder.getRequestAttributes().

ThreadLocal is bad practice. It's global variables and there are plenty of articles about how bad that is, in any language. The fact that Spring uses it does not justify using it. I like the solution cruftex has given. Avoid passing data via global variables.

Why cache entire POJO if we only need to cache a particular object

In this link: http://code.google.com/p/ehcache-spring-annotations/wiki/UsingCacheable
they say:
When the above POJO is defined as a bean in a Spring IoC container, the bean instance can be made 'cacheable' by adding merely one line of XML configuration.
Without using a caching framework, I would just declare Weather and List as static and that would have taken care of caching.
So my question is that if I want to just have the Weather and List<Location> to be cached, then why would I cache the entire DAO?
Also behind the scenes, does the annotation #Cacheable turn Weather and List<Location> into static variables?

This is the code in question:
public interface WeatherDao {
public Weather getWeather(String zipCode);
public List<Location> findLocations(String locationSearch);
}
public class DefaultWeatherDao implements WeatherDao {
#Cacheable(cacheName="weatherCache")
public Weather getWeather(String zipCode) {
//Some Code
}
#Cacheable(cacheName="locationSearchCache")
public List<Location> findLocations(String locationSearch) {
//Some Code
}
}
I don't quite understand your points about static variables. In essence there is a map from zipCode to Weather and similar from locationSearch to List<Location>. This map can come from a database, file, external API, etc.
Do you want to create a map with all possible arguments as keys and corresponding values? Sure, you can, but it has several drawbacks:
you put a lot of pressure on your heap. In many cases the amount of data might never fit into memory, or even on your disk (think: caching Google search engine by storing every possible search query and list of hits)
most likely you won't use most of the keys, ever. Why store them in memory?
what about eviction? I bet these methods tend to return different weather for the same ZIP code over the time...
Since I don't fully understand your arguments, let me explain briefly what happens behind the scenes when getWeather() is called:
transparent proxy intercepts getWeather() call and looks up weatherCache
in that cache it uses zipCode argument as cache key
if such entry exists (of type Weather) it is returned immediately
if the above is not the case, control is delegated to the real getWeather() method. It can call some API, run database query or do some lengthy computations
the result of getWeather() is placed in the weatherCache for future reference.
No static is involved here.
BTW Spring 3.1 introduced caching abstraction layer that probably makes this Google Code project obsolete. It looks the same and allows seamless integration with different cache implementations.

Pattern/Library for sending objects over network, keeping pointers

Let's say you have a Client and a Server that wants to share/synchronize the same Models/Objects. The models point to each other, and you want them to keep pointing at the same object after being sent/serialized between the client and the server. My current solution roughly looks like this:
class Person {
static Map<Integer,Person> allPeople;
int myDogId;
static Person getPerson(int key){
return allPeople.get(key);
}
Dog getMyDog() {
return Dog.getDog(myDogId);
}
}
class Dog {
static Map<Integer,Dog> allDogs;
int myOwnersId;
static Dog getDog(int key) {
return allDogs.get(key);
}
Person getMyOwner() {
return Person.getPerson(myOwnersId);
}
}
But i'm not too satisfied with this solution, fields being integer and stuff. This should also be a pretty common problem. So what I'm looking for here is a name for this problem, a pattern, common solution or a library/framework.

There are two issues here.
Are you replicating the data in the Client and the Server (if so, why?) or does one, the other, or
a database agent hold the Model?
How does each agent access (its/the) model?
If the model is only held by one agent (Client, Server, Database), then the other agents
need a way to remotely query the model (e.g., object enumerators, getters and setters for various fields)
operating on abstract model entities (e.g, model element identifiers, which might be implemented
as integers as you have done).
Regardless of who holds the model (one or all), each model can be implemented naturally.
THe normal implementation has each object simply refer to other objects using normal object references,
as if you had coded this without any thought of sharing between agents, and unlike what
you did.
You can associate an objectid with each object, as you have, but your application
code doesn't need to use it; it is only necessary when referencing a remote copy of
of the model. Whether this objectid is associated with each object as a special
field, a hash table, or is computed on the fly is just an implementation detail.
One way to handle this is to compute the objectid on-the-fly. You can do this
if there is a canonical spanning tree over the entire model. In this case,
the objectid is "just" the path from root of the spanning tree to the location
of object. If you don't have a spanning tree or it is too expensive to compute,
you can assign objectids as objects are created.
The real problem with a duplicated, distributed model as you have suggested you have,
is keeping it up to date with both agents updating it. How do you prevent
one from creating an object (an assigning an objectid) at the same time
as the other, but the objects being created are different with the same objectid,
or the same with with different Objectids? You'll need remote locking
and signalling to keep the models in sync (this is the same problem as
"cache coherency" for multiple CPUs; just think of each object as acting like a cache line). The way it is generally solved
is to designate who holds the master copy (perhaps of the entire model,
perhaps of individual objects within the model) and then issue queries,
reads, reads-with-intent-to-modify, or writes to ensure that the
"unique" entire model gets updated.

The only solution I am aware of is to send the complete structure, i.e. Dogs and Persons over the network. Then they will end up pointing to the correct copy on the other side of the network. The implementation of this solution however depends on a lot of circumstances. For example when your inclusion relation defines a tree you can go at this problem differently than if it is a graph with cycles.
Have a look at this for more information.

I guess one can use the proxy pattern for this.

Serializing a part of object graph

I have a problem regarding Java custom serialization. I have a graph of objects and want to configure where to stop when I serialize a root object from client to server.
Let's make it a bit concrete, clear by giving a sample scenario. I have Classes of type
Company
Employee (abstract)
Manager extends Employee
Secretary extends Employee
Analyst extends Employee
Project
Here are the relations:
Company(1)---(n)Employee
Manager(1)---(n)Project
Analyst(1)---(n)Project
Imagine, I'm on the client side and I want to create a new company, assign it 10 employees (new or some existing) and send this new company to the server. What I expect in this scenario is to serialize the company and all bounding employees to the server side, because I'll save the relations on the database. So far no problem, since the default Java serialization mechanism serializes the whole object graph, excluding the field which are static or transient.
My goal is about the following scenario. Imagine, I loaded a company and its 1000 employees from the server to the client side. Now I only want to rename the company's name (or some other field, that directly belongs to the company) and update this record. This time, I want to send only the company object to the server side and not the whole list of employees (I just update the name, the employees are in this use case irrelevant). My aim also includes the configurability of saying, transfer the company AND the employees but not the Project-Relations, you must stop there.
Do you know any possibility of achieving this in a generic way, without implementing the writeObject, readObject for every single Entity-Object? What would be your suggestions?
I would really appreciate your answers. I'm open to any ideas and am ready to answer your questions in case something is not clear.

You can make another class (a Data-Transfer-Object) where you have only the fields you want to transfer.
A way of custom serialization is implementing Externalizable

I would say the short answer to your question is no, such varied logic for serialization can't be easily implemented without writing the serialization yourself. That said an alternative might be to write several serializer/deserializer pairs (XML, JSON, whatever your favorite format, instead of standard yusing the built in serialization). and then to run your objects through those pairs sending some kind of meta-information preamble.
for example following your scenarios above you may have these pairs of (de)serialization mechanisms
(de)serializeCompany(Company c) - for the base company information
(de)serializeEmployee(Employee e) - for an employee's information
(de)serializeEmployee(Company c) - the base information of employees in a company
(de)serializeRelationships(Company c) - for the project relationships
For XML each of these can generate a dom tree, and then you place them all in a root node containing metainformation i.e.
<Company describesEmployees="true" describeRelationships="false">
[Elements from (de)serializeCompany]
[Elements from (de)serializeEmployee(Company c)]
</Company>
One potential "gotcha" with this approach is making sure you do the deserialization in the correct order depending on your model (i.e. make sure you deserialize the company first, then the employees, then the relationships). But this approach should afford you the ability to only write the "actual" serialization once, and then you can build your different transport models based on compositions of these pieces.

You could take an object swizzling approach where you send a "stub" object over the wire to your client.
Pros
The same object graph is logically available client-side without the overhead of serializing / deserializing unnecessary data.
Full / stub implementations can be swapped in as necessary without your client code having to change.
Cons
The overhead in calling getters which result in dynamically loading additional attributes via a call to the server is hidden from the client, which can be problematic if you do not control the client code; e.g. An unwitting user could be making an expensive call many times in a tight loop.
If you decide to cache data locally on the client-side you need to ensure it stays in-sync with the server.
Example
/**
* Lightweight company stub that only serializes the company name.
* The collection of employees is fetched on-demand and cached locally.
* The service responsible for returning employees must be "installed"
* client-side when the object is first deserialized.
*/
public class CompanyStub implements Company, Serializable {
private final String name;
private transient Set<Employee> employees;
private Service service;
public Service getService() {
return service;
}
public void setService(Service service) {
this.service = service;
}
public String getName() {
return name;
}
public Set<? extends Employee> getEmployees() {
if (employees == null) {
// Employees not loaded so load them now.
this.employees = server.getEmployeesForCompany(name);
}
return employees;
}
}

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.