How to have a common object across multiple JVMs - java

There is an application that will need to have something like a look up table. This application can be started many times with different configurations. Is there a way to share a datastructure across JVMs. static would be valid within a JVM. Having a database would solve the issue. However, is there something simpler and fast?

You might use a file. Write the object to a file. There is no such thing as an object shared within JVMs because the life cycle of an Object is defined for and within a JVM.
File IO is usually faster than DB operations and simpler as well. But on the downside, ACID properties are not guaranteed by files and there could be inconsistencies if multiple processes try to read / write on the same file.

Related

Should I have a single or multiple script files?

I'm creating a program in Java that uses scripting. I'm just wondering if I should split my scripts into one file for each script (more realistically every type of script like "math scripts" and "account scripts" etc.), or if I should use one clumped file for all scripts.
I'm looking for an answer from more of a technical viewpoint rather than a practical viewpoint if possible, since this question kind of already explained the practical side (separate often modified scripts and large scripts).
In terms of technical performance impacts, one could argue that using a single Globals instance is actually more efficient since any libraries are loaded only once instead of multiple times. However the question about usage of multiple files really depends. Multiple physical lua files can be loaded using the same Globals, or a single file can be loaded using the Globals instance, either way the Globals table contains the same amount of data in the end regardless of whether it was loaded from multiple files or not. If you use multiple Globals for each file this is not the case.
Questions like this really depend on what the intended goals are that you wish to use lua for. Using a single Globals instance will use RAM more efficiently, but besides that will not really give any performance increase. Loading multiple files versus a single file may take slightly longer, as the time to open and close the file handles, but this is such a micro optimization it seriously isn't worth the hassle it requires to write all the code in a single file, not to mention how hard it'd be to organize it efficiently.
There are a few advantages to using multiple Globals as well however, each Globals instance has it's own global storage, so changing something, like overloading operators on an objects metatable or overriding functions don't carry over to other instances. If this isn't a problem for you, then my suggestion may be to write the code in multiple files, and load them all with a single Globals instance. However if you do this be careful to structure all your files properly, if you use the global scope a lot you may find that keeping track of object names becomes difficult and is prone to accidentally modifying values from other files by naming them the same. To avoid this each file can define all of its functionality in it's own table, and then these Tables work as individual modules, where you can select features based on the tables, almost like choosing from a specific file.
In the end it really doesn't make much of a difference, but depending on which you choose you may need to take care to ensure good organization of the code.
Using multiple Globals takes more RAM, but can allow each file to have their own custom libraries without affecting others, but comes at the cost of requiring more structural management from the Java end of your software to keep all the files organized.
Using a single Globals takes less RAM, but all files share the same global scope, making customized versions of libraries more difficult and requires more structural organization from the Lua end of the software to prevent names and other functionality from conflicting.
If you intend other users to use your Lua API to add-on to your software through an addon system for example, you may wish to use multiple instance of Globals, because requiring the user creating addons to be the one responsible for ensuring they're code won't conflict with other addons is not only dangerous but also a burden that doesn't need to exist. An inexperienced user comes along trying to make an addon, doesn't organize it properly, and may mess up parts of the software or software addons.

Are transactions on top of "normal file system" possible?

It seems to be possible to implement transactions on top of normal file systems using techniques like write-ahead logging, two-phase commit, and shadow-paging etc.
Indeed, it must have been possible because a transactional database engine like InnoDB can be deployed on top of a normal file system. There are also libraries like XADisk.
However, Apache Commons Transaction state:
...we are convinced that the main advertised feature transactional file access can not be implemented reliably. We are convinced that no such implementation can be possible on top of an ordinary file system. ...
Why did Apache Commons Transactions claim implementing transactions on top of normal file systems is impossible?
Is it impossible to do transactions on top of normal file systems?
Windows offers transactions on top of NTFS. See the description here: http://msdn.microsoft.com/en-us/library/windows/desktop/bb968806%28v=vs.85%29.aspx
It's not recommended for use at the moment and there's an extensive discussion of alternative scenarios right in MSDN: http://msdn.microsoft.com/en-us/library/windows/desktop/hh802690%28v=vs.85%29.aspx .
Also if you take a definition of the filesystem, DBMS is also a kind of a filesystem and a filesystem (like NTFS or ext3) can be implemented on top (or in) DBMS as well. So Apache's statement is a bit, hmm, incorrect.
This answer is pure speculation, but you may be comparing apples and oranges. Or perhaps more accurately, milk and dairy products.
When a database uses a file system, it is only using a small handful of predefined files on the system (per database). These include data files and log files. The one operation that is absolutely necessary for ACID-compliant transactions is the ability to force a write to permanent memory (either disk or static RAM). And, I think most file systems provide this capability.
With this mechanism, the database can maintain locks on objects in the database as well as control access to all objects. Happily, the database has layers of memory/page management built on top of the file system. The "database" itself is written in terms of things like pages, tables, and indexes, not files, directories, and disk blocks.
A more generic transactional system has other challenges. It would need, for instance, atomic actions for more things. E.g. if you "transactionally" delete 10 files, all these would have to disappear at the same time. I don't think "traditional" file systems have this capability.
In the database world, the equivalent would be deleting 10 tables. Well, you essentially create new versions of the system tables without the tables — within a transaction, while the old tables are being used. Then you put a full lock on the system tables (preventing reads and writes), waiting until they are available. Then you swap in the new table definitions (i.e. without the tables), unlock the tables, and clean up the data. (This is intended as an intuitive view of the locking mechanism in this case, not a 100% accurate description.)
So, notice that locking and transactions are deeply embedded in the actions the database is doing. I suspect that the authors of this module come to realize that they had to basically fully re-implement all existing file system functionality to support their transactions — and that was a bit too much scope to take on.

Sharing a java object across a cluster

My requirement is to share a java object across a cluster.
I get Confused
whether to write an EJB and share the java objects across the cluster
or
to use any third party such as infinispan or memecached or terracotta or
what about JCache?
with the constraint that
I can't change any of my source code with specific to any application
server (such as implementing the weblogic's singleton services).
I can't offer two builds for cluster and non cluster environment.
Performance should not be downgraded.
I am looking for only open source third party if I need to use it.
It need to work in weblogic , Websphere , Jbos and Tomcat too.
Can any one come up with the best option with these constraints in mind.
It can depend on the use case of the objects you want to share in the cluster.
I think it comes down to really the following options in most complex to least complex
Distributed cacheing
http://www.ehcache.org
Distributed cacheing is good if you need to ensure that an object is accessible from a cache on every node. I have used ehache to distribute quite successfully, no need to setup a terracotta server unless you need the scale, can just point instances together via rmi. Also works synchronously and asynchronously depending on requirements. Also cache replication is handy if nodes go down so cache is actually redundant and dont lose anything. Good if you need to make sure that the object has been updated across all the nodes.
Clustered Execution/data distribution
http://www.hazelcast.com/
Hazelcast is also a nice option as provides a way of executing java classes across a cluster. This is more useful if you have an object that represents a unit of work that needs to be performed and you dont care so much where it gets executed.
Also useful for distributed collections, i.e. a distributed map or queue
Roll your own RMI/Jgroups
Can write your own client/server but I think you will start to run into issues that the bigger frameworks solve if the requirements of the objects your dealing with starts to get complex. Realistically Hazelcast is really simple and should really eliminate the need to roll your own.
It's not open source, but Oracle Coherence would easily solve this problem.
If you need an implementation of JCache, the only one that I'm aware of being available today is Oracle Coherence; see: http://docs.oracle.com/middleware/1213/coherence/develop-applications/jcache_part.htm
For the sake of full disclosure, I work at Oracle. The opinions and views expressed in this post are my own, and do not necessarily reflect the opinions or views of my employer.
It is just an idea. you might want to check the exact implementation.
It will downgrade performance but I don't see how it is possible to avoid it.
It not an easy one to implement. might be you should consider load balance instead of clustering.
you might consider RMI and/or dynamic-proxy.
extract interface of your objects.
use RMI to access the real object (from all clusters even the one that actually holds the object)
in order to create RMI for an existing code you might use dynamic-proxy (again..not sure about implementation)
*dynamic proxy can wrap any object and do some pre and post task on each method invocation. in this case it might use the original object for RMI invocation
you will need connectivity between clusters in order to propogate the RMI object.

Communication between different Jar's/Classloaders

I've got the following problem to solve:
There are two jar files. These jar's start independently from each other.
Now, let's say the first jar A.jar calculates or computes something and has to commit the results to B.jar.
I tried to communicate via a central Singleton (enum Singleton and a Singleton that uses a own classloader as mentioned here: Singleton class with several different classloaders).
But it didn't seem to work for me. When i start the two jar's the hashcode of the instances are different.
Can anyone tell me what i'm doing wrong? Or other ideas how to solve my problem?
There are two jar files. These jar's start independently from each
other.
So they're separate processes. They won't share instances of classes, variables etc. You need some form of inter-process communication to communicate between them.
That would normally mean some form of network protocol (e.g. TCP/UDP sockets, HTTP etc.). You could also do something really simple like reading/writing a shared file (that's not particularly nice, but it is straightforward for simple cases)
If you are running 2 jar files separately, then each jar file runs its own JVM instance and nothing is shared between them. They are 2 seperate process. Full-Stop.
If you wish to communicate between them then here are alternatives:
It depends on what kind of data you wish to transfer.
If it is only Strings, then:
if number of process = 2 and if you are sure of it, then stdin &8 stdout is the best way forward. One process can start run another Jar file by creating a Process using ProcessBuilder and then using the streams to communicate. The other process can just do System.out to transfer message. This is preferred to Socket, because you dont have to handle graceful closing of socket etc. (In case it fails and the port is not un-binded successfully, it can be a big trouble)
if number of jar files > 2 (i.e. number of total processes) and less than say 10, you can probably use Sockets and communicate through Socket. This should work well, though extra effort goes in gracefully managing sockets.
if number of process is Large, then JMS should be used. It does a lot of things which you dont need to handle. Too big a task if the number of processes are less.
So in your case, process is the best way forward.
If the data you wish to transfer, can even be Objects. RMI can be used given the number of processes are less. If more, use JMS again.
Edit: Now for all the above, there is a lot of dirty work involved. For a change, if you are looking at something new & exciting, I would advice akka. It is a actor based model which communicate with each other using Messages. The beauty is, the actors can be on same JVM or another (very little config) and akka takes care the rest for you. I haven't seen a more cleaner way than doing this :)

Servlet concurrency/synchronization in Tomcat?

Is there a recommended way to synchronize Tomcat Servlet instances that happen to be competing for the same resource (like a file, or a database like MongoDB that isn't ACID)?
I'm familiar with thread synchronization to ensure two Java threads don't access the same Java object concurrently, but not with objects that have an existence outside the JRE.
edit: I only have 1 Tomcat server running. Whether that means different JVMs or not, I am not sure (I assume it's the same JVM, but potentially different threads).
edit: particular use case (but I'm asking the question in general):
Tomcat server acts as a file store, putting the raw files into a directory, and using MongoDB to store metadata. This is a pretty simple concept except for the concurrency issue. If there are two concurrent requests to store the same file, or to manage metadata on the same object at the same time, I need a way to resolve that and I'm not sure how. I suppose the easiest approach would be to serialize / queue requests somehow. Is there a way to implement queueing in Tomcat?
Typically, your various servlets will be running in the same JVM, and if they're not, you should be able to configure your servlet runner so this is the case. So you can arrange for them to see some central, shared resource manager.
Then for the actual gubbinry, if plain old synchronized isn't appropriate, look for example at the Semaphore class (link is to part of a tutorial/example I wrote a while ago in case it's helpful), which allows you to handle "pools" of resources.
If you are running one tomcat server and all your servlets are on one context you can always synchronize on a java object present on that context class loader. If you are running multiple contexts then the "synchronization object" can not reside in any particular context but needs to reside at a higher level that is shared by all the contexts. You can use the "common" class loader in tomcat 6.0 documentation here to place your "synchronization object" there which will then be shared among all contexts.
I have 2 cases, If you expect to access common resource for File editing within the same JVM you can use the "synchronized" in a Java function. If different JVMs and other none Java threads accessing the common resource you might try using manual file locking code giving each thread priority number in queue
For database i believe there's no concurrency issue.
Your external resource is going to be represented by Java object (e.g. java.io.File) in some way or another. You can always synchronize on that object if you need to.
Of course, that implies that said object would have to be shared across your servlet instances.
IMO you're asking for trouble. There are reasons why things like databases and shared file systems were invented. Trying to write your own using some Singleton class or semaphores is going to get ugly real quick. Find a storage solution that does this for you and save yourself a lot of headaches.

Categories