I have written a java app which manipultes a file which is created by another program. i want my program to work in real time, in order to do so, i need to read from a file while the other program is writing it.
the simple solution is to keep reading from the file even when EOF has been reached in an infinite loop, but thats very ineffecient.
the better solution is to use a named pipe and configure the program to write to that pipe (i can choose the output file of the program). I know nothing about pipes in windows and i have no idea how to create them in the filesystem. if possible, i would like to create them from my app (in java), but any other way will be good as well.
i am working in windows xp SP3.
is it even possible in windows? and what is the best way?
While windows has pipes they aren't completely the same as those under *nix (see this wikipedia page) and there is no support in Java. The common suggestion is instead to use a socket for interprocess communication. Though I can not provide any hard evidence, if you are running through the localhost this should not create a significant amount of overhead versus a pipe, and will also allow your code to be more flexible if you later choose to run the processes on different machines.
Sorry if the question is too open-ended or otherwise not suitable, but this is due to my lack of understanding about several pieces of technology/software, and I'm quite lost. I have a project where I have an existing java swing GUI, which runs MPI jobs on a local machine. However, it is desired to support running MPI jobs on HPC clusters (let's assume linux cluster with ssh access). To be more specific, the main backend executable (linux and windows) that I need to, erm, execute uses a very simple master-slave system where all relevant output is performed by the master node only. Currently, to run my backend executable on multiple machines, I would simply need to copy all necessary files to the machines (assuming no shared filespace) and call "mpiexec" or "mpirun" as is usual practice. The output produced by the master needs to be read in (or partially read in) by my GUI.
The main problem as I see things is this: Where to run the GUI? Several options:
Local machine - potential problem is needing to read data from cluster back to local machine (and also reading stdout/stderr of the cluster processes) to display current progress to user.
Login node - obvious problem of hogging precious resources, and in many cases will be banned.
Compute node - sounds pretty dodgy - especially if the cluster has a queuing system (slurm, sun grid, etc)! Also possibly banned.
Of these three options, the first seems the most reasonable, and also seems least likely to upset any HPC admin people, but is also the hardest to implement! There are multiple problems associated with that setup:
Passing data from cluster to local machine - because we're using a cluster - by definition we probably will generate large amounts of data, which the user wants to see at least part of! Also, how should this be done? I can see how to execute commands on remote machine via ssh using jsch or similar, but if i'm currently logged in on the remote machine - how do I communicate information back to the local machine?
Displaying stdout/stderr of backend in local machine. Similar to above.
Dealing with peculiar aspects of individual clusters - the only way I see around that is to allow the user to write custom slurm scripts or such like.
How to detect if backend computations have finished/failed - this problem interacts with any custom slurm scripts written by user.
Hopefully it should be clear from the above that I'm quite confused. I've had a look at apache camel, jsch, ganemede ssh, apache mina, netty, slurm, Sun Grid, open mpi, mpich, pmi, but there's so much information that I think I need to ask for some help and advice. I would greatly appreciate any comments regarding these problems!
Actually, I just came across this: link which seems to suggest that if the cluster allows an "interactive"-mode job, then you can run a GUI from a compute node. However, I don't know much about this, nor do I know if this is common. I would be grateful for comments on this aspect.
You may be able to leverage the approach shown here: a ProcessBuilder is used to execute a command in the background of a SwingWorker, while the command's output is displayed in a suitable component. In the example, ls -l would become ssh username#host 'ls -l'. Use JPasswordField as required.
I was just wondering if it's possible to dump a running Java program into a file, and later on restart it (same machine)
It's sounds a bit weird, but who knows
--- update -------
Yes, this is the hibernate feature for a process instead of a full system. But google 'hibernate jvm process' and you'll understand my pain.
There is a question for linux on this subject (here). Quickly, it's possible to hibernate a process (far from 100% reliable) with CryoPID.
A similar question was raised in stackoverflow some years ago.
With a JVM my educated guess is that hibernating should be a lot easier, not always possible and not reliable at 100% (e.g. UI and files).
Serializing a persistent state of the application is an option but it is not an answer to the question.
This may me a bit overkill but one thing you can do is run something like VirtualBox and halt/save the machine.
There is also:
- JavaFlow from Apache that should do just that even though I haven't personally tried
- Brakes that may be exactly what you're looking for
There are a lot restrictions any solution to your problem will have: all external connections might or might not survive your attempt to freeze and awake them. Think of timeouts on the other side, or even stopped communication partners - anything from a web server to a database or even local files.
You are asking for a generic solution, without any internal knowledge of your program, that you would like to hibernate. What you can always do, is serialize that part of the state of your program, that you need to restart your program. It is, or at least was common wisdom to implement restart point in long running computations (think of days or weeks). So, when you hit a bug in your program after it run for a week, you could fix the bug and save some computation days.
The state of a program could be surprisingly small, compared to the complete memory size used.
You asked "if it's possible to dump a running Java program into a file, and later on restart it." - Yes it is, but I would not suggest a generic and automatic solution that has to handle your program as a black box, but I suggest that you externalize the important part of your programs state and program restart points.
Hope that helps - even if it's more complicated than what you might have hoped for.
I believe what the OP is asking is what the Smalltalk guys have been doing for decades - store the whole programming/execution environment in an image file, and work on it.
AFAIK there is no way to do the same thing in Java.
There has been some research in "persisting" the execution state of the JVM and then move it to another JVM and start it again. Saw something demonstrated once but don't remember which one. Don't think it has been standardized in the JVM specs though...
Found the presentation/demo I was thinking about, it was at OOPSLA 2005 that they were talking about squawk
Good luck!
Other links of interest:
How about using SpringBatch framework?
As far as I understood from your question you need some reliable and resumable java task, if so, I believe that Spring Batch will do the magic, because you can split your task (job) to several steps while each step (and also the entire job) has its own execution context persisted to a storage you choose to work with.
In case of crash you can recover by analyzing previous run of specific job and resume it from exact point where the failure occurred.
You can also pause and restart your job programmatically if the job was configured as restartable and the ExecutionContext for this job already exists.
Good luck!
I believe :
1- the only generic way is to implement serialization.
2- a good way to restore a running system is OS virtualization
3- now you are asking something like single process serialization.
The problem are IOs.
Says your process uses a temporary file which gets deleted by the system after
'hybernation', but your program does not know it. You will have an IOException
So word is , if the program is not designed to be interrupted at random , it won't work.
Thats a risky and unmaintable solution so i believe only 1,2 make sense.
I guess IDE supports debugging in such a way. It is not impossible, though i don't know how. May be you will get details if you contact some eclipse or netbeans contributer.
First off you need to design your app to use the Memento pattern or any other pattern that allows you to save state of your application. Observer pattern may also be a possibility. Once your code is structured in a way that saving state is possible, you can use Java serialization to actually write out all the objects etc to a file rather than putting it in a DB.
Just by 2 cents.
What you want is impossible from the very nature of computer architecture.
Every Java program gets compiled into Java intermediate code and this code is then interpreted into into native platform code (when run). The native code is quite different from what you see in Java files, because it depends on underlining platform and JVM version. Every platform has different instruction set, memory management, driver system, etc... So imagine that you hibernated your program on Windows and then run it on Linux, Mac or any other device with JRE, such as mobile phone, car, card reader, etc... All hell would break loose.
You solution is to serialize every important object into files and then close the program gracefully. When "unhibernating", you deserialize these instances from these files and your program can continue. The number of "important" instances can be quite small, you only need to save the "business data", everything else can be reconstructed from these data. You can use Hibernate or any other ORM framework to automatize this serialization on top of a SQL database.
Probably Terracotta can this: http://www.terracotta.org
I am not sure but they are supporting server failures. If all servers stop, the process should saved to disk and wait I think.
Otherwise you should refactor your application to hold state explicitly. For example, if you implement something like runnable and make it Serializable, you will be able to save it.
Is there any way in java to read a file's content, which is being updated by another handler before closing it?
That depends on the operating systems.
Traditionally, POSIX-y operating systems (Linux, Solaris, ...) have absolutely no problem with having a file open for both reading and writing, even by separate processes (they even support deleting a file while it's being read from and/or written to).
In Windows, the more common approach is to open files exclusively (contrary to common believe, Windows does support non-exclusive file access, it's just rarely used by applications).
Java has no way* of specifying what way you want to access a file, so the platform default is used (shared access on Linux/Solaris, exclusive access on Windows).
* This might be wrong for NIO and new NIO in Java 7, but I'm not a big NIO expert.
In theory its quite easy to do, however files are not designed to exchange data this way and depending on your requirements it can be quite tricky to get right. This is why there is no general solution for this.
e.g. if you want to read a file as another process writes to it, the reading thread will see an EOF even though the writer hasn't finished. You have to re-open the file and skip to where the file was last read and continue. The writing thread might roll the files it is writing meaning the reading has to detect this and handle it.
What specificity do you want to do?
Alright, so I'm writing this program that essentially batch runs other java programs for me (multiple times, varying parameters, parallel executions, etc).
So far the running part works great. Using ProcessBuilder's .start() method (equivalent to the Runtime.exec() I believe), it creates a separate java process and off it goes.
Problem is I would like to be able to pause/stop these processes once they've been started. With simple threads this is generally easy to do, however the external process doesn't seem to have any inbuilt functionality for waiting/sleeping, at least not from an external point of view.
My question(s) is this: Is there a way to pause a java.lang.Process object? If not, does anyone know of any related exec libraries that do contain this ability? Barring all of that, is extending Process a more viable alternative?
My question(s) is this: Is there a way to pause a java.lang.Process object?
As you've probably discovered, there's no support for this in the standard API. Process for instance provides no suspend() / resume() methods.
If not, does anyone know of any related exec libraries that do contain this ability?
On POSIX compliant operating systems such as GNU/Linux or Mac OS you could use another system call (using Runtime.exec, ProcessBuilder or some natively implemented library) to issue a kill command.
Using the kill command you can send signals such as SIGSTOP (to suspend a process) and SIGCONT (to resume it).
(You will need to get hold of the process id of the external program. There are plenty of questions and answers around that answers this.)
You will need to create a system for sending messages between processes. You might do this by:
Sending signals, depending on OS. (As aioobe notes.)
Having one process occasionally check for presence/absence of a file that another process can create/delete. (If the file is being read/written, you will need to use file locking.)
Have your "main" process listen on a port, and when it launches the children it tells them (via a comamnd-line argument) how to "phone home" as they start up. Both programs alternate between doing work and checking for handling messages.
From what you have described (all Java programs in a complex batch environment) I would suggest #3, TCP/IP communication.
While it certainly involves extra work, it also gives you the flexibility to send commands or information of whatever kind you want between different processes.
A Process represents a separate process running on the machine. Java definitely does not allow you to pause them through java.lang.Process. You can forcibly stop them using Process.destroy(). For pausing, you will need the co-operation of the spawned process.
What sorts of processes are these? Did you write them?
I am somewhat familiar with ProcessBuilder and do process the streams.
Now I ran into the problem that the process that I am automating reads some information from two files that I need to provide.
Currently, I am writing the files and provide the paths to the program via ProcessBuilder.
Since I am expecting to have millions of runs in the near future I would like to speed-up things by doing all work in memory and not reading and writing to file.
Basically, what I need to be able to do is capture the file open request from the automated program and provide the expected data from a stringstream or something similar.
Of course if I could tell ProcessBuilder somehow that the file paths I am giving be replaced by streams that would be even better.
How can I achieve this?
There is no interface to Process that allows you to intercept and modify I/O access like that. Unless you have the source code for the program whose execution you're trying to automate, you'll most likely have to do it on OS level.
It could be achieved though by creating a ram disk. If you're on Linux for instance, it's not that complicated. Have a look at this link: Linux RAM Disk: Creating A Filesystem In RAM.
I suppose that another alternative would be to let your Java program create a named pipe and pass these as the paths to the automated program.