I have an application which is deployed on Linux environment and has two JVM's simultaneously running. One is producer and one is consumer.
I have different targets written in my ant script for stopping and starting the two JVMs.
There are times while restarting the producer or the consumer, one of the JVMs fail to stop so we have to go and manually find the process id for that particular port and kill that process and then start the application.
How could I automate this and write one script for everything. This script should be able to call the ant targets for stopping the JVMs, kill the process if any JVMs does not stop and finally start the two JVMs.
The first and the last is fine. But how to write things like finding the process id against the port and then doing kill -9.
I am a Java developer so don't know much about this.
If your JVMs are communicating on a socket then try something like
lsof | grep ":$port " | awk '{print $2}'
where $port is the port number. This searches the list of open file descriptors for any matching the required port number and spits out the process id.
Related
I have two weblogic domains each one has one managed server, the problem is that every 3 or 4 hours may be less than the four process are killed suddenly and in domain console i found that.
./startWebLogic.sh: line 175: 53875 Killed ${JAVA_HOME}/bin/java ${JAVA_VM} ${MEM_ARGS} -Dweblogic.Name=${SERVER_NAME} -Djava.security.policy=${WL_HOME}/server/lib/weblogic.policy ${JAVA_OPTIONS} ${PROXY_SETTINGS} ${SERVER_CLASS}
There is no problem in free memory in server.
free memory
Two possible explanations for this message are the Linux OOM killer and the WebLogic node manager.
You should be able to find evidence for the first in /var/log/messages (grep -i -n 'killed process' /var/log/messages). If so, add up all the Xmx parameters of the running java processes, add 35% and see if that total tops the total amount of memory in the machine. If it does, tweak the Xmx parameters downwards.
The easier way to test for the second is to kill the nodemanager process, keep it down and see if the problem persists (kill -9 `ps -ef | grep odeManager | awk '{print $2}'`). If the problem does not reoccur, check the WebLogic admin console on how the "Panic action" and "Failure action" are configured for each server and set them to "No Action". In that case also check the nodemanager and server logs to figure out why the node manager killed your managed server processes.
I use Amazon EC2 instances to perform some complex computation by using the AWS Java SDK, these computations might take so long sometimes. Hence, I need to kill the process running on the EC2 instance remotely from my Java code so that I can reuse the same instance for another task without needing to stop and start the instance.
The problem with stop and start, is that Amazon treat partial hours as complete hours, so my aim is to make my code more cost effective.
I use SSH to connect with my EC2 instances and that is how I pass commands to be executed there. I doubt that if I disconnect the SSH connection and connect to it again, it would kill whatever process was running there.
In short, what I need is away of doing Ctrl+C but from within my Java code (without user intervention). Is that possible?
EDIT:
To clarify, the computation is executed by a separate tool installed on the Linux EC2 instance. So I just pass the command to start this tool, then the command line waits until its finished and shows the results. On manual usage scenario, I can click Ctrl+C on linux command line and will get control of the command line back. But in my case, I want to do similar thing from java code if possible.
Use the SIGAR library to scan the process list on a machine and kill your process.
Use getProcList() to get all process IDs, getProcArgs() to get the full command line of each process, and kill() to kill the relevant process or processes.
So if a task is running and you want to kill it, SSH again into the machine and run your SIGAR based process killer.
One dirty/functional way would be to kill your process via SSH using the java Runtime to execute it.
Something like
Runtime runtime = Runtime.getRuntime();
Process p = runtime.exec("ssh user#host command");
So in your case, if you know the program's PID (1234 for example):
Process p = runtime.exec("ssh user#host kill 1234");
Or if you know the program's name:
Process p = runtime.exec("ssh user#host pkill my_program_name_x64");
Note that you usually have to give absolute paths to the executables invoked via runtime.
So you'll have to replace ssh by something like /bin/ssh or /usr/bin/ssh as well as for kill and pkill or killall.
I want to write a program that ssh's into remote boxes and runs jobs there if the remote computer is not actively being used. I'll be logging in as clusterJobRunner#remoteBox, and the other user will be logged in as someLocalUser#remoteBox.
Is there a way to see if a remote user is actively using the box using either Python or Java?
If the aim is to avoid bothering someLocalUser, you could consider running your job on a lower priority. See the documentation for nice.
I assume that the "actively used" part is the tricky part.
If it is sufficient to check whether or not another user is logged in, you can use the commands w and who and perhaps last and lastlog. All these commands several parameter which you can lookup in the manuals.
From Java / Python you can execute these commands and parse their output.
On the other hand: The tools w and who use the file utmp to get their information. A quick Google turned up nothing for Java but for Python I've found the library pyutmp which you can use to read the utmp file directly without parsing the command output.
Whether the user logged in and went to lunch (possibly locking the screen) is a completely other story.
In Java you can execute the users Linux command using Runtime.exec(), grab the standard output and get it into a parsable String. I don't think there are any OS-independent ways to do this.
I second the answer by #Eero Aaltonen -- you should run your stuff under nice. A Linux computer can run at 100% CPU busy, yet feel nice and fast for the user, if the extra tasks are all under nice; the scheduler will only run the nice tasks when the main user's tasks are idle.
But if you want to figure out if the machine is being used, I suggest you look into the w command. Try man w at your prompt. The w command prints the load average for the machine, and a list of users and how much time they have been using (a combined time that includes any background tasks they are running, plus a time for their main task).
You need to define what "actively using" means to you. Which of the following defines "actively using":
A user is logged in
A user is actively engaging with the box, typing stuff into their ssh session.
A user is running a job
If #2 is what you are looking for, and users only log in using ssh, you could possibly hook into the sshd daemon and monitor whether it is receiving input from the ssh client.
For example if you attach strace to sshd of a logged in user, you can watch for reads from the ssh client. Here is an example that works for me:
$ for x in $(ps aux | grep sshd | grep "#pts" | sed -E 's/[^ ]+ +([0-9]*).*/\1/'); do {(sudo strace -p $x -eread 2>&1 | grep 'read(4' )&}; done
[1] 166396
[2] 166397
[3] 166399
[4] 166402
read(4, "\206\261\364\271\204\\\26S\3\"El\365W\352\35\375\242\205Qlu#$\2538\306\2777oW\230"..., 16384) = 36
read(4, "#s\2733d\355\17~\2550=\316`)3|^\340\f\252\242_\251\377d[l\221\210|z\37"..., 16384) = 36
read(4, "\5\214\261\25\322\222\242\221\313\314\4$\344\273\200\220a\233\345*\7\17\274\331\246\363f.\346\365\22\255"..., 16384) = 36
read(4, "\325\220<\0:\34^\235\346y\223\304\3061\212\203\373\371rD Rs\254oL*\260\22\234\372\27"..., 16384) = 36
read(4, "TXD\7\373~.\214\321\35\201\350\22\211\34J~m\\\270\364\243\267\261\207\323\224\314x\240i'"..., 16384) = 36
read(4, "\347\320\243\v/Z\213n\7\264\376\27\0340\30\364u!9\n\326\314)c\331\362\346\256\317E8\317"..., 16384) = 36
read(4, "\7\264\207\232\252xT\271\240Aq\210\21m\232l\306i\225\311\356\3
What you are seeing here is read calls on file descriptor 4 for all sshd forked processes for each user logged in using ssh. File descriptor 4 seems to be the one assigned to the socket connecting to the client (YMMV). Each line output means a user (or script) is sending something from the client to the server, and that seems to correspond to the user typing something into the shell.
You may be able to build on that.
I m doing a eclipse plugin project to create an IDE for a particular language.
For running i connect to the server and ask the user the command,type of connection...
After the program has started execution The only way to stop the execution of the program is by pressing "ctrl+C" when done in the command prompt.
I run the program by sending the server the following command:
"probevue filename.e >output.txt"
when i give this command it is running,but i m not able to stop the program...
i.e when i press Ctrl+C the program should stop execution.
How shall i do this?
Thanks in Advance.
after reading into this article about probevue it seems that you have started a dynamic session. i think the program you're calling is never terminating itself.
http://pic.dhe.ibm.com/infocenter/aix/v7r1/index.jsp?topic=%2Fcom.ibm.aix.cmds%2Fdoc%2Faixcmds4%2Fprobevue.htm
you should first read how to invoke a command properly. e.g. -> http://www.javalobby.org/java/forums/t53333.html
then you should think about how would you terminate the command if you wouldn't have the ctrl+c option, like finding out the process-id of the shell command you executed before (e.g.: ps -ef | grep ) and terminating it by using another shell command (e.g.: kill -9 )
hope this helps you find a solution.
Common eclipse server plugins like JBoss Tools call the server scripts that are delivered with the server for starting and stopping. Deployment/undeployment and starting/stopping of applications is done via the management port.
So either you have a manageable server to control your program, or you could work around the problem by writing the scripts yourself.
I'm planning to do a heap dump with jmap jdk1.5 tool on a production weblogic (10) instance.
Actually there are 3 EAR (perhaps more, don't really know i don't have access) deployed on this weblogic instance.
Someone told me "weblogic creates a JVM for each EAR"
Can someone confirm this?
With jmap i need the jvm pid as parameter to do the heap dump...
Since i have 3 EAR i guess i have 3 pid so i wonder how to know which pid correspond to which EAR JVM?
Nope - each Weblogic server (or any java process) runs in it's own JVM with it's own PID. So all your EARs will appear in the same heap dump.
If you have multiple Weblogic server instances running on the same machine, each will have a separate PID and a separate process
As #josek says, you'll have one JVM per WebLogic server, so if all your EARs are under the same WebLogic server you'l only have one pid to dump. But you may still have multiple servers - maybe an admin server and a managed server, maybe other unrelated instances - so if you just do something like ps -ef | grep java (I'm assuming this is on Unix?) you could see a lot of pids, even if you can filter it to your WebLogic's JDK_HOME.
One way to identify which pid belongs to a particular server is to go to the <domains>/servers/<your server>/tmp directory, and in there run fuser -f <your server>.lok. This will list the pids of all the processes related to that server, one of which will be the JVM java process. (May be others for JDBC etc.) One way to find just the java process (and I'm sure someone will point out another, better way!) is something like:
cd <domains>/servers/<your server>/tmp
ps -p "`fuser -f <your server>.lok 2>/dev/null`" | grep java
If each EAR is in its own server, I guess you'll have to look at config.xml to see which you need.