My English is like 3years old baby.
Recently, I made a website with Many File Access.
Unfortunately, My tomcat gave me this following error message
Fatal: Socket accept failed
java.net.SocketException: Too many open files
at java.net.PlainSocketImpl.socketAccept(Native Method)
at java.net.PlainSocketImpl.accept(PlainSocketImpl.java:408)
at java.net.ServerSocket.implAccept(ServerSocket.java:462)
at java.net.ServerSocket.accept(ServerSocket.java:430)
at org.apache.tomcat.util.net.DefaultServerSocketFactory.acceptSocket(DefaultServerSocketFactory.java:61)
at org.apache.tomcat.util.net.JIoEndpoint$Acceptor.run(JIoEndpoint.java:352)
at java.lang.Thread.run(Thread.java:662)
org.apache.tomcat.util.net.JIoEndpoint$Acceptor run
This happens when I send request in short time, I guess there too many stream opened for this job.
Does anybody know how to solve this problem.
My Environment are { tomcat 6.0.35, java 1.6.0_31, centos 5 }
Ah, This only happens on Linux;
thank you.
Check the limit allocated by the system
cat /proc/sys/fs/file-nr
(last number)
Allocate more if needed
Edit /etc/sysctl.conf
Add/change fs.file-max = xxxxx
Apply changes sysctl -p
Check cat /proc/sys/fs/file-max
You may also have user limits set.
It's quite possible that you are exceeding the default maximum number of file descriptor.
Explanation and how to increase the values:
http://honglus.blogspot.com.au/2010/08/tune-max-open-files-parameter-in-linux.html
http://www.cyberciti.biz/faq/linux-increase-the-maximum-number-of-open-files/
Related
We have four lpars running 1 java instance each.
They do a lot of read/write operations to a shared NFS server. When the NFS server goes down abruptly, all the threads that were trying to read an image in each of these four servers gets into a hung state.
Below trace shows the same (process is a websphere applciation server process)
While we are working on the issues in the NFS server side, is there a way to avoid this from the code side?
If the underlying connection is tcp based (which I assume it is), should the tcp read/connect timeout take care of this? Basically I want to thread be returned back to the pool instead of waiting infinitely for the other side to repond.
Or is this something which should be taken care by the nfs 'client' on the source machine? Some config setting on the client side pertaining to nfs (since FileInputStream.open would not know whether the file it is trying to read is on the local server or the shared folder in nfs server)
Thanks in advance for your answers :)
We are using
java 1.6 on WAS 7.0
[8/2/15 19:52:41:219 GST] 00000023 ThreadMonitor W WSVR0605W: Thread
"WebContainer : 77" (00003c2b) has been active for 763879 milliseconds
and may be hung. There is/are 110 thread(s) in total in the server
that may be hung.
at java.io.FileInputStream.open(Native Method)
at java.io.FileInputStream.(FileInputStream.java:113)
at java.io.FileInputStream.(FileInputStream.java:73)
at org.emarapay.presentation.common.util.ImageServlet.processRequest(Unknown
Source)
at org.emarapay.presentation.common.util.ImageServlet.doGet(Unknown
Source)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:718)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:831)
at com.ibm.ws.webcontainer.servlet.ServletWrapper.service(ServletWrapper.java:1694)
at com.ibm.ws.webcontainer.servlet.ServletWrapper.service(ServletWrapper.java:1635)
at com.ibm.ws.webcontainer.filter.WebAppFilterChain.doFilter(WebAppFilterChain.java:113)
at com.ibm.ws.webcontainer.filter.WebAppFilterChain._doFilter(WebAppFilterChain.java:80)
at com.ibm.ws.webcontainer.filter.WebAppFilterManager.doFilter(WebAppFilterManager.java:908)
at com.ibm.ws.webcontainer.servlet.ServletWrapper.handleRequest(ServletWrapper.java:965)
at com.ibm.ws.webcontainer.servlet.ServletWrapper.handleRequest(ServletWrapper.java:508)
at com.ibm.ws.webcontainer.servlet.ServletWrapperImpl.handleRequest(ServletWrapperImpl
Check this solution https://stackoverflow.com/a/9832633/1609655
You can do something similar for reading the image. Basically wrap the call to read in a Java Future implementation and signal a thread kill when the operation does not finish in a specified amount of time.
It might not be perfect, but i will atleast prevent your server to be stuck for ever.
This was the response from #shodanshok in serverfault and it helped us.
This probably depends on how the NFS share is mounted. By default, NFS shared are mounted with the "hard" parameters, meaning that accesses to a non-responding NFS share will block indefinitely.
You can change the client side mount point, adding one of the following parameters (I'm using Linux man page here, maybe your specific options are a little different):
soft: if the soft option is specified, then the NFS client fails an NFS request after retrans retransmissions have been sent, causing the NFS client to return an error to the calling applicationintr: selects whether to allow signals to interrupt file operations on this mount point. Using the intr option is preferred to using the soft option because it is significantly less likely to result in data corruption. FYI, this was deprecated in Linux kernel 2.6.25+
Source: Linux nfs man page
http://martinfowler.com/bliki/CircuitBreaker.html
This seems to be the perfect solution for this problem (and the similar kinds). The idea is to wrap the call in an another object which will prevent further calls (based on how you design this object to handle the situation) to the failed service.
E.g. When a external service becomes unresponsive, slowly threads goes into a hung state. Instead, it will be good if we have a THRESHOLD LEVLE which prevents threads from getting into that state. What if we can configure say, do not attempts to connect to the external service if it has not responded or waiting to respond for the previous 30 requests! In this case the 31 request will directly throw an error to the customer trying to access report (or send an error mail to the team) but ATLEAST the 31st thread WILL NOT BE STUCK waiting, instead it will be used to server other requests from other functionalities.
References:
http://martinfowler.com/bliki/CircuitBreaker.html
http://doc.akka.io/docs/akka/snapshot/common/circuitbreaker.html
http://techblog.netflix.com/2011/12/making-netflix-api-more-resilient.html
https://github.com/Netflix/Hystrix
I have a socket application written with Apache MINA, with Linux OS,
This time I got too many error when I see the log files with this code:
IoAcceptor acceptor = new NioSocketAcceptor();
acceptor.getFilterChain().addLast("logger", new LoggingFilter());
acceptor.getFilterChain().addLast("codec", new ProtocolCodecFilter(new TextLineCodecFactory(Charset.forName("UTF-8"))));
acceptor.setCloseOnDeactivation(true);
acceptor.setHandler(new ChatHandler());
acceptor.getSessionConfig().setReadBufferSize(2);
acceptor.getSessionConfig().setIdleTime(IdleStatus.BOTH_IDLE, 10);
acceptor.bind(new InetSocketAddress(15000));
When I tested it with 2-3 client at the same time, I got this error:
Caused by: java.io.IOException: Too many open files
at sun.nio.ch.IOUtil.makePipe(Native Method) ~[?:?]
at sun.nio.ch.EPollSelectorImpl.<init>(EPollSelectorImpl.java:65) ~[?:?]
at sun.nio.ch.EPollSelectorProvider.openSelector(EPollSelectorProvider.java:36) ~[?:?]
at java.nio.channels.Selector.open(Selector.java:227) ~[?:?]
at org.apache.mina.transport.socket.nio.NioProcessor.<init>(NioProcessor.java:59) ~[MC.jar:?]
I have googled it, but I don't know what the risk of this exception, will this error make my application fail at transaction or not?
If yes, can someone explain it ? and how to solve it?
It can be quite a problem. This error (ENFILE) means that either you, or the OS, has too many open file descriptors.
stdin, stdout and stderr are file descriptors; any open file is a file descriptor; any socket you create is a file descriptor as well.
To know your limit of open files, as a user, do:
ulimit -n
To know the limit of the OS, do:
cat /proc/sys/fs/file-max
Generally, it is the user limit which is the problem.
You can try and raise the limit using:
ulimit -n <a greater number here>
but more than likely it won't work. What you need to do is editing /etc/security/limits.conf of, preferred, create a new file in /etc/security/limits.d with a relevant name and add these two lines:
theuser soft nofile <somelargenumber>
theuser hard nofile <somelargenumber>
Note that in order for these limits to take effect, the user must log out and login again; if it is a user dedicated to a system service, then restarting this service will do.
Additionally, if you know the PID of the process running your application, you can see the number of currently open file descriptors by issuing the command:
ls /proc/<thepid>/fd|wc -l
If the kernel limit is the problem (very unlikely but who knows) then you'll have to edit /etc/sysctl.conf and change the proc.sys.fs.file-max entry, then run sysctl -- as root.
I'm sure you may not invoke the dispose method of the connector.
This method shutdown the business threads by invoke the ExecuteService's shutdown method.
Meanwhile set the inner flag of disposed to marked the connector which need to stop, the worker thread stop by this flag.
whenever our application handles a large amount of http request, the error "too many open files" is being displayed on the logs and I am sure that error is connected to the socket and creates a new file descriptor instance see the error below:
[java.net.Socket.createImpl(Socket.java:447), java.net.Socket.getImpl(Socket.java:510),
java.net.Socket.setSoTimeout(Socket.java:1101),
org.apache.http.conn.scheme.PlainSocketFactory.connectSocket(PlainSocketFactory.java:122),
org.apache.http.impl.conn.DefaultClientConnectionOperator.openConnection(DefaultClientConnecti
onOperator.java:148),
org.apache.http.impl.conn.AbstractPoolEntry.open(AbstractPoolEntry.java:149),
org.apache.http.impl.conn.AbstractPooledConnAdapter.open(AbstractPooledConnAdapter.java:121),
org.apache.http.impl.client.DefaultRequestDirector.tryConnect(DefaultRequestDirector.java:561), org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:415),
when I saw on the internet that I should use
EntityUtils.consume(entity);
httpClient.getConnectionManager().shutdown();
the errors were reduced but not that many, I have a feeling that my consuming of resources is not that enough to clear all of the file descriptors. Right I am looking for different answers asides on changing the ulimit because the application will be deployed on other server that we can't configure if changes are needed.
Since you are using Linux, there are some configurations which might be changed to solve the problem. First of all, what is happening? After you close the socket in Java, the operating systems sets it to TIME_WAIT state and that is because something still might be sent to the socket, so OS maintains it open for some time to make sure all the packets will be received (basically the packets which are still on the way and you need some kind of response for those must be received).
As far as I know it's not an optimal solution for this problem, but it works. You should set tcp_tw_recycle and tcp_tw_reuse to 1, to allow fast re-usage of sockets by OS. Now how it is done, depends on the Linux version you have. For instance in Fedora I could do something like:
echo 1 > /proc/sys/net/ipv4/tcp_tw_reuse
echo 1 > /proc/sys/net/ipv4/tcp_tw_recycle
to set those temporarily (untill reboot). It is up to you to find out how to set those permanently, cuz I'm not very strong at Linux Administration.
EDIT: I'm mentioning once again, it is not an optimal solution, try to think what can be changed in application before messing with OS configuration.
we've been having this problem for a long time and still cannot find out where is the problem. Our application uses RTMP for videostreaming and if the webclient cannot connect it skips to RTMPT (RTMP over HTTP). This causes the video to freeze after couple seconds of playback.
I have already found some forums where people seems to be havoing the same issue, but none of the proposed solutions worked. One suggestion was to turn of the video recording, but it didn't work. I have also read, that it seems to be a thread problem in the red5, but before hacking into the RED5 I would like to know, if maybe somebody has a patch or anything which repairs this.
One thing more, we've been testing this on Macs if that should count. Thank you very much in advance.
the very first thing you should look at is really the red5/error log.
Also Red5 occassionally produces output that might be not in the log but just to plain std.out
There is a red5-debug.sh or red5-highpref.sh that does output/log everything to a file called std.out.
You should use those logs to start your analysis. Eventually you will already see something into it. For example exception like:
broken pipe
connection closed due to too long xxx
handshake error
encoding issue in packet xyz
unexpected connection closed
call xyz cannot be handled
too many connections
heap space error
too many open files
Some of them are operating system specific, like for example the number of open files. Some are not.
Also it is very important that you are using the latest revision of Red5 and not an old version. You did not tell us what version you are using.
However, just from symptoms like video freezes *occassional disconnects* or similar you won't be able to start a real analysis of the problem.
Sebastian
Were you connected to the server when the video freezed? Or after that? I am not sure but I think connection closed which caused the stream to freeze.Just check in the access logs of Red5 if there are any logs for 'idle' packets(possibly after a 'send' packet(s) and more than one in number).
Another thing you could have a look at is your web server log files because RTMPT is over HTTP. I once had a problem with my anti DDOS program on the server. RTMPT will make many connections after each other and these TCP connections remain alive for about 4 minutes by default. You can easily get hundreds connections at the same time being seen as a DDOS-attack and as a result the IP-adres of the client will be banned.
Hi I have created a socket and client program using java NIO.
My server and client are on different computers ,Server have LINUX OS and CLIENT have WINDOWS OS. Whenever I have created 1024 sockets on client my client machines supports but in server I got too many files open error.
So How to open 15000 sockets without any error in server.
Or is there any other way to connect with 15000 clients at the same time?
Thanks
Bapi
Ok, questioning why he needs 15K sockets is a separate discussion.
The answer is that you are hitting the user's file descriptor limit.
Log with the user you will use in the listener and do $ulimit -n to see the current limit.
Most likely 1024.
Using root edit the file /etc/security/limits.conf
and set ->
{username} soft nofile 65536
{username} hard nofile 65536
65536 is just a suggestion, you will need to figure that out from your app.
Log off, log in again and recheck with ulimit -n, to see it worked.
Your are probably going to need more than 15 fds for all that. Monitor your app with lsof.
Like this:
$lsof -p {pid} <- lists all file descriptors
$lsof -p {pid} | wc -l <- count them
By the way, you might also hit the system wide fd limit, so you need to check it:
$cat /proc/sys/fs/file-max
To increase that one, add this line to the /etc/sysctl.conf
#Maximum number of open FDs
fs.file-max = 65535
Why do you need to have 15000 sockets on one machine? Anyway, look at ulimit -n
If you're going to have 15,000 clients talking to your server (and possibly 200,000 in the future according to your comments) then I suspect you're going to have scalability problems servicing those clients once they're connected (if they connect).
I think you may need to step back and look at how you can architect your application and/or deployment to successfully achieve these sort of numbers.