Commons FTPClient hangs after uploading large a file

Commons FTPClient hangs after uploading large a file - java

I'm using Apache Commons FTPClient 3.1 to do a simple file upload.
storefile() works fine for files of smaller sizes (under 100MB), but when I try uploading something greater than 100MB, it would finish uploading but just hang.
I've tried entering passive mode like others have suggested, but it doesn't seem to fix the problem. I've tried multiple FTP servers with the same results, so I'm guessing it's not the host.
Here's the gist of what I'm doing:
ftpClient.connect(...);
ftpClient.login(...);
ftpClient.enterLocalPassiveMode();
boolean success = ftpClient.storeFile(...);
if(success)
...
The program hangs at line 4 for large files, but does successfully upload the file.

https://commons.apache.org/proper/commons-net/apidocs/org/apache/commons/net/ftp/FTPClient.html
Its timing out. This link may help.
Control channel keep-alive feature:
During file transfers, the data connection is busy, but the control connection is idle. FTP servers know that the control connection is in use, so won't close it through lack of activity, but it's a lot harder for network routers to know that the control and data connections are associated with each other. Some routers may treat the control connection as idle, and disconnect it if the transfer over the data connection takes longer than the allowable idle time for the router.
One solution to this is to send a safe command (i.e. NOOP) over the control connection to reset the router's idle timer. This is enabled as follows:
ftpClient.setControlKeepAliveTimeout(300); // set timeout to 5 minutes
This will cause the file upload/download methods to send a NOOP approximately every 5 minutes.

Related

FTP code causes port scan detection

I have a C# Winforms application that iteratively FTPs files from an FTP server, parses the file for information, then returns the information from each file in a loop. After so many FTP pulls into memory (I'm loading the text files into an array) Mcafee sees my pulls as a port scanning virus and disables to connection.
I thought delaying the thread (Thread.Sleep(int)) might trick my virus scanner from getting this error but the tradeoff is efficiency. Does anyone know the specs on how fast I can run and not get this port scan error? I'm not going outside the company firewall (both my laptop and the FTP server and within the firewall).

The reason for the warning is that for each transfer of a file a new connection between server and client is opened and in general the port is increased by one each time this happens. From the outside this can look like a port-scan to so called personal firewalls leading to this effect.
There are a couple of possible solutions:
It seems that you use the so called active mode for the data transfer, i.e. the FTP server is asked to open the data connection to your system. Switch to passive mode where the connections are established by your client, so the sympthom of an incoming port-scan shouldn't exist anymore keeping your personal firewall quiet.
Whitelist your application or the peers server in your personal firewall preventing it from blocking things.
Change the setting of your FTP-client (a java library you use in your program or why is there a Java-tag with this question?) to use the same port for the data transfer with active mode. Because it's always the same port, this should keep your personal firewall quiet as well.

FTP: When to send the code 226 after initiating a file transfer?

From a FTP server perspective, if a client requests a file through RETR command, the server creates a data connection (socket) to the client through the specified port and starts the transfer by writing in the outputstream. The server is coded (JAVA) in such a way that after the write is complete in the socket, the outputstream is flushed and then the socket is closed. After this the code "226" is sent to the client in control channel.
Since the connection is over a very slow network, the 226 message reaches before the actual data transfer is complete. This is a tricky situation where the client code cannot be changed and the server has to make sure that the 226 is sent after client received the data.
I tried searching in the internet and got few inputs, but not sure which one is the standard.
1. to use setSoLinger() method to turn on SO_LINGER and to set timeout.
2. to introduce a delay after writing each byte in to socket (performance will be impacted for fast connections).
Is there any other options other than the above to solve the situation. Any idea about the standard followed for sending 226 in Linux/ Solaris/Windows FTP Servers.
I could see a similar thread in stackoverflow "When should 226 be sent from the FTP server?" , but could not find much info from that related to my question.
Help here is really appreciated...Thanks

Do not go with the delay for sure, the only thing I can think of is that you build a proxy layer that intercepts the acknowledge code, checks for the file, and reroutes the code to the application, sort of what telerik fiddler does as an application.
The same concept I used before with the JMS acknowledge mode when delivering messages to the server and I had to implement the same.
Wish you all luck my friend

Shall I use persistent connections to upload files in intranet?

In intranet, the network is good
Server A will send lots of files to server B by http service at the same time
Http protocol is HTTP 1.1, which uses persistent connection by default
[update] Use a connection pool to hold 100 connections
[update] One connection sends a file at one time
[update] Onnection will not be closed(persistent connection), and will be reused to send next file
Each file has size of 7K to 30K
Question:
In the above condition, will persistent connection have better performance than non-persistent connections?
I ask this question because we found the connections would be blocked for a huge long time when upload files. I suggest to use non-persistent connection, since I think it's more stable, but my colleage inisit to use persistent connection, because he think persistent has better performance.
UPDATE
See the updated question, thank you ~

In HTTP 1.1, your persistent connection allows pipe-lining but not parallelism. (See RFC 2616) That means that if you share your connection among 100 threads and each one sends a file, you will send those 100 files one at a time (in some ordering) and receive responses for each file in the order it was sent. You are not getting any advantage out of sending on 100 threads, because they're just lining up to send and receive one at a time.
You may be able to send faster using multiple connections, because that would allow them to actually run in parallel. But this is dependent on lots of other factors. Depending on your network, setting up and tearing down 100 connections may be slower than pipe-lining through one connection. Also, the server may not appreciate you opening 100 separate connections. Worse, the server may reject you only some of the time, which is a big headache.
I suggest taking the middle road: open, say, 5 persistent connections (using only 5 threads) and send 20 documents down each connection. HTTPClient has a BasicConnPool to do this sort of thing, though it may be too basic for your needs.

Handling "Connection reset by peer" error in an FTP client

I have a Java program that calculates some stats daily and uploads the file on a server through FTP. However, I get "Connection reset by peer" errors way too often.
Since I cannot change the server configurations, what are the recommended ways to handle such types of errors? How can I make sure that the whole file is transferred to the server?

The message "Connection reset by peer" means the server closed the connection. The cause could be a TCP timeout, a lack of disk space, ETC.
Try transferring the file using FTP without using Java, using a command line utility.If the same problem occurs, it is definitely not the Java program.
Make sure the network is not sensitive to the size of file(s) being transferred.
Make sure the server is not blocking connections from your client after it has already made "N" previous connections or after a certain length of time, E.G. 20 minutes.
See if your client can establish a persistent TCP connection using another protocol: SSH, etc. If the problem occurs with the other protocol also, it's likely to be the network.
If you find the issue is caused by a timeout that would only happen if your connection was idle too long, then check this URL:
FTP: "Connection reset by peer"

polling a HTTP server from J2ME client

I have a J2ME app running on my mobile phone(client),
I would like to open an HTTP connection with the server and keep polling for updated information on the server.
Every poll performed will use up GPRS bytes and would turn out expensive in the long run, as GPRS billing is based on packets sent and received.
Is there a byte efficient way of polling using the HTTP protocol?.
I have also heard of long polling, But I am not sure how it works and how efficient it would be.
Actually the preffered way would be for the Server to tell the phone app that new data is ready to be used that way polling won't be needed to be done, however I don't know of these techniques especially in J2ME.

If you want solve this problem using HTTP only, long polling would be the best way. It's fairly easy. First you need to setup an URL on server side for notification (e.g. http://example.com/notify), and define a notification protocol. The protocol can be as simply as some text lines and each line is an event. For example,
MSG user1
PHOTO user2 album1
EMAIL user1
HEARTBEAT 300
The polling thread on the phone works like this,
Make a HTTP connection to notification URL. In J2ME, you can use GCF HttpConnection.
The server will block if no events to push.
If the server responds, get each line and spawn a new thread to notify the application and loopback to #1.
If the connection closes for any reason, sleep for a while and go back to step 1.
You have to pay attention to following implementation details,
Tune HTTP timeouts on both client and server. The longer the timeout, the more efficient. Timed out connection will cause a reconnect.
Enable HTTP keepalive on both the phone and the server. TCP's 3-way handshake is expensive in GPRS term so try to avoid it.
Detect stale connections. In mobile environments, it's very easy to get stale HTTP connections (connection is gone but polling thread is still waiting). You can use heartbeats to recover. Say heartbeat rate is 5 minutes. Server should send a notification in every 5 minutes. If no data to push, just send HEARTBEAT. On the phone, the polling thread should try to close and reopen the polling connection if nothing received for 5 minutes.
Handling connectivity errors carefully. Long polling doesn't work well when there are connectivity issues. If not handled properly, it can be the deal-breaker. For example, you can waste lots of packets on Step 4 if the sleep is not long enough. If possible, check GPRS availability on the phone and put the polling thread on hold when GPRS is not available to save battery.
Server cost can be very high if not implemented properly. For example, if you use Java servlet, every running application will have at least one corresponding polling connection and its thread. Depending on the number of users, this can kill a Tomcat quickly :) You need to use resource efficient technologies, like Apache Mina.
I was told there are other more efficient ways to push notifications to the phone, like using SMS and some IP-level tricks. But you either have to do some low level non-portable programming or run into risks of patent violations. Long polling is probably the best you can get with a HTTP only solution.

I don't know exactly what you mean by "polling", do you mean something like IMAP IDLE?
A connection stays open and there is no overhead for building up the connection itself again and again. As stated, another possible solution is the HEAD Header of a HTTP Request (forgot it, thanks!).
Look into this tutorial for the basic of HTTP Connections in J2ME.
Pushing data to an application/device without Push Support (like a Blackberry) is not possible.

The HEAD HTTP request is the method that HTTP provides if you want to check if a page has changed or not, it is used by browsers and proxy servers to check whether a page has been updated or not without consuming much bandwidth.
In HTTP terms, the HEAD request is the same as GET without the body, I assume this would be only a couple hundred bytes at most which looks acceptable if your polls are not very frequent.

The best way to do this is to use socket connection. Many application like GMail use them.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.