Java - Download zip file from url - java

I have a problem with downloading a zip file from an url.
It works well with firefox but with my app I have a 404.
Here is my code
URL url = new URL(reportInfo.getURI().toString());
HttpsURLConnection con = (HttpsURLConnection) url.openConnection();
// Check for errors
int responseCode = con.getResponseCode();
InputStream inputStream;
if (responseCode == HttpURLConnection.HTTP_OK) {
inputStream = con.getInputStream();
} else {
inputStream = con.getErrorStream();
}
OutputStream output = new FileOutputStream("test.zip");
// Process the response
BufferedReader reader;
String line = null;
reader = new BufferedReader(new InputStreamReader(inputStream));
while ((line = reader.readLine()) != null) {
output.write(line.getBytes());
}
output.close();
inputStream.close();
Any idea ?

In Java 7, the easiest way to save a URL to a file is:
try (InputStream stream = con.getInputStream()) {
Files.copy(stream, Paths.get("test.zip"));
}

As for why you're getting a 404 - that hard to tell. You should check the value of url, which as greedybuddha says, you should get via URI.getURL(). But it's also possible that the server is using a user agent check or something similar to determine whether or not to give you the resource. You could try with something like cURL to fetch in programmatic way but without having to write any code yourself.
However, there another problem looming. It's a zip file. That's binary data. But you're using InputStreamReader, which is designed for text content. Don't do that. You should never use a Reader for binary data. Just use the InputStream:
byte[] buffer = new byte[8 * 1024]; // Or whatever
int bytesRead;
while ((bytesRead = inputStream.read(buffer)) > 0) {
output.write(buffer, 0, bytesRead);
}
Note that you should close the streams in finally blocks, or use the try-with-resources statement if you're using Java 7.

Related

Java gzip pdf from url to file - result gives minor character mismatch

I'm trying to download a gzip pdf from an url, unpacking it and writing it to a file. It almost works, but currently some characters in the pdf made from my code mismatches the real pdf. I checked this by opening both of the pdf's in notepad.
I provide some short text samples from the two pdfs.
From my code:
’8 /qªMiUe°Ä[H`ðKíulýªäqvA®v8;xÒhÖßÚ²ý!Æ¢ØK$áýçpF[¸t1#y$93
From the real pdf:
ƒ8 /qªMiUe°Ä[H`ðKíulªäqvA®—v8;ŸÒhÖßÚ²!ˆ¢ØK$áçpF[¸t1#y$‘‹3
Here is my code:
public void readPDFfromURL(String urlStr) throws IOException {
URL myURL = new URL(urlStr);
HttpURLConnection urlCon = (HttpURLConnection) myURL.openConnection();
urlCon.setRequestProperty("Accept-Encoding", "gzip");
urlCon.setRequestProperty("Content-Type", "application/pdf");
urlCon.setRequestMethod("GET");
urlCon.setDoInput(true);
urlCon.connect();
Reader reader;
if ("gzip".equals(urlCon.getContentEncoding())) {
reader = new InputStreamReader(new GZIPInputStream(urlCon.getInputStream()));
}
else {
reader = new InputStreamReader(urlCon.getInputStream());
}
FileOutputStream fos = new FileOutputStream("document.pdf");
int data = reader.read();
while(data != -1) {
char c = (char) data;
fos.write(c);
data = reader.read();
}
fos.close();
reader.close();
}
I can open the pdf, and it has the correct amount of pages, but the pages are all blank.
My initial thought is that it might got something to do with character codes to do, like some setting in my java project, intellij etc.
Alternatively, I don't actually need to put it in a file. I just need to download it so I can upload it to another place. However, the pdf should of course be working in either case. I'm really just putting it in an actual file to check if it works.
Thank you for your help!
Here is my new implementation, which solves my question:
public void readPDFfromURL(String urlStr) throws IOException {
URL myURL = new URL(urlStr);
HttpURLConnection urlCon = (HttpURLConnection) myURL.openConnection();
urlCon.setRequestProperty("Accept-Encoding", "gzip");
urlCon.setRequestProperty("Content-Type", "application/pdf");
urlCon.setRequestMethod("GET");
urlCon.setDoInput(true);
urlCon.connect();
GZIPInputStream reader = new GZIPInputStream(urlCon.getInputStream());
FileOutputStream fos = new FileOutputStream("document.pdf");
byte[] buffer = new byte[1024];
int len;
while((len = reader.read(buffer)) != -1){
fos.write(buffer, 0, len);
}
fos.close();
reader.close();
}

reading remote csv file without downloading

I have requirement to read remote big csv file line by line (basically streaming). After each read I want to persist record in db. Currently I am achieving it through below code but I am not sure if it download complete file and keep it in jvm memory. I assume it is not. Can I write this code in better way using some java 8 stream features
URL url = new URL(baseurl);
HttpURLConnection urlConnection = url.openConnection();
if(connection.getResponseCode() == 200)
{
BufferedReader in = new BufferedReader(new InputStreamReader(connection.getInputStream()));
String current;
while((current = in.readLine()) != null)
{
persist(current);
}
}
First you should use a try-with-resources statement to automatically close your streams when reading is done.
Next BufferedReader has a method BufferedReader::lines which returns a Stream<String>.
Then your code should look like this:
URL url = new URL(baseurl);
HttpURLConnection connection = (HttpURLConnection) url.openConnection();
if (connection.getResponseCode() == 200) {
try (InputStreamReader streamReader = new InputStreamReader(connection.getInputStream());
BufferedReader br = new BufferedReader(streamReader);
Stream<String> lines = br.lines()) {
lines.forEach(s -> persist(s)); //should be a method reference
}
}
Now it's up to you to decide if the code is better and your assumption is right that you don't keep the whole file in the JVM.

file-size difference while downloading file

I'm working on a Java file downloader, so I just downloaded a video file with app and without app, so I saw file-size differences between that files. And I couldn't open the file which I downloaded by using my Java app. When I open them using Notepad++, I saw randomly generated symbols inside. What am I doing so wrong?
http://i.imgur.com/lKaofVg.png - here, as you can see randomly generated question marks there.
http://i.imgur.com/8bLC2R7.png - but in the original file, they doesn't exist.
http://i.imgur.com/H3MGgwl.png - here's the file sizes, I just placed "+" for the generated file.
Here's my code:
String currentRange = "bytes=0-"+num*13107200;
System.out.println(num + " is executing");
URL file = new URL(url);
FileOutputStream stream = new FileOutputStream("tmp"+num+".mp4");
HttpURLConnection urlConnection = (HttpURLConnection) file.openConnection();
urlConnection.setRequestProperty("Range", currentRange);
urlConnection.connect();
BufferedReader in = new BufferedReader(new InputStreamReader(
urlConnection.getInputStream()));
String inputLine;
final PrintStream printStream = new PrintStream(stream);
while ((inputLine = in.readLine()) != null)
printStream.println(inputLine);
in.close();
printStream.close();
I solved by using this code, thanks to this question: Reading binary file from URLConnection and #piet.t
InputStream input = urlConnection.getInputStream();
byte[] buffer = new byte[4096];
int n = - 1;
while ( (n = input.read(buffer)) != -1)
{
stream.write(buffer, 0, n);
}

HTTP GET request in java returns meaningless data from App Engine

I'm requesting a json file from an App Engine URL
http://1-1-26a.wordbuzzweb.appspot.com/json/level-images.json
The file encoding is UTF-8 without a BOM. If I look at this file on my local disk it's size is 12414 bytes. If I get the file in Chrome is reads it perfectly well. If I then save it it's 12414 bytes. However, if I try and download the file with a GET request in java I only get 780 bytes returned and the returned data would appear to be meaningless.
I've tried several different types of get request, both of the methods below I have used elsewhere perfectly effectively. The response code on the GET requests is 200. Interestingly, if I do a POST with no content instead of a GET, then I get the valid response.
If I download the file from this URL on Google Drive instead, then the GET methods below work perfectly.
edit This code is now working, however, this is a recurring issue that comes and goes. If anyone has any ideas what might be causing it then please say so!
This doesn't work
public static String doGetSync(String urlToRead) throws IOException {
URL url = new URL(urlToRead);
InputStream is = url.openStream();
ByteArrayOutputStream buffer = new ByteArrayOutputStream();
int nRead;
byte[] data = new byte[BUFFER_SIZE];
while ((nRead = is.read(data, 0, data.length)) != -1) {
buffer.write(data, 0, nRead);
}
buffer.flush();
byte[] bytes = buffer.toByteArray();
return new String(bytes, "UTF-8");
}
Neither does this
public static String doGetSync2(String urlToRead) throws IOException {
final String charset = "UTF-8";
// Create the connection
HttpURLConnection connection = (HttpURLConnection) new URL(urlToRead).openConnection();
// Check the error stream first, if this is null then there have been no issues with the request
InputStream inputStream = connection.getErrorStream();
if (inputStream == null)
inputStream = connection.getInputStream();
// Read everything from our stream
BufferedReader responseReader = new BufferedReader(new InputStreamReader(inputStream, charset));
String inputLine;
StringBuilder response = new StringBuilder();
while ((inputLine = responseReader.readLine()) != null) {
response.append(inputLine);
}
responseReader.close();
return response.toString();
}
This code works
public static String doPostSync(final String url, final String content) throws IOException {
final String charset = "UTF-8";
// Create the connection
HttpURLConnection connection = (HttpURLConnection) new URL(url).openConnection();
// setDoOutput(true) implicitly set's the request type to POST
connection.setDoOutput(true);
connection.setRequestProperty("Accept-Charset", charset);
connection.setRequestProperty("Content-type", "application/json");
// Write to the connection
OutputStream output = connection.getOutputStream();
output.write(content.getBytes(charset));
output.close();
// Check the error stream first, if this is null then there have been no issues with the request
InputStream inputStream = connection.getErrorStream();
if (inputStream == null)
inputStream = connection.getInputStream();
// Read everything from our stream
BufferedReader responseReader = new BufferedReader(new InputStreamReader(inputStream, charset));
String inputLine;
StringBuilder response = new StringBuilder();
while ((inputLine = responseReader.readLine()) != null) {
response.append(inputLine);
}
responseReader.close();
return response.toString();
}

Download file from ftp server

i am writing a small android application which requires some data which is stored on my web server. The file is a .txt file curretly less than 1 MB. Is it advisable to set up a ftp server to get the data or can i just use a http get method to get the contents on a file. If i am using a http get can someone please tell me the java code required for this operation.
This is out of my head (so an error could have sneaked in):
URL url = new URL("http://www.yourserver.com/some/path");
HttpURLConnection urlConnection = (HttpURLConnection) url.openConnection();
try {
InputStream in = new BufferedInputStream(urlConnection.getInputStream());
FileOutputStream out = new FileutputStream("/path/to/your/output/file");
byte[] buffer = new byte[16384];
int len;
while((len = in.read(buffer)) != -1){
out.write(buffer, 0, len);
}
finally {
urlConnection.disconnect();
}

Categories