Downloading Large JSON File to local file using Java - java

I'm attempting to download a JSON from the following URL - http://api.crunchbase.com/v/1/companies.js - to a local file. I'm using Java 1.7 and the following JSON Libraries - http://www.json.org/java/ - to attempt to make it work.
Here's my code:
public static void download(String address, String localFileName) {
OutputStream out = null;
URLConnection conn = null;
InputStream in = null;
try {
URL url = new URL(address);
out = new BufferedOutputStream(
new FileOutputStream(localFileName));
conn = url.openConnection();
in = conn.getInputStream();
byte[] buffer = new byte[1024];
int numRead;
long numWritten = 0;
while ((numRead = in.read(buffer)) != -1)
{
out.write(buffer, 0, numRead);
numWritten += numRead;
System.out.println(buffer.length);
System.out.println(" " + buffer.hashCode());
}
System.out.println(localFileName + "\t" + numWritten);
} catch (Exception exception) {
exception.printStackTrace();
} finally {
try {
if (in != null) {
in.close();
}
if (out != null) {
out.close();
}
} catch (IOException ioe) {
}
}
}
When I run the code everything seems to work until midway through the loop the program seems to stop and not continue reading the JSON Object.
Does anyone know why this would stop reading? How could I fix the issue?

Try This:
public void saveUrl(String filename, String urlString) throws MalformedURLException, IOException
{
BufferedInputStream in = null;
FileOutputStream fout = null;
try
{
in = new BufferedInputStream(new URL(urlString).openStream());
fout = new FileOutputStream(filename);
byte data[] = new byte[1024];
int count;
while ((count = in.read(data, 0, 1024)) != -1)
{
fout.write(data, 0, count);
}
}
finally
{
if (in != null)
in.close();
if (fout != null)
fout.close();
}
}

Does anyone know why this would stop reading? How could I fix the issue?
I can't see anything obviously wrong with the client-side code. In the absence of any other evidence on the client side, I'd look at the server-side logs to see if there are any clues there.
IMO, the most likely explanation is one of the following:
There's a bug in the server-side code that is generating the JSON and it is crashing halfway through.
The server (or a proxy / reverse proxy) has a timeout on the time allowed for some part of the interaction, and this particular request is taking too long.

Related

How to download a remote file using Java

I'm trying to download a single file from a web server (http or https) using as few third party libraries as possible.
The method I've come up with is as follows:
private static final int BUFFER_SIZE = 8;
public static boolean download(URL url, File f) throws IOException {
URLConnection conn = url.openConnection();
conn.setDoOutput(true);
FileOutputStream out = new FileOutputStream(f);
BufferedInputStream in = new BufferedInputStream(conn.getInputStream());
byte[] buffer;
long dld = 0, expected = conn.getContentLengthLong(); // TODO expected will be -1 if the content length is unknown
while (true) { // TODO fix endless loop if server timeout
buffer = new byte[BUFFER_SIZE];
int n = in.read(buffer);
if (n == -1) break;
else dld += n;
out.write(buffer);
}
out.close();
System.out.println(dld + "B transmitted to " + f.getAbsolutePath());
return true;
}
However, it does by no means work as intended. I tried to download https://upload.wikimedia.org/wikipedia/commons/6/6d/Rubber_Duck_Florentijn_Hofman_Hong_Kong_2013d.jpg for example, the result was horrifying:
For some reason I was able to view the picture in IrfanView but not in any other viewer, so this is a re saved version.
I tried messing with the buffer size or downloading other images but the results are more or less the same.
If I look at the file, there are entire parts of the content simply replaced with dots:
I'm really lost on this one so thanks for any help :)
The problem occurs when there aren't 8 bytes of data to read. This leaves part of the array filled with zeros, which is why you're seeing so many in your hex editor. The solution is simple: replace out.write(buffer); with out.write(buffer, 0, n);. This tells the FileOutputStream to only read the bytes between indexes 0 and n.
Fixed code:
private static final int BUFFER_SIZE = 8;
public static boolean download(URL url, File f) throws IOException {
URLConnection conn = url.openConnection();
conn.setDoOutput(true);
FileOutputStream out = new FileOutputStream(f);
BufferedInputStream in = new BufferedInputStream(conn.getInputStream());
// We can move the buffer declaration outside the loop
byte[] buffer = new byte[BUFFER_SIZE];
long dld = 0, expected = conn.getContentLengthLong(); // TODO expected will be -1 if the content length is unknown
while (true) {
int n = in.read(buffer);
if (n == -1) break;
else dld += n;
out.write(buffer, 0, n);
}
out.close();
System.out.println(dld + "B transmitted to " + f.getAbsolutePath());
return true;
}
Try something like this to download pictures
public static byte[] download(String param) throws IOException {
InputStream in = null;
ByteArrayOutputStream out = null;
try {
URL url = new URL(param);
HttpURLConnection con = (HttpURLConnection)url.openConnection();
con.setConnectTimeout(120000);
con.setReadTimeout(120000);
con.setRequestMethod("GET");
con.connect();
in = new BufferedInputStream(con.getInputStream());
out = new ByteArrayOutputStream();
byte[] buf = new byte[1024];
int n = 0;
while (-1 != (n = in.read(buf))) {
out.write(buf, 0, n);
}
return out.toByteArray();
} finally {
try {
out.close();
} catch (Exception e1) {
}
try {
in.close();
} catch (Exception e2) {
}
}
}

How to read a file using ByteArrayInputStream in java? [duplicate]

How do I convert a java.io.File to a byte[]?
From JDK 7 you can use Files.readAllBytes(Path).
Example:
import java.io.File;
import java.nio.file.Files;
File file;
// ...(file is initialised)...
byte[] fileContent = Files.readAllBytes(file.toPath());
It depends on what best means for you. Productivity wise, don't reinvent the wheel and use Apache Commons. Which is here FileUtils.readFileToByteArray(File input).
Since JDK 7 - one liner:
byte[] array = Files.readAllBytes(Paths.get("/path/to/file"));
No external dependencies needed.
import java.io.RandomAccessFile;
RandomAccessFile f = new RandomAccessFile(fileName, "r");
byte[] b = new byte[(int)f.length()];
f.readFully(b);
Documentation for Java 8: http://docs.oracle.com/javase/8/docs/api/java/io/RandomAccessFile.html
Basically you have to read it in memory. Open the file, allocate the array, and read the contents from the file into the array.
The simplest way is something similar to this:
public byte[] read(File file) throws IOException, FileTooBigException {
if (file.length() > MAX_FILE_SIZE) {
throw new FileTooBigException(file);
}
ByteArrayOutputStream ous = null;
InputStream ios = null;
try {
byte[] buffer = new byte[4096];
ous = new ByteArrayOutputStream();
ios = new FileInputStream(file);
int read = 0;
while ((read = ios.read(buffer)) != -1) {
ous.write(buffer, 0, read);
}
}finally {
try {
if (ous != null)
ous.close();
} catch (IOException e) {
}
try {
if (ios != null)
ios.close();
} catch (IOException e) {
}
}
return ous.toByteArray();
}
This has some unnecessary copying of the file content (actually the data is copied three times: from file to buffer, from buffer to ByteArrayOutputStream, from ByteArrayOutputStream to the actual resulting array).
You also need to make sure you read in memory only files up to a certain size (this is usually application dependent) :-).
You also need to treat the IOException outside the function.
Another way is this:
public byte[] read(File file) throws IOException, FileTooBigException {
if (file.length() > MAX_FILE_SIZE) {
throw new FileTooBigException(file);
}
byte[] buffer = new byte[(int) file.length()];
InputStream ios = null;
try {
ios = new FileInputStream(file);
if (ios.read(buffer) == -1) {
throw new IOException(
"EOF reached while trying to read the whole file");
}
} finally {
try {
if (ios != null)
ios.close();
} catch (IOException e) {
}
}
return buffer;
}
This has no unnecessary copying.
FileTooBigException is a custom application exception.
The MAX_FILE_SIZE constant is an application parameters.
For big files you should probably think a stream processing algorithm or use memory mapping (see java.nio).
As someone said, Apache Commons File Utils might have what you are looking for
public static byte[] readFileToByteArray(File file) throws IOException
Example use (Program.java):
import org.apache.commons.io.FileUtils;
public class Program {
public static void main(String[] args) throws IOException {
File file = new File(args[0]); // assume args[0] is the path to file
byte[] data = FileUtils.readFileToByteArray(file);
...
}
}
If you don't have Java 8, and agree with me that including a massive library to avoid writing a few lines of code is a bad idea:
public static byte[] readBytes(InputStream inputStream) throws IOException {
byte[] b = new byte[1024];
ByteArrayOutputStream os = new ByteArrayOutputStream();
int c;
while ((c = inputStream.read(b)) != -1) {
os.write(b, 0, c);
}
return os.toByteArray();
}
Caller is responsible for closing the stream.
// Returns the contents of the file in a byte array.
public static byte[] getBytesFromFile(File file) throws IOException {
// Get the size of the file
long length = file.length();
// You cannot create an array using a long type.
// It needs to be an int type.
// Before converting to an int type, check
// to ensure that file is not larger than Integer.MAX_VALUE.
if (length > Integer.MAX_VALUE) {
// File is too large
throw new IOException("File is too large!");
}
// Create the byte array to hold the data
byte[] bytes = new byte[(int)length];
// Read in the bytes
int offset = 0;
int numRead = 0;
InputStream is = new FileInputStream(file);
try {
while (offset < bytes.length
&& (numRead=is.read(bytes, offset, bytes.length-offset)) >= 0) {
offset += numRead;
}
} finally {
is.close();
}
// Ensure all the bytes have been read in
if (offset < bytes.length) {
throw new IOException("Could not completely read file "+file.getName());
}
return bytes;
}
You can use the NIO api as well to do it. I could do this with this code as long as the total file size (in bytes) would fit in an int.
File f = new File("c:\\wscp.script");
FileInputStream fin = null;
FileChannel ch = null;
try {
fin = new FileInputStream(f);
ch = fin.getChannel();
int size = (int) ch.size();
MappedByteBuffer buf = ch.map(MapMode.READ_ONLY, 0, size);
byte[] bytes = new byte[size];
buf.get(bytes);
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
} finally {
try {
if (fin != null) {
fin.close();
}
if (ch != null) {
ch.close();
}
} catch (IOException e) {
e.printStackTrace();
}
}
I think its very fast since its using MappedByteBuffer.
Simple way to do it:
File fff = new File("/path/to/file");
FileInputStream fileInputStream = new FileInputStream(fff);
// int byteLength = fff.length();
// In android the result of file.length() is long
long byteLength = fff.length(); // byte count of the file-content
byte[] filecontent = new byte[(int) byteLength];
fileInputStream.read(filecontent, 0, (int) byteLength);
Simplest Way for reading bytes from file
import java.io.*;
class ReadBytesFromFile {
public static void main(String args[]) throws Exception {
// getBytes from anyWhere
// I'm getting byte array from File
File file = null;
FileInputStream fileStream = new FileInputStream(file = new File("ByteArrayInputStreamClass.java"));
// Instantiate array
byte[] arr = new byte[(int) file.length()];
// read All bytes of File stream
fileStream.read(arr, 0, arr.length);
for (int X : arr) {
System.out.print((char) X);
}
}
}
Guava has Files.toByteArray() to offer you. It has several advantages:
It covers the corner case where files report a length of 0 but still have content
It's highly optimized, you get a OutOfMemoryException if trying to read in a big file before even trying to load the file. (Through clever use of file.length())
You don't have to reinvent the wheel.
import java.io.File;
import java.nio.file.Files;
import java.nio.file.Path;
File file = getYourFile();
Path path = file.toPath();
byte[] data = Files.readAllBytes(path);
Using the same approach as the community wiki answer, but cleaner and compiling out of the box (preferred approach if you don't want to import Apache Commons libs, e.g. on Android):
public static byte[] getFileBytes(File file) throws IOException {
ByteArrayOutputStream ous = null;
InputStream ios = null;
try {
byte[] buffer = new byte[4096];
ous = new ByteArrayOutputStream();
ios = new FileInputStream(file);
int read = 0;
while ((read = ios.read(buffer)) != -1)
ous.write(buffer, 0, read);
} finally {
try {
if (ous != null)
ous.close();
} catch (IOException e) {
// swallow, since not that important
}
try {
if (ios != null)
ios.close();
} catch (IOException e) {
// swallow, since not that important
}
}
return ous.toByteArray();
}
This is one of the simplest way
String pathFile = "/path/to/file";
byte[] bytes = Files.readAllBytes(Paths.get(pathFile ));
I belive this is the easiest way:
org.apache.commons.io.FileUtils.readFileToByteArray(file);
ReadFully Reads b.length bytes from this file into the byte array, starting at the current file pointer. This method reads repeatedly from the file until the requested number of bytes are read. This method blocks until the requested number of bytes are read, the end of the stream is detected, or an exception is thrown.
RandomAccessFile f = new RandomAccessFile(fileName, "r");
byte[] b = new byte[(int)f.length()];
f.readFully(b);
If you want to read bytes into a pre-allocated byte buffer, this answer may help.
Your first guess would probably be to use InputStream read(byte[]). However, this method has a flaw that makes it unreasonably hard to use: there is no guarantee that the array will actually be completely filled, even if no EOF is encountered.
Instead, take a look at DataInputStream readFully(byte[]). This is a wrapper for input streams, and does not have the above mentioned issue. Additionally, this method throws when EOF is encountered. Much nicer.
Not only does the following way convert a java.io.File to a byte[], I also found it to be the fastest way to read in a file, when testing many different Java file reading methods against each other:
java.nio.file.Files.readAllBytes()
import java.io.File;
import java.io.IOException;
import java.nio.file.Files;
public class ReadFile_Files_ReadAllBytes {
public static void main(String [] pArgs) throws IOException {
String fileName = "c:\\temp\\sample-10KB.txt";
File file = new File(fileName);
byte [] fileBytes = Files.readAllBytes(file.toPath());
char singleChar;
for(byte b : fileBytes) {
singleChar = (char) b;
System.out.print(singleChar);
}
}
}
//The file that you wanna convert into byte[]
File file=new File("/storage/0CE2-EA3D/DCIM/Camera/VID_20190822_205931.mp4");
FileInputStream fileInputStream=new FileInputStream(file);
byte[] data=new byte[(int) file.length()];
BufferedInputStream bufferedInputStream=new BufferedInputStream(fileInputStream);
bufferedInputStream.read(data,0,data.length);
//Now the bytes of the file are contain in the "byte[] data"
Let me add another solution without using third-party libraries. It re-uses an exception handling pattern that was proposed by Scott (link). And I moved the ugly part into a separate message (I would hide in some FileUtils class ;) )
public void someMethod() {
final byte[] buffer = read(new File("test.txt"));
}
private byte[] read(final File file) {
if (file.isDirectory())
throw new RuntimeException("Unsupported operation, file "
+ file.getAbsolutePath() + " is a directory");
if (file.length() > Integer.MAX_VALUE)
throw new RuntimeException("Unsupported operation, file "
+ file.getAbsolutePath() + " is too big");
Throwable pending = null;
FileInputStream in = null;
final byte buffer[] = new byte[(int) file.length()];
try {
in = new FileInputStream(file);
in.read(buffer);
} catch (Exception e) {
pending = new RuntimeException("Exception occured on reading file "
+ file.getAbsolutePath(), e);
} finally {
if (in != null) {
try {
in.close();
} catch (Exception e) {
if (pending == null) {
pending = new RuntimeException(
"Exception occured on closing file"
+ file.getAbsolutePath(), e);
}
}
}
if (pending != null) {
throw new RuntimeException(pending);
}
}
return buffer;
}
public static byte[] readBytes(InputStream inputStream) throws IOException {
byte[] buffer = new byte[32 * 1024];
int bufferSize = 0;
for (;;) {
int read = inputStream.read(buffer, bufferSize, buffer.length - bufferSize);
if (read == -1) {
return Arrays.copyOf(buffer, bufferSize);
}
bufferSize += read;
if (bufferSize == buffer.length) {
buffer = Arrays.copyOf(buffer, bufferSize * 2);
}
}
}
Another Way for reading bytes from file
Reader reader = null;
try {
reader = new FileReader(file);
char buf[] = new char[8192];
int len;
StringBuilder s = new StringBuilder();
while ((len = reader.read(buf)) >= 0) {
s.append(buf, 0, len);
byte[] byteArray = s.toString().getBytes();
}
} catch(FileNotFoundException ex) {
} catch(IOException e) {
}
finally {
if (reader != null) {
reader.close();
}
}
Try this :
import sun.misc.IOUtils;
import java.io.IOException;
try {
String path="";
InputStream inputStream=new FileInputStream(path);
byte[] data=IOUtils.readFully(inputStream,-1,false);
}
catch (IOException e) {
System.out.println(e);
}
Can be done as simple as this (Kotlin version)
val byteArray = File(path).inputStream().readBytes()
EDIT:
I've read docs of readBytes method. It says:
Reads this stream completely into a byte array.
Note: It is the caller's responsibility to close this stream.
So to be able to close the stream, while keeping everything clean, use the following code:
val byteArray = File(path).inputStream().use { it.readBytes() }
Thanks to #user2768856 for pointing this out.
try this if you have target version less than 26 API
private static byte[] readFileToBytes(String filePath) {
File file = new File(filePath);
byte[] bytes = new byte[(int) file.length()];
// funny, if can use Java 7, please uses Files.readAllBytes(path)
try(FileInputStream fis = new FileInputStream(file)){
fis.read(bytes);
return bytes;
} catch (FileNotFoundException e) {
e.printStackTrace();
} catch (IOException e) {
e.printStackTrace();
}
return null;
}
In JDK8
Stream<String> lines = Files.lines(path);
String data = lines.collect(Collectors.joining("\n"));
lines.close();

Efficient way to write InputStream to a File in Java 6

I will get input stream from third party library to my application.
I have to write this input stream to a file.
Following is the code snippet I tried:
private void writeDataToFile(Stub stub) {
OutputStream os = null;
InputStream inputStream = null;
try {
inputStream = stub.getStream();
os = new FileOutputStream("test.txt");
int read = 0;
byte[] bytes = new byte[1024];
while ((read = inputStream.read(bytes)) != -1) {
os.write(bytes, 0, read);
}
} catch (Exception e) {
log("Error while fetching data", e);
} finally {
if(inputStream != null) {
try {
inputStream.close();
} catch (IOException e) {
log("Error while closing input stream", e);
}
}
if(os != null) {
try {
os.close();
} catch (IOException e) {
log("Error while closing output stream", e);
}
}
}
}
Is there any better approach to do this ?
Since you are stuck with Java 6, do yourself a favour and use Guava and its Closer:
final Closer closer = Closer.create();
final InputStream in;
final OutputStream out;
final byte[] buf = new byte[32768]; // 32k
int bytesRead;
try {
in = closer.register(createInputStreamHere());
out = closer.register(new FileOutputStream(...));
while ((bytesRead = in.read(buf)) != -1)
out.write(buf, 0, bytesRead);
out.flush();
} finally {
closer.close();
}
Had you used Java 7, the solution would have been as simple as:
final Path destination = Paths.get("pathToYourFile");
try (
final InputStream in = createInputStreamHere();
) {
Files.copy(in, destination);
}
And yourInputStream would have been automatically closed for you as a "bonus"; Files would have handled destination all by itself.
If you're not on Java 7 and can't use fge's solution, you may want to wrap your OutputStream in a BufferedOutputStream
BufferedOutputStream os = new BufferedOutputStream(new FileOutputStream("xx.txt"));
Such buffered output stream will write bytes in blocks to the file, which is more efficient than writing byte per byte.
It can get cleaner with an OutputStreamWriter:
OutputStream outputStream = new FileOutputStream("output.txt");
Writer writer = new OutputStreamWriter(outputStream);
writer.write("data");
writer.close();
Instead of writing a string, you can use a Scanner on your inputStream
Scanner sc = new Scanner(inputStream);
while (sc.HasNext())
//read using scanner methods

Request Method GET returns wrong content length

Hi i am using an HttpURLConnection that gets a txt file's content and i want to know the size of that file and i use the content length Method but it returns wrong value for example in this code the file's size is 17509 but it returns 5147 ?
so Any Help?
Thanks so much in advance :).
new Thread() {
#Override
public void run() {
String path = parser.getValue(e, "txt");
URL u = null;
try {
u = new URL(path);
HttpURLConnection c = (HttpURLConnection) u
.openConnection();
c.setRequestMethod("GET");
c.connect();
int lenghtOfFile = c.getContentLength();
InputStream in = c.getInputStream();
final ByteArrayOutputStream bo = new ByteArrayOutputStream();
byte[] buffer = new byte[1024];
long total = 0;
Log.i("p1",""+lenghtOfFile);
while ((count = in.read(buffer)) != -1) {
total += count;
Log.i("p2",""+total);
bo.write(buffer, 0, count);
}
bo.close();
} catch (MalformedURLException e) {
e.printStackTrace();
} catch (ProtocolException e) {
e.printStackTrace();
} catch (IOException e) {
e.printStackTrace();
}
}
}.start();
The content-length is a header set by the server. I would check to make sure that your server is returning the correct content-length. You can do that with cUrl:
curl -v http://path/to/file.txt
That should show you the headers that were sent and returned.
A quick workaround I can think of, is just ignoring the content-length and reading input stream until there's nothing left to read.
ByteArrayOutputStream byteArrayOutputStream = new ByteArrayOutputStream(8192);
int read = inputStream.read();
while (read != -1) {
byteArrayOutputStream.write((byte) read);
read = inputStream.read();
}
byteArrayOutputStream.flush();
buf = byteArrayOutputStream.toByteArray();

How to write files to an SD Card from the internet?

An example would be a simple image.
I have tried so many things and it just refuses to work despite making a whole lot of sense.
What I've done so far is I'm able to grab 25 pictures and add them to
/sdcard/app name/sub/dir/filename.jpg
They all appear there according to the DDMS but they always have a filesize of 0.
I'm guessing it's probably because of my input stream?
Here's my function that handles the downloading and saving.
public void DownloadPages()
{
for (int fileC = 0; fileC < pageAmount; fileC++)
{
URL url;
String path = "/sdcard/Appname/sub/dir/";
File file = new File(path, fileC + ".jpg");
int size=0;
byte[] buffer=null;
try{
url = new URL("http://images.bluegartr.com/bucket/gallery/56ca6f9f2ef43ab7349c0e6511edb6d6.png");
InputStream in = url.openStream();
size = in.available();
buffer = new byte[size];
in.read(buffer);
in.close();
}catch(Exception e){
}
if (!new File(path).exists())
new File(path).mkdirs();
FileOutputStream out;
try{
out = new FileOutputStream(file);
out.write(buffer);
out.flush();
out.close();
}catch(Exception e){
}
}
}
It just keeps giving me 25 files in that directory but all of their file sizes are zero. I have no idea why. This is practically the same code I've used in a java program.
PS...
If you're gonna give me a solution... I've already tried code like this. It doesn't work.
try{
url = new URL(urlString);
in = new BufferedInputStream(url.openStream());
fout = new FileOutputStream(filename);
byte data[] = new byte[1024];
int count;
System.out.println("Now downloading File: " + filename.substring(0, filename.lastIndexOf(".")));
while ((count = in.read(data, 0, 1024)) != -1){
fout.write(data, 0, count);
}
}finally{
System.out.println("Download complete.");
if (in != null)
in.close();
if (fout != null)
fout.close();
}
}
Here's an image of what my directories look like
http://oi48.tinypic.com/2cpcprm.jpg
A bit change to your second option, try it as following way,
byte data[] = new byte[1024];
long total = 0;
int count;
while ( ( count = input.read(data)) != -1 )
{
total += count;
output.write( data,0,count );
}
This one is different in while statement while ((count = in.read(data, 0, 1024)) != -1)
Using Guava something like this should work:
String fileUrl = "xxx";
File file = null;
InputStream in;
FileOutputStream out;
try {
Uri url = new URI(fileUrl);
in = url.openStream();
out = new FileOutputStream(file)
ByteStreams.copy(in, out);
}
catch (IOException e) {
System.out.println(e.toString());
}
finally {
in.close();
out.flush();
out.close();
}

Categories