S3 Multipart Upload Error Code: MalformedXML - java

I am using the following code to for multipart upload to s3.
public boolean uploadFileToS3(List<InputStream> filePartitions ) throws FileException, IOException{
AWSCredentials credentials = new BasicAWSCredentials("access_key","secret_key");
String existingBucketName = "bucketName";
String keyName = "file.log";
AmazonS3 s3Client = new AmazonS3Client(credentials);
// Create a list of UploadPartResponse objects. You get one of these for
// each part upload.
List<PartETag> partETags = new ArrayList<PartETag>();
// Step 1: Initialize.
InitiateMultipartUploadRequest initRequest = new InitiateMultipartUploadRequest(existingBucketName, keyName);
InitiateMultipartUploadResult initResponse = s3Client.initiateMultipartUpload(initRequest);
try {
// Step 2: Upload parts.
Iterator its = filePartitions.iterator();
int i=0;
while(its.hasNext()){
UploadPartRequest uploadRequest = new UploadPartRequest()
.withBucketName(existingBucketName)
.withKey(keyName)
.withUploadId(initResponse.getUploadId()).withPartNumber(i)
.withInputStream((InputStream)its.next());
i++;
System.out.println("Part " + i + "is uploaded");
}
// Step 3: Complete.
CompleteMultipartUploadRequest compRequest = new CompleteMultipartUploadRequest(existingBucketName,
keyName,
initResponse.getUploadId(),
partETags);
s3Client.completeMultipartUpload(compRequest);
} catch (Exception e) {
System.out.println("******************************");
System.out.println(e.getMessage());
System.out.println(e.getCause());
System.out.println("************************");
s3Client.abortMultipartUpload(new AbortMultipartUploadRequest(
existingBucketName, keyName, initResponse.getUploadId()));
}
return true;
}
I am facing the following exception when I run this code.
The XML you provided was not well-formed or did not validate against our published schema (Service: Amazon S3; Status Code: 400; Error Code: MalformedXML; Request ID: C0538A21C25A2DD4)
In the above code List filePartitions is created from a large file.
each chunk is of 20000 bytes.
I have used the following code to split the file into partitions as I have InputStream data directly not a file. It is a REST API.
List<InputStream> filePartitions = new ArrayList<InputStream>();
InputStream inStream = new BufferedInputStream(a_fileInputStream);
int totalBytesRead = 0;
int FILE_SIZE = a_fileInputStream.available();
int chunkSize = 20000;
while (totalBytesRead < FILE_SIZE) {
int bytesRemaining = FILE_SIZE - totalBytesRead;
if (bytesRemaining < chunkSize) {
chunkSize = bytesRemaining;
}
byte[] buffer=new byte[chunkSize];
//Temporary Byte Array
int read = inStream.read(buffer,0,chunkSize);
int bytesRead = inStream.read(buffer, 0, chunkSize);
if (bytesRead > 0) // If bytes read is not empty
{
totalBytesRead += bytesRead;
//create InputStream from temporary byte array
InputStream partition = new ByteArrayInputStream(buffer);
filePartitions.add(partition);
}
}
return filePartitions;

Related

Java, how can you chunk pieces of a large inputstream efficiently?

I have an input stream that is potentially 20-30mb. I'm trying to upload chunks as a multi-part file upload to S3.
I have the content-length available and I have the input-stream available. How can I efficiently do this with memory in mind.
I saw someone had done something like this, but not sure I fully understand it:
int contentLength = inputStreamMetadata.getContentLength();
int partSize = 512 * 1024; // Set part size to 2 MB
int filePosition = 0;
ByteArrayInputStream bais = inputStreamMetadata.getInputStream();
List<PartETag> partETags = new ArrayList<>();
byte[] chunkedFileBytes = new byte[partSize];
for (int i = 1; filePosition < contentLength; i++) {
// Because the last part could be less than 5 MB, adjust the part size as needed.
partSize = Math.min(partSize, (contentLength - filePosition));
filePosition += bais.read(chunkedFileBytes, filePosition, partSize);
// Create the request to upload a part.
UploadPartRequest uploadRequest = new UploadPartRequest()
.withBucketName(bucketName)
.withUploadId(uploadId)
.withKey(fileName)
.withPartNumber(i)
.withInputStream(new ByteArrayInputStream(chunkedFileBytes, 0, partSize))
.withPartSize(partSize);
UploadPartResult uploadResult = client.uploadPart(uploadRequest);
partETags.add(uploadResult.getPartETag());
}
}
Specifically this piece: .withInputStream(new ByteArrayInputStream(bytes, 0, bytesRead))
Sorry, i cannot (easily) test it, but I think you are really close, ... just have to "fix" and "arrange" your loop!
Combining https://stackoverflow.com/a/22128215/592355 with your latest code:
int partSize = 5 * 1024 * 1024; // Set part size to 5 MB
ByteArrayInputStream bais = inputStreamMetadata.getInputStream();
List<PartETag> partETags = new ArrayList<>();
byte[] buff = new byte[partSize];
int partNumber = 1;
while (true) {//!
int readBytes = bais.read(buff);// readBytes in [-1 .. partSize]!
if (readBytes == -1) { //EOF
break;
}
// Create the request to upload a part.
UploadPartRequest uploadRequest = new UploadPartRequest()
.withBucketName(bucketName)
.withUploadId(uploadId)
.withKey(fileName)
.withPartNumber(partNumber++)
.withInputStream(new ByteArrayInputStream(buff, 0, readBytes))
.withPartSize(readBytes);
UploadPartResult uploadResult = client.uploadPart(uploadRequest);
partETags.add(uploadResult.getPartETag());
}
// Complete the multipart upload....
// https://docs.aws.amazon.com/AmazonS3/latest/dev/llJavaUploadFile.html

Download File from amazon s3

Hi guys I'm working under a project like Downloading file from amazon s3. My file was successfully downloaded by chunks [i.e download 5 by 5 MB or more]. But my downloaded file was corrupted after the completion.
GetObjectRequest rangeObjectRequest = new getObjectRequest(existingBucketName, fileName);
long filePosition = 0;
int part = 0;
while (filePosition < contentLength)
{
blockSize = Math.min(blockSize, (contentLength - filePosition));
Thread.currentThread();
if( Thread.interrupted() )
{
throw new InterruptedException();
}
rangeObjectRequest.setRange(filePosition, filePosition + blockSize);
S3Object objectPortion = s3Client.getObject(rangeObjectRequest);
InputStream objectData = objectPortion.getObjectContent();
IOUtils.copy(objectData, out);
filePosition = filePosition + blockSize;
part++;
}
out.close();
objectContent.close();

Split large file into chunks

I have a method which accept file and size of chunks and return list of chunked files. But the main problem that my line in file could be broken, for example in main file I have next lines:
|1|aaa|bbb|ccc|
|2|ggg|ddd|eee|
After split I could have in one file:
|1|aaa|bbb
In another file:
|ccc|2|
|ggg|ddd|eee|
Here is the code:
public static List<File> splitFile(File file, int sizeOfFileInMB) throws IOException {
int counter = 1;
List<File> files = new ArrayList<>();
int sizeOfChunk = 1024 * 1024 * sizeOfFileInMB;
byte[] buffer = new byte[sizeOfChunk];
try (BufferedInputStream bis = new BufferedInputStream(new FileInputStream(file))) {
String name = file.getName();
int tmp = 0;
while ((tmp = bis.read(buffer)) > 0) {
File newFile = new File(file.getParent(), name + "."
+ String.format("%03d", counter++));
try (FileOutputStream out = new FileOutputStream(newFile)) {
out.write(buffer, 0, tmp);
}
files.add(newFile);
}
}
return files;
}
Should I use RandomAccessFile class for above purposes (main file is really big - more then 5 Gb)?
If you don't mind to have chunks of different lengths (<=sizeOfChunk but closest to it) then here is the code:
public static List<File> splitFile(File file, int sizeOfFileInMB) throws IOException {
int counter = 1;
List<File> files = new ArrayList<File>();
int sizeOfChunk = 1024 * 1024 * sizeOfFileInMB;
String eof = System.lineSeparator();
try (BufferedReader br = new BufferedReader(new FileReader(file))) {
String name = file.getName();
String line = br.readLine();
while (line != null) {
File newFile = new File(file.getParent(), name + "."
+ String.format("%03d", counter++));
try (OutputStream out = new BufferedOutputStream(new FileOutputStream(newFile))) {
int fileSize = 0;
while (line != null) {
byte[] bytes = (line + eof).getBytes(Charset.defaultCharset());
if (fileSize + bytes.length > sizeOfChunk)
break;
out.write(bytes);
fileSize += bytes.length;
line = br.readLine();
}
}
files.add(newFile);
}
}
return files;
}
The only problem here is file charset which is default system charset in this example. If you want to be able to change it let me know. I'll add third parameter to "splitFile" function for it.
Just in case anyone is interested in a Kotlin version.
It creates an iterator of ByteArray chunks:
class ByteArrayReader(val input: InputStream, val chunkSize: Int, val bufferSize: Int = 1024*8): Iterator<ByteArray> {
var eof: Boolean = false
init {
if ((chunkSize % bufferSize) != 0) {
throw RuntimeException("ChunkSize(${chunkSize}) should be a multiple of bufferSize (${bufferSize})")
}
}
override fun hasNext(): Boolean = !eof
override fun next(): ByteArray {
var buffer = ByteArray(bufferSize)
var chunkWriter = ByteArrayOutputStream(chunkSize) // no need to close - implementation is empty
var bytesRead = 0
var offset = 0
while (input.read(buffer).also { bytesRead = it } > 0) {
if (chunkWriter.use { out ->
out.write(buffer, 0, bytesRead)
out.flush()
offset += bytesRead
offset == chunkSize
}) {
return chunkWriter.toByteArray()
}
}
eof = true
return chunkWriter.toByteArray()
}
}
Split a file to multiple chunks (in memory operation), here I'm splitting any file to a size of 500kb(500000 bytes) and adding to a list :
public static List<ByteArrayOutputStream> splitFile(File f) {
List<ByteArrayOutputStream> datalist = new ArrayList<>();
try {
int sizeOfFiles = 500000;
byte[] buffer = new byte[sizeOfFiles];
try (FileInputStream fis = new FileInputStream(f); BufferedInputStream bis = new BufferedInputStream(fis)) {
int bytesAmount = 0;
while ((bytesAmount = bis.read(buffer)) > 0) {
try (OutputStream out = new ByteArrayOutputStream()) {
out.write(buffer, 0, bytesAmount);
out.flush();
datalist.add((ByteArrayOutputStream) out);
}
}
}
} catch (Exception e) {
//get the error
}
return datalist;
}
Split files in chunks depending upon your chunk size
val f = FileInputStream(file)
val data = ByteArray(f.available()) // Size of original file
var subData: ByteArray
f.read(data)
var start = 0
var end = CHUNK_SIZE
val max = data.size
if (max > 0) {
while (end < max) {
subData = data.copyOfRange(start, end)
start = end
end += CHUNK_SIZE
if (end >= max) {
end = max
}
//Function to upload your chunk
uploadFileInChunk(subData, isLast = false)
}
// For the Last Chunk
end--
subData = data.copyOfRange(start, end)
uploadFileInChunk(subData, isLast = true)
}
If you are taking the file from the user through intent you may get file URI as content, so in that case.
Uri uri = data.getData();
InputStream inputStream = getContext().getContentResolver().openInputStream(uri);
fileInBytes = IOUtils.toByteArray(inputStream);
Add the dependency in you build gradle to use IOUtils
compile 'commons-io:commons-io:2.11.0'
Now do a little modification in the above code to send your file to server.
var subData: ByteArray
var start = 0
var end = CHUNK_SIZE
val max = fileInBytes.size
if (max > 0) {
while (end < max) {
subData = fileInBytes.copyOfRange(start, end)
start = end
end += CHUNK_SIZE
if (end >= max) {
end = max
}
uploadFileInChunk(subData, isLast = false)
}
// For the Last Chunk
end--
subData = fileInBytes.copyOfRange(start, end)
uploadFileInChunk(subData, isLast = true)
}

java zip to binary format and then decompress

I have a task that
read a zip file from local into binary message
transfer binary message through EMS as String (done by java API)
receive transferred binary message as String (done by java API)
decompress the binary message and then print it out
The problem I am facing is DataFormatException while decompress the message.
I have no idea which part went wrong.
I use this to read file into binary message:
static String readFile_Stream(String fileName) throws IOException {
File file = new File(fileName);
byte[] fileData = new byte[(int) file.length()];
FileInputStream in = new FileInputStream(file);
in.read(fileData);
String content = "";
System.out.print("Sent message: ");
for(byte b : fileData)
{
System.out.print(getBits(b));
content += getBits(b);
}
in.close();
return content;
}
static String getBits(byte b)
{
String result = "";
for(int i = 0; i < 8; i++)
result = ((b & (1 << i)) == 0 ? "0" : "1") + result;
return result;
}
I use this to decompress message:
private static byte[] toByteArray(String input)
{
byte[] byteArray = new byte[input.length()/8];
for (int i=0;i<input.length()/8;i++)
{
String read_data = input.substring(i*8, i*8+8);
short a = Short.parseShort(read_data, 2);
byteArray[i] = (byte) a;
}
return byteArray;
}
public static byte[] unzipByteArray(byte[] file) throws IOException {
byte[] byReturn = null;
Inflater oInflate = new Inflater(false);
oInflate.setInput(file);
ByteArrayOutputStream oZipStream = new ByteArrayOutputStream();
try {
while (! oInflate.finished() ){
byte[] byRead = new byte[4 * 1024];
int iBytesRead = oInflate.inflate(byRead);
if (iBytesRead == byRead.length){
oZipStream.write(byRead);
}
else {
oZipStream.write(byRead, 0, iBytesRead);
}
}
byReturn = oZipStream.toByteArray();
}
catch (DataFormatException ex){
throw new IOException("Attempting to unzip file that is not zipped.");
}
finally {
oZipStream.close();
}
return byReturn;
}
The message I got is
java.io.IOException: Attempting to unzip file that is not zipped.
at com.sourcefreak.example.test.TibcoEMSQueueReceiver.unzipByteArray(TibcoEMSQueueReceiver.java:144)
at com.sourcefreak.example.test.TibcoEMSQueueReceiver.main(TibcoEMSQueueReceiver.java:54)
After check, the binary message does not corrupted after transmission.
Please help to figure out the problem.
Have you tried using InflaterInputStream? Based on my experience, using Inflater directly is rather tricky. You can use this to get started:
public static byte[] unzipByteArray(byte[] file) throws IOException {
InflaterInputStream iis = new InflaterInputStream(new ByteArrayInputStream(file));
ByteArrayOutputStream baos = new ByteArrayOutputStream();
byte[] buffer = new byte[512];
int length = 0;
while ((length = iis.read(buffer, 0, buffer.length) != 0) {
baos.write(buffer, 0, length);
}
iis.close();
baos.close();
return baos.toByteArray();
}
I finally figure out the problem.
The problem is the original file is a .zip file, so I should use zipInputStream to unzip the file before further processing.
public static byte[] unzipByteArray(byte[] file) throws IOException {
// create a buffer to improve copy performance later.
byte[] buffer = new byte[2048];
byte[] content ;
// open the zip file stream
InputStream theFile = new ByteArrayInputStream(file);
ZipInputStream stream = new ZipInputStream(theFile);
ByteArrayOutputStream output = new ByteArrayOutputStream();
try
{
ZipEntry entry;
while((entry = stream.getNextEntry())!=null)
{
//String s = String.format("Entry: %s len %d added %TD", entry.getName(), entry.getSize(), new Date(entry.getTime()));
//System.out.println(s);
// Once we get the entry from the stream, the stream is
// positioned read to read the raw data, and we keep
// reading until read returns 0 or less.
//String outpath = outdir + "/" + entry.getName();
try
{
//output = new FileOutputStream(outpath);
int len = 0;
while ((len = stream.read(buffer)) > 0)
{
output.write(buffer, 0, len);
}
}
finally
{
// we must always close the output file
if(output!=null) output.close();
}
}
}
finally
{
// we must always close the zip file.
stream.close();
}
content = output.toByteArray();
return content;
}
This code work for zip file containing single file inside.

File is not transferred completely (Android)

I am developing an Android App to send a file via bluetooth to a java server using the BlueCove library version 2.1.0 based on this snippet. At the beginning everything looks fine, but the file will not transfered completly. Only about 7KB of 35KB.
Android
private void sendFileViaBluetooth(byte[] data){
OutputStream outStream = null;
BluetoothDevice device = btAdapter.getRemoteDevice(address);
btSocket = device.createRfcommSocketToServiceRecord(MY_UUID);
btSocket.connect();
try {
outStream = btSocket.getOutputStream();
outStream.write( data );
outStream.write("end of file".getBytes());
outStream.flush();
} catch (IOException e) {
} finally{
try {
outStream.close();
btSocket.close();
device = null;
} catch (IOException e) {
}
}
}
PC Server
InputStream inStream = connection.openInputStream();
byte[] buffer = new byte[1024];
File f = new File("d:\\temp.jpg");
FileOutputStream fos = new FileOutputStream (f);
InputStream bis = new BufferedInputStream(inStream);
int bytes = 0;
boolean eof = false;
while (!eof) {
bytes = bis.read(buffer);
if (bytes > 0){
int offset = bytes - 11;
byte[] eofByte = new byte[11];
eofByte = Arrays.copyOfRange(buffer, offset, bytes);
String message = new String(eofByte, 0, 11);
if(message.equals("end of file")) {
eof = true;
} else {
fos.write (buffer, 0, bytes);
}
}
}
fos.close();
connection.close();
I tried already to split the byte array before writing:
public static byte[][] divideArray(byte[] source, int chunksize) {
byte[][] ret = new byte[(int)Math.ceil(source.length / (double)chunksize)][chunksize];
int start = 0;
for(int i = 0; i < ret.length; i++) {
ret[i] = Arrays.copyOfRange(source,start, start + chunksize);
start += chunksize ;
}
return ret;
}
private void sendFileViaBluetooth(byte[] data){
[...]
byte[][] chunks = divideArray(data, 1024);
for (int i = 0; i < (int)Math.ceil(data.length / 1024.0); i += 1) {
outStream.write( chunks[i][1024] );
}
outStream.write("end of file".getBytes());
outStream.flush();
[...]
}
Every help or ideas are appreciated.
You don't need any of this. The canonical way to copy a stream in Java is this:
while ((count = in.read(buffer)) > 0)
{
out.write(buffer, 0, count);
}
out.close();
Same at both ends. TCP/IP will do all the chunking for you. All you need to do is cope correctly with varying size reads, which this code does.

Categories