Spaces added when writing files

Spaces added when writing files - java

I'm modifying the source code of H2 MVStore 1.4.191 to write files by doing some thread sleep.
The big change is that the file is not written in one time anymore, but by 2^16 bytes chunks.
MVStore uses java nio FileChannel and ByteBuffer to write its file. The problem is that the result is different from the original version. It seems that FileChannel add space characters (0x20 in ASCII), like, more than 40 in a row. Or maybe it doesn't remove this spaces, on the contrary to the original version, I don't know.
I suppose it's due to file writing.
The method file.write(buffer,position), where file is FileChannel object, and that returns the number of bytes written, sometimes returns a smaller number than the buffer size, in the original version of H2. In my version, it never happens.
Have you tips about ByteBuffer, FileChannel and my problem ?

The original code call writefully function few times (it writes a header, a footer and the datas)
int off = 0;
do {
int len = file.write(src, pos + off);
off += len;
} while (src.remaining() > 0);
src is the ByteBuffer and file is a FileChannelImpl from sun.io. Buffer can contain more than 50MB of datas.
From this code, I developped a solution that split the ByteBuffer in 2^16-sized buffers that I write, by adding sleep function between each of them:
int off = 0;
byte[] buffer = src.array();
int size = src.array().length;
int chunkSize = 128;
List<byte[]> splittedBuffer = new ArrayList<byte[]>();
int i = 0;
while (i < size) {
int start = i;
int end = i + chunkSize;
if (end > size)
{
//if buffer size is not a multiple of 2^16, the last
//chunk will be smaller
end = size;
}
splittedBuffer.add(Arrays.copyOfRange(src.array(), start, end));
try {
Thread.sleep(5);
} catch (InterruptedException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
i += chunkSize;
}
int offset = 0;
for (byte[] chunk : splittedBuffer) {
int len=file.write(ByteBuffer.wrap(chunk),pos+offset);
offset+=len;
try {
Thread.sleep(5);
} catch (InterruptedException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}
Finally, the problem is maybe not that whitespaces are added, but that a part of datas is written in a wrong place. I'm going to check it.

Ok,
The problem was that I used the size of ByteBuffer to split it instead of its limit which is smaller (set by H2 during its process)
Thanks for the help
Regards

Related

Java OutOfMemoryError while merge large file parts from chunked files

I have a problem when the user upload large files (> 1 GB) (I'm using flow.js library), it creates hundred of thousand small chunked files (e.g 100KB each) inside temporary directory but failed to merge into single file, due to MemoryOutOfException. This is not happened when the file is under 1 GB. I know it sound tedious and you probably suggest me to increase the XmX in my container-but I want to have another angle besides that.
Here is my code
private void mergeFile(String identifier, int totalFile, String outputFile) throws AppException{
File[] fileDatas = new File[totalFile]; //we know the size of file here and create specific amount of the array
byte fileContents[] = null;
int totalFileSize = 0;
int filePartUploadSize = 0;
int tempFileSize = 0;
//I'm creating array of file and append the length
for (int i = 0; i < totalFile; i++) {
fileDatas[i] = new File(identifier + "." + (i + 1)); //indentifier is the name of the file
totalFileSize += fileDatas[i].length();
}
try {
fileContents = new byte[totalFileSize];
InputStream inStream;
for (int j = 0; j < totalFile; j++) {
inStream = new BufferedInputStream(new FileInputStream(fileDatas[j]));
filePartUploadSize = (int) fileDatas[j].length();
inStream.read(fileContents, tempFileSize, filePartUploadSize);
tempFileSize += filePartUploadSize;
inStream.close();
}
} catch (FileNotFoundException ex) {
throw new AppException(AppExceptionCode.FILE_NOT_FOUND);
} catch (IOException ex) {
throw new AppException(AppExceptionCode.ERROR_ON_MERGE_FILE);
} finally {
write(fileContents, outputFile);
for (int l = 0; l < totalFile; l++) {
fileDatas[l].delete();
}
}
}
Please show the "inefficient" of this method, once again... only large files that cannot be merge using this method, smaller one ( < 1 GB) no problem at all....
I appreciate if you do not suggest me to increase the heap memory instead show me the fundamental error of this method... thanks...
Thanks

It's unnecessary to allocate the entire file size in memory by declaring a byte array of the entire size. Building the concatenated file in memory in general is totally unnecessary.
Just open up an outputstream for your target file, and then for each file that you are combining to make it, just read each one as an input stream and write the bytes to outputstream, closing each one as you finish. Then when you're done with them all, close the output file. Total memory use will be a few thousand bytes for the buffer.
Also, don't do I/O operations in finally block (except closing and stuff).
Here is a rough example you can play with.
ArrayList<File> files = new ArrayList<>();// put your files here
File output = new File("yourfilename");
BufferedOutputStream boss = null;
try
{
boss = new BufferedOutputStream(new FileOutputStream(output));
for (File file : files)
{
BufferedInputStream bis = null;
try
{
bis = new BufferedInputStream(new FileInputStream(file));
boolean done = false;
while (!done)
{
int data = bis.read();
boss.write(data);
done = data < 0;
}
}
catch (Exception e)
{
//do error handling stuff, log it maybe?
}
finally
{
try
{
bis.close();//do this in a try catch just in case
}
catch (Exception e)
{
//handle this
}
}
}
} catch (Exception e)
{
//handle this
}
finally
{
try
{
boss.close();
}
catch (Exception e) {
//handle this
}
}

... show me the fundamental error of this method
The implementation flaw is that you are creating a byte array (fileContents) whose size is the total file size. If the total file size is too big, that will cause an OOME. Inevitably.
Solution - don't do that! Instead "stream" the file by reading from the "chunk" files and writing to the final file using a modest sized buffer.
There are other problems with your code too. For instance, it could leak file descriptors because you are not ensure that inStream is closed under all circumstances. Read up on the "try-with-resources" construct.

How to read (all available) data from serial connection when using JSSC?

I'm trying to work with JSSC.
I built my app according to this link:
https://code.google.com/p/java-simple-serial-connector/wiki/jSSC_examples
My event handler looks like:
static class SerialPortReader implements SerialPortEventListener {
public void serialEvent(SerialPortEvent event) {
if(event.isRXCHAR()){//If data is available
try {
byte buffer[] = serialPort.readBytes();
}
catch (SerialPortException ex) {
System.out.println(ex);
}
}
}
}
}
The problem is that I'm always not getting the incoming data in one piece. (I the message has a length of 100 bytes, Im getting 48 and 52 bytes in 2 separates calls)
- The other side send me messages in different lengths.
- In the ICD Im working with, there is a field which tell us the length of the message. (from byte #10 to byte #13)
- I cant read 14 bytes:
(serialPort.readBytes(14);,
parse the message length and read the rest of the message:
(serialPort.readBytes(messageLength-14);
But if I will do it, I will not have the message in once piece (I will have 2 separates byte[] and I need it in one piece (byte[]) without the work of copy function.
Is it possible ?
When working with Ethernet (SocketChannel) we can read data using ByteBuffer. But with JSSC we cant.
Is there a good alternative to JSSC ?
Thanks

You can't rely on any library to give you all the content you need at once because :
the library dont know how many data you need
the library will give you data as it comes and also depending on buffers, hardware, etc
You must develop your own business logic to handle your packets reception. It will of course depend on how your packets are defined : are they always the same length, are they separated with same ending character, etc.
Here is an example that should work with your system (note you should take this as a start, not a full solution, it doesn't include timeout for example) :
static class SerialPortReader implements SerialPortEventListener
{
private int m_nReceptionPosition = 0;
private boolean m_bReceptionActive = false;
private byte[] m_aReceptionBuffer = new byte[2048];
#Override
public void serialEvent(SerialPortEvent p_oEvent)
{
byte[] aReceiveBuffer = new byte[2048];
int nLength = 0;
int nByte = 0;
switch(p_oEvent.getEventType())
{
case SerialPortEvent.RXCHAR:
try
{
aReceiveBuffer = serialPort.readBytes();
for(nByte = 0;nByte < aReceiveBuffer.length;nByte++)
{
//System.out.print(String.format("%02X ",aReceiveBuffer[nByte]));
m_aReceptionBuffer[m_nReceptionPosition] = aReceiveBuffer[nByte];
// Buffer overflow protection
if(m_nReceptionPosition >= 2047)
{
// Reset for next packet
m_bReceptionActive = false;
m_nReceptionPosition = 0;
}
else if(m_bReceptionActive)
{
m_nReceptionPosition++;
// Receive at least the start of the packet including the length
if(m_nReceptionPosition >= 14)
{
nLength = (short)((short)m_aReceptionBuffer[10] & 0x000000FF);
nLength |= ((short)m_aReceptionBuffer[11] << 8) & 0x0000FF00;
nLength |= ((short)m_aReceptionBuffer[12] << 16) & 0x00FF0000;
nLength |= ((short)m_aReceptionBuffer[13] << 24) & 0xFF000000;
//nLength += ..; // Depending if the length in the packet include ALL bytes from the packet or only the content part
if(m_nReceptionPosition >= nLength)
{
// You received at least all the content
// Reset for next packet
m_bReceptionActive = false;
m_nReceptionPosition = 0;
}
}
}
// Start receiving only if this is a Start Of Header
else if(m_aReceptionBuffer[0] == '\0')
{
m_bReceptionActive = true;
m_nReceptionPosition = 1;
}
}
}
catch(Exception e)
{
e.printStackTrace();
}
break;
default:
break;
}
}
}

After writing data to serial port it need to be flushed. Check the timing and pay attention to the fact that read should occur only after other end has written. read size is just an indication to read system call and is not guaranteed. The data may have arrived and is buffered in serial port hardware buffer but may not have been transferred to operating system buffer hence not to application. Consider using scm library, it flushes data after each write http://www.embeddedunveiled.com/

Try this:
Write your data to the serial port (using serialPort.writeBytes()) and if you are expecting a response, use this:
byte[] getData() throws SerialPortException, IOException {
ByteArrayOutputStream baos = new ByteArrayOutputStream();
byte[] b;
try {
while ((b = serialPort.readBytes(1, 100)) != null) {
baos.write(b);
// System.out.println ("Wrote: " + b.length + " bytes");
}
// System.out.println("Returning: " + Arrays.toString(baos.toByteArray()));
} catch (SerialPortTimeoutException ex) {
; //don't want to catch it, it just means there is no more data to read
}
return baos.toByteArray();
}
Do what you want with the returned byte array; in my case I just display it for testing.
I found it works just fine if you read one byte at a time, using a 100ms timeout, and when it does time out, you've read all data in the buffer.
Source: trying to talk to an Epson serial printer using jssc and ESC/POS.

InputStream.available() and reading file compeletly notes from oracle

according to :
Note that while some implementations of InputStream will return the
total number of bytes in the stream, many will not. It is never
correct to use the return value of this method to allocate a buffer
intended to hold all data in this stream.
from:
http://docs.oracle.com/javase/7/docs/api/java/io/InputStream.html#available%28%29
and this note
In particular, code of the form
int n = in.available();
byte buf = new byte[n];
in.read(buf);
is not guaranteed to read all of the remaining bytes from the given input stream.
http://docs.oracle.com/javase/8/docs/technotes/guides/io/troubleshooting.html
dose it mean that using below function cause not to read file completely?
/**
* Reads a file from /raw/res/ and returns it as a byte array
* #param res Resources instance for Mosembro
* #param resourceId ID of resource (ex: R.raw.resource_name)
* #return byte[] if successful, null otherwise
*/
public static byte[] readRawByteArray(Resources res, int resourceId)
{
InputStream is = null;
byte[] raw = new byte[] {};
try {
is = res.openRawResource(resourceId);
raw = new byte[is.available()];
is.read(raw);
}
catch (IOException e) {
e.printStackTrace();
raw = null;
}
finally {
try {
is.close();
}
catch (IOException e) {
e.printStackTrace();
}
}
return raw;
}

available() returns the number of bytes that can be read without blocking. There is no necessary correlation between that number, which can be zero, and the total length of the file.

Yes it does not necessarily read all. Like RandomAccessFile.read(byte[]) as opposed to RandomAccessFile.readFully(byte[]). Furthermore the code actually physically reads 0 bytes.
It probably reads only the first block, if it were a slow device like a file system.
The principle:
The file is being read by the underlying system software, normally
buffered, so you have a couple of blocks already in memory, and
sometimes already reading further. The software reads asynchrone
blocks, and blocks if trying to read more than the system has
already read.
So in general one has in the software a read loop of a block, and regularly at a read the read operation blocks till the physical read sufficiently buffers.
To hope for a non-blocking you would need to do:
InputStream is = res.openRawResource(resourceId);
ByteArrayOutputStream baos = new ByteArrayOutputStream();
for (;;) {
// Read bytes until no longer available:
for (;;) {
int n = is.available();
if (n == 0) {
break;
}
byte[] part = new byte[n];
int nread = is.read(part);
assert nread == n;
baos.write(part, 0, nread);
}
// Still a probably blocking read:
byte[] part = new byte[128];
int nread = is.read(part);
if (nread <= 0) {
break; // End of file
}
baos.write(part, 0, nread);
}
return baos.toByteArray();
Now, before you copy that code, simply do a blocking read loop immediately. I cannot see an advantage of using available() unless you can do something with partial data while reading the rest.

RXTX java, inputStream does not return all the buffer

This is my code, I'm using rxtx.
public void Send(byte[] bytDatos) throws IOException {
this.out.write(bytDatos);
}
public byte[] Read() throws IOException {
byte[] buffer = new byte[1024];
int len = 20;
while(in.available()!=0){
in.read(buffer);
}
System.out.print(new String(buffer, 0, len) + "\n");
return buffer;
}
the rest of code is just the same as this, i just changed 2 things.
InputStream in = serialPort.getInputStream();
OutputStream out = serialPort.getOutputStream();
They are global variables now and...
(new Thread(new SerialReader(in))).start();
(new Thread(new SerialWriter(out))).start();
not exist now...
I'm sending this (each second)
Send(("123456789").getBytes());
And this is what i got:
123456789123
456789
123456789
1234567891
23456789
can anybody help me?
EDIT
Later, i got the better way to solve it. Thanks, this was the Read Code
public byte[] Read(int intEspera) throws IOException {
try {
Thread.sleep(intEspera);
} catch (InterruptedException ex) {
Logger.getLogger(COM_ClComunica.class.getName()).log(Level.SEVERE, null, ex);
}//*/
byte[] buffer = new byte[528];
int len = 0;
while (in.available() > 0) {
len = in.available();
in.read(buffer,0,528);
}
return buffer;
}
It was imposible for me to erase that sleep but it is not a problem so, thanks veer

You should indeed note that InputStream.available is defined as follows...
Returns an estimate of the number of bytes that can be read (or skipped over) from this input stream without blocking by the next invocation of a method for this input stream. The next invocation might be the same thread or another thread. A single read or skip of this many bytes will not block, but may read or skip fewer bytes.
As you can see, this is not what you expected. Instead, you want to check for end-of-stream, which is indicated by InputStream.read() returning -1.
In addition, since you don't remember how much data you have already read in prior iterations of your read loop, you are potentially overwriting prior data in your buffer, which is again not something you likely intended.
What you appear to want is something as follows:
private static final int MESSAGE_SIZE = 20;
public byte[] read() throws IOException {
final byte[] buffer = new byte[MESSAGE_SIZE];
int total = 0;
int read = 0;
while (total < MESSAGE_SIZE
&& (read = in.read(buffer, total, MESSAGE_SIZE - total)) >= 0) {
total += read;
}
return buffer;
}
This should force it to read up to 20 bytes, less in the case of reaching the end of the stream.
Special thanks to EJP for reminding me to maintain the quality of my posts and make sure they're correct.

Get rid of the available() test. All it is doing is telling you whether there is data ready to be read without blocking. That isn't the same thing as telling you where an entire message ends. There are few correct uses for available(), and this isn't one of them.
And advance the buffer pointer when you read. You need to keep track of how many bytes you have read so far, and use that as the 2nd parameter to read(), with buffer.length as the third parameter.

Is there a difference in Java's writeInt when executed on Windows vs an Intel based Mac

I currently writing a Java TCP server to handle the communication with a client (which I didn't write). When the server, hosted on windows, responds to the client with the number of records received the client doesn't read the integer correctly, and instead reads it as an empty packet. When the same server code, hosted on my Mac, responds to the client with the number of records received the client reads the packet and responds correctly. Through my research I haven't found an explanation that seems to solve the issue. I have tried reversing the bytes (Integer.reverseBytes) before calling the writeInt method and that didn't seem to resolve the issue. Any ideas are appreciated.
Brian
After comparing the pcap files there are no obvious differences in how they are sent. The first byte is sent followed by the last 3. Both systems send the correct number of records.
Yes I'm referring to the DataOutputStream.writeInt() method. //Code added
public void run() {
try {
InputStream in = socket.getInputStream();
DataOutputStream datOut = new DataOutputStream(socket.getOutputStream());
datOut.writeByte(1); //sends correctly and read correctly by client
datOut.flush();
//below is used to read bytes to determine length of message
int bytesRead=0;
int bytesToRead=25;
byte[] input = new byte[bytesToRead];
while (bytesRead < bytesToRead) {
int result = in.read(input, bytesRead, bytesToRead - bytesRead);
if (result == -1) break;
bytesRead += result;
}
try {
inputLine = getHexString(input);
String hexLength = inputLine.substring(46, 50);
System.out.println("hexLength: " + hexLength);
System.out.println(inputLine);
//used to read entire sent message
bytesRead = 0;
bytesToRead = Integer.parseInt(hexLength, 16);
System.out.println("bytes to read " + bytesToRead);
byte[] dataInput = new byte[bytesToRead];
while (bytesRead < bytesToRead) {
int result = in.read(dataInput, bytesRead, bytesToRead - bytesRead);
if (result == -1) break;
bytesRead += result;
}
String data = getHexString(dataInput);
System.out.println(data);
//Sends received data to class to process
ProcessTel dataValues= new ProcessTel(data);
String[] dataArray = new String[10];
dataArray = dataValues.dataArray();
//assigns returned number of records to be written to client
int towrite = Integer.parseInt(dataArray[0].trim());
//Same write method on Windows & Mac...works on Mac but not Windows
datOut.writeInt(towrite);
System.out.println("Returned number of records: " + Integer.parseInt(dataArray[0].trim()) );
datOut.flush();
} catch (Exception ex) {
Logger.getLogger(ServerThread.class.getName()).log(Level.SEVERE, null, ex);
}
datOut.close();
in.close();
socket.close();
} catch (IOException e) {
e.printStackTrace();
}
}

As described in its Javadoc, DataOutputStream.writeInt() uses network byte order as per the TCP/IP RFCs. Is that the method you are referring to?

No, x86 processors only support little-endian byte order, it doesn't vary with the OS. Something else is wrong.
I suggest using wireshark to capture the stream from a working Mac server and a non-working Windows server and compare.

Some general comments on your code:
int bytesRead=0;
int bytesToRead=25;
byte[] input = new byte[bytesToRead];
while (bytesRead < bytesToRead) {
int result = in.read(input, bytesRead, bytesToRead - bytesRead);
if (result == -1) break;
bytesRead += result;
}
This EOF handling is hokey. It means that you don't know whether or not you've actually read the full 25 bytes. And if you don't, you'll assume that the bytes-to-send is 0.
Worse, you copy-and-paste this code lower down, relying on proper initialization of the same variables. If there's a typo, you'll never know it. You could refactor it into its own method (with tests), or you could call DataInputStream.readFully().
inputLine = getHexString(input);
String hexLength = inputLine.substring(46, 50);
You're converting to hex in order to extract an integer? Why? And more important, if you have any endianness issues this is probably the reason
I was originally going to recommend using a ByteBuffer to extract values, but on a second look I think you should wrap your input stream with a DataInputStream. That would allow you to read complete byte[] buffers without the need for a loop, and it would let you get rid of the byte-to-hex-to-integer conversions: you'd simply call readInt().
But, continuing on:
String[] dataArray = new String[10];
dataArray = dataValues.dataArray();
Do you realize that the new String[10] is being thrown away by the very next line? Is that what you want?
int towrite = Integer.parseInt(dataArray[0].trim());
datOut.writeInt(towrite);
System.out.println("Returned number of records: " + Integer.parseInt(dataArray[0].trim()) );
If you're using logging statements, print what you're actually using (towrite). Don't recalculate it. There's too much of a chance to make a mistake.
} catch (Exception ex) {
Logger.getLogger(ServerThread.class.getName()).log(Level.SEVERE, null, ex);
}
// ...
} catch (IOException e) {
e.printStackTrace();
}
Do either or both of these catch blocks get invoked? And why do they send their output to different places? For that matter, if you have a logger, why are you inserting System.out.println() statements?

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Spaces added when writing files - java

Ok, The problem was that I used the size of ByteBuffer to split it instead of its limit which is smaller (set by H2 during its process) Thanks for the help Regards

Related

Java OutOfMemoryError while merge large file parts from chunked files

How to read (all available) data from serial connection when using JSSC?

InputStream.available() and reading file compeletly notes from oracle

RXTX java, inputStream does not return all the buffer

Is there a difference in Java's writeInt when executed on Windows vs an Intel based Mac

Categories

Resources