Smooks : return an OutputStream - java

I am currently writing a JAVA application that will input an EDI file and return an OutputStream using Smooks library for this purpose. I am struggling to return the output stream and use it without killing memory. The goal of an output stream is to allow users to convert it into an InputStream and manipulate the stream into object creation, files, push to db, etc, ... I would really appreciate if somebody with any considerable input could give me an insight what am I doing wrong. Thanks in advance ..
public class EdiToXml {
private static final int headerBufferSize = 100;
private static final byte[] buf = new byte[headerBufferSize];
private static Smooks smooks;
private static final String headerVersion1 = "IFLIRR\u001F15\u001F2\u001F1A";
private static StreamSource stream;
protected static ByteArrayOutputStream TransformBifToJava(FileInputStream inputStream) throws IOException, SAXException, SmooksException {
Locale defaultLocale = Locale.getDefault();
Locale.setDefault(new Locale("en", "EN"));
//Creating a bufferedInputStream
BufferedInputStream bufferedInputStream = new BufferedInputStream(inputStream);
//Marking the bufferedInputStream
bufferedInputStream.mark(0);
//Obtaining the first 100 bytes to detect the file version
bufferedInputStream.read(buf);
//Reading first 100 bytes
String value = new String(buf);
if(value.indexOf(headerVersion1) > 0) {
// Instantiate Smooks with the config for 15.2.1A
smooks = new Smooks("smooks-config.xml");
}
bufferedInputStream.reset();
stream = new StreamSource(bufferedInputStream);
try {
return Parse1(defaultLocale, smooks, stream);
}finally {
bufferedInputStream.close();
inputStream.close();
}
}
protected static ByteArrayOutputStream Parse1(Locale locale, Smooks smooks, StreamSource streamSource) throws IOException, SAXException, SmooksException {
try {
ByteArrayOutputStream byteArrayOutputStream = new ByteArrayOutputStream();
// Create an exec context - no profiles....
ExecutionContext executionContext = smooks.createExecutionContext();
// Filter the input message
smooks.filterSource(executionContext, streamSource, new StreamResult(byteArrayOutputStream));
Locale.setDefault(locale);
System.out.println(byteArrayOutputStream.size());
return byteArrayOutputStream;
} finally {
smooks.close();
}
}
public static void main(String[] args) throws IOException, SAXException, SmooksException {
ByteArrayOutputStream byteArrayOutputStream = EdiToXml.TransformBifToJava(new FileInputStream("xxxx/BifInputFile.DATA"));
InputStream is = new ByteArrayInputStream(byteArrayOutputStream.toByteArray());
byteArrayOutputStream.close();
int b = is.read();
while (b != -1) {
System.out.printf("%c", b);
b = is.read();
}
is.close();
System.out.println("======================================\n\n");
System.out.print("Finished");
System.out.println("======================================\n\n");
}
}
Exception in thread "main" org.milyn.SmooksException: Smooks Filtering operation failed.
at org.milyn.Smooks._filter(Smooks.java:548)
at org.milyn.Smooks.filterSource(Smooks.java:482)
at com.maureva.xfunctional.EdiToXml.Parse1(EdiToXml.java:102)
at com.maureva.xfunctional.EdiToXml.TransformBifToJava(EdiToXml.java:86)
at com.maureva.xfunctional.EdiToXml.main(EdiToXml.java:173)
Caused by: java.lang.OutOfMemoryError: Java heap space
at java.util.Arrays.copyOf(Arrays.java:3236)
at java.io.ByteArrayOutputStream.grow(ByteArrayOutputStream.java:118)
at java.io.ByteArrayOutputStream.ensureCapacity(ByteArrayOutputStream.java:93)
at java.io.ByteArrayOutputStream.write(ByteArrayOutputStream.java:153)
at sun.nio.cs.StreamEncoder.writeBytes(StreamEncoder.java:221)
at sun.nio.cs.StreamEncoder.implFlushBuffer(StreamEncoder.java:291)
at sun.nio.cs.StreamEncoder.implFlush(StreamEncoder.java:295)
at sun.nio.cs.StreamEncoder.flush(StreamEncoder.java:141)
at java.io.OutputStreamWriter.flush(OutputStreamWriter.java:229)
at org.milyn.delivery.sax.SAXHandler.flushCurrentWriter(SAXHandler.java:503)
at org.milyn.delivery.sax.SAXHandler.endElement(SAXHandler.java:234)
at org.milyn.delivery.SmooksContentHandler.endElement(SmooksContentHandler.java:96)
at org.milyn.edisax.EDIParser.endElement(EDIParser.java:897)
at org.milyn.edisax.EDIParser.endElement(EDIParser.java:883)
at org.milyn.edisax.EDIParser.mapComponent(EDIParser.java:693)
at org.milyn.edisax.EDIParser.mapField(EDIParser.java:636)
at org.milyn.edisax.EDIParser.mapFields(EDIParser.java:603)
at org.milyn.edisax.EDIParser.mapSegment(EDIParser.java:564)
at org.milyn.edisax.EDIParser.mapSegments(EDIParser.java:535)
at org.milyn.edisax.EDIParser.mapSegments(EDIParser.java:453)
at org.milyn.edisax.EDIParser.mapSegment(EDIParser.java:566)
at org.milyn.edisax.EDIParser.mapSegments(EDIParser.java:535)
at org.milyn.edisax.EDIParser.mapSegments(EDIParser.java:453)
at org.milyn.edisax.EDIParser.mapSegment(EDIParser.java:566)
at org.milyn.edisax.EDIParser.mapSegments(EDIParser.java:535)
at org.milyn.edisax.EDIParser.mapSegments(EDIParser.java:453)
at org.milyn.edisax.EDIParser.mapSegment(EDIParser.java:566)
at org.milyn.edisax.EDIParser.mapSegments(EDIParser.java:535)
at org.milyn.edisax.EDIParser.mapSegments(EDIParser.java:453)
at org.milyn.edisax.EDIParser.mapSegment(EDIParser.java:566)
at org.milyn.edisax.EDIParser.mapSegments(EDIParser.java:535)
at org.milyn.edisax.EDIParser.mapSegments(EDIParser.java:453)
Process finished with exit code 1

I suggest you upgrade to the latest version of Smooks (v2.0.0-RC1) given that the EDI cartridge has been totally overhauled. The app is running out of memory because you're writing to a java.io.ByteArrayOutputStream which keeps the written bytes in-memory. I haven't understood what you're trying to accomplish. The things you mentioned like, creating objects, writing to files, and saving to a database, can be done from within Smooks.
If you only want to use Smooks for converting the EDI into XML then you should write the result to an output stream that doesn't keep the data in-memory like a FileOutputStream or implement your own OutputStream should you want to do something funky with the result. Having said this, it doesn't make too much sense to me to use Smooks only for transforming the input into XML.

Related

Strings in downloadfile weird symbols

I've got a String array that contains the content for a downloadable file. I am converting it to a Stream for the download but there are some random values in the downloadfile. I don't know if it is due to the encoding and if yes, how can I change it?
var downloadButton = new DownloadLink(btn, "test.csv", () -> {
try {
ByteArrayOutputStream byteArrayOutputStream = new ByteArrayOutputStream();
ObjectOutputStream objectOutputStream = new ObjectOutputStream(byteArrayOutputStream);
for (int i = 0; i < downloadContent.size(); i++) {
objectOutputStream.writeUTF(downloadContent.get(i));
}
objectOutputStream.flush();
objectOutputStream.close();
byte[] byteArray = byteArrayOutputStream.toByteArray();
ByteArrayInputStream byteArrayInputStream = new ByteArrayInputStream(byteArray);
ObjectInputStream objectInputStream = new ObjectInputStream(byteArrayInputStream);
objectInputStream.close();
return new ByteArrayInputStream(byteArray);
This is the DownloadLink class.
public class DownloadLink extends Anchor {
public DownloadLink(Button button, String fileName, InputStreamFactory fileProvider) {
super(new StreamResource(fileName, fileProvider), "");
getElement().setAttribute("download", fileName);
add(button);
getStyle().set("display", "contents");
}
}
this is the output file
ObjectOutputStream is part of the Java serialization system. In addition to the data itself, it also includes metadata about the original Java types and such. It's only intended for writing data that will later be read back using ObjectInputStream.
To create a file for others to download, you could instead use a PrintWriter that wraps the original output stream. On the other hand, you're using the output stream to create a byte[] which means that a more straightforward, but slightly less efficient, way would be to create a concatenated string from all the array elements and then use getBytes(StandardCharsets.UTF_8) on it to directly get a byte array.

Deserialize Avro Data from bytes

I am trying to deserialize, i.e., get an object of class org.apache.avro.generic.GenericRecord from byte array Avro data. This data contains a header with the full schema.
So far, I have tried this:
public List<GenericRecord> deserializeGenericWithSchema(byte[] message) throws IOException {
List<GenericRecord> listOfRecords = new ArrayList<>();
DatumReader<GenericRecord> reader = new GenericDatumReader<>();
DataFileReader<GenericRecord> fileReader =
new DataFileReader<>(new SeekableByteArrayInput(message), reader);
GenericRecord record = null;
while (fileReader.hasNext()) {
listOfRecords.add(fileReader.next(record));
}
return listOfRecords;
}
But I am getting an error:
java.io.IOException: Invalid int encoding at
org.apache.avro.io.BinaryDecoder.readInt(BinaryDecoder.java:145) at
org.apache.avro.io.BinaryDecoder.readBytes(BinaryDecoder.java:282) at
org.apache.avro.file.DataFileStream.initialize(DataFileStream.java:112)
at org.apache.avro.file.DataFileReader.(DataFileReader.java:97)
However, if I write to disk the byte array message and change my function like:
public List<GenericRecord> deserializeGenericWithSchema(String fileName) throws IOException {
byte[] file = new File(fileName);
List<GenericRecord> listOfRecords = new ArrayList<>();
DatumReader<GenericRecord> reader = new GenericDatumReader<>();
DataFileReader<GenericRecord> fileReader =
new DataFileReader<>(file, reader);
GenericRecord record = null;
while (fileReader.hasNext()) {
listOfRecords.add(fileReader.next(record));
}
return listOfRecords;
}
It works flawlessly. I really don't want to write to disk every avro message I get because this is intended to work in a real time basis.
What am I doing wrong in my first approach?
Do you have any follow up on the issue? My assumption is encoding issue. Where the byte[] came from? Is it the exact byte[] you are writing to the disk? Maybe the explanation is on both File writer and reader default encoding settings.

How to compress and decompress between C++ and Java?

all
I faced a problem to compress and decompress between java and C++.
Here is Java code runs at server.
public static byte[] CompressByDeflater(byte[] toCompress) throws IOException
{
ByteArrayOutputStream compressedStream = new ByteArrayOutputStream();
DeflaterOutputStream inflater = new DeflaterOutputStream(compressedStream);
inflater.write(toCompress, 0, toCompress.length);
inflater.close();
return compressedStream.toByteArray();
}
public static byte[] DecompressByInflater(byte[] toDecompress) throws IOException
{
ByteArrayOutputStream uncompressedStream = new ByteArrayOutputStream();
ByteArrayInputStream compressedStream = new ByteArrayInputStream(toDecompress);
InflaterInputStream inflater = new InflaterInputStream(compressedStream);
int c;
while ((c = inflater.read()) != -1)
{
uncompressedStream.write(c);
}
return uncompressedStream.toByteArray();
}
And I receive a binary file from server.
Then I have to decompress it using C++.
Where do I start with?
Your compression program uses zlib (see JDK documentation), so you need to use a C++ zlib library to decompress its output.
The zlib documentation is the place to start.

How to zip a single string with ZipOutputStream and save as a readable string

I tried to compress a string with DeflaterOutputStream and converted the output with base64 to save the result in another string
public static String compress(String str) throws IOException {
byte[] data = str.getBytes("UTF-8");
ByteArrayOutputStream stream = new ByteArrayOutputStream();
java.util.zip.Deflater compresser = new java.util.zip.Deflater(java.util.zip.Deflater.BEST_COMPRESSION, true);
DeflaterOutputStream deflaterOutputStream = new DeflaterOutputStream(stream, cozmpresser);
deflaterOutputStream.write(data);
deflaterOutputStream.close();
byte[] output = stream.toByteArray();
return Base64Coder.encodeLines(output);
}
Now i wish to try ZipOutputStream. i tried
public static String compress(String str) throws IOException {
byte[] data = str.getBytes("UTF-8");
ByteArrayOutputStream stream = new ByteArrayOutputStream();
ZipOutputStream deflaterOutputStream = new ZipOutputStream(stream);
deflaterOutputStream.setMethod(ZipOutputStream.DEFLATED);
deflaterOutputStream.setLevel(8);
deflaterOutputStream.write(data);
deflaterOutputStream.close();
byte[] output = stream.toByteArray();
return Base64Coder.encodeLines(output);
}
But dont work. ZipOutputStream seems orientated to a structure of folders and files
how can I do?
ZipOutputStream is intended to produce a Zip file, which, as you noticed, is generally used as a container for files and folders (a "compressed folder" or "compressed directory tree", in other words).
If you merely want to compress a string and then convert it to some printable form, ZipOutputStream isn't really the right choice. GZIPOutputStream is more appropriate to that purpose, in my opinion.
Since you marked this question with an android tag, note the comment here: http://developer.android.com/reference/java/util/zip/GZIPOutputStream.html
Using GZIPOutputStream is a little easier than ZipOutputStream because GZIP is only for compression, and is not a container for multiple files.

Best way to write String to file using java nio

I need to write(append) huge string to flat file using java nio. The encoding is ISO-8859-1.
Currently we are writing as shown below. Is there any better way to do the same ?
public void writeToFile(Long limit) throws IOException{
String fileName = "/xyz/test.txt";
File file = new File(fileName);
FileOutputStream fileOutputStream = new FileOutputStream(file, true);
FileChannel fileChannel = fileOutputStream.getChannel();
ByteBuffer byteBuffer = null;
String messageToWrite = null;
for(int i=1; i<limit; i++){
//messageToWrite = get String Data From database
byteBuffer = ByteBuffer.wrap(messageToWrite.getBytes(Charset.forName("ISO-8859-1")));
fileChannel.write(byteBuffer);
}
fileChannel.close();
}
EDIT: Tried both options. Following are the results.
#Test
public void testWritingStringToFile() {
DiagnosticLogControlManagerImpl diagnosticLogControlManagerImpl = new DiagnosticLogControlManagerImpl();
try {
File file = diagnosticLogControlManagerImpl.createFile();
long startTime = System.currentTimeMillis();
writeToFileNIOWay(file);
//writeToFileIOWay(file);
long endTime = System.currentTimeMillis();
System.out.println("Total Time is " + (endTime - startTime));
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}
/**
*
* #param limit
* Long
* #throws IOException
* IOException
*/
public void writeToFileNIOWay(File file) throws IOException {
FileOutputStream fileOutputStream = new FileOutputStream(file, true);
FileChannel fileChannel = fileOutputStream.getChannel();
ByteBuffer byteBuffer = null;
String messageToWrite = null;
for (int i = 1; i < 1000000; i++) {
messageToWrite = "This is a test üüüüüüööööö";
byteBuffer = ByteBuffer.wrap(messageToWrite.getBytes(Charset
.forName("ISO-8859-1")));
fileChannel.write(byteBuffer);
}
}
/**
*
* #param limit
* Long
* #throws IOException
* IOException
*/
public void writeToFileIOWay(File file) throws IOException {
FileOutputStream fileOutputStream = new FileOutputStream(file, true);
BufferedOutputStream bufferedOutputStream = new BufferedOutputStream(
fileOutputStream, 128 * 100);
String messageToWrite = null;
for (int i = 1; i < 1000000; i++) {
messageToWrite = "This is a test üüüüüüööööö";
bufferedOutputStream.write(messageToWrite.getBytes(Charset
.forName("ISO-8859-1")));
}
bufferedOutputStream.flush();
fileOutputStream.close();
}
private File createFile() throws IOException {
File file = new File(FILE_PATH + "test_sixth_one.txt");
file.createNewFile();
return file;
}
Using ByteBuffer and Channel: took 4402 ms
Using buffered Writer : Took 563 ms
UPDATED:
Since Java11 there is a specific method to write strings using java.nio.file.Files:
Files.writeString(Paths.get(file.toURI()), "My string to save");
We can also customize the writing with:
Files.writeString(Paths.get(file.toURI()),
"My string to save",
StandardCharsets.UTF_8,
StandardOpenOption.CREATE,
StandardOpenOption.TRUNCATE_EXISTING);
ORIGINAL ANSWER:
There is a one-line solution, using Java nio:
java.nio.file.Files.write(Paths.get(file.toURI()),
"My string to save".getBytes(StandardCharsets.UTF_8),
StandardOpenOption.CREATE,
StandardOpenOption.TRUNCATE_EXISTING);
I have not benchmarked this solution with the others, but using the built-in implementation for open-write-close file should be fast and the code is quite small.
I don't think you will be able to get a strict answer without benchmarking your software. NIO may speed up the application significantly under the right conditions, but it may also make things slower.
Here are some points:
Do you really need strings? If you store and receive bytes from you database you can avoid string allocation and encoding costs all together.
Do you really need rewind and flip? Seems like you are creating a new buffer for every string and just writing it to the channel. (If you go the NIO way, benchmark strategies that reuse the buffers instead of wrapping / discarding, I think they will do better).
Keep in mind that wrap and allocateDirect may produce quite different buffers. Benchmark both to grasp the trade-offs. With direct allocation, be sure to reuse the same buffer in order to achieve the best performance.
And the most important thing is: Be sure to compare NIO with BufferedOutputStream and/or BufferedWritter approaches (use a intermediate byte[] or char[] buffer with a reasonable size as well). I've seen many, many, many people discovering that NIO is no silver bullet.
If you fancy some bleeding edge... Back to IO Trails for some NIO2 :D.
And here is a interesting benchmark about file copying using different strategies. I know it is a different problem, but I think most of the facts and author conclusions also apply to your problem.
Cheers,
UPDATE 1:
Since #EJP tiped me that direct buffers wouldn't be efficient for this problem, I benchmark it myself and ended up with a nice NIO solution using nemory-mapped files. In my Macbook running OS X Lion this beats BufferedOutputStream by a solid margin. but keep in mind that this might be OS / Hardware / VM specific:
public void writeToFileNIOWay2(File file) throws IOException {
final int numberOfIterations = 1000000;
final String messageToWrite = "This is a test üüüüüüööööö";
final byte[] messageBytes = messageToWrite.
getBytes(Charset.forName("ISO-8859-1"));
final long appendSize = numberOfIterations * messageBytes.length;
final RandomAccessFile raf = new RandomAccessFile(file, "rw");
raf.seek(raf.length());
final FileChannel fc = raf.getChannel();
final MappedByteBuffer mbf = fc.map(FileChannel.MapMode.READ_WRITE, fc.
position(), appendSize);
fc.close();
for (int i = 1; i < numberOfIterations; i++) {
mbf.put(messageBytes);
}
}
I admit that I cheated a little by calculating the total size to append (around 26 MB) beforehand. This may not be possible for several real world scenarios. Still, you can always use a "big enough appending size for the operations and later truncate the file.
UPDATE 2 (2019):
To anyone looking for a modern (as in, Java 11+) solution to the problem, I would follow #DodgyCodeException's advice and use java.nio.file.Files.writeString:
String fileName = "/xyz/test.txt";
String messageToWrite = "My long string";
Files.writeString(Paths.get(fileName), messageToWrite, StandardCharsets.ISO_8859_1);
A BufferedWriter around a FileWriter will almost certainly be faster than any NIO scheme you can come up with. Your code certainly isn't optimal, with a new ByteBuffer per write, and then doing pointless operations on it when it is about to go out of scope, but in any case your question is founded on a misconception. NIO doesn't 'offload the memory footprint to the OS' at all, unless you're using FileChannel.transferTo/From(), which you can't in this instance.
NB don't use a PrintWriter as suggested in comments, as this swallows exceptions. PW is really only for consoles and log files where you don't care.
Here is a short and easy way. It creates a file and writes the data relative to your code project:
private void writeToFile(String filename, String data) {
Path p = Paths.get(".", filename);
try (OutputStream os = new BufferedOutputStream(
Files.newOutputStream(p, StandardOpenOption.CREATE, StandardOpenOption.APPEND))) {
os.write(data.getBytes(), 0, data.length());
} catch (IOException e) {
e.printStackTrace();
}
}
This works for me:
//Creating newBufferedWritter for writing to file
BufferedWritter napiš = Files.newBufferedWriter(Paths.get(filePath));
napiš.write(what);
//Don't forget for this (flush all what you write to String write):
napiš.flush();

Categories