FileChannel not writing special characters properly - java

I'm trying to write some text into a file using a FileChannel. So far everything works fine except for the fact that umlauts are not written correctly.
Path fileChannel = Paths.get("c:/channel.txt");
try (FileChannel channel = FileChannel.open(
fileChannel,
StandardOpenOption.CREATE,
StandardOpenOption.READ,
StandardOpenOption.WRITE)) {
String response = "Würzburg";
channel.write(ByteBuffer.wrap(response.getBytes(StandardCharsets.UTF_8)));
}
In this example, I want to write Würzburg into the file, but when I open it, it contains the following: Würzburg. The file itself is utf-8.
Any suggestions, what could be done?
edit:
Finally, I would like to read out the file again, for example like this:
try (FileChannel channel = FileChannel.open(fileChannel, StandardOpenOption.READ)) {
byte[] buffer = new byte[(int) channel.size()];
ByteBuffer bb = ByteBuffer.wrap(buffer);
channel.read(bb);
String request = new String(buffer, StandardCharsets.UTF_8);
System.out.println(request);
}
However, a comparison of the strings response and request shows that they are not identical.
Running on a Windows machine, Java 17 (IntelliJ configured with adoptium open jdk)

Related

FileInputStream FileNotFound Exception in AIX with filename with special characters

I have a small Java Application running inside IBM Integration Bus, which is installed in an AIX Server with the character encoding set to ISO-8959-1.
My application is creating a ZIP File with the filenames received as a parameter. I have a file called "Websërvícès Guide.pdf" in the filesystem which I wanted to zip but I'm unable.
This is my code:
String zipFilePath = "/tmp/EventAttachments_2018.01.25.11.39.34.zip";
// Streams buffer
int BUFFER = 2048;
// Open I/O Buffered Streams
BufferedInputStream origin = null;
FileOutputStream dest = new FileOutputStream(zipFilePath);
ZipOutputStream out = new ZipOutputStream(new BufferedOutputStream(dest));
byte data[] = new byte[BUFFER];
// Oprn File Stream to my file
Path currentFilePath = Paths.get("/tmp/Websërvícès Guide.pdf");
InputStream fi = Files.newInputStream(currentFilePath, StandardOpenOption.READ);
origin = new BufferedInputStream(fi, BUFFER);
ZipEntry entry = new ZipEntry("Websërvícès Guide.pdf");
out.putNextEntry(entry);
int count;
while ((count = origin.read(data, 0, BUFFER)) != -1) {
out.write(data, 0, count);
}
origin.close();
out.close();
Which is throwing a "File Not Found" exception in the Files.newInputStream line.
I have read that Java is not working properly when checking it files with special characters exists and so on. I'm not able to perform changes in the JVM Parameters as code is executed inside a IBM JVM.
Any idea on how to solve this issue and pack the file properly in the ZIP?
Thank you
Can you try to pass following flag to while running your java program
-Dsun.jnu.encoding=UTF-8
First: In your code, you are not taking care of any Exceptions that could be thrown. I would suggest to handle the exceptions of the method or make the method throw the exception and handle it on a higher level. But somewhere you need to handle the exception.
Maybe that's already the problem. (see https://stackoverflow.com/a/155655/8896833)
Second: According to ISO-8959-1 all the characters used in your filename should be covered. Are you really sure about the path your program is working in at the moment you are trying to access the file?
Try to use URLDecoder class method decode(String string, String encoding);.
For example:
String path = URLDecoder.decode("Websërvícès Guide.pdf", "UTF-8"));

Trying to Change the Encdoing of a File in Java is Doubling the Contents of the File

I have a FileOutputStream in java that is reading the contents of UDP packets and saving them to a file. At the end of reading them, I sometimes want to convert the encoding of the file. The problem is that currently when doing this, it just ends up doubling all the contents of the file. The only workaround that I could think to do would be to create a temp file with the new encoding and then save it as the original file, but this seems too hacky.
I must be just overlooking something in my code:
if(mode.equals("netascii")){
byte[] convert = new byte[(int)file.length()];
FileInputStream input = new FileInputStream(file);
input.read(convert);
String temp = new String(convert);
convert = Charset.forName("US-ASCII").encode(temp).array();
fos.write(convert);
}
JOptionPane.showMessageDialog(frame, "Read Successful!");
fos.close();
}
Is there anything suspect?
Thanks in advance for any help!
The problem is the array of bytes you've read from the InputStream will be converted as if its ascii chars, which I'm assuming its not. Specify the InputStream encoding when converting its bytes to String and you'll get a standard Java string.
I've assumed UTF-16 as the InputStream's encoding here:
byte[] convert = new byte[(int)file.length()];
FileInputStream input = new FileInputStream(file);
// read file bytes until EOF
int r = input.read(convert);
while(r!=-1) r = input.read(convert,r,convert.length);
String temp = new String(convert, Charset.forName("UTF-16"));

org.xmlpull.v1.XmlPullParserException

I'm trying to bind an xml file(as a byte[]) to a java object. This is my code-
public voidinputConfigXML(String xmlfile, byte[] xmlData) {
IBindingFactory bFact = BindingDirectory.getFactory(GroupsDTO.class);
IUnmarshallingContext uctx = bFact.createUnmarshallingContext();
groups = (GroupsDTO) uctx.unmarshalDocument(new ByteArrayInputStream(xmlData), "UTF8");
}
The unmarshalDocument() is giving me this exception. What do i do?
FYI: Running as JUnit test case
The following is the stacktrace -
Error parsing document (line 1, col 1)
org.xmlpull.v1.XmlPullParserException: only whitespace content allowed before start tag and not \u0 (position: START_DOCUMENT seen \u0... #1:1)
at org.xmlpull.mxp1.MXParser.parseProlog(MXParser.java:1519)
at org.xmlpull.mxp1.MXParser.nextImpl(MXParser.java:1395)
at org.xmlpull.mxp1.MXParser.next(MXParser.java:1093)
at org.jibx.runtime.impl.XMLPullReaderFactory$XMLPullReader.next(XMLPullReaderFactory.java:291)
at org.jibx.runtime.impl.UnmarshallingContext.toStart(UnmarshallingContext.java:451)
at org.jibx.runtime.impl.UnmarshallingContext.unmarshalElement(UnmarshallingContext.java:2755)
at org.jibx.runtime.impl.UnmarshallingContext.unmarshalDocument(UnmarshallingContext.java:2905)
at abc.dra.DRAAPI.inputConfigXML(DRAAPI.java:31)
at abc.dra.XMLToObject_Test.test(XMLToObject_Test.java:34)
[...]
This is my code that forms byte[]-
void test() {
String xmlfile = "output.xml"
File file = new File(xmlfile);
byte[] xmlData = new byte[(int) file.length()];
groups = dra.inputConfigXML(xmlfile, xmlData);
}
The ByteArrayInputstream is empty:
only whitespace content allowed before start tag and not \u0
(position: START_DOCUMENT seen \u0... #1:1)
means, that a \u0 Bit was found as first char within the XML.
Ensure you have content within your byte[] and the UTF-8 don't start with a BOM.
I don't think, that the BOM is your problem here, but I often encountert regarding BOM and java.
update
You don't fill the byte[]. You have to read the file-content into the byte[]:
read this: File to byte[] in Java
By the way: byte[] xmlData = new byte[(int) file.length()]; is bad code-style, becaus you will run into problems with larger XML-files. If they are larger than Integer.MAX_VALUE you will read a corrupt file.
Hari,
JiBX need characters as input. I think you have specified your encoding incorrectly. Try this code instead:
FileInputStream fis = new FileInputStream("output.xml");
InputStreamReader isr = new InputStreamReader(fis, "UTF8");
groups = (GroupsDTO) uctx.unmarshalDocument(isr);
If you must use the code you have written, I would try outputting the text to the console (System.put.println(xxx)) to make sure you are decoding the utf-8 correctly.
Don
Go to to mvn repository path and delete that folder for xml file.

Out of memory when encoding file to base64

Using Base64 from Apache commons
public byte[] encode(File file) throws FileNotFoundException, IOException {
byte[] encoded;
try (FileInputStream fin = new FileInputStream(file)) {
byte fileContent[] = new byte[(int) file.length()];
fin.read(fileContent);
encoded = Base64.encodeBase64(fileContent);
}
return encoded;
}
Exception in thread "AWT-EventQueue-0" java.lang.OutOfMemoryError: Java heap space
at org.apache.commons.codec.binary.BaseNCodec.encode(BaseNCodec.java:342)
at org.apache.commons.codec.binary.Base64.encodeBase64(Base64.java:657)
at org.apache.commons.codec.binary.Base64.encodeBase64(Base64.java:622)
at org.apache.commons.codec.binary.Base64.encodeBase64(Base64.java:604)
I'm making small app for mobile device.
You cannot just load the whole file into memory, like here:
byte fileContent[] = new byte[(int) file.length()];
fin.read(fileContent);
Instead load the file chunk by chunk and encode it in parts. Base64 is a simple encoding, it is enough to load 3 bytes and encode them at a time (this will produce 4 bytes after encoding). For performance reasons consider loading multiples of 3 bytes, e.g. 3000 bytes - should be just fine. Also consider buffering input file.
An example:
byte fileContent[] = new byte[3000];
try (FileInputStream fin = new FileInputStream(file)) {
while(fin.read(fileContent) >= 0) {
Base64.encodeBase64(fileContent);
}
}
Note that you cannot simply append results of Base64.encodeBase64() to encoded bbyte array. Actually, it is not loading the file but encoding it to Base64 causing the out-of-memory problem. This is understandable because Base64 version is bigger (and you already have a file occupying a lot of memory).
Consider changing your method to:
public void encode(File file, OutputStream base64OutputStream)
and sending Base64-encoded data directly to the base64OutputStream rather than returning it.
UPDATE: Thanks to #StephenC I developed much easier version:
public void encode(File file, OutputStream base64OutputStream) {
InputStream is = new FileInputStream(file);
OutputStream out = new Base64OutputStream(base64OutputStream)
IOUtils.copy(is, out);
is.close();
out.close();
}
It uses Base64OutputStream that translates input to Base64 on-the-fly and IOUtils class from Apache Commons IO.
Note: you must close the FileInputStream and Base64OutputStream explicitly to print = if required but buffering is handled by IOUtils.copy().
Either the file is too big, or your heap is too small, or you've got a memory leak.
If this only happens with really big files, put something into your code to check the file size and reject files that are unreasonably big.
If this happens with small files, increase your heap size by using the -Xmx command line option when you launch the JVM. (If this is in a web container or some other framework, check the documentation on how to do it.)
If the file recurs, especially with small files, the chances are that you've got a memory leak.
The other point that should be made is that your current approach entails holding two complete copies of the file in memory. You should be able to reduce the memory usage, though you'll typically need a stream-based Base64 encoder to do this. (It depends on which flavor of the base64 encoding you are using ...)
This page describes a stream-based Base64 encoder / decoder library, and includes lnks to some alternatives.
Well, do not do it for the whole file at once.
Base64 works on 3 bytes at a time, so you can read your file in batches of "multiple of 3" bytes, encode them and repeat until you finish the file:
// the base64 encoding - acceptable estimation of encoded size
StringBuilder sb = new StringBuilder(file.length() / 3 * 4);
FileInputStream fin = null;
try {
fin = new FileInputStream("some.file");
// Max size of buffer
int bSize = 3 * 512;
// Buffer
byte[] buf = new byte[bSize];
// Actual size of buffer
int len = 0;
while((len = fin.read(buf)) != -1) {
byte[] encoded = Base64.encodeBase64(buf);
// Although you might want to write the encoded bytes to another
// stream, otherwise you'll run into the same problem again.
sb.append(new String(buf, 0, len));
}
} catch(IOException e) {
if(null != fin) {
fin.close();
}
}
String base64EncodedFile = sb.toString();
You are not reading the whole file, just the first few kb. The read method returns how many bytes were actually read. You should call read in a loop until it returns -1 to be sure that you have read everything.
The file is too big for both it and its base64 encoding to fit in memory. Either
process the file in smaller pieces or
increase the memory available to the JVM with the -Xmx switch, e.g.
java -Xmx1024M YourProgram
This is best code to upload image of more size
bitmap=Bitmap.createScaledBitmap(bitmap, 100, 100, true);
ByteArrayOutputStream stream = new ByteArrayOutputStream();
bitmap.compress(Bitmap.CompressFormat.PNG, 100, stream); //compress to which format you want.
byte [] byte_arr = stream.toByteArray();
String image_str = Base64.encodeBytes(byte_arr);
Well, looks like your file is too large to keep the multiple copies necessary for an in-memory Base64 encoding in the available heap memory at the same time. Given that this is for a mobile device, it's probably not possible to increase the heap, so you have two options:
make the file smaller (much smaller)
Do it in a stram-based way so that you're reading from an InputStream one small part of the file at a time, encode it and write it to an OutputStream, without ever keeping the enitre file in memory.
In Manifest in applcation tag write following
android:largeHeap="true"
It worked for me
Java 8 added Base64 methods, so Apache Commons is no longer needed to encode large files.
public static void encodeFileToBase64(String inputFile, String outputFile) {
try (OutputStream out = Base64.getEncoder().wrap(new FileOutputStream(outputFile))) {
Files.copy(Paths.get(inputFile), out);
} catch (IOException e) {
throw new UncheckedIOException(e);
}
}

java program download file is corrupt. Why?

I have created a java program that downloads a file from a URL part by part into several files, then reads the bytes from those files into the full downloaded object. It works by separating sections of the file to be downloaded into threads. Every time my program downloads a file it gets all of the bytes and the file size is correct, but sometimes with an image the picture is distorted. Other times the image is perfect. What would cause this?
code that individual threads use to download file parts:
URL xyz = new URL(urlStr);
URLConnection connection= xyz.openConnection();
// set the download range
connection.setRequestProperty("Range", "bytes="+fileOffset+"-");
connection.setDoInput(true);
connection.setDoOutput(true);
// set input stream and output stream
in = new BufferedInputStream(connection.getInputStream());
fos = new FileOutputStream("part_"+this.partNumber);
out = new BufferedOutputStream(fos, this.downloadFileSize);
// create buffer to read bytes from file into
byte[] contentBytes = new byte[downloadFileSize];
// read contents into buffer
in.read(contentBytes, 0, this.downloadFileSize);
out.write(contentBytes, 0, this.downloadFileSize);
code that puts file together:
int partSize=0;
//Create output stream
OutputStream saveAs = new FileOutputStream(fileName);
for(int i=0; i<filePieces;i++)
{
File file=new File("part_"+(i+1));
partSize=(int)file.length();
byte fileBuffer[]=new byte [partSize];
//Create input stream
InputStream is = new FileInputStream(file);
is.read(fileBuffer);
saveAs.write(fileBuffer);
is.close();
}
Without further details and sample code you're forcing any answers to be guesses. Here are mine:
You're using Readers and Writers when you should use Input- / OutputStreams.
You've messed up the synchronization somehow. Favour classes from the java.util.concurrent package over home grown synchronized solutions.

Categories