Java InputStreamReader Error (org.apache.poi.openxml4j.exceptions.InvalidOperationException) - java

I am trying to convert pptx files to txt (Text Extraction) using Apache POI Framework (Java).
I'm new in coding Java, so I don't know a lot about Buffered Readers/InputStream, etc.
What I tried is:
import org.apache.poi.xslf.XSLFSlideShow;
import org.apache.poi.xslf.extractor.XSLFPowerPointExtractor;
import org.apache.poi.xslf.usermodel.XMLSlideShow;
... Classes and Stuff ....
String inputfile = "X:\\Master\\simpl_temp\\2d0a44a2-95e7-428c-911c-1f803acbff42.pptx";
InputStream fis = new FileInputStream(inputfile);
BufferedReader br1 = new BufferedReader(new InputStreamReader(fis));
String fileName = br1.readLine();
System.out.println(new XSLFPowerPointExtractor(new XMLSlideShow(new XSLFSlideShow(fileName))).getText());
br1.close();
My goal is, to write the extracted text into a variable, but It doesn't even work to print it on console... What I get is:
org.apache.poi.openxml4j.exceptions.InvalidOperationException: Can't open the specified file: 'PK
org.apache.poi.openxml4j.opc.ZipPackage.<init>(ZipPackage.java:102)
org.apache.poi.openxml4j.opc.OPCPackage.open(OPCPackage.java:199)
org.apache.poi.openxml4j.opc.OPCPackage.open(OPCPackage.java:178)
org.apache.poi.POIXMLDocument.openPackage(POIXMLDocument.java:69)
org.apache.poi.xslf.XSLFSlideShow.<init>(XSLFSlideShow.java:90)
Any help would be greatly appreciated!

You are doing much to much, in fact you are trying to read the data of the PPTX itself as filename, better simply use
System.out.println(new XSLFPowerPointExtractor(
new XMLSlideShow(new XSLFSlideShow(
"X:\\Master\\simpl_temp\\2d0a44a2-95e7-428c-911c-1f803acbff42.pptx"))).getText());
or more generic
POITextExtractor extractor = ExtractorFactory.createExtractor(
new java.io.File("X:\\Master\\simpl_temp\\2d0a44a2-95e7-428c-911c-1f803acbff42.pptx"");
System.out.println(extractor.getText());
extractor.close();

I cannot give you the correct answer (because I myself don't use POI), but I can tell you where your mistake might lie.
The constructor of the class XSLFSlideShow is expecting file path as its argument. But you are passing an InputStream. Try it as follows:
String filePath = "X:\\Master\\simpl_temp\\2d0a44a2-95e7-428c-911c-1f803acbff42.pptx";
System.out.println(new XSLFPowerPointExtractor(new XMLSlideShow(new XSLFSlideShow(filePath))).getText());

Related

Java, Reading a file that has UCS-2 Little Endian encodeing

I'm trying to read a txt file that has the UCS-2 LE encoding, I have the following code below. the ??? is the encoding variable I need but I am not sure what it's supposed to be.
InputStream HostFile = new FileInputStream(Location + FileName);
Reader file = new InputStreamReader(HostFile, Charset.forName(???);
PrintWriter writer = new PrintWriter(outLocation, "UTF-8");
Any ideas would be appreciated .
Reader file = new InputStreamReader(HostFile, Charset.forName("UTF-16LE");

Create a csv or simple text file using only streams

I'm about to use a jsf Primefaces download button to download a csv file.
The file doesn't exists and it can't use the Export utility because I need to build the csv at runtime.
This is a test attempt which works:
private StreamedContent file;
/** Getter,setter...*/
public void FileDownloadBean() {
InputStream stream = this.getClass().getResourceAsStream("test.csv");
file = new DefaultStreamedContent(stream, "application/csv", "test.csv");
}
The fact I'm using Primefaces doesn't really count here, what I want to achieve is to build a file of any kind, preferably CSV, without actually saving a (temp) file in the file-system.
I would like to append my data using a stream, so then I can easily append and manipulate Strings, bytes, and image files.
Any ideas? Maybe a Stringbuffer?
Thanks in advance.
I don't think you can "create a file without creating a file".
Use a String, StringBuffer, StringBuilder, or other variable to have the file's contents in memory.
Edit: Apparently, there are also streams to memory (?): ByteArrayInputStream and ByteArrayOutputStream
As far as I understood your question, the following answer "maybe" solves your problem:
public class InMemoryStreaming {
private StringBuilder sb = new StringBuilder();
public void FileDownloadBean() throws IOException {
InputStream csvStream = this.getClass().getResourceAsStream("test.csv");
try (BufferedReader br = new BufferedReader(new InputStreamReader(
csvStream))) {
// on every method call the StringBuilder is appended
sb.append(br.lines().collect(
Collectors.joining(System.lineSeparator())));
}
}
}
If you want to serialize the StringBuilder into a real File, you can do it with the appropriate writer.

read greek characters from xls file into java

I am trying to read an xls file in java and convert it to csv. The problem is that it contains greek characters. I have used various different methods with no success.
br = new BufferedReader(new InputStreamReader(
new FileInputStream(saveDir+"/"+fileName+".xls"), "UTF-8"));
FileWriter writer1 = new FileWriter(saveDir+"/A"+fileName+".csv");
byte[] bytes = thisLine.getBytes("UTF-8");
writer1.append(new String(bytes, "UTF-8"));
used that with different encoders, like utf16 and windoes-1253 and ofcourse with out using the bytes array. none worked. any ideas?
Use "ISO-8859-7" instead of "UTF-8". It is for latin and greek. See documentation
InputStream in = new BufferedInputStream(new FileInputStream(new File(myfile)));
result = new Scanner(in,"ISO-8859-7").useDelimiter("\\A").next();
A Byte Order Mask (BOM) should be entered at the start of the CSV file.
Can you try this code?
PrintWriter writer1 = new PrintWriter(saveDir+"/A"+fileName+".csv");
writer1.print('\ufeff');
....

Saving Information in Hindi

I am using this code to save the data in the file. The data that is being saved in the file is ????????. Please help me with suitable solution.
File gpxfile = new File(activate, "activate.csv");
OutputStreamWriter writer = new OutputStreamWriter(new FileOutputStream(gpxfile),"UTF-8");
writer.write(merchantId);
It works for me. Make sure merchantId contains valid Hindi. For instance:
String str = "मानक हिन्दी";
writer.write(str);

Writing in the beginning of a text file Java

I need to write something into a text file's beginning. I have a text file with content and i want write something before this content. Say i have;
Good afternoon sir,how are you today?
I'm fine,how are you?
Thanks for asking,I'm great
After modifying,I want it to be like this:
Page 1-Scene 59
25.05.2011
Good afternoon sir,how are you today?
I'm fine,how are you?
Thanks for asking,I'm great
Just made up the content :) How can i modify a text file like this way?
You can't really modify it that way - file systems don't generally let you insert data in arbitrary locations - but you can:
Create a new file
Write the prefix to it
Copy the data from the old file to the new file
Move the old file to a backup location
Move the new file to the old file's location
Optionally delete the old backup file
Just in case it will be useful for someone here is full source code of method to prepend lines to a file using Apache Commons IO library. The code does not read whole file into memory, so will work on files of any size.
public static void prependPrefix(File input, String prefix) throws IOException {
LineIterator li = FileUtils.lineIterator(input);
File tempFile = File.createTempFile("prependPrefix", ".tmp");
BufferedWriter w = new BufferedWriter(new FileWriter(tempFile));
try {
w.write(prefix);
while (li.hasNext()) {
w.write(li.next());
w.write("\n");
}
} finally {
IOUtils.closeQuietly(w);
LineIterator.closeQuietly(li);
}
FileUtils.deleteQuietly(input);
FileUtils.moveFile(tempFile, input);
}
I think what you want is random access. Check out the related java tutorial. However, I don't believe you can just insert data at an arbitrary point in the file; If I recall correctly, you'd only overwrite the data. If you wanted to insert, you'd have to have your code
copy a block,
overwrite with your new stuff,
copy the next block,
overwrite with the previously copied block,
return to 3 until no more blocks
As #atk suggested, java.nio.channels.SeekableByteChannel is a good interface. But it is available from 1.7 only.
Update : If you have no issue using FileUtils then use
String fileString = FileUtils.readFileToString(file);
This isn't a direct answer to the question, but often files are accessed via InputStreams. If this is your use case, then you can chain input streams via SequenceInputStream to achieve the same result. E.g.
InputStream inputStream = new SequenceInputStream(new ByteArrayInputStream("my line\n".getBytes()), new FileInputStream(new File("myfile.txt")));
I will leave it here just in case anyone need
ByteArrayOutputStream byteArrayOutputStream = new ByteArrayOutputStream();
try (FileInputStream fileInputStream1 = new FileInputStream(fileName1);
FileInputStream fileInputStream2 = new FileInputStream(fileName2)) {
while (fileInputStream2.available() > 0) {
byteArrayOutputStream.write(fileInputStream2.read());
}
while (fileInputStream1.available() > 0) {
byteArrayOutputStream.write(fileInputStream1.read());
}
}
try (FileOutputStream fileOutputStream = new FileOutputStream(fileName1)) {
byteArrayOutputStream.writeTo(fileOutputStream);
}

Categories