Trying to compare generated inputStream with resource file - jUnit - java

I'm trying to compare the inputStream from a resource file with a created inputStream.
I'm doing that the following way:
InputStream isAsJpg = Thread.currentThread().getContextClassLoader()
.getResourceAsStream("koala.jpg");
InputStream returnedIs = ImageUtil.convertImageStreamToPdfStream(isAsJpg);
//get is from src/test/resources
InputStream expectedIs = Thread.currentThread().getContextClassLoader()
.getResourceAsStream("koala.pdf");
and for my tests I'm calling:
assertTrue(IOUtils.contentEquals(expectedIs, returnedIs));
but it returns false. Therefor I started with creating files so that I could manually check if the file is empty or something. So I added:
File tempFile = File.createTempFile("koala", ".pdf");
tempFile.deleteOnExit();
try (FileOutputStream out = new FileOutputStream(tempFile)) {
IOUtils.copy(returnedIs, out);
}
and I have checked the content of the file manually and it seems ok. Now I wanted to created a file from the resource that I've got to check that content (on the same way) and the pdf was empty.. Although it is placed in the src/test/resource directory and when I try to open it there, it is not empty.
What am I doing wrong? It seems as if I'm not getting the resource on the correct way (koala.pdf) but I can't find an error actually..
EDIT:
When I go and look to
C:..\target\test-classes
the file is there, but.. it is empty (blank page). Although when I open it from
C:..\src\test\resources
it is not empty. How can that be??

I've found a solution.
It's possible to say to maven that he should replace Maven placeholders of type ${..} and my binary PDF content is ofc full with it and therefor the file got corrupted.
I've changed the filtering in my pom:
<testResources>
<testResource>
<directory>src/test/resources</directory>
<filtering>false</filtering>
</testResource>
</testResources>
and the file in the target/test-classes also contains the image now.

Does this line
InputStream returnedIs = ImageUtil.convertImageStreamToPdfStream(isAsJpg);
actually write to a file? It doesn't seem like it. It seems like you should use the InputStream returned to write to a corresponding OutputStream. Then continue as you were.

Related

How to read PDF from the .jar file

In my maven project I have PDF file which is located inside resources folder. My function reads the PDF file from the resources folder and adds some values in the document based on the user's data.
This project is packed as .jar file using mvn clean install and is used as dependency in my other spring boot application.
In my spring boot project I create instace of the class that will perform some work on the PDF. Once all job on the PDF file is done, and when PDF file is saved on file system it is always empty (all pages are blank). I have impression that mvn clean install does something with the PDF file. Here is what I've tried so far:
First way
ByteArrayOutputStream byteArrayOutputStream = new ByteArrayOutputStream();
File file= new ClassPathResource("/pdfs/testpdf.pdf").getFile();//Try to get PDF file
PDDocument pdf = PDDocument.load(file);//Load PDF document from the file
List<PDField> fields = forms.getFields();//Get input fields that I want to update in the PDF
fieldsMap.forEach(throwingConsumerWrapper((field,value) -> changeField(fields,field,value)));//Set input field values
pdf.save(byteArrayOutputStream);//Save value to the byte array
This works great, but as soon as project is packed in a .jar file then I get exception that new ClassPathResource("/pdfs/testpdf.pdf").getFile(); can't find the specified file.
This is normal because the File class can't access anything inside .jar file (it can access the .jar file itself only) and that is clear.
So, the solution to that problem is to use the InputStream instead of the File. Here is what I did:
Second way
ByteArrayOutputStream byteArrayOutputStream = new ByteArrayOutputStream();
InputStream inputStream = new ClassPathResource("/pdfs/testpdf.pdf").getInputStream();//Try to get input stream
PDDocument pdf = PDDocument.load(inputStream );//Load PDF document from the input stream
List<PDField> fields = forms.getFields();//Get input fields that I want to update in the PDF
fieldsMap.forEach(throwingConsumerWrapper((field,value) -> changeField(fields,field,value)));//Set input field values
pdf.save(byteArrayOutputStream);//Save value to the byte array
This time getInputStream() doesn't throw error and inputStream object is not null. But the PDF file once saved on my file system is empty, meaning all pages are empty.
I even tried to copy complete inputStream and saving it to the file byte by byte but what I've noticed that every byte is equal 0. Here is what I did:
Third way
InputStream inputStream = new ClassPathResource("/pdfs/test.pdf").getInputStream();
byte[] buffer = new byte[inputStream.available()];
inputStream.read(buffer);
File targetFile = new File(OUTPUT_FOLDER);
OutputStream outStream = new FileOutputStream(targetFile);
outStream.write(buffer);
Copied test.pdf is saved but when opened with Adobe Reader is reported as corrupted.
Anyone have idea how to fix this?
You have to load it like this:
InputStream inputStream = this.getClass().getClassloader().getResourceAsStream("/pdfs/testpdf.pdf");
If you load it via the ClassLoader the path starts in the root of the classpath.
After few hours of investigation and good input from #Simon Martinelli and #Tilman Hausherr I had 2 issues to solve:
First issue - Read the file correctly
In order to read a file from the resources folder you have to use appropriate classes. As stated above you can't use File class to read the file from the .jar and I used the following construction in my case:
InputStream inputStream = CreatePDF.class.getResourceAsStream("/pdfs/test.pdf");
PDDocument pdf = PDDocument.load(inputStream);
In my case CreatePDF class is static one. If your class is not static then use the following:
InputStream inputStream = this.getClass().getResourceAsStream("/pdfs/test.pdf");
PDDocument pdf = PDDocument.load(inputStream);
Second issue - My original problem
One thing I noticed in my third example of my question is, when I'm copying file byte by byte from the resources to my local folder then all bytes were equal to 0. I knew this can't be correct so I tried to do the same thing with simple .txt file and in that case everything worked correctly. This means mvn clean install was causing some problems on PDF files.
After some investigation I realized that mvn filters are causing the problem. If resource filters are enabled:
<resource>
<directory>src/main/resources</directory>
<filtering>true</filtering>
</resource>
then your binary data is going to be corrupted and that was my original problem. When I set it to false it worked like expected.
Here is Warning from the maven page:
Warning: Do not filter files with binary content like images! This
will most likely result in corrupt output.
If you have both text files and binary files as resources it is
recommended to have two separated folders. One folder
src/main/resources (default) for the resources which are not filtered
and another folder src/main/resources-filtered for the resources which
are filtered.
Here is an example how you could do it:
<resource>
<directory>src/main/resources</directory>
<filtering>true</filtering>
<includes>
<include>**/*.properties</include>
<include>**/*.xml</include>
<include>**/*.txt</include>
<include>**/*.html</include>
</includes>
</resource>
<resource>
<directory>src/main/resources</directory>
<filtering>false</filtering>
<includes>
<include>**/*.pdf</include>
</includes>
</resource>

InputStream from jar-File returns always null

i know this question has been asked several times, but i think my problem differs a bit from the others:
String resourcePath = "/Path/To/Resource.jar";
File newFile = new File(resourcePath);
InputStream in1 = this.getClass().getResourceAsStream(resourcePath);
InputStream in2 = this.getClass().getClassLoader().getResourceAsStream(resourcePath);
The File-Object newFile is completely fine (the .jar file has been found and you can get its meta-data like newFile.length() etc)
On the other hand the InputStream always return null.
I know the javadoc says that the getResourceAsStream() is null if there is no resource found with this name, but the File is there! (obviously, because it's in the File-Object)
Anyone know why this happens and how i can fix it so that i can get the .jar File in the InputStream?
The getResourceAsStream() method doesn't load a file from the file system; it loads a resource from the classpath. You can use it to load, for example, a property file that's packaged inside your JAR. You cannot use it to load a file from the file system.
So, if your file resides on the file system, rather than in your JAR file, better use the FileInputStream class.

Reading File In JAR using Relative Path

I have some text configuration file that need to be read by my program. My current code is:
protected File getConfigFile() {
URL url = getClass().getResource("wof.txt");
return new File(url.getFile().replaceAll("%20", " "));
}
This works when I run it locally in eclipse, though I did have to do that hack to deal with the space in the path name. The config file is in the same package as the method above. However, when I export the application as a jar I am having problems with it. The jar exists on a shared, mapped network drive Z:. When I run the application from command line I get this error:
java.io.FileNotFoundException: file:\Z:\apps\jar\apps.jar!\vp\fsm\configs\wof.txt
How can I get this working? I just want to tell java to read a file in the same directory as the current class.
Thanks,
Jonah
When the file is inside a jar, you can't use the File class to represent it, since it is a jar: URI. Instead, the URL class itself already gives you with openStream() the possibility to read the contents.
Or you can shortcut this by using getResourceAsStream() instead of getResource().
To get a BufferedReader (which is easier to use, as it has a readLine() method), use the usual stream-wrapping:
InputStream configStream = getClass().getResourceAsStream("wof.txt");
BufferedReader configReader = new BufferedReader(new InputStreamReader(configStream, "UTF-8"));
Instead of "UTF-8" use the encoding actually used by the file (i.e. which you used in the editor).
Another point: Even if you only have file: URIs, you should not do the URL to File-conversion yourself, instead use new File(url.toURI()). This works for other problematic characters as well.

get File from JAR

I'm using Spring's Resource abstraction to work with resources (files) in the filesystem. One of the resources is a file inside a JAR file. According to the following code, it appears the reference is valid
ResourcePatternResolver resourceResolver = new PathMatchingResourcePatternResolver();
// The path to the resource from the root of the JAR file
Resource fileInJar = resourcePatternResolver.getResources("/META-INF/foo/file.txt");
templateResource.exists(); // returns true
templateResource.isReadable(); // returns true
At this point, all is well, but then when I try to convert the Resource to a File
templateResource.getFile();
I get the exception
java.io.FileNotFoundException: class path resource [META-INF/foo/file.txt] cannot be resolved to absolute file path because it does not reside in the file system: jar:file:/D:/m2repo/uic-3.2.6-0.jar!/META-INF/foo/file.txt
at org.springframework.util.ResourceUtils.getFile(ResourceUtils.java:198)
at org.springframework.core.io.ClassPathResource.getFile(ClassPathResource.java:174)
What is the correct way to get a File reference to a Resource that exists inside a JAR file?
What is the correct way to get a File
reference to a Resource that exists
inside a JAR file?
The correct way is not doing that at all because it's impossible. A File represents an actual file on a file system, which a JAR entry is not, unless you have a special file system for that.
If you just need the data, use getInputStream(). If you have to satisfy an API that demands a File object, then I'm afraid the only thing you can do is to create a temp file and copy the data from the input stream to it.
If you want to read it, just call resource.getInputStream()
The exception message is pretty clear - the file does not reside on the file-system, so you can't have a File instance. Besides - what will do do with that File, apart from reading its content?
A quick look at the link you provided for Resource documentation, says the following:
Throws: IOException if the resource cannot be resolved as absolute file path,
i.e. if the resource is not available in a file system
Maybe the text file is inside a jar? In that case you will have to use getInputStream() to read its contents.
Just adding an example to the answers here. If you need a File (and not just the contents of it) from within your JAR, you need to create a temporary file from the resource first. (The below is written in Groovy):
InputStream inputStream = resourceLoader.getResource('/META-INF/foo/file.txt').inputStream
File tempFile = new File('file.txt')
OutputStream outputStream = new FileOutputStream(tempFile)
try {
IOUtils.copy(inputStream, outputStream)
} catch (IOException e) {
// Handle exception
} finally {
outputStream.close()
}

java FileInputStream - differences based on how the File object is referenced: classloader/filesystem

I'm using apache POI to extract some data from an excel file.
I need an InputStream to instantiate the POI HSSFWorkbook class HSSFWorkbook wb = new HSSFWorkbook(inputStreamX);
I'm finding differences if I try to construct the InputStream object like
InputStream inputStream = new FileInputStream(new File("/home/xxx/workspace/myproject/test/resources/importTest.xls"));
InputStream inputStream2 = new FileInputStream(getClass().getResource("/importTest.xls").getFile());
InputStream inputStream3 = new ClassPathResource("importTest.xls").getInputStream();
If I construct the POI object with inputStream it works fine.
But inputStream2 and inputStream3 are throwing this exception
java.io.IOException: Invalid header signature; read -2300849302551019537, expected -2226271756974174256
at org.apache.poi.poifs.storage.HeaderBlockReader.<init>(HeaderBlockReader.java:100)
at org.apache.poi.poifs.filesystem.POIFSFileSystem.<init>(POIFSFileSystem.java:84)
It seems that the header of the binary file is different and the library can't recognize it as an Excel file. I can't understand why.
The only difference I see is that inputStream2 & 3 are using the classloader to locate the file. (ClassPathResource is a Spring class).
I'd like to have the file path separated from the system. So I would prefer something like inputStream2 or 3.
Do you have any idea on why this is happening?
Thank you
Update:
I tried writing to disk the inputStream and inputStream2. The excel file that comes with inputStream is Ok. inputStream2 contains an excel file with some strange characters that wrap the real content.
It seems that maven corrupts the excel file in some way during the build.
So it's basically the file I retrieve with the classLoader (under /home/xxx/workspace/myproject/target/test-classes/importTest.xls) that is not ok.
Any idea?
The problem seems maven's filtering option. If the pom looks like this
<testResource>
<directory>${basedir}/src/test/resources</directory>
<includes>
<include>**/*.xml</include>
<include>**/*.properties</include>
<include>**/*.sql</include>
<include>**/*.xls</include>
</includes>
<filtering>true</filtering>
</testResource>
When the filtering option is set to true on xls files it corrupts them.
Have you tried ClassLoader#getResourceAsStream(String)? It will probably behave similarly to your second attempt using Class#getResource(String), as alluded to in the latter's documentation.
My first thought here was that no such file was found, but if it's consistently reading the same value (-2300849302551019537) each time you run the program, that suggests there really is a file there that's being read. Trap the statement after you initialize your InputStream and inspect the stream instance in the debugger. You should be able to find a reference to the underlying file name. To make this easier at first, try using ClassLoader#getResources(String) and inspect the sequence of URLs returned.

Categories