Test if a file is an image file - java

I am using some file IO and want to know if there is a method to check if a file is an image?

This works pretty well for me. Hope I could help
import javax.activation.MimetypesFileTypeMap;
import java.io.File;
class Untitled {
public static void main(String[] args) {
String filepath = "/the/file/path/image.jpg";
File f = new File(filepath);
String mimetype= new MimetypesFileTypeMap().getContentType(f);
String type = mimetype.split("/")[0];
if(type.equals("image"))
System.out.println("It's an image");
else
System.out.println("It's NOT an image");
}
}

if( ImageIO.read(*here your input stream*) == null)
*IS NOT IMAGE*
And also there is an answer: How to check a uploaded file whether it is a image or other file?

In Java 7, there is the java.nio.file.Files.probeContentType() method. On Windows, this uses the file extension and the registry (it does not probe the file content). You can then check the second part of the MIME type and check whether it is in the form <X>/image.

You may try something like this:
String pathname="abc\xyz.png"
File file=new File(pathname);
String mimetype = Files.probeContentType(file.toPath());
//mimetype should be something like "image/png"
if (mimetype != null && mimetype.split("/")[0].equals("image")) {
System.out.println("it is an image");
}

You may try something like this:
import javax.activation.MimetypesFileTypeMap;
File myFile;
String mimeType = new MimetypesFileTypeMap().getContentType( myFile ));
// mimeType should now be something like "image/png"
if(mimeType.substring(0,5).equalsIgnoreCase("image")){
// its an image
}
this should work, although it doesn't seem to be the most elegant version.

There are a variety of ways to do this; see other answers and the links to related questions. (The Java 7 approach seems the most attractive to me, because it uses platform specific conventions by default, and you can supply your own scheme for file type determination.)
However, I'd just like to point out that no mechanism is entirely infallible:
Methods that rely on the file suffix will be tricked if the suffix is non-standard or wrong.
Methods that rely on file attributes (e.g. in the file system) will be tricked if the file has an incorrect content type attribute or none at all.
Methods that rely on looking at the file signature can be tricked by binary files which just happen to have the same signature bytes.
Even simply attempting to read the file as an image can be tricked if you are unlucky ... depending on the image format(s) that you try.

Other answers suggest to load full image into memory (ImageIO.read) or to use standard JDK methods (MimetypesFileTypeMap and Files.probeContentType).
First way is not efficient if read image is not required and all you really want is to test if it is an image or not (and maybe to save it's content type to set it in Content-Type response header when this image will be read in the future).
Inbound JDK ways usually just test file extension and not really give you result that you can trust.
The way that works for me is to use Apache Tika library.
private final Tika tika = new Tika();
private MimeType detectImageContentType(InputStream inputStream, String fileExtension) {
Assert.notNull(inputStream, "InputStream must not be null");
String fileName = fileExtension != null ? "image." + fileExtension : "image";
MimeType detectedContentType = MimeType.valueOf(tika.detect(inputStream, fileName));
log.trace("Detected image content type: {}", detectedContentType);
if (!validMimeTypes.contains(detectedContentType)) {
throw new InvalidImageContentTypeException(detectedContentType);
}
return detectedContentType;
}
The type detection is based on the content of the given document stream and the name of the document. Only a limited number of bytes are read from the stream.
I pass fileExtension just as a hint for the Tika. It works without it. But according to documentation it helps to detect better in some cases.
The main advantage of this method compared to ImageIO.read is that Tika doesn't read full file into memory - only first bytes.
The main advantage compared to JDK's MimetypesFileTypeMap and Files.probeContentType is that Tika really reads first bytes of the file while JDK only checks file extension in current implementation.
TLDR
If you plan to do something with read image (like resize/crop/rotate it), then use ImageIO.read from Krystian's answer.
If you just want to check (and maybe store) real Content-Type, then use Tika (this answer).
If you work in the trusted environment and you are 100% sure that file extension is correct, then use Files.probeContentType from prunge's Answer.

Here's my code based on the answer using tika.
private static final Tika TIKA = new Tika();
public boolean isImageMimeType(File src) {
try (FileInputStream fis = new FileInputStream(src)) {
String mime = TIKA.detect(fis, src.getName());
return mime.contains("/")
&& mime.split("/")[0].equalsIgnoreCase("image");
} catch (IOException e) {
throw new RuntimeException(e);
}
}

Related

How to user Scanner check if a Zip file compressed by CSV or other type of files in JAVA?

I am trying to use Scanner read ZipInputStream line by line, below is the code i have
ZipInputStream inputStream = new ZipInputStream(bodyPartEntity.getInputStream());
inputStream.getNextEntry();
Scanner sc = new Scanner(inputStream);
while (sc.hasNextLine()) {
log.info(sc.nextLine());
}
and it works fine.
But I have a question, what if a user compressed an image or different type of files (not CSV) as a Zip. Is there a way that I can check that so I can throw an exception for it? Also, is there a way to read next file?
For now, if I compressed multiple files, I'm only able to read one. And then sc.hasNextLine() will be equal to false.
Anyway, I can read the next file?
you should loop over your zip file like this:
ZipEntry entry;
while((entry = inputStream.getNextEntry())!=null){
// do your logic here...
// for example: entry.getName()
}
For the file type, you could use something like this, inside of while loop:
MimetypesFileTypeMap mtft = new MimetypesFileTypeMap();
String mimeType = mtft.getContentType(entry.getName());
System.out.println(entry.getName()+" type: "+ mimeType);
Happy coding!
First, you need to loop over your zip, as it was suggested before.
Then you need to check the type of the file with Apache Tika. I think this is the best library if you need to determine the type of the file. It can detect filetype from file extension and file content as well. To use file content is the safest solution, and it fits best for your scenario (stream).
See more info here:
description of Tika detector interface
API
example, how to use it

Let Tika suggest a file-extension [duplicate]

I am uploading files to an Amazon s3 bucket and have access to the InputStream and a String containing the MIME Type of the file but not the original file name. It's up to me to actually create the file name and extension before pushing the file up to S3. Is there a library or convenient way to determine the appropriate extension to use from the MIME Type?
I've seen some references to the Apache Tika library but that seems like overkill and I haven't been able to get it to successfully detect file extensions yet. From what I've been able to gather it seems like this code should work, but I'm just getting an empty string when my type variable is "image/jpeg"
MimeType mimeType = null;
try {
mimeType = new MimeTypes().forName(type);
} catch (MimeTypeException e) {
Logger.error("Couldn't Detect Mime Type for type: " + type, e);
}
if (mimeType != null) {
String extension = mimeType.getExtension();
//do something with the extension
}
As some of the commentors have pointed out, there is no universal 1:1 mapping between mimetypes and file extensions... Some mimetypes have more than one possible extension, many extensions are shared by multiple mimetypes, and some mimetypes have no extension.
Wherever possible, you're much better off storing the mimetype and using that going forward, and forgetting about the extension.
That said, if you do want to get the most common file extension for a given mimetype, then Tika is a good way to go. Apache Tika has a very large set of mimetypes it knows about, and for many of these it also knows mime magic for detection, common extensions, descriptions etc.
If you want to get the most common extension for a JPEG file, then as shown in this Apache Tika unit test you just need to do something like:
MimeTypes allTypes = MimeTypes.getDefaultMimeTypes();
MimeType jpeg = allTypes.forName("image/jpeg");
String jpegExt = jpeg.getExtension(); // .jpg
assertEquals(".jpg", jpeg.getExtension());
The key thing is that you need to load up the xml file that's bundled in the Tika jar to get the definitions of all the mimetypes. If you might be dealing with custom mimetypes too, then Tika supports those, and change line one to be:
TikaConfig config = TikaConfig.getDefaultConfig();
MimeTypes allTypes = config.getMimeRepository();
By using the TikaConfig method to get the MimeTypes, Tika will also check your classpath for custom mimetype defintions, and include those too.

Java URL: Unknown Protocol "C"

I know there are similar questions to this one on SO (like this one), however, after reading through the list of "Questions with similar titles", I still feel strongly that this is unique.
I am working with the iText library to generate PDFs from inside a Swing application. iText's Jpeg class requires a URL in its constructor to locate an image/jpg that you want to add to the PDF file.
When I set this URL to the absolute file path of my JPG file, I get a MalformedURLException claiming unknown protocol: c ("c" being the C:\ drive on my local disk).
Is there any hack/circumvention to this, or do I have to host this JPG somewhere and have the URL find it over the net? Here is the code that is failing:
try {
String imageUrl = "C:\Users\MyUser\image.jpg";
Jpeg image = new Jpeg(new URL(imageUrl));
} catch(Exception exc) {
System.out.println(exc.getMessage());
}
Please note: The URL does properly escape the string (thus "\" are converted to "\ \", etc.).
Thanks in advance!
You need to turn the path to the image.jpg file into a file:// URL, like this:
String imageUrl = "file:///C:/Users/MyUser/image.jpg";
Otherwise it interprets the C as the URL protocol.
Try with
String imageUrl = "file:///C:/Users/MyUser/image.jpg";
Try this
try {
String imageUrl = "file:///C:/Users/MyUser/image.jpg";
Jpeg image = new Jpeg(new URL(imageUrl));
} catch(Exception exc) {
System.out.println(exc.getMessage());
}
In my case the issue was that I was having "%" in my file name. Once I changed it, the file was loaded successfully. So I guess special characters are not allowed in file names, at least in windows.
Searching the file with its directory and adding in the image to assign to the ImageView
File file = new File("F:/a.jpg");
Image image = new Image(arquivo.toURI().toString()); //here is magic happens
imageView.setImage(image);

Java URL problem

A webpage contains a link to an executable (i.e. If we click on the link, the browser will download the file on your local machine).
Is there any way to achieve the same functionality with Java?
Thank you
Yes there is.
Here a simple example:
You can have a JSF(Java Server Faces) page, with a supporting backing bean that contains a method annotated with #PostConstruct This means that any action(for example downloading), will occur when the page is created.
There is already a question very similar already, have a look at: Invoke JSF managed bean action on page load
You can use Java's, URL class to download a file, but it requires a little work. You will need to do the following:
Create the URL object point at the file
Call openStream() to get an InputStream
Open the file you want to write to (a FileOutputStream)
Read from the InputStream and write to the file, until there is no more data left to read
Close the input and output streams
It doesn't really matter what type of file you are downloading (the fact that it's an executable file is irrelevant) since the process is the same for any type of file.
Update: It sounds like what you actually want is to plug the URL of a webpage into the Java app, and have the Java app find the link in the page and then download that link. If that is the case, the wording of your question is very unclear, but here are the basic steps I would use:
First, use steps 1 and 2 above to get an InputStream for the page
Use something like TagSoup or jsoup to parse the HTML
Find the <a> element that you want and extract its href attribute to get the URL of the file you need to download (if it's a relative URL instead of absolute, you will need to resolve that URL against the URL of the original page)
Use the steps above to download that URL
Here's a slight shortcut, based on jsoup (which I've never used before, I'm just writing this from snippets stolen from their webpage). I've left out a lot of error checking, but hey, I usually charge for this:
Document doc = Jsoup.connect(pageUrl).get();
Element aElement = doc.getElementsByTag("a").first() // Obviously you may need to refine this
String newUrl = aElement.attr("abs:href"); // This is a piece of jsoup magic that ensures that the destination URL is absolute
// assert newUrl != null
URL fileUrl = new URL(newUrl);
String destPath = fileUrl.getPath();
int lastSlash = destPath.lastIndexOf('/');
if (lastSlash != -1) {
destPath = destPath.substring(lastSlash);
}
// Assert that this is really a valid filename
// Now just download fileUrl and save it to destPath
The proper way to determine what the destination filename should be (unless you hardcode it) is actually to look for the Content-Disposition header, and look for the bit after filename=. In that case, you can't use openStream() on the URL, you will need to use openConnection() instead, to get a URLConnection. Then you can use getInputStream() to get your InputStream and getRequestProperty("Content-Disposition") to get the header to figure out your filename. In case that header is missing or malformed, you should then fall-back to using the method above to determine the destination filename.
You can do this using apache commons IO FileUtils
http://commons.apache.org/io/apidocs/org/apache/commons/io/FileUtils.html#copyURLToFile(java.net.URL, java.io.File)
Edit:
I was able to successfully download a zip file from source forge site (it is not empty), It did some thing like this
import java.io.File;
import java.net.URL;
import org.apache.commons.io.FileUtils;
public class Test
{
public static void main(String args[])
{
try {
URL url = new URL("http://sourceforge.net/projects/gallery/files/gallery3/3.0.2/gallery-3.0.2.zip/download");
FileUtils.copyURLToFile(url, new File("test.zip"));
} catch (Exception e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}
}
I was able successfully download tomcat.exe too
URL url = new URL("http://archive.apache.org/dist/tomcat/tomcat-6/v6.0.16/bin/apache-tomcat-6.0.16.exe");

Check if a file is an image

I am using JAI and create a file with:
PlanarImage img = JAI.create("fileload", myFilename);
I check before that line if the file exists. But how could I check if the file is a .bmp or a .tiff or an image file?
Does anyone know?
The Image Magick project has facilities to identify image and there's a Java wrapper for Image Magick called JMagick which I think you may want to consider instead of reinventing the wheel:
http://www.jmagick.org
I'm using Image Magick all the time, including its "identify" feature from the command line and it never failed once to identify a picture.
Back in the days where I absolutely needed that feature and JMagick didn't exist yet I used to Runtime.exec() ImageMagick's identify command from Java and it worked perfectly.
Nowadays that JMagick exist this is probably not necessary anymore (but I haven't tried JMagick yet).
Note that it gives much more than just the format, for example:
$ identify tmp3.jpg
tmp3.jpg JPEG 1680x1050 1680x1050+0+0 DirectClass 8-bit 293.582kb
$ identify tmp.png
tmp.png PNG 1012x900 1012x900+0+0 DirectClass 8-bit 475.119kb
Try using the width of the image:
boolean isImage(String image_path){
Image image = new ImageIcon(image_path).getImage();
if(image.getWidth(null) == -1){
return false;
}
else{
return true;
}
}
if the width is -1 then is not image.
To tell if something is a png, I've used this below snippet in Android java.
public CompressFormat getCompressFormat(Context context, Uri fileUri) throws IOException {
// create input stream
int numRead;
byte[] signature = new byte[8];
byte[] pngIdBytes = { -119, 80, 78, 71, 13, 10, 26, 10 };
InputStream is = null;
try {
ContentResolver resolver = context.getContentResolver();
is = resolver.openInputStream(fileUri);
// if first 8 bytes are PNG then return PNG reader
numRead = is.read(signature);
if (numRead == -1)
throw new IOException("Trying to reda from 0 byte stream");
} finally {
if (is != null)
is.close();
}
if (numRead == 8 && Arrays.equals(signature, pngIdBytes)) {
return CompressFormat.PNG;
}
return null;
}
At the beginning of files, there is an identifying character sequence.
For example JPEG files starts with FF D8 FF.
You can check for this sequence in your program but I am not sure whether this works for every file.
For information about identifying characters you can have a look at http://filext.com
You could use DROID, a tool for file format identification that also offers a Java API, to be used roughly like this:
AnalysisController controller = new AnalysisController();
controller.readSigFile(signatureFileLocation);
controller.addFile(fileToIdentify.getAbsolutePath());
controller.runFileFormatAnalysis();
Iterator<IdentificationFile> it = controller.getFileCollection().getIterator();
Documentation on the API usage is rather sparse, but you can have a look at this working example (the interesting part is in the identifyOneBinary method).
The only (semi-)reliable way to determine the contents of a file is to open it and read the first few characters. Then you can use a set of tests such as implemented in the Unix file command to make an educated guess as to the contents of the file.
Expanding on Birkan's answer, there is a list of 'magic numbers' available here:
http://www.astro.keele.ac.uk/oldusers/rno/Computing/File_magic.html
I just checked a BMP and TIFF file (both just created in Windows XP / Paint), and they appear to be correct:
First two bytes "42 4d" -> BMP
First four bytes "4d 4d 00 2a" -> TIFF
I used VIM to edit the files and then did Tools | Convert to Hex, but you can also use 'od -c' or something similar to check them.
As a complete aside, I was slightly amused when I found out the magic numbers used for compiled Java Classes: 'ca fe ba be' - 'cafe babe' :)
Try using the standard JavaBeans Activation Framework (JAF)
With the JavaBeans Activation Framework standard extension, developers who use Java technology can take advantage of standard services to determine the type of an arbitrary piece of data, encapsulate access to it, discover the operations available on it, and to instantiate the appropriate bean to perform said operation(s). For example, if a browser obtained a JPEG image, this framework would enable the browser to identify that stream of data as an JPEG image, and from that type, the browser could locate and instantiate an object that could manipulate, or view that image.
if(currentImageType ==null){
ByteArrayInputStream is = new ByteArrayInputStream(image);
String mimeType = URLConnection.guessContentTypeFromStream(is);
if(mimeType == null){
AutoDetectParser parser = new AutoDetectParser();
Detector detector = parser.getDetector();
Metadata md = new Metadata();
mimeType = detector.detect(is,md).toString();
if (mimeType.contains("pdf")){
mimeType ="pdf";
}
else if(mimeType.contains("tif")||mimeType.contains("tiff")){
mimeType = "tif";
}
}
if(mimeType.contains("png")){
mimeType ="png";
}
else if( mimeType.contains("jpg")||mimeType.contains("jpeg")){
mimeType = "jpg";
}
else if (mimeType.contains("pdf")){
mimeType ="pdf";
}
else if(mimeType.contains("tif")||mimeType.contains("tiff")){
mimeType = "tif";
}
currentImageType = ImageType.fromValue(mimeType);
}

Categories