Creating a jpeg file with metadata - java

I have a Java application that creates a BufferedImage and saves it to disk as a JPEG. I'd really like to add a caption to the image. To prevent the image from getting crowded out by text on the image itself, it'd be great if I could write the caption to the JPEG's metadata.
I've been searching all over the place for a solution, but haven't found anything satisfactory. Sanselan comes up a lot, but I haven't figured out how to use it properly. I found examples that modify existing metadata, but my files don't contain metadata as they are simply created from ImageIO.write() or Sanselan.writeImage().
I found another post that does what I'm looking for, but it's in C# and I need Java.
Any help would be greatly appreciated.

the package you want to look at is javax.imageio.metadata
The IIOMetaData class (which has a concrete subclass for JPEG) contains methods to get metadata information in various formats, including as an XML DOM tree root node.

Related

Need a head start in reading AFP files and extracting the content and metadata

I was assigned to work on this specific project, where we will be getting AFP(advanced function presentation) files and we need to get the documents, i.e.the content and the corresponding meta data. I have been looking into AFP(advanced function presentation) file format and haven't actually got any useful resource about how I should proceed with the task.
I have almost got no information up until now and don't know where to proceed. I looked into some open source projects and found this: https://github.com/yan74/afplib
I tried running it.. But it does not work on the sample AFP file which I have.
Really need some insight upon what resources should I go through to be able this project.
I need to write the code in Java and have gone through some licensed softwares which do the same,like PROARCHIVER and PAPYRUS.
Thanks in advance
AFP is an easy format, it's composed of structured fields, your first step is decoding them, download this: "Mixed Object Document Content Architecture Reference" read first 50 pages and write code to split afp into structured fields, in order to create an easy dump of your file.
After that if you want to extract images AFP world calls them IOCA, so you need: Image Object Content Architecture reference
If you want to extract text (called PTX) you need: Presentation Text Object Content Architecture Reference
good job

No reader matches PNG-Stream in Java.ImageIO

I'm trying to read the meta-data of a PNG file with java following the solution proposed here.
But the method ImageIO.getImageReaders(inputStream) is returning an empty list of readers.
I assured that the stream is correct by reading it via ImageIO.read and rendering the resulting Image to the screen.
And this is why I'm confused: since ImageIO.read returns a valid image, i assume there is some ImageReader claiming to be able to interpret this stream. Is there a difference between interpreting image data and the meta-data of the image?
Any hints or even solutions to this problem?
Thank you very much.
I believe that ImageIO.getImageReaders() expects an ImageInputStream, you can try to create one from your InputStream using createImageInputStream. I guess that's what ImageIO.read(InputStream) does under the hood.
Anyway, if you already know that you have a PNG, why not use getImageReadersByFormatName("png") ?
BTW: height and width (and color model, etc) can be considered as "image metadata", in the sense that they are not part of the pixels values (which would be the real data), but in common parlance, they are regarded rather as image (esential) properties. The image metadata is generally (and specifcally in IIOMetadata) understood to be additional "miscelanous" data (as physical resolution, timestamp) which is normally not needed to access the image data.

PDF Handling in Java

I have created a program that should one day become a PDF editor
It's purpose will be saving GUI's textual content to the PDF, and loading it from it. GUI resembles text editor, but it only has certain fields(JTextAreas, actually).
It can look like this (this is only one page, it can have many more, also upper and lower margins are cut out of the picture) It should actually resemble A4 in pixel size.
I have looked around for a bit for PDF libraries and found out that iText could suit my PDF creating needs, however, if I understood it correct, it retirevs text from a whole page as a string which won't work for me, because I will need to detect diferent fields/paragaphs/orsomething to be able to load them back into the program.
Now, I'm a bit lazy, but I don't want to spend hours going trough numerus PDF libraries just to find out that they won't work for me.
Instead, I'm asking someone with a bit more Java PDF handling experience to recommend me one according to my needs.
Or maybe recommend me how to add invisible parts to PDF which will help my program to determine where is it exactly situated insied a PDF file...
Just to be clear (I formed my question wrong before), only thing I need to put in my PDF is text, and that's all I need to later be able to get out. My program should be able to read PDF's which he created himself...
Also, because of the designated use of files created with this program, they need to be in the PDF format.
Short Answer: Use an intermediate format like JSON or XML.
Long Answer: You're using PDF's in a manner that they wasn't designed for. PDF's were not designed to store data; they were designed to present and format data in an portable form. Furthermore, a PDF is a very "heavy" way to store data. I suggest storing your data in another manner, perhaps in a format like JSON or XML.
The advantage now is that you are not tied to a specific output-format like PDF. This can come in handy later on if you decide that you want to export your data into another format (like a Word document, or an image) because you now have a common representation.
I found this link and another link that provides examples that show you how to store and read back metadata in your PDF. This might be what you're looking for, but again, I don't recommend it.
If you really insist on using PDF to store data, I suggest that you store the actual data in either XML or RDF and then attach that to the PDF file when you generate it. Then you can read the XML back for the data.
Assuming that your application will only consume PDF files generated by the same application, there is one part of the PDF specification called Marked Content, that was introduced precisely for this purpose. Using Marked Content you can specify the structure of the text in your document (chapter, paragraph, etc).
Read Chapter 14 - Document Interchange of the PDF Reference Document for more details.

best practices question: How to save a collection of images and a java object in a single file? File is read to be rendered

I am making a java program that has a collection of flash-card like objects. I store the objects in a jtree composed of defaultmutabletreenodes. Each node has a user object attached to it with has a few string/native data type parameters. However, i also want each of these objects to have an image (typical formats, jpg, png etc).
I would like to be able to store all of this information, including the images and the tree data to the disk in a single file so the file can be transferred between users and the entire tree, including the images and parameters for each object, can be reconstructed.
I had not approached a problem like this before so I was not sure what the best practices were. I found XLMEncoder (http://java.sun.com/j2se/1.4.2/docs/api/java/beans/XMLEncoder.html) to be a very effective way of storing my tree and the native data type information. However I couldn't figure out how to save the image data itself inside of the XML file, and I'm not sure it is possible since the data is binary (so restricted characters would be invalid). My next thought was to associate a hash string instead of an image within each user object, and then gzip together all of the images, with the hash strings as the names and the XMLencoded tree in the same compmressed file. That seemed really contrived though.
Does anyone know a good approach for this type of issue?
THanks!
Thanks!
Assuming this isn't just a serializable graph, consider bundling the files together in Jar format. If you already have your data structures working with XMLEncoder, you can reuse this code by saving the data as a jar entry.
If memory serves, the jar library has better support for Unicode name entries than the zip package, which is why I would favour it.
You might consider using an MS JET database (.mdb file) and storing all the stuff in there. That'll also make it easy to examine and edit the data in (for example) MS Access.
You can employ some virtual file system, which stores it's data in a single container. We develop and offer one of such files sytems, SolFS, however right now there's no Java binding for it. We will release Java JNI interface for SolFS within a month.

Java: Using Castor XML with images

how can I use Castor XML to marshal a java.awt.Image object to XML, or make the XML reference the image in some way.
Cheers,
Pete
I guess you could write your own field handler. Me, I'd write the image itself to a location and reference the image from within the xml.
I don't know if you can store the image directly. You could try to get the raster and store each pixel.
This depends entirely on what will consume the XML. As jjungnickel, said, the easiest way to do this is to write the image to a file and then reference this file within XML. You can do this by:
Put the filename (relative or absolute) in the XML
Put an XML entity reference in the XML, which may or may not work depending on how the image is encoded
Images -- binary content -- can go into XML, but it requires more special handling. It's a lot easier to put the filename into XML and then to open that file separately, but this depends on the needs of the consumer of the XML.
Now, if you want to use Castor to do this, it's a lot easier to serialize the filename rather than the image itself. If you want to put the image itself in the XML, you'll need to write a custom field handler. When I've used Castor for XML that involved images, I always put the filename in the XML, not the image itself, and then the XML consumer read the filename and used that to serialize the image.
You could certainly base64 encode a gif/jpg/whatever an wrap it in an or whatever sort of tag. As Jan suggested, you can use a custom FieldHandler to do this (which I haven't done in Castor for at least five years).
The real question is this: What is your goal? Are you trying to design an inter-op scheme that can pass image data along with "normal" information? Are you trying to persist your data as XML in your own systems? If you are trying to do inter-op, I'd go the Base64 route for simplicity. Almost any language with an XML parser will have a Base64 package as well, so it wouldn't place an undue burden on the other party.
If you are persisting XML documents within systems that you control, I'd consider the other posters' suggestion to provide a key in the XML to an image file stored somewhere, especially if the images are large or numerous.

Categories