Accessing GRIB data using GRIB2Tools causes IndexOutOfBoundException

Accessing GRIB data using GRIB2Tools causes IndexOutOfBoundException - java

I'm manipulating GRIB2 forecast files and I'm having trouble using the GRIB2Tools library.
I have an Array[Byte] representing the content of a GRIB2 dataset. Because I want to be able to get value at specific location, I wrote this variable's content to a file which I'm then loading as an InputStream to use it with the getValueAtLocation(id, lat, long) and/or interpolateValueAtLocation(id, lat, long). I can perfectly read the metadata of the file, but as soon as I call one of those 2 previous methods, I get an IndexOutOfBoundException.
Here is the Scala code I use to write the GRIB2 bytes array (variable bytes) on a file and then load it as an InputStream:
val file: File = new File("my-data.grib")
val temp = FileUtils.writeByteArrayToFile(file, bytes)
val input = new FileInputStream("my-data.grib")
val grib: RandomAccessGribFile = new RandomAccessGribFile("my-grib", "my-data.grib")
grib.importFromStream(input, 0)
According to the README.md I am doing it right, isn't it?
Then I can easily get those metadata from the GRIB2 (using some code of the GRIB2FileTest.java):
Body format : GRIB2
Date: 12.10.2021
Time: 9:0.0
Generating centre: 85
Forecast time: 5
Parameter category: 0
Parameter number: 0
Covered area:
from (latitude, longitude): 51.47, 348.0
to: (latitude, longitude): 37.5, 16.0
When calling getValueAtLocation(id, lat, long) and interpolateValueAtLocation(id, lat, long) with id = 0 and lat = 48 and long = 2 (which seems to be ok when reading the metadata) I got this :
java.lang.IndexOutOfBoundsException
at java.nio.Buffer.checkIndex(Buffer.java:551)
at java.nio.HeapByteBuffer.getShort(HeapByteBuffer.java:327)
at com.ph.grib2tools.grib2file.RandomAccessGribFile.interpolateValueAt(RandomAccessGribFile.java:196)
at com.ph.grib2tools.grib2file.RandomAccessGribFile.interpolateValueAtLocation(RandomAccessGribFile.java:133)
The faulting line seems to be this one in this file RandomAccessGribFile.java:196 :
float val11 = sec5.calcValue(ByteBuffer.wrap(data).getShort((jidx1*gridDefinition.numberPointsLon+iidx1)*bytesperval));
Am I doing something wrong or is there an issue with the library source code or my GRIB file? The file comes from a national forecast agency and should be ok. I give you the structure of the GRIB2 file in the screenshot attached (from the Panoply software).

After some research with the dev of the Grib2Tools library, it turns out that the issue was coming from my GRIB file. The national forecast organization of my country, Météo France, add a bitmap on top of the data to indicate whether or not there's a value available at given coordinates. This feature wasn't supported by Grib2Tools, wich lead to data misunderstanding and errors at execution. This bitmap feature will be supported soon thanks to the dev.

Related

How to get the length of an audio file from a link to a website in Kotlin / Java?

I'm trying to get the length of an audio file. Sadly I run into some issues trying to retrieve that file. This is my code (which is in Kotlin):
val inputStream = AudioSystem.getAudioInputStream(URL(url))
val format = inputStream.format
val durationSeconds = inputStream.frameLength / format.frameRate
lengthTicks = (durationSeconds * 20).toDouble()
The link I use is https://cdn.bandithemepark.net/lindburgh/HyperionHal.mp3
When my code gets ran, I get "UnsupportedAudioFileException: URL of unsupported format"
I am unsure why I am getting this error, since MP3 looks like a pretty normal file format to me. I also tried using an MP4 file, but I got the same error with that. Does anybody know what is happening here?

According to the docs:
The provided reference implementation of this API supports the following features:
Audio file formats: AIFF, AU and WAV
Music file formats: MIDI Type 0, MIDI Type 1, and Rich Music Format (RMF)
So it does looks like mp3 and mp4 are not supported. You'll most likely need a library/plugin.
Deciding on which one you might need is beyond the scope of SO, as that would be an opinion-based answer and is not considered acceptable.

how to add metadata to an image (with java code) and then convert it to dicom

I found a java code that converts a jpg and a Dicom(it takes the metadata fri¡om that one) files to a final Dicom one. What I want to do is convert the jpg image into a Dicom one, generating the metadata with java code.
BufferedImage jpg = ImageIO.read(new File("myjpg.jpg"));
// Convert the image to a byte array
DataBuffer buff = jpg.getData().getDataBuffer();
DataBufferUShort buffer = new DataBufferUShort(buff.getSize());
for (int i = 0; i < buffer.getSize(); ++i)
buffer.setElem(i, buff.getElem(i));
short[] data = buffer.getData();
ByteBuffer byteBuf = ByteBuffer.allocate(2 * data.length);
int i = 0;
while (data.length > i) {
byteBuf.putShort(data[i]);
i++;
}
// Copy a header
DicomInputStream dis = new DicomInputStream(new File("fileToCopyheaderFrom.dcm"));
Attributes meta = dis.readFileMetaInformation();
Attributes attribs = dis.readDataset(-1, Tag.PixelData);
dis.close();
// Change the rows and columns
attribs.setInt(Tag.Rows, VR.US, jpg.getHeight());
attribs.setInt(Tag.Columns, VR.US, jpg.getWidth());
System.out.println(byteBuf.array().length);
// Write the file
attribs.setBytes(Tag.PixelData, VR.OW, byteBuf.array());
DicomOutputStream dcmo = new DicomOutputStream(new File("myDicom.dcm"));
dcmo.writeFileMetaInformation(meta);
attribs.writeTo(dcmo);
dcmo.close();

I am not expert in toolkit (and of-course Java as well).
Your "// Copy a header" section reads the source DICOM file and holds all the attributes in Attributes attribs variable.
Then, your "// Change the rows and columns" section modifies few attributes as per need.
Then, your "// Write the file" section simply add the attributes read from source file to destination file.
Now, you want to bypass the source DICOM file and convert plain JPEG to DICOM with adding attributes yourself.
Replace your "// Copy a header" section to build the instance of Attributes.
Attributes attribs = new Attributes();
attribs.setString(Tag.StudyDate, VR.DA, "20110404");
attribs.setString(Tag.StudyTime, VR.TM, "15");
The tags mentioned in above example are for example only. You have to decide yourself which tags you want to include. Note that specifications have defined Types 1, 1C, 2, 2C and 3 for tags depending on the SOP class you are dealing with.
While adding the tags, you have to take care of correct VR as well. Specifications talk about that thing as well.
I cannot explain all this here; too broad.

I cannot help about dcm4che, but if using another Java DICOM library is an option for you, this task is quite simple using DeCaMino (http://dicomplugin.com) :
BufferedImage jpg = ImageIO.read(new File("myjpg.jpg"));
DicomWriter dw = new DicomWriter();
dw.setOutput(new File("myjpg.dcm"));
DicomMetadata dmd = new DicomMetadata();
dw.write(dmd, new IIOImage(jpg, null, null), null);
This will write a DICOM conform file with SOP class "secondary capture" and default metadata.
To customize the metadata, add data elements to dmd before writing, e.g. :
DataSet ds = dmd.getDataSet();
ds.set(Tag.StudyDate, LocalDate.of(2011, 4, 4));
ds.set(Tag.StudyTime, LocalTime.of(15, 0, 0));
You can also change the transfer syntax (thus controlling the pixel data encoding) :
dw.setTransferSyntax(UID.JPEG2000TS);
Disclaimer: I'm the author of DeCaMino.
EDIT: As kritzel_sw says, I'll strongly advice against modifying and existing DICOM object by changing pixel data and some data element, you'll mostly end with a non-conform object. Better is to write an object from scratch, and the simplest objects are from the secondary capture class. DeCaMino helps you by generating a conform secondary capture object with mandatory data elements, but it won't help you to generate a modality (like a CT acquisition) object.

Just a side note:
attribs.setBytes(Tag.PixelData, VR.OW, byteBuf.array());
VR.OW means 16 bits per pixel/channel. Since you are replacing the pixel data with pixel data read from a JPEG image, and you named the buffer "byteBuf", I suspect that this is inconsistent. VR.OB is the value representation for 8 bits per pixel/channel image.
Talking about channels, I understand that you want to make construction of a DICOM object easy by modifying an existing DICOM image rather than creating a new one from scratch. However, color pixel data is not appropriate for all types of DICOM images. E.g. if your fileToCopyheaderFrom.dcm is a Radiography, CT or MRI image (or many other radiology types), it is not allowed to add color pixel data to it.
Furthermore, each image contains identifying information (Study-, Series-, SOP Instance UID are the most important ones) which should be replaced by newly generated values.
I understand that it appears appealing to modify an existing DICOM object with new pixel data, but this process is probably much more complicated than you would expect it to be. In both cases, it is inevitable to learn basic DICOM concepts.

Load pre-trained models in Tensorflow for Java

I'm trying to load pre-trained models in Tensorflow using the Java API.
I notice that over time the format of the saved model files has changed and now there are saved models with file formats .pb , .ckpt and model directories with model.ckpt.data-00000-of-00001 , model.ckpt.index.
I am following the way to read a model specified in the LabelImage example. But in this example the file format is protobuf .pb. I see that the latest saved models are saved in .ckpt or model.ckpt.data-00000-of-00001 , model.ckpt.index formats.
I tried to use the SavedModelBundle method with the export_dir containing the files - model.ckpt.data-00000-of-00001 and model.ckpt.index, but I get this error
`2018-07-18 16:54:00.388790: I tensorflow/cc/saved_model/loader.cc:291] SavedModel load for tags { }; Status: fail. Took 95 microseconds.
Exception in thread "main" org.tensorflow.TensorFlowException: SavedModel not found in export directory: /path/to/model_dir
at org.tensorflow.SavedModelBundle.load(Native Method)
at org.tensorflow.SavedModelBundle.load(SavedModelBundle.java:39)
Could someone please tell me what I'm doing wrong or let me know as to how I can read the saved models saved in file formats apart from .pb in Java.

I think there are 2 ways that you can try to solve your problem:
Convert the format of the saved model (checkpoint file) to protobuf file
After restore the saved model to the current session: sess,
# Freeze the graph, with output _node_names is the name of the output when construct the model
# Eg. output_node_names = ["prediction"]
frozen_graph_def = tf.graph_util.convert_variables_to_constants (sess, sess.graph_def, output_node_names)
# Save the frozen graph
with open (frozen_graph_file, "wb") as f:
f.write(frozen_graph_def.SerializeToString()
It should convert the former format to the new one.
Retrain and save the model to .pb format.

Postscript. Get document page size

I wish to take page size of document, such as A4, A5, A6 etc.
Solution, which I found it's parsing of postscript text and extracting string A6 from
featurebegin{
%%BeginFeature: *PageSize A6
<</DeferredMediaSelection true /PageSize [298 420] /ImagingBBox null /MediaClass null>> setpagedevice
%%EndFeature
}featurecleanup
but this works slowly...
How I can do this? Do exist any libraries for getting full document information?
I prefer solutions in java, if exists.

Your solution there only works for a DSC (Document Structure Convention) conforming file. While many files do conform, others do not. Also that only works if the PostScript file contains a comment (% introduces a comment in PostScript).
You could instead override the setpagedevice operator and have it print the requested media size if present.
/Oldsetpagedevice /setpagedevice load def
/setpagedevice {
dup /PageSize known {
dup /PageSize get
dup 0 get 20 string cvs exch 1 get 20 string cvs exch
(Requested Media Size is ) print print (points by ) print print (points\n) print
} if
Oldsetpagedevice
} bind def
What do you mean by 'full document information' ? By the way, you need to be aware that (unlike PDF) PostScript files are programs, not documents. So the only way to know what's really going on is to interpret the program.
You could use Ghostscript, but it does not have a Java interface, and you would need to be much more specific about the information you want.

If you run the postscript through ghostscript with -sDEVICE=bbox it would report the corners of a rectangle which crops the rendered output, which may be (close to) what you want.
The info is usually printed to stderr in a DSC %%BoundingBox: x0 y0 x1 y1 format.

Accessing "alternate text" for an image via PDFBox

Is there some way to extract "alternate text" for a specific image using PDFBox?
I have a PDF file which, as described at http://www.w3.org/WAI/GL/2011/WD-WCAG20-TECHS-20110621/pdf.html#PDF1, has had alternate text added to an image. Using PDFBox I can find my way through the object model to the image itself (a PDXObjectImage) through PDFDocument.getDocumentCatalog().getAllPages() [iterator] .getResources.getImages() but I can not see any way to get from the image itself to the alternate text for it.
A small sample PDF (with a single image which has some alternate text specified) can be found at http://dl.dropbox.com/u/12253279/image_test_pass.pdf (It should say "This is the alternate text for the image.").

I do not know how/if this can be done with PDFBox, but I can tell you that this feature is related to the sections of the PDF Spec called Logical Structutre/Tagged PDF, which is not fully supported in every PDF tool out-there.
Assuming it is supported by the tool you are using, you will have to follow 4 main steps to retrieve this information (I will use the sample PDF file you posted for the following explanation).
Assuming you have access to the internal structure of the PDF file, you will need to:
1- Parse the page content and find the MCID number of the Tag element that wraps the image you are interested in.
Page content:
BT
/P <</MCID 0 >>BDC
/GS0 gs
/TT0 1 Tf
0.0004 Tc -0.0028 Tw 10.02 0 0 10.02 90 711 Tm
(This is an image test )Tj
EMC
ET
/Figure <</MCID 1 >>BDC
q
106.5 0 0 106.5 90 591.0599976 cm
/Im0 Do
Q
EMC
Your image:
2- In the page object, retrieve the key StructParents.
3- Now retrieve the Structure Tree (key StructTreeRoot of the Catalog object, which is the root object in every PDF file), and inside it, the ParentTree.
4- The ParentTree starts with an array where you can find pairs of elements (See Number Trees in the PDF Spec for more details). In this specific tree, the first element of each pair is a numeric value that corresponds to the StructParents key retrieved in step 2, and the second element is an array of objects, where the indexes correspond to the MCID values retreived in step 1. So, You will search here the element that corresponds to the MCID value of your image, and you will find a PDF object. Inside this object, you will find the alternate text.
Looks easy, isn't it?
Tools used in this answer:
PDF Vole (based on iText)
Amyuni PDF Analyzer

Eric from the PDFBox mailing list sent me the following, though I've not tested it out yet...
Hi,
For your test file, here is a way to access "/Alt" entry :
PDDocument document = PDDocument.load("image_test_pass.pdf");
PDStructureTreeRoot treeRoot =
document.getDocumentCatalog().getStructureTreeRoot();
// get page for each StructElement
for (Object o : treeRoot.getKids()) {
if (o instanceof PDStructureElement) {
PDStructureElement structElement = (PDStructureElement)o;
System.out.println(structElement.getAlternateDescription());
PDPage page = structElement.getPage();
if (page != null) {
page.getResources().getImages();
}
}
}
Please refer to the PDF specification http://www.adobe.com/devnet/acrobat/pdfs/PDF32000_2008.pdf and in particular §14.6, §14.7,
§14.9.3 and §14.9.4 to know all the rules in order to find the "/Alt"
entry. There seems to have several way to define this information.
BR,
Eric

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.