Writing a Jagged Array in HDF5 using the Java Native Library

Writing a Jagged Array in HDF5 using the Java Native Library - java

I have tried numerous ways and followed some of the examples that are scattered around the web on how to write a jagged array (an array of arrays that may be of differing lengths) in HDF5.
Most of the examples are in C and rather low-level. Anyhow I can't seem to get it working and I just looked at the C-source code and it pretty much says that any variable-length datatypes that are not strings are not supported (if I understood correctly).
My miserable dysfunctional code (as is):
public void WIP_createVLenFloatDataSet( List<? extends Number> floats ) throws Exception
{
String group = "/test";
long groupId = createGroupIfNotExist( group );
MDataQualifier qualifier = new MDataQualifierImpl( group, "float", "0.0.0" );
long datasetId = openDataSet( qualifier );
long heapType = H5.H5Tcopy( MDataType.FLOAT_ARRAY.getHDFType() );
heapType = H5.H5Tvlen_create( heapType );
// heapType = H5.H5Tarray_create( heapType, 1, new long[]{1} );
if( !exists( datasetId ) )
{
long[] maxDims = new long[]{ HDF5Constants.H5S_UNLIMITED };
long dataspaceId = H5.H5Screate_simple( 1, new long[]{ 1 }, null );
// Create the dataset.
long datasetId1 = -1;
try
{
if( exists( m_fileId ) && exists( dataspaceId ) && exists( heapType ) )
{
long creationProperties = H5.H5Pcreate( HDF5Constants.H5P_DATASET_CREATE );
H5.H5Pset_chunk( creationProperties, /*ndims*/1, new long[]{ 1 } );
datasetId1 = H5.H5Dcreate( groupId, qualifier.getVersionedName(), heapType, dataspaceId, H5P_DEFAULT, creationProperties, H5P_DEFAULT );
// H5.H5Pclose( creationProperties );
}
}
catch( Exception e )
{
LOG.error( "Problems creating the dataset: " + e.getMessage(), e );
}
datasetId = datasetId1;
if( exists( datasetId ) )
{
// flushIfNecessary();
LOG.trace( "Wrote empty dataset {}", qualifier.getVersionedName() );
}
}
List<? extends Number> data = ( List<? extends Number> )floats;
// H5.H5Dwrite( datasetId, heapType, dataspaceId, memSpaceId, HDF5Constants.H5P_DEFAULT, Floats.toArray( data) );
ByteBuffer bb = ByteBuffer.allocate( data.size() * 4 );
floats.forEach( f -> bb.putFloat( f.floatValue() ) );
// H5.H5Dwrite( datasetId, heapType, H5S_ALL, H5S_ALL, H5P_DEFAULT, Floats.toArray( data ) );
H5.H5Dwrite( datasetId, heapType, H5S_ALL, H5S_ALL, H5P_DEFAULT, bb.array() );
}
Has anyone done this before and can at least confirm that it's not possible?
The most I can get out of HDF5 is the message "buf does not support variable length type".

Apparently the "glue code" of the JNI wrapper doesn't support this. If you want to use this feature you either have to implement your own JNI or wait for a newer version. The official JNI code is open source and can be found here.

Related

mapstruct does not set relationship properly on a bi-directional OneToMany

I have a JPA one to many bi-directional association. In my code i set the relationship on both side. But the generated mapstruct code seems not setting the relationship properly. I mean it is setting on one side.
I pasted part of my code. The line which i commented is added manually by me.
It should have been generated by mapstruct
derivativeFuture.setDerivativeExecutions( derivativeExecutionDTOSetToDerivativeExecutionSet( derivativeDTO.getDerivativeExecutions() ) );
//derivativeFuture.getDerivativeExecutions().forEach(derivativeExecution -> { derivativeExecution.setDerivative(derivativeFuture); });
protected Set<DerivativeExecution> derivativeExecutionDTOSetToDerivativeExecutionSet(Set<DerivativeExecutionDTO> set) {
if ( set == null ) {
return null;
}
Set<DerivativeExecution> set1 = new HashSet<DerivativeExecution>( Math.max( (int) ( set.size() / .75f ) + 1, 16 ) );
for ( DerivativeExecutionDTO derivativeExecutionDTO : set ) {
set1.add( derivativeExecutionDTOToDerivativeExecution( derivativeExecutionDTO ) );
}
return set1;
}
protected DerivativeExecution derivativeExecutionDTOToDerivativeExecution(DerivativeExecutionDTO derivativeExecutionDTO) {
if ( derivativeExecutionDTO == null ) {
return null;
}
DerivativeExecution derivativeExecution = new DerivativeExecution();
derivativeExecution.setPhysicalQuantity( derivativeExecutionDTO.getPhysicalQuantity() );
derivativeExecution.setExchangeQuantity( derivativeExecutionDTO.getExchangeQuantity() );
derivativeExecution.setPurchaseSaleIndicator( derivativeExecutionDTO.getPurchaseSaleIndicator() );
derivativeExecution.setQuotePricingStartDate( derivativeExecutionDTO.getQuotePricingStartDate() );
derivativeExecution.setQuotePricingEndDate( derivativeExecutionDTO.getQuotePricingEndDate() );
derivativeExecution.setContractExecutionId( derivativeExecutionDTO.getContractExecutionId() );
return derivativeExecution;
}

There are 2 options: adder_prefered : http://mapstruct.org/documentation/stable/reference/html/#collection-mapping-strategies or using a context: https://github.com/mapstruct/mapstruct-examples/tree/master/mapstruct-jpa-child-parent.

Access to Drools returned fact object in Java Code

I have a drools rule created via the Guvnor console and the rule validates and inserts a fact into the working memory if conditions were met. The rule is:
1. | rule "EligibilityCheck001"
2. | dialect "mvel"
3. | when
4. | Eligibility( XXX== "XXX" , YYY== "YYY" , ZZZ== "ZZZ" , BBB == "BBB" )
5. | then
6. | EligibilityInquiry fact0 = new EligibilityInquiry();
7. | fact0.setServiceName( "ABCD" );
8. | fact0.setMemberStatus( true );
9. | insert(fact0 );
10. | System.out.println( "Hello from Drools");
11. | end
Java code that executes the rule is as follows
RuleAgent ruleAgent = RuleAgent.newRuleAgent("/Guvnor.properties");
RuleBase ruleBase = ruleAgent.getRuleBase();
FactType factType = ruleBase.getFactType("mortgages.Eligibility");
Object obj = factType.newInstance();
factType.set(obj, "XXX", "XXX");
factType.set(obj, "YYY", "YYY");
factType.set(obj, "ZZZ", "XXX");
factType.set(obj, "BBB", "BBB");
WorkingMemory workingMemory = ruleBase.newStatefulSession();
workingMemory.insert(obj);
workingMemory.fireAllRules();
System.out.println("After drools execution");
long count = workingMemory.getFactCount();
System.out.println("count " + count);
Everything looks great with the output as below:
Hello from Drools
After drools execution
count 2
I cannot seem to find a way to get the EligibilityInquiry fact object back in my Java code and get the attributes set in the rule above (serviceName and status). I have used the StatefulSession approach.
The properties file has the link to the snapshot with basic authentication via username and password. There are 2 total facts: EligibilityInquiry and Eligibility.
I am fairly new to drools and any help with the above is appreciated.

(Note: I fixed the order of statement, a typo ("XX") and removed the comments from the output. Less surprise.)
This snippet assumes that EligibilityInquiry is also declared in DRL.
FactType eligInqFactType = ruleBase.getFactType("mortgages", "EligibilityInquiry");
Class<?> eligInqClass = eligInqFactType.getFactClass();
ObjectFilter filter = new FilterByClass( eligInqClass );
Collection<Object> eligInqs = workingMemory.getObjects( filter );
And the filter is
public class FilterByClass implements ObjectFilter {
private Class<?> theClass;
public FilterByClass( Class<?> clazz ){
theClass = clazz;
}
public boolean accept(Object object){
return theClass.isInstance( object );
}
}
You might also use a query, which takes about the same amount of code.
// DRL code
query "eligInqs"
eligInq : EligibilityInquiry()
end
// after return from fireAllRules
QueryResults results = workingMemory.getQueryResults( "eligInqs" );
for ( QueryResultsRow row : results ) {
Object eligInqObj = row.get( "eligInq" );
System.out.println( eligInqClass.cast( eligInqObj ) );
}
Or you can call workingMemory.getObjects() and iterate the collection and check for the class of each object.
for( Object obj: workingMemory.getObjects() ){
if( obj.isInstance( eligInqClass ) ){
System.out.println( eligInqClass.cast( eligInqObj ) );
}
}
Or you can (with or without inserting the created EligibilityInquiry object as a fact) add the fact to a global java.util.List eligInqList and iterate that in your Java code. Note that the API of StatefulKnowledgeSession is required (instead of WorkingMemory).
// Java - prior to fireAllRules
StatefulKnowledgeSession kSession() = ruleBase.newStatefulSession();
List<?> list = new ArrayList();
kSession.setGlobal( "eligInqList", list );
// DRL
global java.util.List eligInqList;
// in a rule
then
EligibilityInquiry fact0 = new EligibilityInquiry();
fact0.setServiceName( "ABCD" );
fact0.setMemberStatus( true );
insert(fact0 );
eligInqList.add( fact0 );
end
// after return from fireAllRules
for( Object elem: list ){
System.out.println( eligInqClass.cast( elem ) );
}
Probably an embarras de richesses.

Javassist - How to add line number to method

I am a newbie to java bytecode and javassist. I created a new class file with using javassist. Although I added fields and methods, I couldn't achieve to add line number to method. Result of my research, I understand that I need to add linenumberattribute to codeattribute of method info. Moreover, linenumberattribute consists of linenumbertable. I don't know how can I create a new linenumberattribute with javassist.

I am writing a compiler that produces JVM code. I need line numbers in the output. I do it this way.
I build up a list of objects similar to this:
public class MyLineNum {
public final short pc;
public final short lineNum;
}
Then I add the line number table:
final ClassFile classFile = ...;
final ConstPool constPool = classFile.getConstPool();
final MethodInfo minfo = new MethodInfo( ... );
final Bytecode code = new Bytecode( constPool );
... code that writes to 'code'
final List<MyLineNum> lineNums = new ArrayList<>();
... code that adds to 'lineNums'
final CodeAttribute codeAttr = code.toCodeAttribute();
if ( !lineNums.isEmpty() ) {
// JVM spec describes method line number table thus:
// u2 line_number_table_length;
// { u2 start_pc;
// u2 line_number;
// } line_number_table[ line_number_table_length ];
final int numLineNums = lineNums.size();
final byte[] lineNumTbl = new byte[ ( numLineNums * 4 ) + 2 ];
// Write line_number_table_length.
int byteIx = 0;
ByteArray.write16bit( numLineNums, lineNumTbl, byteIx );
byteIx += 2;
// Write the individual line number entries.
for ( final MyLineNum ln : lineNums) {
// start_pc
ByteArray.write16bit( ln.pc, lineNumTbl, byteIx );
byteIx += 2;
// line_number
ByteArray.write16bit( ln.lineNum, lineNumTbl, byteIx );
byteIx += 2;
}
// Add the line number table to the CodeAttribute.
#SuppressWarnings("unchecked")
final List<AttributeInfo> codeAttrAttrs = codeAttr.getAttributes();
codeAttrAttrs.removeIf( ( ai ) -> ai.getName().equals( "LineNumberTable" ) ); // remove if already present
codeAttrAttrs.add( new AttributeInfo( constPool, "LineNumberTable", lineNumTbl ) );
}
// Attach the CodeAttribute to the MethodInfo.
minfo.setCodeAttribute( codeAttr );
// Attach the MethodInfo to the ClassFile.
try {
classFile.addMethod( minfo );
}
catch ( final DuplicateMemberException ex ) {
throw new AssertionError( "Caught " + ex, ex );
}

Groovy/Java: Ini4j insert multiple values to single parameter in different lines

I'm trying to add multiple-values option into my ini file from Groovy using ini4j with following codes (I tried some variants):
import org.ini4j.Wini
List valuesList = [ 'val1’, ‘val2’, ‘val3' ]
( new Wini( new File( "test.ini" ) ) ).with{
valuesList.each{
put( 'sectionNa'sectionName','optionName', it)
}
store()
}
import org.ini4j.Wini
List valuesList = [ 'val1’, ‘val2’, ‘val3' ]
( new Wini( new File( "test.ini" ) ) ).with{
Section sectionObject = get( ‘sectionName’ )
sectionObject .put( 'optionName', ‘val1’ )
sectionObject .put( 'optionName', ‘val2’ )
sectionObject .put( 'optionName', ‘val3’ )
}
store()
}
I got ini file like this one:
[sectionName]
optionName = val3
But I want to get:
[sectionName]
optionName = val1
optionName = val2
optionName = val3
Could you please advice me how to resolve my issue? Thanks In Advance!
Update 1
I still waiting more elegant solution. But I created direct ini file editing below. Please provide me any feedback about it:
List newLines = []
File currentFile = new File( "test.ini" )
List currentLines = currentFile.readLines()
int indexSectionStart = currentLines.indexOf( 'sectionName' )
(0..indexSectionStart).each{
newLines.add( currentLines[ it ] )
}
List valuesList = 'val1,val2,val3'.split( ',' )
valuesList.each{
newLines.add( "optionName = $it" )
}
( indexSectionStart + 1 .. currentLines.size() - 1 ).each{
newLines.add( currentLines[ it ] )
}
File newFile = new File( "new_test.ini" )
if ( newFile.exists() ) newFile.delete()
newLines.each {
newFile.append( it+'\n' )
}
And simply delete old file and rename new one. I implemented it because I didn't find any insertLine() like methods in standart File

Right, how's this:
import org.ini4j.*
List valuesList = [ 'val1', 'val2', 'val3' ]
new File( "/tmp/test.ini" ).with { file ->
new Wini().with { ini ->
// Configure to allow multiple options
ini.config = new Config().with { it.multiOption = true ; it }
// Load the ini file
ini.load( file )
// Get or create the section
( ini.get( 'sectionName' ) ?: ini.add( 'sectionName' ) ).with { section ->
valuesList.each {
// Then ADD the options
section.add( 'optionName', it )
}
}
// And write it back out
store( file )
}
}

Solr WordDelimiterFilter + Lucene Highlighter

I am trying to get the Highlighter class from Lucene to work properly with tokens coming from Solr's WordDelimiterFilter. It works 90% of the time, but if the matching text contains a ',' such as "1,500" the output is incorrect:
Expected: 'test 1,500 this'
Observed: 'test 11,500 this'
I am not currently sure whether it is Highlighter messing up the recombination or WordDelimiterFilter messing up the tokenization but something is unhappy. Here are the relevant dependencies from my pom:
org.apache.lucene
lucene-core
2.9.3
jar
compile
org.apache.lucene
lucene-highlighter
2.9.3
jar
compile
org.apache.solr
solr-core
1.4.0
jar
compile
And here is a simple JUnit test class demonstrating the problem:
package test.lucene;
import static org.junit.Assert.assertEquals;
import static org.junit.Assert.assertTrue;
import java.io.IOException;
import java.io.Reader;
import java.util.HashMap;
import org.apache.lucene.analysis.Analyzer;
import org.apache.lucene.analysis.TokenStream;
import org.apache.lucene.queryParser.ParseException;
import org.apache.lucene.queryParser.QueryParser;
import org.apache.lucene.search.Query;
import org.apache.lucene.search.highlight.Highlighter;
import org.apache.lucene.search.highlight.InvalidTokenOffsetsException;
import org.apache.lucene.search.highlight.QueryScorer;
import org.apache.lucene.search.highlight.SimpleFragmenter;
import org.apache.lucene.search.highlight.SimpleHTMLFormatter;
import org.apache.lucene.util.Version;
import org.apache.solr.analysis.StandardTokenizerFactory;
import org.apache.solr.analysis.WordDelimiterFilterFactory;
import org.junit.Test;
public class HighlighterTester {
private static final String PRE_TAG = "<b>";
private static final String POST_TAG = "</b>";
private static String[] highlightField( Query query, String fieldName, String text )
throws IOException, InvalidTokenOffsetsException {
SimpleHTMLFormatter formatter = new SimpleHTMLFormatter( PRE_TAG, POST_TAG );
Highlighter highlighter = new Highlighter( formatter, new QueryScorer( query, fieldName ) );
highlighter.setTextFragmenter( new SimpleFragmenter( Integer.MAX_VALUE ) );
return highlighter.getBestFragments( getAnalyzer(), fieldName, text, 10 );
}
private static Analyzer getAnalyzer() {
return new Analyzer() {
#Override
public TokenStream tokenStream( String fieldName, Reader reader ) {
// Start with a StandardTokenizer
TokenStream stream = new StandardTokenizerFactory().create( reader );
// Chain on a WordDelimiterFilter
WordDelimiterFilterFactory wordDelimiterFilterFactory = new WordDelimiterFilterFactory();
HashMap<String, String> arguments = new HashMap<String, String>();
arguments.put( "generateWordParts", "1" );
arguments.put( "generateNumberParts", "1" );
arguments.put( "catenateWords", "1" );
arguments.put( "catenateNumbers", "1" );
arguments.put( "catenateAll", "0" );
wordDelimiterFilterFactory.init( arguments );
return wordDelimiterFilterFactory.create( stream );
}
};
}
#Test
public void TestHighlighter() throws ParseException, IOException, InvalidTokenOffsetsException {
String fieldName = "text";
String text = "test 1,500 this";
String queryString = "1500";
String expected = "test " + PRE_TAG + "1,500" + POST_TAG + " this";
QueryParser parser = new QueryParser( Version.LUCENE_29, fieldName, getAnalyzer() );
Query q = parser.parse( queryString );
String[] observed = highlightField( q, fieldName, text );
for ( int i = 0; i < observed.length; i++ ) {
System.out.println( "\t" + i + ": '" + observed[i] + "'" );
}
if ( observed.length > 0 ) {
System.out.println( "Expected: '" + expected + "'\n" + "Observed: '" + observed[0] + "'" );
assertEquals( expected, observed[0] );
}
else {
assertTrue( "No matches found", false );
}
}
}
Anyone have any ideas or suggestions?

After further investigation, this appears to be a bug in the Lucene Highlighter code. As you can see here:
public class TokenGroup {
...
protected boolean isDistinct() {
return offsetAtt.startOffset() >= endOffset;
}
...
The code attempts to determine if a group of tokens is distinct by checking to see if the start offset is greater than the previous end offset. The problem with this approach is illustrated by this issue. If you were to step through the tokens, you would see that they are as follows:
0-4: 'test', 'test'
5-6: '1', '1'
7-10: '500', '500'
5-10: '1500', '1,500'
11-15: 'this', 'this'
From this you can see that the third token starts after the end of the second, but the fourth starts the same place as the second. The intended outcome would be to group tokens 2, 3, and 4, but per this implementation, token 3 is seen as separate from 2, so 2 shows up by itself, then 3 and 4 get grouped leaving this outcome:
Expected: 'test <b>1,500</b> this'
Observed: 'test 1<b>1,500</b> this'
I'm not sure this can be accomplished without 2 passes, one to get all the indexes and a second to combine them. Also, I'm not sure what the implications would be outside of this specific case. Does anyone have any ideas here?
EDIT
Here is the final source code I came up with. It will group things correctly. It also appears to be MUCH simpler than the Lucene Highlighter implementation, but admittedly does not handle different levels of scoring as my application only needs a yes/no as to whether a fragment of text gets highlighted. Its also worth noting that I am using their QueryScorer to score the text fragments which does have the weakness of being Term oriented rather than Phrase oriented which means the search string "grammatical or spelling" would end up with highlighting that looks something like this "grammatical or spelling" as the or would most likely get dropped by your analyzer. Anyway, here is my source:
public TextFragments<E> getTextFragments( TokenStream tokenStream,
String text,
Scorer scorer )
throws IOException, InvalidTokenOffsetsException {
OffsetAttribute offsetAtt = (OffsetAttribute) tokenStream.addAttribute( OffsetAttribute.class );
TermAttribute termAtt = (TermAttribute) tokenStream.addAttribute( TermAttribute.class );
TokenStream newStream = scorer.init( tokenStream );
if ( newStream != null ) {
tokenStream = newStream;
}
TokenGroups tgs = new TokenGroups();
scorer.startFragment( null );
while ( tokenStream.incrementToken() ) {
tgs.add( offsetAtt.startOffset(), offsetAtt.endOffset(), scorer.getTokenScore() );
if ( log.isTraceEnabled() ) {
log.trace( new StringBuilder()
.append( scorer.getTokenScore() )
.append( " " )
.append( offsetAtt.startOffset() )
.append( "-" )
.append( offsetAtt.endOffset() )
.append( ": '" )
.append( termAtt.term() )
.append( "', '" )
.append( text.substring( offsetAtt.startOffset(), offsetAtt.endOffset() ) )
.append( "'" )
.toString() );
}
}
return tgs.fragment( text );
}
private class TokenGroup {
private int startIndex;
private int endIndex;
private float score;
public TokenGroup( int startIndex, int endIndex, float score ) {
this.startIndex = startIndex;
this.endIndex = endIndex;
this.score = score;
}
}
private class TokenGroups implements Iterable<TokenGroup> {
private List<TokenGroup> tgs;
public TokenGroups() {
tgs = new ArrayList<TokenGroup>();
}
public void add( int startIndex, int endIndex, float score ) {
add( new TokenGroup( startIndex, endIndex, score ) );
}
public void add( TokenGroup tg ) {
for ( int i = tgs.size() - 1; i >= 0; i-- ) {
if ( tg.startIndex < tgs.get( i ).endIndex ) {
tg = merge( tg, tgs.remove( i ) );
}
else {
break;
}
}
tgs.add( tg );
}
private TokenGroup merge( TokenGroup tg1, TokenGroup tg2 ) {
return new TokenGroup( Math.min( tg1.startIndex, tg2.startIndex ),
Math.max( tg1.endIndex, tg2.endIndex ),
Math.max( tg1.score, tg2.score ) );
}
private TextFragments<E> fragment( String text ) {
TextFragments<E> fragments = new TextFragments<E>();
int lastEndIndex = 0;
for ( TokenGroup tg : this ) {
if ( tg.startIndex > lastEndIndex ) {
fragments.add( text.substring( lastEndIndex, tg.startIndex ), textModeNormal );
}
fragments.add(
text.substring( tg.startIndex, tg.endIndex ),
tg.score > 0 ? textModeHighlighted : textModeNormal );
lastEndIndex = tg.endIndex;
}
if ( lastEndIndex < ( text.length() - 1 ) ) {
fragments.add( text.substring( lastEndIndex ), textModeNormal );
}
return fragments;
}
#Override
public Iterator<TokenGroup> iterator() {
return tgs.iterator();
}
}

Here's a possible cause.
Your highlighter needs to use the same Analyzer used for search. IIUC, Your code uses a default analyzer for the highlighting, even though it uses a specialized analyzer for parsing the query. I believe you need to change the Fragmenter to work with your specific TokenStream.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Writing a Jagged Array in HDF5 using the Java Native Library - java

Apparently the "glue code" of the JNI wrapper doesn't support this. If you want to use this feature you either have to implement your own JNI or wait for a newer version. The official JNI code is open source and can be found here.

Related

mapstruct does not set relationship properly on a bi-directional OneToMany

Access to Drools returned fact object in Java Code

Javassist - How to add line number to method

Groovy/Java: Ini4j insert multiple values to single parameter in different lines

Solr WordDelimiterFilter + Lucene Highlighter

Categories

Resources