Solr WordDelimiterFilter + Lucene Highlighter

Solr WordDelimiterFilter + Lucene Highlighter - java

I am trying to get the Highlighter class from Lucene to work properly with tokens coming from Solr's WordDelimiterFilter. It works 90% of the time, but if the matching text contains a ',' such as "1,500" the output is incorrect:
Expected: 'test 1,500 this'
Observed: 'test 11,500 this'
I am not currently sure whether it is Highlighter messing up the recombination or WordDelimiterFilter messing up the tokenization but something is unhappy. Here are the relevant dependencies from my pom:
org.apache.lucene
lucene-core
2.9.3
jar
compile
org.apache.lucene
lucene-highlighter
2.9.3
jar
compile
org.apache.solr
solr-core
1.4.0
jar
compile
And here is a simple JUnit test class demonstrating the problem:
package test.lucene;
import static org.junit.Assert.assertEquals;
import static org.junit.Assert.assertTrue;
import java.io.IOException;
import java.io.Reader;
import java.util.HashMap;
import org.apache.lucene.analysis.Analyzer;
import org.apache.lucene.analysis.TokenStream;
import org.apache.lucene.queryParser.ParseException;
import org.apache.lucene.queryParser.QueryParser;
import org.apache.lucene.search.Query;
import org.apache.lucene.search.highlight.Highlighter;
import org.apache.lucene.search.highlight.InvalidTokenOffsetsException;
import org.apache.lucene.search.highlight.QueryScorer;
import org.apache.lucene.search.highlight.SimpleFragmenter;
import org.apache.lucene.search.highlight.SimpleHTMLFormatter;
import org.apache.lucene.util.Version;
import org.apache.solr.analysis.StandardTokenizerFactory;
import org.apache.solr.analysis.WordDelimiterFilterFactory;
import org.junit.Test;
public class HighlighterTester {
private static final String PRE_TAG = "<b>";
private static final String POST_TAG = "</b>";
private static String[] highlightField( Query query, String fieldName, String text )
throws IOException, InvalidTokenOffsetsException {
SimpleHTMLFormatter formatter = new SimpleHTMLFormatter( PRE_TAG, POST_TAG );
Highlighter highlighter = new Highlighter( formatter, new QueryScorer( query, fieldName ) );
highlighter.setTextFragmenter( new SimpleFragmenter( Integer.MAX_VALUE ) );
return highlighter.getBestFragments( getAnalyzer(), fieldName, text, 10 );
}
private static Analyzer getAnalyzer() {
return new Analyzer() {
#Override
public TokenStream tokenStream( String fieldName, Reader reader ) {
// Start with a StandardTokenizer
TokenStream stream = new StandardTokenizerFactory().create( reader );
// Chain on a WordDelimiterFilter
WordDelimiterFilterFactory wordDelimiterFilterFactory = new WordDelimiterFilterFactory();
HashMap<String, String> arguments = new HashMap<String, String>();
arguments.put( "generateWordParts", "1" );
arguments.put( "generateNumberParts", "1" );
arguments.put( "catenateWords", "1" );
arguments.put( "catenateNumbers", "1" );
arguments.put( "catenateAll", "0" );
wordDelimiterFilterFactory.init( arguments );
return wordDelimiterFilterFactory.create( stream );
}
};
}
#Test
public void TestHighlighter() throws ParseException, IOException, InvalidTokenOffsetsException {
String fieldName = "text";
String text = "test 1,500 this";
String queryString = "1500";
String expected = "test " + PRE_TAG + "1,500" + POST_TAG + " this";
QueryParser parser = new QueryParser( Version.LUCENE_29, fieldName, getAnalyzer() );
Query q = parser.parse( queryString );
String[] observed = highlightField( q, fieldName, text );
for ( int i = 0; i < observed.length; i++ ) {
System.out.println( "\t" + i + ": '" + observed[i] + "'" );
}
if ( observed.length > 0 ) {
System.out.println( "Expected: '" + expected + "'\n" + "Observed: '" + observed[0] + "'" );
assertEquals( expected, observed[0] );
}
else {
assertTrue( "No matches found", false );
}
}
}
Anyone have any ideas or suggestions?

After further investigation, this appears to be a bug in the Lucene Highlighter code. As you can see here:
public class TokenGroup {
...
protected boolean isDistinct() {
return offsetAtt.startOffset() >= endOffset;
}
...
The code attempts to determine if a group of tokens is distinct by checking to see if the start offset is greater than the previous end offset. The problem with this approach is illustrated by this issue. If you were to step through the tokens, you would see that they are as follows:
0-4: 'test', 'test'
5-6: '1', '1'
7-10: '500', '500'
5-10: '1500', '1,500'
11-15: 'this', 'this'
From this you can see that the third token starts after the end of the second, but the fourth starts the same place as the second. The intended outcome would be to group tokens 2, 3, and 4, but per this implementation, token 3 is seen as separate from 2, so 2 shows up by itself, then 3 and 4 get grouped leaving this outcome:
Expected: 'test <b>1,500</b> this'
Observed: 'test 1<b>1,500</b> this'
I'm not sure this can be accomplished without 2 passes, one to get all the indexes and a second to combine them. Also, I'm not sure what the implications would be outside of this specific case. Does anyone have any ideas here?
EDIT
Here is the final source code I came up with. It will group things correctly. It also appears to be MUCH simpler than the Lucene Highlighter implementation, but admittedly does not handle different levels of scoring as my application only needs a yes/no as to whether a fragment of text gets highlighted. Its also worth noting that I am using their QueryScorer to score the text fragments which does have the weakness of being Term oriented rather than Phrase oriented which means the search string "grammatical or spelling" would end up with highlighting that looks something like this "grammatical or spelling" as the or would most likely get dropped by your analyzer. Anyway, here is my source:
public TextFragments<E> getTextFragments( TokenStream tokenStream,
String text,
Scorer scorer )
throws IOException, InvalidTokenOffsetsException {
OffsetAttribute offsetAtt = (OffsetAttribute) tokenStream.addAttribute( OffsetAttribute.class );
TermAttribute termAtt = (TermAttribute) tokenStream.addAttribute( TermAttribute.class );
TokenStream newStream = scorer.init( tokenStream );
if ( newStream != null ) {
tokenStream = newStream;
}
TokenGroups tgs = new TokenGroups();
scorer.startFragment( null );
while ( tokenStream.incrementToken() ) {
tgs.add( offsetAtt.startOffset(), offsetAtt.endOffset(), scorer.getTokenScore() );
if ( log.isTraceEnabled() ) {
log.trace( new StringBuilder()
.append( scorer.getTokenScore() )
.append( " " )
.append( offsetAtt.startOffset() )
.append( "-" )
.append( offsetAtt.endOffset() )
.append( ": '" )
.append( termAtt.term() )
.append( "', '" )
.append( text.substring( offsetAtt.startOffset(), offsetAtt.endOffset() ) )
.append( "'" )
.toString() );
}
}
return tgs.fragment( text );
}
private class TokenGroup {
private int startIndex;
private int endIndex;
private float score;
public TokenGroup( int startIndex, int endIndex, float score ) {
this.startIndex = startIndex;
this.endIndex = endIndex;
this.score = score;
}
}
private class TokenGroups implements Iterable<TokenGroup> {
private List<TokenGroup> tgs;
public TokenGroups() {
tgs = new ArrayList<TokenGroup>();
}
public void add( int startIndex, int endIndex, float score ) {
add( new TokenGroup( startIndex, endIndex, score ) );
}
public void add( TokenGroup tg ) {
for ( int i = tgs.size() - 1; i >= 0; i-- ) {
if ( tg.startIndex < tgs.get( i ).endIndex ) {
tg = merge( tg, tgs.remove( i ) );
}
else {
break;
}
}
tgs.add( tg );
}
private TokenGroup merge( TokenGroup tg1, TokenGroup tg2 ) {
return new TokenGroup( Math.min( tg1.startIndex, tg2.startIndex ),
Math.max( tg1.endIndex, tg2.endIndex ),
Math.max( tg1.score, tg2.score ) );
}
private TextFragments<E> fragment( String text ) {
TextFragments<E> fragments = new TextFragments<E>();
int lastEndIndex = 0;
for ( TokenGroup tg : this ) {
if ( tg.startIndex > lastEndIndex ) {
fragments.add( text.substring( lastEndIndex, tg.startIndex ), textModeNormal );
}
fragments.add(
text.substring( tg.startIndex, tg.endIndex ),
tg.score > 0 ? textModeHighlighted : textModeNormal );
lastEndIndex = tg.endIndex;
}
if ( lastEndIndex < ( text.length() - 1 ) ) {
fragments.add( text.substring( lastEndIndex ), textModeNormal );
}
return fragments;
}
#Override
public Iterator<TokenGroup> iterator() {
return tgs.iterator();
}
}

Here's a possible cause.
Your highlighter needs to use the same Analyzer used for search. IIUC, Your code uses a default analyzer for the highlighting, even though it uses a specialized analyzer for parsing the query. I believe you need to change the Fragmenter to work with your specific TokenStream.

Related

How to efficiently check if read line from Buffered reader contains a string from an enum list

I am a computer science university student working on my first 'big' project outside of class. I'm attempting to read through large text files (2,000 - 3,000 lines of text), line by line with buffered reader. When a keyword from a list of enums is located, I want it to send the current line from buffered reader to its appropriate method to be handled appropriatley.
I have a solution, but I have a feeling in my gut that there is a much better way to handle this situation. Any suggestions or feedback would be greatly appreciated.
Current Solution
I am looping through the the list of enums, then checking if the current enum's toString return is in the current line from buffered reader using the String.contains method.
If the enum is located, the enum is used in a switch statement for the appropriate method call. (I have 13 total cases just wanted to keep the code sample short).
try (BufferedReader reader = new BufferedReader(new FileReader(inputFile.getAbsoluteFile()))){
while ((currentLine = reader.readLine()) != null) {
for (GameFileKeys gameKey : GameFileKeys.values()) {
if (currentLine.contains(gameKey.toString())) {
switch (gameKey) {
case SEAT -> seatAndPlayerAssignment(currentTableArr, currentLine);
case ANTE -> playerJoinLate(currentLine);
}
}
}
}
}
Previous Solution
Originally, I had a nasty list of if statements checking if the current line contained one of the keywords and then handled it appropriatley. Clearly that is far from optimal, but my gut tells me that my current solution is also less than optimal.
try (BufferedReader reader = new BufferedReader(new FileReader(inputFile.getAbsoluteFile()))){
while ((currentLine = reader.readLine()) != null) {
if(currentLine.contains(GameFileKey.SEAT){
seatAndPlayerAssignment(currentTableArr, currentLine);
}
else if(currentLine.contains(GameFileKey.ANTE){
playerJoinLate(currentLine);
}
}
}
Enum Class
In case you need this, or have any general feedback for how I'm implementing my enums.
public enum GameFileKeys {
ANTE("posts ante"),
SEAT("Seat ");
private final String gameKey;
GameFileKeys(String str) {
this.gameKey = str;
}
#Override
public String toString() {
return gameKey;
}
}

I cannot improve over the core of your code: the looping on values() of the enum, performing a String#contains for each enum object’s string, and using a switch. I can make a few minor suggestions.
I suggest you not override the toString method on your enum. The Object#toString method is generally best used only for debugging and logging, not logic or presentation.
Your string passed to constructor of the enum is likely similar to the idea of a display name commonly seen in such enums. The formal enum name (all caps) is used internally within Java, while the display name is used for display to the user or exchanged with external systems. See the Month and DayOfWeek enums as examples offering a getDisplayName method.
Also, an enum should be named in the singular. This avoids confusion with any collections of the enum’s objects.
By the way, looks like you have a stray SPACE in your second enum's argument.
At first I thought it would help to have a list of all the display names, and a map of display name to enum object. However, in the end neither is needed for your purpose. I kept those as they might prove interesting.
public enum GameFileKey
{
ANTE( "posts ante" ),
SEAT( "Seat" );
private String displayName = null;
private static final List < String > allDisplayNames = Arrays.stream( GameFileKey.values() ).map( GameFileKey :: getDisplayName ).toList();
private static final Map < String, GameFileKey > mapOfDisplayNameToGameFileKey = Arrays.stream( GameFileKey.values() ).collect( Collectors.toUnmodifiableMap( GameFileKey :: getDisplayName , Function.identity() ) );
GameFileKey ( String str ) { this.displayName = str; }
public String getDisplayName ( ) { return this.displayName; }
public static GameFileKey forDisplayName ( final String displayName )
{
return
Objects.requireNonNull(
GameFileKey.mapOfDisplayNameToGameFileKey.get( displayName ) ,
"None of the " + GameFileKey.class.getCanonicalName() + " enum objects has a display name of: " + displayName + ". Message # 4dcefee2-4aa2-48cf-bf66-9a4bde02ac37." );
}
public static List < String > allDisplayNames ( ) { return GameFileKey.allDisplayNames; }
}
You can use a stream of the lines of your file being processed. Just FYI, not necessarily better than your code.
public class Demo
{
public static void main ( String[] args )
{
Demo app = new Demo();
app.demo();
}
private void demo ( )
{
try
{
Path path = Demo.getFilePathToRead();
Stream < String > lines = Files.lines( path );
lines.forEach(
line -> {
for ( GameFileKey gameKey : GameFileKey.values() )
{
if ( line.contains( gameKey.getDisplayName() ) )
{
switch ( gameKey )
{
case SEAT -> this.seatAndPlayerAssignment( line );
case ANTE -> this.playerJoinLate( line );
}
}
}
}
);
}
catch ( IOException e )
{
throw new RuntimeException( e );
}
}
private void playerJoinLate ( String line )
{
System.out.println( "line = " + line );
}
private void seatAndPlayerAssignment ( String line )
{
System.out.println( "line = " + line );
}
public static Path getFilePathToRead ( ) throws IOException
{
Path tempFile = Files.createTempFile( "bogus" , ".txt" );
Files.write( tempFile , "apple\nSeat\norange\nposts ante\n".getBytes() );
return tempFile;
}
}
When run:
line = Seat
line = posts ante

Storing string into hashmap with occurrences

I have a method that returns some kind of string. I want to store the individual words in a HashMap with their number of occurrences?
public static void main(String[] args) {
String s = "{link:hagdjh, matrics:[{name:apple, value:1},{name:jeeva, value:2},{name:abc, value:0}]}";
String[] strs = s.split("matrics");
System.out.println("Substrings length:" + strs.length);
for (int i = 0; i < strs.length; i++) {
System.out.println(strs[i]);
}
}
For eg, I have a string- "{link:https://www.google.co.in/, matrics:[{name:apple, value:1},{name:graph, value:2},{name:abc, value:0}]}";
Now my hashmap should look like
apple = 1
graph = 2
abc = 0
How should I proceed?
I know how to use HashMaps. My problem, in this case, is that I don't know how to parse through the given string and store the words with their number of occurrences.

String regex = "\\{name:(.*), value:(\\d+)\\}";
HashMap<String, Integer> link = new HashMap<>();
Pattern pattern = Pattern.compile(regex);
Matcher matcher = pattern.matcher(s);
while (matcher.find()){
String found = matcher.group(1);
String number = matcher.group(2);
link.put(found, Integer.parseInt(number));
}

import java.util.ArrayList;
import java.util.LinkedHashMap;
import java.util.List;
import java.util.Map;
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class Main {
public static void main(String[] args) {
Map<String, Integer> map = new LinkedHashMap<String, Integer>();
Pattern pattern = Pattern.compile("matrics:\\[\\{(.*?)\\]\\}");
Matcher matcher = pattern
.matcher("{link:hagdjh, matrics:[{name:apple, value:1},{name:jeeva, value:2},{name:abc, value:0}]}");
String data = "";
if (matcher.find()) {
data = matcher.group();
}
List<String> records = new ArrayList<String>();
pattern = Pattern.compile("(?<=\\{).+?(?=\\})");
matcher = pattern.matcher(data);
while (matcher.find()) {
records.add(matcher.group());
}
for (String s : records) {
String[] parts = s.split(", ");
map.put(parts[0].substring(parts[0].indexOf(":") + 1),
Integer.parseInt(parts[1].substring(parts[1].indexOf(":") + 1)));
}
map.entrySet().forEach(entry -> {
System.out.println(entry.getKey() + " = " + entry.getValue());
});
}
}
Output:
apple = 1
jeeva = 2
abc = 0

It appeares that your data is in JSON format.
If it is guaranteed to be in JSON format, you can parse it using JSON parsing library and than analyze the matrics data in a convinient way (code follows).
If the data is not guaranteed to be in JSON format, you can use REGEX to help you parse it, as in Reza soumi's answer.
import org.json.JSONObject;
import org.json.JSONArray;
import java.util.HashMap;
String s = "{link:hagdjh, matrics:[{name:apple, value:1},{name:jeeva, value:2},{name:abc, value:0}]}";
JSONObject obj = new JSONObject(s);
JSONArray matrics = obj.getJSONArray("matrics");
System.out.println(matrics);
HashMap<String, Integer> matricsHashMap = new HashMap<String, Integer>();
for (int i=0;i < matrics.length();i++){
JSONObject matric = matrics.getJSONObject(i);
System.out.println("Adding matric: " + matric + " to hash map");
String matricName = matric.getString("name");
Integer matricValue = Integer.valueOf(matric.getInt("value"));
matricsHashMap.put(matricName, matricValue);
}
System.out.println(matricsHashMap);

Try this:
import static java.lang.System.err;
import static java.lang.System.out;
import static java.util.Arrays.stream;
import static java.util.stream.Collectors.groupingBy;
import static java.util.stream.Collectors.toMap;
/**
* Counting the words in a String.
*/
public class CountWordsInString
{
/*-----------*\
====** Constants **========================================================
\*-----------*/
/**
* An empty array of {#code ${type_name}} objects.
*/
public static final String INPUT = "{link:https://www.google.co.in/, matrics:[{name:apple, value:1},{name:graph, value:2},{name:abc, value:0}]}";
/*---------*\
====** Methods **==========================================================
\*---------*/
/**
* The program entry point.
*
* #param args The command line arguments.
*/
public static void main( final String... args )
{
try
{
final var result = stream( INPUT.split( "\\W+" ) )
.filter( s -> !s.isBlank() )
.filter( s -> !s.matches( "\\d*" ) )
.collect( groupingBy( s -> s ) )
.entrySet()
.stream()
.collect( toMap( k -> k.getKey(), v -> Long.valueOf( v.getValue().size() ) ) );
out.println( result.getClass() );
for( final var entry : result.entrySet() )
{
out.printf( "'%s' occurred %d times%n", entry.getKey(), entry.getValue() );
}
}
catch( final Throwable t )
{
//---* Handle any previously unhandled exceptions *----------------
t.printStackTrace( err );
}
} // main()
}
// class CountWordsInString
Confessed, not the most obvious solution, but I wanted to have some fun with it, too.
The INPUT.split( "\\W+" ) gives you the words in the string, but also numbers and an 'empty' word at the beginning.
The 'empty' word is eliminated with the first filter() statement, the numbers go with the second.
The first collect( groupingBy() ) gives you a HashMap<String,List<String>>, so I had to convert that to a HashMap<String,Long> in the following steps (basically with the second collect( groupingBy() )).
May be there is a more efficient solution, or one that is more elegant, or even one that is both, more efficient and more elegant … but it works as expected, and I had some fun with it.
The output is:
class java.util.HashMap
'apple' occurred 1 times
'matrics' occurred 1 times
'abc' occurred 1 times
'in' occurred 1 times
'www' occurred 1 times
'name' occurred 3 times
'link' occurred 1 times
'google' occurred 1 times
'https' occurred 1 times
'co' occurred 1 times
'value' occurred 3 times
'graph' occurred 1 times

Writing a Jagged Array in HDF5 using the Java Native Library

I have tried numerous ways and followed some of the examples that are scattered around the web on how to write a jagged array (an array of arrays that may be of differing lengths) in HDF5.
Most of the examples are in C and rather low-level. Anyhow I can't seem to get it working and I just looked at the C-source code and it pretty much says that any variable-length datatypes that are not strings are not supported (if I understood correctly).
My miserable dysfunctional code (as is):
public void WIP_createVLenFloatDataSet( List<? extends Number> floats ) throws Exception
{
String group = "/test";
long groupId = createGroupIfNotExist( group );
MDataQualifier qualifier = new MDataQualifierImpl( group, "float", "0.0.0" );
long datasetId = openDataSet( qualifier );
long heapType = H5.H5Tcopy( MDataType.FLOAT_ARRAY.getHDFType() );
heapType = H5.H5Tvlen_create( heapType );
// heapType = H5.H5Tarray_create( heapType, 1, new long[]{1} );
if( !exists( datasetId ) )
{
long[] maxDims = new long[]{ HDF5Constants.H5S_UNLIMITED };
long dataspaceId = H5.H5Screate_simple( 1, new long[]{ 1 }, null );
// Create the dataset.
long datasetId1 = -1;
try
{
if( exists( m_fileId ) && exists( dataspaceId ) && exists( heapType ) )
{
long creationProperties = H5.H5Pcreate( HDF5Constants.H5P_DATASET_CREATE );
H5.H5Pset_chunk( creationProperties, /*ndims*/1, new long[]{ 1 } );
datasetId1 = H5.H5Dcreate( groupId, qualifier.getVersionedName(), heapType, dataspaceId, H5P_DEFAULT, creationProperties, H5P_DEFAULT );
// H5.H5Pclose( creationProperties );
}
}
catch( Exception e )
{
LOG.error( "Problems creating the dataset: " + e.getMessage(), e );
}
datasetId = datasetId1;
if( exists( datasetId ) )
{
// flushIfNecessary();
LOG.trace( "Wrote empty dataset {}", qualifier.getVersionedName() );
}
}
List<? extends Number> data = ( List<? extends Number> )floats;
// H5.H5Dwrite( datasetId, heapType, dataspaceId, memSpaceId, HDF5Constants.H5P_DEFAULT, Floats.toArray( data) );
ByteBuffer bb = ByteBuffer.allocate( data.size() * 4 );
floats.forEach( f -> bb.putFloat( f.floatValue() ) );
// H5.H5Dwrite( datasetId, heapType, H5S_ALL, H5S_ALL, H5P_DEFAULT, Floats.toArray( data ) );
H5.H5Dwrite( datasetId, heapType, H5S_ALL, H5S_ALL, H5P_DEFAULT, bb.array() );
}
Has anyone done this before and can at least confirm that it's not possible?
The most I can get out of HDF5 is the message "buf does not support variable length type".

Apparently the "glue code" of the JNI wrapper doesn't support this. If you want to use this feature you either have to implement your own JNI or wait for a newer version. The official JNI code is open source and can be found here.

Java - How to call an oracle procedure with custom types?

I have a plsql procedure
PROCEDURE merge_time_bounds(s1_bd_t IN bd_tb_struct, s2_bd_t IN bd_tb_struct, r_bd_t OUT bd_tb_struct);
And I try to call it inside of my Java Code. I did it already with other procedures where all parameters was of type VARCHAR, but here all params are "bd_tb_struct"
create or replace TYPE bd_tb_struct FORCE
AS
OBJECT
(
start_ts TIMESTAMP (3) ,
end_ts TIMESTAMP (3) ,
time_type NUMBER (19) ,
duration NUMBER (12) ) FINAL ;
I also have a Java Class with this Comment. "Class of the corresponding type in the database. (bd_tb_struct )"
BoundsSqlType.java
Can some one explain me how I can call my procedure?

Oracle Setup:
CREATE OR REPLACE TYPE BD_TB_STRUCT AS OBJECT(
start_ts TIMESTAMP(3),
end_ts TIMESTAMP(3),
time_type NUMBER(19),
duration NUMBER(12)
) FINAL;
/
CREATE OR REPLACE PROCEDURE merge_time_bounds(
s1_bd_t IN bd_tb_struct,
s2_bd_t IN bd_tb_struct,
r_bd_t OUT bd_tb_struct
)
IS
p_start TIMESTAMP(3) := LEAST( s1_bd_t.start_ts, s2_bd_t.start_ts );
p_end TIMESTAMP(3) := GREATEST( s1_bd_t.end_ts, s2_bd_t.end_ts );
BEGIN
r_bd_t := new BD_TB_STRUCT(
p_start,
p_end,
COALESCE( s1_bd_t.time_type, s2_bd_t.time_type ),
( CAST( p_end AS DATE ) - CAST( p_start AS DATE ) ) * 24 * 60 * 60
);
END;
/
Java SQLData Class:
import java.math.BigDecimal;
import java.math.BigInteger;
import java.sql.SQLData;
import java.sql.SQLException;
import java.sql.SQLInput;
import java.sql.SQLOutput;
import java.sql.Timestamp;
import java.time.LocalDateTime;
import java.time.ZoneOffset;
public class BoundsSQL implements SQLData
{
public static final String SQL_TYPE = "BD_TB_STRUCT";
public java.sql.Timestamp start;
public java.sql.Timestamp end;
public BigInteger type;
public BigInteger duration;
public BoundsSQL()
{
}
public BoundsSQL(
final int year,
final int month,
final int dayOfMonth,
final int hour,
final int minute,
final int seconds,
final long duration,
final long type )
{
final long epochSeconds = LocalDateTime.of(
year,
month,
dayOfMonth,
hour,
minute,
seconds
).toEpochSecond( ZoneOffset.UTC );
this.start = new Timestamp( epochSeconds * 1000 );
this.end = new Timestamp( (epochSeconds + duration) * 1000 );
this.duration = BigInteger.valueOf( duration );
this.type = BigInteger.valueOf( type );
}
#Override
public String getSQLTypeName() throws SQLException
{
return SQL_TYPE;
}
#Override
public void readSQL( SQLInput stream,
String typeName ) throws SQLException
{
start = stream.readTimestamp();
end = stream.readTimestamp();
type = stream.readBigDecimal().toBigInteger();
duration = stream.readBigDecimal().toBigInteger();
}
#Override
public void writeSQL( SQLOutput stream ) throws SQLException
{
stream.writeTimestamp( start );
stream.writeTimestamp( end );
stream.writeBigDecimal( new BigDecimal( type ) );
stream.writeBigDecimal( new BigDecimal( duration ) );
}
#Override
public String toString()
{
return String.format(
"Start: %s\nEnd: %s\nDuration: %s\nType: %s",
start,
end,
duration,
type
);
}
}
Call Stored Procedure from Java:
Call the stored procedure using OracleCallableStatement#setObject( int, Object ) to pass parameters and put the class into a type map and use OracleCallableStatement#registerOutParameter( int, int, string ) and OracleCallableStatement#getObject( int ) to retrieve the parameters.
import java.sql.DriverManager;
import java.sql.SQLException;
import java.util.Map;
import oracle.jdbc.OracleCallableStatement;
import oracle.jdbc.OracleConnection;
import oracle.jdbc.OracleTypes;
public class PassStructToProcedure
{
public static void main( final String[] args ){
OracleConnection con = null;
try{
Class.forName( "oracle.jdbc.OracleDriver" );
con = (OracleConnection) DriverManager.getConnection(
"jdbc:oracle:thin:#localhost:1521:orcl",
"USERNAME",
"PASSWORD"
);
BoundsSQL bound1 = new BoundsSQL( 2019, 1, 1, 0, 0, 0, 10, 1 );
BoundsSQL bound2 = new BoundsSQL( 2019, 1, 1, 0, 0, 5, 10, 2 );
OracleCallableStatement st = (OracleCallableStatement) con.prepareCall(
"{ call MERGE_TIME_BOUNDS( ?, ?, ? ) }"
);
st.setObject( 1, bound1 );
st.setObject( 2, bound2 );
st.registerOutParameter( 3, OracleTypes.STRUCT, BoundsSQL.SQL_TYPE );
st.execute();
Map<String,Class<?>> typeMap = con.getTypeMap();
typeMap.put( BoundsSQL.SQL_TYPE, BoundsSQL.class );
BoundsSQL out = (BoundsSQL) st.getObject( 3 );
System.out.println( out.toString() );
st.close();
} catch (ClassNotFoundException | SQLException ex) {
System.out.println( ex.getMessage() );
ex.printStackTrace();
} finally {
try{
if ( con != null )
con.close();
}
catch( SQLException e )
{
}
}
}
}
Output:
Start: 2019-01-01 00:00:00.0
End: 2019-01-01 00:00:15.0
Duration: 15
Type: 1

Use oracle.jdbc.OracleStruct to map your custom type. Check Oracle's docs at https://docs.oracle.com/database/121/JJDBC/oraoot.htm#JJDBC28431
PreparedStatement ps= conn.prepareStatement("text_of_prepared_statement");
Struct mySTRUCT = conn.createStruct (...);
((OraclePreparedStatement)ps).setOracleObject(1, mySTRUCT);

If you're open to using a third party library, jOOQ generates code for all of your PL/SQL packages, UDTs, procedures, functions, etc. to spare you the hassle of doing the binding manually.
So, in your case, you could call your procedure like this, using the generated stubs:
BdTbStructRecord result = Routines.mergeTimeBounds(
configuration, // This contains your JDBC connection and other things
new BdTbStructRecord(start1, end1, time1, duration1),
new BdTbStructRecord(start2, end2, time2, duration2)
);
This does something similar to what the accepted answer does with JDBC directly, but:
You don't have to work with strings, everything type checks (and IDE auto completes)
You don't have to remember data types and parameter order
When you change the procedure in your database, and regenerate the code, then your client code stops compiling, so you'll notice the problem early on
Disclaimer: I work for the company behind jOOQ.

Automatic Hibernate index creation too long

I am back with a bug/problem that came to sunlight now. Usually I test the local development and changes on an H2DB but as I know, this has to work on Oracle and MSSQL too.
Now testing on oracle again this problem occurred:
The Key COR_VIEWSETTINGSCOR_USERSETTINGS_FK0 and COR_VIEWSETTINGSCOR_USERSETTINGS_FK1 are generated automatic and are way too long for an oracle db.
To know how these keys are created I will now show you the entities UserSettings and UserViewSettings.
hint: you can overlook the entities and go further to the edits if they confuse you. maybe you can still help me.
UserSettings
/**
The Class UserSettings.
*/
#org.hibernate.envers.Audited
#DataObject( value = UserSettings.DATA_OBJECT_NAME )
#CRUDDefinition( supportsRead = true, supportsCreate = true, supportsUpdate = true, supportsDelete = true )
#Entity( name = UserSettings.DATA_OBJECT_NAME )
#NamedQuery( name = UserSettings.DATA_OBJECT_NAME, query = "from userSettings e where e.name = :name" )
#javax.persistence.Inheritance( strategy = javax.persistence.InheritanceType.TABLE_PER_CLASS )
#AttributeOverrides( { #AttributeOverride( name = "id", column = #Column( name = "USERSETTINGS_ID" ) )
} )
#Table( name = "COR_USERSETTINGS", indexes = {
#javax.persistence.Index( name="COR_USERSETTINGS_FK0", columnList = "SETTINGSTYPE_ID" ),
#javax.persistence.Index( name="COR_USERSETTINGS_FK1", columnList = "USER_ID" ),
}
)
public class UserSettings extends NamedRevisionEntity implements NameSettingsType, NameSettings
{
/** The Constant serialVersionUID. */
private static final long serialVersionUID = 1L;
/** The Constant DATA_OBJECT_NAME. */
public static final String DATA_OBJECT_NAME = "userSettings";
#javax.persistence.Basic( fetch = javax.persistence.FetchType.EAGER, optional = false )
#Column( name = "SETTINGS", nullable = false, unique = false, insertable = true, updatable = true )
#javax.persistence.Lob
private java.lang.String settings;
#javax.persistence.ManyToOne( fetch = javax.persistence.FetchType.EAGER, optional = false )
#javax.persistence.JoinColumn( name = "SETTINGSTYPE_ID", nullable = false, unique = false, insertable = true, updatable = true )
private SettingsType settingsType;
#javax.persistence.ManyToOne( fetch = javax.persistence.FetchType.EAGER, optional = true )
#javax.persistence.JoinColumn( name = "USER_ID", nullable = true, unique = false, insertable = true, updatable = true )
private User user;
public SettingsType getSettingsType()
{
return settingsType;
}
public void setSettingsType( SettingsType settingsType )
{
this.settingsType = settingsType;
}
public User getUser()
{
return user;
}
public void setUser( User user )
{
this.user = user;
}
public java.lang.String getSettings()
{
return settings;
}
public void setSettings( java.lang.String settings )
{
this.settings = settings;
}
#Override
public String getDataObjectName()
{
return DATA_OBJECT_NAME;
}
#Override
public String toString()
{
StringBuilder builder = new StringBuilder( super.toString() );
builder.append( ", " );
try
{
builder.append( ToStringUtils.referenceToString( "settingsType", "SettingsType", this.settingsType ) );
}
catch( Exception ex )
{
builder.append( ex.getClass().getName() );
builder.append( ": " );
builder.append( ex.getMessage() );
}
builder.append( ", " );
try
{
builder.append( ToStringUtils.referenceToString( "user", "User", this.user ) );
}
catch( Exception ex )
{
builder.append( ex.getClass().getName() );
builder.append( ": " );
builder.append( ex.getMessage() );
}
builder.append( "]" );
return builder.toString();
}
}
UserViewSettings
/**
The Class UserViewSettings.
*/
#org.hibernate.envers.Audited
#DataObject( value = UserViewSettings.DATA_OBJECT_NAME )
#CRUDDefinition( supportsRead = true, supportsCreate = true, supportsUpdate = true, supportsDelete = true )
#Entity( name = UserViewSettings.DATA_OBJECT_NAME )
#AttributeOverrides( { #AttributeOverride( name = "id", column = #Column( name = "VIEWSETTINGS_ID" ) )
} )
#Table( name = "COR_VIEWSETTINGS", uniqueConstraints = {
#javax.persistence.UniqueConstraint( name="COR_VIEWSETTINGS_UNQ1", columnNames = { "NAME", "SETTINGSTYPE_ID", "VIEW_NAME", "VIEWTYPE_ID" } ),
}
, indexes = {
#javax.persistence.Index( name="COR_VIEWSETTINGS_FK0", columnList = "VIEWTYPE_ID" ),
}
)
public class UserViewSettings extends UserSettings implements NameViewName, NameViewType
{
/** The Constant serialVersionUID. */
private static final long serialVersionUID = 1L;
/** The Constant DATA_OBJECT_NAME. */
public static final String DATA_OBJECT_NAME = "userViewSettings";
#javax.persistence.Basic( fetch = javax.persistence.FetchType.EAGER, optional = false )
#Column( name = "VIEW_NAME", nullable = false, unique = false, insertable = true, updatable = true )
private java.lang.String viewName;
#javax.persistence.ManyToOne( fetch = javax.persistence.FetchType.EAGER, optional = true )
#javax.persistence.JoinColumn( name = "VIEWTYPE_ID", nullable = true, unique = false, insertable = true, updatable = true )
private ViewType viewType;
public java.lang.String getViewName()
{
return viewName;
}
public void setViewName( java.lang.String viewName )
{
this.viewName = viewName;
}
public ViewType getViewType()
{
return viewType;
}
public void setViewType( ViewType viewType )
{
this.viewType = viewType;
}
#Override
public String getDataObjectName()
{
return DATA_OBJECT_NAME;
}
#Override
public String toString()
{
StringBuilder builder = new StringBuilder( super.toString() );
builder.append( ", " );
builder.append( "viewName" );
builder.append( "=" );
builder.append( this.viewName );
builder.append( ", " );
try
{
builder.append( ToStringUtils.referenceToString( "viewType", "ViewType", this.viewType ) );
}
catch( Exception ex )
{
builder.append( ex.getClass().getName() );
builder.append( ": " );
builder.append( ex.getMessage() );
}
builder.append( "]" );
return builder.toString();
}
}
Starting Wildfly 10.0.0 with Hibernate 5.2 and an Oracle 11 Database then results in the error that the automatic generated Keys COR_VIEWSETTINGSCOR_USERSETTINGS_FK0 and COR_VIEWSETTINGSCOR_USERSETTINGS_FK1 are naturally too long for the database.
I took a look at the NamingStrategies for Hibernate and even tried some but they didn't change the error for me.
How can I impact the generation of these keys?
EDIT:
So turning on DEBUG gave me this:
2016-11-29 09:22:03,190 DEBUG [org.hibernate.SQL] (ServerService Thread Pool -- 58) create index COR_USERSETTINGS_FK0 on COR_USERSETTINGS (SETTINGSTYPE_ID)
2016-11-29 09:22:03,190 DEBUG [org.hibernate.SQL] (ServerService Thread Pool -- 58) create index COR_USERSETTINGS_FK1 on COR_USERSETTINGS (USER_ID)
2016-11-29 09:22:03,190 DEBUG [org.hibernate.SQL] (ServerService Thread Pool -- 58) create index COR_VIEWSETTINGSCOR_USERSETTINGS_FK0 on COR_VIEWSETT INGS(SETTINGSTYPE_ID)
2016-11-29 09:22:03,190 DEBUG [org.hibernate.SQL] (ServerService Thread Pool -- 58) create index COR_VIEWSETTINGSCOR_USERSETTINGS_FK1 on COR_VIEWSETTINGS (USER_ID)
2016-11-29 09:22:03,190 DEBUG [org.hibernate.SQL] (ServerService Thread Pool -- 58) create index COR_VIEWSETTINGS_FK0 on COR_VIEWSETTINGS (VIEWTYPE_ID)
Now I found the Class ImplicitIndexNameSource in the package org.hibernate.boot.model.naming but the internet doesn't really give examples what I can do with this and it seems to be an empty class for a long since a long time.
EDIT 2:
The previous edit seems to be a wrong path. I found the place where these logs are created. It's StandardIndexExporter which gets called from SchemaCreatorImpl. So I need to dig even deeper into the framework but if somebody sees this. Is this the right path? Can I modify code so that He will do the thing I want? It seems to be the hbm2ddl that is the culprit since the index get's created in StandardIndexExport in these lines:
final String indexNameForCreation;
if ( dialect.qualifyIndexName() ) {
indexNameForCreation = jdbcEnvironment.getQualifiedObjectNameFormatter().format(
new QualifiedNameImpl(
index.getTable().getQualifiedTableName().getCatalogName(),
index.getTable().getQualifiedTableName().getSchemaName(),
jdbcEnvironment.getIdentifierHelper().toIdentifier( index.getName() )
),
jdbcEnvironment.getDialect()
);
}
else {
indexNameForCreation = index.getName();
}
final StringBuilder buf = new StringBuilder()
.append( "create index " )
.append( indexNameForCreation )
.append( " on " )
.append( tableName )
.append( " (" );
boolean first = true;
Iterator<Column> columnItr = index.getColumnIterator();
while ( columnItr.hasNext() ) {
final Column column = columnItr.next();
if ( first ) {
first = false;
}
else {
buf.append( ", " );
}
buf.append( ( column.getQuotedName( dialect ) ) );
}
buf.append( ")" );
return new String[] { buf.toString() };
I would appreciate help a lot. This is getting really frustrating

So I got it working.
Answering for future people that might find this and have the same issue.
The index key gets created by the dialect of oracle that hibernate is referrencing to. So what had to be done was implementing an custom OracleDialect that overrides the method getIndexExporter and points to the custom IndexExporter. In this IndexExporter you can then modify the way the keys are created. In my case I fixed the solution like this:
/**
* Gets the correct index name if it is a index for a TABLE_PER_CLASS inheritance and longer than
* 30 chars.
*
* #param index the index to decide for
* #return the correct index name
*/
private String getCorrectIndexName( Index index )
{
if ( index.getTable() instanceof DenormalizedTable && index.getName().length() > 30 )
{
String prefixedTable = index.getTable().getName();
String tableName = prefixedTable.substring( prefixedTable.indexOf( '_' ) + 1, prefixedTable.length() );
tableName = shortenName( tableName );
Iterator<Column> columnItr = index.getColumnIterator();
String reference;
if ( columnItr.hasNext() )
{
reference = extractReference( columnItr.next() );
}
else
{
/** backup strategy to prevent exceptions */
reference = shortenName( NamingHelper.INSTANCE.hashedName( index.getName() ) );
}
return tableName + "_" + reference;
}
return index.getName();
}
/**
* Extract the reference column of the index and hash the full name before shortening it with
* shortenName().
*
* #param index the index to extract the reference from.
* #return the reference with an appended _FK(hashedReference).
*/
private String extractReference( Column column )
{
String reference = column.getQuotedName( dialect );
String md5Hash = NamingHelper.INSTANCE.hashedName( reference );
md5Hash = md5Hash.substring( md5Hash.length() - 4, md5Hash.length() );
reference = shortenName( reference );
return reference + "_FK" + md5Hash;
}
/**
* Shorten the name to a maximum of 11 chars if it's longer.
*
* #param reference the reference to shorten
* #return the shortened string
*/
private static String shortenName( String reference )
{
if ( reference.length() > 11 )
{
return reference.substring( 0, 11 );
}
return reference;
}
this had to be called in the Overriden function getSqlCreateStrings. the changed lines look like this:
String indexName = getCorrectIndexName( index );
indexNameForCreation = jdbcEnvironment.getQualifiedObjectNameFormatter()
.format(
new QualifiedNameImpl( index.getTable().getQualifiedTableName().getCatalogName(),
index.getTable().getQualifiedTableName().getSchemaName(), jdbcEnvironment.getIdentifierHelper().toIdentifier( indexName ) ),
jdbcEnvironment.getDialect() );
I hope that helps someone.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Solr WordDelimiterFilter + Lucene Highlighter - java

Related

How to efficiently check if read line from Buffered reader contains a string from an enum list

Storing string into hashmap with occurrences

Writing a Jagged Array in HDF5 using the Java Native Library

Java - How to call an oracle procedure with custom types?

Automatic Hibernate index creation too long

Categories

Resources