CLOB and CriteriaQuery - java

I have an entity that has a CLOB attribute:
public class EntityS {
...
#Lob
private String description;
}
To retrieve certain EntityS from the DB we use a CriteriaQuery where we need the results to be unique, so we do:
query.where(builder.and(predicates.toArray(new Predicate[predicates.size()]))).distinct(true).orderBy(builder.asc(root.<Long> get(EntityS_.id)));
If we do that we get the following error:
ORA-00932: inconsistent datatypes: expected - got CLOB
I know that's because you cannot use distinct when selecting a CLOB. But we need the CLOB. Is there a workaround for this using CriteriaQuery with Predicates and so on?
We are using an ugly workaround getting rid of the .unique(true) and then filtering the results, but that's crap. We are using it only to be able to keep on developing the app, but we need a better solution and I don't seem to find one...

In case you are using Hibernate as persistence provider, you can specify the following query hint:
query.setHint(QueryHints.HINT_PASS_DISTINCT_THROUGH, false);
This way, "distinct" is not passed through to the SQL command, but Hibernate will take care of returning only distinct values.
See here for more information: https://thoughts-on-java.org/hibernate-tips-apply-distinct-to-jpql-but-not-sql-query/

Thinking outside the box - I have no idea if this will work, but perhaps it is worth a shot. (I tested it and it seems to work, but I created a table with just one column, CLOB data type, and two rows, both with the value to_clob('abcd') - of course it should work on that setup.)
To de-duplicate, compute a hash of each clob, and instruct Oracle to compute a row number partitioned by the hash value and ordered by nothing (null). Then select just the rows where the row number is 1. Something like below (t is the table I created, with one CLOB column called c).
I expect that execution time should be reasonably good. The biggest concern, of course, is collisions. How important is it that you not miss ANY of the CLOBs, and how many rows do you have in the base table in the first place? Is something like "one chance in a billion" of having a collision acceptable?
select c
from (
select c, row_number() over (partition by dbms_crypto.hash(c, 3) order by null) as rn
from t
)
where rn = 1;
Note - the user (your application, in your case) must have EXECUTE privilege on SYS.DBMS_CRYPTO. A DBA can grant it if needed.

Related

How to sum values from a database column in AnyLogic?

as a newbie I want to sum the values of a column pv from a database table evm in my model and store it in a variable. I have tried the SQL code SELECT SUM(pv) FROM evm; but that doesn't seem to work.I would be grateful if you lend me an aid regarding how to pull this one.
You can always write a native query and get the response in the resultset to populate the field of your pojo. Once you have the POJO/DTO created as the list of result set perform your sum on the field by Iterating the list.
You do just use the SQL you have suggested. (The database in an AnyLogic model is a standard HSQLDB database which supports this SQL syntax.)
The simplest way to execute it is to use AnyLogic's in-built functions for such queries (as would be produced by the Insert Database Query wizard), so
mySumVariable = selectFirstValue("SELECT SUM(pv) FROM evm;");
You didn't say what errors you had; obviously the table and column has to exist (and the column you're summing needs to be numeric, though NULLs are OK), as does the variable you're assigning the sum to.
If you wanted to do this in a way which more easily fits one of the standard query 'forms' suggested by the wizard (i.e., not having to know particular SQL syntax) you could just adapt the "Iterate over returned rows and do something" code to 'explicitly' sum the columns; e.g., (using the Query DSL format this time):
List<Tuple> rows = selectFrom(evm).list();
for (Tuple row : rows) {
mySumVariable += row.get(evm.pv);
}

Decode in SQL vs. If... Else in Java

I'm looking for a solution to a simple scenario. I need to check if a value is present in a table, and if present I need Y else N
I can do it in two ways, either fetch the count of rows from the database, and code the logic in java, or use DECODE(COUNT(*),0,'N','Y')
Which is better? Is there any advantage of one over the other? Or more specifically, is there any disadvantage of using DECODE() instead of doing it in Java?
The database I have is DB2.
You should use exists. I would tend to do this as:
select (case when exists (select 1 from . . . .)
then 'Y' else 'N'
end) as flag
from sysibm.sysdummy1;
The reason you want to use exists is because it is faster. When you use count(*), the SQL engine has to process all the (appropriate) data to get the count. With exists, it can stop at the first one.
The reason to prefer case over decode() is that the former is ANSI standard SQL, available in basically all databases.
It shouldn't be any considerable difference between those 2 ways that you mentioned.
1) The DECODE will be simple and the IF will be simple.
2) You will be receiving an Int32 versus a CHAR(1) - which is not a significant difference.
So, I would consider another aspect: Which of those 2 will make your code more CLEAR?
And one more thing: if this is the ONLY thing that you're selecting on that query, you could try something like:
SELECT 'Y' FROM DUAL WHERE EXISTS (SELECT 1 FROM YOURTABLE WHERE YOURCONDITION = 1); --Oracle SQL - but should be fairly easy to translate it to DB2
This is an option to not make the DB count for every occurrence of your condition just to check if it exists.
Aggregated functions like count can be optimized with MQT - Materilized Query Tables
https://www.ibm.com/developerworks/data/library/techarticle/dm-0509melnyk/
connect to sample
alter table employee add unique (empno)
alter table department add unique (deptno)
create table count_emp_dpto_1 as (select d.deptno, e.empno, count(*) from employee e, department d where d.deptno = 1 and e.workdept = d.deptno) data initially deferred refresh immediate
set integrity for count_emp_dpto_1 immediate checked not incremental
select * from count_emp_dpto_1
connect reset

Better to query once, then organize objects based on returned column value, or query twice with different conditions?

I have a table which I need to query, then organize the returned objects into two different lists based on a column value. I can either query the table once, retrieving the column by which I would differentiate the objects and arrange them by looping through the result set, or I can query twice with two different conditions and avoid the sorting process. Which method is generally better practice?
MY_TABLE
NAME AGE TYPE
John 25 A
Sarah 30 B
Rick 22 A
Susan 43 B
Either SELECT * FROM MY_TABLE, then sort in code based on returned types, or
SELECT NAME, AGE FROM MY_TABLE WHERE TYPE = 'A' followed by
SELECT NAME, AGE FROM MY_TABLE WHERE TYPE = 'B'
Logically, a DB query from a Java code will be more expensive than a loop within the code because querying the DB involves several steps such as connecting to DB, creating the SQL query, firing the query and getting the results back.
Besides, something can go wrong between firing the first and second query.
With an optimized single query and looping with the code, you can save a lot of time than firing two queries.
In your case, you can sort in the query itself if it helps:
SELECT * FROM MY_TABLE ORDER BY TYPE
In future if there are more types added to your table, you need not fire an additional query to retrieve it.
It is heavily dependant on the context. If each list is really huge, I would let the database to the hard part of the job with 2 queries. At the opposite, in a web application using a farm of application servers and a central database I would use one single query.
For the general use case, IMHO, I will save database resource because it is a current point of congestion and use only only query.
The only objective argument I can find is that the splitting of the list occurs in memory with a hyper simple algorithm and in a single JVM, where each query requires a bit of initialization and may involve disk access or loading of index pages.
In general, one query performs better.
Also, with issuing two queries you can potentially get inconsistent results (which may be fixed with higher transaction isolation level though ).
In any case I believe you still need to iterate through resultset (either directly or by using framework's methods that return collections).
From the database point of view, you optimally have exactly one statement that fetches exactly everything you need and nothing else. Therefore, your first option is better. But don't generalize that answer in way that makes you query more data than needed. It's a common mistake for beginners to select all rows from a table (no where clause) and do the filtering in code instead of letting the database do its job.
It also depends on your dataset volume, for instance if you have a large data set, doing a select * without any condition might take some time, but if you have an index on your 'TYPE' column, then adding a where clause will reduce the time taken to execute the query. If you are dealing with a small data set, then doing a select * followed with your logic in the java code is a better approach
There are four main bottlenecks involved in querying a database.
The query itself - how long the query takes to execute on the server depends on indexes, table sizes etc.
The data volume of the results - there could be hundreds of columns or huge fields and all this data must be serialised and transported across the network to your client.
The processing of the data - java must walk the query results gathering the data it wants.
Maintaining the query - it takes manpower to maintain queries, simple ones cost little but complex ones can be a nightmare.
By careful consideration it should be possible to work out a balance between all four of these factors - it is unlikely that you will get the right answer without doing so.
You can query by two conditions:
SELECT * FROM MY_TABLE WHERE TYPE = 'A' OR TYPE = 'B'
This will do both for you at once, and if you want them sorted, you could do the same, but just add an order by keyword:
SELECT * FROM MY_TABLE WHERE TYPE = 'A' OR TYPE = 'B' ORDER BY TYPE ASC
This will sort the results by type, in ascending order.
EDIT:
I didn't notice that originally you wanted two different lists. In that case, you could just do this query, and then find the index where the type changes from 'A' to 'B' and copy the data into two arrays.

Hibernate getting position of a row in a result set

I need to get an equivalent to this SQL that can be run using Hibernate. It doesn't work as is due to special characters like #.
SELECT place from (select #curRow := #curRow + 1 AS place, time, id FROM `testing`.`competitor` JOIN (SELECT #curRow := 0) r order by time) competitorList where competitorList.id=4;
My application is managing results of running competitions. The above query is selecting for a specific competitor, it's place based on his/her overall time.
For simplicity I'll only list the COMPETITOR table structure (only the relevant fields). My actual query involves a few joins, but they are not relevant for the question:
CREATE TABLE competitor {
id INT,
name VARCHAR,
time INT
}
Note that competitors are not already ordered by time, thus, the ID cannot be used as rank. As well, it is possible to have two competitors with the same overall time.
Any idea how I could make this work with Hibernate?
Hard to tell without a schema, but you may be able to use something like
SELECT COUNT(*) FROM testing ts
WHERE ts.score < $obj.score
where I am using the $ to stand for whatever Hibernate notation you need to refer to the live object.
I couldn't find any way to do this, so I had to change the way I'm calculating the position. I'm now taking the top results and am creating the ladder in Java, rather than in the SQL query.

How Can I Use Hibernate JPA #SQLInsert With a Database Column Having a Default Value

I have a table "groups" with four columns. The database is postgres and the group_id column is a Serial. So in reality it is an Integer with a default to get the next value.
I have a use case where I need to use #SQLInsert (using the normal persist method is not an option), but I can't get it to work with the default. Here is what I have:
#SQLInsert(sql="INSERT INTO groups (group_id, parent_id, group_name, version) VALUES (DEFAULT,?,?,?)")
I set the entity attributes to values where group_id and version are null, and the other two are correctly populated. group_id is not nullable in the DB, version can be null.
I get this exception:
WARNING: SQL Error: 0, SQLState: 22023
SEVERE: The column index is out of range: 4, number of columns: 3.
SEVERE: Could not synchronize database state with session
If I enter the following DML directly on the database, it works:
INSERT INTO groups (group_id, parent_id, group_name, version) VALUES (DEFAULT, 3, 'abcd', null);
Is there some way to make the same thing happen using #SQLInsert.
If the class members which you want to save are not reference types they can not hold a null value. It may be the cause of failure in synchronization with database records. Try to use reference types like Integer and Double, etc. And get sure that default values are assumed with a direct insert query.
Another thing in your error messages. It may the default value is out of boundary of the type you are using in Java for that column. Check the default value to be in range. If a value out of range is set for your class member, it can't be synced.
EDIT: Sorry, the second part is not true in this case.
So the short answer is "it can't be done this way". Despite quite a few places I've seen this asked, the Hibernate people have not provided for this use case.
My solution was to decouple the Postgres sequence from the table. That is, I removed the default constraint that selects the nextval from the sequence and populates one of the two primary key fields.
I then manually grab the nextval using a native query (yep, forced to un-abstract the database), and use that value to manually populate the primary key field. It works. It's kludgy, but I might use it more often. It certainly is a lot more understandable as to what is happening than using the pure ORM methods. This can be debugged without a wizards hat. :)
public class...
#PersistenceContext(unitName = "persistence_unit")
private EntityManager em;
...
mymethod(){
...
Query q = em.createNativeQuery("SELECT nextval('groups_group_id_seq')");
BigInteger groupId = (BigInteger)q.getSingleResult();
BigInteger parentId = methodToGetParentId();
GroupsPK gpk = new GroupsPK(groupId, parentId);
Groups grps = new Groups(gpk, "other parameters");
...
}

Categories