Is it possible to get partitioned data using SQL? - java

I have a RDBMS table with a column BIGINT type and values are not sequential. I have a java program where I want each thread to get data as per PARTITION_SIZE i.e. I want a pair of column values like after doing ORDER BY on result,
Column_Value at Row 0 , Column_Value at Row `PARTITION_SIZE`
Column_Value at Row `PARTITION_SIZE+1` , Column_Value at Row `2*PARTITION_SIZE`
Column_Value at Row `2*PARTITION_SIZE+1` , Column_Value at Row `3*PARTITION_SIZE`
Eventually, I will pass above value ranges in a SELECT query's BETWEEN clause to get divided data for each thread.
Currently, I am able to do this partitioning via Java by putting all values in a List ( after getting all values from DB ) and then getting values at those specific indices - {0,PARTITION_SIZE},{PARTITION_SIZE+1,2*PARTITION_SIZE} ..etc but problem there is that List might have millions of records and is not advisable to store in memory.
So I was wondering if its possible to write such a query using SQL itself which would return me those ranges like below?
row-1 -> minId , maxId
row-2 -> minId , maxId
....
Database is DB2.
For example,
For table column values 1,2,12,3,4,5,20,30,7,9,11 ,result of SQL query for a partition size =2 should be {1,2},{3,4} ,{5,7},{9,11},{12,20},{30} .

In my eyes the mod() function would solve your problem and you could choose a dynamic number of partitions with it.
WITH numbered_rows_temp as (
SELECT rownumber() over () as rownum,
col1,
...
coln
FROM table
ORDER BY col1)
SELECT * FROM numbered_rows_temp
WHERE mod(rownum, <numberofpartitions>) = 0
Fill in the appropriate and change the result from 0 to - 1 in your queries.

Michael Tiefenbacher's answer is probably more useful, as it avoids an extra query, but if you do want to determine ID ranges, this might work for you:
WITH parms(partition_size) AS (VALUES 1000) -- or whatever
SELECT
MIN(id), MAX(id),
INT(rn / parms.partition_size) partition_num
FROM (
SELECT id, ROW_NUMBER() OVER (ORDER BY id) rn
FROM yourtable
) t , parms
GROUP BY INT(rn / parms.partition_size)

Related

Can we combine VALUES and SELECT queries in MySQL

I am trying to insert data into a table. That table has 6 attributes, 2 of its own and 4 foreign keys.
Now I write a query like this:
insert into ***bus***
values ( 4 , 45 , (**select** **bus_driver**.id , **conductor**.id , **trip_location**.trip_id , **bus_route**.route_id
**from bus_driver , conductor , trip_location , bus_route**));
And its giving me an error like:
Error Code: 1241. Operand should contain 1 column(s)
What should I change in my query
You need to remove the values clause and just put the select straight after the table and column names of the insert clause like below :
insert into bus(column1, column2 ........)
select 4 , 45 , bus_driver.id , conductor.id , trip_location.trip_id ,
bus_route.route_id from bus_driver , conductor , trip_location , bus_route;
It's not clear what you're trying to do. It looks like you're going to end up with a lot of rows inserted into your bus table depending on the data in the other tables you're selecting from.
If you run only the select statement, see what you get for results:
select bus_driver.id, conductor.id, trip_location.trip_id, bus_route.route_id
from bus_driver, conductor, trip_location, bus_route
Then add 4, 45 in front of all those rows. That's what you'll be inserting into the bus table.
You may be looking to do something more like:
insert into bus (column1, column2, column3, column4, column5, column6)
select 4, 45, bus_driver.id, conductor.id, trip_location.trip_id, bus_route.route_id
from bus_driver, conductor, trip_location, bus_route
where bus_driver.column? = ?
and conductor.column? = ?
...
And the where clauses would be constructed such that only one record is returned for each table. It depends on what you're trying to do though. There may be situations where you want more than one record from the selected tables, which would end up inserting multiple records into the bus table

Can you check if a column exists and perform different actions with oracle?

My table looks like the following:
id | value1 | count
I have a list of value1 in RAM and I want to do the following:
(if value1 exists in table){
count + 1}else{
insert new row into table}
Is this possible with Oracle or do I have to take it to the code, do a for loop and execute one element of the list at a time? The list contains 5 million values. I'd have to do something like this in the code:
for(int i=0; i<list.size; i++){
boolean exists = checkifexists(list.get(i));
if(exists=true){
countPlusOne(list.get(i);
}else{
createNewRow(list.get(i));
}
}
So I have to do at least two queries for each value, totalling 10m+ queries. This could take a long time and may not be the most efficient way to do this. I'm trying to think of another way.
"I load them into RAM from the database"
You already have the source data in the database so you should do the processing in the database. Instantiating a list of 5 million strings in local memory is not a cheap operation, especially when it's unnecessary.
Oracle supports a MERGE capability which we can use to test whether a record exists in the target table and populate a new row conditionally. Being a set operation MERGE is way more performative than single row inserts in a Java loop.
The tricky bit is uniqueness. You need to have a driving query from the source table which contains unique values (otherwise MERGE will hurl). In this example I aggregate a count of each occurrence of value1 in the source table. This gives us a set of value1 plus a figure we can use to maintain the count column on the target table.
merge into you_target_table tt
using ( select value1
, count(*) as dup_cnt
from your_source_table
group by value1
) st
on ( st.value1 = tt.value1 )
when not matched then
insert (id, value1, cnt)
values (someseq.nextval, st.value1, st.dup_cnt)
when matched then
update
set tt.cnt = tt.cnt + st.dup_cnt;
(I'm assuming the ID column of the target table is populated by a sequence; amend that as you require).
In Oracle, we could use a MERGE statement to check if a row exists and do insertion only if it doesn't.
First create a type that defines your list.
CREATE OR REPLACE TYPE value1_type as TABLE OF VARCHAR2(10); --use the datatype of value1
Merge statement.
MERGE INTO yourtable t
USING (
select distinct column_value as value1 FROM TABLE(value1_type(v1,v2,v3))
)s ON ( s.value1 = t.value1 )
WHEN NOT MATCHED THEN INSERT
(col1,col2,col3) VALUES ( s.col1,s.col2,s.col3);
You may also use NOT EXISTS.
INSERT INTO yourtable t
select * FROM
(
select distinct column_value as value1 from TABLE(value1_type(v1,v2,v3))
) s
WHERE NOT EXISTS
(
select 1 from
yourtable t where t.value1 = s.value1
);
You can do this by two approaches
Approach 1:
Create a temp table in database and insert all your value in RAM into that Temp Table
Write query for updating count on the basis of you main table and temp table join and
set a flag in temp table which values are updated, the value which are not updated
use insert query to insert.
Approach 2:
You can create your own data type, which accepts array of values as input:
CREATE OR REPLACE TYPE MyType AS VARRAY(200) OF VARCHAR2(50);
You can write procedure with your logic,procedure will take value of array as input: CREATE OR REPLACE PROCEDURE testing (t_in MyType)
First fill your RAM list in a temporary table TMP
select * from tmp;
VALUE1
----------
V00000001
V00000002
V00000003
V00000004
V00000005
...
You may use a MERGE statement to handle your logik
if key existe increase the count by 1
if key doesn't exists insert it with the initial count of 1
.
merge into val
using tmp
on (val.value1 = tmp.value1)
when matched then update
set val.count = val.count + 1
when not matched then
insert (val.value1, val.count)
values (tmp.value1, 1)
;
Note that I assume you have IDENTITY key in the column ID, so no key assignment is requeired.
In case there are duplicated record in the TMP table (more records with the same VALUE1 key) you get error as MERGEcan not hanlde more actions with one key.
ORA-30926: unable to get a stable set of rows in the source tables
If you want to count each duplicated key as one -
you must pre-aggregate the temporary table using GROUP BY and add the counts.
Otherwise simple ignore the duplicates using DISTINCT.
merge /*+ PARALLEL(5) */ into val
using (select value1, count(*) count from tmp group by value1) tmp
on (val.value1 = tmp.value1)
when matched then update
set val.count = val.count + 1
when not matched then
insert (val.value1, val.count)
values (tmp.value1, 1)

Static list MINUS select statement

I have a java program that returns a list of Long values (hundreds).
I would like to subtract to this list the values obtained from a select on an oracle database,
something like this:
SELECT 23 as num FROM DUAL UNION ALL
SELECT 17 as num FROM DUAL UNION ALL
SELECT 19 as num FROM DUAL UNION ALL
SELECT 67 as num FROM DUAL UNION ALL...
...
...
SELECT 68 as num FROM DUAL MINUS
SELECT NUM FROM MYTABLE
I presume that this operation has some performance issues...
Are there other better approaches?
Thank you
Case 1:
Use Global Temporary Tables (GTT):
CREATE GLOBAL TEMPORARY TABLE my_temp_table (
column1 NUMBER
) ON COMMIT DELETE ROWS;
Insert the List (Long value) into my_temp_table:
INSERT ALL
INTO my_temp_table (column1) VALUES (27)
INTO my_temp_table (column1) VALUES (32)
INTO my_temp_table (column1) VALUES (25)
.
.
.
SELECT 1 FROM DUAL;
Then:
SELECT * FROM my_temp_table
WHERE column1 NOT IN (SELECT NUM FROM MYTABLE);
Let me know if you have any issue.
Case 2:
Use TYPE table:
CREATE TYPE number_tab IS TABLE OF number;
SELECT column_value AS num
FROM TABLE (number_tab(1,2,3,4,5,6)) temp_table
WHERE NOT EXIST (SELECT 1 FROM MYTABLE WHERE MYTABLE.NUM = temp_table.num);
Assuming MyTable is much bigger than literal values, I think the best option is using a temporary table to store your values. This way your query is a lot cleaner.
If you are working in a concurrent environment (e.g. typical web app), use an id field, and delete when finished. Summarizing:
preliminary: create a table for temporary values TEMPTABLE(id, value)
for each transaction
get new unique/atomic id (new oracle sequence value, for example)
for each literal value: insert into temptable(new_id, value)
select * from temptable where id = new_id minus...
process result
delete from temp_table where id = new_id
Temporary tables are a good solution in oracle. This one can be used with an ORM persistence layer

Select query with multiple where clauses

I have a list of serial numbers: 111111, 222222, AAAAAA, FFFFFF and I want to return a corresponding value or null from a table depending on whether or not the value exists.
Currently I loop through my list of serial numbers, query using the following statement:
"SELECT cnum FROM table WHERE serial_num = " + serialNumber[i];
and then use the value if one is returned.
I would prefer to do this is one query and get results similar to:
Row | cnum
------------
1 | 157
2 | 4F2
3 | null
4 | 93O
5 | null
6 | 9F3
Is there a query to do this or am I stuck with a loop?
It sounds as if you have some sort of Java Array or Collection of serial numbers, and perhaps you want to check to see if these numbers are found in the DB2 table, and you'd like to do the whole list all at once, rather than one at a time. Good thinking.
So you want to have a set of rows with which you can do a left join to the table, with null indicating that the corresponding serial was not in the table. Several answers have started to use this approach. But they are not returning your row number, and they are using SELECT UNION's which seems a round-about way to get what you want.
VALUES clause
Your FROM clause can be a "nested-table-expression"
which can be a (fullselect)
with a correlation-clause. The (fullselect) can, in turn, be a VALUES clause. So you could have something like this:
FROM (VALUES (1, '157'), (2, '4F2'), (3, '5MISSING'), (4, '93O'), ...
) as Lst (rw, sn)
You can then LEFT JOIN this to the table, and get a two-column result table like you asked for:
SELECT Lst.rn, t.serial_num
FROM (VALUES (1, '157'), (2, '4F2'), (3, '5MISSING'), (4, '93O'), ...
) as Lst (rw, sn)
LEFT JOIN sometable t ON t.serial_num = Lst.sn
With this method, you will probably need a loop to build your dynamic SQL statement string, using the values from your collection.
If it was embedded SQL, we might be able to reference a host array variable containing your serial numbers. But alas, in Java I am not sure how to manage using the list directly in SQL, without using some loop.
If you use only an "in" it is not going to return null for the missing value forcing you to do some coding in the application (probably the most efficient way).
If you wanted the database to do all the work (may or may not be ideal) then
you would have to trick db2 into returning your list regardless.
Something like this might work, faking the null values to be returned from sysdummy with the common table expression (with part):
with all_serials as (
select '111111' as serialNumber from sysibm.sysdummy1 union all ,
select '222222' as serialNumber from sysibm.sysdummy1 union all ,
select 'AAAAAA' as serialNumber from sysibm.sysdummy1 union all ,
select 'FFFFFF' as serialNumber from sysibm.sysdummy1
)
select
t1.serialNumber,
t2.serialNumber as serialNumberExists
from
all_serials as t1 left outer join
/* Make sure the grain of the_Table is at "serialNumber" */
the_table as t2 on t1.serialNumber = t2.serialNumber
You can use the SQL IN keyword. You'd need to dynamically generate the list, but basically it'd look like:
SELECT cnum FROM table WHERE serial_num in ('111111', '2222222', '3333333', 'AAAAAAA'...)
Try something like:
select t.cnum
from
(select '111111' serial_num from sysibm.sysdummy1 union all
select '222222' serial_num from sysibm.sysdummy1 union all
select 'AAAAAA' serial_num from sysibm.sysdummy1 union all
select 'FFFFFF' serial_num from sysibm.sysdummy1) v
left join table t on v.serial_num = t.serial_num
I'm not sure if I get you correctly, but this could help:
String query = "SELECT cnum FROM table WHERE ";
for(int i = 0; i < serialNumber.length; i++)
query += "serial_num='" + serialNumber[i] + "' OR ";
query += "serial_num IS NULL "
System.out.println(query);

SQL with rank and partition

I need to execute this sql:
select * from
(select nt.*,
rank() over (partition by feld0 order by feld1 desc) as ranking
from (select bla from test) nt)
where ranking < 3
order by 1,2
This sql works fine in my oracle database but in the h2 database which i use sometimes this doesnt work because rank and partition are not defined.
So i need to transform this sql so that it works in h2 and oracle.
I want to use java to execute this sql. So is it possible to split this sql into different sqls without rank and partition? And then to handle it with java?
If feld1 is unique within feld0 partitions, you could:
select *
, (
select count(*)
from YourTable yt2
where yt2.feld0 = yt1.feld0 -- Same partition
and yt2.feld1 <= yt1.feld1 -- Lower or equal rank
) as ranking
from YourTable yt1

Categories