Position Autoincrement in Talend

Position Autoincrement in Talend - java

So i a bit lost and don t really know how to hang up this one...
Consider that i have a 2 DB table in Talend, let say firstly
A table invoices_only which has as fields, the invoiceNummer and the authors like this
Then, a table invoices_table with the field (invoiceNummer, article, quantity and price) and for one invoice, I can have many articles, for example
and through a tmap want to obtain a table invoice_table_result, with new columns, one for the article position, an one other for the total price. for the position i know that i can use something like the Numeric.sequence("s1",1,1) function, but don t know how to restart my counter when a new invoices nummer is found, and of course for the total price it is just a basic multiplication
so my result should be some thing like this
Here is a draft of my talend job, i m doing a lookup on the invoicenummer between the table invoice_only and invoices
Any Advices? thanks.

A trick I use is to do the sequence like this:
Numeric.sequence("s" + row.InvoiceNummer, 1, 1)
This way, the sequence gets incremented while you're still on the same InvoiceNummer, and a new one is started whenever a new InvoiceNummer is found.

There are two ways to achieve it,
tJavaFlex
Sql
tJavaFlex
You can compare current data with the previous data and reset the sequence value using below function,
if () {
Numeric.resetSequence(seqName, startValue);
}
Sql
Once data is loaded into the tables, create a post job and use an update query to update the records. You have to select the records and take the rank of the values. On top of the select you have to perform the update.
select invoicenumber, row_number() over(partition by invoicenumber, order by invoicenumber) from table name where -- conditions if any.
Update statements vary with respect to the database, please provide which database are you using, so that can provide the update query.
I would recommend you to achieve this through Sql

Related

Java Hibernate tips about update all table fields performance

I have a requirement like this.
protected Integer[] updateFullTable(final Class clazz){
final ProjectionList projectionList=Projections.projectionList().add(Projections.property("id"),"id");
final Criteria criteria=session.createCriteria(clazz)
.add(Restrictions.eq("typeOfOperation",1))
.add(Restrictions.eq("performUpdate",true));
criteria.setProjection(projectionList);
final List idsList=criteria.list();
final Integer[]ids = transformObjectArrayIntoIntegerArray(idList);
//NOW WE UPDATE THE ROWS IDS.
final Query query=session.createQuery("update "+clazz.getName()+" set activeRegister=true and updateTime=:updateTime where id in (:ids)")
.setParameter("updateTime",new Date())
.setParameterList("ids",ids);
query.executeUpdate();
return transform;
}
As you guys can see I need to update all rows in a table sometime I query all the rows ids and later apply the update to those ids in a separate query but the tables has a lot of records sometimes takes between 30 seconds to 10 minutes depends of the table.
I have change this code to only one update like this.
final Query query=session.createQuery("update "+clazz.getName()+" set activeRegister=true and updateTime=:updateTime where typeOfOperation=1 and performUpdate=true");
And with that only query I avoid the first query but I cannot not longer return the affected Ids. But later the requirement was change a
final StringBuilder logRevert;
Parameter was added.
Which needs to store the updated ids to apply a direct reverse update into the DB if required.
But with my update i cannot get the Ids not longer. My question is how can I get or return the affected ids using a stored procedure or some workaround in the DB or hibernate I mean get the first behaviour with only one query or a enhanced code..
Any tip.
I have tried
Using criteria
Using HQL.
Using namedQuery
Using SqlQuery
Not using transformer returning me a raw Object[]
But the times still are somehow high.
I want something like
query.executeUpdate(); // RETURNS THE COUNT OF THE AFFECTED ROWS
But I need the affected Ids......
Sorry if the question is simple.
UPDATE
With #dmitry-senkovich I could do it using rawSQL but not with hibernate a separated question was made here.
https://stackoverflow.com/questions/44641851/java-hibernate-org-hibernate-exception-sqlgrammarexception-could-not-extract-re

What about the following solution?
SET #ids = NULL;
UPDATE SOME_TABLE
SET activeRegister = true, updateTime = :updateTime
WHERE typeOfOperation = 1 and performUpdate = true
AND (SELECT #ids := CONCAT_WS(',', id, #ids));
SELECT #ids;

if updateTime is datetime
you can select all affected record ids with select
Date updateTime = new Date(); // time from update
select id from clazz.getName() where updateTime=:updateTime and activeRegister=true and typeOfOperation=1 and performUpdate=true

Updating a large number of rows in a table is a slow operation. This is due to needing to capture the 'old' value of each row in case of a ROLLBACK (due to explicit ROLLBACK, failure of the UPDATE, failure or subsequent query in same transaction, or power failure before UPDATE finishes).
The usual fix is to rethink the application design that necessitated the large UPDATE.
On there other hand, there is a possible fix to the schema. Please provide SHOW CREATE TABLE so I don't have to do as much 'hand waving' in the following paragraph...
It might be better to move the column(s) that need to be updated into a separate, parallel, table ("vertical partitioning"). This might be beneficial if
The original table has lots of wide columns (TEXT, BLOB, etc) -- by not having to make bulky copies.
The original table is being updated simultaneously -- by the updates not blocking each other.
There are SELECT hitting the non-updated columns -- by avoiding certain other blockings.
You can still get the original set of columns -- by JOINing the two tables together.

Cassandra Updates not working properly

BoundStatement UpdateTable = new BoundStatement(preparedStatement);
UpdateTable.bind(productId, productname, time);
session.execute(UpdateTable);
I am using the following commands to update cassandra tables.Sometimes it updates and sometimes it doesn't.
UPDATE product SET count = count + 1 where productId = ? AND productname = ? AND time = ?;
It never throws an error.
Why is this ?
EDIT
Table structure
CREATE TABLE IF NOT EXISTS product(productId int,productname text,time timestamp , count counter,PRIMARY KEY (productid,productname,time));

By looking at your (Java?) code, I can't really tell what kind of object insertUpdateTable is. But the bind method should return a BoundStatement object that can be executed. And while UpdateTable is indeed a BoundStatement, I don't see that you're actually binding your variables to it.
Based on the limited amount of code shown, I see two solutions here:
Call the bind method on UpdateTable inside your session.execute:
session.execute(UpdateTable.bind(productId, productname, time));
Wrap your insertUpdateTable.bind inside a session.execute:
session.execute(insertUpdateTable.bind(productId, productname, time));
Check out the DataStax documentation on Using Bound Statements with the Java driver for more information.
Sometimes it updates and sometimes it doesn't.
If you had posted your Cassandra table definition, it might shed some more light on this. But it is important to remember that Cassandra PRIMARY KEYs are unique, and that INSERTs and UPDATEs are essentially the same (an INSERT can "update" existing values and an UPDATE can "insert" new values). Sometimes an UPDATE may appear to not work, when it may be performing a write with the same key values. Just something to look out for.
Also important to note, is that UPDATE product SET count = count + 1 will only work under two conditions:
count is a counter column.
product is a counter table, consisting of only keys and counter columns (all non-counter columns must be a part of the PRIMARY KEY).
Worth noting is that counter columns underwent a big change/improvement with Cassandra 2.1. If you need to use counters and are still on Cassandra 2.0, it may be worth upgrading.

You said the update sometimes works, but note that if you ever delete row with a counter, you'll be unable to modify the row again without dropping the table and recreating it. The update will appear to fail silently. For more, see CASSANDRA-8491

I had a similar issue during high frequency writes & updates.
As the number of concurrent requests goes up, there is good chance that the latest bind may over write the previous bound params. So instead of using single boundStatement, used preparedStatement.bind() in the session.execute
Can you try the following.
Instead of using :
UpdateTable.bind(productId, productname, time);
session.execute(UpdateTable);
Use :
session.execute(preparedStatement.bind(productId, productname, time));

How to Fetch Last 10 database Transaction In IBM DB2..?

I would like to fetch last 10 database transaction in IBM DB2..
Means Which Last 10 transaction execute In DB2..

Depending on what you need that for, you will have to set up the DB2 audit facility or use an activity event monitor.

SQL tables have no implicit ordering, the order has to come from the
data. Perhaps you should add a field to your table (e.g. an int
counter) and re-import the data.
If you cannot do so, then here is one more idea which is coming to my mind, while writing this answer. Can we use rownum to get the last 10 records? Perhaps yes, here is what you can try, i am just throwing this idea and have not tested.
Get the MAX(rownum) from the table
Fetch the records from the table between the max(rownum) to max(rownum) -10
Aghh sounds ugly but see if it works for u.
Btw if you don't know about rowid then here is link to learn about that:
http://pic.dhe.ibm.com/infocenter/db2luw/v9r7/index.jsp?topic=%2Fcom.ibm.db2.luw.apdv.porting.doc%2Fdoc%2Fr0052875.html

If there is a column in your table that you can use to ascertain the correct order, such h as a transaction number or a value generated by a sequence reference, or some column(s) that you can use to ORDER BY, then simply add DESCENDING after each column in the ORDER BY clause, and FETCH FIRST 10 ROWS.

Locking Tables with postgres in JDBC

Just a quick question about locking tables in a postgres database using JDBC. I have a table for which I want to add a new record to, however, To do this for the primary key, I use an increasing integer value.
I want to be able to retrieve the max value of this column in Java and store it as a variable to be used as a new primary key when adding a new row.
This gives me a small problem, as this is going to be modelled as a multi-user system, what happens when 2 locations request the same max value? This will of course create a problem when trying to add the same primary key.
I realise that I should be using an EXCLUSIVE lock on the table to prevent reading or writing while getting the key and adding a new row. However, I can't seem to find any way to deal with table locking in JDBC, just standard transactions.
psuedo code as such:
primaryKey = "SELECT MAX(id) FROM table1;";
primary key++;
//id retrieved again from 2nd source
"INSERT INTO table1 (primaryKey, value 1, value 2);"

You're absolutely right, if two locations request at around the same time, you'll run into a race condition.
The way to handle this is to create a sequence in postgres and select the nextval as the primary key.
I don't know exactly what direction you're heading and how your handle your data, but you could also set the column as a serial and not even include the column in your insert query. The column will automatically auto increment.

Insert fail then update OR Load and then decide if insert or update

I have a webservice in java that receives a list of information to be inserted or updated in a database. I don't know which one is to insert or update.
Which one is the best approach to abtain better performance results:
Iterate over the list(a object list, with the table pk on it), try to insert the entry on Database. If the insert failed, run a update
Try to load the entry from database. if the results retrieved update, if not insert the entry.
another option? tell me about it :)
In first calls, i believe that most of the entries will be new bd entries, but there will be a saturation point that most of the entries will be to update.
I'm talking about a DB table that could reach over 100 million entries in a mature form.
What will be your approach? Performance is my most important goal.

If your database supports MERGE, I would have thought that was most efficient (and treats all the data as a single set).
See:
http://www.oracle.com/technology/products/oracle9i/daily/Aug24.html
https://web.archive.org/web/1/http://blogs.techrepublic%2ecom%2ecom/datacenter/?p=194

If performance is your goal then first get rid of the word iterate from your vocabulary! learn to do things in sets.
If you need to update or insert, always do the update first. Otherwise it is easy to find yourself updating the record you just inserted by accident. If you are doing this it helps to have an identifier you can look at to see if the record exists. If the identifier exists, then do the update otherwise do the insert.

The important thing is to understand the balance or ratio between the number of inserts versus the number of updates on the list you receive. IMHO you should implement an abstract strategy that says "persists this on database". Then create concrete strategies that (for example):
checks for primary key, if zero records are found does the insert, else updates
Does the update and, if fails, does the insert.
others
And then pull the strategy to use (the class fully qualified name for example) from a configuration file. This way you can switch from one strategy to another easily. If it is feasible, could be depending on your domain, you can put an heuristic that selects the best strategy based on the input entities on the set.

MySQL supports this:
INSERT INTO foo
SET bar='baz', howmanybars=1
ON DUPLICATE KEY UPDATE howmanybars=howmanybars+1

Option 2 is not going to be the most efficient. The database will already be making this check for you when you do the actual insert or update in order to enforce the primary key. By making this check yourself you are incurring the overhead of a table lookup twice as well as an extra round trip from your Java code. Choose which case is the most likely and code optimistically.
Expanding on option 1, you can use a stored procedure to handle the insert/update. This example with PostgreSQL syntax assumes the insert is the normal case.
CREATE FUNCTION insert_or_update(_id INTEGER, _col1 INTEGER) RETURNS void
AS $$
BEGIN
INSERT INTO
my_table (id, col1)
SELECT
_id, _col1;
EXCEPTION WHEN unique_violation THEN
UPDATE
my_table
SET
col1 = _col1
WHERE
id = _id;
END;
END;
$$
LANGUAGE plpgsql;
You could also make the update the normal case and then check the number of rows affected by the update statement to determine if the row is actually new and you need to do an insert.
As alluded to in some other answers, the most efficient way to handle this operation is in one batch:
Take all of the rows passed to the web service and bulk insert them into a temporary table
Update rows in the mater table from the temp table
Insert new rows in the master table from the temp table
Dispose of the temp table
The type of temporary table to use and most efficient way to manage it will depend on the database you are using.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Position Autoincrement in Talend - java

A trick I use is to do the sequence like this: Numeric.sequence("s" + row.InvoiceNummer, 1, 1) This way, the sequence gets incremented while you're still on the same InvoiceNummer, and a new one is started whenever a new InvoiceNummer is found.

Related

Java Hibernate tips about update all table fields performance

Cassandra Updates not working properly

How to Fetch Last 10 database Transaction In IBM DB2..?

Locking Tables with postgres in JDBC

Insert fail then update OR Load and then decide if insert or update

Categories

Resources