How to sum inserted rows in tjavaflex component when iterating between inputs? - java

In Talend (open studio for data integration) 7.0.1 (and earlier versions). I use tJavaFlex to log how many rows have been inserted into a database.
Talend Job In detail:
1. Split large file into multiple smaller
2. Iterate between smaller files, insert them into database
3. Log how many rows have been inserted
The logging part logs every iteration, to look like this:
2019-01-31 09:39:20 |Stage_SalesInvoiceLine | Rows inserted 5000
2019-01-31 09:39:25 |Stage_SalesInvoiceLine | Rows inserted 5000
2019-01-31 09:39:32 |Stage_SalesInvoiceLine | Rows inserted 5000
I need help figuring out how to get it to look like this:
2019-01-31 09:39:32 |Stage_SalesInvoiceLine | Rows inserted 15000
tJavaFlex behaviour when changing loop position I've tried to look here for an answer, but did not manage to solve my problem
Current code in tJavaFlex Main code part (start and end parts are empty)
Integer Inserted = ((Integer)globalMap.get("tJDBCOutput_6_NB_LINE"));
String InsertedS = "Rows inserted " + Integer.toString(Inserted);
row19.TimeStamp = TalendDate.getDate("yyyy-MM-dd HH:mm:ss ");
row19.LogRow = "Stage_SalesInvoiceLine | " + InsertedS;

If you use local variables in tJavaFlex they will get reset at each iteration. Instead, you could define a global variable before the start of your subjob, increment it inside tJavaFlex, and retrieve its value after you've done all your inserts.
tSetGlobalVar (NB_INSERTS set to 0)
|
OnSubjobOK
|
database inserts -- OnComponentOK -- tJavaFlex
|
tFixedFlowInput -- tFileOutputDelimited
In the above tJavaFlex, you can increment your variable in the main part:
globalMap.put("NB_INSERTS", (Integer)globalMap.get("NB_INSERTS") + (Integer)globalMap.get("tJDBCOutput_1_NB_LINE_INSERTED"))
in tFixedFlowInput: "Rows inserted " + (Integer)globalMap.get("NB_INSERTS")

Related

How do i understand firebase query in android? [duplicate]

I have a list of numbers inside that have certain data, in which I am adding a value "Sí" or "No", as in this image:
The last "Sí" that I added was the one that is in number 4, but if I do a filter in Firebase with
.equalTo("Sí").limitToLast(1)
I return the value of "Sí" positioned in the number 5 and not in the 4 that was the last "Sí" that I added to the database. There is some way to recognize the last "Sí", without the need of that is in the last position of the list?
I still can not find the solution to this, I hope you can help me.
Thank you.
There is some way to recognize the last "Sí", without the need of that is in the last position of the list?
Yes there is. The most common approach would be to add under each object a new property of type Timestamp named lastUpdate and then query descending according to it. Everytime you update a value, change the value also to lastUpdate with the current timestamp. This is how your schema might look like:
Firebase-root
|
--- Users
|
--- 1
| |
| --- Activo: "Si"
| |
| --- lastUpdate: 1561202277
|
--- 2
|
--- Activo: "No"
|
--- lastUpdate: 1561202299
This is how to save the timestamp:
How to save the current date/time when I add new value to Firebase Realtime Database
And this is how to order descending:
Firebase Data Desc Sorting in Android
How to arrange firebase database data in ascending or descending order?

How to remove minus and plus sign duplicates via Talend job?

I have loaded local file into talend process and need to do below condition this file data
Below my csv file data showing like
NO,DATE,MARK
123,2015-03-01,200
123,2015-03-01,-200
123,2015-03-01,200
123,2015-03-01,200
125,2016-01-01,80
Here above "200" and "-200" two values availed. if I have -200
I need to remove corresponding +200 value after that If I have same NO,DATE,MARK then I need to remove duplicates two
" 123,2015-03-01,200"," 123,2015-03-01,200" = " 123,2015-03-01,200"
Finally my result should come like below
NO,DATE,MARK
123,2015-03-01,200
125,2016-01-01,80
After that I need to some 200 + 80 = 125,2016-01-01,280. How to do above process using talend job.
Step by step, we can start by removing this:
123,2015-03-01,200
123,2015-03-01,-200
we can do it by summing MARK after grouping by NO and DATE by using the talend compoenet tAggregateRow. After, we will get :
123,2015-03-01,0
Now we can use the component tFilterRow to remove all rows having MARK == 0, and the component tUniqRow to remove duplicated rows.
The last step is to get the sum of MARK using tAggregateRow and store it in a context variable, then get the greatest NO and the latest DATE by using the component tSortRow and then get only that row using tSampleRow. We can affect the sum of MARK.

Insert 2 million records into table from file

I have 2 million records in file. and i'm trying to insert all records into my table. i'm very complicated with which way i should use. LOAD DATA INFILE or hibernate begin transaction.
How to insert all data very fast?
File format is txt and its separated with line. need to insert only one row and others will generate auto.
sorry for bad English.
LOAD DATA INFILE is first choice of mysql database users. But if you want to validate the data then you need efforts for that. You can also use Data Integration tools for that. For ex- talend is open source Data integration tool. By clicks it load from file to database and so on. Its useful for large data set. You can also validate and cleaning your data.
I decided to use Load Data Infile but here is another problem. when i finish process i get this warning :
WARN org.hibernate.engine.jdbc.spi.SqlExceptionHelper - SQL Warning Code: 1062, SQLState: 23000
' for key 'PRIMARY'
its my query
String query = " LOAD DATA LOCAL INFILE :file " +
" IGNORE INTO TABLE Code" +
" (code) " +
" SET point = 0, created = NOW(), activated = 0; ";
and when i check my records from mysql there has no value on CODE column
+-----------+-------+------+-----------+---------------+---------------------+
| code | point | user | activated | activatedDate | created |
+-----------+-------+------+-----------+---------------+---------------------+
| 0 | NULL | 0 | NULL | 2015-10-01 16:35:02 |
| 0 | NULL | 0 | NULL | 2015-10-01 16:35:02 |
+-----------+-------+------+-----------+---------------+---------------------+
2 rows in set (0.00 sec)

BigQuery WORM work-around for updated data

Using Google's "electric meter" example from a few years back, we would have:
MeterID (Datastore Key) | MeterDate (Date) | ReceivedDate (Date) | Reading (double)
Presuming we received updated info (Say, out of calibration/busted meter, etc.) and put in a new row with same MeterID and MeterDate, using a Window Function to grab the newest Received Date for each ID+MeterDate pair would only cost more if there is multiple records for that pair, right?
Sadly, we are flying without a SQL expert, but it seems like the query should look like:
SELECT
meterDate,
NTH_VALUE(reading, 1) OVER (PARTITION BY meterDate ORDER BY receivedDate DESC) AS reading
FROM [BogusBQ:TableID]
WHERE meterID = {ID}
AND meterDate BETWEEN {startDate} AND {endDate}
Am I missing anything else major here? Would adding 'AND NOT IS_NAN(reading)' cause the Window Function to return the next row, or nothing? (Then we could use NaN to signify "deleted".)
Your SQL looks good. Couple of advices:
- I would use FIRST_VALUE to be a bit more explicit, but otherwise should work.
- If you can - use NULL instead of NaN. Or better yet, add new BOOLEAN column to mark deleted rows.

Auto-complete a tuple

I have a database table like this
Port Code | Country |Port_Name
--------------------------------------
1234 | Australia | port1
2345 | India | Mumbai
2341 | Australia | port2
...
The table consists of around 12000 entries.I need to auto-complete as the user enter's the query.Now the query can be any either a port-code,country or a port name.For example if the users partial query is '12' and the drop-down should display 1234 | Australia | port1.The problem that i'm facing now is that for each user entry i'm querying the database which makes the auto-complete really slow.So is there a way to optimize this ?
in smartgwt use comboboxitem.Then override getPickListFilterCriteria of comboxitem like this
ComboBoxItem portSelect = new ComboBoxItem("PORT_ATTRIB", "") {
#Override
public Criteria getPickListFilterCriteria() {
if (getValue() != null && getValue() instanceof String) {
criteria = new AdvancedCriteria(OperatorId.AND, new Criterion[]{new Criterion("portValue",
OperatorId.EQUALS, getDisplayValue())});
}
return criteria;
}
};
Every key press will give you a criteria which u can pass to your query.Query will be something likeselect * from port where portName like '+criteria+%' or portCode like '+criteria+%
You could do this with Lucene and a RAMDirectory. You build an index on your data, and implement a data lookup service to check from time to time if changes in the database occured. Or any other update from your database for your Lucene Index. See Lucene for Indexing your DB and for Querying use the MultiFieldQueryParser.
Is your database indexed correctly? Lookups on indexed columns should be pretty fast - 12k rows is not a great deal for any relational DB.
Another thing I could suggest is to load the database table data, into an in memory table. I've done this in MySQL long time back : http://dev.mysql.com/doc/refman/5.0/en/memory-storage-engine.html . This will help specially if the data does not change very frequently - so a one time load up of the data into a in memory table will be quick. After that, all queries should be executed on this in memory table and these are amazingly fast.

Categories