Talend : tmap Nullpointer exception while merging two CSV files - java

I want to merge two CSV files. The problem that I am facing is one of the two CSV files has dynamic column.
e.g.
The first CSV file has two column. A and G. Column G has comma separated values.
A | G |<-Column Names
--|---------|
A1| G1,G2,G3| <-Row
A2| G2,G5,G6|<-Row
The second CSV file has dynamic columns. But it will alwas have the column A(uid). e.g.
A | C1 |C2 |Othercolumns|<-Column Names
--|-------|---------|------------|
A1|C1Value|C2Value | |<-Row
A2|C1Value| C2Value | |<-Row
I want to merge these two files So the output will be:
A |G | C1 |C2 |Othercolumns|<-Column Names
--|-----------|-------|---------|------------|
A1| G1,G2,G3 |C1Value|C2Value | |<-Row
A2| G2,G5,G6 |C1Value| C2Value | |<-Row
Here is the job.
I didn't check the include header option in tfileoutputdelimited_1.
This merges the csv files correctly, but does not bring the column information of 2nd CSV file(one with dynamic column). The output is as shown below.
A |G | | | |
--|-----------|-------|---------|------------|
A1| G1,G2,G3 |C1Value|C2Value | |<-Row
A2| G2,G5,G6 |C1Value| C2Value | |<-Row
To get the column names, When I check the "include Header" option in the output file I get the below exception.
java.lang.NullPointerException
at routines.system.DynamicUtils.writeHeaderToDelimitedFile(DynamicUtils.java:72)
at content.csvmergetest_0_1.CSVMergeTest.tFileInputDelimited_2Process(CSVMergeTest.java:2696)
at content.csvmergetest_0_1.CSVMergeTest.runJobInTOS(CSVMergeTest.java:3109)
at content.csvmergetest_0_1.CSVMergeTest.main(CSVMergeTest.java:2975)
As shown below, In this case only one row is fetched from the Tfileinputdelimited_2. I guess that row is the header column and that is why the nullpointer exception.
Why is this happening? How will I get the headers?
Please let me know how I can achieve this.

Read in the file with the "othercolumns" as 1 column of type Dynamic.
Before joining in tMap you need to extract the A column from it:
Then take care to only have one dynamic type in the output schema, because talend cannot handle two.
The resultfile including one header line and 1 "othercolumns" colum Z looks as follows:
A;G;C1;C2;Z
A1;G1,G2,G3;C1Value;C2Value;Z1
A2;G2,G5,G6;C1Value;C2Value;Z2

Related

How to get data from DB using query param?

My DB data is
-----------------------
id | value |
-----------------------
1 | a,b |
-----------------------
2 | c |
-----------------------
3 | d |
-----------------------
I am trying to get DB values using this endpoint: http://localhost:8996/abc/v2/tpids?values=a,b...
String[] tpids = apiData.request.getParam(constants.tpids).split(",")
abc/v2/tpids?values=a,b,c,d this case was failed... it treated a,b,c,d as comma separated values –
In this case, a and b as treated two different values
but I need [a,b] as treated single value... How to escape comma form param value... Thanks
It seems like the splitting is redundant, and you just need to treat the entire input as a single string:
String tpids = apiData.request.getParam(constants.tpids);

How to format data in column for WHERE clause just before executing SELECT?

I am using Microsoft SQL Server with already stored data.
In one of my tables I can find data like:
+--------+------------+
| id | value |
+--------+------------+
| 1 | 12-34 |
| 2 | 5678 |
| 3 | 1-23-4 |
+--------+------------+
I realized that the VALUE column was not properly formatted when inserted.
What I am trying to achieve is to get id by given value:
SELECT d.id FROM data d WHERE d.value = '1234';
Is there any way to format data in column just before SELECT clause?
Should I create new view and modify column in that view or maybe use complicated REGEX to get only digits (with LIKE comparator)?
P.S. I manage database in Jakarta EE project using Hibernate.
P.S.2. I am not able to modify stored data.
One method is to use replace() before the comparison:
WHERE REPLACE(d.value, '-', '') = '1234'

SQL Dynamic column handling in Table during data load

I need to design a Table in Oracle/SQL & data will be upload via Java/C# application via CSV with 50 fields (mapped to columns of Table). How to design Table/DB with below constraints during data importing from CSV
CSV may have new fields being added to existing 50 fields.
In that case instead of adding column to table manually & load data. How can we design table for smooth/automatic file handling with dynamic fields
EX:
CSV has S_ID, S_NAME, SUBJECT, MARK_VALUE fields in it
+------+---------+-------------+------------+
| S_ID | S_NAME | SUBJECT | MARK_VALUE |
+------+---------+-------------+------------+
| 1 | Stud | SUB_1 | 50 |
| 2 | Stud | SUB_2 | 60 |
| 3 | Stud | SUB_3 | 70 |
+------+---------+-------------+------------+
What if CSV has new field "RANK" (similar more fields) added to it & i need to store all new fields in Table.
Please suggest DB design for this consideration
So there are few approaches come to my mind, one of the way would be having metadata(Record) information in one table (column name, data type, any constraint) and have another free form table with large enough no. of columns which will hold the data. Use the metadata table while inserting data into this table to maintain data integrity and other stuff.

Context variable is null in tMysqlInput query

I have a problem with my job when I want to make a query with 2 context variables. I attached photos with my job and my components and when I run the job, it's giving me this error:
Exception in component tMysqlInput_1 (facebook_amazon_us)
java.lang.NullPointerException
at mava.facebook_amazon_us_0_1.facebook_amazon_us.tWaitForFile_1Process(facebook_amazon_us.java:2058)
at mava.facebook_amazon_us_0_1.facebook_amazon_us.tMysqlConnection_1Process(facebook_amazon_us.java:798)
at mava.facebook_amazon_us_0_1.facebook_amazon_us.runJobInTOS(facebook_amazon_us.java:5363)
at mava.facebook_amazon_us_0_1.facebook_amazon_us.main(facebook_amazon_us.java:5085)
What I want to do in this job: I have a csv file with multiple columns. The first one is called Reporting_Starts. I want to get the first registration from that column and put it in the query for a select like:
SELECT * FROM my_table WHERE MONTH(my_table.Reporting_Starts)='"+context.month+"'.
I cannot get why my tJava_4 sees the variables and tMysqlInput don't.
In my tJava_4 I have the following code:
System.out.println(context.month);[My job][1][after running the job][1][tJava_3][1][tJavaRow_1][1][tMysqlInput_1 query][1]
Please let me know if you need any additional information about the job.
Thanks!
With all the iterate links you have, I'm guessing the code isn't executing in the order you expect. Could you please make the following changes:
Remove all the iterate links from tFileList_1
Reorganize your jobs as :
tMysqlConnection_1
|
OnSubjobOk
|
tWaitForFile_1
|
Iterate
|
tFileList_1 -- Iterate -- tJava_3
|
OnSubjobOk
|
tFileInputDelimited_1 -- Main -- tJavaRow_1
|
OnSubjobOk
|
tMysqlInput -- tMap -- tMysqlOutput (delete mode, set a column as delete key)
|
tFileInputDelimited -- tMap -- tMysqlOutput (insert csv)
|
OnSubjobOk
|
tFileCopy
First test with just this part. Then if it works, you can add the rest of your job.

How to break string from an excel file into substrings and load it?

I'm actually working on a talend job. I need to load from an excel file to an oracle 11g database.
I can't figure out how to break a field of my excel entry file within talend and load the broken string into the database.
For example I've got a field like this:
toto:12;tata:1;titi:15
And I need to load into a table, for example grade:
| name | grade |
|------|-------|
| toto |12 |
| titi |15 |
| tata |1 |
|--------------|
Thank's in advance
In a Talend job, you can use tFileInputExcel to read your Excel file, and then tNormalize to split your special column into individual rows with a separator of ";". After that, use tExtractDelimitedFields with a separator of ":" to split the normalized column into name and grade columns. Then you can use a tOracleOutput component to write the result to the database.
While this solution is more verbose than the Java snippet suggested by AlexR, it has the advantage that it stays within Talend's graphical programming model.
for(String pair : str.split(";")) {
String[] kv = pair.split(":");
// at this point you have separated values
String name = kv[0];
String grade = kv[1];
dbInsert(name, grade);
}
Now you have to implement dbInsert(). Do it either using JDBC or using any higher level tools (e.g. Hivernate, iBatis, JDO, JPA etc).

Categories