We have a accented Character value in DB2 table as "POKORNÝ" when i am trying to process the data through java and writing into csv file the value has been changed to "POKORNÃ". I tried converting the value to UTF-8 before writing into CSV but no luck.
DB :
CSV :
Related
i having issue to get the value from data is a Chinese character when i get from the resultSet the chinese character was become µ¥?ÚûÇ. My resultset is a xml that stored in database. i would like to convert back the µ¥?ÚûÇ to Chinese character.
the original chinese character is "澳門" but when i extract from my database it wash showing µ¥?ÚûÇ. When i see my xml in my Sybase database in Interactive SQL by using Charset UTF-8 it was able to see my chinese character 澳門.
Object[] results = (Object[]) query.getSingleResult();
xml = String.valueOf(results[2]);
I use the following command to import data from a .csv file into a MySQL database table like so:
String loadQuery = "LOAD DATA LOCAL INFILE '" + file + "' INTO TABLE source_data_android_cell FIELDS TERMINATED BY ','" + "ENCLOSED BY '\"'"
+ " LINES TERMINATED BY '\n' " + "IGNORE 1 LINES(.....)" +"SET test_date = STR_TO_DATE(#var1, '%d/%m/%Y %k:%i')";
However, as one of the columns in the sourcefile contains a really screwy data which is: viva Y31L.RastaMod䋢_Version the program refuses to import the data into MySQL and keeps throwing this error:
java.sql.SQLException: Invalid utf8 character string: 'viva
Y31L.RastaMod'
I searched up on this but cant really understand what exactly the error was, other than that the INPUT format of this string "viva Y31L.RastaMod䋢_Version" was wrong and didn't fit the utf8 format used in the MySQL database?
However, I already did the following which is SET NAMES UTF8MB4 in my MySQL db, since it was suggested in other questions that UTF8MB4 was more flexible in accepting weird characters.
I explored this further by manually inserting that weird data into MySQL database table in the Command Prompt, which worked fine. In fact, the table displayed almost the full entry: viva Y31L.RastaMod?ã¢_Version. But if I ran my program from the IDE the file gets rejected.
Would appreciate any explanations.
Second minor question related to the import process of csv file into mySQL:
I noticed that I couldn't import a copy of the same file into the MySQL database. Errors thrown included that the data was a duplicate. Is that because MySQL rejects duplicate column data? But when I changed all the data of one column leaving the rest the same in that copied file, it gets imported correctly. Why is that so?
I don't think this immediate error has to do with the destination of the data not being able to cope with UTF-8 characters, but rather the way you are using LOAD DATA. You can try specifying the character set which should be used when loading the data. Consider the following LOAD DATA command, which is what you had originally but slightly modified:
LOAD DATA LOCAL INFILE path/to/file INTO TABLE source_data_android_cell
CHARACTER SET utf8
FIELDS TERMINATED BY ','
ENCLOSED BY '"'
LINES TERMINATED BY '\n'
IGNORE 1 LINES(.....)
SET test_date = STR_TO_DATE(#var1, '%d/%m/%Y %k:%i')
This being said, you should also make sure that the target table uses a character set which supports the data you are trying to load into it.
I connect to Oracle database which has NLS_CHARACTERSET (WE8ISO8859P1) , which as far as I know cannot support storing Arabic text.
But Toad for Oracle can read Arabic from this database:
However, I cannot read this using java code.
even I tried to get row of them in bytes using UTL_RAW.CAST_TO_RAW
The result was "218,227,237,225,228,199,32,199,225,218,210,237,210,161,225,222,207,32,199,211,202,229,225,223,202,32,32,56,48,37,32,227,228,32,230,205,207,199,202,32,221,225,237,223,211,32,32,32"
In a test java class, I tried to create new String(new char[]{}) using the above mentioned bytes, with no luck to display Arabic characters.
Any help ? , thank you.
This could be caused by quite a few things:
Check the column type in database it should be NVARCHAR not VARCHAR (notice the "N" at the beginning of the word)
Try to put charset=utf8 in the connection string
Convert the byte[] to string using UTF-8 encoding like this
String arabicText = new String(byteArray, "UTF-8");
I have a mysql table that has a column with the type longtext, this column containt the content of a csv file. Is there a way to create from the content of this column a csv file and extract data from it by using talend ?
You could extract the String from the DB using the proper Input component. Then use tExtractDelimitedField to split this long String string against a separator character (the comma, I guess). Don't forget to specify carefully your output schema
Finally, use tFileDelimitedOut to write the delimited file with data from the outgoing connection on the file system.
This could help: Validate a csv file
Sure, request the string out of the database, open a file and write the string to it - where's your actual problem?
In my app, I:
let Hibernate create H2 DB
populate DB through JDBC SQL statement with CSV import (INSERT INTO ... SELECT ... FROM CSVREAD(file.csv)). File is in UTF-8 encoding.
On Linux special characters in the DB are correct.
On Windows (default encoding cp1250) special characters are incorrect.
When I try different CSV file encoding (cp1250, iso-8859-2), it works on Windows, but not on Linux.
Is there any way to tell H2 it needs to respect UTF-8 encoding on Windows?
UTF-8 needs to be set in the options parameter of the CSVREAD function, as follows:
CSVREAD('file.csv', null, 'charset=UTF-8')