At the time of fetching records of a column ("emp_name") in Arabic format from Oracle database table, getting question mark instead of Arabic name (??????) when I fetch records like:
String emp_name = rs.getString("EMP_NAME");
But If I changed the above line like:
String emp_name = new String(rs.getString("EMP_NAME").getBytes("UTF-8")); getting Arabic name perfectly. But in some Arabic names I found garbage characters mentioned below: "احمد عبدالحليم سي�? ابراهيم محمود"
How can I eliminate the �? (garbage character)? Is there any Unicode other than UTF-8 which supports the Arabic name perfectly or will I need to modify the above code to solve the issue?
Related
i having issue to get the value from data is a Chinese character when i get from the resultSet the chinese character was become µ¥?ÚûÇ. My resultset is a xml that stored in database. i would like to convert back the µ¥?ÚûÇ to Chinese character.
the original chinese character is "澳門" but when i extract from my database it wash showing µ¥?ÚûÇ. When i see my xml in my Sybase database in Interactive SQL by using Charset UTF-8 it was able to see my chinese character 澳門.
Object[] results = (Object[]) query.getSingleResult();
xml = String.valueOf(results[2]);
I use the following command to import data from a .csv file into a MySQL database table like so:
String loadQuery = "LOAD DATA LOCAL INFILE '" + file + "' INTO TABLE source_data_android_cell FIELDS TERMINATED BY ','" + "ENCLOSED BY '\"'"
+ " LINES TERMINATED BY '\n' " + "IGNORE 1 LINES(.....)" +"SET test_date = STR_TO_DATE(#var1, '%d/%m/%Y %k:%i')";
However, as one of the columns in the sourcefile contains a really screwy data which is: viva Y31L.RastaMod䋢_Version the program refuses to import the data into MySQL and keeps throwing this error:
java.sql.SQLException: Invalid utf8 character string: 'viva
Y31L.RastaMod'
I searched up on this but cant really understand what exactly the error was, other than that the INPUT format of this string "viva Y31L.RastaMod䋢_Version" was wrong and didn't fit the utf8 format used in the MySQL database?
However, I already did the following which is SET NAMES UTF8MB4 in my MySQL db, since it was suggested in other questions that UTF8MB4 was more flexible in accepting weird characters.
I explored this further by manually inserting that weird data into MySQL database table in the Command Prompt, which worked fine. In fact, the table displayed almost the full entry: viva Y31L.RastaMod?ã¢_Version. But if I ran my program from the IDE the file gets rejected.
Would appreciate any explanations.
Second minor question related to the import process of csv file into mySQL:
I noticed that I couldn't import a copy of the same file into the MySQL database. Errors thrown included that the data was a duplicate. Is that because MySQL rejects duplicate column data? But when I changed all the data of one column leaving the rest the same in that copied file, it gets imported correctly. Why is that so?
I don't think this immediate error has to do with the destination of the data not being able to cope with UTF-8 characters, but rather the way you are using LOAD DATA. You can try specifying the character set which should be used when loading the data. Consider the following LOAD DATA command, which is what you had originally but slightly modified:
LOAD DATA LOCAL INFILE path/to/file INTO TABLE source_data_android_cell
CHARACTER SET utf8
FIELDS TERMINATED BY ','
ENCLOSED BY '"'
LINES TERMINATED BY '\n'
IGNORE 1 LINES(.....)
SET test_date = STR_TO_DATE(#var1, '%d/%m/%Y %k:%i')
This being said, you should also make sure that the target table uses a character set which supports the data you are trying to load into it.
I connect to Oracle database which has NLS_CHARACTERSET (WE8ISO8859P1) , which as far as I know cannot support storing Arabic text.
But Toad for Oracle can read Arabic from this database:
However, I cannot read this using java code.
even I tried to get row of them in bytes using UTL_RAW.CAST_TO_RAW
The result was "218,227,237,225,228,199,32,199,225,218,210,237,210,161,225,222,207,32,199,211,202,229,225,223,202,32,32,56,48,37,32,227,228,32,230,205,207,199,202,32,221,225,237,223,211,32,32,32"
In a test java class, I tried to create new String(new char[]{}) using the above mentioned bytes, with no luck to display Arabic characters.
Any help ? , thank you.
This could be caused by quite a few things:
Check the column type in database it should be NVARCHAR not VARCHAR (notice the "N" at the beginning of the word)
Try to put charset=utf8 in the connection string
Convert the byte[] to string using UTF-8 encoding like this
String arabicText = new String(byteArray, "UTF-8");
I am fetching tweets from Twitter and storing them in a database for future use. I am using UTF-8 encoding in my driver, utf8_mb4_bin in my VARCHAR fields and utf8mb4_general_ciserver collation. The problem with that is that when inserting a value in a VARCHAR field, if the text has any binary code then it will throw an exception since VARCHAR utf8 does not accept binary.
Here is an example, I am fetching the text from here and try inserting it in my database and I get the error:
Incorrect string value: '\xF0\x9F\x98\xB1\xF0\x9F...' for column 'fullTweet' at row 1
My guess is that the two emoticons are causing this. How do I get rid of them before inserting the tweet text in my database?
Update:
Looks like I can manually enter the emoticons. I run this query:
INSERT INTO `tweets`(`id`, `createdAt`, `screenName`, `fullTweet`, `editedTweet`) VALUES (450,"1994-12-19","john",_utf8mb4 x'F09F98B1',_utf8mb4 x'F09F98B1')
and this is what the row in the table looks like:
You can remove non ascii characters from tweet string before inserting.
tweetStr = tweetStr.replaceAll("[^\\p{ASCII}]", "");
It looks like utf8mb4 support is still not configured correctly.
In order to use utf8mb4 in your fields you need to do the following:
Set character-set-server=utf8mb4 in your my.ini or my.cnf. Only character-set-server really matters here, other settings don't.
Add characterEncoding=UTF-8 to connection URL:
jdbc:mysql://localhost:3306/db?characterEncoding=UTF-8
Configure collation of the field
I am using mysql query browser to store the following names in the Person table which contains fields of personNumber and personName. I have the character set of personName at utf-8 and if i insert the name via query browse the query is running correctly but when i try that via JDBC or JPA, the name's special characters become the '?'. What is the problem here?...
The names are
1.Năstase
2.Hrustanović
3.Ogris-Martič and some similar names.
Have you set your connection string correctly?
jdbc:mysql://localhost:3306/administer?characterEncoding=utf8
Try this code
jdbc:mysql://localhost:3306/MY_DB?useUnicode=yes&characterEncoding=UTF8