I have three textfields (ID,Name,Address) which reflects columns in DB table i'm trying to insert arabic characters into that table but it appears as "?????"
iam connecting to DB using JDBC
my Database info. : MySQL Server 5.5.14 community
Server characterset: latin1
Db characterset: utf8
Client characterset: latin1
Conn. characterset: latin1
i try to encode strings using the following code:
private String ArEncode(String text){
String txt="";
try {
Charset cset = Charset.forName("utf8");
CharsetEncoder encoder = cset.newEncoder();
CharsetDecoder decoder = cset.newDecoder();
ByteBuffer buffer = encoder.encode(CharBuffer.wrap(text));
txt=buffer.asCharBuffer().toString();
} catch (Exception ex) {
Logger.getLogger(UserView.class.getName()).log(Level.SEVERE, null, ex);
}
return txt;
}
then the returned string "txt" is inserted in to the database
note: when i try to insert values directly into DB from netbeans it is inserted correctly and the Arabic characters appears correctly.
Why would you do this? It's the job of the mysql JDBC driver to encode strings correctly in the declared database character set. Set your character set in the database to UTF-8, set the JDBC options correctly, and just do a plain insert of a plain old String.
your encoding logic should if it works correctly return the exact same value in txt as in text thus it is not needed.
And: set the connection characterset to utf8 - the JDBC will/should take care of the rest...
Related
The schema and tables all have charset and collation of Latin1, but if I try to retrieve Chinese characters from table, it only gives ?????. I don't have access to change the schema or tables properties. How can I convert the charset to actual characters in JDBC?
I do a HTTP GET call in Java to get content which may contain spanish characters, for example: Ñañez
But what I get as a response from Mysql - Ñañez
So far I searched online and did the below:
Appended utf-8 as encoding in connection String(Using Java)
jdbc:mysql://localhost:3306/dbname?useUnicode=true&characterEncoding=UTF-8
Updated the table's encoding
ALTER TABLE test CONVERT TO CHARACTER SET utf8 COLLATE utf8_unicode_ci;
The problem is still there..
Anything I am missing??
Server is Tomcat 6
try altering table column
ALTER TABLE `test` CHANGE `columnname` `columnname` VARCHAR(200)
CHARACTER SET utf8 COLLATE utf8_unicode_ci NOT NULL;
you must run this query before your insert query in mysql:
SET NAMES 'utf8'
Mojibake is usually caused by
The bytes you have in the client are correctly encoded in utf8 (good).
You connected with SET NAMES latin1 (or set_charset('latin1') or ...), probably by default. (It should have been utf8.)
The column in the tables may or may not have been CHARACTER SET utf8, but it should have been that.
Include characterEncoding=utf-8 in the connection string.
I connect to Oracle database which has NLS_CHARACTERSET (WE8ISO8859P1) , which as far as I know cannot support storing Arabic text.
But Toad for Oracle can read Arabic from this database:
However, I cannot read this using java code.
even I tried to get row of them in bytes using UTL_RAW.CAST_TO_RAW
The result was "218,227,237,225,228,199,32,199,225,218,210,237,210,161,225,222,207,32,199,211,202,229,225,223,202,32,32,56,48,37,32,227,228,32,230,205,207,199,202,32,221,225,237,223,211,32,32,32"
In a test java class, I tried to create new String(new char[]{}) using the above mentioned bytes, with no luck to display Arabic characters.
Any help ? , thank you.
This could be caused by quite a few things:
Check the column type in database it should be NVARCHAR not VARCHAR (notice the "N" at the beginning of the word)
Try to put charset=utf8 in the connection string
Convert the byte[] to string using UTF-8 encoding like this
String arabicText = new String(byteArray, "UTF-8");
I got problems while reading arabic characters from oracle in java using JDBC driver, the main problem was i couldn't find the proper character encoding to get the correct data , but i solved the problem manually using this method:
public static String cleanORCLString(String s) throws UnsupportedEncodingException {
byte[] bytes = s.getBytes("UTF16");
String x = new String(bytes, "Cp1256");
String finalS = x.substring(3);
StringBuilder sb = new StringBuilder(finalS);
for(int k = sb.length() - 1 ; k > 0 ; k--) {
if(!isEven(k)) {
sb.deleteCharAt(k);
}
}
return sb.toString();
}
this method give me the correct characters like its shown in database, but when I try to update/insert arabic data, it save wrong characters.
For example: my text saved in database as "?????????" instead of "مرحبا".
This is the way I connect to oracle database.
URL = ORCLConnProperties.ORCL_THIN_PREFIX + orclProp.getIpAddress()
+ orclProp.getPortNumber() + ORCLConnProperties.ORCL_THIN_SUFIX;
// URL = jdbc:oracle:thin:#10.0.0.12:1521:ORCL
System.out.println("URL: " + URL);
Properties connectionProps = new Properties();
connectionProps.put("characterEncoding", "Cp1256");
connectionProps.put("useUnicode", "true");
connectionProps.put("user", orclProp.getUserName());
connectionProps.put("password", orclProp.getPassword());
try {
Class.forName("oracle.jdbc.driver.OracleDriver");
} catch (ClassNotFoundException ex) {
System.out.println("Error: unable to load driver class!");
System.exit(1);
}
myDriver = new oracle.jdbc.driver.OracleDriver();
DriverManager.registerDriver(myDriver);
conn = DriverManager.getConnection(URL, connectionProps);
please help me in solving this issue ?
Thanks.
New Note:
Database itself don't use UTF16 character set, but
"the JDBC OCI driver transfers the data from the server to the client
in the character set of the database. Depending on the value of the
NLS_LANG environment variable, the driver handles character set
conversions: OCI converts the data from the database character set to
UTF-8. The JDBC OCI driver then passes the UTF-8 data to the JDBC
Class Library, where the UTF-8 data is converted to UTF-16."
this note is mentioned here:
http://docs.oracle.com/cd/B10501_01/java.920/a96654/advanc.htm
First you may check the NLS_CHARACTERSET parameter of your database using
the SQL*PLUS command :-
select * from v$nls_parameters where parameter = 'NLS_CHARACTERSET';
the result should be
PARAMETER
VALUE
NLS_CHARACTERSET
AR8MSWIN1256
if it's not, you have to change the value of this parameter using :-
hit WINDOWS KEY + r on your keyboard
write :- SQLPLUS sys as sysdba
press Enter then enter the password or just hit another Enter
issue the following commands :
SHUTDOWN IMMEDIATE
STARTUP RESTRICT
ALTER DATABASE CHARACTER SET INTERNAL_USE AR8MSWIN1256;
ALTER DATABASE CHARACTER SET AR8MSWIN1256;
SHUTDOWN IMMEDIATE
STARTUP
change the value of the NLS_LANG registry string into AMERICAN_AMERICA.AR8MSWIN1256
if your operating system is a flavor of UNIX use
AR8ISO8859P6 instead of AR8MSWIN1256 as the value of NLS_CHARACTERSET
DON'T use National datatypes (i.e NVARCHAR, NTEXT, or NCLOB ) in your database unless you are going to use other languages than (Arabic and English) inside your database
AR8MSWIN1256 character set is sufficient for mixing arabic and english inside the same field (as far as I know).
TAKEN FROM
https://www.youtube.com/watch?v=zMphHE78imM
https://ksadba.wordpress.com/2008/06/10/how-to-show-arabic-characters-in-your-client-app/
Check your oracle version. if it old it can't support UTF16.
here is an article -- hope it will be useful.
http://docs.oracle.com/cd/B19306_01/server.102/b14225/ch6unicode.htm
I'm doing
private void doSomething(ScrollableResults scrollableResults) {
while(scrollableResults.next()) {
Object[] result = scrollableResults.get();
String columnValue = (String) result[0];
}
}
I tried this in two computers
It works fine. It is a Windows 7. System.getProperty("file.encoding") returns Cp1252.
When the word in the database has accents columnValue gets strange values. Is is a CentOS. System.getProperty("file.encoding") returns UTF-8.
Both databases are MySql, Charset: latin1, Collation: latin1_swedish_ci.
What should I do to correct this?
My suggestion would be to use UTF-8 everywhere:
at the database/tables level (the following ALTER will change the character set not only for the table itself, but also for all existing textual columns)
ALTER TABLE <some table> CONVERT TO CHARACTER SET utf8
in the connection string (which is required with MySQL's JDBC driver or it will use the client's encoding)
jdbc:mysql://localhost:3306/db_name?useUnicode=yes&characterEncoding=UTF-8
References
MySQL 5.0 Reference Manual
9.1.3.2. Database Character Set and Collation
9.1.3.3. Table Character Set and Collation
Connector/J (JDBC) Reference
20.3.4.4. Using Character Sets and Unicode