how to read unicode text from java resultset?
rs.getString() returns a Java String which is Unicode by definition.
If you get mangled characters, you have to configure your database driver to use the right encoding for the connection to the database.
Just read the strings. All strings in Java are unicode already. If you're having problems, then:
It could be a diagnostic problem - you may be reading the right data out of the ResultSet but displaying it so it looks like you haven't read it properly
It could be a configuration problem - there may be something you need to do when connecting to the database so that it determines the right encoding to use
It could be a database problem - the database may not be configured to store full Unicode data
It could be a database schema problem - the particular column you're using may be configured using a column type which doesn't support full Unicode
It could be a problem in the data, e.g. with another program incorrectly submitting data.
I've seen all of these before now. You should use detailed logging (e.g. of the individual characters, in hex) to work out whether you've got the data correctly or not - that will tell you where to look next.
If you are using DataSource (f.e. com.mysql.jdbc.jdbc2.optional.MysqlDataSource) you can directly set channel encoding to UTF8 like ds.setEncoding("UTF-8")
Related
SELECT * FROM `employee` WHERE `name` LIKE "%شريف%"
Above query works fine and find the element by phpmyAdmin query but using it inside JavaFX doesn't get it.
And get the english searchs, So what I need to add in java to permit me search by Arabic.
As per my above comments, I guess it to be a encoding - decoding issue of Java and has nothing specific to do with JavaFX and I also assume that you are not getting any exceptions. You have to use a proper standard while inserting as well as retrieving data. Helpful information is there at , How to store arabic text in mysql database using python?
Refer this article to work only on bytes so your application is always properly internationalized , Byte Encodings and Strings
Refer this one too as how to set encoding in Java , How can I insert arabic word to mysql database using java
Your console might be UTF8 enabled so you are able to match strings there and see Arabic characters.
Hope it helps.
I tried using Teradata fastload
Here is the sample file that they provide on the official website
L_INDEX,L_TIMESTAMP,L_TEXT
1,2010-08-11 13:19:05.1,some text
2,2010-08-11 13:19:05.1,
3,2010-08-11 13:19:05.1,more text
4,,text
5,,
It runs perfect with the above file
Then I modified ONLY the first row . So that some text became "some, text" . The following is a perfectly legit csv
L_INDEX,L_TIMESTAMP,L_TEXT
1,2010-08-11 13:19:05.1,"some, text" // this row was slightly modified
2,2010-08-11 13:19:05.1,
3,2010-08-11 13:19:05.1,more text
4,,text
5,,
However I got an error saying that the first column contains 4 values but only 3 values were expected
As far as I understand I must specify text qualifier " . How can I do this ?
I read documentation but nothing is mentioned about this .
According to the FastLoad Utility documentation pertaining to the selection of a delimiter for use with the SET RECORD command and a VARTEXT layout:
Any character sequence that appears in the data cannot be used as a
delimiter. No control character other than a tab character can be used
in a delimiter.
This would likely extend to the use of the FastLoad API mechanism leveraged within the Teradata JDBC driver.
EDIT
FastLoad has been around for 15+ years and does what it was intended to well -- load lots of data fast. Your other options would be to create a fixed length record where you do not have to rely on a delimiter or create an INMOD to parse the file as it is streamed into FastLoad.
Other alternatives include MultiLoad, Teradata Parallel Transport, TPUMP, or a proper ETL tool to load your data. Each have their own advantages and disadvantages that have to be considered with the format of the data which is being supplied to the environment.
I wrote a Java class that runs on as/400 to build an XML file from DB2 data.
I access the DB using native driver (com.ibm.db2.jdbc.app.DB2Driver) and the data is in croatian format with special characters like PETROVEÅKI, VRANIÑ and so on.
The DB table where are stored the data filled into the XML file has CCSID 65535.
My first problem is that the driver doesn't accept the default CCSID (65535) of the job, thus in my calling CL program I make a CHGJOB setting CCSID to 870 (should be the croatian).
With this setting I can run succesfully the Java class but the special character are transleted into something else that does not match the original character.
This is the first time I work with those special character and I don't know how to solve it.
Any hint would appreciated.
If the job CCSID is 65535, probably all of the files are too. Try changing the CCSID of the file to 870 and see if the translation works then. 65535 means 'binary - do not translate'.
I'm Vietnamese, I used to use some Unicode character ex 'Việt Nam', â, ẵ, ấ, ị, đ, Đ, Ệ, Ố, ư..., I'm having a exercise relate with insert/receive data from database, I'm using Java. I can not receive data from database without many error with there character. Who can help me ?
It whould be better to understand your problem with the exact error code. But as I underestand you, your problem is that you have something in database ( and it's okthere ) but when you want to read and show it, you face with some problems. With these background I think:
you have to use utf-8 to store and read data.
your application have to support your locale. ( for instance as I know JDK doesn't support fa_IR locale, so it may be your problem)
But as there are lots of vietnamians java application, your problem should belongs to your DAL.
Provide complete error and ensure that your data is stored correctly in your database.
Try Java 7 if you want to use Access database.
Otherwise, use a different DBMS that Java currently supports queries or updates with Unicode characters.
Java Programming with Vietnamese
I am reading a column from database using rs.getString() method , the column has some multibyte data.
When retrieved through rs.getString() , the data get garbled and all multibyte characters appear as ??????.
Please suggest what should be done.
I have tried using -Dfile.encoding=UTF8 , but that does not work out.
Do you have the language sets installed on the machine you are trying to decode? Can you get the data out correctly using any methods?