utf8_unicode_ci String is inserted incorrectly? - java

I have java application through which I do different operations on MySQL DB. The probleam is that when inserting utf8 String it is not inserted correctly. The charset of DB is utf8 and I have set collation to utf8_unicode_ci. Server connection collation is also utf8_unicode_ci. Furthermore when I insert data from phpMyAdmin it is inserted correctly, but when I do it from Java application using JOOQ - it is not. Example:
Result<ExecutorsRecord> executorsRecord =
context.insertInto(EXECUTORS, EXECUTORS.ID, EXECUTORS.NAME, EXECUTORS.SURNAME, EXECUTORS.REGION, EXECUTORS.PHONE, EXECUTORS.POINTS, EXECUTORS.E_TYPE)
.values(id, name, surname, region, phone, 0, type)
.returning(EXECUTORS.ID)
.fetch();
where name = "Бобр" and surname = "Добр", produces tuple with ???? as a name and ???? as surname. I have checked both strings, they are passed correctly to the method correctly.

As #spencer7593 suggested the problem could be in JDBC connector. So I added into url of connection following: ?characterEncoding=utf8 so that final url was "jdbc:mysql://localhost:3306/mydb?characterEncoding=utf8", where mydb is a name of database. This has sorted out my problem. Also I would like to add the following statement (again by #spencer7593):
When we've got things configured correctly, and things aren't working, our goto suspect is the JDBC driver. To get timezone differences between the JVM and the MySQL database sorted out, to prevent the JDBC driver from "helping" by doing an illogical combination of various operations, we had to add two extra obscurely documented settings to the connection string.
Further reading

Related

Charset issues in ojdbc7/ojdbc8 vs. correct behaviour in ojdbc6

We have an Oracle database with the following charset settings
SELECT parameter, value FROM nls_database_parameters WHERE parameter like 'NLS%CHARACTERSET'
NLS_NCHAR_CHARACTERSET: AL16UTF16
NLS_CHARACTERSET: WE8ISO8859P15
In this database we have a table with a CLOB field, which has a record that starts with the following string, stored obviously in ISO-8859-15: X²ARB (here correctly converted to unicode, in particular that 2-superscript is important and correct).
Then we have the following trivial piece of code to get the value out, which is supposed to automatically convert the charset to unicode via globalization support in Oracle:
private static final String STATEMENT = "SELECT data FROM datatable d WHERE d.id=2562456";
public static void main(String[] args) throws Exception {
Class.forName("oracle.jdbc.driver.OracleDriver");
try (Connection conn = DriverManager.getConnection(DB_URL);
ResultSet rs = conn.createStatement().executeQuery(STATEMENT))
{
if (rs.next()) {
System.out.println(rs.getString(1).substring(0, 5));
}
}
}
Running the code prints:
with ojdbc8.jar and orai18n.jar: X�ARB -- incorrect
with ojdbc7.jar and orai18n.jar: X�ARB -- incorrect
with ojdbc-6.jar: X²ARB -- correct
By using UNISTR and changing the statement to SELECT UNISTR(data) FROM datatable d WHERE d.id=2562456 I can bring ojdbc7.jar and ojdbc8.jar to return the correct value, but this would require an unknown number of changes to the code as this is probably not the only place where the problem occurs.
Is there anything I can do to the client or server configurations to make all queries return correctly encoded values without statement modifications?
It definitely looks like a bug in the JDBC thin driver (I assume you're using thin). It could be related to LOB prefetch where the CLOB's length, character set id and the first part of the LOB data is sent inband. This feature was introduced in 11.2. As a workaround, you can disable lob prefetch by setting the connection property
oracle.jdbc.defaultLobPrefetchSize
to "-1". Meanwhile I'll follow up on this bug to make sure that it gets fixed.
Please have a look at Database JDBC Developer's Guide - Globalization Support
The basic Java Archive (JAR) file ojdbc7.jar, contains all the
necessary classes to provide complete globalization support for:
CHAR or VARCHAR data members of object and collection for the character sets US7ASCII, WE8DEC, WE8ISO8859P1, WE8MSWIN1252, and UTF8.
To use any other character sets in CHAR or VARCHAR data members of
objects or collections, you must include orai18n.jar in the CLASSPATH
environment variable:
ORACLE_HOME/jlib/orai18n.jar

How to get Column Comments in JDBC

I want to fetch Column comments using JDBC Metadata , But everytime it returns null , I tested with Oracle and SqlServer both cases it returning Null.
DatabaseMetaData dmt = con.getMetaData();
colRs = dmt.getColumns(null, "dbo", 'Student', null);
while (colRs.next()) {
System.out.println(colRs.getString("REMARKS");
}
While i am getting all other data like column name , length etc absolutely ok ...
For Oracle you need to provide a connection property remarksReporting and set that to true or call the method setRemarksReporting() to enable that.
OracleConnection oraCon = (OracleConnection)con;
oraCon.setRemarksReporting(true);
After that, getColumns() will return the column (or table) comments in the REMARKS column of the ResultSet.
See Oracle's JDBC Reference for more details
For SQL Server this is not possible at all.
Neither the Microsoft nor the jTDS driver expose table or column comments. Probably because there is no SQL support for that in SQL Server. The usual approach of using "extended properties" and the property name MS_DESCRIPTION is not reliable. Mainly because there is no requirement to us MS_DESCRIPTION as the property name. Not even sp_help returns those remarks. And at least the jTDS driver simply calls sp_help go the the table columns. I don't know what the Microsoft driver does.
The only option you have there, is to use fn_listextendedproperty() to retrieve the comments:
e.g.:
SELECT objname, cast(value as varchar(8000)) as value
FROM fn_listextendedproperty ('MS_DESCRIPTION','schema', 'dbo', 'table', 'Student', 'column', null)
You need to replace MS_DESCRIPTION with whatever property name you use to store your comments.

Sqlserver: what are the differences between execute sql with jdbc driver and execute with sql client

I have a table named "T_ROLE", it has just one column named "NAME" which type is nvarchar(255), the sqlserver Collation is "SQL_Latin1_General_CP1_CI_AS"(en_US), now i want to insert japanese character, so i know that i need do sql like this:
INSERT INTO T_ROLE(NAME) VALUES(N'japaneseString')
this can be successful.
if i do sql:
INSERT INTO T_ROLE(NAME) VALUES('japaneseString')
which without N prefix, it will saved as '?', i can under these behavior.
But when i use sqlserver jdbc driver to do insert operation like this:
String sql = "INSERT INTO T_ROLE (NAME) VALUES(?)";
stmt.setString(1, "");
stmt.execute(sql);
notice: i don't use stmt.setNString() method, but it can be saved successful, why?
See this blog: https://blogs.msdn.microsoft.com/sqlcat/2010/04/05/character-data-type-conversion-when-using-sql-server-jdbc-drivers/
It turns out that the JDBC driver sends character data including varchar() as nvarchar() by default. The reason is to minimize client side conversion from Java’s native string type, which is Unicode.
So how do you force the JDBC driver not to behave this way? There is a connection string property, named sendStringParametersAsUnicode. By default it’s true.
One would ask what if I want to pass both varchar and nvarchar parameters at the same time? Well, even with the connection property set false, you can explicitly specify nvarchar type like this:
pStmt.setObject(2,Id,Types.NVARCHAR); //Java app code
Simple Google search for sql server jdbc nvarchar found this answer.

UTF-8 Queries with JDBC

i want to ask the MYSQL an UTF-8 Query but it does not work fine . when i try the following query , the result comes up truly :
String query = "select * from Terms where Term = 'lol'";
but with the following query doesn't make a response :
String query = "select * from Terms where Term = 'خدابخش'";
where the
'خدابخش'
part is in Persian and UTF-8 .
note that the connection to the database is fine .
Chances are that you may need to set your character encoding in your JDBC connection. If you are using MySQL JDBC Connector you do it using the property characterEncoding. Somewhat like this:
jdbc:mysql://localhost/some_db?useUnicode=yes&characterEncoding=UTF-8
You may want to read the reference on encoding and character sets in your connector JDBC documentation.
This is the one that mentions the use of characterEncoding for the MySQL JDBC Connector:
Connector JDBC: Using Character Sets and Unicode
One or more of the following is true:
The Java compiler, compiling your code, is set to read the source file with a different encoding in which the source file was actually stored. In other words, there is a discrepancy between the encoding that your editor uses, the encoding in which the file is actually saved, and the encoding with which the Java compiler is reading your source code.
Your database isn't set correctly to accept/store Unicode characters. Ensure that your database is set correctly. Looks like you're using MySQL. You may want to create a dump of the database using mysqldump and witness how the database was created with respect to character sets.

Storing Special Character in MySQL

I am using mysql query browser to store the following names in the Person table which contains fields of personNumber and personName. I have the character set of personName at utf-8 and if i insert the name via query browse the query is running correctly but when i try that via JDBC or JPA, the name's special characters become the '?'. What is the problem here?...
The names are
1.Năstase
2.Hrustanović
3.Ogris-Martič and some similar names.
Have you set your connection string correctly?
jdbc:mysql://localhost:3306/administer?characterEncoding=utf8
Try this code
jdbc:mysql://localhost:3306/MY_DB?useUnicode=yes&characterEncoding=UTF8

Categories