Quarkus mysql UTF8 hibernate cannot insert utf8 characters

Quarkus mysql UTF8 hibernate cannot insert utf8 characters - java

When I insert data with special characters with liquibase from SQL file it works.
When I insert data with special characters from IntelliJ console via SQL insert it works.
When I insert data with special characters by persisting an entity it gives the following
error:
(Quarkus Main Thread) Incorrect string value: '\xE1\xE1\xE1\xE1\xE1\xE1...' for column 'LEGAL_CLASSIFICATION' at row 1
The data I would like to insert:
product.setLegalClassification("ááááááásdaásdáasáá");
product.persist();
Properties:
hibernate-orm:
dialect: "org.hibernate.dialect.MySQLInnoDBDialect"
datasource:
db-kind: mysql
username: ${DB_USER}
password: ${DB_PASSWORD}
jdbc:
driver: "com.mysql.cj.jdbc.Driver"
url: "jdbc:mysql://localhost:3306/my-db?useUnicode=true&characterEncoding=utf8"
I've already tried with URLs:
"jdbc:mysql://localhost:3306/my-db?characterEncoding=utf8"
"jdbc:mysql://localhost:3306/my-db?useUnicode=true&characterEncoding=UTF-8"
"jdbc:mysql://localhost:3306/my-db?useUnicode=true&characterEncoding=utf8"
"jdbc:mysql://localhost:3306/my-db?useUnicode=true;characterEncoding=utf8;"
"jdbc:mysql://localhost:3306/my-db?useUnicode=yes&characterEncoding=utf8"
"jdbc:mysql://localhost:3306/my-db?useUnicode=true&characterEncoding=UTF-8"
"jdbc:mysql://localhost:3306/my-db?useUnicode=true;characterEncoding=UTF-8;"
"jdbc:mysql://localhost:3306/my-db?useUnicode=yes&characterEncoding=UTF-8"
and driver:
"org.hibernate.dialect.MySQLInnoDBDialect"
"org.hibernate.dialect.MySQLDialect"
The column type is set like this:
<column name="LEGAL_CLASSIFICATION"
type="LONGTEXT"/>
and in a MySQL dbms SQL changeset:
ALTER TABLE PRODUCT MODIFY COLUMN LEGAL_CLASSIFICATION LONGTEXT CHARACTER SET utf8mb4 COLLATE utf8mb4_hungarian_ci;
I tried to query database info with the following results:
SELECT SCHEMA_NAME 'database', default_character_set_name 'charset', DEFAULT_COLLATION_NAME 'collation' FROM information_schema.SCHEMATA;
gives
database, charset, collation
information_schema, utf8mb3, utf8_general_ci
my-db, utf8mb4, utf8mb4_0900_ai_ci
show variables like 'character%';
gives
variable_name, value
character_set_client, utf8mb4
character_set_connection, utf8mb4
character_set_database, utf8mb4
character_set_filesystem, binary
character_set_results, utf8mb4
character_set_server, utf8mb4
character_set_system, utf8mb3
character_sets_dir, /usr/share/mysql-8.0/charsets/
show variables like 'collation%';
gives
variable_name, value
collation_connection, utf8mb4_unicode_ci
collation_database, utf8mb4_0900_ai_ci
collation_server, utf8mb4_0900_ai_ci
SELECT DEFAULT_CHARACTER_SET_NAME, DEFAULT_COLLATION_NAME FROM information_schema.SCHEMATA S WHERE schema_name = 'my-db';
gives
default_character_set_name, default_collation_name
utf8mb4, utf8mb4_0900_ai_ci
Before I set the collation and character set on the columns I could not even insert data from SQL file via liquibase but altering tables fixed it, although I still can't insert through hibernate...
Does anybody have an idea what I should do to make it work?
I really appreciate any help you can provide.
UPDATE (solution/workaround)!!
I've found out that even without jdbc url parameters varchar and longvarbinary columns are working correctly from SQL file, terminal and even through hibernate.
So the only column that's not encoded correctly is the one with LONGTEXT. Unfortunately for me varchar is too short for the job, so currently I solved this issue by using type="JAVA.SQL.TYPES.LONGVARBINARY" as column type (with this I don't even need the column alteration) and I put #Lob on a byte[] entity field and convert the data while I'm mapping the field to a dto.
My Entity:
#Column(name = "LEGAL_CLASSIFICATION")
#Lob
private byte[] legalClassification;
Liquibase column:
<column name="LEGAL_CLASSIFICATION"
type="JAVA.SQL.TYPES.LONGVARBINARY"/>
Mapping:
#Mapping(source = "legalClassification", target = "legalClassification", qualifiedByName = "toUtf8")
ProductModel toModel(Product entity);
#Mapping(source = "legalClassification", target = "legalClassification", qualifiedByName = "fromUtf8")
Product toEntity(ProductModel entity);
#Named("toUtf8")
default String toUtf8(byte[] entity) {
return entity == null ? null : new String(entity, StandardCharsets.UTF_8);
}
#Named("fromUtf8")
default byte[] fromUtf8(String model) {
return model == null ? null : model.getBytes(StandardCharsets.UTF_8);
}
Now Everything is working fine, with good 4byte utf8 encoding. Altough I did not find any documentation how this works in MySQL or hibernate and why I cannot store 4byte utf8 encoded strings in TEXT and LONGTEXT type columns. I would appreciate an explanation on that.

Related

pgloader - How to import a longblob as oid?

In a nutshell
How do you migrate a longblob from MySQL to Postgres using pgloader s.t. Hibernate is happy if the column is annotated #Lob and #Basic(fetch= FetchType.LAZY)?
Full story
So I'm migrating (or trying to, at least) a MySQL DB to postgres. And I'm now trying to move this table correctly:
My current pgloader script is fairly simple:
LOAD DATABASE
FROM mysql://foo:bar#localhost:3306/foobar
INTO postgresql://foo:bar#localhost:5432/foobar
CAST
type int to integer drop typemod,
type bigint with extra auto_increment to bigserial drop typemod,
type bigint to bigint drop typemod
ALTER TABLE NAMES MATCHING 'User' RENAME TO 'users'
ALTER TABLE NAMES MATCHING ~/./ SET SCHEMA 'public'
;
This is sufficient to load the data and have the foreign keys working.
The postgres table looks like this:
The File, however, is a java entity and its content is annotated #Lob:
#Entity
#Inheritance(strategy= InheritanceType.JOINED)
public class File extends BaseEntity {
#NotNull
private String name;
#Column
#Size(max = 4096)
private String description;
#NotNull
private String mimeType;
#Lob
#Basic(fetch= FetchType.LAZY)
private transient byte[] content;
...
}
which is why the application fails to connect to the migrated database with error:
Schema-validation: wrong column type encountered in column [content] in table [File];
found [bytea (Types#BINARY)], but expecting [oid (Types#BLOB)]
How do I get this migration to work?
I did try setting
spring.jpa.properties.hibernate.jdbc.use_streams_for_binary=false
as suggested in proper hibernate annotation for byte[] but that didn't do anything.

Hm ... I guess I can just create blobs after the fact, as suggested by Migrate PostgreSQL text/bytea column to large object?
Meaning the migration script will get an extension:
LOAD DATABASE
FROM mysql://foo:bar#localhost:3306/foobar
INTO postgresql://foo:bar#localhost:5432/foobar
CAST
type int to integer drop typemod,
type bigint with extra auto_increment to bigserial drop typemod,
type bigint to bigint drop typemod
ALTER TABLE NAMES MATCHING 'User' RENAME TO 'users'
ALTER TABLE NAMES MATCHING ~/./ SET SCHEMA 'public'
AFTER LOAD DO
$$
ALTER TABLE file RENAME COLUMN content TO content_bytes;
$$,
$$
ALTER TABLE file ADD COLUMN content OID;
$$,
$$
UPDATE file SET
content = lo_from_bytea(0, content_bytes::bytea),
content_bytes = NULL
;
$$,
$$
ALTER TABLE file DROP COLUMN content_bytes
$$
;

Caused by: org.h2.jdbc.JdbcSQLDataException: Hexadecimal string contains non-hex character

I'm trying to write tests using in-memory DB.
I wrote an sql to clean and store data to DB. But I have an exception:
Caused by: org.h2.jdbc.JdbcSQLDataException: Hexadecimal string contains non-hex character: "e7485042-b46b-11e9-986a-b74e614de0b0"; SQL statement:
insert into users (user_id, name, created_on, modified_on) values ('e7485042-b46b-11e9-986a-b74e614de0b0', 'Ann', null, null) -- ('e7485042-b46b-11e9-986a-b74e614de0b0', 'Ann', NULL, NULL) [90004-199]
My sql:
insert into users (user_id, name, created_on, modified_on) values ('e7485042-b46b-11e9-986a-b74e614de0b0', 'Ann', null, null);
insert into product(product_id, name, created_on, modified_on) VALUES ('f3a775de-b46b-11e9-95e4-af440b6044e6', 'product1', '2019-08-01 17:51:51.000000', '2019-08-01 17:51:51.000000');
insert into products_users(user_id, product_id) VALUES ('e7485042-b46b-11e9-986a-b74e614de0b0', 'f3a775de-b46b-11e9-95e4-af440b6044e6');
My application.properties:
spring.h2.console.enabled=true
spring.datasource.url=jdbc:h2:mem:db;DB_CLOSE_DELAY=-1
spring.datasource.driverClassName=org.h2.Driver
spring.datasource.username=sa
spring.datasource.password=
spring.jpa.database-platform=org.hibernate.dialect.H2Dialect

Using spring.datasource.url=jdbc:h2:mem:testdb;MODE=MYSQL fixed it for me.
Or adding an annotation #Type to the UUID field should fix the issue:
#Id
#Type(type="uuid-char")
private UUID user_id;

The actual cause of this problem is the mapping between your object and the generated create table statement by hibernate (ddl-auto:create) used to create your h2 database schema.
If you enable the output of the those ddl statements using:
spring.jpa.properties.hibernate.show_sql=true
spring.jpa.properties.hibernate.use_sql_comments=true
spring.jpa.properties.hibernate.format_sql=true
logging.level.org.hibernate.type=TRACE
you will most likely see that your UUID class has been mapped to a binary column in your database.
Hibernate:
create table <your_table> (
id bigint generated by default as identity,
...,
<your_object> binary(255),
...
primary key (id)
)
This means that your uuid-string is mapped onto a binary column and thus contains illegal characters. You need a varchar(<uuid-length>) column to store a uuid. There are several solution strategies, one of them is defining a type, see this StackOverflow answer. You can read on binary columns on the official MySQL reference site.

I resolved this problem by adding spring.jpa.hibernate.ddl-auto=none to my application.properties file

How to deal with foreign characters using MySql within Java EE environment

I want to read data from a csv file and then write to MySql. The data contains foreign Languages.
I got this error when I tried to insert a record, which contains Japanese Characters, into MySql.
"1366Incorrect string value: '\xE6\xB0\xB4\xE7\x9D\x80...' for column 'name' at row 1"
The SQL sentence looks like this:
INSERT INTO `MerchandiseMaster` (id,name) VALUES ('20000101','JANIE AND JACK水着　鶯茶系　大胆花柄')
My csv file uses UTF-8 Encoding and the charset of MySql database schema is utf8_gerneral_ci.
I have put these parameters when I connect to database through JDBC(mysql-connector-java-5.1.34-bin.jar):
connect = DriverManager.getConnection("jdbc:mysql://localhost/mydata?"
+ "useUnicode=yes&characterEncoding=UTF-8&user=user123&password=user123.");
My question is:
Is there anything else that I am missing to deal with foreign characters correctly?

I found this on a website, so caveat emptor, but apparently MySQL's UTF-8 support is incomplete. In 2010 they added new support, utf8mb4 that supports the entire UTF-8 encoding scheme.
Add to your MySQL configuration file:
[client]
default-character-set = utf8mb4
[mysql]
default-character-set = utf8mb4
[mysqld]
character-set-client-handshake = FALSE
character-set-server = utf8mb4
collation-server = utf8mb4_unicode_ci
Here's a link to the full article. I haven't tried this out, so test everything carefully first, and make a back-up of your database before doing anything.

How to fix “Incorrect string value for the column at the row 1” errors?

How to fix “Incorrect string value for the column at the row 1” errors
In my java project, I am using hibernate so in my entity class for the field PASSWORD.
I want to use encryption and decryption using the #ColumnTransformer
#Column(name="PASSWORD",length=100)
#ColumnTransformer(
read="AES_DECRYPT(PASSWORD, 'ABCD')",
write="AES_ENCRYPT(?, 'ABCD')")
private String password;
but at the saving time, it is throwing following exception
javax.persistence.PersistenceException: org.hibernate.exception.GenericJDBCException: Incorrect string value: '\x80\x11(\xC4\xAB\x10...' for column 'PASSWORD' at row 1
So How can I resolved above exception?

Is it MySQL database?
For mysql database, you may need to change Character set of column, in your case Password
you can do this using below SQL. Change size and not null options as per your need
ALTER TABLE database.table MODIFY COLUMN PASSWORD VARCHAR(100) CHARACTER SET utf8 COLLATE utf8_general_ci NOT NULL;

Getting rid of binary code from string before inserting row in MySQL database

I am fetching tweets from Twitter and storing them in a database for future use. I am using UTF-8 encoding in my driver, utf8_mb4_bin in my VARCHAR fields and utf8mb4_general_ciserver collation. The problem with that is that when inserting a value in a VARCHAR field, if the text has any binary code then it will throw an exception since VARCHAR utf8 does not accept binary.
Here is an example, I am fetching the text from here and try inserting it in my database and I get the error:
Incorrect string value: '\xF0\x9F\x98\xB1\xF0\x9F...' for column 'fullTweet' at row 1
My guess is that the two emoticons are causing this. How do I get rid of them before inserting the tweet text in my database?
Update:
Looks like I can manually enter the emoticons. I run this query:
INSERT INTO `tweets`(`id`, `createdAt`, `screenName`, `fullTweet`, `editedTweet`) VALUES (450,"1994-12-19","john",_utf8mb4 x'F09F98B1',_utf8mb4 x'F09F98B1')
and this is what the row in the table looks like:

You can remove non ascii characters from tweet string before inserting.
tweetStr = tweetStr.replaceAll("[^\\p{ASCII}]", "");

It looks like utf8mb4 support is still not configured correctly.
In order to use utf8mb4 in your fields you need to do the following:
Set character-set-server=utf8mb4 in your my.ini or my.cnf. Only character-set-server really matters here, other settings don't.
Add characterEncoding=UTF-8 to connection URL:
jdbc:mysql://localhost:3306/db?characterEncoding=UTF-8
Configure collation of the field

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Quarkus mysql UTF8 hibernate cannot insert utf8 characters - java

Related

pgloader - How to import a longblob as oid?

Caused by: org.h2.jdbc.JdbcSQLDataException: Hexadecimal string contains non-hex character

How to deal with foreign characters using MySql within Java EE environment

How to fix “Incorrect string value for the column at the row 1” errors?

Getting rid of binary code from string before inserting row in MySQL database

Categories

Resources