Why java strings are not saved as UTF-8 in MYSQL? - java

message = new String(("round " + id).getBytes("UTF-8"));
conn = DriverManager.getConnection("jdbc:mysql://" + host + "/" + db + "?useUnicode=true&characterEncoding=UTF-8&"
+ "user=" + login + "&password=" + password);
When I make an insert into the database which encoding is UTF-8 CI, get something like this �������������������� 179, the java file encoding is utf-8, what I'am doing wrong?
+--------------------------+----------------------------+
| Variable_name | Value |
+--------------------------+----------------------------+
| character_set_client | utf8 |
| character_set_connection | utf8 |
| character_set_database | utf8 |
| character_set_filesystem | binary |
| character_set_results | utf8 |
| character_set_server | utf8 |
| character_set_system | utf8 |
| character_sets_dir | /usr/share/mysql/charsets/ |
+--------------------------+----------------------------+

Generally, MySQL comes with a list of predefined system variables. If you want to list them, you can open the MySQL prompt and type:
mysql> SHOW VARIABLES LIKE 'char%';
+--------------------------+----------------------------+
| Variable_name | Value |
+--------------------------+----------------------------+
| character_set_client | latin1 |
| character_set_connection | latin1 |
| character_set_database | latin1 |
| character_set_filesystem | binary |
| character_set_results | latin1 |
| character_set_server | latin1 |
| character_set_system | latin1 |
| character_sets_dir | /usr/share/mysql/charsets/ |
+--------------------------+----------------------------+
As shown, MySQL's default encoding is latin1. In order to change it, you need to edit a bit your my.cnf file and add the following lines:
[client]
default-character-set=utf8
[mysql]
default-character-set=utf8
[mysqld]
collation-server = utf8_unicode_ci
init-connect='SET NAMES utf8'
character-set-server = utf8
When you configure the my.cnf file and restart the MySQL server, you'll note the difference.
Edit:
You can set the encoding for the JDBC's DriverManager like this:
DriverManager.getConnection("jdbc:mysql://" + host + "/" + db + "?useUnicode=true&"
+ "user=" + login + "&password=" + password + "&characterEncoding=utf8");

Besides set connection string in java jdbc,An alternative way to change mysql charset:
1) specify utf8 when creating database and table
//database
CREATE DATABASE IF NOT EXISTS databasename
default charset utf8
COLLATE utf8_general_ci;
//table
CREATE TABLE `Extenics`.`BE_headset` (
`id` INT(10) UNSIGNED NOT NULL AUTO_INCREMENT,
`name` VARCHAR(255) NOT NULL,
PRIMARY KEY (`id`) USING BTREE
) ENGINE=InnoDB AUTO_INCREMENT=30 DEFAULT CHARSET=utf8;
2) or alter it after creating(I don't encourage this)
ALTER DATABASE db_name
[[DEFAULT] CHARACTER SET charset_name]
[[DEFAULT] COLLATE collation_name]
see mysql for more help.

Related

ERROR 1366 (HY000): Incorrect string value: '\x85\x8A\x8D\x95\x97' for column 'name' at row 1

this is the faulty statement:
insert into customer (id, uuid, name) values (1,uuid(), 'àèìòù');
I am using MySQL server 8.0.32 (just upgraded from 5.7) on Windows.
I am also executing this statement from MySQL client, on the same machine.
Same error using mysql-connector-j-8.0.32.
jdbc:mysql://localhost:3306/risk?allowPublicKeyRetrieval=true&useLocalSessionState=true&rewriteBatchedStatements=true
No error using mysql-connector-java-8.0.22 with the same environment and the same connection string.
This the table:
| customer | CREATE TABLE `customer` (
`id` bigint NOT NULL,
`uuid` char(36) NOT NULL,
`name` varchar(255) DEFAULT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `UK_q8w2f8xfdoax44qc8w0epholu` (`uuid`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci |
And these are the variables:
mysql> show variables like '%char%';
+--------------------------+------------------------------------------------------+
| Variable_name | Value |
+--------------------------+------------------------------------------------------+
| character_set_client | utf8mb4 |
| character_set_connection | utf8mb4 |
| character_set_database | utf8mb4 |
| character_set_filesystem | binary |
| character_set_results | utf8mb4 |
| character_set_server | utf8mb4 |
| character_set_system | utf8mb3 |
| character_sets_dir | D:\shape\servers\mysql-8.0.32-winx64\share\charsets\ |
+--------------------------+------------------------------------------------------+
8 rows in set (0.00 sec)
mysql> show variables like '%collation%';
+-------------------------------+--------------------+
| Variable_name | Value |
+-------------------------------+--------------------+
| collation_connection | utf8mb4_0900_ai_ci |
| collation_database | utf8mb4_0900_ai_ci |
| collation_server | utf8mb4_0900_ai_ci |
| default_collation_for_utf8mb4 | utf8mb4_0900_ai_ci |
+-------------------------------+--------------------+
4 rows in set (0.00 sec)
Maybe a bug? Am I missing something?
update
this is working:
mysql> set character_set_client=cp850;
Query OK, 0 rows affected (0.00 sec)
mysql> insert into customer (id, uuid, name) values (1,uuid(), 'àèìòù');
Query OK, 1 row affected (0.00 sec)
But why?
And how can I use UTF-8 also on client side?
update2
Also, commenting in my.ini:
character-set-server=utf8mb4
collation-server=utf8mb4_0900_ai_ci
#character-set-client-handshake = FALSE
#init_connect='SET NAMES utf8mb4'
makes JDBC working again...
What application are you using?
Apparently, your client is using "cp850" since that hex decodes to àèìòù.
You could either tell MySQL that you are using cp850 by establishing the connection that way, or (better) fix the encoding inside the client.
That is, the problem does not seem to be with MySQL.
More
If you are using Windows, did you do something like
chcp 850
Instead, you need
chcp 65001

MySQL - Selecting only the default value from column

I am wondering is it possible to select only the default value of empty column?
I have absolutely empty table and I want just to select one of the columns default value - it is important for my JAVA app which is filling the table.
Thanks.
You can get the default from the INFORMATION_SCHEMA.COLUMNS
select COLUMN_DEFAULT
from INFORMATION_SCHEMA.COLUMNS
where TABLE_SCHEMA='your_db' and TABLE_NAME='your_table' and COLUMN_NAME='your_column'
You can define a default value for a column when you create a table, if you just want MySQL to insert it automatically:
create table my_table (i INT DEFAULT 1);
But if you mean you want the default value which is stored in the DB dictionary, you can get it by this query:
SELECT Column_Default
FROM Information_Schema.Columns
WHERE Table_Schema = 'yourSchema'
AND Table_Name = 'yourTableName'
AND Column_Name = 'yourColumnName'
I can only think of two ways:
Inserting a row
Insert a row without specifying a value for that column
Select the column from that row; it will have the default value of the column
Delete the row
...probably all in a transaction so nothing else sees it.
Using describe (explain)
The describe command (aka explain) describes objects in the system, including tables. So if you do explain YourTable, you'll get back information about the table, including its default values.
Here's an example from that linked documentation:
mysql> DESCRIBE City;
+−−−−−−−−−−−−+−−−−−−−−−−+−−−−−−+−−−−−+−−−−−−−−−+−−−−−−−−−−−−−−−−+
| Field | Type | Null | Key | Default | Extra |
+−−−−−−−−−−−−+−−−−−−−−−−+−−−−−−+−−−−−+−−−−−−−−−+−−−−−−−−−−−−−−−−+
| Id | int(11) | NO | PRI | NULL | auto_increment |
| Name | char(35) | NO | | | |
| Country | char(3) | NO | UNI | | |
| District | char(20) | YES | MUL | | |
| Population | int(11) | NO | | 0 | |
+−−−−−−−−−−−−+−−−−−−−−−−+−−−−−−+−−−−−+−−−−−−−−−+−−−−−−−−−−−−−−−−+
So you can extract the default from the Default column in the returned rows.
Ah, of course, there's a third way, see slaakso's answer for it.

Emoji renders in MySQL table column and in HTML after ajax, but DOES NOT render after page reload. Why?

I've implemented an emoji picker for comments on my Spring & Thymeleaf web app/blog.
Currently, I can select an emoji, see it appear in the textarea, submit the form, the comment is saved inside the controller post method into my MySQL 5.7.17 db table - I can see the emoji art in the table column- the comment returns via ajax, and I can see the emoji on the page. Yay, woohoo!
But! After I reload the page... I see this:
"ð± and ð¶"
What gives??
In order to insert the emoji's in mysql, I followed this tutorial:
https://mathiasbynens.be/notes/mysql-utf8mb4
Storing is NOT the problem.
My my.cnf file, located at
/usr/local/Cellar/mysql/5.7.17/support-files/my.cnf
my.cnf:
--defaults-extra-file=#
[client]
default-character-set = utf8mb4
[mysqld]
init-connect='SET NAMES utf8mb4'
character-set-client-handshake = FALSE
character-set-server = utf8mb4
collation-server = utf8mb4_unicode_ci
[mysql]
default-character-set = utf8mb4
and then made this query:
ALTER TABLE comments CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;
and this:
mysql> SET NAMES 'utf8mb4';
Query OK, 0 rows affected (0.00 sec) [then I put: init-connect='SET NAMES utf8mb4' in the cnf file]
mysql> SHOW VARIABLES WHERE Variable_name LIKE 'character\_set\_%' OR
Variable_name LIKE 'collation%';
+--------------------------+--------------------+
| Variable_name | Value |
+--------------------------+--------------------+
| character_set_client | utf8mb4 |
| character_set_connection | utf8mb4 |
| character_set_database | utf8 |
| character_set_filesystem | binary |
| character_set_results | utf8mb4 |
| character_set_server | utf8 |
| character_set_system | utf8 |
| collation_connection | utf8mb4_general_ci |
| collation_database | utf8_general_ci |
| collation_server | utf8_general_ci |
+--------------------------+--------------------+
10 rows in set (0.00 sec)
^However, from what I understand, this only works once^
because when I run that command after I Run the app, it reads:
+--------------------------+--------------------+
| Variable_name | Value |
+--------------------------+--------------------+
| character_set_client | utf8 |
| character_set_connection | utf8 |
| character_set_database | utf8mb4 |
| character_set_filesystem | binary |
| character_set_results | utf8 |
| character_set_server | utf8 |
| character_set_system | utf8 |
| collation_connection | utf8_general_ci |
| collation_database | utf8mb4_unicode_ci |
| collation_server | utf8_general_ci |
+--------------------------+--------------------+
10 rows in set (0.03 sec)
My pom.xml has this:
<properties>
<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
<project.reporting.outputEncoding>UTF 8</project.reporting.outputEncoding>
<java.version>1.8</java.version>
<property name="hibernate.connection.CharSet" value="utf8mb4" />
<property name="hibernate.connection.characterEncoding"
value="utf8mb4"/>
<property name="hibernate.connection.useUnicode" value="true"/>
</properties>
and on all relevant HTML pages and on the header fragment I have:
<meta charset="UTF-8">
When I System.out.println(comment.getBody()) in the controller's PostMapping method -both before and after I save the comment- I can see the emojis in the terminal just fine! But when I System.out.println(comment.getBody()) in the GetMapping for the page, I see all the weird characters and not the emoji. I'm really confused. What do you think the issue can be and what should I do to resolve it? Any help is appreciated, thank you in advance!
(From Comment:)
CREATE TABLE `comments` (
`id` bigint(20) NOT NULL AUTO_INCREMENT,
`body` blob NOT NULL,
`created_date` datetime DEFAULT NULL,
`parent_id` bigint(20) DEFAULT NULL,
`post_id` bigint(20) DEFAULT NULL,
`user_id` bigint(20) DEFAULT NULL,
) ENGINE=InnoDB AUTO_INCREMENT=2084 DEFAULT CHARSET=utf8
That looks like 'Mojibake'; see Trouble with UTF-8 characters; what I see is not what I stored
But, since ð is hex F0, and F0 is the start of Emoji (etc), it may be that you have specified utf8 in MySQL instead of utf8mb4. What was "ð± and ð¶" supposed to be??
Spring/Hibernate:
Hibernate XML:
<property name="hibernate.connection.CharSet">utf8mb4</property>
<property name="hibernate.connection.characterEncoding">utf8</property>
<property name="hibernate.connection.useUnicode">true</property>
Connection url:
db.url=jdbc:mysql://localhost:3306/db_nameuseUnicode=true&character_set_server=utf8mb4
CREATE TABLE
DEFAULT CHARSET=utf8 says that all VARCHAR and TEXT columns will be CHARSET utf8 unless overridden.
body blob NOT NULL, -- You aren't even using a text-like datatype! BLOB says "just throw the bytes in; don't even think about CHARSET".
Because of BLOB, if the Emoji is going into the body, the bytes should be coming out identical to the way they went in. But, let's check something else. Please get HEX(body), preferably for a very short body, perhaps with nothing but an Emoji in it.
For example, the hex for 😁 --
F09F9881 -- correctly in utf8mb4 (aka "UTF-8" outside MySQL). Note leading F0
C3B0C5B8CB9CC281 -- "Double encoded". Might display as 😠Note leading ETH (ð)

Invalid string value: '\ xD0...." for the column .... ". utf8 charset try used

In the spring project, I try to introduce Cyrillic characters into the database. But the database does not encode it.
I used the extended JpaRepository interface, method save (T t), and everything works correctly when I send English text;
when the program tries to save the entity with the Cyrillic, I get the exception "Invalid string value: '\ xD0 \ xA5 \ xD0 \ xB0 \ xD0 \ xB1 ..." for the column .... "
So encoding does not work.
My database character variables:
application.properties:
launchMode=cli
#Database settings
spring.datasource.url=jdbc:mysql://localhost:3306/mySecondBD?serverTimezone=UTC
spring.jpa.hibernate.ddl-auto=update
spring.datasource.username=root
spring.datasource.password=password
#spring.datasource.driver-class-name=org.gjt.mm.mysql.Driver
spring.datasource.driver-class-name=com.mysql.jdbc.Driver
spring.datasource.tomcat.connection-properties=useUnicode=true;characterEncoding=utf-8;
spring.datasource.sql-script-encoding=UTF-8
Question:
Where more I need set charset encoding params?
Edit properties :
spring.datasource.url=jdbc:mysql://localhost:3306/mySecondBD?useUnicode=yes&characterEncoding=UTF-8
Check your variables :
mysql> show variables like 'char%';
Edit my.cnf :
vi /etc/my.cnf
[client]
default-character-set=utf8
Finally, you recheck your MySQL variables and confirm your query result.
Hex D0A5D0B0D0B1 is the UTF-8 encoding for Cyrillic Хаб. So, using MySQL's utf8 (or utf8mb4) should work.
When you said '\ xD0 \ xA5 \ xD0 \ xB0 \ xD0 \ xB1 ...", you had extra spaces; were they in the output you saw?
Please provide SHOW CREATE TABLE page.
Somehow, you need to say that the client encoding is utf8. One way (in spring) is to put this in the application.yml:
datasource:
connectionInitSql: "SET NAMES 'utf8'"
Used this query , it works for me...
ALTER TABLE table_name MODIFY COLUMN column_name varchar(255)
CHARACTER SET utf8 COLLATE utf8_general_ci;
In summary, I do next steps and all runs:
edit application.properties, how me advised
spring.datasource.url=jdbc:mysql://localhost:3306/mySecondBD?useUnicode=yes&characterEncoding=UTF-8
changed own my table encode property(not only database):
ALTER TABLE page CHARACTER SET utf8 COLLATE utf8_general_ci;
ALTER TABLE page CONVERT TO CHARACTER SET utf8 COLLATE utf8_general_ci;
and I changed variable column into text:
alter TABLE page MODIFY content TEXT(21844);
PS variables of the database:
| Variable_name | Value |
| character_set_client | utf8 |
| character_set_connection | utf8 |
| character_set_database | utf8 |
| character_set_filesystem | binary |
| character_set_results | utf8 |
| character_set_server | utf8 |
| character_set_system | utf8 |
| character_sets_dir | /usr/share/mysql/charsets/ |

Not able to insert CAFÉ in mysql, even after setting encoding correctly

I inserted the word CAFE in name field mySQL table.
Unexpectedly, I get a row containing CAFE when I execute below statement
SELECT * FROM myTable where name='CAFÉ';
, which is wrong. In my use-case CAFE shouldn't be equal to CAFÉ
I think I set all the encodings correctly on server and client side:
Server side:
By modifying /etc/mysql/my.cnf I get below
mysql> show variables like "%character%";show variables like "%collation%";
+--------------------------+----------------------------+
| Variable_name | Value |
+--------------------------+----------------------------+
| character_set_client | utf8 |
| character_set_connection | utf8 |
| character_set_database | utf8 |
| character_set_filesystem | binary |
| character_set_results | utf8 |
| character_set_server | utf8 |
| character_set_system | utf8 |
| character_sets_dir | /usr/share/mysql/charsets/ |
+--------------------------+----------------------------+
8 rows in set (0.00 sec)
+----------------------+-----------------+
| Variable_name | Value |
+----------------------+-----------------+
| collation_connection | utf8_general_ci |
| collation_database | utf8_general_ci |
| collation_server | utf8_general_ci |
+----------------------+-----------------+
3 rows in set (0.00 sec)
Client Side:
connect = DriverManager
.getConnection("jdbc:mysql://"+serverName+"/" +
dataBaseName + "?characterEncoding=utf8&user=" + userName + "&password=" + password);
p.s. there are many duplicate questions similar to this, but none of them answering specific to what I am running into.
Collation utf8_general_ci (_ci stands for Case Insensitive) does not only make e equal to E, but also makes E equal to É. To make a select statement case sensitive, you can use the solution from this answer:
SELECT * FROM myTable where BINARY name='CAFÉ';
If you want to make data in column name always case sensitive, use a case sensitive _bin collation as shown in the answers for this question. E.g. when you create a table, use:
CREATE TABLE myTable (
...
) CHARACTER SET utf8 COLLATE utf8_bin ENGINE=MyISAM;
Credit to #vanOekel comments
adding CHARACTER SET utf8 COLLATE utf8_bin while creating the table solved my problem
CREATE TABLE myTable (
name CHAR(100) NOT NULL,
CONSTRAINT uc__name UNIQUE (name)
) CHARACTER SET utf8 COLLATE utf8_bin ENGINE=MyISAM;
Now in my table I can have both CAFE and CAFÉ without tripping the unique constraint on name field.

Categories