Character encoding issue with linux and mysql - java

I am getting this string which contains this substring gratuit.AFLĂ MAI MULTEDe from webservice. when i save this in data base in my local(windows) works fine but when i try to save on server when it is deployed on linux i get following error:
Incorrect string value: '\xC4\x82 MAI...' for column 'description' at row 1
I am using hibernate 3.3 with mysql 5.5 (both windows and linux) and database usage default encoding (latin1).
I have tried setting -Dfile.encoding=UTF8 in JAVA_OPT but not worked, i think its this is os related problem.
Any suggestion?

(In general nowadays I would do all in UTF-8.) There is a long pipeline of points where encoding can be set. From the web service you get probably XML in UTF-8. That is automatically read correctly, as XML handles the encoding strict.
On database level there is the database and table and field with a default and explicit encoding. Furthermore the connection url should be parametrised to the correct encoding.
The error message shows the UTF-8 bytes for that accented A and I guess it is not available in Latin1.
For MySQL the connection string could look like:
jdbc:mysql://localhost/MYDB?useUnicode=true&characterEncoding=UTF-8

Related

Spring Security java.lang.IllegalArgumentException: Non-hex character in input

I deployed an existing Maven project in my Tomcat Server on Windows7 environment. I'm using tomcat7 , spring-security-core 3.1.0 .
However, everytime I'm logging in my webapp, I received an error
java.lang.IllegalArgumentException: Non-hex character in input
The code is working perfectly fine in Linux environment. So I was thinking it's because I'm using windows7 in my local environment. When I look into the internet I saw that's it's a encoding issue between linux and windows.
I tried setting up
JAVA_TOOL_OPTIONS -Dfile.encoding=UTF8
but haven't succeeded. Please help me out. Thanks in advance!
Most likely, when you login, events happen is such order:
Spring selects an entity from DB by username.
Spring must check inputted password for match with stored encoded password.
To check for a match, Spring uses PasswordEncoder, which you have most likely configured.
Your password encoder expects that stored encoded password is a hexidecimal char sequence (previously encoded by this PasswordEncoder). Thus, it tries to decode CharSequence into byte[], but fails (source).
The solution is to persist users with previously encoded password, e.g. by BCryptPasswordEncoder.
Answer Alex Derkach is right for me!
In my case i have DB with straight store password(develop) looks like User=roor, psw=root.
So when i comment(delete) .passwordEncoder(new StandardPasswordEncoder("53c433t")); ! its work
!!But is wrong, password must be stored in encrypted form!!!
A possible reason for this is mixing password encoders. There're different implementations of PasswordEncoder. And, for example, if you use SymmetricPasswordEncoder for encoding and StandardPasswordEncoder for decoding you may get this exception.

how to read and store non-English character in java on windows os

I am dealing with a Brazilian customer where the data is in Portuguese language.
My application is responsible to read the data through web-service calls and store it in our database. The issue that I am currently facing is the Portuguese characters are not getting identified as it is and is stored in my database as a special character.
I am using MySQL database with all tables configured as collation UTF-8. I tried manually inserting Portuguese character into my database and it worked. So I am suspecting its java who is converting the Portuguese characters into special character.
Also, my application is using Hibernate for database operations.
I am able to get the character as I see in logs and the issue reside while trying to store that data in database.
Eg: Original characters: Gerãt
Database characters: Gerãt
What configurations or setting or changes I need to do to my database so that I can capture the data in Portuguese language as it is?
It may not be database issue, but an application configuration issue.
Few pointers to help :
Check the webservice implementation if it can accept Portugese chars
Check the encoding in your web container. For example Tomcat
Add logs to find out where the Portugese chars are lost.
Hope it helps.
Try changing your hibernate configuration to:
<hibernate-configuration>
<session-factory>
<property name="connection.useUnicode">true</property>
<property name="connection.characterEncoding">UTF-8</property>
<property name="connection.charSet">UTF-8</property>
If that doesn't work, try adding UTF-8 to your characterEncoding in your JDBC url.
Also, this question might help you: Cannot insert non latin symbols in MySQL
This worked in my case:
Add environment variable JAVA_TOOL_OPTIONS instead of _JAVA_OPTIONS with value as
-Dfile.encoding=UTF8

Bad UTF-8 encoding when writing to database (reading is OK)

I have big problem in my web application using JSF and EclipseLink JPA to MySQL database.
When I read data from database JSF reads and writes my charachters in UTF-8 OK. but in database characters are bad.
f.e.: input characters: "żźćółzxcv", written in database: "?????zxcv".
But if I manually write data to database, for example: "żźćółzxcv", then reading in JSF is perfect.
I tried everything from here: Unicode input retrieved via PrimeFaces input components become corrupted
And I discovered that encoding in JSF is fine, but the problem is in java, becouse if I set manually
current.setUwagiZ("żźćóżźćłąśóżźćł TE");
getFacade().edit(current);
in database record is wrong: ???ó??????ó???? TE
I have set characterEncoding and useUnicode in JDBC Resource. Also when execute commands by some tools in NetBeans encoding is OK and data in MySQL are in UTF-8, so connections seems fine.
So the problem is java, but I completely don't know how to solve this :(
Question marks can occur when the messenger itself is aware about the character encoding used in the both sides of the transport. That's the difference with Mojibake whereby it's not the messenger's fault, but the producer's and/or consumer's fault.
In an average web application with a database backend, there are only 2 places where this can happen: communication with the DB and communication with the HTTP client. You've already excluded the HTTP part, so left behind the DB part.
The messenger in the DB part is the JDBC driver. You need to tell the JDBC driver to use UTF-8. MySQL JDBC driver is known to use by default the client platform default encoding, which is in your particular case apparently not UTF-8.
Add the following 2 properties to the JDBC connection:
useUnicode=true
characterEncoding=UTF-8
It's unclear how you've configured the JDBC connection, but if it's "plain vanilla" JDBC, then specify them as query string in JDBC URL:
jdbc:mysql://localhost:3306/db_name?useUnicode=true&characterEncoding=UTF-8
Or if it's a container-specific datasource config, then specify them as separate connection properties, exactly the same way as you specify the username and password.
See also:
Unicode - How to get the characters right?

Umlaut characters are not displayed properly in my JSP

I have a jsp which has the option for uploading a file. In my case I have uploaded a file that has the name in combination of English and umlaut characters - that will be displayed in next jsp where it displays properly for example üß_file.xls when the same code display as ?_file.xls in the higher environment ie.,test environment I had tried three options:
encoded to UTF-8 in the encoding option as the first line in my jsp.
I have changed the html:form attribute (accept character set) to UTF-8.
I have included only SetCharacter Encoded Sevlet filter which is setting response content type to UTF-8 and request .set character Encoding to UTF-8. It includes the change in web.xml with the param to force the jsp patterns to UTF-8 Encoding type.
Please suggest me some solutions to solve this issue in test environment (where it works fine in DEV and local environments).
Have you checked the encoding of the servlet-container? E.g Tomcat might use the plattform (OS) encoding which might not be UTF-8.

Handling Character Encoding in URI on Tomcat

On the web site I am trying to help with, user can type in an URL in the browser, like following Chinese characters,
http://localhost:8080?a=测试
On server, we get
GET /a=%E6%B5%8B%E8%AF%95 HTTP/1.1
As you can see, it's UTF-8 encoded, then URL encoded. We can handle this correctly by setting encoding to UTF-8 in Tomcat.
However, sometimes we get Latin1 encoding on certain browsers,
http://localhost:8080?a=ß
turns into
GET /a=%DF HTTP/1.1
Is there anyway to handle this correctly in Tomcat? Looks like the server has to do some intelligent guessing. We don't expect to handle the Latin1 correctly 100% but anything is better than what we are doing now by assuming everything is UTF-8.
The server is Tomcat 5.5. The supported browsers are IE 6+, Firefox 2+ and Safari on iPhone.
Unfortunately, UTF-8 encoding is a "should" in the URI specification, which seems to assume that the origin server will generate all URLs in such a way that they will be meaningful to the destination server.
There are a couple of techniques that I would consider; all involve parsing the query string yourself (although you may know better than I whether setting the request encoding affects the query string to parameter mapping or just the body).
First, examine the query string for single "high-bytes": a valid UTF-8 sequence must have two or more bytes (the Wikipedia entry has a nice table of valid and invalid bytes).
Less reliable would be to look a the "Accept-Charset" header in the request. I don't think this header is required (haven't looked at the HTTP spec to verify), and I know that Firefox, at least, will send a whole list of acceptable values. Picking the first value in the list might work, or it might not.
Finally, have you done any analysis on the logs, to see if a particular user-agent will consistently use this encoding?

Categories