My Sql-Database uses utf8_bin, which is limited to 3-Byte-Characters. Im using a pre "5.1.46" Mysql connector.
On this site: https://dev.mysql.com/doc/connector-j/5.1/en/connector-j-reference-charsets.html, it is stated that using the "characterEncoding=utf-8" parameter, means that it uses utf8mb3. Which would be correct in my case. The problem is, that the application will throw a Mysql-Exception when it tries to write a 4-Byte-Character:
for example:
java.sql.SQLException: Incorrect string value: '\xF0\x9F\x98\x81 \xC4...'
Furthermore, if I dont specify the characterEncoding-parameter at all, it will default to some other charset. The problem here is, that a lot of characters that I need to be able to write into the DB will just be replaced by a "?". Like "ğ" for example.
So far, the only Solution I see to this problem is just removing all 4-Byte characters from a String, before I write it into the Database, since changing the charset of the Database iself is not an option unfortunatley
But I was wondering if im missing something here. Is there better way to do this?
Thanks a lot
Related
I have a complex situation that I'm trying to deal with involving character encoding.
I have a perl program which is communicating with a java endpoint via thrift, the java is then using the data to make a request to a legacy php service. It's ugly, but part of a migration plan so needs to work for a short while.
In perl a thrift object is created where some of the fields of the thrift object are json encoded strings.
The problem is that when perl makes the request to java, one of the strings is as follows (this is from data:dumper and is subsequently json encoded and added to thrift):
'offer_message' => "<<>>
&&
\x{c3}\x{82}\x{c2}\x{a9}©
<script>alert(\"XSS\");</script>
https://url.com/imghp?hl=uk",
However, when this data is received on the java side the sequence \x{c3}\x{82}\x{c2}\x{a9} has been converted so in java we receive the following:
<<>>\\n&&\\nÃ�Â�Ã�©©\\n<script>alert(\"XSS\");</script>\\nhttps://www.google.com.ua/imghp?hl=uk
The problem is that if I pass the second string to the legacy php program, it fails, if I pass the string taken from the dump of the perl hash, it succeeds. So my assumption is that I need to convert the received string to another encoding (correct me if I'm wrong, I'm not sure that this is the right solution).
I've tried taking the parameters received in java and converting them to every encoding I can think of, however it doesn't work. So for example:
byte[] utf8 = templateParams.getBytes("UTF8");
normallisedTemplateParams = new String(utf8, "UTF8");
I've been varying the encoding schemes in the hope I find something that works.
What is the correct way to solve this? For a short time this messy solution is my only option while other re-engineering is happening.
The problem in the end difficult to diagnose but simple to resolve. It turned out that the package I was using to convert in Java was using java's default encoding of UTF-16. I had to modify the package and force it to use UTF-8. After that, everything worked.
Why some people prefer to use As400Text object to handle EBCDIC/ASCII conversion (Java code with IBM MQ jars) if we already have MQC.MQGMO_CONVERT option to handle this?
My requirement is to convert ASCII->EBCDIC during the PUT operation which I am doing by setting the character set to 37 and the write format to "STRING" and using MQC.MQGMO_CONVERT option to automatically convert EBCDIC ->ASCII during the GET operation.
Is there any downfall of using convert option? Could anyone please let me know if this is not 100 percent safe option?
Best practice is to write the MQ message in your local code page (where the CCSID and Encoding will normally be filled in automatically as the correct values) and to set the Format field. Then the getter will should use MQGMO_CONVERT to request the message in the CCSID and Encoding they need it in.
Get with Convert is safe, and will be correct so long as you provide the correct CCSID a and Encoding that describes the message, when you put it.
In the description of what you are doing in your question you convert from ASCII->EBCDIC before putting the message, and then getter is converting from EBCDIC->ASCII on the MQGET. This means you have paid for two data conversion operations, when you could have done none (or if two different ASCIIs, only one).
I made a program, and for "protection" set some parameters. First parameter is date to wich program can work (something Trial but with fixed date), and the second one is HDD serial number (had some problems with other hardware serials) on which program works.
Now, I need to make it possible to me to change these values after compiling program.
I tried adding Log in which accepts anything and executes program with default values. Only if I log in with my user/pass values, it somehow allows me to change default values. After that, by every start of program, he checks with new values I've entered earlier.
If someone understands what I want and what I tried, tell me is this possible, or is there some other and easier/better solution?
put your "constant" part in a String which will must have a fixed size
that string should have only ASCII code; for max flexibility I'd use a base64 encoded string, so you can put even binary encripyted data
compile the class
if you open the .class with an hexadecimal editor, you are able to see and change that string
edit the .class file by that hexadecimal editor, putting the new ASCII values. Keep attention not to change the size of the String
Note also that this approach can be hackable by some guys :)
Hy,
Lets say you have Varchar-Database values in a column that are cAmeLCaSe and you always want to display them UPPERCASE in a view.
Is it now better to select those entrys using the (for example) UPPER-Function of Oracle
or to loop the results and call the .toUpperCase() Method from within the Java Code after the selection has been made?
I know its a bit of a general question and i will of corse comment after having made performance messurments of the above two possibilitys. But i am more after a good source of information that addresses such questions in general (like for example "is it better do run sorting db- side or in programm-code?" and questions like this for common Solutions like .Net/Java and Oracle/ MSSQL Server.
Many thanks you took the time to read this questions, i appreciate any input and wish you a great day.
Regards
Jan
It depends on where and how the uppercased value is used.
If this is only used in the frontend (I assume with "view" you did not mean a database view) then I'd go for a toUpperCase() ideally using the user's locale.
If you are using the uppercase value for comparison I'd use the Oracle function to ensure that the you have a consistent behaviour. I'm think of e.g. a condition where you compare the column value to a string constant: WHERE upper(foobar) = upper('SomeValue') If you used Java's toUpperCase() that might apply different (locale dependent) rules than Oracle would use.
I believe always my code should be database independent.
String upper = string.toUpperCase();
Because,it's database independent.If I shift my database to some other,I need not to change my code.
In a nutshell your specific requirements should take in to consideration.
i am using replace method for editing text in mysql database and its working well for
every time i try to replace a string by some other string e.g
REPLACE(Eligibility_Points , '(ii)', 'second point is')";
works well for above case
but does not work well in the following case
REPLACE(Eligibility_Points , '(ii)-(iii)', 'second and third point is')";
how should i fix this problem, thanks for your help
Assuming that this is the MySQL REPLACE string function you are talking about, the only reason I can see why the second example wouldn't work is that (maybe) the Eligibility_Points field (or whatever) doesn't contain the first string at all.
Maybe you could provide more context; e.g. what evidence you have that the replace isn't working.
However #vadchen makes a good point. If you do the replacement in the first example, then it will remove all examples that might trigger a replacement in the second example. Maybe you just need to do the "edits" in the reverse order.
There is no need to escape any of the characters in those fragments, either from the Java or SQL perspective.