Java Special character encoding issue

Java Special character encoding issue - java

I tried to insert some special character via java into oracle table and then retrieve it again--assuming my encoding will work.
Below is the code which i tried.
String s=new String("yesterday"+"\u2019"+"s");
...
statement.executeUpdate("INSERT into test1 values ('"+s+"')");
ResultSet rs=statement.executeQuery("select * from test1");
while (rs.next()) {
System.out.println(new String(rs.getString(1).getBytes("UTF-8"),"UTF-8"));
}
...
Now, when I try to see output via commandline execution it displays special character always: yesterdayâ€™s
My question is: why even after using encoding, it is not showing expected result. i.e. yesterday’s. Is above mentioned code is not correct or some modification is required?
P.S.: In eclipse, the code might result yesterday’s, but if executed via command line , it shows yesterdayâ€™s
I am using :
-- JDK1.6
-- Oracle : 11.1.0.6.0
-- NLS_Database_Parameters: NLS_CHARACTERSET WE8MSWIN1252
--Windows
Edit:
\u2019 : this is RIGHT SINGLE QUOTATION MARK & I am looking for this character only.

Check the java property "file.encoding" when you run on the commandline, it may be set to something other than "UTF-8" causing the text to display incorrectly when you output on the commandline.

Here is an illustration of what I suggested in a comment (change the character set of your client). Straight from my SQL*Plus:
SQL> select unistr('\2019') from dual;
U
-
Æ
SQL> $chcp 1252
Active code page: 1252
SQL> select unistr('\2019') from dual;
U
-
’
If this works for you, you may want to add $chcp 1252 to your [g]login.sql.

The problem is that the character encoding for the apostrophe is \u0027
I ran this in the command line:
public class Yesterday{
public static void main(String[] args) {
String s = new String("yesterday" + "\u0027" +"s");
System.out.println(s);
}
}
it resulted in:
yesterday's

Related

Unable to capture next line character in Java

I have a requirement of parsing through an python file which contains multiple sql queries and get the start and end positions of the query to get only the query part using JAVA
I am using .contains function to check for sql(''' as my opening character for the query and now for the closing character I have ''') but there are some cases where ''') comes in between the query when there is a variable involved which should not be detected as an end of the query.
Something like this :
spark.sql(''' SELECT .......
FROM.....
WHERE xxx IN ('''+ Variable +''')
''')
here the last but one line also gets detected as end of line if I use line.contains(" ''') ") which is wrong.
All I can think of is to check for next line character as the end of the query as each query is separated by two empty lines. So tried these if (line.contains(" ''')\n") & if (line.contains(" ''')\r\n") but none of them work for me.
Kindly let me know of any other way to do this.
Note that I do not have the privilege to change the query file.
Thanks

I believe simple contains won't solve this problem.
You will have to use Pattern if you are looking to match \n.
String query = "spark.sql(''' SELECT .......\n" +
"FROM..... \n" +
"WHERE xxx IN ('''+ Variable +''')\n" +
"''')";
Pattern pattern = Pattern.compile("^spark.sql\\('''(.*)'''\\)$", Pattern.DOTALL);
System.out.println(pattern.matcher(query).find());
Output:
true
Pattern.DOTALL tells Java to allow the dot to match newline characters, too.

AS400 SQL Script on a parameter file returns

I'm integrating an application to the AS400 using Java/JT400 driver. I'm having an issue when I extract data from a parameter file - the data retrieved seems to be encoded.
SELECT SUBSTR(F00001,1,20) FROM QS36F."FX.PARA" WHERE K00001 LIKE '16FFC%%%%%' FETCH FIRST 5 ROWS ONLY
Output
00001: C6C9D9C540C3D6D4D4C5D9C3C9C1D34040404040, - 1
00001: C6C9D9C5406040C3D6D4D4C5D9C3C9C1D3406040, - 2
How can I convert this to a readable format? Is there a function which I can use to decode this?
On the terminal connection to the AS400 the information is displayed correctly through the same SQL query.
I have no experience working with AS400 before this and could really use some help. This issue is only with the parameter files. The database tables work fine.

What you are seeing is EBCDIC output instead of ASCII. This is due to the CCSID not being specified in the database as mentioned in other answers. The ideal solution is to assign the CCSID to your field in the database. If you don't have the ability to do so and can't convince those responsible to do so, then the following solution should also work:
SELECT CAST(SUBSTR(F00001,1,20) AS CHAR(20) CCSID(37))
FROM QS36F."FX.PARA"
WHERE K00001 LIKE '16FFC%%%%%'
FETCH FIRST 5 ROWS ONLY
Replace the CCSID with whichever one you need. The CCSID definitions can be found here: https://www-01.ibm.com/software/globalization/ccsid/ccsid_registered.html

Since the file is in QS36F, I would guess that the file is a flat file and not externally defined ... so the data in the file would have to be manually interpreted if being accessed via SQL.
You could try casting the field, after you substring it, into a character format.
(I don't have a S/36 file handy, so I really can't try it)

It is hex of bytes of a text in EBCDIC, the AS/400 charset.
static String fromEbcdic(String hex) {
int m = hex.length();
if (m % 2 != 0) {
throw new IllegalArgumentException("Must be even length");
}
int n = m/2;
byte[] bytes = new byte[n];
for (int i = 0; i < n; ++i) {
int b = Integer.parseInt(hex.substring(i*2, i*2 + 2), 16);
bytes[i] = (byte) b;
}
return new String(bytes, Charset.forName("Cp500"));
}
passing "C6C9D9C540C3D6D4D4C5D9C3C9C1D34040404040".
Convert the file with Cp500 as charset:
Path path = Paths.get("...");
List<String> lines = Files.readAllLines(path, Charset.forName("Cp500"));
For line endings, which are on AS/400 the NEL char, U+0085, one can use regex:
content = content.replaceAll("\\R", "\r\n");
The regex \R will match exactly one line break, whether \r, \n, \r\n, \u0085.

A Big thank you for all the answers provided, they are all correct.
It is a flat parameter file in the AS400 and I have no control over changing anything in the system. So it has to be at runtime of the SQL query or once received.
I had absolutely no clue about what the code page was as I have no prior experience with AS400 and files in it. Hence all your answers have helped resolve and enlighten me on this. :)
So, the best answer is the last one. I have changed the SQL as follows and I get the desired result.
SELECT CAST(F00001 AS CHAR(20) CCSID 37) FROM QS36F."FX.PARA" WHERE K00001 LIKE '16FFC%%%%%' FETCH FIRST 5 ROWS ONLY
00001: FIRE COMMERCIAL , - 1
00001: FIRE - COMMERCIAL - , - 2
Thanks once again.
Dilanke

Croatian character in Java standard output

I have a database with some cratian characters in it like Đ , in the database the character is stored correctly, when using a datatable in primefaces it also shows the character in the webpage just fine.
The problem is that when I send it to the out.println() the character Đ in the name is missing.
for (People p : people) {
System.out.println("p.getName());
}
I tried using String name2 = p.getName().getBytes("ISO-8859-2"); but it still not working

I assume you are using UTF-8 as default encoding on the Database and for Primefaces
Have also a look to this:
Display special characters using System.out.println

How to pass all kind of SQL-Queries via Command Line/Batch Args[] into Java?

I want to write a little .jar which is used as a "translator" for SQL-Queries directed to a z/OS-DB2-Database.
My goal is that the application accepts SQL-Queries as Command Line Arguments manually or via shell script/cron, next to other parameters like IP, Port, User etc.
Is there a way to leave those arguments unaffected while passing them to the jar?
Example:
java -jar db2sql.jar SQL=={SELECT * FROM TABLE1 TAB1, TABLE2 TAB2 WHERE TAB1.XYZ = TAB2.ZYX AND TAB2.ABC LIKE 'blabla' AND TAB1.DATE >= '01.01.2015'} IP=={192.168.0.1} User=={Santa} Password=={CLAUS}
(please ignore that this statement is senseless, but i hope you get the trick)
My Problem is reading out that Command Line parameters, mostly special characters like * , " ' etc.
Questions:
Is there a list of all possible SQL-Parameters which must be escaped?
Is there a special character which can be used as delimiter that will never occur in an SQL-Query?
Is it possible to pass all kind of SQL Statments as ONE argument?
Is it possible to leave special characters unhandled, e.g. Argument "" = String "", and not .classpath etc. ?
Kind Regards

Although I wouldn't recommend what you're trying to do for several reasons, at least in a *NIX environment you could just use the standard way.
java -jar foo.jar -s "SELECT * FROM SOMETHING WHERE foo = 2" -u username -h hostname
You can use additional libraries to parse the parameters, but this way you would use -s to specify the SQL query, and wrap the param value in " to make it a single argument with automatic escape.
In your main method you can then get the full query with (simplified)
if(args[0].equals("-s"))
sqlString = args[1];

What is the MySQL SQL REGEX for this regex

Regular regex:
foo(\((\d{1}|\d{2}|\d{3})\))?
This regex works in Java:
foo(\\((\\d{1}|\\d{2}|\\d{3})\\))?
Examples:
fooa //no match
foo(1)a //no match
foo(a) //no match
foo(1) //match
foo(999) //match
foo //match
MySQL 5.5 documentation (https://dev.mysql.com/doc/refman/5.5/en/regexp.html) says
Note:
Because MySQL uses the C escape syntax in strings (for example, “\n” to
represent the newline character), you must double any “\” that you use
in your REGEXP strings.
I tried as a test running the following on MySQL 5.x
select 'foo' REGEXP 'foo(\\((\\d{1}|\\d{2}|\\d{3})\\))?'
Here is the error message I get:
Error: You have an error in your SQL syntax; check the manual
that corresponds to your MySQL server version for the right syntax to
use near ''foo(\\([(]\\d{1}' at line 1
I looked at Adapting a Regex to work with MySQL and tried the suggestion of replacing \d{1} etc.. with [0-9] which gave me:
select 'foo' REGEXP 'foo(\\(([0-9]|[0-9]|[0-9])\\))?'
But still getting MySQL death.

Not having an immediately availble MySQL console to verify, this should work:
'foo\\([:digit:]{1,3})\\)?'
Your other regexes have capture groups around both foo(123) and foo(123). It doesn't look like you want the capture groups in MySQL (does it even support them?), which would lead to MySQL choking.

Popping in because I ran into this and found the problem/solution.
Go go Global Preferences -> MySQL tab. Under "Use Custom Query Tokenizer" there is a "Procedure/Function Separator." If that is "|" change it to something else (like "/"). This is what's causing SQuirreL to fail parsing the REGEX.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Java Special character encoding issue - java

Check the java property "file.encoding" when you run on the commandline, it may be set to something other than "UTF-8" causing the text to display incorrectly when you output on the commandline.

The problem is that the character encoding for the apostrophe is \u0027 I ran this in the command line: public class Yesterday{ public static void main(String[] args) { String s = new String("yesterday" + "\u0027" +"s"); System.out.println(s); } } it resulted in: yesterday's

Related

Unable to capture next line character in Java

AS400 SQL Script on a parameter file returns

Croatian character in Java standard output

How to pass all kind of SQL-Queries via Command Line/Batch Args[] into Java?

What is the MySQL SQL REGEX for this regex

Categories

Resources