Trying to print "white smiling face" using UTF-16 character code \u263A

Trying to print "white smiling face" using UTF-16 character code \u263A - java

I am trying to print the "white smiling face" to the console window using the following line of code in Java:
System.out.println( '\u263A' );
I do not get Smiley but some other character that looks a little like a question mark.
I am running the Windows 7 Pro operating system using jdk and jre versions 1.8.0_66 Any hints as to why?
Note: I am using the Consolas font in the console window which maps the code to the ideograph according to the character map dialogue.

This is not really a problem in your code. As commenters have pointed out, there is a difference between writing a Unicode code point and how your applications or OS choose to render a sequence of bytes as a character. Here is what I get on Mac:
> javac TestWhiteSmilingFace.java && java TestWhiteSmilingFace
☺
The Windows console does not support Unicode output though. Instead, it operates on Windows Code Pages.
If you are willing to pipe output to a separate file and then open it in Notepad, then here is an approach that has worked successfully for me.
Start cmd.exe with the /U option. As discussed in cmd documentation, This option forces command output redirected to a file to be in Unicode.
Redirect the command output to a file, i.e. java TestWhiteSmilingFace > TestWhiteSmilingFace.txt.
Open the file in Notepad, i.e. notepad TestWhiteSmilingFace.txt.
This prior answer discusses the Windows console Unicode limitation in more detail and also suggests using the PowerShell Integrated Scripting Environment as a potential workaround.
Printing Unicode characters to the PowerShell prompt

Related

How to fix Java args not getting Japanese characters properly in string from Windows Explorer?

On Windows 10, I have a shortcut file in the "SendTo" directory. It is a shortcut to a .bat file.
Inside the .bat file can have just the command "python <filepath> %*" or "java -jar <filepath> %*".
When I select and right click file(s) from Windows Explorer and have it sent to this shortcut file, it will run the program from <filepath> with the selected file(s) as arguments.
I am trying to send files with filenames containing Japanese characters as arguments. The filenames are passed to python programs just fine, but for Java programs, the args for the filenames are messed up and the Java program cannot find the file.
For example, in Java and with locale of Japan, a filename of Filename ファイル名.txt becomes Filename 繝輔ぃ繧､繝ｫ蜷�.txt in the args. Other locales also do not work. The result is the same if I send the args to python and then from python to Java.
How to make it so Java gets the proper filename or can find the file properly?

You are encountering an unresolved issue with Java. See open bug JDK-8124977 cmdline encoding challenges on Windows which consolidates several problems related to passing Unicode arguments to a Java application from the command line.
Java 18 (to be released next month) resolves some UTF-8 issues with the implementation of JEP 400: UTF-8 by Default, but specifically not your problem unfortunately. From the "Goals" for JEP400:
Standardize on UTF-8 throughout the standard Java APIs, except for console I/O. [Emphasis mine]
However, there is a workaround. See Netbeans Chinese characters in java project properties run arguments, and in particular this answer which successfully processes Chinese characters passed as command line arguments using JNA (Java Native Access). From that answer:
JNA allows you to invoke Windows API methods from Java, without using
native code. So in your Java application you can call Win API methods
such as GetCommandLineW() and CommandLineToArgvW() directly, to access
details about the command line used to invoke your program, including
any arguments passed. Both of those methods support Unicode.
So the code in that answer does not read the arguments passed to main() directly. Instead it uses JNA to invoke the Win API methods to access them.
While that code was processing Chinese characters passed as arguments from the command line, it would work just as well for Japanese characters, including your Japanese filenames.

Java - working with different cmd charsets

I want to read a file path from the user in java console application,
some of the file path may contain some Hebrew characters.
how can i read the input from the command line when i don't know the encoding charset?
I have been spending some time on the web and didn't succeed to find any relevant solution that will be dynamic for every platform.
*
Screen shot when running in console

If you are using windows you need to check the terminal encoding before to make sure that its encoding supports hebrew.
To do this just type chcp in the console
as output you should see chcp 28598
if you see diffrent number type chcp 28598
Now your console encoding is set to hebrew and you should be able to write the path in hebrew without getting any exception.

Junk character output even after encoding

So, I have basically been trying to use Spanish Characters in my program, but wherever I used them, Java would print out '?'.
I am using Slackware, and executing my code there.
I updated lang.sh, and added: export JAVA_TOOL_OPTIONS=-Dfile.encoding=UTF-8
After this when I tried printing, it did not print the question marks, but other junk characters. I printed the default Charset on screen, and it has been successfully set, but it is not printing properly.
Help?
Thanks!
EDIT: I'm writing code in windows on NetBeans, and executing .class or .jar on slackware.
Further, I cannot seem to execute locale command. I get error "bash: locale: command not found".
This is what confuses me: When I echo any special characters on Slackware console, they are displayed perfectly, but when I run a java program that simply prints it's command line arguments (and I enter the special characters as Command Line input), it outputs garbage.

If you are using an ssh client such as PuTTY, check that it is using a UTF-8 charset as well.

invalid Xsl format (or) file name

what i wanted to do is to get all the installed applications in a computer and ihave decided to use the /output command of the command prompt using java. my code was working properly with this line of code in my computer:
Process proc = rt.exec("wmic /output:C:\\Users\\Public\\Documents\\list.csv product get name,version /format:csv ");
however, when i try to run the program in another computer, i encounter the "Invalid XSL format or file name" error. I tried reading other problems and i added this line of code before the code above:
proc2 = rt.exec("xcopy /y C:\\Windows\\System32\\wbem\\en-US\\*.xsl C:\\Windows\\System32\\");
but still nothing happened. the error is still there. anyone who can help me with this problem?

This is a bug in Windows 7 WMIC. When you use (for example) Dutch regional settings in an English Windows installation, WMIC searches for the xsl files inside C:\Windows\System32\wbem\nl-NL, instead of C:\Windows\System32\wbem\en-US where they are.
Workarounds:
Copy or move the C:\Windows\system32\wbem\en-US\*.xsl files up into the C:\Windows\system32\wbem\ folder.
Change your regional settings to match your Windows language version, log out and back in.
Specify the full path: WMIC process get /format:"%WINDIR%\System32\wbem\en-US\csv".

Special Characters [^M] appearing at the end of line in file if seen on unix but not when seen in eclipse

As described in questions, if I see a file in unix then I see special characters in it like ^M at the end of every line but if I see same file in eclipse than I do not see that special characters.
How can I remove those characters in the file, if am using eclipse for editing the file, do we have to make any specific changes in the eclipse preferences for the same ?
Any guidance would be highly appreciated.
Update:
Yes indeed it was carriage issue and following command helped me to get it sort out:
dos2unix file1.sh>file2.sh and file2.sh will be the file and it will not have any carriage values.
Possibly we can get warning like
could not open /dev/kbd to get keyboard type US keyboard assumed
could not get keyboard type US keyboard assumed but following command will suppress the warnings:
dos2unix -437 file1.txt>file2.txt

You have saved your text file as a DOS/Windows text file. Some Unix text editors do not interpret correctly DOS/Windows newline convention by default. To convert from Windows to Unix, you can use dos2unix, a command-line utility that does exactly that. If you do not have that available in your system, you can try with tr, which is more standard, using the following invocation:
tr -d '\r' < input.file > output.file

They are probably Windows carriage return characters. In Windows, lines are terminated with a carriage-return character followed by an end-of-line character. On Unix, only end-of-line characters are normally used, therefore many programs display the carriage return as a ^M.
You can get rid of them by running dos2unix on the files. You should also change your Eclipse preferences to save files with Unix end of lines.

Perhape this has suppressed UNIX warning message and worked creating the output file:
$ dos2unix -437 file.txt > file2.txt

You can remove those using dos2unix utility on a linux or unix machine. The syntax is like this dos2unix filename.

This are windows new line chars. You can follow steps shown in this post to have correct this issue.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Trying to print "white smiling face" using UTF-16 character code \u263A - java

Related

How to fix Java args not getting Japanese characters properly in string from Windows Explorer?

Java - working with different cmd charsets

Junk character output even after encoding

invalid Xsl format (or) file name

Special Characters [^M] appearing at the end of line in file if seen on unix but not when seen in eclipse

Categories

Resources