We have a number of unit tests which assemble a multi-line string in memory and compare it against a reference string in a file.
We've gotten a bit carried away and implemented the tests to use System.getProperty("line.separator") to make the code OS-agnostic, but the reference files are just files with \n line endings (Linux), so the tests which compared generated content to reference file content fail on a Windows machine because System.getProperty("line.separator") returns \r\n there.
This is test code so we'll probably simply define a final String LINE_ENDING="\n" and update tests to use it instead of the "line.separator" property value, but that said, I'd really like to understand why I'm unable to specify a different line separator. I tried mvn -DargLine="-Dline.separator=\n" test, but the newline special character appears to have been interpreted as a literal letter "n" so tests failed again. To my surprise, trying \\n instead of \n made no difference, either.
Can anyone show how one would set the line.separator parameter properly?
Final note: the above commands were issued on a Linux machine. When running one of the tests from within Eclipse on a Windows, the \n special character (passed in the debug configuration as a JVM parameter -Dline.separator=\n) seems to be interpreted as the literal value "\\n". Searching the web proves frustratingly fruitless.
Related
On Windows 10, I have a shortcut file in the "SendTo" directory. It is a shortcut to a .bat file.
Inside the .bat file can have just the command "python <filepath> %*" or "java -jar <filepath> %*".
When I select and right click file(s) from Windows Explorer and have it sent to this shortcut file, it will run the program from <filepath> with the selected file(s) as arguments.
I am trying to send files with filenames containing Japanese characters as arguments. The filenames are passed to python programs just fine, but for Java programs, the args for the filenames are messed up and the Java program cannot find the file.
For example, in Java and with locale of Japan, a filename of Filename ファイル名.txt becomes Filename 繝輔ぃ繧、繝ォ蜷�.txt in the args. Other locales also do not work. The result is the same if I send the args to python and then from python to Java.
How to make it so Java gets the proper filename or can find the file properly?
You are encountering an unresolved issue with Java. See open bug JDK-8124977 cmdline encoding challenges on Windows which consolidates several problems related to passing Unicode arguments to a Java application from the command line.
Java 18 (to be released next month) resolves some UTF-8 issues with the implementation of JEP 400: UTF-8 by Default, but specifically not your problem unfortunately. From the "Goals" for JEP400:
Standardize on UTF-8 throughout the standard Java APIs, except for console I/O. [Emphasis mine]
However, there is a workaround. See Netbeans Chinese characters in java project properties run arguments, and in particular this answer which successfully processes Chinese characters passed as command line arguments using JNA (Java Native Access). From that answer:
JNA allows you to invoke Windows API methods from Java, without using
native code. So in your Java application you can call Win API methods
such as GetCommandLineW() and CommandLineToArgvW() directly, to access
details about the command line used to invoke your program, including
any arguments passed. Both of those methods support Unicode.
So the code in that answer does not read the arguments passed to main() directly. Instead it uses JNA to invoke the Win API methods to access them.
While that code was processing Chinese characters passed as arguments from the command line, it would work just as well for Japanese characters, including your Japanese filenames.
I have the following simple java program:
import java.util.Arrays;
public class Arguments
{
public static void main(String [] args)
{
System.out.println("args: "+Arrays.toString(args));
}
}
When I execute this in powershell using the following command: java Arguments "*.java" the string received in the program is not "*.java" but a comma-separated list of all java files in the current directory. And if there are no java files in the current directory the string received is "*.java".
I want to know why this is happening and how to pass a string as it is without converting it.
Update: java Arguments '"*".java' and java Arguments `"*.java`" did the work but this creates the same problem when executed in cmd. Can anyone explain why this is happening? Is there any common solution for both PowerShell and cmd?
It is not PowerShell (nor cmd.exe) that interprets "*.java" as a filename pattern and expands (resolves) it to the matching files, known as globbing in the Unix world.
(You would only get that behavior if you used *.java - i.e., no quoting - in PowerShell Core on Unix-like platforms, but never with a quoted string such as "*.java" or '*.java', and even without quoting never on Windows).
Apparently, it is legacy versions of java.exe on Windows that automatically perform the globbing on unquoted arguments, in an apparent attempt to emulate the behavior of POSIX-like shells such as Bash on Unix.
As of (at least) JDK 12, this behavior no longer seems to be effect, at least not by default.
The linked answer suggests that in earlier versions there may be a system property that controls the behavior, but it's unclear what its name is.
Generally, the syntax java -D<systemPropName>=<value> ... can be used to set a system property on startup.
Therefore, you have the following options:
Upgrade to a Java version that no longer exhibits this behavior (by default).
In legacy versions, find the relevant system property name that disables the behavior and use the syntax shown above.
Use shell-specific quoting, as shown below.
Using quoting to prevent globbing:
To prevent java.exe from performing globbing, the invocation command line must ultimately contain "*.java", i.e., the argument must be enclosed in double quotes.
Unfortunately, there is no common syntax that works in both PowerShell and cmd.exe:
cmd.exe:
cmd.exe passes double-quoted arguments through as-is, so the following is sufficient:
java Arguments "*.java"
PowerShell:
PowerShell, by contrast, performs re-quoting as needed behind the scenes (see this answer for more information).
Since an argument with content *.java normally does not require quoting when you pass it to an external program, PowerShell translates both "*.java" and '*.java' to unquoted *.java in the command line that is ultimately used behind the scenes - which is what you experienced.
There are two ways around that:
Use java Arguments '"*.java"', i.e., embed the " chars. in the argument, inside a literal string ('...').
Use java Arguments --% "*.java"; --% is the stop-parsing symbol (PSv3+), which instructs PowerShell to pass the remainder of the command line through as-is (except for expanding cmd.exe-style environment-variable references).
I have a Java application that uses -D system properties that I create. I'm having issues getting one of them to be translated correctly.
In my test environment (localhost) on my local computer, I'm running Windows using IntelliJ Idea IDE and I enter the -D system properties through the IDE like so:
-Dproperty={\"prop1\":\"val1\",\"prop2\":\"val2\"}
I escape the double quotes because they need to be apart of the string literal. The above property works and the entire value including the curly braces is stored as a string literal.
The issue occurs when we deploy this application to our Linux cloud environment. I think the difference in architecture is causing the system property to not be read in correctly. In bash, I find the process ID of all the -D system properties and do a ps -fwwp [processId] command. I see that the above property is being broken up into a bunch of smaller properties that look like the following:
-Dproperty=prop1:val1
-Dproperty=prop2:val2
-Dproperty=prop3:val3
etc...
This is causing the part of my application that uses this property to fail. I've tried doing a bunch of escaping methods and none of them are working.
How can I escape this system property in such a way that the value is treated as the string literal {"prop1":"val1","prop2":"val2"}
Bash requires the curly braces to be escaped, as in:
-Dproperty=\{\"prop1\":\"val1\",\"prop2\":\"val2\"\}
The other option is to try surrounding the entire string in single quotes. Bash won't do any expansions inside single quotes:
-Dproperty='{"prop1":"val1","prop2":"val2"}'
I don't know which option will be easier to make compatible with your windows environment.
So, I have basically been trying to use Spanish Characters in my program, but wherever I used them, Java would print out '?'.
I am using Slackware, and executing my code there.
I updated lang.sh, and added: export JAVA_TOOL_OPTIONS=-Dfile.encoding=UTF-8
After this when I tried printing, it did not print the question marks, but other junk characters. I printed the default Charset on screen, and it has been successfully set, but it is not printing properly.
Help?
Thanks!
EDIT: I'm writing code in windows on NetBeans, and executing .class or .jar on slackware.
Further, I cannot seem to execute locale command. I get error "bash: locale: command not found".
This is what confuses me: When I echo any special characters on Slackware console, they are displayed perfectly, but when I run a java program that simply prints it's command line arguments (and I enter the special characters as Command Line input), it outputs garbage.
If you are using an ssh client such as PuTTY, check that it is using a UTF-8 charset as well.
As described in questions, if I see a file in unix then I see special characters in it like ^M at the end of every line but if I see same file in eclipse than I do not see that special characters.
How can I remove those characters in the file, if am using eclipse for editing the file, do we have to make any specific changes in the eclipse preferences for the same ?
Any guidance would be highly appreciated.
Update:
Yes indeed it was carriage issue and following command helped me to get it sort out:
dos2unix file1.sh>file2.sh and file2.sh will be the file and it will not have any carriage values.
Possibly we can get warning like
could not open /dev/kbd to get keyboard type US keyboard assumed
could not get keyboard type US keyboard assumed but following command will suppress the warnings:
dos2unix -437 file1.txt>file2.txt
You have saved your text file as a DOS/Windows text file. Some Unix text editors do not interpret correctly DOS/Windows newline convention by default. To convert from Windows to Unix, you can use dos2unix, a command-line utility that does exactly that. If you do not have that available in your system, you can try with tr, which is more standard, using the following invocation:
tr -d '\r' < input.file > output.file
They are probably Windows carriage return characters. In Windows, lines are terminated with a carriage-return character followed by an end-of-line character. On Unix, only end-of-line characters are normally used, therefore many programs display the carriage return as a ^M.
You can get rid of them by running dos2unix on the files. You should also change your Eclipse preferences to save files with Unix end of lines.
Perhape this has suppressed UNIX warning message and worked creating the output file:
$ dos2unix -437 file.txt > file2.txt
You can remove those using dos2unix utility on a linux or unix machine. The syntax is like this dos2unix filename.
This are windows new line chars. You can follow steps shown in this post to have correct this issue.