java strings problem:
I am trying to automate a process of special char typing (ctrl+insert)
using a java based system.
i have tried googling it with no luck
how do i include chars like these in a java string?
Your problem is that "ctrl+insert" and many other keystrokes are not characters. Programs that respond to those keystrokes have to respond to the keystrokes as part of their responses to the system on which they run, they aren't reading "characters" to detect those.
Seems to be a fairly hit issue, but I've not yet been able to find a solution; perhaps because it comes in so many flavors. Here it is though. I'm trying to read some comma delimited files (occasionally the delimiters can be a little bit more unique than commas, but commas will suffice for now).
The files are supposed to be standardized across the industry, but lately we've seen many different types of character set files coming in. I'd like to be able to set up a BufferedReader to compensate for this.
What is a pretty standard way of doing this and detecting whether it was successful or not?
My first thoughts on this approach are to loop through character sets simple->complex until I can read the file without an exception. Not exactly ideal though...
Thanks for your attention.
The Mozilla's universalchardet is supposed to be the efficient detector out there. juniversalchardet is the java port of it. There is one more port. Read this SO for more information Character Encoding Detection Algorithm
Our software has a script that creates different language JAR files, for Japanese we use the encoding SJIS in a call to native2asci. This worked last time a Japanese build was attempted but now seems to only work in certain contexts. For example in the following dialog the encoding seems to only work in the title bar:
Anyone have any idea about what might be causing this? Could this problem be related to a change in Java?
What exactly do you pass through native2ascii? Just to make sure, you're using native2ascii -encoding Shift_JIS, right? And you're passing text files or source files through native2ascii, right?
My only other idea is that after the text has been converted to \uXXXX format, the font you're using to display the dialog may not have all the Kanji and Kana. Explicitly set a font, and try that.
I would suggest checking these 2 things:
Make absolutely sure that the native2ascii conversions are correct. You should do a round trip conversion with the -reverse flag, and make sure that your input and output are in sync.
Double-check that your fonts used can support Shift-JIS. Those blocks and symbols that appear in the dialog text and button text look like the characters might be OK, but the fonts might not support them.
An additional word of caution: If this application is intended for use on Windows, then you really should be using the MS932 or windows-31j encoding. SJIS will work for all but a dozen or so symbols, but it turns out these symbols (like the full-width tilde) are actually used quite frequently in Japan.
I think the right way to do this is to use UTF-8 or UTF-16 exclusively. Kanji and Katakana demand special attention.
I know there are hundreds of questions asking how to update already written text on console and I know I can do it printing the \r character.
My issue comes when I use Console.readPassword or Console.readLine which creates a new line and later I can't overwrite it.
I think that my issue isn't related to something special about the Console.read* methods but to new lines. \r goes to the start of the current line, I need to be able to go to the start of the previous N line and start overwriting from there. Or just clear the entire screen.
Any ideas how can I do this?
Thanks.
In principle, this is terminal dependent, and with simple Java means there is no way to do this for all consoles.
Many terminals (at least in unixoid systems) support ANSI-escape sequences, so you can there write something like "\u001B[1;5H" to move the cursor to line 1, columnn 5.
In my current project, we always insert an empty new line at the end of the Java source files. We also enforce this with CheckStyle (with error level).
I was searching for this topic for a long time, but unfortunately I can't find any convincing reason for this. It seems that other developers are pretty indifferent about this because they just checked one checkbox in Eclipse formatter and it's done automatically. But I still don't know why it is needed and why it can be important. So my question is:
Why are empty lines at the end of Java source files needed?
Is it a current need or a relic of the past and undesirable in current code bases?
I think they are trying to ensure every file ends with a trailing newline character. This is different from ending with a blank line, a.k.a. empty newline.
Edit: As #Easy Angel succinctly clarified in the comments: trailing newline = "\n" and blank line = "\n\n"
I think either:
your lead is either mandating that every file ends with a newline character, but its being misinterpreted as mandating that every file end with a blank line (i.e. an empty line that ends in a newline), or else
they are trying to ensure every file ends with a newline character by actually mandating every file end with a blank line (a.k.a. empty line that ends with a newline), thereby ensuring files ends with at least one newline (and possibly redundant additional newline - overkill?).
Unless the editor actually shows newline symbols, its not always clear in some editors that a file:
DOES NOT END a newline at all,
ENDS with a single trailing newline, or
ENDS with a blank newline, i.e. 2 trailing newlines
I think most modern source code editors insert a trailing newline. However, when using older more general editors, I would always try to ensure my source code files (and text files in general) always ended with a trailing newline (which occasionally came out as a blank line/empty newline depending on the editor I was using) because:
when using cat to display the file on the command line, if the file lacked a trailing newline, the next output (like the shell prompt or a visual delimiter a script may output between files) would end up appearing right after the last non-newline character rather than starting on a newline. In general, the trailing newline made files more user- and script- friendly.
I believe some editors (I can't remember any specifics) would automatically insert a trailing newline if the text file lacked one. This would make it appear like the file was modified. It would get confusing if you have a bunch of files open in different windows and then go to close all of them - the editor prompts you to save but you are unsure whether you made "real changes" to the file or its just the auto-inserted newline.
Some tools like diff and some compilers will complain about a missing trailing newline. This is more noise that users and tools may have to deal with.
Edit:
About editors adding newlines and not being able to see whether there's a newline vs blank newline at the end of the file, I just tested Vim, Eclipse, and Emacs (on my Windows system with Cygwin): I opened a new file, typed 'h' 'e' 'l' 'l' 'o' and saved without hitting [ENTER]. I examined each file with od -c -t x1.
Vim did add a trailing newline.
Emacs did add a trailing newline.
Eclipse did NOT add a trailing newline.
But
Vim did NOT allow me to cursor down to a blank line under "hello".
Emacs did allow me to cursor down to a blank line under "hello".
Eclipse did NOT allow me to cursor down to a blank line under "hello".
Interpret as you like.
My personal practice is to try to ensure text files end with a trailing newline. I just feel there's the least surprise to people and tools with this is the case. I wouldn't treat source files any different from text files in this respect.
Google turns up this:
which, as of this edit, show hits that talk about warnings about a missing trailing newline coming from C compilers, svn (because of diff), diff, etc. I feel there's a general expectation that text files (source files included) end with a trailing newline and least surprising (and less noisy) when they tend to be there.
Finally this is interesting:
Sanitizing files with no trailing newline
Text files should have all their lines terminated by newline characters (ie, \n). This is stated by POSIX, that says that a text file is
A file that contains characters organized into zero or more lines.
A line, in turn, is defined as
* A sequence of zero or more non- characters plus a terminating character.
HOWEVER, all that said, this is just my personal practice. I'm happy to share my opinion to anyone that asks, but I don't foist this on anyone. I don't feel this is something worth mandating, like I say here:
While I'm one whose all for consistency, I'm also against micromanaging every bit of style. Having a huge list of coding conventions, particularly when some of them seem arbitrary, is part of what discourages people from following them. I think coding guidelines should be streamlined to the most valuable practices that improve the -ilities. How much is readability, maintainability, performance, etc improved by mandating this practice?
Here is a good reason for having extra line-break at the end:
If you have a file without line-break at the end, next time the file is edited to add another line, most of merge tools will think that the existing line has changed (I'm 90% sure SVN also does).
In the example below, the line containing “last line before edit” does not have the line break. If we try to add a new line “last line after edit”, as we can see both lines 5 and 6 are marked as changed, but actual contents of line 5 in both versions are the same.
If everyone is following your project lead suggestion, then this would be the result (only line 6 differ from original file). This also avoids misunderstandings during merges.
While this may not look like a big deal, let's say one developer (A) actually meant to change the contents of the last line and another developer (B) added a new line. If not using line-break before EOF, then you have a merge conflict because developer B was forced to also edit the former last line to add a line-break. And... who likes CVS/SVN conflicts?
Have a look at this SO question..
The answer shamelessly stolen from Ralph Rickenbach:
Many older tools misbehave if the last
line of data in a text file is not
terminated with a newline or carriage
return / new line combination. They
ignore that line as it is terminated
with ^Z (eof) instead.
So I figure it's mostly a ghost of the past. Unfortunately, such ghosts can bite you in the tail if you don't properly exorcise them. (Is your build server old and uses older shell scripts for summaries and such things).
Try to cut/paste the whole file.
Something bug in checkstyle or eclipse : )
Aside from the already mentioned valid reasons for having a trailing newline character (possible issues with older tools and diff), here is another way to look at it:
Why special-case the last line by not appending a newline character when all other lines in the file have one?
Sometimes your compiler doesn't parse it correctly:
Error: Reached end of file while parsing
It is just a coding style. Doesn't hurt or help anything. I would not let it bother you it sounds like it is your teams preference to include and empty line. There is not really a good argument against it other than why does anyone care enough to actually add it to checkstyle?
I have never heard of such a requirement.
In fact, I just confirmed that a Java program will run without any compiler/runtime errors or warnings when there isn't a blank line at the end of the file.
This, as some commenters have said, must be a coding style issue. Unfortunately, I can't suggest why it may be important for there to be a blank line at the end of a file in Java. In fact, it seems entirely pointless to me
We had to do this for some C++ code as the compiler generated a warning about it, and we had a 'no error or warnings' policy.
Maybe the issue lies elsewhere... have you a diffing tool have goes haywire or a merge tool that can't handle it?
It's not a big deal really.