Converting a string of UTF code point to its respective value - java

I have the following string \u5733. I need to convert this to its respective UTF value. I tried doing it in both the ways shown below but I end up with "?" as the output. The UTF code points is for a Chinese character.Any help would be appreciated.
char[] arr=Character.toChars(5733);
System.out.println(new String(arr));
String code = "5733";
char c = (char)Integer.parseInt(code, 16);
System.out.println("Code: " + code + " Character: " + c);

You seem to have a problem with the terminal where your output is shown since your second approach is working for me.
Your first approach contains an error though. Since 5733 is a hexadecimal number, you should prefix it with 0x:
char[] arr=Character.toChars(0x5733);
An even simpler method would be:
char c = 0x5733;
System.out.println("Code: " + (int)c + " Character: " + c);

If you are running this in Eclipse, you can display UTF-8 characters as follows:
Click Run Configurations...
Select your particular Run Config for this App
Click Common->Encoding->Other
Select UTF-8
Run

Related

cannot split a specific kind of strings using Java

I am working in Java. I have list of parameters stored in a string which is coming form excel. I want to split it only at starting hyphen of every new line. This string is stored in every excel cell and I am trying to extract it using Apache poi. The format is as below:
String text =
"- I am string one\n" +
"-I am string two\n" +
"- I am string-three\n" +
"with new line\n" +
"-I am string-four\n" +
"- I am string five";
What I want
array or arraylist which looks like this
[I am string one,
I am string two,
I am string-three with new line,
I am string-four,
I am string five]
What I Tried
I tried to use split function like this:
String[] newline_split = text.split("-");
but the output I get is not what I want
My O/P
[, I am string one,
I am string two,
I am string, // wrong
three // wrong
with new line, // wrong
I am string, // wrong!
four, // wrong!
I am string five]
I might have to tweak split function a bit but not able to understand how, because there are so many hyphens and new lines in the string.
P.S.
If i try splitting only at new line then the line - I am string-three \n with new line breaks into two parts which again is not correct.
EDIT:
Please know that this data inside string is incorrectly formatted just like what is shown above. It is coming from an excel file which I have received. I am trying to use apache poi to extract all the content out of each excel cell in a form of a string.
I intentionally tried to keep the format like what client gave me. For those who are confused about description inside A, I have changed it because I cannot post the contents on here as it is against privacy of my workplace.
You can
remove line separators (replace it with space) if they don't have - after it (in next line): .replaceAll("\\R(?!-)", " ") should do the trick
\R (written as "\\R" in string literal) since Java 8 can be used to represent line separators
(?!...) is negative-look-ahead mechanism - ensures that there is no - after place in which it was used (will not include it in match so we will not remove potential - which ware matched by it)
then remove - placed at start of each line (lets also include followed whitespaces to trim start of the string). In other words replace - placed
after line separators: can be represented by "\\R"
after start of string: can be represented by ^
This should do the trick: .replaceAll("(?<=\\R|^)-\\s*","")
split on remaining line separtors: .split("\\R")
Demo:
String text =
"- I am string one\n" +
"-I am string two\n" +
"- I am string-three\n" +
"with new line\n" +
"-I am string-four\n" +
"- I am string five";
String[] split = text.replaceAll("\\R(?!-)", " ")
.replaceAll("(?<=\\R|^)-\\s*","")
.split("\\R");
for (String s: split){
System.out.println("'"+s+"'");
}
Output (surrounded with ' to show start and end of results):
'I am string one'
'I am string two'
'I am string-three with new line'
'I am string-four'
'I am string five'
This is how I would do:
import java.util.*;
public class MyClass {
public static void main(String args[]) {
String A = "- I am string one \n" +
" -I am string two\n" +
" - I am string-three \n" +
" with new line\n" +
" -I am string-four\n" +
"- I am string five";
String[] s2 = A.split("\r?\n");
List<String> lines = new ArrayList<String>();
String line = "";
for (int i = 0; i < s2.length; i++) {
String ss = s2[i].trim();
if (i == 0) { // first line MUST start with "-"
line = ss.substring(1).trim();
} else if (ss.startsWith("-")) {
lines.add(line);
ss = ss.substring(1).trim();
line = ss;
} else {
line = line + " " + ss;
}
}
lines.add(line);
System.out.println(lines.toString());
}
}
I hope it helps.
A little explanation:
I will process line by line, trimming each one.
If it starts with '-' it means the end of the previous line, so I include it in the list. If not, I concatenate with the previous line.
looks as if you are splitting the FIRST - of each line, so you need to remove every instance of a "newline -"
str.replace("\n-", '\n')
then Remove the initial "-"
str = str.substring(1);

Why won't "\t" create a tab?

I want the "Module Code = " and "Result = " to be separated by a tab but whenever I run the code below it literally just outputs
"Module Code = Biology\tResult = 40.0"
public String toString()
{
return "Module Code = " + moduleCode + "\t" + "Result = " + result;
}
The problem is that you're viewing the value of the produced string in the BlueJ window. That window is good for debugging purposes, but it won't exhibit the same behavior that a proper output device would, especially with respect to characters such as newline, tabulation, etc. Those characters will still appear with their escape sequences, just like you typed them in your source code.
In other words, your toString() method is fine and it works as intended. If you want to see its results formatted properly, don't view them using BlueJ -- print them somewhere else. The console is a good choice:
System.out.println(module.toString());
Why won't “\t” create a new line?
well, that is because “\t” is a tabulation not a new line “\n”
if you need a new line try instead
return "Module Code = " + moduleCode + "\n" + "Result = " + result;

Split string after n amount of digits occurrence

I'm parsing some folder names here. I have a program that lists subfolders of a folder and parses folder names.
For example, one folder could be named something like this:
"Folder.Name.1234.Some.Info.Here-ToBeParsed"
and I would like to parse it so name would be "Folder Name". At the moment I'm first using string.replaceAll() to get rid of special characters and then there is this 4-digit sequence. I would like to split string on that point. How can I achieve this?
Currently my code looks something like this:
// Parsing string if regex p matches folder's name
if(b) {
//System.out.println("Folder: \" " + name + "\" contains special characters.");
String result = name.replaceAll("[\\p{P}\\p{S}]", " "); // Getting rid of all punctuations and symbols.
//System.out.println("Parsed: " + name + " > " + result);
// If string matches regex p2
if(b2) {
//System.out.println("Folder: \" " + result + "\" contains release year.");
String parsed_name[] = result.split("20"); // This is the line i would like to split when 4-digits in row occur.
//System.out.println("Parsed: " + result + " > " + parsed_name[0]);
movieNames.add(parsed_name[0]);
}
Or maybe there is even easier way to do this? Thanks in advance!
You should keep it simple like this:
String name = "Folder.Name.1234.Some.Info.Here-ToBeParsed";
String repl = name.replaceFirst( "\\.\\d{4}.*", "" ).
replaceAll( "[\\p{P}\\p{S}&&[^']]+", " " );
//=> Folder Name
replaceFirst is removing everything after a DOT and 4 digits
replaceAll is replacing all punctuation and space (except apostrophe) by a single space

Using HTML code in Swing

I can't figure out how to use variables in HTML, string is a variable in this case.
JOptionPane.showMessageDialog(null,"<html>Error #1<br> + string +</html>","Error",JOptionPane.PLAIN_MESSAGE);
this output: Error #1 + string
JOptionPane.showMessageDialog(null,"<html>Error #1<br></html>" + string ,"Error",JOptionPane.PLAIN_MESSAGE);
this output: Error #1
is there a way to use string variables in HTML?
Do you want something like
"<html>Error #1<br>" + string + "</html>"
?
If you want to concatenate the string variable to the html you have there, it must be outside the quotes, otherwise it will be treated as a literal, as in your example.
JOptionPane.showMessageDialog(
null,
"<html>Error #1<br>" + string + "</html>",
"Error",
JOptionPane.PLAIN_MESSAGE);
Though note that logically JOptionPane.PLAIN_MESSAGE should be JOptionPane.ERROR_MESSAGE.
You need to do: "<html>Error #1<br>" + string + "</html>"
A string literal in java consists of zero or more characters enclosed in double quotes. So anything satisfies this condition will also be regarded as string too. You will need to close the String before appending the string.
So if the string = "Hi to stack":
Then "<html>Error #1<br>" + string + "</html>" will result in:
"<html>Error #1<br>Hi to stack</html>"

uncaught syntax error

I am using following code for local storage.
for(int i=0; i< files.length; i++)
{
System.out.println("base = " + files[i].getName() + "\n i=" +i + "\n");
AudioFile f = AudioFileIO.read(files[i]);
Tag tag = f.getTag();
//AudioHeader h = f.getAudioHeader();
int l = f.getAudioHeader().getTrackLength();
String s1 = tag.getFirst(FieldKey.ALBUM);
out.print("writeToStorage("+s1+","+s1+");");
}
getting uncaught syntex erroe: unexpected identifer as a error.
Im guessing you meant java rather than javascript?
Your unexpected identifier is here out.println you need System. infront of it.
The reason for this is that out is not defined in your code. You need to access it by using the static variable in the System class. Hence why you use System.out.
Alternatley you could set a variable out to be equal to System.out for shorthand, although I don;t tend to. But this can allow you to switch out to a different type of output stream without having to refactor your code much.
Have you added following ?
import static java.lang.System.out;
Probably you need to output "s in the last line to surround the s1 values.
"writeToStorage("+s1+","+s1+");"
->
"writeToStorage('"+s1+"','"+s1+"');"
Btw for the same reason you have to fix the other line too:
"base = " + files[i].getName() + "...
->
"base = '" + files[i].getName() + "'...

Categories