Reformatting a Java String

Reformatting a Java String - java

I have a string that looks like this:
CALDARI_STARSHIP_ENGINEERING
and I need to edit it to look like
Caldari Starship Engineering
Unfortunately it's three in the morning and I cannot for the life of me figure this out. I've always had trouble with replacing stuff in strings so any help would be awesome and would help me understand how to do this in the future.

Something like this is simple enough:
String text = "CALDARI_STARSHIP_ENGINEERING";
text = text.replace("_", " ");
StringBuilder out = new StringBuilder();
for (String s : text.split("\\b")) {
if (!s.isEmpty()) {
out.append(s.substring(0, 1) + s.substring(1).toLowerCase());
}
}
System.out.println("[" + out.toString() + "]");
// prints "[Caldari Starship Engineering]"
This split on the word boundary anchor.
See also
regular-expressions.info/Word boundary
Matcher loop solution
If you don't mind using StringBuffer, you can also use Matcher.appendReplacement/Tail loop like this:
String text = "CALDARI_STARSHIP_ENGINEERING";
text = text.replace("_", " ");
Matcher m = Pattern.compile("(?<=\\b\\w)\\w+").matcher(text);
StringBuffer sb = new StringBuffer();
while (m.find()) {
m.appendReplacement(sb, m.group().toLowerCase());
}
m.appendTail(sb);
System.out.println("[" + sb.toString() + "]");
// prints "[Caldari Starship Engineering]"
The regex uses assertion to match the "tail" part of a word, the portion that needs to be lowercased. It looks behind (?<=...) to see that there's a word boundary \b followed by a word character \w. Any remaining \w+ would then need to be matched so it can be lowercased.
Related questions
Use Java and RegEx to convert casing in a string
Java regex does not support Perl preprocessing operations \l \u, \L, and \U.
Java split is eating my characters.
More examples of using assertions
StringBuilder and StringBuffer in Java
Unfortunately, appendReplacement/Tail only takes StringBuffer

You can try this:
String originalString = "CALDARI_STARSHIP_ENGINEERING";
String newString =
WordUtils.capitalize(originalString.replace('_', ' ').toLowerCase());
WordUtils are part of the Commons Lang libraries (http://commons.apache.org/lang/)

Using reg-exps:
String s = "CALDARI_STARSHIP_ENGINEERING";
StringBuilder camel = new StringBuilder();
Matcher m = Pattern.compile("([^_])([^_]*)").matcher(s);
while (m.find())
camel.append(m.group(1)).append(m.group(2).toLowerCase());

Untested, but thats how I implemented the same some time ago:
s = "CALDARI_STARSHIP_ENGINEERING";
StringBuilder b = new StringBuilder();
boolean upper = true;
for(char c : s.toCharArray()) {
if( upper ) {
b.append(c);
upper = false;
} else if( c = '_' ) {
b.append(" ");
upper = true;
} else {
b.append(Character.toLowerCase(c));
}
}
s = b.toString();
Please note that the EVE license agreements might forbit writing external tools that help you in your careers. And it might be the trigger for you to learn Python, because most of EVE is written in Python :).

Quick and dirty way:
Lower case all
line.toLowerCase();
Split into words:
String[] words = line.split("_");
Then loop through words capitalising first letter:
words[i].substring(0, 1).toUpperCase()

Related

Convert String="one,two,three" to String='one','two','three'

Need to convert my String values "one,two,three" to 'one','two','three'
I have below code
String input = "One,two,three";
Need to send this input values for an query in hibernate.
So need to send as 'one','two','three' as single string, please provide me a solution easy way to do it

You can do the following:
String result = input.replace(",", "','").replaceAll("(.*)", "'$1'");
input.replace(",", "','") replaces each , with ',' so at this step your string will look like:
One','two','three
next we use a regex to surround the string with ', now it'll look
'One','two','three'
which is what you want.
Regex explanation: We catch the whole string, then we replace it with itself, but surrounded with single quote.
References:
String#replaceAll
String#replace
Regex tutorial

Using regex :
input.replaceAll("(\\w+)", "\'$1\'")
In regex, 'w' is the word character.
http://docs.oracle.com/javase/6/docs/api/java/util/regex/Pattern.html#sum

String[] split = input.split(",");
StringBuilder sb = new StringBuilder();
for (int i = 0 ; i < split.length ; i++ ) {
sb.append("'" + str + "'");
if ( i != split.length-1 ) {
sb.append(",");
}
}
sb.toString();
not smart, but easiest..

String.split by semicolon

I want to split a string by semicolon(";"):
String phrase = "‫;‪14/May/2015‬‬ ‫‪FC‬‬ ‫‪Barcelona‬‬ ‫‪VS.‬‬ ‫‪Real‬‬ ‫‪Madrid";
String[] dateSplit = phrase.split(";");
System.out.println("dateSplit[0]:" + dateSplit[0]);
System.out.println("dateSplit[1]:" + dateSplit[1]);
But it removes the ";" from string and puts all string to 'datesplit1'
so the output is:
dateSplit[0]:‫
dateSplit[1]:‪14/May/2015‬‬ ‫‪FC‬‬ ‫‪Barcelona‬‬ ‫‪VS.‬‬ ‫‪Real‬‬ ‫‪Madrid`
Demo
and on doing
System.out.println("Real String :"+phrase);
string printed is
Real String :‫;‪14/May/2015‬‬ ‫‪FC‬‬ ‫‪Barcelona‬‬ ‫‪VS.‬‬ ‫‪Real‬‬ ‫‪Madrid

The phrase contains bi-directional characters like right-to-left embedding. It's why some editors don't manage to display correctly the string.
This piece of code shows the actual characters in the String (for some people the phrase won't display here the right way, but it compiles and looks fine in Eclipse). I just translate left-right with ->, right-to-left with <- and pop directions with ^:
public static void main(String[]args) {
String phrase = "‫;‪14/May/2015‬‬ ‫‪FC‬‬ ‫‪Barcelona‬‬ ‫‪VS.‬‬ ‫‪Real‬‬ ‫‪Madrid";
String[] dateSplit = phrase.split(";");
for (String d : dateSplit) {
System.out.println(d);
}
char[] c = phrase.toCharArray();
StringBuilder p = new StringBuilder();
for (int i = 0; i < c.length;i++) {
int code = Character.codePointAt(c, i);
switch (code) {
case 8234:
p.append(" -> ");
break;
case 8235:
p.append(" <- ");
break;
case 8236:
p.append(" ^ ");
break;
default:
p.append(c[i]);
}
}
System.out.println(p.toString());
}
Prints:
<- ; -> 14/May/2015 ^ ^ <- -> FC ^ ^ <- -> Barcelona ^ ^ <- -> VS. ^ ^ <- -> Real ^ ^ <- -> Madrid
The String#split() will work on the actual character string and not on what the editor displays, hence you can see the ; is the second character after a right-to-left, which gives (beware of display again: the ; is not part of the string in dateSplit[1]):
dateSplit[0] = "";
dateSplit[1] = "14/May/2015‬‬ ‫‪FC‬‬ ‫‪Barcelona‬‬ ‫‪VS.‬‬ ‫‪Real‬‬ ‫‪Madrid";
I guess you are processing data from a language writing/reading from right-to-left and there is some mixing with the football team names which are left-to-right. The solution is certainly to get rid of directional characters and put the ; at the right place, i.e as a separator for the token.

I rewrote your code, instead of coping from here and its working perfectly fine.
public static void main(String[] args) {
String phrase = "14/May/2015; FC Barcelona VS. Real Madrid";
String[] dateSplit = phrase.split(";");
System.out.println("dateSplit[0]:" + dateSplit[0]);
System.out.println("dateSplit[1]:" + dateSplit[1]);
}
Demo

Cut and pasting your code into IntelliJ screwed up the editor; as #Palcente said, possible encoding issues.
However, I would recommend usinge a StringTokenizer instead.
StringTokenizer sTok = new StringTokenizer(phrase, ";");
You can then iterate over it, which leads to nicer (and safer) code.

Replace a set of substring in a string in more efficient way?

I've to replace a set of substrings in a String with another substrings for example
"^t" with "\t"
"^=" with "\u2014"
"^+" with "\u2013"
"^s" with "\u00A0"
"^?" with "."
"^#" with "\\d"
"^$" with "[a-zA-Z]"
So, I've tried with:
String oppip = "pippo^t^# p^+alt^shefhjkhfjkdgfkagfafdjgbcnbch^";
Map<String,String> tokens = new HashMap<String,String>();
tokens.put("^t", "\t");
tokens.put("^=", "\u2014");
tokens.put("^+", "\u2013");
tokens.put("^s", "\u00A0");
tokens.put("^?", ".");
tokens.put("^#", "\\d");
tokens.put("^$", "[a-zA-Z]");
String regexp = "^t|^=|^+|^s|^?|^#|^$";
StringBuffer sb = new StringBuffer();
Pattern p = Pattern.compile(regexp);
Matcher m = p.matcher(oppip);
while (m.find())
m.appendReplacement(sb, tokens.get(m.group()));
m.appendTail(sb);
System.out.println(sb.toString());
But it doesn't work. tokens.get(m.group()) throws an exception.
Any idea why?

You don't have to use a HashMap. Consider using simple arrays, and a loop:
String oppip = "pippo^t^# p^+alt^shefhjkhfjkdgfkagfafdjgbcnbch^";
String[] searchFor =
{"^t", "^=", "^+", "^s", "^?", "^#", "^$"},
replacement =
{"\\t", "\\u2014", "\\u2013", "\\u00A0", ".", "\\d", "[a-zA-Z]"};
for (int i = 0; i < searchFor.length; i++)
oppip = oppip.replace(searchFor[i], replacement[i]);
// Print the result.
System.out.println(oppip);
Here is an online code demo.
For the completeness, you can use a two-dimensional array for a similar approach:
String oppip = "pippo^t^# p^+alt^shefhjkhfjkdgfkagfafdjgbcnbch^";
String[][] tasks =
{
{"^t", "\\t"},
{"^=", "\\u2014"},
{"^+", "\\u2013"},
{"^s", "\\u00A0"},
{"^?", "."},
{"^#", "\\d"},
{"^$", "[a-zA-Z]"}
};
for (String[] replacement : tasks)
oppip = oppip.replace(replacement[0], replacement[1]);
// Print the result.
System.out.println(oppip);

In regex the ^ means "begin-of-text" (or "not" within a character class as negation). You have to place a backslash before it, which becomes two backslashes in a java String.
String regexp = "\\^[t=+s?#$]";
I have reduced it a bit further.

Replace different Regex-Matches with Match-based results in Java

One common usage for regex is the replacement of the matches with something that is based on the matches.
For example a commit-text with ticket numbers ABC-1234: some text (ABC-1234) has to be replaced with <ABC-1234>: some text (<ABC-1234>) (<> as example for some surroundings.)
This is very simple in Java
String message = "ABC-9913 - Bugfix: Some text. (ABC-9913)";
String finalMessage = message;
Matcher matcher = Pattern.compile("ABC-\\d+").matcher(message);
if (matcher.find()) {
String ticket = matcher.group();
finalMessage = finalMessage.replace(ticket, "<" + ticket + ">");
}
System.out.println(finalMessage);
results in<ABC-9913> - Bugfix: Some text. (<ABC-9913>).
But if there are different matches in the input String, this is different. I tried a slightly different code replacing if (matcher.find()) { with while (matcher.find()) {. The result is messed up with doubled replacements (<<ABC-9913>>).
How can I replace all matching values in an elegant way?

You can simply use replaceAll:
String input = "ABC-1234: some text (ABC-1234)";
System.out.println(input.replaceAll("ABC-\\d+", "<$0>"));
prints:
<ABC-1234>: some text (<ABC-1234>)
$0 is a reference to the matched string.
Java regex reference (see "Groups and capturing").

The problem is that the replace() method transforms the string over and over again.
A better way is to replace one match at a time. The matcher class has an appendReplacement-method for this.
String message = "ABC-9913, ABC-9915 - Bugfix: Some text. (ABC-9913,ABC-9915)";
Matcher matcher = Pattern.compile("ABC-\\d+").matcher(message);
StringBuffer sb = new StringBuffer();
while (matcher.find()) {
String ticket = matcher.group();
matcher.appendReplacement(sb, "<" + ticket + ">");
}
matcher.appendTail(sb);
System.out.println(sb);

removing space before new line in java

i have a space before a new line in a string and cant remove it (in java).
I have tried the following but nothing works:
strToFix = strToFix.trim();
strToFix = strToFix.replace(" \n", "");
strToFix = strToFix.replaceAll("\\s\\n", "");

myString.replaceAll("[ \t]+(\r\n?|\n)", "$1");
replaceAll takes a regular expression as an argument. The [ \t] matches one or more spaces or tabs. The (\r\n?|\n) matches a newline and puts the result in $1.

try this:
strToFix = strToFix.replaceAll(" \\n", "\n");
'\' is a special character in regex, you need to escape it use '\'.

I believe with this one you should try this instead:
strToFix = strToFix.replace(" \\n", "\n");
Edit:
I forgot the escape in my original answer. James.Xu in his answer reminded me.

Are you sure?
String s1 = "hi ";
System.out.println("|" + s1.trim() + "|");
String s2 = "hi \n";
System.out.println("|" + s2.trim() + "|");
prints
|hi|
|hi|

are you sure it is a space what you're trying to remove? You should print string bytes and see if the first byte's value is actually a 32 (decimal) or 20 (hexadecimal).

trim() seems to do what your asking on my system. Here's the code I used, maybe you want to try it on your system:
public class so5488527 {
public static void main(String [] args)
{
String testString1 = "abc \n";
String testString2 = "def \n";
String testString3 = "ghi \n";
String testString4 = "jkl \n";
testString3 = testString3.trim();
System.out.println(testString1);
System.out.println(testString2.trim());
System.out.println(testString3);
System.out.println(testString4.trim());
}
}

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Reformatting a Java String - java

You can try this: String originalString = "CALDARI_STARSHIP_ENGINEERING"; String newString = WordUtils.capitalize(originalString.replace('_', ' ').toLowerCase()); WordUtils are part of the Commons Lang libraries (http://commons.apache.org/lang/)

Using reg-exps: String s = "CALDARI_STARSHIP_ENGINEERING"; StringBuilder camel = new StringBuilder(); Matcher m = Pattern.compile("([^_])([^_]*)").matcher(s); while (m.find()) camel.append(m.group(1)).append(m.group(2).toLowerCase());

Quick and dirty way: Lower case all line.toLowerCase(); Split into words: String[] words = line.split("_"); Then loop through words capitalising first letter: words[i].substring(0, 1).toUpperCase()

Related

Convert String="one,two,three" to String='one','two','three'

String.split by semicolon

Replace a set of substring in a string in more efficient way?

Replace different Regex-Matches with Match-based results in Java

removing space before new line in java

Categories

Resources