ASCII to HTML-Entities Escaping in Java

ASCII to HTML-Entities Escaping in Java - java

I found this website with escape codes and I'm just wondering if someone has done this already so I don't have to spend couple of hours building this logic:
StringBuffer sb = new StringBuffer();
int n = s.length();
for (int i = 0; i < n; i++) {
char c = s.charAt(i);
switch (c) {
case '\u25CF': sb.append("●"); break;
case '\u25BA': sb.append("►"); break;
/*
... the rest of the hex chars literals to HTML entities
*/
default: sb.append(c); break;
}
}

These "codes" is a mere decimal representation of the unicode value of the actual character. It seems to me that something like this would work, unless you want to be very strict about which codes get converted, and which don't.
StringBuilder sb = new StringBuilder();
int n = s.length();
for (int i = 0; i < n; i++) {
char c = s.charAt(i);
if (Character.UnicodeBlock.of(c) != Character.UnicodeBlock.BASIC_LATIN) {
sb.append("&#");
sb.append((int)c);
sb.append(';');
} else {
sb.append(c);
}
}

The other answers don't work correctly for surrogate pairs, e.g. if you have Emojis such as "😀" (see character info). Here's how to do it in Java 8:
StringBuilder sb = new StringBuilder();
s.codePoints().forEach(codePoint -> {
if (Character.UnicodeBlock.of(codePoint) != Character.UnicodeBlock.BASIC_LATIN) {
sb.append("&#");
sb.append(codePoint);
sb.append(';');
} else {
sb.appendCodePoint(codePoint);
}
});
And for older Java:
StringBuilder sb = new StringBuilder();
for (int i = 0; i < s.length(); ) {
int c = s.codePointAt(i);
if (Character.UnicodeBlock.of(c) != Character.UnicodeBlock.BASIC_LATIN) {
sb.append("&#");
sb.append(c);
sb.append(';');
} else {
sb.appendCodePoint(c);
}
i += Character.charCount(c);
}
A simple way to test if a solution handles surrogate pairs correctly is to use "\uD83D\uDE00" (😀) as the input. If the output is "😀", then it's wrong. The correct output is 😀.

Hmm, what if you did something like this instead:
if (c > 127) {
sb.append("&#" + (int) c + ";");
} else {
sb.append(c);
}
Then you just need to determine the range of characters you want HTML escaped. In this case I just specified any character beyond the ASCII table space.

Related

A simple decryption in Java

I am writing a method to decrypt a input string. The encryption is straight forward. Any repeating character in the string is replaced by the character followed by the number of times it appears in the string. So, hello is encrypted as hel2o. Below is the decryption logic I have so far. It works but is so imperative and involves multiple loops. How can this be improved?
String input = "hel2o";
String[] inarr = input.split("\\s+");
StringBuilder sb = new StringBuilder();
for(int i = 0; i < inarr.length; i++) {
String s = inarr[i];
char[] c = s.toCharArray();
for(int j = 0; j < c.length; j++) {
if(Character.isDigit(c[j])) {
for(int x = 0; x < Character.getNumericValue(c[j])-1; x++) {
sb.append(c[j-1]);
}
} else {
sb.append(c[j]);
}
}
}
System.out.printl(sb.toString());

You pretty much asked for a solution but I had fun doing it so I'll share.
You can do it with one loop, by doing some clever appending. Also, unlike yours, my solution will work with multi digit numbers e.g.
Hel23o will convert to helllllllllllllllllllllllo with 23 l's.
String input = "hel23o";
StringBuilder builder = new StringBuilder();
char previousChar = ' ';
StringBuilder number = new StringBuilder();
for (char c : input.toCharArray()) {
if (Character.isDigit(c)) {
number.append(c);
continue;
}
if (number.length() > 0 ) {
int count = Integer.parseInt(number.toString());
count = count > 1 ? count - 1 : 0;
builder.append(String.join("", Collections.nCopies(count, String.valueOf(previousChar))));
}
builder.append(c);
previousChar = c;
number.setLength(0);
}
Alternatively without the multi digit number support:
String input = "hel3o";
StringBuilder builder = new StringBuilder();
char previousChar = ' ';
for (char c : input.toCharArray()) {
if (Character.isDigit(c)) {
builder.append(String.join("", Collections.nCopies(Character.getNumericValue(c) - 1, String.valueOf(previousChar))));
continue;
}
builder.append(c);
previousChar = c;
}

Trying to change a string to altcase

Trying to write a code that makes a string become altcase (ie. "hello" becomes "HeLlO". I borrowed code from another question on this forum that asked for something similar (Java Case Switcher) However, the code only switched the casing of a letter instead of having a capital letter (first), then lowercase letter, etc. pattern.
What I have so far:
public String altCase(String text)
{
String str = "";
for (int i = 0; i <= text.length(); i++)
{
char cA = text.charAt(i);
if (text.charAt(0).isUppercase)
{
str += Character.toLowerCase(cA);
}
if (text.charAt(0).isLowercase)
{
str += Character.toUpperCase;
}
if(i != 0 && Character.isUpperCase(cA))
{
if (text.charAt(i)-1.isUpperCase || text.charAt(i)+1.isUpperCase)
{
str += Character.toLowerCase(cA);
}
else
{
str += cA;
}
}
if(i != 0 && Character.isLowerCase(cA))
{
if (text.charAt(i)-1.isLowerCase || text.charAt(i)+1.isLowerCase)
{
str += Character.toUpperCase(cA);
}
else
{
str += cA;
}
}
}
return str;
}
I'm still relatively new to coding in general so please excuse my inefficiencies, as well as any headaches I might induce from the lack of experience in my coding. I cannot tell where I am going wrong except maybe when I typed "text.charAt(i)-1.isLowerCase" as the statement seems a bit illogical, but I am lost in terms of trying to come up with something else that would accomplish the same thing. Or is my error completely elsewhere? Thanks for any help in advance.

The modulus operator could take you a long way here...
StringBuilder rslt = new StringBuilder();
for (int i = 0; i < text.length(); i++) {
char c = text.charAt(i);
switch (i % 2) {
case 0:
rslt.append(Character.toUpperCase(c));
break;
case 1:
rslt.append(Character.toLowerCase(c));
break;
}
}
return rslt.toString();

If I truly understand what you want to get is that:
Get a string, change it in a format of AbCdEfG.... and so on.
There is more simple solution.
Get a string and with for loop, for every character, change character size depending on position in string, for i%2 == 0 upper case, and i%2 == 1 lower case.
public String altCase(String text)
{
String str = "";
for (int i = 0; i < text.length(); i++)
{
char cA = text.charAt(i);
if (i%2 == 0)
{
str += Character.toUpperCase(cA);
}
else
{
str += Character.toLowerCase(cA);
}
}
return str;
}

I would start with a StringBuilder (a mutable character sequence) of text.toLowerCase(); then set the characters at even indices to their capital equivalents (and your method doesn't appear to depend on instance state, so it might be static). Something like,
public static String altCase(String text) {
StringBuilder sb = new StringBuilder(text.toLowerCase());
for (int i = 0; i < text.length(); i += 2) {
sb.setCharAt(i, Character.toUpperCase(sb.charAt(i)));
}
return sb.toString();
}

IntStream.range(0, s.length()).mapToObj(i -> i % 2 == 0 ?
Character.toUpperCase(s.charAt(i)) :
Character.toLowerCase(s.charAt(i)))
.map(String::valueOf)
.collect(Collectors.joining());

Convert java char inside string to lowerCase/upperCase

I have a String called "originalstring" which contains a sentence with mixed upper and lower case characters.
I simply want to flip the string so that if a character is a lowercase make it upper case and vice versa and return it.
I have tried this code, which returns the original string in upperCase:
for (int i = 0; i < originalString.length(); i++) {
char c = originalString.charAt(i);
if (Character.isUpperCase(c)) {
originalString += Character.toLowerCase(c);
}
if (Character.isLowerCase(c)) {
originalString += Character.toUpperCase(c);
}
}
return originalString;

You are adding characters to the original string. Also, this means that your for loop will never get to the end of the iteration of the for loop, because originalString.length() changes each loop also. It's an infinite loop.
Instead, create a StringBuilder that stores the converted characters as you're iterating over the original string. The convert it to a String and return it at the end.
StringBuilder buf = new StringBuilder(originalString.length());
for (int i = 0; i < originalString.length(); i++) {
char c = originalString.charAt(i);
if (Character.isUpperCase(c)) {
buf.append(Character.toLowerCase(c));
}
else if (Character.isLowerCase(c)) {
buf.append(Character.toUpperCase(c));
}
// Account for case: neither upper nor lower
else {
buf.append(c);
}
}
return buf.toString();

Common-lang provide a swapCase function, see the doc. Sample from the doc:
StringUtils.swapCase(null) = null
StringUtils.swapCase("") = ""
StringUtils.swapCase("The dog has a BONE") = "tHE DOG HAS A bone"
And if you really want to do it by yourself, you can check the source of common-lang StringUtils

Instead of using existing utilities, you may try below conversion using boolean operation:
To upper case:
char upperChar = (char) (c & 0x5f)
To lower case:
char lowerChar = (char) (c ^ 0x20)
In your program:
StringBuilder result = new StringBuilder(originalString.length());
for (int i = 0; i < originalString.length(); i++) {
char c = originalString.charAt(i);
if (Character.isUpperCase(c)) {
result.append((char) (c ^ 0x20));
}
else if ((c >= 'a') && (c <= 'z')) {
result.append((char) (c & 0x5f));
}
else {
result.append(c);
}
}
System.out.println(result);

How to make alternate characters in a string to uppercase?

I wrote the following code but similar characters are always in the same case. What's wrong in this code and How can this problem be solved??
private void genBTActionPerformed(java.awt.event.ActionEvent evt) {
String str = new String(strTF.getText());
int n = str.length();
char ch;
int i;
for(i = 0; i < n; i++) {
if(i % 2 == 0) {
ch = Character.toLowerCase(str.charAt(i));
str = str.replace(str.charAt(i), ch);
} else {
ch = Character.toUpperCase(str.charAt(i));
str = str.replace(str.charAt(i), ch);
}
}
jumTF.setText(str);
}

Unlike what its name says, .replace() replaces characters/CharSequences in the whole input. The difference with .replaceAll() is that it takes literals as arguments and not regexes/regex replacements strings (and that it has an overload taking two chars as arguments). That is the second worst misnamed method of the String class after matches().
Moreover you create a new String on each character you replace, so you have n+1 strings for a n character long string. Do it like this instead:
final char[] chars = str.toCharArray();
final int len = chars.length;
char c;
for (int i = 0; i < len; i++) {
c = chars[i];
chars[i] = i % 2 == 0
? Character.toLowerCase(c)
: Character.toUpperCase(c);
}
jumTF.setText(new String(chars));

In your program you were using replace() which replaces characters/CharSequences in the whole input what you need to do is
Put the string into an array.
Iterate over said array.
convert that array back into string
private void genBTActionPerformed(java.awt.event.ActionEvent evt) {
String str = new String(strTF.getText());
char [] chr= str.toCharArray();
int n = chr.length;
char ch;
int i;
for(i = 0; i < n; i++) {
if(i % 2 == 0) {
ch = Character.toLowerCase(chr[i]);
chr[i]=ch;
} else {
ch = Character.toUpperCase(chr[i]);
chr[i]=ch;
}
}
jumTF.setText(new String(chr)); }
hope this will help you :)

Since String are immutable in java , you can use StringBuilder or StringBuffer to solve this problem
StringBuilder str=new StringBuilder(inputString);
You can use your own logic just with slight change instead of using
str = str.replace(str.charAt(i), ch);//since it replaces in whole string
Use
str.setCharAt(i,ch);
So your final Program looks like this :
for(i = 0; i < n; i++) {
if(i % 2 == 0) {
ch = Character.toLowerCase(str.charAt(i));
str.setCharAt(i,ch);
} else {
ch = Character.toUpperCase(str.charAt(i));
str.setCharAt(i,ch);
}
}
Suppose InputString is : stackoverflow
then output is : sTaCkOvErFlOw

How to Convert String to another String format

How can I convert a string in the form eyesOfTheTiger to one that reads eyes-of-the-tiger?

Just travel through the string and take different action if the character is uppercase.
public class Test {
private static String upperCaseToDash(String input) {
StringBuilder sb = new StringBuilder();
for (int i = 0; i < input.length(); i++) {
char c = input.charAt(i);
if (Character.isUpperCase(c))
sb.append('-').append(Character.toLowerCase(c));
else
sb.append(c);
}
return sb.toString();
}
public static void main(String[] args) {
System.out.println(upperCaseToDash("eyesOfTheTiger"));
}
}

Before you start implementing this function yourself via substrings, regex, etc, consider using Google Guava. Class com.google.common.base.CaseFormat solves exactly what you intend to do.
In your case you need the LOWER_CAMEL and LOWER_HYPHEN class constants and the to(CaseFormat format, String s) method.
IMO, it's always better to use a mature and well-tested library than to implement everything yourself.

You can split() the String using a regex , like "(?<!(^|[A-Z0-9]))(?=[A-Z0-9])|(?<!^)(?=[A-Z][a-z])" and then append - at the end of each split .
public String camelCaseToDashSeparated(String initialString) {
if(initialString==null || initialString.length()<1)
return initialString;
StringBuilder str = new StringBuilder();
for (String w : "eyesOfTheTiger".split("(?<!(^|[A-Z0-9]))(?=[A-Z0-9])|(?<!^)(?=[A-Z][a-z])")) {
str.append(w.toLowerCase()+"-");
}
return str.substring(0, str.length()-1);
}
Another way would be :
Travel through the String , char by char , keep adding the characters to the StringBuilder. Once you find a char in uppercase , append - to the StringBuilder with the lowercase of the char.
public static String camelCaseToDashSeparated2 (String initialString) {
StringBuffer buff = new StringBuffer();
for(int x = 0; x < initialString.length(); x++) {
char c = initialString.charAt(x);
if(Character.isUpperCase(c)) {
buff.append("-").append(Character.toLowerCase(c));
}
else {
buff.append(c);
}
}
return buff.toString();
}

Quick and dirty solution could be something like this:
(you should decide what to do with spaces, dashes, full stops,
languages other than English etc.)
public static String toDashed(String value) {
if (null == value)
return null;
StringBuilder sb = new StringBuilder();
for (int i = 0; i < value.length(); ++i) {
char ch = value.charAt(i);
if ((ch >= 'A') && (ch <= 'Z') && (i > 0)) {
sb.append('-');
sb.append(Character.toLowerCase(ch));
}
else
sb.append(ch);
}
return sb.toString();
}

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

ASCII to HTML-Entities Escaping in Java - java

Hmm, what if you did something like this instead: if (c > 127) { sb.append("&#" + (int) c + ";"); } else { sb.append(c); } Then you just need to determine the range of characters you want HTML escaped. In this case I just specified any character beyond the ASCII table space.

Related

A simple decryption in Java

Trying to change a string to altcase

Convert java char inside string to lowerCase/upperCase

How to make alternate characters in a string to uppercase?

How to Convert String to another String format

Categories

Resources