Count length of String excluding linebreaks - java

I am writing out a String and i need it to break the lines every time there is 20 characters on a line.
However, it counts the
\n
as a character as well. How can i break the lines when there is 20 characters on the line, excluding the \n?
Edit: Every time it've counted 20 characters, it adds \n to the end of the line

Before printing your string, get rid of the containing line breaks like this:
yourString = yourString.replaceAll("\\n", "");
Here "\\n" is a regular expression matching the newline character. The string literal "\\n" is actually a string containing \n

You could first find and replace with an empty string all the occurences of "\n" into your source then execute your algorithm.
destinationSring = sourceString.replaceAll("\\n", "");

int count=0;
for (char c : someString.toCharArray()) {
if (c == '\n') continue;
System.out.print(c);
++count;
if (count % 20 == 0) System.out.println();
}

You can use the following
stringValue.replaceAll("[\n\r]+", "").length()

Related

Issue with replacing part of string in Java

I have a method that reads a sql query as a string and needs to replace all '\' with '\\'. This is done to prevent \n and \r being processed as line breaks in the output.
private static String cleanTokenForCsv(String inputToken) {
if (inputToken == null) {
return "";
}
if (inputToken.length() == 0 || inputToken.contains(",") || inputToken.contains("\"")
|| inputToken.contains("\n") || inputToken.contains("\r")) {
String replacedToken = inputToken.replace(",", ";");
return String.format("\"%s\"", replacedToken.replace("\"", "\"\""));
} else {
return inputToken;
}
}
Sample Input
(\nSELECT\n a.population_id\n ,a.empi_id\n\r ,a.encounter_id\n ,SPLIT_PART(MIN(a.service_date||'|'||a.norm_numeric_value),'|',2)::FLOAT as earliest_temperature\nFROM\n ph_f_result a)
The expected output for the query would be along the lines of
"(\nSELECT\n a.population_id\n ;a.empi_id\n\r ;a.encounter_id\n ;SPLIT_PART(MIN(a.service_date||'|'||a.norm_numeric_value);'|';2)::FLOAT as earliest_temperature\nFROM\n ph_f_result a)"
The entire query in one line with the line breaks intact
However, the output instead is
"(
SELECT
a.population_id
;a.empi_id
;a.encounter_id
;SPLIT_PART(MIN(a.service_date||'|'||a.norm_numeric_value);'|';2)::FLOAT as earliest_temperature
FROM
ph_f_result a)"
I also tried the following:
replacedToken = replacedToken.replace("\\", "\\\\");
With Regex
replacedToken = replacedToken.replaceAll("\\\\", "\\\\\\\\");
Edit: So I can get it to work if I add individual replace calls for \n and \r like below
replacedToken = replacedToken.replace("\n","\\n");
replacedToken = replacedToken.replace("\r", "\\r");
But I am looking for something more generic for all '\n' instances
You have carriage return and newline characters, not escaped r or n.
replacedToken = replacedToken.replace("\n", "\\n").replace("\r", "\\r");
This replaces all carriage return and newline characters with their escaped equivalents.
I assume that your goal is simply to convert characters \r, \t and \n in an input String to double-quoted two-character strings "\\r" and so on, so that printing the string does not result in newlines or tabs.
Note that the character \n does not really contain the character \ at all. We simply agree to write \n to represent it. This should work:
public static String escapeWhitespaceEscapes(String input) {
return input
.replace("\n", "\\n")
.replace("\r", "\\r")
.replace("\t", "\\t");
}
But note that you will have to perform the reverse operation to get back the original string.

How can I add a character inside a Regular Expression which changes each time?

String s = scan.nextLine();
s = s.replaceAll(" ", "");
for (int i = 0; i < s.length(); i++) {
System.out.print(s.charAt(i) + "-");
int temp = s.length();
// this line is the problem
s = s.replaceAll("[s.charAt(i)]", '');
System.out.print((temp - s.length()) + "\n");
i = -1;
}
I was actually using the above method to count each character.
I wanted to use s.charAt(i) inside Regular Expression so that it counts and displays as below. But that line (line 10) doesn't work I know.
If it's possible how can I do it?
Example:
MALAYALAM (input)
M-2
A-4
L-2
Y-1
Java does not have string interpolation, so code written inside a string literal will not be executed; it is just part of the string. You would need to do something like "[" + s.charAt(i) + "]" instead to build the string programmatically.
But this is problematic when the character is a regex special character, for example ^. In this case the character class would be [^], which matches absolutely any character. You could escape regex special characters while building the regex, but this is overly complicated.
Since you just want to replace occurrences an exact substring, it is simpler to use the replace method which does not take a regex. Don't be fooled by the name replace vs. replaceAll; both methods replace all occurrences, the difference is really that replaceAll takes a regex but replace just takes an exact substring. For example:
> "ababa".replace("a", "")
"bb"
> "ababa".replace("a", "c")
"cbcbc"

Use regex to un camelCase Java String

This code seems to work perfectly, but I'd love to clean it up with regex.
public static void main(String args[]) {
String s = "IAmASentenceInCamelCaseWithNumbers500And1And37";
System.out.println(unCamelCase(s));
}
public static String unCamelCase(String string) {
StringBuilder newString = new StringBuilder(string.length() * 2);
newString.append(string.charAt(0));
for (int i = 1; i < string.length(); i++) {
if (Character.isUpperCase(string.charAt(i)) && string.charAt(i - 1) != ' '
|| Character.isDigit(string.charAt(i)) && !Character.isDigit(string.charAt(i - 1))) {
newString.append(' ');
}
newString.append(string.charAt(i));
}
return newString.toString();
}
Input:
IAmASentenceInCamelCaseWithNumbers500And1And37
Output:
I Am A Sentence In Camel Case With Numbers 500 And 1 And 37
I'm not a fan of using that ugly if statement, and I'm hoping there's a way to use a single line of code that utilizes regex. I tried for a bit but it would fail on words with 1 or 2 letters.
Failing code that doesn't work:
return string.replaceAll("(.)([A-Z0-9]\\w)", "$1 $2");
The right regex and code to do your job is this.
String s = "IAmASentenceInCamelCaseWithNumbers500And1And37";
System.out.println("Output: " + s.replaceAll("[A-Z]|\\d+", " $0").trim());
This outputs,
Output: I Am A Sentence In Camel Case With Numbers 500 And 1 And 37
Editing answer for query asked by OP in comment:
If input string is,
ThisIsAnABBRFor1Abbreviation
Regex needs a little modification and becomes this, [A-Z]+(?![a-z])|[A-Z]|\\d+ for handling abbreviation.
This code,
String s = "ThisIsAnABBRFor1Abbreviation";
System.out.println("Input: " + s.replaceAll("[A-Z]+(?![a-z])|[A-Z]|\\d+", " $0").trim());
Gives expected output as per OP ZeekAran in comment,
Input: This Is An ABBR For 1 Abbreviation
You may use this lookaround based regex solution:
final String result = string.replaceAll(
"(?<=\\S)(?=[A-Z])|(?<=[^\\s\\d])(?=\\d)", " ");
//=> I Am A Sentence In Camel Case With Numbers 500 And 1 And 37
RegEx Demo
RegEx Details:
Regex matches either of 2 conditions and replaces it with a space. It will ignore already present spaces in input.
(?<=\\S)(?=[A-Z]): Previous char is non-space and next char is a uppercase letter
|: OR
(?<=[^\\s\\d])(?=\\d): previous char is non-digit & non-space and next one is a digit
I think you can try this
let str = "IAmASentenceInCamelCaseWithNumbers500And1And37";
function unCamelCase(str){
return str.replace(/(?:[A-Z]|[0-9]+)/g, (m)=>' '+m.toUpperCase()).trim();
}
console.log(unCamelCase(str));
Explanation
(?:[A-Z]|[0-9]+)
?: - Non capturing group.
[A-Z] - Matches any one capital character.
'|' - Alternation (This works same as Logical OR).
[0-9]+ - Matches any digit from 0-9 one or more time.
P.S Sorry for the example in JavaScript but same logic can be achived in JAVA pretty easily.

String.replace() not replacing all occurrences

I have a very long string which looks similar to this.
355,356,357,358,359,360,361,382,363,364,365,366,360,361,363,366,360,361,363,366,360,361,363,366,360,361,363,366,360,361,363,366,360,361,363,366,360,361,363,366,368,369,313,370,371,372,373,374,375,376,377,378,379,380,381,382,382,382,382,382,382,383,384,385,380,381,382,382,382,382,382,386,387,388,389,380,381,382,382,382,382,382,382,390,391,380,381,382,382,382,382,382,392,393,394,395,396,397,398,399,....
When I tried using the following code to remove the number 382 from the string.
String str = "355,356,357,358,359,360,361,382,363,364,365,366,360,361,363,366,360,361,363,366,360,361,363,366,360,361,363,366,360,361,363,366,360,361,363,366,360,361,363,366,368,369,313,370,371,372,373,374,375,376,377,378,379,380,381,382,382,382,382,382,382,383,384,385,380,381,382,382,382,382,382,386,387,388,389,380,381,382,382,382,382,382,382,390,391,380,381,382,382,382,382,382,392,393,394,395,396,397,398,399,...."
str = str.replace(",382,", ",");
But it seems that not all occurrences are being replaced. The string which originally had above 3000 occurrences still was left with about 630 occurrences after replacing.
Is the capability of String.replace() limited? If so, is there a possible way of achieving what I need?
You need to replace the trailing comma as well (if one exists, which it won't if last in the list):
str = str.replaceAll("\\b382,?", "");
Note \b word boundary to prevent matching "-,1382,-".
The above will convert:
382,111,382,1382,222,382
to:
111,1382,222
I think the issue is your first argument to replace(), in particular the comma (,) before and after 382. If you have "382,382,383", you will only match the inner ",382," and leave the initial one behind. Try:
str.replace("382,", "");
Although this will fail to match "382" at the very end as it does not have a comma after it.
A full solution might entail two method calls thus:
str = str.replace("382", ""); // Remove all instances of 382
str.replaceAll(",,+", ","); // Compress all duplicates, triplicates, etc. of commas
This combines the two approaches:
str.replaceAll("382,?", ""); // Remove 382 and an optional comma after it.
Note: both of the last two approaches leave a trailing comma if 382 is at the end.
try this
str = str.replaceAll(",382,", ",");
Firstly, remove the preceding comma in your matching string. Then, remove duplicated commas by replacing commas with a single comma using java regular expression.
String input = "355,356,357,358,359,360,361,382,363,364,365,366,360,361,363,366,360,361,363,366,360,361,363,366,360,361,363,366,360,361,363,366,360,361,363,366,360,361,363,366,368,369,313,370,371,372,373,374,375,376,377,378,379,380,381,382,382,382,382,382,382,383,384,385,380,381,382,382,382,382,382,386,387,388,389,380,381,382,382,382,382,382,382,390,391,380,381,382,382,382,382,382,392,393,394,395,396,397,398,399";
String result = input.replace("382,", ","); // remove the preceding comma
String result2 = result.replaceAll("[,]+", ","); // replace duplicate commas
System.out.println(result2);
As dave already said, the problem is that your pattern overlaps. In the string "...,382,382,..." there are two occurrences of ",382,":
"...,382,382,..."
----- first occurrence
----- second occurrence
These two occurrences overlap at the comma, and thus Java can only replace one of them. When finding occurrences, it does not see yet what you replace the pattern with, and thus it doesn't see that new occurrence of ",382," is generated when replacing the first occurrence is replaced by the comma.
If your data is known not to contain numbers with more than 3 digits, then you might do:
str.replace("382,", "");
and then handle occurrences at the end as a special case. But if your data can contain big numbers, then "...,1382,..." will be replaced by "...,1,..." which probably is not what you want.
Here are two solutions that do not have the above problem:
First, simply repeat the replacement until no changes occur anymore:
String oldString = str;
str = str.replace(",382,", ",");
while (!str.equals(oldString)) {
oldString = str;
str = str.replace(",382,", ",");
}
After that, you will have to handle possible occurrences at the end of the string.
Second, if you have Java 8, you can do a little more work yourself and use Java streams:
str = Arrays.stream(str.split(","))
.filter(s -> !s.equals("382"))
.collect(Collectors.joining(","));
This first splits the string at ",", then filters out all strings which are equal to "382", and then concatenates the remaining strings again with "," in between.
(Both code snippets are untested.)
Traditional way:
String str = ",abc,null,null,0,0,7,8,9,10,11,12,13,14";
String newStr = "", word = "";
for (int i=0; i<str.length(); i++) {
if (str.charAt(i) == ',') {
if (word.equals("null") || word.equals("0"))
word = "";
newStr += word+",";
word = "";
} else {
word += str.charAt(i);
if (i == str.length()-1)
newStr += word;
}
}
System.out.println(newStr);
Output:
,abc,,,,,7,8,9,10,11,12,13,14

Java: Count empty lines in a text file/string

I am using the following code to count empty lines in Java, but this code returns a greater number of empty lines than there are.
int countEmptyLines(String s) {
int result=0;
Pattern regex = Pattern.compile("(?m)^\\s*$");
Matcher testMatcher = regex.matcher(s);
while (testMatcher.find())
{
result++;
}
return result;}
What am I doing wrong or is there a better way to do it?
Try this:
final BufferedReader br = new BufferedReader(new StringReader("hello\n\nworld\n"));
String line;
int empty = 0;
while ((line = br.readLine()) != null) {
if (line.trim().isEmpty()) {
empty++;
}
}
System.out.println(empty);
I found a way to fix my own regex while I was at lunch:
Pattern regex = Pattern.compile("(?m)^\\s*?$");
The '?' makes the \s* reluctant, meaning it will somehow not match the character that '$' will match.
\s matches any whitespace, which is either a space, a tab or a carriage return/linefeed.
The easiest way to do this is to count chains of successive EOL characters. I write EOL, because you need to determine which character denotes the end of line in your file. While under Windows, an end of line amounts to a Carriage Return and a Linefeed character. Under Unix, this is different, so for a file written under Unix your programm will have to be adjusted.
Then, count every the successive number of the end of line character(s) and each time add this number minus 1 to a count. At the end, you will have the empty line count.

Categories