Java and Regex to add some linebreaks

Java and Regex to add some linebreaks - java

So I have the following string that I want to inserts some \n before the numbers
1. Hello 2. Satuday 3.Kidding 4. sdsfjdfkj
I want to replace it to look like this
1. Hello
2. Satuday
3.Kidding
4. sdsfjdfkj
I was thinking something like this
variable.replaceAll("\d.", "\n");
Not sure how I could get the context I am find to replace

You can use replaceAll with a non-capturing regex, like this:
String res = str.replaceAll("\\b(?=\\d+[.])", "\n");
Given your string as an input, it prints
1. Hello
2. Satuday
3.Kidding
4. sdsfjdfkj
Demo on ideone.

So basically you want to replace every whitespace that has number and dot after it with new line. Try
variable = variable.replaceAll("\\s+(\\d+[.])", "\n$1");
// $1 is reference to captured group 1 which will contain number and dot
or
variable = variable.replaceAll("\\s+(?=\\d+[.])", "\n");
// (?=...) is called look-ahead, \\s+(?=\\d+[.]) makes sure that after matched
// whitespace there will be number and dot

somewhat slow, but easy fix:
string=string.replace("1","\n1");// '\n' is the escape sequence for newline
then repeat for all numbers

Related

can deal with the first line space when i use regex for polynomials

here is my code
String a = "X^5+2X^2+3X^3+4X^4";
String exp[]=a.split("(|\\+\\d)[xX]\\^");
for(int i=0;i<exp.length;i++) {
System.out.println("exp: "+exp[i]+" ");
}
im try to find the output which is 5,2,3,4
but instead i got this answer
exp:
exp:5
exp:2
exp:3
exp:4
i dont know where is the first line space come from, and i cannot find a will to get rid of that, i try to use others regex for this and also use compile,still can get rid of the first line, i try to use new string "X+X^5+2X^2+3X^3+4X^4";the first line shows exp:X.
and i also use online regex compiler to try my problem, but their answer is 5,2,3,4, buy eclipse give a space ,and then 5,2,3,4 ,need a help to figure this out

Try to use regex, e.g:
String input = "X^5+2X^2+3X^3+4X^4";
Pattern pattern = Pattern.compile("\\^([0-9]+)");
Matcher matcher = pattern.matcher(input);
for (int i = 1; matcher.find(); i++) {
System.out.println("exp: " + matcher.group(1));
}
It gives output:
exp: 5
exp: 2
exp: 3
exp: 4
How does it work:
Pattern used: \^([0-9]+)
Which matches any strings starting with ^ followed by 1 or more digits (note the + sign). Dash (^) is prefixed with backslash (\) because it has a special meaning in regular expressions - beginning of a string - but in Your case You just want an exact match of a ^ character.
We want to wrap our matches in a groups to refer to them late during matching process. It means we need to mark them using parenthesis ( and ).
Then we want to pu our pattern into Java String. In String literal, \character has a special meaning - it is used as a control character, eg "\n" represents a new line. It means that if we put our pattern into String literal, we need to escape a \ so our pattern becomes: "\\^([0-9]+)". Note double \.
Next we iterate through all matches getting group 1 which is our number match. Note that a ^.character is not covered in our match even if it is a part of our pattern. It is so because wr used parenthesis to mark our searched group, which in our case are only digits

Because you are using the split method which looks for the occurrence of the regex and, well.. splits the string at this position. Your string starts with X^ so it very much matches your regex.

Regular expression for splitting a String while preserving whitespace

I am doing an Android project which needs to split a String into tokens while preserving whitespaces and also not to split at non-word characters like #, & etc ...
Using \b splits at any non-word character .So i need a way to split the string in the following way.
Input: (. indicates whitespace)
A.A#..A##
Desired output:
A
.
A#
..
A##
So these 5 lines are the 5 values I would like in an array or similar. That means the 4th element of the result-array contains 2 spaces.

I think this is what you want:
(?<=\S)(?=\s)|(?<=\s)(?=\S)
Debuggex Demo
Basically I'm saying "if the previous character is a non-space and the next is a space or if the previous is a space and the next is a non-space, then split".

Use StringTokenizer:
StringTokenizer st = new StringTokenizer("A.A#..A##", ".");//first argument is string you want to split, another is whitespace
while(st.hasMoreTokens())
System.out.println(st.nextToken());
output will be:
A
A#
A##

Try:
String s = "A.A#..A##";
if(s.contains("..")) | s.contains("...")) {
s.replace("..", ".");
s.replace("...", ".");
String out[] = s.split(".");
It should give you an array with Strings the way you want :)
Don't forget to replace the "." with actual spaces :)

Java split by alphabeta char creates an empty value in array

I want to split my string on every occurrence of an alpha-beta character.
for example:
"s1l1e13" to an array of: ["s1","l1","e13"]
when trying to use this simple split by regex i get some weird results:
testStr = "s1l1e13"
Arrays.toString(testStr.split("(?=[a-z])"))
gives me the array of:
["","s1","l1","e13"]
how can i create the split without the empty array element?
I tried a couple more things:
testStr = "s1"
Arrays.toString(testStr.split("(?=[a-z])"))
does return the currect array: ["s1"]
but when trying to use substring
testStr = "s1l1e13"
Arrays.toString(testStr.substring(1).split("(?=[a-z])")
i get in return ["1","l1","e13"]
what am i missing?

Your Lookahead marks each position before any character of a to z; marking the following positions:
s1 l1 e13
^ ^ ^
So by spliting using just the Lookahead, it returns ["", "s1", "l1", "e13"]
You can use a Negative Lookbehind here. This looks behind to see if there is not the beginning of the string.
String s = "s1l1e13";
String[] parts = s.split("(?<!\\A)(?=[a-z])");
System.out.println(Arrays.toString(parts)); //=> [s1, l1, e13]

Your problem is that (?=[a-z]) means "place before [a-z]" and in your text
s1l1e13
you have 3 such places. I will mark them with |
|s1|l1|e13
so split (unfortunately correctly) produces "" "s1" "l1" "e13" and doesn't automatically remove for you first empty elements.
To solve this problem you have at least two options:
make sure that there is something before your place you need to split on (it is not at start of your string). You can use for instance (?<=\\d)(?=[a-z]) if you want to split after digit but before character
(PREFFERED SOLUTION) start using Java 8 which automatically removes empty strings at start of result array if regex used on split is zero-length (look-arounds are zero length).

The first match finds "" to be okay because its looking ahead for any alpha character, which is called zero-width lookahead, so it doesn't need to actually match anything. So "s" at the beginning is alphanumeric, and it matches that at a probable spot.
If you want the regex to match something always, use ".+(?=[a-z])"

The problem is that the initial "s" counts as an alphabetic character. So, the regex is trying to split at s.
The issue is that there is nothing before the s, so the regex machine instead decides to show that there is nothing by adding the null element. It'll do the same thing at the end if you ended with "s" (or any other letter).
If this is the only string you're splitting, or if every array you had starts with a letter but does not end with one, just truncate the array to omit the first element. Otherwise, you'll probably need to loop through each array as you make it so that you can drop empty elements.

So it seems your matches has the pattern x###, where x is a letter, and # is a number.
I'd make the following Regex:
([a-z][0-9]+)

Removing Certain Characters inside a String, Java

My problem here is that i want a Character remove in some parts of a String but I do not know how to restrict the removing.
Example:
A computer is a general purpose device that can be\n
programmed to carry out a finite set of\n
millions to billions of times more capable.\n
\n
In this era mechanical analog computers were used\n
for military applications.\n
1.1 Limited-function early computers\n
1.2 First general-purpose computers\n
1.3 Stored-program architecture\n
1.4 Semiconductors and\n
this here example is the content of my string, what i want to happen is to remove the \n of lines 1 and 2 above but not to remove the \n in line 5 onwards. How do i remove the \n without removing the other \n?. My Goal here is to make the string a paragraph without \n after line. like the example the first 3 lines can be a paragraph and the next lines are in bullet form(example). what i am saying is that I do not want to remove \n in bulleted characters.
The real contents of the string is dynamic.
I have tried using String.replaceAll("\n", " ") well clearly that would not work it will remove all the \n i have thought of using Regex to determine what is Alphanumberic but it would remove some letters after \n

Try using this regex: -
str = str.replaceAll("(.+)(?<!\\.)\n(?!\\d)", "$1 ");
System.out.println(str);
This will replace your \n if it is not preceded by a dot - termination of a paragraph, and it is not followed by a digit, for when it is followed by a bulleted point. (like, your \n in first bullet point is followed by a 1.2. So, it will not be replaced.).
(.+) at the start, ensures that you are not replacing a blank line.
This will work for the string you have shown.
Explanation: -
(.+) -> A capture group, capturing anything, occurring at least once.
(?<!\\.) -> This is called negative-look-behind. It matches the string following it, only if that string is not preceded by a dot(.) given in the negative-look-behind pattern.
For e.g.: - You don't need to replace \n after the line: - millions to billions of times more capable.\n.
(?!\\d) -> This is called negative -look-ahead. It matches string behind it, only if that string is not followed by a digit (\\d) given in the negative-look-ahead pattern.
For e.g.: - In your bulleted points, computers\n is followed by 1.2. where 1 is a digit. So, you don't want to replace that \n.
Now, $1 and $2 represent the groups captured in the pattern match. Since you just want to replace "\n". So, we took the remaining pattern match as it is, while replacing "\n" with a space.
So, $1 is representation for 1st group - (.+)
Note, look-ahead and look-behind regexes are non-capturing groups.
For More Details, follow these links: -
http://docs.oracle.com/javase/tutorial/essential/regex/
http://docs.oracle.com/javase/tutorial/essential/regex/quant.html

I suspect your requirement is to remove the \n of lines 1 and 2 .
What you can do is as below:
split your string into segments,
String[] array = yourString.split("\n");
concat every segments by adding \n tag, except line 1,2
array[1] + array[2] + array[3] + '\n' + array[4] + '\n' ...// and so
forth

Splitting a Java String with '.'

I have
1. This is a test message
I want to print
This is a test message
I am trying
String delimiter=".";
String[] parts = line.split(delimiter);
int gg=parts.length;
Than want to print array
for (int k ;k <gg;K++)
parts[k];
But my gg is always 0.
am I missing anything.
All I need is to remove the number and . and white spaces
The number can be 1 (or) 5 digit number

You are using "." as a delimiter, you should break the special meaning of the . char.
The . char in regex is "any character" so your split is just splitting according to "any character", which is obviously not what you are after.
Use "\\." as a delimiter
For more information on pre-defined character classes you can have a look at the tutorial.
For more information on regex on general (includes the above) you can try this tutorial
EDIT:
P.S. What you are up to (removing the number) can be achieved with a one-liner, using the String.replaceAll() method.
System.out.println(line.replaceAll("[0-9]+\\.\\s+", ""));
will provide output
This is a test message
For your input example.
The idea is: [0-9] is any digit. - the + indicate there can be any number of them, which is greater then 0. The \\. is a dot (with breaking as mentioned above) and the \\s+ is at least one space.
It is all replaced with an empty string.
Note however, for strings like: "1. this is a 2. test" - it will provide "this is a test", and remove the "2. " as well, so think carefully if that is indeed what you are after.

Use following code..
String delimtor="\\."; // use this because . required to be skipped
String[] parts = line.split(delimtor);
For your for loop.
for (int k=0 ;k <gg.length;K++)
parts[k];

try this
String delimtor = "\\.";
"." has a special meaning for a regular expression.

If you are just trying to remove the prefix numbers then you can do it in one line. Not sure if you actually want to split on multiple dots. If it is just the prefix then you can do it in one line
String s = "1. with single digit";
String s2 = "999. with multiple digits";
String s3 = "999. with multiple digits . and . dots";
assertEquals("with single digit", (s.substring(s.indexOf(".") + 1).trim()));
assertEquals("with multiple digits", (s2.substring(s2.indexOf(".") + 1).trim()));
assertEquals("with multiple digits . and . dots", (s3.substring(s3.indexOf(".") + 1).trim()));

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Java and Regex to add some linebreaks - java

You can use replaceAll with a non-capturing regex, like this: String res = str.replaceAll("\\b(?=\\d+[.])", "\n"); Given your string as an input, it prints 1. Hello 2. Satuday 3.Kidding 4. sdsfjdfkj Demo on ideone.

somewhat slow, but easy fix: string=string.replace("1","\n1");// '\n' is the escape sequence for newline then repeat for all numbers

Related

can deal with the first line space when i use regex for polynomials

Regular expression for splitting a String while preserving whitespace

Java split by alphabeta char creates an empty value in array

Removing Certain Characters inside a String, Java

Splitting a Java String with '.'

Categories

Resources