Split a String into Pieces [duplicate] - java

This question already has answers here:
Split string to equal length substrings in Java
(23 answers)
Closed 6 years ago.
I'm trying to split a string into different parts each having x amount of characters in it. How can I go along doing this?
EDIT: I've managed to figure itout thanks to #amith down below however I'm unsure how to make it not split words, any ideas?
Thanks,
- Exporting.

List<String> splitString(int interval, String str) {
int length = str.length();
List<String> split = new ArrayList<>();
if(length < interval) {
split.add(str);
} else {
for(int i=0;i<length;i+=interval) {
int endIndex = i + interval;
if(endIndex > length) {
endIndex = length;
}
String substring = str.substring(i, endIndex);
split.add(substring);
}
}
return split;
}
This is a sample code to split string at regular intervals

If result is an array of char, try this:
public static char[] split(String str, int numChar) {
char[] s = str.toCharArray();
if (numChar >= str.length()) {
return s;
}
char[] r = new char[numChar];
System.arraycopy(s, 0, r, 0, numChar);
return r;
}
Or if result is a String, try this:
public static String split(String str, int numChar) {
char[] s = str.toCharArray();
if (numChar >= str.length()) {
return String.copyValueOf(s);
}
char[] r = new char[numChar];
System.arraycopy(s, 0, r, 0, numChar);
return String.copyValueOf(r);
}
Please note that two of above methods do not change the original String.

Related

Split a String after every n characters ignoring whitespaces in java store it in arraylist

I have a string which I want to split after every n characters and store the same in an array of strings, but this should ignore all the whitespaces.
For example I have a string as follows,
String str = "This is a String which needs to be splitted after every 10 characters";
The output should be,
["This is a Str", "ing which nee", "ds to be split", "ted after ev", "ery 10 chara", "cters"]
(Edit) --> I am using the function below. How can I store this in an array of Strings.
As seen in the output it ignores indexes of all the whitespaces. Is there any way to do it in java.
public static String test(int cnt, String string) {
AtomicInteger n = new AtomicInteger(cnt);
return string
.chars()
.boxed()
.peek(value -> {
if (!Character.isWhitespace(value)) {
n.decrementAndGet();
}
})
.takeWhile(value -> n.get() >= 0)
.map(Character::toString)
.collect(Collectors.joining());
I have used a standard approach with looping through the string and counting chars:
public static void main(String[] args) throws ParseException {
String str = "This is a String which needs to be splitted after every 10 characters";
System.out.println(split(str, 10));
}
public static List<String> split(String string, int splitAfter) {
List<String> result = new ArrayList<String>();
int startIndex = 0;
int charCount = 0;
for (int i = 0; i < string.length(); i++) {
if (charCount == splitAfter) {
result.add(string.substring(startIndex, i));
startIndex = i;
charCount = 0;
}
// only count non-whitespace characters
if (string.charAt(i) != ' ') {
charCount++;
}
}
// check if startIndex is less than string length -> if yes, then last element wont be 10 characters long
if (startIndex < string.length()) {
result.add(string.substring(startIndex));
}
return result;
}
And the result differs slightly from what you posted, but looking at your expected result, it doesn't quite match the description anyways:
[This is a Str, ing which ne, eds to be spl, itted after, every 10 cha, racters]

Trying to iterate through a String and find if char is a letter or digit then append it to a different String [duplicate]

This question already has answers here:
What is the best way to tell if a character is a letter or number in Java without using regexes?
(9 answers)
Closed 3 years ago.
I need to iterate through the String userInput and find if the char is a letter or digit, if it is then I need to append that char to the String endProduct.
public static String converter(String userInput) {
String endProduct = "";
char c = userInput.charAt(0);
Stack<Character> stack = new Stack<Character>();
int len = userInput.length();
//iterates through the word to find symbols and letters, if letter or digit it appends to endProduct, if symbol it pushes onto stack
for (int i = c; i < len; i++) {
if (Character.isLetter(userInput.charAt(i))) {
endProduct = endProduct + c;
System.out.println(c);
}//end if
else if(Character.isDigit(userInput.charAt(i))){
endProduct = endProduct + c;
System.out.println(c);
}
Here are some ways to accomplish this.
Method 1 - traditional Java.
private static String converter(String userInput) {
final StringBuilder endProduct = new StringBuilder();
for(char ch : userInput.toCharArray()) {
if(Character.isLetterOrDigit(ch)) endProduct.append(ch);
}
return endProduct.toString();
}
Method 2 - Streams.
private static String converter(String userInput) {
int[] chars = userInput.codePoints().filter(Character::isLetterOrDigit).toArray();
return new String(chars, 0, chars.length);*/
}
or
private static String converter(String userInput) {
return userInput.codePoints().filter(Character::isLetterOrDigit).collect(StringBuilder::new, StringBuilder::appendCodePoint, StringBuilder::append).toString();
}

Split a string at every 4-th character?

I have a string which i have to split into substrings of equal length if possible. I have found this solution which will only work if the string length is a multiple of 4.
String myString = "abcdefghijklm";
String[] split = myString.split("(?<=\\G....)");
This will produce:
[abcd, efgh, ijkl, m]
What i need is to split "from the end of the string". My desired output should look like :
[a, bcde, fghi, jklm]
How do i achieve this?
This ought to do it:
String[] split = myString.split("(?=(....)+$)");
// or
String[] split = myString.split("(?=(.{4})+$)");
What it does is this: split on the empty string only if that empty string has a multiple of 4 chars ahead of it until the end-of-input is reached.
Of course, this has a bad runtime (O(n^2)). You can get a linear running time algorithm by simply splitting it yourself.
As mentioned by #anubhava:
(?!^)(?=(?:.{4})+$) to avoid empty results if string length is in multiples of 4
Regex are really unnecessary for this. I also don't think this is a good problem for recursion. The following is an O(n) solution.
public static String[] splitIt(String input, int splitLength){
int inputLength = input.length();
ArrayList<String> arrayList = new ArrayList<>();
int i = inputLength;
while(i > 0){
int beginIndex = i - splitLength > 0 ? i - splitLength : 0;
arrayList.add(0, input.substring(beginIndex, i));
i -= splitLength;
}
return arrayList.toArray(new String[0]);
}
No need to use a regular expression. Instead, you can recursively build a list of head strings and return the tail.
import java.util.*;
public class StringChunker {
public static void main(String[] args) {
String str = "abcdefghijklm";
System.out.println(Arrays.toString(chunk(str, 4))); // [abcd, efgh, ijkl, m]
System.out.println(Arrays.toString(chunk(str, 4, true))); // [a, bcde, fghi, jklm]
}
public static String[] chunk(String str, int size) throws IllegalArgumentException {
return chunk(str, size, false);
}
public static String[] chunk(String str, int size, boolean reverse) throws IllegalArgumentException {
return chunk(str, size, reverse, new ArrayList<String>());
}
private static String[] chunk(String str, int size, boolean reverse, List<String> chunks) throws IllegalArgumentException {
if (size < 1) {
throw new IllegalArgumentException("size must be greater than 0");
}
if (str.length() < size) {
if (reverse) {
chunks.add(0, str); // Reverse adds to the front of the list
} else {
chunks.add(str); // Add to the end of the list
}
return chunks.toArray(new String[chunks.size()]); // Convert to an array
} else {
String head, tail;
if (reverse) {
head = str.substring(str.length() - size, str.length());
tail = str.substring(0, str.length() - size);
chunks.add(0, head);
} else {
head = str.substring(0, size);
tail = str.substring(size);
chunks.add(head);
}
return chunk(tail, size, reverse, chunks);
}
}
}

Removing supplementary characters from a Java string [duplicate]

This question already has answers here:
What is the regex to extract all the emojis from a string?
(18 answers)
Closed 5 years ago.
I have a Java string that contains supplementary characters (characters in the Unicode standard whose code points are above U+FFFF). These characters could for example be emojis. I want to remove those characters from the string, i.e. replace them with the empty string "".
How do I remove supplementary characters from a string?
How do I remove characters from an arbitrary code point range? (For example all characters in the range 1F000–​1FFFF)?
There are a couple of approaches. As regex replace is expensive, maybe do:
String basic(String s) {
StringBuilder sb = new StringBuilder();
for (char ch : s.toCharArray()) {
if (!Character.isLowSurrogate(ch) && !Character.isHighSurrogate(ch)) {
sb.append(ch);
}
}
return sb.length() == s.length() ? s : sb.toString();
}
You can get a character's unicode value by simply converting it to an int.
Therefore, you'll want to do the following:
Convert your String to a char[], or do something like have the loop condition iterate through each character in the String using String.charAt()
Check if the unicode value is one you want to remove.
If so, replace the character with "".
This is just to start you off, however if you're still struggling I can try type out a whole example.
Good luck!
Here is a code snippet that collects characters between code point 60 and 100:
public class Test {
public static void main(String[] args) {
new Test().go();
}
private void go() {
String s = "ABC12三○";
String ret = "";
for (int i = 0; i < s.length(); i++) {
System.out.println(s.codePointAt(i));
if ((s.codePointAt(i) > 60) & (s.codePointAt(i) < 100)) {
ret += s.substring(i, i+1);
}
}
System.out.println(ret);
}
}
the result:
code point: 65
code point: 66
code point: 67
code point: 49
code point: 50
code point: 19977
code point: 65518
result: ABC
Hope this helps.
Java strings are UTF-16 encoded. The String type has a codePointAt() method for retrieving a decoded codepoint at a given char (codeunit) index.
So, you can do something like this, for instance:
String removeSupplementaryChars(String s)
{
int len = s.length();
if (len == 0)
return "";
StringBuilder sb = new StringBuilder(len);
int i = 0;
do
{
if (s.codePointAt(i) <= 0xFFFF)
sb.append(s.charAt[i]);
i = s.offsetByCodePoints(i, 1);
}
while (i < len);
return sb.toString();
}
Or this:
String removeCodepointsinRange(String s, int lower, int upper)
{
int len = s.length();
if (len == 0)
return "";
StringBuilder sb = new StringBuilder(len);
int i = 0;
do
{
int cp = s.codePointAt(i);
if ((cp < lower) || (cp > upper))
sb.appendCodePoint(cp);
i = s.offsetByCodePoints(i, 1);
}
while (i < len);
return sb.toString();
}

Break down a long string into smaller string of given length [duplicate]

This question already has answers here:
What's the simplest way to print a Java array?
(37 answers)
Closed 7 years ago.
I am trying to break down a long given string into a smaller string of given length x, and it returns an array of these small strings. But I couldn't print out, it gives me error [Ljava.lang.String;#6d06d69c
Please take a look at my code and help me out if I am doing wrong. Thanks so much!
public static String[] splitByNumber(String str, int num) {
int inLength = str.length();
int arrayLength = inLength / num;
int left=inLength%num;
if(left>0){++arrayLength;}
String ar[] = new String[arrayLength];
String tempText=str;
for (int x = 0; x < arrayLength; ++x) {
if(tempText.length()>num){
ar[x]=tempText.substring(0, num);
tempText=tempText.substring(num);
}else{
ar[x]=tempText;
}
}
return ar;
}
public static void main(String[] args) {
String[] str = splitByNumber("This is a test", 4);
System.out.println(str);
}
You're printing the array itself. You want to print the elements.
String[] str = splitByNumber("This is a test", 4);
for (String s : str) {
System.out.println(s);
}

Categories