Converting arraylist to string - java

I have a multiple line string that is taken as a user input. I broke the string into ArrayList by str.split("\\s ") and changed a particular word if it occurred, now i want to merge the words in the ArrayList with the replaced word in it and form the string again in a multiple line pattern only. I'm not getting how to do this. Please help.

Using only standard Java (assuming your ArrayList is called words)
StringBuilder sb = new StringBuilder();
for (String current : words)
sb.append(current).append(" ");
String s = sb.toString().trim();
If you have the Guava library you can use Joiner:
String s = Joiner.on(" ").join(words)
Both of these will work even if the type of words is String[].
If you want to preserve the line structure, I suggest the following approach: first break the input string into lines by using .split("\n"). Then, split each lines to words using .split("\\s"). Here's how the code should look like:
public String convert(String input, String wordToReplace, String replacement) {
StringBuilder result = new StringBuilder();
String[] lines = input.split("\n");
for (String line : lines) {
boolean isFirst = true;
for (String current : line.split("\\s")) {
if (!isFirst)
result.append(" ");
isFirst = false;
if (current.equals(wordToReplace))
current = replacement;
result.append(current);
}
result.append("\n");
}
return result.toString().trim();
}

Related

How to replace any of the substrings in a string with empty substring "" (remove substring) in java

I want to allow only few substrings(allowed words) in a string. I want to remove the other substrings.
So I want to replace all the words except few words like "abc" , "def" and "ghi" etc.
I want something like this.
str.replaceAll("^[abc],"").replaceAll("^[def],"").......... (Not correct syntax)
Input: String: "abcxyzorkdefa" allowed words: {"abc","def"}
Output: "abcdef";
How to achieve this?
Thanks in Advance.
This is a more C-like approach, but uses Java's String.startsWith for matching patterns. The method walks along the provided string, saving matched patterns to the result string in the result they are found.
You just need to make sure that any longer patterns that contain smaller patterns come at the front of the patterns array (so "abcd" comes before "abc").
class RemoveNegated {
public static String removeAllNegated(String s, List<String> list) {
StringBuilder result = new StringBuilder();
// Move along the string from the front
while (s.length() > 0) {
boolean match = false;
// Try matching a pattern
for (String p : list) {
// If the pattern is matched
if (s.toLowerCase().startsWith(p.toLowerCase())) {
// Save it
result.append(p);
// Move along the string
s = s.substring(p.length());
// Signal a match
match = true;
break;
}
}
// If there was no match, move along the string
if (!match) {
s = s.substring(1);
}
}
return result.toString();
}
public static void main(String[] args) {
String s = "abcxyzorkdefaef";
s = removeAllNegated(s, Arrays.asList("abc", "def", "ghi"));
System.out.println(s);
}
}
Prints: abcdef

Capitalization of the words in string

How can I avoid of StringIndexOutOfBoundsException in case when string starts with space (" ") or when there're several spaces in the string?
Actually I need to capitalize first letters of the words in the string.
My code looks like:
public static void main(String[] args) throws IOException {
BufferedReader reader = new BufferedReader(new InputStreamReader(System.in));
String s = reader.readLine();
String[] array = s.split(" ");
for (String word : array) {
word = word.substring(0, 1).toUpperCase() + word.substring(1); //seems that here's no way to avoid extra spaces
System.out.print(word + " ");
}
}
Tests:
Input: "test test test"
Output: "Test Test Test"
Input: " test test test"
Output:
StringIndexOutOfBoundsException
Expected: " Test Test test"
I'm a Java newbie and any help is very appreciated. Thanks!
split will try to break string in each place where delimiter is found. So if you split on space and space if placed at start of the string like
" foo".split(" ")
you will get as result array which will contain two elements: empty string "" and "foo"
["", "foo"]
Now when you call "".substring(0,1) or "".substring(1) you are using index 1 which doesn't belong to that string.
So simply before you do any String modification based on indexes check if it is safe by testing string length. So check if word you are trying to modify has length grater than 0, or use something more descriptive like if(!word.isEmpty()).
A slight modification to Capitalize first word of a sentence in a string with multiple sentences.
public static void main( String[] args ) throws IOException {
BufferedReader reader = new BufferedReader(new InputStreamReader(System.in));
String s = reader.readLine();
int pos = 0;
boolean capitalize = true;
StringBuilder sb = new StringBuilder(s);
while (pos < sb.length()) {
if (sb.charAt(pos) == ' ') {
capitalize = true;
} else if (capitalize && !Character.isWhitespace(sb.charAt(pos))) {
sb.setCharAt(pos, Character.toUpperCase(sb.charAt(pos)));
capitalize = false;
}
pos++;
}
System.out.println(sb.toString());
}
I would avoid using split and go with StringBuilder instead.
Instead of splitting the string, try to simply iterate over all characters within the original string, replacing all characters by its uppercase in case it's the first character of this string or if its predecessor is a space.
Use a regex in your split split all whitespaces
String[] words = s.split("\\s+");
Easier would be to use existing libraries: WordUtils.capitalize(str) (from apache commons-lang).
To fix your current code however, a possible solution would be to use a regex for words (\\w) and a combination of StringBuffer/StringBuilder setCharAt and Character.toUpperCase:
public static void main(String[] args) {
String test = "test test test";
StringBuffer sb = new StringBuffer(test);
Pattern p = Pattern.compile("\\s+\\w"); // Matches 1 or more spaces followed by 1 word
Matcher m = p.matcher(sb);
// Since the sentence doesn't always start with a space, we have to replace the first word manually
sb.setCharAt(0, Character.toUpperCase(sb.charAt(0)));
while (m.find()) {
sb.setCharAt(m.end() - 1, Character.toUpperCase(sb.charAt(m.end() - 1)));
}
System.out.println(sb.toString());
}
Output:
Test Test Test
Capitalize whole words in String using native Java streams
It is really elegant solution and doesnt require 3rd party libraries
String s = "HELLO, capitalized worlD! i am here! ";
CharSequence wordDelimeter = " ";
String res = Arrays.asList(s.split(wordDelimeter.toString())).stream()
.filter(st -> !st.isEmpty())
.map(st -> st.toLowerCase())
.map(st -> st.substring(0, 1).toUpperCase().concat(st.substring(1)))
.collect(Collectors.joining(wordDelimeter.toString()));
System.out.println(s);
System.out.println(res);
The output is
HELLO, capitalized worlD! i am here!
Hello, Capitalized World! I Am Here!

Using REGEX in Java for splitting a string

I written a code to split the following string
(((OPERATING_CARRIER='AB') OR (OPERATING_CARRIER='EY') OR
((OPERATING_CARRIER='VA') AND ((FLIGHT_NO=604) OR ((FLIGHT_NO=603) AND
(STOCK='9W'))))))
into following array of strings
OPERATING_CARRIER='AB'
OPERATING_CARRIER='EY'
OPERATING_CARRIER='VA'
FLIGHT_NO=604
FLIGHT_NO=603
STOCK='9W'
The code is
String sa="(((OPERATING_CARRIER='AB') OR (OPERATING_CARRIER='EY') OR ((OPERATING_CARRIER='VA') AND ((FLIGHT_NO=604) OR ((FLIGHT_NO=603) AND (STOCK='9W'))))))";
Matcher m = Pattern.compile("\\w+\\s*=\\s*(?:'[^']+'|\\d+)").matcher(sa);
//System.out.println("contains "+sa.contains("((("));
Stack<String> in_cond = new Stack<String>();
Iterator<String> iter = in_cond.iterator();
String new_sa=sa;
System.out.println("Indivisual conditions are as follows : ");
while(m.find()) {
String aMatch = m.group();
// add aMatch to match list...
System.out.println(aMatch);
in_cond.push(aMatch);
}
System.out.println("End of Indivisual conditions");
But now in the input string, the "=" can also be "<>" or"<" or ">" or "LIKE"
eg :
(((OPERATING_CARRIER<>'AB') OR (OPERATING_CARRIER LIKE'EY') OR
((OPERATING_CARRIER='VA') AND ((FLIGHT_NO<604) OR ((FLIGHT_NO>603) AND
(STOCK='9W'))))))
What changes need to be done in the regex?
I guess there are simpler (and more readable) ways to do this :).
Use replaceAll() to replace all braces with empty String. Next split based on either AND or OR.
public static void main(String[] args) {
String s = "(((OPERATING_CARRIER='AB') OR (OPERATING_CARRIER='EY') OR ((OPERATING_CARRIER='VA') AND ((FLIGHT_NO=604) OR ((FLIGHT_NO=603) AND (STOCK='9W'))))))";
String[] arr = s.replaceAll("[()]+","").split("\\s+(OR|AND)\\s+");
for (String str : arr) {
System.out.println(str);
}
}
O/P :
OPERATING_CARRIER='AB'
OPERATING_CARRIER='EY'
OPERATING_CARRIER='VA'
FLIGHT_NO=604
FLIGHT_NO=603
STOCK='9W'

Extract the first letter from each word in a sentence

I have developed a speech to text program where the user can speak a short sentence and then inserts that into a text box.
How do I extract the first letters of each word and then insert that into the text field?
For example if the user says: "Hello World". I want to insert HW into the text box.
If you have a string, you could just split it using
input.split(" ") //splitting by space
//maybe you want to replace dots, etc with nothing).
The iterate over the array:
for(String s : input.split(" "))
And then get the first letter of every string in a list/array/etc or append it to a string:
//Outside the for-loop:
String firstLetters = "";
// Insdie the for-loop:
firstLetters = s.charAt(0);
The resulting function:
public String getFirstLetters(String text)
{
String firstLetters = "";
text = text.replaceAll("[.,]", ""); // Replace dots, etc (optional)
for(String s : text.split(" "))
{
firstLetters += s.charAt(0);
}
return firstLetters;
}
The resulting function if you want to use a list (ArrayList matches):
Basically you just use an array/list/etc as argument type and instead of text.split(" ") you just use the argument. Also, remove the line where you would replace characters like dots, etc.
public String getFirstLetters(ArrayList<String> text)
{
String firstLetters = "";
for(String s : text)
{
firstLetters += s.charAt(0);
}
return firstLetters;
}
Use split to get an array separated words, then you can get first N characters with substring(0, N).
Assuming the sentence only contain a-z and A-Z and " " to separate the words , If you want an efficient way to do it, I suggest the below method.
public String getResult(String input){
StringBuilder sb = new StringBuilder();
for(String s : input.split(" ")){
sb.append(s.charAt(0));
}
return sb.toString();
}
Then write it to the text field.
jTextField.setText(getResult(input_String));
You would want to extract the string, put it all into a list, and loop through
String[] old = myTextView.getText().split(" ");
String add="";
for(String s:old)
add+=""+s.charAt(0);
myTextView.setText(add);

'Enter' detection for text mining

I'm in project doing a text mining. it's needed that my program also tokenize when the text using Enter in his/her document (/br if in HTML). Now my program only can detect 'space'. How to do it?
this is my code:
private ArrayList tokenize(String inp) {
ArrayList<String> out = new ArrayList<String>();
String[] split = inp.split(" ");
for (int i = 0; i < split.length; i++) {
if (!split[i].isEmpty()) {
out.add(split[i]);
}
}
return out;
}
You could also use a simple regular expression to do what you want:
String input = "Line of text \nAnother line<br><br><br />html<br />line";
String [] parts = input.split("\\s+|(<br>|<br\\s*/>)+");
System.out.println(Arrays.asList(parts));
It can also replace multiple whitespaces/breaklines in a row. Regular expressions can work really well for this kind of tasks.
Output:
[Line, of, text, Another, line, html, line]
Explanation: \s is short for all whitespaces (space, tab, newline). \s+ means 1 or more whitespaces. <br>|<br\\s*/> means <br> or <br/> or <br /> or <br />. They are in a group: (<br>|<br\\s*/>), so we can use + to identify one or more of them.
The whole stuff together: one or more whitespace characters or one or more of the different versions of <br>.
So your tokenize method could look like this (use generics, if you use java 1.5 or later):
private List<String> tokenize(String inp) {
List<String> out = new ArrayList<String>();
String[] split = inp.split("\\s+|(<br>|<br\\s*/>)+");
for (String s : split) {
if (!s.isEmpty()) {
out.add(s);
}
}
return out;
}
Are you sure splitting at the enters isn't already working? Because with this:
String s = "Hi b\nb bye";
System.out.println(s);
System.out.println();
String [] ss = s.split(" ");
for(String s2 : ss)
{
System.out.println(s2);
}
This is my output:
Hi b
b bye
Hi
b
b
bye
As you can see, the string was split both at the spaces, and at the new line (even though a space was the only regex). However, if this isn't working for you, you could just cycle through the String array and and call String.split("\n"). Then you can just add the new split strings to an ArrayList.

Categories