How do I exclude capitalizing specific words in a String?

How do I exclude capitalizing specific words in a String? - java

I'm new to programming, and here I'm required to capitalise the user's input, which excludes certain words.
For example, if the input is
THIS IS A TEST I get This Is A Test
However, I want to get This is a Test format
String s = in.nextLine();
StringBuilder sb = new StringBuilder(s.length());
String wordSplit[] = s.trim().toLowerCase().split("\\s");
String[] t = {"is","but","a"};
for(int i=0;i<wordSplit.length;i++){
if(wordSplit[i].equals(t))
sb.append(wordSplit[i]).append(" ");
else
sb.append(Character.toUpperCase(wordSplit[i].charAt(0))).append(wordSplit[i].substring(1)).append(" ");
}
System.out.println(sb);
}
This is the closest I have gotten so far but I seem to be unable to exclude capitalising the specific words.

The problem is that you are comparing each word to the entire array. Java does not disallow this, but it does not really make a lot of sense. Instead, you could loop each word in the array and compare those, but that's a bit lengthy in code, and also not very fast if the array of words gets bigger.
Instead, I'd suggest creating a Set from the array and checking whether it contains the word:
String[] t = {"is","but","a"};
Set<String> t_set = new HashSet<>(Arrays.asList(t));
...
if (t_set.contains(wordSplit[i]) {
...

Your problem (as pointed out by #sleepToken) is that
if(wordSplit[i].equals(t))
is checking to see if the current word is equal to the array containing your keywords.
Instead what you want to do is to check whether the array contains a given input word, like so:
if (Arrays.asList(t).contains(wordSplit[i].toLowerCase()))
Note that there is no "case sensitive" contains() method, so it's important to convert the word in question into lower case before searching for it.

You're already doing the iteration once. Just do it again; iterate through every String in t for each String in wordSplit:
for (int i = 0; i < wordSplit.length; i++){
boolean found = false;
for (int j = 0; j < t.length; j++) {
if(wordSplit[i].equals(t[j])) {
found = true;
}
}
if (found) { /* do your stuff */ }
else { }
}

First of all right method which is checking if the word contains in array.
contains(word) {
for (int i = 0;i < arr.length;i++) {
if ( word.equals(arr[i])) {
return true;
}
}
return false;
}
And then change your condition wordSplit[i].equals(t) to contains(wordSplit[i]

You are not comparing with each word to ignore in your code in this line if(wordSplit[i].equals(t))
You can do something like this as below:
public class Sample {
public static void main(String[] args) {
String s = "THIS IS A TEST";
String[] ignore = {"is","but","a"};
List<String> toIgnoreList = Arrays.asList(ignore);
StringBuilder result = new StringBuilder();
for (String s1 : s.split(" ")) {
if(!toIgnoreList.contains(s1.toLowerCase())) {
result.append(s1.substring(0,1).toUpperCase())
.append(s1.substring(1).toLowerCase())
.append(" ");
} else {
result.append(s1.toLowerCase())
.append(" ");
}
}
System.out.println("Result: " + result);
}
}
Output is:
Result: This is a Test

To check the words to exclude java.util.ArrayList.contains() method would be a better choice.
The below expression checks if the exclude list contains the word and if not capitalises the first letter:
tlist.contains(x) ? x : (x = x.substring(0,1).toUpperCase() + x.substring(1)))
The expression is also corresponds to:
if(tlist.contains(x)) { // ?
x = x; // do nothing
} else { // :
x = x.substring(0,1).toUpperCase() + x.substring(1);
}
or:
if(!tlist.contains(x)) {
x = x.substring(0,1).toUpperCase() + x.substring(1);
}
If you're allowed to use java 8:
String s = in.nextLine();
String wordSplit[] = s.trim().toLowerCase().split("\\s");
List<String> tlist = Arrays.asList("is","but","a");
String result = Stream.of(wordSplit).map(x ->
tlist.contains(x) ? x : (x = x.substring(0,1).toUpperCase() + x.substring(1)))
.collect(Collectors.joining(" "));
System.out.println(result);
Output:
This is a Test

Related

Trying to add a count to a variable everytime a string matches an element in my string array

So, I have a string array. For instance, [test, exam, quiz] which we will call cat to prevent misunderstanding. I also have a string. For instance, String school = "We have a test and a quiz next week". I am checking the string school to see if it contains/matches the string array [test, exam, quiz]. However, I want to keep track of how many matches are occurring in a count variable. I haven't been able to figure out how to add one to the count for each match. For instance, based on the scenario constructed, there should be two matches of the three in the string array.
Here is my code:
int i = 0;
for (String s : cat) {
if (school.contains(s)) {
match = true;
i++;
break;
}
}
The output for this code only gives 1, even though the string school contains test and quiz. I want it to give two.
Your help would be much appreciated.

All you have to do is:
Split given school into separate words;
Using filter() to filter out required words;
Count elements in the stream.
String school = "We have a test and a quiz next week test";
String[] cat = { "test", "exam", "quiz" };
Set<String> words = Arrays.stream(cat).collect(Collectors.toSet());
long count = Arrays.stream(school.split("\\s+"))
.filter(words::contains)
.count();

break will stop the for loop after the first match. That's why you get i = 1;
int i = 0;
for (String s: cat) {
if (school.contains(s)) {
match=true;
i++;
break; // This break will stop the for loop iteration after the first match.
}
}
To match with your both requirements you could modify your code to below.
String school = "We have a test and a quiz next week";
String[] cat = {"test", "exam", "quiz"};
int i = 0;
boolean match = false;
for (String s : cat) {
if (school.contains(s)) {
match = true;
i++;
}
}
System.out.println(match ? "found " + i + " matches." : "not found any match");
But there is an issue in this code as incase of more than one match for same text it will still count as 1 match. Example assume there are two test in school String, you will still get count as 1.

One approach uses an alternation of search terms along with a formal regex pattern matcher:
String[] cat = new String[] {"test", "exam", "quiz"};
StringBuilder regex = new StringBuilder("\\b(?:");
for (int i=0; i < cat.length; ++i) {
if (i > 0) regex.append("|");
regex.append(cat[i]);
}
regex.append(")\\b");
String school = "We have a test and a quiz next week";
Pattern r = Pattern.compile(regex.toString());
Matcher m = r.matcher(school);
int count = 0;
while (m.find()) {
++count;
}
System.out.println("found " + count + " matches."); // found 2 matches.
To be clear here, the regex pattern we are using against the input sentence is:
\b(?:test|exam|quiz)\b
We iterate over the input sentence, and increment the counter by one for each time we hit a keyword.

The result is 1 because you have added a break statement inside the if condition which is making it go out of the loop whenever the first match is found, instead you should write like this
int i=0;
for (String s: cat){
if(school.contains(s)){
match=true;
i++;
}
}

You could use a Hastable in java: (If the input sentence happens to have a keyword more than once, then your String#contains logic will still only count one match)
import java.util.Hashtable;
public class Main
{
public static void main(String[] args) {
String school = "We have a test and a quiz next week";
String cat[] = new String[]{"test", "exam", "quiz"};
Hashtable<String, Integer> my_dict = new Hashtable<String, Integer>();
int count=0; // variable should be named such that it is understandable the purpose of it
for (String s: cat){
if(school.contains(s)){
if(!my_dict.containsKey(s)){
my_dict.put(s, 1);
count++;
}
//else{} //don't need else in your case
}
}
System.out.println( "Occurance: " + count );
}
}
If you need mutlitple counts of the same string word in a sentence, then just remove the Hashtable and also the condition for the Hashtable as below:
int count=0; // variable should be named such that it is understandable the purpose of it
for (String s: cat){
if(school.contains(s)){
count++;
}
}
System.out.println( "Occurance: " + count );

unable to pass test case in autocomplete

Question
Autocomptete
Doug was using Google and was amazed to see the autocomptete feature How autocomptete works it search the database for all the possible words that can be formed using the characters that are provided by user (as input)
For ex If a user type 'cis' in the search bar then suggestions would be
• cisco
• cist
• cissp
• cism
• cisa
He thought about applying the same feature in his search engine. In his prototype he took a string as domain which contained all the words he could search.
As his designer you have to tell him how many autocomptete options will be provided to him if something is entered in the input field.
This is my code for the following problem.
import java.util.ArrayList;
import java.util.List;
public class Test {
public static void main(String[] args) {
String input1 = "Hello world with warm welcome from Mr.kajezevu";
String input2 = "w";
//output should be any word starting with w i.e {world,warm,welcome}
List < String > l = new ArrayList < String > ();
String[] str = input1.split("\\s+");//splits a given string at spaces
for (int i = 0; i < str.length; i++) {
if (str[i].length() >= input2.length()) { // checks if the length of input2 is not greater than the str[i] selected
if (input2.equals(str[i].substring(0, input2.length()))) { //comparing the two string if they are equal
l.add(str[i]);
}
}
}
String[] result = l.toArray(new String[l.size()]);
for (int i = 0; i < result.length; i++) {
System.out.println(result[i]);
}
}
}
But my solution is passing only one test case and also its failing complexity case.
i can't figure out whats wrong with it.

It seems you missed boundary conditions.
Below is code.
public static String[] autoComplete(String input1, String input2){
List<String> listOfPredictions = new ArrayList<String>();
String[] emptyArr = new String[0];
if(isEmpty(input1) || isEmpty(input2)){
return emptyArr;
}
input1 = input1.trim();
input2 = input2.trim();
String tokenizer = " " + input2;
int fromIdx = 1;
if(input1.startsWith(input2)){
fromIdx = input1.indexOf(" ");
listOfPredictions.add(input1.substring(0, fromIdx));
}
while(fromIdx > 0){
fromIdx = input1.indexOf(tokenizer, fromIdx) + 1;
if(fromIdx > 0){
listOfPredictions.add(input1.substring(fromIdx, input1.indexOf(" ", fromIdx)));
}
}
return listOfPredictions.toArray(emptyArr);
}
private static boolean isEmpty(String str){
return str == null || str.trim().length() == 0;
}

We also need to remove all duplicate words from the resulting array.
So first we break the string into words using the string.split() function.
Then push all those words that start with input2 string.
Then from the resulting array, we remove all duplicates by creating a Set and then converting it back into an Array.
function autoComplete(input1, input2) {
let results = [];
if(!input1 || !input1.length || !input2 || !input2.length) return results;
input1 = input1.trim();
input2 = input2.trim();
let allWords = input1.split(/\s+/);
allWords.forEach(word => {
if(word.startsWith(input2)) {
results.push(word);
}
})
results = [...[...new Set(results)]];
return results;
}

How can I check if elements in my array list is ending with character "_bp"

How can I check if elements in my ArrayList are ending with "_bp" and append that suffix when they don't.
It would be great if i don't have to convert my ArrayList into String.
I tried below code:
ArrayList<String> comparingList = new ArrayList<String>();
String search = "_bp";
for (String str : comparingList)
{
if (str.trim().contains(search))
{
// Do Nothing
}
else
{
str.trim().concat("_bp");
//Replace existing value with this concated value to array list
}
}
But I am still not able to append "_bp" in the existing element, also dont want to convert my
comparingList into String.
A smaller code is preferable if possible.
Thanks in advance :)

String a = "test";
System.out.println("1 " + a.concat("_bp"));
System.out.println("2 " + a);
a = a.concat("_bp");
System.out.println("3 " + a);
output:
1 test_bp
2 test
3 test_bp
code like this:
ArrayList<String> comparingList = new ArrayList<String>();
comparingList.add("test1_bp");
comparingList.add("test2");
String search = "_bp";
for (int i = 0; i < comparingList.size(); i++) {
String tmpStr = comparingList.get(i);
if(tmpStr.contains(search)){
//do nothing
}else{
comparingList.set(i, tmpStr+"_bp");
}
}
System.out.println(comparingList);

This worked :)
String search = "_bp";
int i = 0;
for (String str : comparingList)
{
if (str.trim().contains(search))
{
}
else
{
comparingList.set(i, str.trim().concat("_bp"));
}
i++;
}

How to Check for Deleted Words Between 2 Sentences in Java

What's the best approach in Java if you want to check for words that were deleted from sentence A in sentence B. For example:
Sentence A: I want to delete unnecessary words on this simple sentence.
Sentence B: I want to delete words on this sentence.
Output: I want to delete (unnecessary) words on this (simple) sentence.
where the words inside the parenthesis are the ones that were deleted from sentence A.

Assuming order doesn't matter: use commons-collections.
Use String.split() to split both sentences into arrays of words.
Use commons-collections' CollectionUtils.addAll to add each array into an empty Set.
Use commons-collections' CollectionUtils.subtract method to get A-B.

Assuming order and position matters, this looks like it would be a variation of the Longest Common Subsequence problem, a dynamic programming solution.
wikipedia has a great page on the topic, there's really too much for me to outline here
http://en.wikipedia.org/wiki/Longest_common_subsequence_problem

Everyone else is using really heavy-weight algorithms for what is actually a very simple problem. It could be solved using longest common subsequence, but it's a very constrained version of that. It's not a full diff; it only includes deletes. No need for dynamic programming or anything like that. Here's a 20-line implementation:
private static String deletedWords(String s1, String s2) {
StringBuilder sb = new StringBuilder();
String[] words1 = s1.split("\\s+");
String[] words2 = s2.split("\\s+");
int i1, i2;
i1 = i2 = 0;
while (i1 < words1.length) {
if (words1[i1].equals(words2[i2])) {
sb.append(words1[i1]);
i2++;
} else {
sb.append("(" + words1[i1] + ")");
}
if (i1 < words1.length - 1) {
sb.append(" ");
}
i1++;
}
return sb.toString();
}
When the inputs are the ones in the question, the output matches exactly.
Granted, I understand that for some inputs there are multiple solutions. For example:
a b a
a
could be either a (b) (a) or (a) (b) a and maybe for some versions of this problem, one of these solutions is more likely to be the "actual" solution than the other, and for those you need some recursive or dynamic programming approach... but let's not make it too much more complicated than what Israel Sato originally asked for!

String a = "I want to delete unnecessary words on this simple sentence.";
String b = "I want to delete words on this sentence.";
String[] aWords = a.split(" ");
String[] bWords = b.split(" ");
List<String> missingWords = new ArrayList<String> ();
int x = 0;
for(int i = 0 ; i < aWords.length; i++) {
String aWord = aWords[i];
if(x < bWords.length) {
String bWord = bWords[x];
if(aWord.equals(bWord)) {
x++;
} else {
missingWords.add(aWord);
}
} else {
missingWords.add(aWord);
}
}

This works well....for updated strings also
updated strings enclosed with square brackets.
import java.util.*;
class Sample{
public static void main(String[] args){
Scanner sc=new Scanner(System.in);
String str1 = sc.nextLine();
String str2 = sc.nextLine();
List<String> flist = Arrays.asList(str1.split("\\s+"));
List<String> slist = Arrays.asList(str2.split("\\s+"));
List<String> completedString = new ArrayList<String>();
String result="";
String updatedString = "";
String deletedString = "";
int i=0;
int startIndex=0;
int endIndex=0;
for(String word: slist){
if(flist.contains(word)){
endIndex = flist.indexOf(word);
if(!completedString.contains(word)){
if(deletedString.isEmpty()){
for(int j=startIndex;j<endIndex;j++){
deletedString+= flist.get(j)+" ";
}
}
}
startIndex=endIndex+1;
if(!deletedString.isEmpty()){
result += "("+deletedString.substring(0,deletedString.length()-1)+") ";
deletedString="";
}
if(!updatedString.isEmpty()){
result += "["+updatedString.substring(0,updatedString.length()-1)+"] ";
updatedString="";
}
result += word+" ";
completedString.add(word);
if(i==slist.size()-1){
endIndex = flist.size();
for(int j=startIndex;j<endIndex;j++){
deletedString+= flist.get(j)+" ";
}
startIndex = endIndex+1;
}
}
else{
if(i == 0){
boolean boundaryCheck = false;
for(int j=i+1;j<slist.size();j++){
if(flist.contains(slist.get(j))){
endIndex=flist.indexOf(slist.get(j));
boundaryCheck=true;
break;
}
}
if(!boundaryCheck){
endIndex = flist.size();
}
if(!completedString.contains(word)){
for(int j=startIndex;j<endIndex;j++){
deletedString+= flist.get(j)+" ";
}
}
startIndex = endIndex+1;
}else if(i == slist.size()-1){
endIndex = flist.size();
if(!completedString.contains(word)){
for(int j=startIndex;j<endIndex;j++){
deletedString+= flist.get(j)+" ";
}
}
startIndex = endIndex+1;
}
updatedString += word+" ";
completedString.add(word);
}
i++;
}
if(!deletedString.isEmpty()){
result += "("+deletedString.substring(0,deletedString.length()-1)+") ";
}
if(!updatedString.isEmpty()){
result += "["+updatedString.substring(0,updatedString.length()-1)+"] ";
}
System.out.println(result);
}
}

This is basically a differ, take a look at this:
diff
and the root algorithm:
Longest common subsequence problem
Here's a sample Java implementation:
http://introcs.cs.princeton.edu/java/96optimization/Diff.java.html
which compares lines. The only thing you need to do is split by word instead of by line or alternatively put each word of both sentences in a separate line.
If e.g. on Linux, you can actually see the results of the latter option using diff program itself before you even write any code, try this:
$ echo "I want to delete unnecessary words on this simple sentence."|tr " " "\n" > 1
$ echo "I want to delete words on this sentence."|tr " " "\n" > 2
$ diff -uN 1 2
--- 1 2012-10-01 19:40:51.998853057 -0400
+++ 2 2012-10-01 19:40:51.998853057 -0400
## -2,9 +2,7 ##
want
to
delete
-unnecessary
words
on
this
-simple
sentence.
The lines with - in front are different (alternatively, it would show + if the lines were added into sentence B that were not in sentence A). Try it out to see if that fits your problem.
Hope this helps.

Problem implementing classifier algorithm for whitespace separated words

I have a text and split it into words separated by white spaces.
I'm classifying units and they work if it occurs in the same word (eg.: '100m', '90kg', '140°F', 'US$500'), but I'm having problems if they appears separately, each part in a word (eg.: '100 °C', 'US$ 450', '150 km').
The classifier algorithm can understand if the unit is in right and the value is missing is in the left or right side.
My question is how can I iterate over all word that are in a list providing the corrects word to the classifier.
This is only an example of code. I have tried in a lot of ways.
for(String word: words){
String category = classifier.classify(word);
if(classifier.needPreviousWord()){
// ?
}
if(classifier.needNextWord()){
// ?
}
}
In another words, I need to iterate over the list classifying all the words, and if the previous word is needed to test, provide the last word and the unit. If the next word is needed, provide the unit and the next word. Appears to be simple, but I don't know how to do.

Don't use an implicit iterator in your for loop, but an explicit. Then you can go back and forth as you like.
Iterator<String> i = words.iterator();
while (i.hasNext()) {
String category = classifier.classify(i.next());
if(classifier.needPreviousWord()){
i.previous();
}
if(classifier.needNextWord()){
i.next();
}
}
This is not complete, because I don't know what your classifier does exactly, but it should give you an idea on how to proceed.

This could help.
public static void main(String [] args)
{
List<String> words = new ArrayList<String>();
String previousWord = "";
String nextWord = "";
for(int i=0; i < words.size(); i++) {
if(i > 0) {
previousWord = words.get(i-1);
}
String currentWord = words.get(i);
if(i < words.size() - 1) {
nextWord = words.get(i+1);
} else {
nextWord = "";
}
String category = classifier.classify(word);
if(category.needPreviousWord()){
if(previousWord.length() == 0) {
System.out.println("ERROR: missing previous unit");
} else {
System.out.println(previousWord + currentWord);
}
}
if(category.needNextWord()){
if(nextWord.length() == 0) {
System.out.println("ERROR: missing next unit");
} else {
System.out.println(currentWord + nextWord);
}
}
}
}

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

How do I exclude capitalizing specific words in a String? - java

First of all right method which is checking if the word contains in array. contains(word) { for (int i = 0;i < arr.length;i++) { if ( word.equals(arr[i])) { return true; } } return false; } And then change your condition wordSplit[i].equals(t) to contains(wordSplit[i]

Related

Trying to add a count to a variable everytime a string matches an element in my string array

unable to pass test case in autocomplete

How can I check if elements in my array list is ending with character "_bp"

How to Check for Deleted Words Between 2 Sentences in Java

Problem implementing classifier algorithm for whitespace separated words

Categories

Resources