How to get keywords from a specific index in Java - java

Suppose I have a string:
String advise = "eat healthy food";
In the string I only know the keyword “healthy”. I don’t know what has before the word nor what has after the word. I just only know the middle word. So how can I get the before (“eat”) and after (“food”) keyword of “healthy”?
Note: Here the middle word’s size is always specfic but the other two word’s size is always different. Here “eat” and “food” have been used as an example only. These two words may be anything anytime.
I need to get these two words into two different strings, not in the same string.

Just split the string:
advise.split("healthy");
The first value in the array will be "eat", the second will be "food".

Here is a more general purpose solution that will handle more complex strings.
public static void main (String[] args)
{
String keyword = "healthy";
String advise = "I want to eat healthy food today";
Pattern p = Pattern.compile("([\\s]?+[\\w]+[\\s]+)" + keyword + "([\\s]+[\\w]+[\\s]?+)");
Matcher m = p.matcher(advise);
if (m.find())
{
String before = m.group(1).trim();
String after = m.group(2).trim();
System.out.println(before);
System.out.println(after);
}
else
{
System.out.println("The keyword was not found.");
}
}
Outputs:
eat
food

I think you can use split and get all the words separately as you wanted.
String advise = "eat healthy food";
String[] words = advise.split("healthy");
List<String> word = Arrays.asList(words);
word.forEach(w-> System.out.println(w.trim()));

Related

I want to split a string with multiple whitespaces using split() method?

This program is to return the readable string for the given morse code.
class MorseCode{
public static void main(String[] args) {
Scanner scanner = new Scanner(System.in);
String morseCode = scanner.nextLine();
System.out.println(getMorse(morseCode));
}
private static String getMorse(String morseCode){
StringBuilder res = new StringBuilder();
String characters = new String(morseCode);
String[] charactersArray = characters.split(" "); /*this method isn't
working for
splitting what
should I do*/
for(String charac : charactersArray)
res.append(get(charac)); /*this will return a string for the
corresponding string and it will
appended*/
return res.toString();
}
Can you people suggest a way to split up the string with multiple whitespaces. And can you give me some example for some other split operations.
Could you please share here the example of source string and the result?
Sharing this will help to understand the root cause.
By the way this code just works fine
String source = "a b c d";
String[] result = source.split(" ");
for (String s : result) {
System.out.println(s);
}
The code above prints out:
a
b
c
d
First, that method will only work if you have a specific number of spaces that you want to split by. You must also make sure that the argument on the split method is equal to the number of spaces you want to split by.
If, however, you want to split by any number of spaces, a smart way to do that would be trimming the string first (that removes all trailing whitespace), and then splitting by a single space:
charactersArray = characters.trim().split(" ");
Also, I don't understand the point of creating the characters string. Strings are immutable so there's nothing wrong with doing String characters = morseCode. Even then, I don't see the point of the new string. Why not just name your parameter characters and be done with it?

Find the same word in two different strings

I would like to find same word in two string.
startpoint = newresult.indexOf('\'');
endpoint = newresult.lastIndexOf('\'');
variables = newresult.substring(startpoint, endpoint);
variables = variables.replace("\r\n", ",");
variables = variables.replaceAll("'", "");`
String variables:
cons,john,$,alex,manag;
String second:
ins_manages(john,cons)
As it is seen, both strings they have john and cons and I want to check if both have same char sequences or not but I don't know how it can be checked? Is there any way to check it directly?
Solution:
String [] newvar;
newvar = variables.split(",");
After that, I used a for loop and matched them one by one.
BR
Split both the strings and compare the individual words using foreach as shown below:
String first = "hello world today";
String second = "Yet another hello worldly day today";
//split the second string into words
List<String> wordsOfSecond = Arrays.asList(second.split(" "));
//split and compare each word of the first string
for (String word : first.split(" ")) {
if(wordsOfSecond.contains(word))
System.out.println(word);
}
Your requirements are a little ambiguous. You're asking to find the same singular word in two strings, then your sample asks for finding two words in the same strings.
For verifying that one single word is in two strings, you can just do:
public boolean bothStringsContainWord(String s1, String s2, String word) {
return s1.contains(word) && s2.contians(word);
}
You can put that in a loop if you need to do it over multiple words. Again, your requirements are a little fuzzy though; if you straighten them up a more efficient solution probably exists.

(Java) Substrings & Reading data from two files using hashmap

If I had a .txt file called animals that had fishfroggoat etc. in it, and another file called owners that had something like:
fish:jane
frog:mark
goat:joe
how could I go about pairing the pets to their owners? I'm fairly sure a HashMap would be good here, but I'm stuck. I put the animal text into a string, but I don't know how to break it up into 4 characters properly.
Any help would be great.
Sorry I didn't add any code, but thanks to you guys' help (especially Ted Hopps) I worked it out and, more importantly, understood it. :-)
There are various approaches. The most direct is to split it using the substring method:
String animals = "fishfroggoat";
String fish = animals.substring(0, 4);
String frog = animals.substring(4, 8);
String goat = animals.substring(8); // or (8, 12)
If you have an arbitrarily long list of 4-character animals, you can do this:
String animals = "fishfroggoatbear";
int n = animals.length() / 4;
String[] animalArray = new String[n];
for (int i = 0; i < n; ++i) {
animalArray[i] = animals.substring(4*i, 4*i + 4);
}
You can split the pet/owner strings using split:
String rawData = "fish:jane";
String[] data = rawData.split(":");
String pet = data[0];
String owner = data[1];
Use String split as given below.
String msg=fish:jane;
msg.split(":")
Then it will make array separate by ":".
This is how you split a string into 4-character chunks in just one line:
String[] animals = input.split("(?<=\\G....)");
This may seem like black magic, so I'll try to demystify it. Welcome to the dark art of regular expressions...
The String.split() method splits the string on every match to the specified regex. So let's look at the regex:
(?<=\\G....)
The construct (?<=regex) is a "positive look behind" for the regex, meaning that the characters preceding the point in the input between characters (because a look behind is zero-width) must natch the regex.
The regex \G (coded as \\G as a java String constant) means "start of previous match" but also initially matches start of input.
The regex .... matches any 4 characters.
Thus, when expressed in English, the regex (?<=\\G....) means "after every characters".
IF anyone is interested, removing \G and splitting on (?<=\....) causes it to split on every character after the 4th = it just means "preceded by 4 characters" - you need the \G to find 4 new characters.
Here's some test code:
public static void main(String[] args) throws Exception {
String input = "fishfroggoatbear";
String[] animals = input.split("(?<=\\G....)");
System.out.println(Arrays.toString(animals));
}
Output:
[fish, frog, goat, bear]

Java - separate numbers from a string

I have a string that contains a few numbers (usually a date) and separators. The separators can either be "," or "." - or example 01.05,2000.5000
....now I need to separate those numbers and put into an array but I'm not sure how to do that (the separating part). Also, I need to check that the string is valid - it cannot be 01.,05.
I'm not asking for anyone to solve the thing for me (but if someone wants I appreciated it), just point me in the right direction :)
This is a way of doing it with StringTokenizer class, just iterate the tokens and if the obtained token is empty then you have a invalid String, also, convert the tokens to integers by the parseInt method to check if they are valid integer numbers:
import java.util.*;
public class t {
public static void main(String... args) {
String line = "01.05,2000.5000";
StringTokenizer strTok = new StringTokenizer(line, ",.");
List<Integer> values = new ArrayList<Integer>();
while (strTok.hasMoreTokens()) {
String s = strTok.nextToken();
if (s.length() == 0) {
// Found a repeated separator, String is not valid, do something about it
}
try {
int value = Integer.parseInt(s, 10);
values.add(value);
} catch(NumberFormatException e) {
// Number not valid, do something about it or continue the parsing
}
}
// At the end, get an array from the ArrayList
Integer[] arrayOfValues = values.toArray(new Integer[values.size()]);
for (Integer i : arrayOfValues) {
System.out.println(i);
}
}
}
Iterate through an String#split(regex) generated array and check each value to make sure your source String is "valid".
In:
String src = "01.05,2000.5000";
String[] numbers = src.split("[.,]");
numbers here will be an array of Strings, like {"01", "05", "2000", "5000"}. Each value is a number.
Now iterate over numbers. If you find a index that is not a number (it's a number when numbers[i].matches("\\d+") is true), then your src is invalid.
If possible, I would use guava String splitter for that. It is much more reliable, predictable and flexible than String#split. You can tell it exactly what to expect, what to omit, and so on.
For an example usage, and a small rant on how stupid javas split sometimes behaves, have a look here: http://code.google.com/p/guava-libraries/wiki/StringsExplained#Splitter
Use regex to group and match the input
String s = "01.05,2000.5000";
Pattern pattern = Pattern.compile("(\\d{2})[.,](\\d{2})[.,](\\d{4})[.,](\\d{4})");
Matcher m = pattern.matcher(s);
if(m.matches()) {
String[] matches = { m.group(1),m.group(2), m.group(3),m.group(4) };
for(String match : matches) {
System.out.println(match);
}
} else {
System.err.println("Mismatch");
}
Try this:
String str = "01.05,2000.5000";
str = str.replace(".",",");
int number = StringUtils.countMatches(str, ",");
String[] arrayStr = new String[number+1];
arrayStr = str.split(",");
StringUtils is from Apache Commons >> http://commons.apache.org/proper/commons-lang/
To validate:
if (input.matches("^(?!.*[.,]{2})[\\d.,]+))
This regex checks that:
dot and comma are never adjacent
input is comprised only of digits, dots and commas
To split:
String[] numbers = input.split("[.,]");
In order to separate the string, use split(), the argument of the method is the delimiter
array = string.split("separator");

What is the best way to extract the first word from a string in Java?

Trying to write a short method so that I can parse a string and extract the first word. I have been looking for the best way to do this.
I assume I would use str.split(","), however I would like to grab just the first first word from a string, and save that in one variable, and and put the rest of the tokens in another variable.
Is there a concise way of doing this?
The second parameter of the split method is optional, and if specified will split the target string only N times.
For example:
String mystring = "the quick brown fox";
String arr[] = mystring.split(" ", 2);
String firstWord = arr[0]; //the
String theRest = arr[1]; //quick brown fox
Alternatively you could use the substring method of String.
You should be doing this
String input = "hello world, this is a line of text";
int i = input.indexOf(' ');
String word = input.substring(0, i);
String rest = input.substring(i);
The above is the fastest way of doing this task.
To simplify the above:
text.substring(0, text.indexOf(' '));
Here is a ready function:
private String getFirstWord(String text) {
int index = text.indexOf(' ');
if (index > -1) { // Check if there is more than one word.
return text.substring(0, index).trim(); // Extract first word.
} else {
return text; // Text is the first word itself.
}
}
The simple one I used to do is
str.contains(" ") ? str.split(" ")[0] : str
Where str is your string or text bla bla :). So, if
str is having empty value it returns as it is.
str is having one word, it returns as it is.
str is multiple words, it extract the first word and return.
Hope this is helpful.
import org.apache.commons.lang3.StringUtils;
...
StringUtils.substringBefore("Grigory Kislin", " ")
You can use String.split with a limit of 2.
String s = "Hello World, I'm the rest.";
String[] result = s.split(" ", 2);
String first = result[0];
String rest = result[1];
System.out.println("First: " + first);
System.out.println("Rest: " + rest);
// prints =>
// First: Hello
// Rest: World, I'm the rest.
API docs for: split
for those who are searching for kotlin
var delimiter = " "
var mFullname = "Mahendra Rajdhami"
var greetingName = mFullname.substringBefore(delimiter)
like this:
final String str = "This is a long sentence";
final String[] arr = str.split(" ", 2);
System.out.println(Arrays.toString(arr));
arr[0] is the first word, arr[1] is the rest
You could use a Scanner
http://download.oracle.com/javase/1.5.0/docs/api/java/util/Scanner.html
The scanner can also use delimiters
other than whitespace. This example
reads several items in from a string:
String input = "1 fish 2 fish red fish blue fish";
Scanner s = new Scanner(input).useDelimiter("\\s*fish\\s*");
System.out.println(s.nextInt());
System.out.println(s.nextInt());
System.out.println(s.next());
System.out.println(s.next());
s.close();
prints the following output:
1
2
red
blue
None of these answers appears to define what the OP might mean by a "word". As others have already said, a "word boundary" may be a comma, and certainly can't be counted on to be a space, or even "white space" (i.e. also tabs, newlines, etc.)
At the simplest, I'd say the word has to consist of any Unicode letters, and any digits. Even this may not be right: a String may not qualify as a word if it contains numbers, or starts with a number. Furthermore, what about hyphens, or apostrophes, of which there are presumably several variants in the whole of Unicode? All sorts of discussions of this kind and many others will apply not just to English but to all other languages, including non-human language, scientific notation, etc. It's a big topic.
But a start might be this (NB written in Groovy):
String givenString = "one two9 thr0ee four"
// String givenString = "oňňÜÐæne;:tŵo9===tĥr0eè? four!"
// String givenString = "mouse"
// String givenString = "&&^^^%"
String[] substrings = givenString.split( '[^\\p{L}^\\d]+' )
println "substrings |$substrings|"
println "first word |${substrings[0]}|"
This works OK for the first, second and third givenStrings. For "&&^^^%" it says that the first "word" is a zero-length string, and the second is "^^^". Actually a leading zero-length token is String.split's way of saying "your given String starts not with a token but a delimiter".
NB in regex \p{L} means "any Unicode letter". The parameter of String.split is of course what defines the "delimiter pattern"... i.e. a clump of characters which separates tokens.
NB2 Performance issues are irrelevant for a discussion like this, and almost certainly for all contexts.
NB3 My first port of call was Apache Commons' StringUtils package. They are likely to have the most effective and best engineered solutions for this sort of thing. But nothing jumped out... https://commons.apache.org/proper/commons-lang/apidocs/org/apache/commons/lang3/StringUtils.html ... although something of use may be lurking there.
You could also use http://download.oracle.com/javase/6/docs/api/java/util/StringTokenizer.html
I know this question has been answered already, but I have another solution (For those still searching for answers) which can fit on one line:
It uses the split functionality but only gives you the 1st entity.
String test = "123_456";
String value = test.split("_")[0];
System.out.println(value);
The output will show:
123
The easiest way I found is this:
void main()
String input = "hello world, this is a line of text";
print(input.split(" ").first);
}
Output: hello
Assuming Delimiter is a blank space here:
Before Java 8:
private String getFirstWord(String sentence){
String delimiter = " "; //Blank space is delimiter here
String[] words = sentence.split(delimiter);
return words[0];
}
After Java 8:
private String getFirstWord(String sentence){
String delimiter = " "; //Blank space is delimiter here
String firstWord = Arrays.stream(sentence.split(delimiter))
.findFirst()
.orElse("No word found");
}
String anotherPalindrome = "Niagara. O roar again!";
String roar = anotherPalindrome.substring(11, 15);
You can also do like these

Categories