Reading a string with multiple options - java

I have String like ",yes,,,,,,,,,,,,," which says option2 is selected out of 15 options. Here, a comma , represents a option; if it is selected then some data will be there in place of option. I need to read this string and get the exact option selected value. In above it should be option2. How shall I do this?
I have 15 options in database from which selected data is replaced here and , in place none selected.
Or, looked at another way, there are 15 fields separated by commas. One field — in the example, the second field — has a non-empty value; the others are all empty. How can I determine the first field that is not empty?

Try String.split(",") - this will return String[]
http://docs.oracle.com/javase/7/docs/api/java/lang/String.html#split%28java.lang.String%29
public class Split {
public static void main(String [] args) {
String [] options = args[0].split(",",15);
for(int i = 0; i < options.length; i++) {
System.out.printf("option %d = [%s]\n", i, options[i]);
}
}
}

If I understand you correctly, you get a string of commas and between two commas there is some word like "Yes". Your task is to retrieve the index of that word, i.e., the number of commas (plus 1) before the word.
First of all, that encoding for an option is quite stupid, so if it lies in your responsibility, change it.
The simple solution is to count the commas before the word.
Something like this will do:
String s = ",yes,,,,"
for ( int i = 0 , len = s.length() ; i < len ; i++ )
if ( s.charAt(i) != ',' )
return i+1;
throw new Exception ("No option found");

Well, I assume you want to split the string on , like this:
String[] options = ",yes,,,,,,,,,,,,,".split(",");
String option2 = options[1]; //yields "yes"
However, why don't you use some more understandable markup, like option2=yes etc.?
Have a look at Apache Commons CLI for some library to support better options.

If your String always follows the format you specify, you can use String.split(). For example,
String[] split = ",yes,,,,,,,,,,,,,".split(",");
System.out.format("The chosen data is in Option %d and is '%s'%n", split.length, split[split.length - 1]);
prints
The chosen data is in Option 2 and is 'yes'

Related

How can i extract specific terms from string lines in Java?

I have a serious problem with extracting terms from each string line. To be more specific, I have one csv formatted file which is actually not csv format (it saves all terms into line[0] only)
So, here's just example string line among thousands of string lines:
(split() doesn't work.!!! )
test.csv
"31451 CID005319044   15939353   C8H14O3S2    beta-lipoic acid   C1C[S#](=O)S[C##H]1CCCCC(=O)O "
"12232 COD05374044 23439353  C924O3S2    saponin   CCCC(=O)O "
"9048   CTD042032 23241  C3HO4O3S2 Berberine  [C##H]1CCCCC(=O)O "
I want to extract "beta-lipoic acid" ,"saponin" and "Berberine" only which is located in 5th position.
You can see there are big spaces between terms, so that's why I said 5th position.
In this case, how can I extract terms located in 5th position for each line?
One more thing: the length of whitespace between each of the six terms is not always equal. the length could be one, two, three, four, or five, or something like that.
Because the length of whitespace is random, I can not use the .split() function.
For example, in the first line I would get "beta-lipoic" instead "beta-lipoic acid.**
Here is a solution for your problem using the string split and index of,
import java.util.ArrayList;
public class StringSplit {
public static void main(String[] args) {
String[] seperatedStr = null;
int fourthStrIndex = 0;
String modifiedStr = null, finalStr = null;
ArrayList<String> strList = new ArrayList<String>();
strList.add("31451 CID005319044   15939353   C8H14O3S2 beta-lipoic acid C1C[S#](=O)S[C##H]1CCCCC(=O)O ");
strList.add("12232 COD05374044 23439353 C924O3S2 saponin CCCC(=O)O ");
strList.add("9048 CTD042032 23241 C3HO4O3S2 Berberine [C##H]1CCCCC(=O)O ");
for (String item: strList) {
seperatedStr = item.split("\\s+");
fourthStrIndex = item.indexOf(seperatedStr[3]) + seperatedStr[3].length();
modifiedStr = item.substring(fourthStrIndex, item.length());
finalStr = modifiedStr.substring(0, modifiedStr.indexOf(seperatedStr[seperatedStr.length - 1]));
System.out.println(finalStr.trim());
}
}
}
Output:
beta-lipoic acid
saponin
Berberine
Option 1 : Use spring.split and check for multiple consecutive spaces. Like the code below:
String s[] = str.split("\\s\\s+");
for (String string : s) {
System.out.println(string);
}
Option 2 : Implement your own string split logic by browsing through all the characters. Sample code below (This code is just to give an idea. I didnot test this code.)
public static List<String> getData(String str) {
List<String> list = new ArrayList<>();
String s="";
int count=0;
for(char c : str.toCharArray()){
System.out.println(c);
if (c==' '){
count++;
}else {
s = s+c;
}
if(count>1&&!s.equalsIgnoreCase("")){
list.add(s);
count=0;
s="";
}
}
return list;
}
This would be a relatively easy fix if it weren't for beta-lipoic acid...
Assuming that only spaces/tabs/other whitespace separate terms, you could split on whitespace.
Pattern whitespace = Pattern.compile("\\s+");
String[] terms = whitespace.split(line); // Not 100% sure of syntax here...
// Your desired term should be index 4 of the terms array
While this would work for the majority of your terms, this would also result in you losing the "acid" in "beta-lipoic acid"...
Another hacky solution would be to add in a check for the 6th spot in the array produced by the above code and see if it matches English letters. If so, you can be reasonably confident that the 6th spot is actually part of the same term as the 5th spot, so you can then concatenate those together. This falls apart pretty quickly though if you have terms with >= 3 words. So something like
Pattern possibleEnglishWord = Pattern.compile([[a-zA-Z]*); // Can add dashes and such as needed
if (possibleEnglishWord.matches(line[5])) {
// return line[4].append(line[5]) or something like that
}
Another thing you can try is to replace all groups of spaces with a single space, and then remove everything that isn't made up of just english letters/dashes
line = whitespace.matcher(line).replaceAll("");
Pattern notEnglishWord = Pattern.compile("^[a-zA-Z]*"); // The syntax on this is almost certainly wrong
notEnglishWord.matcher(line).replaceAll("");
Then hopefully the only thing that is left would be the term you're looking for.
Hopefully this helps, but I do admit it's rather convoluted. One of the issues is that it appears that non-term words may have only one space between them, which would fool Option 1 as presented by Hirak... If that weren't the case that option should work.
Oh by the way, if you do end up doing this, put the Pattern declarations outside of any loops. They only need to be created once.

Determining if a given string of words has words greater than 5 letters long

So, I'm in need of help on my homework assignment. Here's the question:
Write a static method, getBigWords, that gets a String parameter and returns an array whose elements are the words in the parameter that contain more than 5 letters. (A word is defined as a contiguous sequence of letters.) So, given a String like "There are 87,000,000 people in Canada", getBigWords would return an array of two elements, "people" and "Canada".
What I have so far:
public static getBigWords(String sentence)
{
String[] a = new String;
String[] split = sentence.split("\\s");
for(int i = 0; i < split.length; i++)
{
if(split[i].length => 5)
{
a.add(split[i]);
}
}
return a;
}
I don't want an answer, just a means to guide me in the right direction. I'm a novice at programming, so it's difficult for me to figure out what exactly I'm doing wrong.
EDIT:
I've now modified my method to:
public static String[] getBigWords(String sentence)
{
ArrayList<String> result = new ArrayList<String>();
String[] split = sentence.split("\\s+");
for(int i = 0; i < split.length; i++)
{
if(split[i].length() > 5)
{
if(split[i].matches("[a-zA-Z]+"))
{
result.add(split[i]);
}
}
}
return result.toArray(new String[0]);
}
It prints out the results I want, but the online software I use to turn in the assignment, still says I'm doing something wrong. More specifically, it states:
Edith de Stance states:
⇒     You might want to use: +=
⇒     You might want to use: ==
⇒     You might want to use: +
not really sure what that means....
The main problem is that you can't have an array that makes itself bigger as you add elements.
You have 2 options:
ArrayList (basically a variable-length array).
Make an array guaranteed to be bigger.
Also, some notes:
The definition of an array needs to look like:
int size = ...; // V- note the square brackets here
String[] a = new String[size];
Arrays don't have an add method, you need to keep track of the index yourself.
You're currently only splitting on spaces, so 87,000,000 will also match. You could validate the string manually to ensure it consists of only letters.
It's >=, not =>.
I believe the function needs to return an array:
public static String[] getBigWords(String sentence)
It actually needs to return something:
return result.toArray(new String[0]);
rather than
return null;
The "You might want to use" suggestions points to that you might have to process the array character by character.
First, try and print out all the elements in your split array. Remember, you do only want you look at words. So, examine if this is the case by printing out each element of the split array inside your for loop. (I'm suspecting you will get a false positive at the moment)
Also, you need to revisit your books on arrays in Java. You can not dynamically add elements to an array. So, you will need a different data structure to be able to use an add() method. An ArrayList of Strings would help you here.
split your string on bases of white space, it will return an array. You can check the length of each word by iterating on that array.
you can split string though this way myString.split("\\s+");
Try this...
public static String[] getBigWords(String sentence)
{
java.util.ArrayList<String> result = new java.util.ArrayList<String>();
String[] split = sentence.split("\\s+");
for(int i = 0; i < split.length; i++)
{
if(split[i].length() > 5)
{
if(split[i].matches("[a-zA-Z]+"))
{
result.add(split[i]);
}
if (split[i].matches("[a-zA-Z]+,"))
{
String temp = "";
for(int j = 0; j < split[i].length(); j++)
{
if((split[i].charAt(j))!=((char)','))
{
temp += split[i].charAt(j);
//System.out.print(split[i].charAt(j) + "|");
}
}
result.add(temp);
}
}
}
return result.toArray(new String[0]);
}
Whet you have done is correct but you can't you add method in array. You should set like a[position]= spilt[i]; if you want to ignore number then check by Float.isNumber() method.
Your logic is valid, but you have some syntax issues. If you are not using an IDE like Eclipse that shows you syntax errors, try commenting out lines to pinpoint which ones are syntactically incorrect. I want to also tell you that once an array is created its length cannot change. Hopefully that sets you off in the right directions.
Apart from syntax errors at String array declaration should be like new String[n]
and add method will not be there in Array hence you should use like
a[i] = split[i];
You need to add another condition along with length condition to check that the given word have all letters this can be done in 2 ways
first way is to use Character.isLetter() method and second way is create regular expression
to check string have only letter. google it for regular expression and use matcher to match like the below
Pattern pattern=Pattern.compile();
Matcher matcher=pattern.matcher();
Final point is use another counter (let say j=0) to store output values and increment this counter as and when you store string in the array.
a[j++] = split[i];
I would use a string tokenizer (string tokenizer class in java)
Iterate through each entry and if the string length is more than 4 (or whatever you need) add to the array you are returning.
You said no code, so... (This is like 5 lines of code)

How to design String decollator in a string contains many params

I need pass a string parameter that contains many params. When receive the parameter, I use String.split() to split it to get all the params.
But one promblem accured. How to design my string decollator so that any ASCII CODE on keyboard can be passed correctly.
Hope for any advice.
Maybe you could have a look at variadic arguments instead of splitting a string. For example:
public void method(String... strings) {
// strings is actually an array
String firstParam = strings[0];
String secondParam = strings[1];
// ...
}
Calling:
method("string1");
method("string1", "string2", "string3");
// as many string args as you want
If I understood correctly - you need to encode set of parameters to one string. You can use some sequence of characters for this purpose, E.g.
final String delimiter = "###"
String value = "param1###param2###param3";
String[] parameters = value.split(delimiter);
Choose a character which is easy to enter and unlikely to appear in the input. Let's assume that character is #.
Normal input would like like Item 1#Item 2#Item 3. Actually, you can .trim() every item and let the user enter Item 1 # Item 2 # Item 3 if s/he prefers.
However, like you describe, say the user would like to enter Item #1, Item #2, etc.. There are a few ways to let him/her do this, but the easier is to let them escape the delimiter. For example, instead of Item #1 # Item #2 # Item #3, which would result in 6 different items being found normally, let the user enter, for example Item ##1 # Item ##2 # Item ##3. Then in your parsing, make sure to handle the case when two or more #'s have been entered in a row. split likely won't be good enough, you'll have to go through the string yourself.
Here's a sketch of a method which would split the input string for you:
private static List<String> parseArguments(String input) {
ArrayList<String> arguments = new ArrayList<String>();
String[] prelArguments = input.split("#");
for (int i = 0; i < prelArguments.length; i++) {
String argument = prelArguments[i];
if (argument.equals("")) {
// We will enter here if there were two or more #'s in a row
StringBuilder combinedArgument = new StringBuilder(arguments.remove(arguments.size() - 1));
int inARow = 0;
while (prelArguments[i+inARow].equals("")) {
inARow++;
combinedArgument.append('#');
}
i += inARow;
combinedArgument.append(prelArguments[i]);
arguments.add(combinedArgument.toString());
} else {
arguments.add(argument);
}
}
return arguments;
}
Error handling, edge-case handling and some performance improvement is missing from the above, but I think the idea comes through.
I would eliminate the problem, which is the misuse of String as an argument container. If you need to pass more parameters, pass more parameters. If this gets out of hand, consider passing a map, or a custom object that can contain all the parameters.

String.split() Not Acting on Semicolon or Space Delimiters

This may be a simple question, but I have been Googling for over an hour and haven't found an answer yet.
I'm trying to simply use the String.split() method with a small Android application to split an input string. The input string will be something along the lines of: "Launch ip:192.168.1.101;port:5900". I'm doing this in two iterations to ensure that all of the required parameters are there. I'm first trying to do a split on spaces and semicolons to get the individual tokens sorted out. Next, I'm trying to split on colons in order to strip off the identification tags of each piece of information.
So, for example, I would expect the first round of split to give me the following data from the above example string:
(1) Launch
(2) ip:192.168.1.101
(3) port:5900
Then the second round would give me the following:
(1) 192.168.1.101
(2) 5900
However, the following code that I wrote doesn't give me what's expected:
private String[] splitString(String inputString)
{
String[] parsedString;
String[] orderedString = new String[SOSLauncherConstants.SOCKET_INPUT_STRING_PARSE_VALUE];
parsedString = inputString.trim().split("; ");
Log.i("info", "The parsed data is as follows for the initially parsed string of size " + parsedString.length + ": ");
for (int i = 0; i < parsedString.length; ++i)
{
Log.i("info", parsedString[i]);
}
for (int i = 0; i < parsedString.length; ++i )
{
if (parsedString[i].toLowerCase().contains(SOSLauncherConstants.PARSED_LAUNCH_COMMAND_VALUE))
{
orderedString[SOSLauncherConstants.PARSED_COMMAND_WORD] = parsedString[i];
}
if (parsedString[i].toLowerCase().contains("ip"))
{
orderedString[SOSLauncherConstants.PARSED_IP_VALUE] = parsedString[i].split(":")[1];
}
else if (parsedString[i].toLowerCase().contains("port"))
{
orderedString[SOSLauncherConstants.PARSED_PORT_VALUE] = parsedString[i].split(":")[1];
}
else if (parsedString[i].toLowerCase().contains("username"))
{
orderedString[SOSLauncherConstants.PARSED_USERNAME_VALUE] = parsedString[i].split(":")[1];
}
else if (parsedString[i].toLowerCase().contains("password"))
{
orderedString[SOSLauncherConstants.PARSED_PASSWORD_VALUE] = parsedString[i].split(":")[1];
}
else if (parsedString[i].toLowerCase().contains("color"))
{
orderedString[SOSLauncherConstants.PARSED_COLOR_VALUE] = parsedString[i].split(":")[1];
}
}
Log.i("info", "The parsed data is as follows for the second parsed string of size " + orderedString.length + ": ");
for (int i = 0; i < orderedString.length; ++i)
{
Log.i("info", orderedString[i]);
}
return orderedString;
}
For a result, I'm getting the following:
The parsed data is as follows for the parsed string of size 1:
launch ip:192.168.1.106;port:5900
The parsed data is as follows for the second parsed string of size 6:
launch ip:192.168.1.106;port:5900
192.168.1.106;port
And then, of course, it crashes because the for loop runs into a null string.
Side Note:
The following snippet is from the constants class that defines all of the string indexes --
public static final int SOCKET_INPUT_STRING_PARSE_VALUE = 6;
public static final int PARSED_COMMAND_WORD = 0;
public static final String PARSED_LAUNCH_COMMAND_VALUE = "launch";
public static final int PARSED_IP_VALUE = 1;
public static final int PARSED_PORT_VALUE = 2;
public static final int PARSED_USERNAME_VALUE = 3;
public static final int PARSED_PASSWORD_VALUE = 4;
public static final int PARSED_COLOR_VALUE = 5;
I looked into needing a possible escape (by inserting a \\ before the semicolon) on the semicolon delimiter, and even tried using it, but that didn't work. The odd part is that neither the space nor the semicolon function as a delimiter, yet the colon works on the second time around. Does anybody have any ideas what would cause this?
Thanks for your time!
EDIT: I should also add that I'm receiving the string over a WiFi socket connection. I don't think this should make a difference, but I'd like you to have all of the information that you need.
String.split(String) takes a regex. Use "[; ]". eg:
"foo;bar baz".split("[; ]")
will return an array containing "foo", "bar" and "baz".
If you need groups of spaces to work as a single delimiter, you can use something like:
"foo;bar baz".split("(;| +)")
I believe String.split() tries to split on each of the characters you specify together (or on a regex), not each character individually. That is, split(";.") would not split "a;b.c" at all, but would split "a;.b".
You may have better luck with Guava's Splitter, which is meant to be slightly less unpredictable than java.lang.String.split.
I would write something like
Iterable<String> splits = Splitter.on(CharMatcher.anyOf("; ")).split(string);
but Splitter also provides fluent-style customization like "trim results" or "skip over empty strings."
Is there a reason why you are using String.split(), but not using Regular Expressions? This is a perfect candidate for regex'es, esp if the string format is consistent.
I'm not sure if your format is fixed, and if it is, then the following regex should break it down for you (am sure that someone can come up with an even more elegant regex). If you have several command strings that follow, then you can use a more flexible regex and loop over all the groups:
Pattern p = Pattern.compile("([\w]*)[ ;](([\w]*):([^ ;]*))*");
Matcher m = p.match( <input string>);
if( m.find() )
command = m.group(1);
do{
id = m.group(3);
value = m.group(4);
} while( m.find() );
A great place to test out regex'es online is http://www.regexplanet.com/simple/index.html. It allows you to play with the regex without having to compile and launch you app every time if you just want to get the regex correct.

Regular expression for validating an answer to a question

Hey everyone,
I'm having a minor difficulty setting up a regular expression that evaluates a sentence entered by a user in a textbox to keyword(s). Essentially, the keywords have to be entered consecutive from one to the other and can have any number of characters or spaces before, between, and after (ie. if the keywords are "crow" and "feet", crow must be somewhere in the sentence before feet. So with that in mind, this statement should be valid "blah blah sccui crow dsj feet "). The characters and to some extent, the spaces (i would like the keywords to have at least one space buffer in the beginning and end) are completely optional, the main concern is whether the keywords were entered in their proper order.
So far, I was able to have my regular expression work in a sentence but failed to work if the answer itself was entered only.
I have the regular expression used in the function below:
// Comparing an answer with the right solution
protected boolean checkAnswer(String a, String s) {
boolean result = false;
//Used to determine if the solution is more than one word
String temp[] = s.split(" ");
//If only one word or letter
if(temp.length == 1)
{
if (s.length() == 1) {
// check multiple choice questions
if (a.equalsIgnoreCase(s)) result = true;
else result = false;
}
else {
// check short answer questions
if ((a.toLowerCase()).matches(".*?\\s*?" + s.toLowerCase() + "\\s*?.*?")) result = true;
else result = false;
}
}
else
{
int count = temp.length;
//Regular expression used to
String regex=".*?\\s*?";
for(int i = 0; i<count;i++)
regex+=temp[i].toLowerCase()+"\\s*?.*?";
//regex+=".*?";
System.out.println(regex);
if ((a.toLowerCase()).matches(regex)) result = true;
else result = false;
}
return result;
Any help would greatly be appreciated.
Thanks.
I would go about this in a different way. Instead of trying to use one regular expression, why not use something similar to:
String answer = ... // get the user's answer
if( answer.indexOf("crow") < answer.indexOf("feet") ) {
// "correct" answer
}
You'll still need to tokenize the words in the correct answer, then check in a loop to see if the index of each word is less than the index of the following word.
I don't think you need to split the result on " ".
If I understand correctly, you should be able to do something like
String regex="^.*crow.*\\s+.*feet.*"
The problem with the above is that it will match "feetcrow feetcrow".
Maybe something like
String regex="^.*\\s+crow.*\\s+feet\\s+.*"
That will enforce that the word is there as opposed to just in a random block of characters.
Depending on the complexity Bill's answer might be the fastest solution. If you'd prefer a regular expression, I wouldn't look for any spaces, but word boundaries instead. That way you won't have to handle commas, dots, etc. as well:
String regex = "\\bcrow(?:\\b.*\\b)?feet\\b"
This should match "crow bla feet" as well as "crowfeet" and "crow, feet".
Having to match multiple words in a specific order you could just join them together using '(?:\b.*\b)?' without requiring any additional sorting or checking.
Following Bill answer, I'd try this:
String input = // get user input
String[] tokens = input.split(" ");
String key1 = "crow";
String key2 = "feet";
String[] tokens = input.split(" ");
List<String> list = Arrays.asList(tokens);
return list.indexOf(key1) < list.indexOf(key2)

Categories