Selenium Webdriver Java validate name field - java

I am not very experienced with Selenium. I thought to test my knowledge by doing the following, validate that a name field in a form has no special character. I was not able to do so. 1st I tried to put the characters in an array and read from the array but I kept on getting Alert failure message. Then I thought of the following way and always getting output "valid".
import junit.framework.Assert;
import org.openqa.selenium.Alert;
import org.openqa.selenium.By;
import org.openqa.selenium.NoAlertPresentException;
import org.openqa.selenium.WebElement;
import org.openqa.selenium.firefox.FirefoxDriver;
import org.testng.annotations.Test;
public class NameField {
public static FirefoxDriver fx= new FirefoxDriver();
public static String doCheck()
{
fx.get("http://www.gogamers.com/#!blank/gs4id");
String regex = "^[A-Z0-9+$";
String str=fx.findElement(By.id("comp-iikjotq8nameField")).getText();
fx.findElement(By.id("comp-iikjotq8nameField")).sendKeys("#john");
if (str.matches("[" + regex + "]+")){
System.out.println("Invalid character in Name field");
}
else{
System.out.println("valid");
}
return str;
What I have in mind is if you give a name using sendkey(eg: John#, #john) you will get invalid message. Another thing I was thinking should I use assertion? Please suggest a best way a small sample code would be helpful.
The new code that I have tried today which is still giving me Valid, when I am expecting invalid. Can someone kindly take a look please? I tried both matches and find
public class YahooMail {
public static void main(String[] args) {
FirefoxDriver fx= new FirefoxDriver();
fx.get("https://login.yahoo.com/account/create?");
String title=fx.getTitle();
Assert.assertTrue(title.contains("Yahoo"));
//First I send a text, then I get the text
fx.findElement(By.id("usernamereg-firstName")).sendKeys("$John");
fx.findElement(By.id("usernamereg-firstName")).getText();
//This is the String I want to find
String firstName="John";
//If there are these symbols associated with the name-show invalid
String patternString = ".*$%^#:.*";
Pattern pattern = Pattern.compile(patternString, Pattern.CASE_INSENSITIVE);
Matcher matcher = pattern.matcher(firstName);
if(matcher.find()){
System.out.println("Invalid Name" );
}
else{
System.out.println("Valid Name");
}
}
}

You can fix your regular expression to match any non-alphanumeric characters and use Pattern and Matcher instead:
Pattern p = Pattern.compile("\\W");
Matcher m = p.matcher(str);
if (m.find()) {
System.out.println("Invalid character in Name field");
}
else {
System.out.println("valid");
}

What I did to validate a name field I used same regular expression as was used by our developers in website. Name field in my case is only accepting alphanumeric characters. First of all I created a java functions to randomly generate alphanumeric with special characters as below and then I am comparing this auto generated input with actual regular expressions. As special characters are not allowed in my case, if statement will return false and else block will be executed showing that special characters are not allowed.
//can also be used for complex passwords
public String randomSpecial(int count)
{
String characters = "~`!##$%^&*()-_=+[{]}\\|;:\'\",<.>/?ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789";
String generatedString = RandomStringUtils.random(count, characters);
return generatedString;
}
public void TC_05_regExpression_Invalid()
{
String regex = "/^[a-zA-Z0-9- ]*$/";
WebElement element = driver.findElement(By.name("firstName"));
element.sendKeys(randomSpecial(10));
String fieldText = element.getAttribute("value");
if(fieldText.matches("["+ regex + "]+"))
{
logger.info("Valid Input: " + fieldText);
}
else
{
logger.info("InValid Input: " + fieldText + "not allowed");
}
element.clear();
}

It is working now, the problem is I was not capturing the sendKeys value. I should have used getAttribute
f.get("https://mail.yahoo.com");
f.findElement(By.id("login-username")).sendKeys("jj%jo.com");
//The getAttribute method returns the value of an attribute of an HTML Tag;
//for example if I have an input like this:
WebElement element = f.findElement(By.id("login-username"));
String text = element.getAttribute("value");
System.out.println(text);
if((text).contains("#")){
System.out.println("pass");
}
else{
System.out.println("not pass");
}
enter code here

public class Personal_loan {
public String verified_number(String inputNumber) // pass the parameter
{
String validation;
String regexNum = "[0-9]+"; //"[A-Za-z]";//"^[A-Z+$]";
if (inputNumber.matches("[" + regexNum + "]+"))
{
System.out.println("valid");
validation="valid";
}
else{
System.out.println("Invalid character in Name field");
validation="invalid";
}
return validation;
}
public String verified_str(String inputStr)
{
String regexString = "[A-Za-z]";//"^[A-Z+$]";
if (inputStr.matches("[" + regexString + "]+"))
{
System.out.println("valid");
}
else{
System.out.println("Invalid character in Name field");
}
return null;
}
public static void main(String[] args) {
System.setProperty("webdriver.chrome.driver", "C:\\chromedriver.exe");
WebDriver driver = new ChromeDriver();
driver.get("https://www.iservefinancial.com/");
driver.findElement(By.xpath("(//DIV[#itemprop='name'])[1]")).click();
WebElement LoanAmount =driver.findElement(By.xpath("//INPUT[#id='amount_qa']"));
WebElement Income =driver.findElement(By.xpath("//INPUT[#id='income_qa']"));
LoanAmount.sendKeys("12345");
Income.sendKeys("amount");
Personal_loan pl=new Personal_loan(); //creating object
String g = LoanAmount.getAttribute("value"); // store the value in string
String incomevalue = Income.getAttribute("value");
String lavalid=pl.verified_number(g);
String income_valid = pl.verified_number(incomevalue);
System.out.println("Loan Amount "+lavalid);
System.out.println("income Amount "+income_valid);
}
}

Related

Error while replace string with symbol in Java

I'm solving this problem:
problem
And what I did is this:
import java.io.*;
import static java.lang.System.exit;
import java.util.*;
//Driver for Abbreviations
public class AbbreviationsDriver {
//string of message
private static String message = "";
//List of Abbreviations
private static String[] AbbreviationsList;
//Abbreviations list file
private static File AbbreviationsListFile = new File("abbreviations.txt");
//message file
private static File inputMessageFile = new File("sample_msg.txt");
//output message file
private static File outputMessageFile = new File("sample_output.txt");
//main method
public static void main(String[] args) throws FileNotFoundException {
setAbbreviations(readFileList(AbbreviationsListFile));
System.out.println("list of abbriviations:\n" + Arrays.toString(AbbreviationsList));
setMessage(readFile(inputMessageFile));
System.out.println("\nMessage in input file:\n" + message);
writeFile(outputMessageFile,addTags(message, AbbreviationsList));
System.out.println("\nMessage with tag in output file:\n" + addTags(message, AbbreviationsList));
}
//method to add tags
public static String addTags(String toTag, String[] abbreviations){
for(String abbreviation:abbreviations)
if(toTag.contains(abbreviation)){
toTag = toTag.replaceAll(abbreviation, "<" + abbreviation + ">");
}
return toTag;
}
//method to read the file list
public static String[] readFileList(File fileInput){
String input = "";
try{
Scanner inputStream = new Scanner(fileInput);
while(inputStream.hasNextLine()){
input = input + inputStream.nextLine()+ "<String>";
}
inputStream.close();
// System.out.println("list in string: " + input);
return input.split("<String>");
}
catch(Exception exception){
System.out.println("error in getting string array from file:\t" + exception.getMessage());
exit(0);
return new String[] {""};
}
}
//method to read the file
public static String readFile(File fileInput){
String inputFile = "";
try{
Scanner inputStatement = new Scanner(fileInput);
while(inputStatement.hasNextLine()){
inputFile = inputFile + inputStatement.nextLine();
}
inputStatement.close();
return inputFile;
}
catch(Exception exception){
System.out.println("error in getting message from file:\t" + exception.getMessage());
exit(0);
return "";
}
}
//method to write the output file
public static void writeFile(File fileName, String outString){
try{
PrintWriter outputStatement = new PrintWriter(fileName);
outputStatement.print(outString);
outputStatement.close();
}
catch(Exception exception){
System.out.println("error in setting message of file:\t" + exception.getMessage());
exit(0);
}
}
//method to set abbreviations
public static void setAbbreviations(String[] newAbbreviationsList){
AbbreviationsList = newAbbreviationsList;
}
//setter to set message
public static void setMessage(String newMessage){
message = newMessage;
}
//input string
public static String inputString(){
return new Scanner(System.in).nextLine();
}
}
abbreviations.txt is here:
lol
:)
iirc
4
u
ttfn
and sample_msg.txt is here:
How are u today? Iirc, this is your first free day. Hope you are having fun! :)
but when I compile and run, the error message comes out:
list of abbriviations:
[lol, :), iirc, 4, u, ttfn]
Message in input file:
How are u today? Iirc, this is your first free day. Hope you are having fun! :)
Exception in thread "main" java.util.regex.PatternSyntaxException: Unmatched closing ')' near index 0
:)
^
at java.util.regex.Pattern.error(Pattern.java:1969)
at java.util.regex.Pattern.compile(Pattern.java:1706)
at java.util.regex.Pattern.<init>(Pattern.java:1352)
at java.util.regex.Pattern.compile(Pattern.java:1028)
at java.lang.String.replaceAll(String.java:2223)
at AbbreviationsDriver.addTags(AbbreviationsDriver.java:44)
at AbbreviationsDriver.main(AbbreviationsDriver.java:36)
Process finished with exit code 1
I don't know how to solve this error because I've never seen this error before.
Please help me!
You pass wrong parameter into replaceAll(). First parameter must be a regex. For your purpose, regex is not needed, so use replace() method instead.
You faced the error because ) is treated as a metacharacter in regex and therefore either it needs to be escaped or must be paired with its closing counterpart.
Solution
You need to treat abbreviations with metacharacters and strings without metacharacters differently. For strings with metacharacters (e.g. :) where ) is a metacharacter), you should use String#replace while for the strings without metacharacter you should use String#replaceAll.
When you use String#replaceAll, you should create a capturing group which includes word boundaries e.g. (\bu\b) so that only those u will be processed which appear as a word. Finally, you should replace the capturing group with <$1> where $1 refers to the first (in the code given below, there is only one capturing group) capturing group e.g. (\bu\b) will be replaced by <u>.
Demo:
public class Main {
public static void main(String[] args) {
String[] abbrWithoutMetaChars = { "lol", "iirc", "4", "u", "ttfn" };
String[] abbrWithMetaChars = { ":)" };
// Test string
String str = "How are u today? iirc, this is your first free day. Hope you are having fun! :)";
// Replace all abbr. without meta chars
for (String abbreviation : abbrWithoutMetaChars) {
str = str.replaceAll("(\\b" + abbreviation + "\\b)", "<$1>");
}
// Replace all abbr. with meta chars
for (String abbreviation : abbrWithMetaChars) {
str = str.replace(abbreviation, "<" + abbreviation + ">");
}
System.out.println(str);
}
}
Output:
How are <u> today? <iirc>, this is your first free day. Hope you are having fun! <:)>
The problem is actually tricky. For example, in the list of abbreviations, u should be interpreted as a word and not a letter, since in your expected output you don't surround the letter u in the word your with angle brackets but only the u that appears by itself. Hence your code needs to locate the abbreviation as a single word in the input.
Also, iirc appears in the abbreviations list but in the input you have Iirc (with a capital I) and in the expected output it should appear as <Iirc> and not as <iirc>. In other words you should ignore case when locating the abbreviation but you need to keep the case after surrounding the abbreviation with angle brackets.
Then you have :) in the abbreviations list but ) has special meaning in regular expression syntax so your code also needs to handle that situation.
All the above implies that you need to analyze the contents of the abbreviations list file in order to turn a raw abbreviation into a valid regular expression that you can then use to locate the abbreviation in the input text.
If you assume that the abbreviations list may contain every possible abbreviation, you would probably need a large amount of code to handle each one properly. Rather than do that, I just concentrated on your sample list which divides easily into two groups:
simple words
punctuation only
Note that the second group is also known as emoticons and some emoticons contain both letters and punctuation which my code, below, does not handle. As I said, my solution only pertains to your sample list of abbreviations.
Here is the code and below the code are some notes regarding it. Please not that I took the liberty of not just fixing your code, but refactoring it as well.
import java.io.File;
import java.io.FileNotFoundException;
import java.io.IOException;
import java.io.PrintWriter;
import java.nio.file.Files;
import java.nio.file.Path;
import java.nio.file.Paths;
import java.util.List;
//Driver for Abbreviations
public class AbbreviationsDriver {
//Abbreviations list file
private static Path abbreviationsListPath = Paths.get("abbreviations.txt");
//message file
private static Path inputPath = Paths.get("sample_msg.txt");
//output message file
private static File outputMessageFile = new File("sample_output.txt");
//main method
public static void main(String[] args) throws FileNotFoundException {
List<String> abbreviationsList = readFileList(abbreviationsListPath);
System.out.println("List of abbreviations: " + abbreviationsList);
String message = readFile(inputPath);
System.out.println("\nMessage in input file:\n" + message);
String result = addTags(message, abbreviationsList);
writeFile(outputMessageFile, result);
System.out.println("\nMessage with tag in output file:\n" + result);
}
//method to add tags
public static String addTags(String toTag, List<String> abbreviations) {
for (String abbreviation : abbreviations) {
String regex;
if (abbreviation.contains(")")) {
regex = "(\\Q" + abbreviation + "\\E)";
}
else {
regex = "(?i)(\\b" + abbreviation + "\\b)";
}
toTag = toTag.replaceAll(regex, "<$1>");
}
return toTag;
}
//method to read the file list
public static List<String> readFileList(Path path) {
List<String> list;
try {
list = Files.readAllLines(path);
}
catch (IOException exception) {
list = List.of();
System.out.println("Failed to load: " + path);
exception.printStackTrace();
}
return list;
}
//method to read the file
public static String readFile(Path path) {
String inputFile;
try {
inputFile = Files.readString(path);
}
catch (IOException exception) {
System.out.println("Failed to read: " + path);
exception.printStackTrace();
inputFile = "";
}
return inputFile;
}
//method to write the output file
public static void writeFile(File fileName, String outString) {
try {
PrintWriter outputStatement = new PrintWriter(fileName);
outputStatement.print(outString);
outputStatement.close();
}
catch (Exception exception) {
System.out.println("Failed to write file: " + fileName);
exception.printStackTrace();
}
}
}
I use interface Path rather that class File so that I can use methods of class Files to read the text files that contain the abbreviations list and the input. Hence my code works with interface List rather than with an array of String.
Passing class members to methods as method parameters defeats the purpose of having a class member in the first place. Hence I removed the members message and AbbreviationsList.
The actual work of locating the abbreviations in the input and surrounding them with angle brackets, all occurs in method addTags. Here I handle each separate group of abbreviations. If the abbreviation contains the character ), I quote it by surrounding it with quote markers \Q and \E. (Refer to javadoc of class Pattern). Otherwise the abbreviation is a regular word, so I surround it with the word boundary marker \b. I also enclose each regular expression in parentheses so as to make it a capturing group. Note that the second regular expression begins with (?i) which means to ignore case. Hence iirc will match Iirc.
The replacement string is <$1>. The $1 is replaced with the string that was actually matched so any abbreviation found in the input will be replaced by the matched string surrounded with angle brackets.
Finally, here is the output when running the above code and using your sample data.
List of abbreviations: [lol, :), iirc, 4, u, ttfn]
Message in input file:
How are u today? Iirc, this is your first free day. Hope you are having fun! :)
Message with tag in output file:
How are <u> today? <Iirc>, this is your first free day. Hope you are having fun! <:)>
There are several ways to do this. Either you use regular expressions, or you do things the old-fashioned way by parsing word-by-word. Others have pointed out problems with your current code, due to using strings that contain regular expression metacharacters. In particular,
String doesNotWork = "I am :)".replaceAll(":)", "happy"); // invalid regex
This can be solved by quoting the string, so that metacharacters are converted into literals (it returns the string that would be written as "\\Q:)\\E", because \Q and \E are used as delimiters for quoting whole substrings, as opposed to \, which quotes the next only if it is non-alphabetical; and is otherwise used for a host of regex classes):
String worksAsExpected = "I am :)".replaceAll(Pattern.quote(":)"), "happy");
The most efficient way to process text is to do a single pass. This can be achieved by combining literal expressions with |s:
String regex = Stream.of("lol iirc 4".split(" "))
.map(s -> Pattern.quote(s)) // quotes each emoticon
.collect(Collectors.joining("|")); // joins with |
Matcher m = Pattern.compile(regex).matcher(input);
This yields surprisingly compact code, with nothing hardcoded. Finished code:
import java.util.regex.*;
import java.util.stream.*;
public class T {
public static String mark(
String[] needles, String startMark, String endMark, String input) {
String regex = Stream.of(needles)
.map(s -> s.matches("\\p{Alpha}+") ? // quotes each
"\\b" + Pattern.quote(s) + "\\b" : // to avoid yo<u>r
Pattern.quote(s)) // to handle emoticons
.collect(Collectors.joining("|")); // joins with |
Matcher m = Pattern.compile(regex).matcher(input);
StringBuffer output = new StringBuffer();
while (m.find()) {
m.appendReplacement(output, startMark + m.group() + endMark);
}
m.appendTail(output);
return output.toString();
}
public static void main(String ... args) {
System.out.println(mark(
"lol iirc 4 u ttfn :)".split(" "), // abbreviations
"<", ">", // markers to mark them with
"How are u today? iirc, this is your first free day. "
+ "Hope you are having fun! :)"));
}
}
I used #Arvind's trick of placing word-boundary metacharacters (\\b) only on alphabetical needles. This fixes all us in words being marked; but may yield strange results for 4s: writing a number with 4s in it will get it marked. Ultimately, natural language processing is hard. Regular expressions are great for very regular inputs.

password generation in java using regex

import java.util.*;
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class Main{
public static void main(String[] args)
{
Scanner sc=new Scanner(System.in);
System.out.println("Generate your Security Code ");
String password=sc.next();
if(password.length()>=8)
{
Pattern letter = Pattern.compile("[a-z]{1}[A-z]{1}");
Pattern digit = Pattern.compile("[0-9]");
Pattern special = Pattern.compile ("[##*]{1}");
Matcher hasLetter = letter.matcher(password);
Matcher hasDigit = digit.matcher(password);
Matcher hasSpecial = special.matcher(password);
if(hasLetter.find() && hasSpecial.find() &&hasDigit.find()){
System.out.println("Security Code Generated Successfully");
}
}
else{
System.out.println("Invalid Security Code, Try Again!");
}
}
}
I wrote a code for password generation but one test case is failing,the digits in password are not compulsory how do i do it?
List item
If you want to change the regex for optional digit then add a * .
Another way around is to use || instead of &&
if(hasLetter.find() && hasSpecial.find() ||hasDigit.find()){
System.out.println("Security Code Generated Successfully");
}

Matcher can't match

I have the following code. I need to check the text for existing any of the words from some list of banned words. But even if this word exists in the text matcher doesn't see it. here is the code:
final ArrayList<String> regexps = config.getProperty(property);
for (String regexp: regexps){
Pattern pt = Pattern.compile("(" + regexp + ")", Pattern.CASE_INSENSITIVE);
Matcher mt = pt.matcher(plainText);
if (mt.find()){
result = result + "message can't be processed because it doesn't satisfy the rule " + property;
reason = false;
System.out.println("reason" + mt.group() + regexp);
}
}
What is wrong? This code can'f find regexp в[ыy][шs]лит[еe], which is regexp in the plainText = "Вышлите пожалуйста новый счет на оплату на Санг, пока согласовывали, уже
прошли его сроки. Лиценз...". I also tried another variants of the regexp but everything is useless
The trouble is elsewhere.
import java.util.regex.*;
public class HelloWorld {
public static void main(String []args) {
Pattern pt = Pattern.compile("(qwer)");
Matcher mt = pt.matcher("asdf qwer zxcv");
System.out.println(mt.find());
}
}
This prints out true. You may want to use word boundary as delimiter, though:
import java.util.regex.*;
public class HelloWorld {
public static void main(String []args) {
Pattern pt = Pattern.compile("\\bqwer\\b");
Matcher mt = pt.matcher("asdf qwer zxcv");
System.out.println(mt.find());
mt = pt.matcher("asdfqwer zxcv");
System.out.println(mt.find());
}
}
The parenthesis are useless unless you need to capture the keyword in a group. But you already have it to begin with.
Use ArrayList's built in functions indexOf(Object o) and contains(Object o) to check if a String exists anywhere in the Array and where.
e.g.
ArrayList<String> keywords = new ArrayList<String>();
keywords.add("hello");
System.out.println(keywords.contains("hello"));
System.out.println(keywords.indexOf("hello"));
outputs:
true
0
Try this to filter out messages which contain banned words using the following regex which uses OR operator.
private static void findBannedWords() {
final ArrayList<String> keywords = new ArrayList<String>();
keywords.add("f$%k");
keywords.add("s!#t");
keywords.add("a$s");
String input = "what the f$%k";
String bannedRegex = "";
for (String keyword: keywords){
bannedRegex = bannedRegex + ".*" + keyword + ".*" + "|";
}
Pattern pt = Pattern.compile(bannedRegex.substring(0, bannedRegex.length()-1));
Matcher mt = pt.matcher(input);
if (mt.matches()) {
System.out.println("message can't be processed because it doesn't satisfy the rule ");
}
}

regex pattern to match particular uri from list of urls

I have a list of urls (lMapValues ) with wild cards like as mentioned in the code below
I need to match uri against this list to find matching url.
In below code I should get matching url as value of d in the map m.
That means if part of uri is matching in the list of urls, that particular url should be picked.
I tried splitting uri in tokens and then checking each token in list lMapValues .However its not giving me correct result.Below is code for that.
public class Matcher
{
public static void main( String[] args )
{
Map m = new HashMap();
m.put("a","https:/abc/eRControl/*");
m.put("b","https://abc/xyz/*");
m.put("c","https://work/Mypage/*");
m.put("d","https://cr/eRControl/*");
m.put("e","https://custom/MyApp/*");
List lMapValues = new ArrayList(m.values());
List tokens = new ArrayList();
String uri = "cr/eRControl/work/custom.jsp";
StringTokenizer st = new StringTokenizer(uri,"/");
while(st.hasMoreTokens()) {
String token = st.nextToken();
tokens.add(token);
}
for(int i=0;i<lMapValues.size();i++) {
String value = (String)lMapValues.get(i);
String patternString = "\\b(" + StringUtils.join(tokens, "|") + ")\\b";
Pattern pattern = Pattern.compile(patternString);
java.util.regex.Matcher matcher = pattern.matcher(value);
while (matcher.find()) {
System.out.println(matcher.group(1));
System.out.println(value);
}
}
}
}
Please help me with regex pattern to achieve above objective.
Any help will be appreciated.
It's much simpler to check if a string starts with a certain value with String.indexOf().
String[] urls = {
"abc/eRControl",
"abc/xyz",
"work/Mypage",
"cr/eRControl",
"custom/MyApp"
};
String uri = "cr/eRControl/work/custom.jsp";
for (String url : urls) {
if (uri.indexOf(url) == 0) {
System.out.println("Matched: " + url);
}else{
System.out.println("Not matched: " + url);
}
}
Also. There is no need to store the scheme into the map if you are never going to match against it.
if I understand your goal correctly, you might not even need regular expressions here.
Try this...
package test;
import java.util.HashSet;
import java.util.Set;
public class PartialURLMapper {
private static final Set<String> PARTIAL_URLS = new HashSet<String>();
static {
PARTIAL_URLS.add("cr/eRControl");
// TODO add more partial Strings to check against input
}
public static String getPartialStringIfMatching(final String input) {
if (input != null && !input.isEmpty()) {
for (String partial: PARTIAL_URLS) {
// this will be case-sensitive
if (input.contains(partial)) {
return partial;
}
}
}
// no partial match found, we return an empty String
return "";
}
// main method just to add example
public static void main(String[] args) {
System.out.println(PartialURLMapper.getPartialStringIfMatching("cr/eRControl/work/custom.jsp"));
}
}
... it will return:
cr/eRControl
The problem is that i is acting as a key not as an index on
String value = (String)lMapValues.get(i);
you will be better served exchanging the map for a list, and using the for each loop.
List<String> patterns = new ArrayList<String>();
...
for (String pattern : patterns) {
....
}

Tokenize a string with a space in java

I want to tokenize a string like this
String line = "a=b c='123 456' d=777 e='uij yyy'";
I cannot split based like this
String [] words = line.split(" ");
Any idea how can I split so that I get tokens like
a=b
c='123 456'
d=777
e='uij yyy';
The simplest way to do this is by hand implementing a simple finite state machine. In other words, process the string a character at a time:
When you hit a space, break off a token;
When you hit a quote keep getting characters until you hit another quote.
Depending on the formatting of your original string, you should be able to use a regular expression as a parameter to the java "split" method: Click here for an example.
The example doesn't use the regular expression that you would need for this task though.
You can also use this SO thread as a guideline (although it's in PHP) which does something very close to what you need. Manipulating that slightly might do the trick (although having quotes be part of the output or not may cause some issues). Keep in mind that regex is very similar in most languages.
Edit: going too much further into this type of task may be ahead of the capabilities of regex, so you may need to create a simple parser.
line.split(" (?=[a-z+]=)")
correctly gives:
a=b
c='123 456'
d=777
e='uij yyy'
Make sure you adapt the [a-z+] part in case your keys structure changes.
Edit: this solution can fail miserably if there is a "=" character in the value part of the pair.
StreamTokenizer can help, although it is easiest to set up to break on '=', as it will always break at the start of a quoted string:
String s = "Ta=b c='123 456' d=777 e='uij yyy'";
StreamTokenizer st = new StreamTokenizer(new StringReader(s));
st.ordinaryChars('0', '9');
st.wordChars('0', '9');
while (st.nextToken() != StreamTokenizer.TT_EOF) {
switch (st.ttype) {
case StreamTokenizer.TT_NUMBER:
System.out.println(st.nval);
break;
case StreamTokenizer.TT_WORD:
System.out.println(st.sval);
break;
case '=':
System.out.println("=");
break;
default:
System.out.println(st.sval);
}
}
outputs
Ta
=
b
c
=
123 456
d
=
777
e
=
uij yyy
If you leave out the two lines that convert numeric characters to alpha, then you get d=777.0, which might be useful to you.
Assumptions:
Your variable name ('a' in the assignment 'a=b') can be of length 1 or more
Your variable name ('a' in the assignment 'a=b') can not contain the space character, anything else is fine.
Validation of your input is not required (input assumed to be in valid a=b format)
This works fine for me.
Input:
a=b abc='123 456' &=777 #='uij yyy' ABC='slk slk' 123sdkljhSDFjflsakd#*#&=456sldSLKD)#(
Output:
a=b
abc='123 456'
&=777
#='uij yyy'
ABC='slk slk'
123sdkljhSDFjflsakd#*#&=456sldSLKD)#(
Code:
import java.util.ArrayList;
import java.util.List;
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class RegexTest {
// SPACE CHARACTER followed by
// sequence of non-space characters of 1 or more followed by
// first occuring EQUALS CHARACTER
final static String regex = " [^ ]+?=";
// static pattern defined outside so that you don't have to compile it
// for each method call
static final Pattern p = Pattern.compile(regex);
public static List<String> tokenize(String input, Pattern p){
input = input.trim(); // this is important for "last token case"
// see end of method
Matcher m = p.matcher(input);
ArrayList<String> tokens = new ArrayList<String>();
int beginIndex=0;
while(m.find()){
int endIndex = m.start();
tokens.add(input.substring(beginIndex, endIndex));
beginIndex = endIndex+1;
}
// LAST TOKEN CASE
//add last token
tokens.add(input.substring(beginIndex));
return tokens;
}
private static void println(List<String> tokens) {
for(String token:tokens){
System.out.println(token);
}
}
public static void main(String args[]){
String test = "a=b " +
"abc='123 456' " +
"&=777 " +
"#='uij yyy' " +
"ABC='slk slk' " +
"123sdkljhSDFjflsakd#*#&=456sldSLKD)#(";
List<String> tokens = RegexTest.tokenize(test, p);
println(tokens);
}
}
Or, with a regex for tokenizing, and a little state machine that just adds the key/val to a map:
String line = "a = b c='123 456' d=777 e = 'uij yyy'";
Map<String,String> keyval = new HashMap<String,String>();
String state = "key";
Matcher m = Pattern.compile("(=|'[^']*?'|[^\\s=]+)").matcher(line);
String key = null;
while (m.find()) {
String found = m.group();
if (state.equals("key")) {
if (found.equals("=") || found.startsWith("'"))
{ System.err.println ("ERROR"); }
else { key = found; state = "equals"; }
} else if (state.equals("equals")) {
if (! found.equals("=")) { System.err.println ("ERROR"); }
else { state = "value"; }
} else if (state.equals("value")) {
if (key == null) { System.err.println ("ERROR"); }
else {
if (found.startsWith("'"))
found = found.substring(1,found.length()-1);
keyval.put (key, found);
key = null;
state = "key";
}
}
}
if (! state.equals("key")) { System.err.println ("ERROR"); }
System.out.println ("map: " + keyval);
prints out
map: {d=777, e=uij yyy, c=123 456, a=b}
It does some basic error checking, and takes the quotes off the values.
This solution is both general and compact (it is effectively the regex version of cletus' answer):
String line = "a=b c='123 456' d=777 e='uij yyy'";
Matcher m = Pattern.compile("('[^']*?'|\\S)+").matcher(line);
while (m.find()) {
System.out.println(m.group()); // or whatever you want to do
}
In other words, find all runs of characters that are combinations of quoted strings or non-space characters; nested quotes are not supported (there is no escape character).
public static void main(String[] args) {
String token;
String value="";
HashMap<String, String> attributes = new HashMap<String, String>();
String line = "a=b c='123 456' d=777 e='uij yyy'";
StringTokenizer tokenizer = new StringTokenizer(line," ");
while(tokenizer.hasMoreTokens()){
token = tokenizer.nextToken();
value = token.contains("'") ? value + " " + token : token ;
if(!value.contains("'") || value.endsWith("'")) {
//Split the strings and get variables into hashmap
attributes.put(value.split("=")[0].trim(),value.split("=")[1]);
value ="";
}
}
System.out.println(attributes);
}
output:
{d=777, a=b, e='uij yyy', c='123 456'}
In this case continuous space will be truncated to single space in the value.
here attributed hashmap contains the values
import java.io.*;
import java.util.Scanner;
public class ScanXan {
public static void main(String[] args) throws IOException {
Scanner s = null;
try {
s = new Scanner(new BufferedReader(new FileReader("<file name>")));
while (s.hasNext()) {
System.out.println(s.next());
<write for output file>
}
} finally {
if (s != null) {
s.close();
}
}
}
}
java.util.StringTokenizer tokenizer = new java.util.StringTokenizer(line, " ");
while (tokenizer.hasMoreTokens()) {
String token = tokenizer.nextToken();
int index = token.indexOf('=');
String key = token.substring(0, index);
String value = token.substring(index + 1);
}
Have you tried splitting by '=' and creating a token out of each pair of the resulting array?

Categories