I have the following code which, by means of a keyboard input, gives me the start and arrival .. the start is determined according to the "da" proposition, while the arrival determines it according to the preposition "a" so I'm fighting now is: I want to get the start and the arrival even if I change the order of the propositions .. you know how I could proceed ..
this is the OUTPUT I get :
I want to go from ostuni to trapani
Partenza :ostuni
Arrivo :trapani
but if I wrote like this:
I want to go to ostuni by trapani
I would like to print the same start and finish correctly ..that is
Patenza :trapani
Arrivo :ostuni
Is this processing possible?
thanks a lot for the attention! Good day
package eubot.controller;
import eubot.intent.Intent;
public class EubotEngine {
public Intent getIntent(String stringInput) {
String str1 = "";
String str2 = "";
Intent dictionary = null;
for (String str3 : Intent.keyWord) {
if (stringInput.contains(str3)) {
//System.out.println("La stringa contiene : " + str3);
int indice1 = stringInput.indexOf(str3) + str3.length();
String splittable =
stringInput.substring(indice1,stringInput.length()).trim();
String splittable2[] = splittable.split(" ");
int index = 0;
for (String str : splittable2) {
str = splittable2[index +1];
str1 = str;
System.out.println("Partenza :" + str1);
break;
}
String splittable3[] = splittable.split(" ");
for(String str : splittable3) {
str = splittable3[index + 3];
str2 = str;
System.out.println("Arrivo :" + str2);
break;
}
index++;
dictionary = new Intent();
dictionary.setTesto(stringInput);
}
}
return dictionary;
}
}
package eustema.eubot.intent;
public class Intent {
public String testo;
public String getTesto() {
return testo;
}
public void setTesto(String testo) {
this.testo = testo;
}
public static String[] keyWord = { "devo andare", "voglio andare", "vorrei andare", "devo recarmi"};
public static String[] parameter = { "bari", "roma", "milano","pisa","firenze","napoli","como","torino" };
}
package eustema.eubot.main;
import java.util.Scanner;
import eustema.eubot.controller.*;
import eustema.eubot.intent.*;
public class Test {
public static void main(String[] args) {
System.out.println("<<-|-|-|-|-|-|-|-|-|<<<BENVENUTO IN EuBoT>>>|-|-|-|-|-|-|-|-|->>");
EubotEngine controller = new EubotEngine();
Scanner input = new Scanner(System.in);
String string;
while (true) {
string = input.nextLine();
Intent intent = controller.getIntent(string);
}
}
}
I know this will not be considered a good answer:)
This is non-trivial to solve by means of imperative programming. The reason is there are many forms in which one can express the same intent. Things like filler words, synonyms, inversions and in general things you did not think about could disrupt your algorithm.
Of course it depends on the level of accuracy you want to achieve. If you are happy that this will not work for all cases, you could always put in conditions like:
if (arr[index-1] == "from") setStart(arr[index]);
if (arr[index-1] == "to") setDestination(arr[index]);
Google, Amazon and Apple are battling to improve this sort of human-computer interaction, but they are using a more mathematical/statistical approach through machine learning.
So, if you're looking for state of the art:
Main search terms: context-free grammars.
Other key words: Markov models, Information extraction, vector space models, tf-idf
Related
I am currently working on a Java program that crawls a webpage and prints out some information from it.
There is one part that I can't figure out, and thats when I try to print out one specific String Array with some information in it, all it gives me is " ] " for that line. However, a few lines before, I also try printing out another String array in the exact same way and it prints out fine. When I test what is actually being passed to the "categories" variable, its the correct information and can be printed out there.
public class Crawler {
private Document htmlDocument;
String [] keywords, categories;
public void printData(String urlToCrawl)
{
nextURL=urlToCrawl;
crawl();
//This does what its supposed to do. (Print Statement 1)
System.out.print("Keywords: ");
for (String i :keywords) {System.out.print(i+", ");}
//This doesnt. (Print Statement 2)
System.out.print("Categories: ");
for (String b :categories) {System.out.print(b+", ");}
}
public void crawl()
{
//Gather Data
//open up JSOUP for HTTP parsing.
Connection connection = Jsoup.connect(nextURL).userAgent(USER_AGENT);
Document htmlDocument = connection.get();
this.htmlDocument=htmlDocument;
System.out.println("Recieved Webpage "+ nextURL);
int guacCounter = 0;
for(Element guac : htmlDocument.select("script"))
{
if(guacCounter==5)
{
//String concentratedGuac = guac.toString();
String[] items = guac.toString().split("\\n");
categories = processGuac(items);
break;
}
else if(guacCounter<5) {
guacCounter++;
}
}
}
public String[] processKeywords(String totalKeywords)
{
String [] separatedKeywords = totalKeywords.split(",");
//System.out.println(separatedKeywords.toString());
return separatedKeywords;
}
public String[] processGuac(String[] inputGuac)
{
int categoryIsOnLine = 6;
String categoryData = inputGuac[categoryIsOnLine-1];
categoryData = categoryData.replace(",","");
categoryData = categoryData.replace("'","");
categoryData = categoryData.replace("|",",");
categoryData = categoryData.split(":")[1];
//this prints out the list of categories in string form.(Print Statement 3)
System.out.println("Testing here: " + categoryData.toString());
String [] categoryList=categoryData.split(",");
//This prints out the list of categories in array form correctly.(Print statement 4)
System.out.println("Testing here too: " );
for(String a : categoryList) {System.out.println(a);}
return categoryList;
}
}
I cut out a lot of the irrelevant parts of my code so there might be some missing variables.
Here is what my printouts look like:
PS1:
Keywords: What makes a good friend, making friends, signs of a good friend, supporting friends, conflict management,
PS2:
]
PS3:
Testing here: wellbeing,friends-and-family,friendships
PS4:
Testing here too:
wellbeing
friends-and-family
friendships
The program that I am writing is in Java. I am attempting to make my program read the file "name.txt" and store the values of the text file in an array.
So far I am using a text file that will be read in my main program, a service class called People.java which will be used as a template for my program, and my main program called Names.java which will read the text file and store its values into an array.
name.txt:
John!Doe
Jane!Doe
Mike!Smith
John!Smith
George!Smith
People.java:
public class People
{
String firstname = " ";
String lastname = " ";
public People()
{
firstname = "First Name";
lastname = "Last Name";
}
public People(String firnam, String lasnam)
{
firstname = firnam;
lastname = lasnam;
}
public String toString()
{
String str = firstname+" "+lastname;
return str;
}
}
Names.java:
import java.io.File;
import java.io.IOException;
import java.util.Scanner;
import java.util.StringTokenizer;
public class Names
{
public static void main(String[]args)
{
String a = " ";
String b = "empty";
String c = "empty";
int counter = 0;
People[]peoplearray=new People[5];
try
{
File names = new File("name.txt");
Scanner read = new Scanner(names);
while(read.hasNext())
{
a = read.next();
StringTokenizer token = new StringTokenizer("!", a);
while(token.hasMoreTokens())
{
b = token.nextToken();
c = token.nextToken();
People p = new People(b,c);
peoplearray[counter]=p;
++counter;
}
}
}
catch(IOException ioe1)
{
System.out.println("There was a problem reading the file.");
}
System.out.println(peoplearray[0]);
}
}
As I show in my program, I tried to print the value of peoplearray[0], but when I do this, my output reads: "null."
If the program were working corrrectly, the value of peoplearray[0] should be, "John Doe" as those are the appropriate values in "names.txt"
Is the value of peoplearray[0] supposed to be null?
If not, what can I do to fix this problem?
Thanks!
The order of your arguments is wrong:
StringTokenizer token = new StringTokenizer("!", a);
According to API constructor
public StringTokenizer(String str, String delim)
use
StringTokenizer token = new StringTokenizer(a,"!");
I have trouble splitting a name by a space, and I can't seem to figure out why. Could someone please provide me with a solution?
My code is like this:
public void getPlayerNames(int id){
try {
Document root = Jsoup.connect("http://www.altomfotball.no/element.do?cmd=team&teamId=" + id).get();
Element table = root.getElementById("sd_players_table");
Elements names = table.getElementsByTag("a");
for(Element name : names){
getPlayers().add(new Player(name.text()));
}
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}
which returns the name of football players as a string. The names are retrieved such as Mario Balotelli, Steven Gerrard, and so on, and I assumed I could use string.split(" "); to get me the first and last names, but whenever I try to access the second space of the string array it gives me an index out of bounds exception. Here is the code trying to fetch me the first name
/**
* Method to get the first name of a player
*/
public static String getFirstName(String name){
String[] nameArray = name.split(" ");
return nameArray[0];
}
Thanks for answers!
Sindre M
EDIT ######
So I got it to work, but thanks for the effort. The problem was that even though I could not see it in a simple sysout statement, the names actually contained a " "; character, so I solved it by running a replaceAll("  ;" , " ") on the names for a better formatting.
If you're trying to write a screen-scraper you need to be more defensive in your code... Definitely test the length of the array first and log any unexpected inputs so you can incorporate them later...
public static String getFirstName(String name) {
String[] nameArray = name.split(" ");
if (nameArray.length >= 1) { // <== check length before you access nameArray[0]
return nameArray[0];
} else {
// log error
}
return null;
}
Additionally java.util.Optional in Java 8 provides a great alternative to returning null...
public static Optional<String> getFirstName(String name) {
String[] nameArray = name.split(" ");
if (nameArray.length >= 1) {
return Optional.of(nameArray[0]);
} else {
// log error
}
return Optional.empty();
}
You might be getting in the actual string as you are retrieving from html page. try to debug and check.
package com.appkart.examples;
public class SplitProgram {
public void firstNameArray(String nameString) {
String strArr[] = nameString.split(",");
for (String name : strArr) {
String playerName = name.trim();
String firstName = playerName.substring(0, playerName.indexOf(" "));
System.out.println(firstName);
}
}
public static void main(String[] args) {
String nameString = "Mario Balotelli, Steven Gerrard";
SplitProgram program = new SplitProgram();
program.firstNameArray(nameString);
}
}
I think that the correct answer should be:
String[] nameArray = name.split("\\s+");
But to be honest, there are couple of answers at stackoverflow.
Eg.
How to split a String by space
How do I split a string with any whitespace chars as delimiters?
First try to replace white space as
string.replace(" ","");
then try to split with [,] as
String strAr[] = string.split(",");
So I have a string variable which is meant to hold names of cars separated by commas.
String cars = "";
What I want to do is append cars to this string. The way a new car would be added:
String newCar1 = "Mini";
String newCar2 = "LandRover";
appendToCars(newCar1);
appendToCars(newCar2);
Then currently I have this, which I primarily need help with.
public void appendToCars(String newCar)
{
cars = cars + "," + newCar;
}
So output should be:
Mini,LandRover
but it's:
[,]Mini
Been racking my brain about this for hours figuring out how to do it, but I just can't get the result I actually want.
Im also using a JUnit test for this which reads :
#Test
public void testAppendToCars() {
System.out.println("appendToCars");
String newCar1 = "Mini";
String newCar2 = "LandRover";
String expResult = newCar1 + "," + newCar2;
testDel.appendToCars(newCar1);
testDel.appendToCars(newCar2);
String result = testDel.getCars();
assertEquals("Delivery notes incorrectly stored", expResult, result);
I think you just have a variable scope issue. This example uses your code but takes the scope into consideration:
public class temp {
static String cars = "";
public static void appendToCars(String something)
{
if (cars.equals("")){
cars = something;
}
else {
cars= cars + "," + something;
}
}
public static void main(String[] args){
String newcar1 = "Mini";
String newcar2 = "LandRover";
appendToCars(newcar1);
appendToCars(newcar2);
System.out.println(cars);
}
}
This class will return the following:
Mini,LandRover
DurationOfRun:5
ThreadSize:10
ExistingRange:1-1000
NewRange:5000-10000
Percentage:55 - AutoRefreshStoreCategories Data:Previous/30,New/70 UserLogged:true/50,false/50 SleepTime:5000 AttributeGet:1,16,10106,10111 AttributeSet:2060/30,10053/27
Percentage:25 - CrossPromoEditItemRule Data:Previous/60,New/40 UserLogged:true/50,false/50 SleepTime:4000 AttributeGet:1,10107 AttributeSet:10108/34,10109/25
Percentage:20 - CrossPromoManageRules Data:Previous/30,New/70 UserLogged:true/50,false/50 SleepTime:2000 AttributeGet:1,10107 AttributeSet:10108/26,10109/21
I am trying to parse above .txt file(first four lines are fixed and last three Lines can increase means it can be more than 3), so for that I wrote the below code and its working but it looks so messy. so Is there any better way to parse the above .txt file and also if we consider performance then which will be best way to parse the above txt file.
private static int noOfThreads;
private static List<Command> commands;
public static int startRange;
public static int endRange;
public static int newStartRange;
public static int newEndRange;
private static BufferedReader br = null;
private static String sCurrentLine = null;
private static List<String> values;
private static String commandName;
private static String percentage;
private static List<String> attributeIDGet;
private static List<String> attributeIDSet;
private static LinkedHashMap<String, Double> dataCriteria;
private static LinkedHashMap<Boolean, Double> userLoggingCriteria;
private static long sleepTimeOfCommand;
private static long durationOfRun;
br = new BufferedReader(new FileReader("S:\\Testing\\PDSTest1.txt"));
values = new ArrayList<String>();
while ((sCurrentLine = br.readLine()) != null) {
if(sCurrentLine.startsWith("DurationOfRun")) {
durationOfRun = Long.parseLong(sCurrentLine.split(":")[1]);
} else if(sCurrentLine.startsWith("ThreadSize")) {
noOfThreads = Integer.parseInt(sCurrentLine.split(":")[1]);
} else if(sCurrentLine.startsWith("ExistingRange")) {
startRange = Integer.parseInt(sCurrentLine.split(":")[1].split("-")[0]);
endRange = Integer.parseInt(sCurrentLine.split(":")[1].split("-")[1]);
} else if(sCurrentLine.startsWith("NewRange")) {
newStartRange = Integer.parseInt(sCurrentLine.split(":")[1].split("-")[0]);
newEndRange = Integer.parseInt(sCurrentLine.split(":")[1].split("-")[1]);
} else {
attributeIDGet = new ArrayList<String>();
attributeIDSet = new ArrayList<String>();
dataCriteria = new LinkedHashMap<String, Double>();
userLoggingCriteria = new LinkedHashMap<Boolean, Double>();
percentage = sCurrentLine.split("-")[0].split(":")[1].trim();
values = Arrays.asList(sCurrentLine.split("-")[1].trim().split("\\s+"));
for(String s : values) {
if(s.startsWith("Data")) {
String[] data = s.split(":")[1].split(",");
for (String n : data) {
dataCriteria.put(n.split("/")[0], Double.parseDouble(n.split("/")[1]));
}
//dataCriteria.put(data.split("/")[0], value)
} else if(s.startsWith("UserLogged")) {
String[] userLogged = s.split(":")[1].split(",");
for (String t : userLogged) {
userLoggingCriteria.put(Boolean.parseBoolean(t.split("/")[0]), Double.parseDouble(t.split("/")[1]));
}
//userLogged = Boolean.parseBoolean(s.split(":")[1]);
} else if(s.startsWith("SleepTime")) {
sleepTimeOfCommand = Long.parseLong(s.split(":")[1]);
} else if(s.startsWith("AttributeGet")) {
String[] strGet = s.split(":")[1].split(",");
for(String q : strGet) attributeIDGet.add(q);
} else if(s.startsWith("AttributeSet:")) {
String[] strSet = s.split(":")[1].split(",");
for(String p : strSet) attributeIDSet.add(p);
} else {
commandName = s;
}
}
Command command = new Command();
command.setName(commandName);
command.setExecutionPercentage(Double.parseDouble(percentage));
command.setAttributeIDGet(attributeIDGet);
command.setAttributeIDSet(attributeIDSet);
command.setDataUsageCriteria(dataCriteria);
command.setUserLoggingCriteria(userLoggingCriteria);
command.setSleepTime(sleepTimeOfCommand);
commands.add(command);
Well, parsers usually are messy once you get down to the lower layers of them :-)
However, one possible improvement, at least in terms of code quality, would be to recognize the fact that your grammar is layered.
By that, I mean every line is an identifying token followed by some properties.
In the case of DurationOfRun, ThreadSize, ExistingRange and NewRange, the properties are relatively simple. Percentage is somewhat more complex but still okay.
I would structure the code as (pseudo-code):
def parseFile (fileHandle):
while (currentLine = fileHandle.getNextLine()) != EOF:
if currentLine.beginsWith ("DurationOfRun:"):
processDurationOfRun (currentLine[14:])
elsif currentLine.beginsWith ("ThreadSize:"):
processThreadSize (currentLine[11:])
elsif currentLine.beginsWith ("ExistingRange:"):
processExistingRange (currentLine[14:])
elsif currentLine.beginsWith ("NewRange:"):
processNewRange (currentLine[9:])
elsif currentLine.beginsWith ("Percentage:"):
processPercentage (currentLine[11:])
else
raise error
Then, in each of those processWhatever() functions, you parse the remainder of the line based on the expected format. That keeps your code small and readable and easily changed in future, without having to navigate a morass :-)
For example, processDurationOfRun() simply gets an integer from the remainder of the line:
def processDurationOfRun (line):
this.durationOfRun = line.parseAsInt()
Similarly, the functions for the two ranges split the string on - and get two integers from the resultant values:
def processExistingRange (line):
values[] = line.split("-")
this.existingRangeStart = values[0].parseAsInt()
this.existingRangeEnd = values[1].parseAsInt()
The processPercentage() function is the tricky one but that is also easily doable if you layer it as well. Assuming those things are always in the same order, it consists of:
an integer;
a literal -;
some sort of textual category; and
a series of key:value pairs.
And even these values within the pairs can be parsed by lower levels, splitting first on commas to get subvalues like Previous/30 and New/70, then splitting each of those subvalues on slashes to get individual items. That way, a logical hierarchy can be reflected in your code.
Unless you're expecting to be parsing this text files many times per second, or unless it's many megabytes in size, I'd be more concerned about the readability and maintainability of your code than the speed of the parsing.
Mostly gone are the days when we need to wring the last ounce of performance from our code but we still have problems in fixing said code in a timely manner when bugs are found or enhancements are desired.
Sometimes it's preferable to optimise for readability.
I would not worry about performance until I was sure there was actually a performance issue. Regarding the rest of the code, if you won't be adding any new line types I would not worry about it. If you do worry about it, however, a factory design pattern can help you separate the selection of the type of processing needed from the actual processing. It makes adding new line types easier without introducing as much opportunity for error.
The younger and more convenient class is Scanner. You just need to modify the delimiter, and get reading of data in the desired format (readInt, readLong) in one go - no need for separate x.parseX - calls.
Second: Split your code into small, reusable pieces. They make the program readable, and you can hide details easily.
Don't hesitate to use a struct-like class for a range, for example. Returning multiple values from a method can be done by these, without boilerplate (getter,setter,ctor).
import java.util.*;
import java.io.*;
public class ReadSampleFile
{
// struct like classes:
class PercentageRow {
public int percentage;
public String name;
public int dataPrevious;
public int dataNew;
public int userLoggedTrue;
public int userLoggedFalse;
public List<Integer> attributeGet;
public List<Integer> attributeSet;
}
class Range {
public int from;
public int to;
}
private int readInt (String name, Scanner sc) {
String s = sc.next ();
if (s.startsWith (name)) {
return sc.nextLong ();
}
else err (name + " expected, found: " + s);
}
private long readLong (String name, Scanner sc) {
String s = sc.next ();
if (s.startsWith (name)) {
return sc.nextInt ();
}
else err (name + " expected, found: " + s);
}
private Range readRange (String name, Scanner sc) {
String s = sc.next ();
if (s.startsWith (name)) {
Range r = new Range ();
r.from = sc.nextInt ();
r.to = sc.nextInt ();
return r;
}
else err (name + " expected, found: " + s);
}
private PercentageLine readPercentageLine (Scanner sc) {
// reuse above methods
PercentageLine percentageLine = new PercentageLine ();
percentageLine.percentage = readInt ("Percentage", sc);
// ...
return percentageLine;
}
public ReadSampleFile () throws FileNotFoundException
{
/* I only read from my sourcefile for convenience.
So I could scroll up to see what's the next entry.
Don't do this at home. :) The dummy later ...
*/
Scanner sc = new Scanner (new File ("./ReadSampleFile.java"));
sc.useDelimiter ("[ \n/,:-]");
// ... is the comment I had to insert.
String dummy = sc.nextLine ();
List <String> values = new ArrayList<String> ();
if (sc.hasNext ()) {
// see how nice the data structure is reflected
// by this code:
long duration = readLong ("DurationOfRun");
int noOfThreads = readInt ("ThreadSize");
Range eRange = readRange ("ExistingRange");
Range nRange = readRange ("NewRange");
List <PercentageRow> percentageRows = new ArrayList <PercentageRow> ();
// including the repetition ...
while (sc.hasNext ()) {
percentageRows.add (readPercentageLine ());
}
}
}
public static void main (String args[]) throws FileNotFoundException
{
new ReadSampleFile ();
}
public static void err (String msg)
{
System.out.println ("Err:\t" + msg);
}
}