java - scanning txt into 2D arrays with own class - java

to start with i want to say this is no homework or somthing, i just want deeper knowledge about these kind of arrays with I/O so feel free to just tell me how you tackle the problem WITH SCANNER, if its solvable :P
if i have a txt file that is like:
car 1 2 3 4 5
boat 1 2 3 4 5
plane 1 2 3 4 5
and i have made a new class in new .java-file which is an abstract 2d array:
class Type
{
String type;
int number;
}
public toString()
{
return String.format("%02d:%02d", type, number);
}
is it possible to get an outprint like:
car:1 car:2 car:3
boat:1 boat:2 boat:3
etc? thanks.
edit: also an ArrayList of course..
edit2:
while (scanner.hasNext())
{
list.add(scanner.hasNext(), 0); //the array should be <car, 0>
} //later i will loop through numbers

The pseudo-code of what I'm suggesting is this:
create line Scanner
while scanner has next *line*
get the next line
use String#split(" ") to create a String array from this line
Create your object from the items in this array
Add your new object into your list
end while loop

First thing that comes to mind is this (assuming your file will always have that structure):
Read your line and use string.split("\\s+"); to break your line into word tokens (\\s+ denotes one or more space in regular expression language).
If you can be certain that the word will always be the first element in the line, then, you can iterate from your 2nd token (your first number) till your last token (your last number) and with each iteration you create a new Type object where the type is the first token and the number is the nth token.
The above should allow you to construct objects which when printed, will yield your desired output.
If on the other hand, the order of the tokens is not known, but you are sure that in every line you will have one word and n numbers, you can use further regular expressions such as ^\\w+$ and ^\\d+$ to see which token is either a word or a digit respectively. Once you know which token is what, you can refer to the above points to get your code to work.
For more information on regular expressions, you can take a look at this tutorial.

Related

Parsing natural text to array

How can I parse natural strings like these:
"10 meters"
"55m"
Into instances of this class:
public class Units {
public String name; //will be "meters"
public int howMuch; //will be 10 or 55
}
P.S. I want to do this with NLP libraries, I'm really a noob in NLP and sorry for my bad english
It is possible, but I recommend you don't do this. An array usually holds only hold one type of data structures, so it cannot hold an int and a string at the same time. If you did do it, you would have to do Object[][]
You could use the following algorithm:
Separate the text into words by looping through each character and breaking off a new word each time you encounter a space: this can be stored in a String array. Make sure that each word is stored lowercase.
Store a 2-dimensional String array as a database of all the units you want to recognize: this could be done with each sub-array representing one unit and all its equivalent representations: for example, the sub-array for meters might look like {"meter","meters","m"}.
Make two parallel ArrayLists: the first representing all numerical values and the second representing their corresponding units.
Loop through the list of words from step 1: for each word, check if it is in the format nubmer+unit (without an adjoining space). If so, then split the number off and put it in the first ArrayList. Then, find the unabbreviated unit corresponding with the abbreviated unit given in the text by referring to the 2-dimensional string array (this should be the first index of the subarray). Add this unit to the second ArrayList. Finally, if the word is a single number, check if the next word corresponds with any of the units; if it does, then find its unabbreviated unit (the first index of the sub-array). Then add the number and its unit to their respective ArrayLists.

I am trying to find the specific character within a String Array. What must I end up using?

Below is the code.\ I imagine you use a for loop and then another but I cannot seem to make it work. I attempted research however most topics were too complex since I am a novice. I'm trying to find a way to get the fifth character out of each string within the variable. I'll use the information given to me so i can then solve the rest of my program. I have more to do
public static void main (String[]args)
{
String[] decoder = {"Nexa2f5", "Z52Bizlm" , "Diskapr" , "emkem9sD", "LaWYrUs", "dAStn78L", "mPTuriye", "aaeeiuUu", "IL8Ctmpn"};
int character = 4;
for(int i=0; i<=decoder.length-1; i++)
{
}
}
I am trying to get the third and fourth characters of the odd numbered Strings. I am trying to put the letters into an array and decode the message. I am also trying to print the 5th character of all other words. I'm having issues commenting right and I've tried to reply a couple times but no dice.
Within your for loop use the array indexing notation. For example, String current = decoder[0] results in current having the value Nexa2f5. Once you have the String Object (in my example, I named it current) you can use the charAt() method shown in the String class documentation to get the 4th and 5th characters. If you need more help than that, read my comments then update your code and ask another question.

String method java assignment

So I'm having a bit of trouble on one of my assignments. I can't seem to figure out what I need to do. Here is the question:
Points for reading are assigned on the following basis: The first three books read are worth 10 points each. The next three books read are worth 15 points each. All books read over six are worth 20 points each. A student who reads 7 books would be awarded 95 points (30 for the first three plus 45 for the next 3 and 20 for the 7th book).
An external file contains a first and last name followed by an integer (the number of books read)
Print on each persons name, the number of books read and the points earned. The names should be in the order last, first with the only the last name in all capital letters. At the bottom of the list print out the average points for all readers and the winner of the contest.
Statements Required: input, output, decision making, loop control, strings
Data Location: prog700c.dat
Sample Output:
Reading Contest
Name Books Points
SUMMER Sam 4 45
LAZY Linda 2 20
PRODDER Paul 5 60
MASTER K.C. 8 115
READER Richie 6 75
Average points per reader = 63.0
The winner of the contest is K.C. Master
The external file data is this:
SUMMER Sam 4
LAZY Linda 2
PRODDER Paul 5
MASTER K.C. 8
READER Richie 6
My current code:
import java.io.*;
import java.util.*;
public class Prog505a
{
public static void main(String[] args)
{
Scanner kbReader = new Scanner(new File("C:\\Users\\Guest\\Documents\\java programs\\Prog700c\\Prog700c.in"));
String data = kbReader.nextLine();
while(kbReader.hasNextLine())
{
I'm having a problem where I don't know what string method to use to only get the numbers from the external file to use them in the calculations. I know I can do the decision making, etc. but I just don't know what to do on this one part to get only the numbers from the external file. If someone could provide some direction or guidance would be greatly appreciated. Thank you!
You can use String.split("\\s+") to split the line into three parts (check if they are three parts to be sure, print a warning and skip if not). This will return an array of strings. You can use Integer.parseInt() to read the number of books on the third item in the array (position 2, arrays in Java are "zero based").
Note that "\\s+" is a regular expression which finds 1 or more parts of whitespace. The strings within the whitespace and the beginning/end of the string are returned. If regular expressions are not allowed, try and find the index of the space-character within the string, and use String.substring() - and don't ignore the return value for string operations.
Alternatively you could use next(), next() and nextInt() instead of nextLine() in your scanner.
You could use regex to deliver what you're saying :
Try to use something like:
str = str.replaceAll("\\D+","");
This way you would delete all non-digits in a string. After that you would need to parse the String to an integer or any other Number type.

Implementing the useDelimiter method

I have the following code, please keep in mind I'm just starting to learn a language and a such have been looking for fairly simple exercises. Coding etiquette and critics welcome.
import java.util.*;
import java.io.*;
public class Tron
{
public static void main(String[] args) throws Exception
{
int x,z,y = 0;
File Tron= new File("C:\\Java\\wordtest.txt");
Scanner word = new Scanner(Tron);
HashMap<String, Integer> Collection = new HashMap<String, Integer>();
//noticed that hasNextLine and hasNext both work.....why one over the other?
while (word.hasNext())
{
String s = word.next();
Collection.get(s);
if (Collection.containsKey(s))
{
Integer n = Collection.get(s);
n = n+1;
Collection.put(s,n);
//why does n++ and n+1 give you different results
}else
{
Collection.put(s,1);
}
}
System.out.println(Collection);
}
}
Without the use of useDelimiter() I get my desired output based on the file I have:
Far = 2, ran = 4, Frog = 2, Far = 7, fast = 1, etc...
Inserting the useDelimiter method as follows
Scanner word = new Scanner(Bible);
word.useDelimiter("\\p{Punct} \\p{Space}");
provides the following output as it appears in the text file shown below.
the the the the the
frog frog
ran
ran ran ran
fast, fast fast
far, far, far far far far far
Why such a difference in output if useDelimiter was supposed to account for punctuation new lines etc? Probably pretty simple but again first shot at a program. Thanks in advance for any advice.
With word.useDelimiter("\\p{Punct} \\p{Space}") you are actually telling the scanner to look for delimiters consisting of a punctuation character followed by a space followed by another whitespace character. You probably wanted to have one (and only one) of these instead, which would be achieved by something like
word.useDelimiter("\\p{Punct}|\\p{Space}");
or at least one of these, which would look like
word.useDelimiter("[\\p{Punct}\\p{Space}]+");
Update
#Andrzej nicely answered the questions in your code comments (which I forgot about), however he missed one little detail which I would like to expand / put straight here.
why does n++ and n+1 give you different results
This obviously relates to the line
n = n+1;
and my hunch is that the alternative you tried was
n = n++;
which indeed gives confusing results (namely the end result is that n is not incremented).
The reason is that n++ (the postfix increment operator by its canonical name) increments the value of n but the result of the expression is the original value of n! So the correct way to use it is simply
n++;
the result of which is equivalent to n = n+1.
Here is a thread with code example which hopefully helps you understand better how these operators work.
Péter is right about the regex, you're matching a very specific sequence rather than a class of characters.
I can answer the questions from your source comments:
noticed that hasNextLine and hasNext both work.....why one over the other?
The Scanner class is declared to implement Iterator<String> (so that it can be used in any situation where you want some arbitrary thing that provides Strings). As such, since the Iterator interface declares a hasNext method, the Scanner needs to implement this with the exact same signature. On the other hand, hasNextLine is a method that the Scanner implements on its own volition.
It's not entirely unusual for a class which implements an interface to declare both a "generically-named" interface method and a more domain-specific method, which both do the same thing. (For example, you might want to implement a game-playing client as an Iterator<GameCommand> - in which case you'd have to declare hasNext, but might want to have a method called isGameUnfinished which did exactly the same thing.)
That said, the two methods aren't identical. hasNext returns true if the scanner has another token to return, whereas hasNextLine returns true if the scanner has another line of input to return.
I expect that if you run the scanner over a file which doesn't end in a newline, and consume all but one of the tokens, then hasNext would return true while hasNextLine would return false. (If the file ends in a newline then both methods will behave the same - as there are more tokens if and only if not all lines have been consumed - but they're not technically the same.)
why does n++ and n+1 give you different results
This is quite straightforward.
n + 1 simply returns a value that is one greater than the current value of n. Whereas n++ sets n to be one greater, and then returns that value.
So if n was currently 4, then both options would return 5; the difference is that the value of n would still be 4 if you called n + 1 but it would be 5 if you called n++.
In general, it's wise to avoid using the ++ operator except in situations where it's used as boilerplate (such as in for loops over an index). Taking two or three extra characters, or even an extra line, to express your intent more clearly and unambiguously is such a small price that it's almost always worth doing.

Help me understand question related to HashMap in Java

Im given a task which i am a little confused to understand. Here is the question statement:
The following program should read a file and store all its tokens in a member variable.
Your task is to write a single method that returns the number of items in tokenMap, the average length (as double value) of the elements in tokenMap, and the number of tokens starting with character "a".
Here the tokenMap is an object of type HashMap<String, Integer>;
I do have some idea about HashMap but what i want to know the "key value" for HashMap required is a single character or the whole word?? that i should store in tokenMap.
Also how can i compute the average length?
Looks like you have to use the entire word as the key.
The average length of tokens can be computed by summing the lengths of each token and dividing by the number of tokens.
In Java, you can find the number of tokens in the HashMap by tokenMap.size().
You can write loops that visit each member of the map like this:
for(String t: tokenMap.values()){
//t is a token
}
and if you look up String in the Java API docs you will see that it is easy to find the length of a String.
To compute the average length of the items in a hash map, you'll have to iterate over them all and count the length and calculate the average.
As for your other question about what to use for a key, how are we supposed to know? A hashmap can use practically any* value for a key.
*The value must be hashable, which is defined differently for different languages.
Reading the question closely, it seems that you have to read a file, extract each word and use it as the key value, and store the length of each key as the integer:
an example line
leads to a HashMap like this
an : 2
example : 7
line : 4
After you've built your map (made of keys mapping to entries, or seemingly elements in the question), you'll need to run some statistics over it to find
the number of keys (look at HashMap)
the average length of all keys (again, simple enough)
the number beginning with "a" (just look at the String)
Then make a value object containing these values and return it from the method that does the statistics.
I know I've given more information that you require, but someone else may benefit from a little extra help.
Guys there is some confusion. Im not asking for a solution. Im just confused for one thing.
For the time being, im gonna use String type as the key type.
The only confusion i have is once i read the file line by line, should i split it based upon words or based upon each character. So that the key value should be a single character type string or a String of whole word.
If you can go through the question statement, what do you suggest. That's all im asking.
should i split it based upon words or
based upon each character
The requirement is to make tokens, so you should split them based on words. Each word becomes a unique String key. It would make sense for the value to be the count of each token.
If the file you are reading has these three lines:
int alpha;
int beta;
float delta;
Then you should have something like
<"int", 2>
<";", 3>
<"alpha", 1>
<"beta", 1>
<"float", 1>
<"delta", 1>
(The semicolon may or may not be considered a token.)
Your average length would be ( 3x2 + 3x1 + 5 + 4 + 5 + 5) / 6.
Your length of tokens starting with "a" would be 5.0.
Look elsewhere on this forum for keySet and you should be good to go.

Categories