Translate words in a string using BufferedReader (Java) - java

I've been working on this for a few days now and I just can't make any headway. I've tried using Scanner and BufferedReader and had no luck.
Basically, I have a working method (shortenWord) that takes a String and shortens it according to a text file formatted like this:
hello,lo
any,ne
anyone,ne1
thanks,thx
It also accounts for punctuation so 'hello?' becomes 'lo?' etc.
I need to be able to read in a String and translate each word individually, so "hello? any anyone thanks!" will become "lo? ne ne1 thx!", basically using the method I already have on each word in the String. The code I have will translate the first word but then does nothing to the rest. I think it's something to do with how my BufferedReader is working.
import java.io.*;
public class Shortener {
private FileReader in ;
/*
* Default constructor that will load a default abbreviations text file.
*/
public Shortener() {
try {
in = new FileReader( "abbreviations.txt" );
}
catch ( Exception e ) {
System.out.println( e );
}
}
public String shortenWord( String inWord ) {
String punc = new String(",?.!;") ;
char finalchar = inWord.charAt(inWord.length()-1) ;
String outWord = new String() ;
BufferedReader abrv = new BufferedReader(in) ;
// ends in punctuation
if (punc.indexOf(finalchar) != -1 ) {
String sub = inWord.substring(0, inWord.length()-1) ;
outWord = sub + finalchar ;
try {
String line;
while ( (line = abrv.readLine()) != null ) {
String[] lineArray = line.split(",") ;
if ( line.contains(sub) ) {
outWord = lineArray[1] + finalchar ;
}
}
}
catch (IOException e) {
System.out.println(e) ;
}
}
// no punctuation
else {
outWord = inWord ;
try {
String line;
while( (line = abrv.readLine()) != null) {
String[] lineArray = line.split(",") ;
if ( line.contains(inWord) ) {
outWord = lineArray[1] ;
}
}
}
catch (IOException ioe) {
System.out.println(ioe) ;
}
}
return outWord;
}
public void shortenMessage( String inMessage ) {
String[] messageArray = inMessage.split("\\s+") ;
for (String word : messageArray) {
System.out.println(shortenWord(word));
}
}
}
Any help, or even a nudge in the right direction would be so much appreciated.
Edit: I've tried closing the BufferedReader at the end of the shortenWord method and it just results in me getting an error on every word in the String after the first one saying that the BufferedReader is closed.

So I took at look at this. First of all, if you have the option to change the format of your textfile I would change it to something like this (or XML):
key1=value1
key2=value2
By doing this you could later use java's Properties.load(Reader). This would remove the need for any manual parsing of the file.'
If by any change you don't have the option to change the format then you'll have to parse it yourself. Something like the code below would do that, and put the results into a Map called shortningRules which could then be used later.
private void parseInput(FileReader reader) {
try (BufferedReader br = new BufferedReader(reader)) {
String line;
while ((line = br.readLine()) != null) {
String[] lineComponents = line.split(",");
this.shortningRules.put(lineComponents[0], lineComponents[1]);
}
} catch (IOException e) {
e.printStackTrace();
}
}
When it comes to actually shortening a message I would probably opt for a regex approach, e.g \\bKEY\\b where key is word you want shortened. \\b is a anchor in regex and symbolizes a word boundery which means it will not match spaces or punctuation.
The whole code for doing the shortening would then become something like this:
public void shortenMessage(String message) {
for (Entry<String, String> entry : shortningRules.entrySet()) {
message = message.replaceAll("\\b" + entry.getKey() + "\\b", entry.getValue());
}
System.out.println(message); //This should probably be a return statement instead of a sysout.
}
Putting it all together will give you something this, here I've added a main for testing purposes.

I think you can have a simpler solution using a HashMap. Read all the abbreviations into the map when the Shortener object is created, and just reference it once you have a word. The word will be the key and the abbreviation the value. Like this:
public class Shortener {
private FileReader in;
//the map
private HashMap<String, String> abbreviations;
/*
* Default constructor that will load a default abbreviations text file.
*/
public Shortener() {
//initialize the map
this.abbreviations = new HashMap<>();
try {
in = new FileReader("abbreviations.txt" );
BufferedReader abrv = new BufferedReader(in) ;
String line;
while ((line = abrv.readLine()) != null) {
String [] abv = line.split(",");
//If there is not two items in the file, the file is malformed
if (abv.length != 2) {
throw new IllegalArgumentException("Malformed abbreviation file");
}
//populate the map with the word as key and abbreviation as value
abbreviations.put(abv[0], abv[1]);
}
}
catch ( Exception e ) {
System.out.println( e );
}
}
public String shortenWord( String inWord ) {
String punc = new String(",?.!;") ;
char finalchar = inWord.charAt(inWord.length()-1) ;
// ends in punctuation
if (punc.indexOf(finalchar) != -1) {
String sub = inWord.substring(0, inWord.length() - 1);
//Reference map
String abv = abbreviations.get(sub);
if (abv == null)
return inWord;
return new StringBuilder(abv).append(finalchar).toString();
}
// no punctuation
else {
//Reference map
String abv = abbreviations.get(inWord);
if (abv == null)
return inWord;
return abv;
}
}
public void shortenMessage( String inMessage ) {
String[] messageArray = inMessage.split("\\s+") ;
for (String word : messageArray) {
System.out.println(shortenWord(word));
}
}
public static void main (String [] args) {
Shortener s = new Shortener();
s.shortenMessage("hello? any anyone thanks!");
}
}
Output:
lo?
ne
ne1
thx!
Edit:
From atommans answer, you can basically remove the shortenWord method, by modifying the shortenMessage method like this:
public void shortenMessage(String inMessage) {
for (Entry<String, String> entry:this.abbreviations.entrySet())
inMessage = inMessage.replaceAll(entry.getKey(), entry.getValue());
System.out.println(inMessage);
}

Related

Sorting a 2D string array in Java in descending order and writing it to a file

So I have to read out a string from a file in Java. It's for a highscore system.
Each line of the file contains something similiar like this: "24/Kilian".
The number in front of the / is the score and the text after the / is the name.
Now, my problem is that I have to sort the scores descending and write them back into the file. The new scores should overwrite the old ones.
I tried it but I can't get it working properly.
I already wrote some code which reads the score + name line by line out of the file.
public static void sortScores() {
String [][]scores = null;
int i = 1;
try (BufferedReader br = new BufferedReader(new FileReader("score.txt"))) {
String line;
while ((line = br.readLine()) != null) {
System.out.println(line);
scores[i][0] = line.substring(0, line.indexOf("/"));
scores[i][1] = line.substring(line.indexOf("/"), line.length());
i++;
}
} catch (FileNotFoundException e) {
e.printStackTrace();
} catch (IOException e) {
e.printStackTrace();
}
}
So, this code basically writes the score and the name in a 2D array like this:
score[0][0] = "24";
score[0][1] = "Kilian";
score[1][0] = "33";
score[1][1] = "Name";
score[2][0] = "45";
score[2][1] = "AnotherName";
I hope someone can help me with my problem.
You can use java.util.Arrays's sort-Method:
Arrays.sort(scores, (a, b) -> -a[0].compareTo(b[0]));
But this lead to the case that "3" will be above "23". So probably you should create new class which holds the value and use an ArrayList
I'd recomend you to make a new class Score which holds your data (score + name) and add a new instance of Score into a ArrayList for each row you read from the file. After that you can implement a Comparator and sort your ArrayList. It's much easier because you don't know how big your string array will get and you need to know that when you're working with arrays.
public class Score {
public Score(int score, String name) {
this.score = score;
this.name = name;
}
int score;
String name;
// getter
}
List<Score> scoreList = new ArrayList<>();
String line;
while ((line = br.readLine()) != null) {
scoreList.add(new Score(Integer.parseInt(line.substring(0, line.indexOf("/"))), line.substring(line.indexOf("/"), line.length())));
}
Collections.sort(scoreList, new Comparator<Score>() {
public int compare(Score s1, Score s2) {
return s1.getScore() - s2.getScore();
}
}
// write to file
You can try it:
HashMap<Integer, String > map = new HashMap<>();
try (BufferedReader br = new BufferedReader(new FileReader("score.txt"))) {
String line;
while ((line = br.readLine()) != null) {
System.out.println(line);
String[] lines = line.split("/");
map.put(Integer.valueOf(lines[0]),lines[1]);
}
SortedSet<Integer> keys = new TreeSet<Integer>(map.keySet());
keys.forEach(k -> System.out.println(map.get(k).toString() + " value " + k ));
Use Arrays.sort(arr, comparator) with a custom comparator:
Arrays.sort(theArray, new Comparator<String[]>(){
#Override
public int compare(final String[] first, final String[] second){
// here you should usually check that first and second
// a) are not null and b) have at least two items
// updated after comments: comparing Double, not Strings
// makes more sense, thanks Bart Kiers
return Double.valueOf(second[1]).compareTo(
Double.valueOf(first[1])
);
}
});
System.out.println(Arrays.deepToString(theArray));

Java parsing alternative to current solution

I have a text file to parse, that requires different logic depending on certain conditions. Below, is my current solution that works. However, I find it very clunky, and have been looking into other solutions such as StringTokenizer or Pattern class and am wondering I may be able to implement this more elegantly using them.
Do let me know if I should move this to the Code Review forum--I have not initially put it there, as I am unable to implement the other mentioned solutions.
File file = fileChooser.getSelectedFile();
java.io.BufferedReader reader = new java.io.BufferedReader(new java.io.FileReader(file));
memoryMap = new HashMap<Integer, Integer>();
registerMap = new HashMap<Integer, Integer>();
String line = reader.readLine();
while (line != null) {
if (line.contains("#")) {
System.out.println(line);
line = reader.readLine();
}
if (!Character.isDigit(line.charAt(0))) {
System.out.println(line);
String[] setFirstSplit = line.split(":");
if (setFirstSplit[0].equals("M")) {
boolean isFirst = true;
for (String setFirstSegment : setFirstSplit) {
if (!isFirst) {
String[] setSecondSplit = setFirstSegment.split(",");
for (String setSecondSegment : setSecondSplit) {
String[] setThirdSplit = setSecondSegment.split("=");
for (String setThirdSegment : setThirdSplit) {
System.out.println(setThirdSegment);
memoryMap.put(Integer.parseInt(setThirdSplit[0]), Integer.parseInt(setThirdSplit[1]));
System.out.println("Memory Set Result: " + memoryMap);
}
}
} else {
isFirst = false;
}
}
}
if (setFirstSplit[0].equals("R")) {
boolean isFirst = true;
for (String setFirstSegment : setFirstSplit) {
if (!isFirst) {
String[] setSecondSplit = setFirstSegment.split(",");
for (String setSecondSegment : setSecondSplit) {
String[] setThirdSplit = setSecondSegment.split("=");
for (String setThirdSegment : setThirdSplit) {
System.out.println(setThirdSegment);
registerMap.put(Integer.parseInt(setThirdSplit[0]), Integer.parseInt(setThirdSplit[1]));
System.out.println("Register Set Result: " + registerMap);
}
}
} else {
isFirst = false;
}
}
}
line = reader.readLine();
} else {
System.out.println(line);
String[] actionFirstSplit = line.split(" ");
if (actionFirstSplit[1].equals("LOAD")) {
String[] actionSecondSplit = actionFirstSplit[2].split(",");
LoadStep action = new LoadStep();
action.executeStep(Integer.parseInt(actionSecondSplit[0]), Integer.parseInt(actionSecondSplit[1]));
System.out.println("Memory Action Result: " + memoryMap);
System.out.println("Register Action Result: " + registerMap);
}
else {
System.out.println(line);
}
line = reader.readLine();
}
}
reader.close();
The text file looks like this:
# sets the memory address 0 to store the value 1. M stands for memory.
M:0=1,1=11
# All programs starts with an initial setup of values in memory such as the example shown above
0 LOAD 1,3
1 LOAD 0,2
2 ADD 1,2
3 ADD 0,1
4 LSS 1,3,2
5 STOR 62,1
6 STOP
Write it top-down.
String line = reader.readLine();
while (line != null) {
if (parsedComment(line)) {
} else if (parsedMemory(line)) {
} else if (parsedInstruction(line)) {
} else {
error(...);
}
line = reader.readLine();
}
Parse functions may use fields to pass results, like those maps, or have extra parameters.
(If you have multi-line syntax, the reader might be better placed in a field, and disappear as parameter. You can then read a line ahead in the field, and check on that.)
You could use a parser generator like ANTLR http://www.antlr.org/

apply extraction information with java

i trying to apply a dictionary (File of Words) on text(File of text):
we test if the word exists in a line of the text, if yes we will print it (the line). we test all word of dictionary for every line of text.
i used EXPREG pattern+matcher but the problem is the time. the operation take 5H.
The 2 File have 3330ko and 55ko
.
my question is is there another method to do this like UNITEX but in java
public class Tratemant_Dic extends Thread {
Tratemant_Dic() {
}
public void run() {
try {
BufferedReader file_corpus = new BufferedReader(
new InputStreamReader(new FileInputStream(
"corpus-medical.TXT"), "UTF-16LE"));
PrintWriter ecrire = new PrintWriter("sort.html");
String line;
String nom = null;
ecrire.write("<mot><span style=\"color:red\">startsss</span></mot></br><ligne>start\n");
while ((line = file_corpus.readLine()) != null) {
BufferedReader file_nom = new BufferedReader(
new InputStreamReader(new FileInputStream(
"Fichie_sorte.DIC"), "UTF-16LE"));
while ((nom = file_nom.readLine()) != null) {
nom = nom.substring(0, nom.length() - 3);
Pattern p = Pattern.compile("(.*)\\W+" + nom + "\\b.*",
Pattern.CASE_INSENSITIVE);
Matcher m = p.matcher(line);
if (m.find()) {
System.out.println(nom + "==>" + line);
ecrire.write("<mot><span style=\"color:red\">" + nom
+ "</span></mot></br><ligne>" + line + "\n");
}
}
file_nom.close();
}
ecrire.close();
System.out.println("FIN");
} catch (FileNotFoundException e) {
// TODO Auto-generated catch block
e.printStackTrace();
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}
}
If I understand what you are trying to do correctly, I would not use regular expressions to do it. They're slow and you do not need them.
This is really a string matching problem. Your dictionary should probably be stored in a hash table, using the hashCode() method to get a key for the string. You then search in your dictionary for each word as you read it ( calculating the appropriate hash code as you read it ) from the text. Properly done that should be as fast as it gets.
Remember that hash codes are not guaranteed to be unique, so always make sure the actual strings match even if the hash code is found in the table.
I would start by attempting to time each of the "things" your application does than then target the slowest item (as mentioned in a comment by Jay, one issue in your case is the fact you are loading the dictionary every time) rather than base the improvements on a guess of what is wrong (the regex being slow).
You can use System.nanoTime() or one of the many stopwatches to do this. I normally use guava.
Why you not use instead of
Pattern p = Pattern.compile("(.*)\\W+" + nom + "\\b.*",
Pattern.CASE_INSENSITIVE);
Matcher m = p.matcher(line);
if (m.find()) {
...
just
if(line.indexOf(nom) > -1) {
...
?
Update: if you need word boundary stuff use something:
String lineToLowerCase = line.toLowerCase(); // before second while
...
int index = lineToLowerCase.indexOf(nom.toLowerCase());
if(index > -1) {
if(index ==0 || Character.isWhitespace(lineToLowerCase.charAt(index-1))) {
int indexEnd = index + nom.length();
if (indexEnd >= lineToLowerCase.length() || !Character.isAlphabetic(lineToLowerCase.charAt(indexEnd))) {
...
for testing
public static void main(String[] s) {
check("skdc s dcd dsf", "dcd"); // print true
check("skdc sdcd dsf", "dcd"); // print false
check("dcd dsf", "dcd"); // print true
check("afasa dcd", "dcd"); // print true
check("afasa dCD11", "dcD"); // print true
check("skdc s dcda dsf", "dcd"); // print false
}
public static void check(String line, String nom) {
String lineToLowerCase = line.toLowerCase();
int index = lineToLowerCase.indexOf(nom.toLowerCase());
if(index > -1) {
if(index ==0 || Character.isWhitespace(lineToLowerCase.charAt(index-1))) {
int indexEnd = index + nom.length();
if (indexEnd >= lineToLowerCase.length() || !Character.isAlphabetic(lineToLowerCase.charAt(indexEnd))) {
System.out.println("true");
return;
}
}
}
System.out.println("false");
}

Handle Empty lines in Java

I am facing a problem in the following code. I am trying to run the program and it terminates when it hits empty space in my input. How else I should approach this.
try {
BufferedReader sc = new BufferedReader(new FileReader(text.txt);
ArrayList<String> name = new ArrayList<>();
ArrayList<String> id = new ArrayList<>();
ArrayList<String> place = new ArrayList<>();
ArrayList<String> details = new ArrayList<>();
String line = null;
while ((line = sc.readLine()) !=null) {
if (!line.trim().equals("")) {
System.out.println(line);
if (line.toLowerCase().contains("name")) {
name.add(line.split("=")[1].trim());
}
if (line.toLowerCase().contains("id")) {
id.add(line.split("=")[1].trim());
}
if (line.toLowerCase().contains("location")) {
place.add(line.split("=")[1].trim());
}
if (line.toLowerCase().contains("details")) {
details.add(line.split("=")[1].trim());
}
}
}
PrintWriter pr = new PrintWriter(new File(text.csv));
pr.println("Name;Id;;Location;Details");
for (int i = 0; i < name.size(); i++) {
pr.println(name.get(i) + ";" + id.get(i) + ";" + place.get(i) + ";" + details.get(i));
}
pr.close();
sc.close();
} catch (Exception e) {
e.printStackTrace();
} }
My Input looks like
name = abc
id = 123
place = xyz
details = hsdyhuslkjaldhaadj
name = ert
id = 7872
place =
details = shahkjdhksdhsala
name = sfd
id = 4343
place = ksjks
Details = kljhaljs
when im trying to execute then above text my program terminates at place = "null" because of no value there.I need the output as an empty space created in place ="null" and print the rest as follows in a .csv file
If you process the location, line.split("=")[1] could result in an ArrayIndexOutOfBoundException and line.split("=")[1].trim() could result in a NullPointerException.
You can avoid this by testing your parsed result.
Instead of place.add(line.split("=")[1].trim());, do place.add(parseContentDefaultEmpty(line));, with:
private String parseContentDefaultEmpty(final String line) {
final String[] result = line.split("=");
if(result.length <= 1) {
return "";
}
final String content = line.split("=")[1];
return content != null ? content.trim() : "";
}
First there is a issue,your input file contains key as "place" but your are trying for word "location"
if (line.toLowerCase().contains("location")) { //this must be changed to place
place.add(line.split("=")[1].trim());
}
Modified the code snippet as below.check it
while ((line = sc.readLine()) != null) {
if (!line.trim().equals("")) {
System.out.println(line);
if (line.toLowerCase().contains("name")) {
name.add(line.split("=")[1].trim());
}
if (line.toLowerCase().contains("id")) {
id.add(line.split("=")[1].trim());
}
if (line.toLowerCase().contains("place")) {
// change done here to add space if no value
place.add(line.split("=").length > 1 ? line.split("=")[1]
.trim() : " ");
}
if (line.toLowerCase().contains("details")) {
details.add(line.split("=")[1].trim());
}
}
}
Setting question to line doesn't appear to change what line is read later (if you're wanting the line to advance before it hits the while loop).

Issue with text file conversion to arraylist

I apologize in advance if the solution is relatively obvious; however, the AP compsci curriculum at my highschool included practically no involvement with IO or file components. I've been trying to write an elementary flashcard program - thus it would be much more practical to read strings off a text file than add 100 objects to an array. My issue is that when I go to check the size and contents of the ArrayList at the end, it's empty. My source code is as follows:
public class IOReader
{
static ArrayList<FlashCard> cards = new ArrayList<FlashCard>();
static File file = new File("temp.txt");
public static void fillArray() throws FileNotFoundException, IOException
{
FileInputStream fiStream = new FileInputStream(file);
if(file.exists())
{
try( BufferedReader br = new BufferedReader( new InputStreamReader(fiStream) )
{
String line;
String[] seperated;
while( (line = br.readLine()) != null)
{
try
{
seperated = line.split(":");
String foreign = seperated[0];
String english = seperated[1];
cards.add( new FlashCard(foreign, english) );
System.out.println(foreign + " : " + english);
}
catch(NumberFormatException | NullPointerException | ArrayIndexOutOfBoundsException e)
{
e.printStackTrace();
}
finally{
br.close();
}
}
}
}
else{
System.err.print("File not found");
throw new FileNotFoundException();
}
}
public static void main(String[] args)
{
try{
fillArray();
}
catch (Exception e){}
for(FlashCard card: cards)
System.out.println( card.toString() );
System.out.print( cards.size() );
}
}
My text file looks as thus:
Volare : To Fly
Velle : To Wish
Facere : To Do / Make
Trahere : To Spin / Drag
Odisse : To Hate
... et alia
My FlashCard class is very simplistic; it merely takes two Strings as parameter. The issue though is that the output whenever I run this is that nothing is printed except for the 0 printed in the main method, indicating that the ArrayList is empty. I thank you in advance for any help, as any would be appreciated.
Some point to consider:
in your fillArray() is good to throws exceptions and caught them inside agent
portion of your program which is main(), so your code in fillArray() will be more
readable and you will not hide the exceptions.
I do not think there is any need to check whether the file exist because if it does not
exist, the exception will be throw and main() function will be use it.
I use Igal class instead of FlashCard class which is as same as your FlashCard class
Code for Igal Class:
public class Igal {
private String st1;
private String st2;
public Igal(String s1, String s2){
st1 = s1;
st2 = s2;
}
/**
* #return the st1
*/
public String getSt1() {
return st1;
}
/**
* #param st1 the st1 to set
*/
public void setSt1(String st1) {
this.st1 = st1;
}
/**
* #return the st2
*/
public String getSt2() {
return st2;
}
/**
* #param st2 the st2 to set
*/
public void setSt2(String st2) {
this.st2 = st2;
}
#Override
public String toString(){
return getSt1() + " " + getSt2();
}
}
Code:
static List<Igal> cards = new ArrayList<>();
static File file = new File("C:\\Users\\xxx\\Documents\\NetBeansProjects\\Dictionary\\src\\temp.txt");
public static void fillArray() throws FileNotFoundException, IOException {
FileInputStream fiStream = new FileInputStream(file);
BufferedReader br = new BufferedReader(new InputStreamReader(fiStream));
String line;
String[] seperated;
while ((line = br.readLine()) != null) {
seperated = line.split(":");
String foreign = seperated[0];
String english = seperated[1];
System.out.println(foreign + " : " + english);
Igal fc = new Igal(foreign, english);
cards.add(fc);
}
}
public static void main(String[] args) {
try {
fillArray();
} catch (IOException e) {
System.out.println(e);
}
System.out.println("------------------");
for (Igal card : cards) {
System.out.println(card.toString());
}
System.out.print("the size is " + cards.size()+"\n");
temp.txt content are as follows
Volare : To Fly
Velle : To Wish
Facere : To Do / Make
Trahere : To Spin / Drag
Odisse : To Hate
output:
------------------
Volare To Fly
Velle To Wish
Facere To Do / Make
Trahere To Spin / Drag
Odisse To Hate
the size is 5

Categories