How to use String object to parse input in Java - java

I'm creating a command line utility in Java as an experiment, and I need to parse a string of input from the user. The string needs to be separated into components for every occurrence of '&'. What's the best way to do this using the String object in Java.
Here is my basic code:
//Process p = null;
Process p = null;
Runtime r = Runtime.getRuntime();
String textLine = "";
BufferedReader lineOfText = new BufferedReader(new InputStreamReader(System.in));
while(true) {
System.out.print("% ");
try {
textLine = lineOfText.readLine();
} catch (IOException e1) {
// TODO Auto-generated catch block
e1.printStackTrace();
}
//System.out.println(textLine);
}

I think the simplest way is
String[] tokens = textLine.split("&");

A Scanner or a StringTokenizer is another way to do this. But for a simple delimiter like this, the split() method mentioned by MByd will work perfectly.

Related

Cannot convert file to string for very large file?

I have tried to implement a bufferedreader in order to convert a .txt file, specifically the Iliad to a string. I have tested small files and they have worked but the larger do not. When I attempt to print fileString after the while loop it finished, no output is shown. Here's my code.
String fileString = "";
String line = "";
char readChar;
BufferedReader br;
try {
br = new BufferedReader(new FileReader(inputFile));
while((line=br.readLine())!=null)
{
fileString = fileString + line;System.out.println(fileString);
}
br.close();
} catch (IOException e) {
e.printStackTrace();
}System.out.println(fileString);
Recall that Strings are immutable in java. This means that the way you are constructing the String from by + is extremely inefficient and resource costly.
You can use either StringBuilder or StringBuffer. In my example I use StringBuilder since it does not seem that you need to worry about synchronization.
StringBuilder fileString = new StringBuilder();
String line = "";
char readChar;
BufferedReader br;
try {
br = new BufferedReader(new FileReader(inputFile));
while((line=br.readLine())!=null)
{
fileString.append(line);
}
br.close();
} catch (IOException e) {
e.printStackTrace();
}
System.out.println(fileString.toString());
Try this. Although, I am not sure whether println function will be able to print the whole string.
You are doing string concatenation the slow way. The very slow way. Use a StringBuffer or StringBuilder.
It has exactly nothing to do with BufferedReader whatsoever, as a simple test will show: remove the code in the body of the loop.

Read file, one line at a time and run code

I have a file with text in this format:
text:text2:text3
text4:text5:text6
text7:text8:text9
Now what I want to do, is to read the first line, separate the words at the ":", and save the 3 strings into different variables. those variables are then used as parameter for a method, before having the program read the next line and doing the same thing over and over again.. So far I've got this:
public static void main(String[] args) {
BufferedReader reader = null;
try {
File file = new File("C://Users//Patrick//Desktop//textfile.txt");
reader = new BufferedReader(new FileReader(file));
String line;
while ((line = reader.readLine()) != null) {
System.out.println(line);
}
} catch (IOException e) {
e.printStackTrace();
} finally {
try {
reader.close();
} catch (IOException e) {
e.printStackTrace();
}
}
Also, I've tried this for separation (although not sure Array is the best option:
String[] strArr = sCurrentLine.split("\\:");
Use String[] parts = line.split(":"); to get an array with text, text2 etc. You can then loop through parts and call the method you want with each item in the list.
Your original split does not work, because : is not a special character in Regex. You only have to use an escape character when the split you are trying to achieve uses a special character.
More information here.

java string matching from a large text file issue

I would like to implement a task of string matching from a large text file.
1. replace all the non-alphanumeric characters
2. count the number of a specific term in the text file. For example, matching term "tom". The matching is not case sensitive.so term "Tom" should me counted. However the term tomorrow should not be counted.
code template one:
try {
in = new BufferedReader(new InputStreamReader(new FileInputStream(inputFile));
} catch (FileNotFoundException e1) {
System.out.println("Not found the text file: "+inputFile);
}
Scanner scanner = null;
try {
while (( line = in.readLine())!=null){
String newline=line.replaceAll("[^a-zA-Z0-9\\s]", " ").toLowerCase();
scanner = new Scanner(newline);
while (scanner.hasNext()){
String term = scanner.next();
if (term.equalsIgnoreCase(args[1]))
countstr++;
}
}
} catch (IOException e) {
e.printStackTrace();
}
code template two:
try {
in = new BufferedReader(new InputStreamReader(new FileInputStream(inputFile));
} catch (FileNotFoundException e1) {
System.out.println("Not found the text file: "+inputFile);
}
Scanner scanner = null;
try {
while (( line = in.readLine())!=null){
String newline=line.replaceAll("[^a-zA-Z0-9\\s]", " ").toLowerCase();
String[] strArray=newline.split(" ");//split by blank space
for (int =0;i<strArray.length;i++)
if (strArray[i].equalsIgnoreCase(args[1]))
countstr++;
}
}
} catch (IOException e) {
e.printStackTrace();
}
By running the two codes, I get the different results, the Scanner looks like to get the right one.But for the large text file, the Scanner runs much more slower than the latter one. Anyone who can tell me the reason and give a much more efficient solution.
In your first approch. You dont need to use two scanner. Scanner with "" is not good choice for the large line.
your line is already Converted to lowercase. So you just need to do lowercase of key outside once . And do equals in loop
Or get the line
String key = String.valueOf(".*?\\b" + "Tom".toLowerCase() + "\\b.*?");
Pattern p = Pattern.compile(key);
word = word.toLowerCase().replaceAll("[^a-zA-Z0-9\\s]", "");
Matcher m = p.matcher(word);
if (m.find()) {
countstr++;
}
Personally i would choose BufferedReader approach for the large file.
String key = String.valueOf(".*?\\b" + args[0].toLowerCase() + "\\b.*?");
Pattern p = Pattern.compile(key);
try (final BufferedReader br = Files.newBufferedReader(inputFile,
StandardCharsets.UTF_8)) {
for (String line; (line = br.readLine()) != null;) {
// processing the line.
line = line.toLowerCase().replaceAll("[^a-zA-Z0-9\\s]", "");
Matcher m = p.matcher(line);
if (m.find()) {
countstr++;
}
}
}
Gave Sample in Java 7. Change if required!!

SetText String[] in a TextView

I am trying to use setText, and I want to use a String array. First, I create a String [], then I assign data to String[0], then I want to .setText(String[0]) on my TextView, is this the right way?
Note : I'm using a StringTokenizer to split Strings in the textfile
try {
filename = "myk.txt";
FileReader filereader = new FileReader(Environment.getExternalStorageDirectory() + "/Q/" + filename);
BufferedReader bufferedreader = new BufferedReader(filereader);
try {
while ((text = bufferedreader.readLine()) != null){
sb.append(text);
sb.toString().split(";");
tokens = new StringTokenizer(sb.toString(), ";");
///NULLPOINTER EXEPTION HERE//// if (tokens.countTokens() > 0){questionfromfile[0] = tokens.nextToken();
}
}
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
} catch (FileNotFoundException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
////ETC ...//// and now textview.setText(question[0]);
Sure, you mean something like
String[] strings = new String [5];
strings[0] = "foobar";
component.setText(strings[0]);
why do you have this line:
sb.toString().split(";");
?
are you forgetting that a string is immutable ,meaning that using the standard API that you use , the string will never change itself , but create new objects instead?
about StringTokenizer, as javadocs say:
StringTokenizer is a legacy class that is retained for compatibility
reasons although its use is discouraged in new code. It is recommended
that anyone seeking this functionality use the split method of String
or the java.util.regex package instead.

how do you get String Tokenizer to ignore text?

I have this code:
public void readTroops() {
File file = new File("resources/objects/troops.txt");
StringBuffer contents = new StringBuffer();
BufferedReader reader = null;
try {
reader = new BufferedReader(new FileReader(file));
String text = null;
// repeat until all lines is read
while ((text = reader.readLine()) != null) {
StringTokenizer troops = new StringTokenizer(text,"=");
String list = troops.nextToken();
String value = troops.nextToken();
}
and this file:
//this is a comment part of the text file//
Total=1
the problem is that 1) I cant get it to ignore everything within the //,// and can't get it to read with an 'ENTER' (line) in-between them. For example, this text works:
Total=1
So my question is what do I type into the delimiter area ie.
StringTokenizer troops = new StringTokenizer(text,"=","WHAT GOES HERE?");
So how can I get Tokenizer to ignore 'ENTER'/new line, and anything in-between // or something similar, thanks.
ps.I don't care if you use a String.split to answer my question.
Use the method countTokens to skip lines that don't have two tokens:
while ((text = reader.readLine()) != null) {
StringTokenizer troops = new StringTokenizer(text,"=");
if(troops.countTokens() == 2){
String list = troops.nextToken();
String value = troops.nextToken();
....
}else {
//ignore this line
}
}
Properties prop = new Properties();
prop.load(new FileInputStream("properties_file.txt"));
assertExuals("1",prop.getProperty("Total"));
ps. you might hold and close input stream.
Thinking out of the box, maybe you can use Properties instead of tokenizer (if you update your comments to start with #)?
Properties troops = new Properties();
InputStream inputStream = SomeClass.class.getResourceAsStream("troops.properties");
try {
props.load(inputStream);
} catch (IOException e) {
// Handle error
} finally {
// Close inputStream in a safe manner
}
troops.getProperty("Total"); // Returns "1"
Or if you are using Java 7:
Properties troops = new Properties();
try (InputStream inputStream = SomeClass.class.getResourceAsStream("troops.properties")) {
props.load(inputStream);
} catch (IOException e) {
// Handle error
}
troops.getProperty("Total"); // Returns "1"
If you are reading in the file a better way would be to use a StreamTokenizer. This then allows you to declare your own syntax of the tokenizer. I used this method to create a HTML rendering engine. This then allows you to parse direct from a reader, and also provides useful functions to identify numbers, which it seems you may use.
(I will post an example once my eclipse loads!)
public static String render(String file, HashMap vars){
// Create a stringbuffer to rebuild the string
StringBuffer renderedFile = new StringBuffer();
try{
FileReader in = new FileReader(file);
BufferedReader reader = new BufferedReader(in); // create your reader
StreamTokenizer tok;
tok = new StreamTokenizer(reader); //the tokenizer then takes in the reader as a builder
tok.resetSyntax();
tok.wordChars(0, 255); //sets all chars (inc spaces to be counted as words)
/*
* quoteChar allows you to set your comment char, for example $ hello $ means it will ignore hello
*/
tok.quoteChar('$');
while(tok.nextToken()!=StreamTokenizer.TT_EOF){ //while it is not at the end of file
String s = tok.sval;
if (vars.containsKey(s))
s =(String)vars.get(s);
renderedFile.append(s);
}
}
catch(Exception e){System.out.println("Error Loading Template");}
return renderedFile.toString();
}
Check this out for a good tutorial http://tutorials.jenkov.com/java-io/streamtokenizer.html

Categories