StreamTokenizer - How to split every character into tokens - java

In short: how do you alter the StreamTokenizer so that it will split each character in an input file into tokens.
For example, if I have the following input:
1023021023584
How can this be read so that each individual character can be saved to a specific index of an array?

To read characters individually from a file as "tokens", use a Reader:
try (BufferedReader in = Files.newBufferedReader(Paths.get("test.txt"))) {
for (int charOrEOF; (charOrEOF = in.read()) != -1; ) {
String token = String.valueOf((char) charOrEOF);
// Use token here
}
}
For full support of Unicode characters from the supplemental planes, e.g. emojis, we need to read surrogate pairs:
try (BufferedReader in = Files.newBufferedReader(Paths.get("test.txt"))) {
for (int char1, char2; (char1 = in.read()) != -1; ) {
String token = (Character.isHighSurrogate​((char) char1) && (char2 = in.read()) != -1)
? String.valueOf(new char[] { (char) char1, (char) char2 })
: String.valueOf((char) char1));
// Use token here
}
}

you have to call StreamTokenizer.resetSyntax() method as below
public static void main(String[] args) {
try (FileReader fileReader = new FileReader("C:\\test.txt");){
StreamTokenizer st = new StreamTokenizer(fileReader);
st.resetSyntax();
int token =0;
while((token = st.nextToken()) != StreamTokenizer.TT_EOF) {
if(st.ttype == StreamTokenizer.TT_NUMBER) {
System.out.println("Number: "+st.nval);
} else if(st.ttype == StreamTokenizer.TT_WORD) {
System.out.println("Word: "+st.sval);
}else {
System.out.println("Ordinary Char: "+(char)token);
}
}
} catch (Exception e) {
e.printStackTrace();
}
}

Related

java - BufferedReader readLine stop reading when encouters empty string

I am using BufferedReader to read a text file line by line. Then i use a method to normalize each line text. But there is something wrong with my normalization method, after the call to it, BufferedReader object stop reading file. Can someone help me with this.
Here is my code:
public static void main(String[] args) {
String string = "";
try (BufferedReader br = new BufferedReader(new FileReader("file.txt"))) {
String line;
while ((line = br.readLine()) != null) {
string += normalize(line);
}
} catch (Exception e) {
}
System.out.println(string);
}
public static String normalize(String string) {
StringBuilder text = new StringBuilder(string.trim());
for(int i = 0; i < text.length(); i++) {
if(text.charAt(i) == ' ') {
removeWhiteSpaces(i + 1, text);
}
}
if(text.charAt(text.length() - 1) != '.') {
text.append('.');
}
text.append("\n");
return text.toString();
}
public static void removeWhiteSpaces(int index, StringBuilder text) {
int j = index;
while(text.charAt(j) == ' ') {
text.deleteCharAt(j);
}
}
and here is the text file that i use:
abc .
asd.
dasd.
I think you have problem in your removeWhiteSpaces(i + 1, text);, and if you have problem in the string process, the reader wont able to read the next line.
You don't check the empty string, and you call text.charAt(text.length()-1), it is a problem too.
Print the exception, change your catch block to write out the exception:
} catch (Exception e) {
e.printStackTrace();
}
The reason is in your while(text.charAt(j) == ' ') {, you don't examine the length of StringBuilder, but you delete it...
Try this:
while ((line = br.readLine()) != null) {
if(line.trim().isEmpty()) {
continue;
}
string += normalize(line);
}
Try ScanReader
Scanner scan = new Scanner(is);
int rowCount = 0;
while (scan.hasNextLine()) {
String temp = scan.nextLine();
if(temp.trim().length()==0){
continue;
}
}
//rest of your logic
The normalize function is causing this.
the following tweak to it shoudl fix this:
public static String normalize(String string) {
if(string.length() < 1) {
return "";
}
StringBuilder text = new StringBuilder(string.trim());
if(text.length() < 1){
return "";
}
for(int i = 0; i < text.length(); i++) {
if(text.charAt(i) == ' ') {
removeWhiteSpaces(i + 1, text);
}
}
if(text.charAt(text.length() - 1) != '.') {
text.append('.');
}
text.append("\n");
return text.toString();
}
The problem is not in your code but in the understanding of the readLine() method. In the documentation is stated:
Reads a line of text. A line is considered to be terminated by any one of a line feed ('\n'), a carriage return ('\r'), or a carriage return followed immediately by a linefeed.
https://docs.oracle.com/javase/7/docs/api/java/io/BufferedReader.html#readLine()
So that means that if the method finds an empty line it will stop reading and return null.
The code proposed by #tijn167 would do the workaround using BufferedReader. If you are not restraint to BufferedReader use ScanReader as #Abhishek Soni suggested.
Also, your method removeWhiteSpaces() is checking for white spaces while the empty lines are not a white space but a carry return \r or a line feed \n or both. So your condition text.charAt(j) == ' ' is never satisfied.
Second line of your file is empty, therefore the while loop stops

Manipulate this code so that it counts the # of digits in a file

I need to manipulate this code so that it will read the # of digits from a file.
I am honestly stumped on this one for some reason. Do i need to tokenize it first?
Thanks!
import java.io.*;
import java.util.*;
public class CountLetters {
public static void main(String args[]) {
if (args.length != 1) {
System.err.println("Synopsis: Java CountLetters inputFileName");
System.exit(1);
}
String line = null;
int numCount = 0;
try {
FileReader f = new FileReader(args[0]);
BufferedReader in = new BufferedReader(f);
while ((line = in.readLine()) != null) {
for (int k = 0; k < line.length(); ++k)
if (line.charAt(k) >= 0 && line.charAt(k) <= 9)
++numCount;
}
in.close();
f.close();
} catch (Exception e) {
e.printStackTrace();
}
System.out.println(numCount + " numbers in this file.");
} // main
} // CountNumbers
Use '' to indicate a char constant (you are comparing chars to ints), also I would suggest you use try-with-resources Statement to avoid explicit close calls and please avoid using one line loops without braces (unless you are using lambdas). Like
public static void main(String args[]) {
if (args.length != 1) {
System.err.println("Synopsis: Java CountLetters inputFileName");
System.exit(1);
}
String line = null;
int numCount = 0;
try (BufferedReader in = new BufferedReader(new FileReader(args[0]))) {
while ((line = in.readLine()) != null) {
for (int k = 0; k < line.length(); ++k) {
if ((line.charAt(k) >= '0' && line.charAt(k) <= '9')) {
++numCount;
}
}
}
} catch (Exception e) {
e.printStackTrace();
}
System.out.println(numCount + " numbers in this file.");
} // main
Also, you could use a regular expression to remove all non-digits (\\D) and add the length of the resulting String (which is all-digits). Like,
while ((line = in.readLine()) != null) {
numCount += line.replaceAll("\\D", "").length();
}
Use if(Charachter.isDigit(char)) replace char with each char, this will count each number, and I believe arabic numbers as well.

Counting { and } braces in program JAVA

I have to read an input file that contains codes and produce an output that matches the corresponding braces ({ and })
example of how output will look
import java.util.scanner;
public class Tester {1
public static void main(String[] args) {2
Scanner in = new Scanner (System.in);
int price = in.nextInt;
if (price < 10)
System.out.println("Good price");
System.out.println ("Buy it");
}2
}1
}0
}0
0 will represent extra braces that has no matches.
What is the most efficient way to approach this?
Should I just process line by line with Strings?
You can keep a count. Iterate the characters in every line, increment (or decrement) the count and (output the count) for { and } respectively. Don't forget to close your Scanner with a finally block or a try-with-resources. Assuming your file Tester.java is in the user's home folder you could do something like,
File f = new File(System.getProperty("user.home"), "Tester.java");
try (Scanner scan = new Scanner(f)) {
int count = 0;
while (scan.hasNextLine()) {
String line = scan.nextLine();
for (char ch : line.toCharArray()) {
System.out.print(ch);
if (ch == '{') {
System.out.print(++count);
} else if (ch == '}') {
if (count > 0) {
System.out.print(--count);
} else {
System.out.print(count);
}
}
}
System.out.println();
}
} catch (Exception e) {
e.printStackTrace();
}
You can find the extra braces by making use of stack as below:
public static void main(final String[] args) {
Stack<String> stack = new Stack<String>();
File file = new File("InputFile");
int lineCount = 0;
try (BufferedReader br = new BufferedReader(new FileReader(file))) {
String line;
while ((line = br.readLine()) != null) {
lineCount++;
for (int i = 0; i < line.length(); i++) {
if (line.charAt(i) == '{') {
stack.push("{");
} else if (line.charAt(i) == '}') {
if (!stack.isEmpty()) {
stack.pop();
} else {
System.out.println("Extra brace found at line number : " + lineCount);
}
}
}
}
if (!stack.isEmpty()) {
System.out.println(stack.size() + " braces are opend but not closed ");
}
} catch (Exception e) {
e.printStackTrace();
}
}

How can I ignore comments statements /*.....*/ when i reading java file?

How can i ignore the comment statements that begin with "/*" and ends with
"*/" for example: /*the problem is..*/ or
/* problem is very difficult */ ,,i want to remove these statement when i reading java file line by line
public class filename1 {
public static void main (String args[])
{
try {
fileName = "C:\\NetBeansProjects\\filename\\src\\filename\\filename.java";
FileReader fr = new FileReader(fileName);
BufferedReader br = new BufferedReader(fr);
line = br.readLine();
while (line !=null) {
for( int i=0;i<line.length();i++)
{
b=line.indexOf("/",i);
ee=line.indexOf("*",i);
if(b!=-1 && ee!=-1)
v=line.indexOf("*/",i);
if (v==-1)
line=" ";
}
System.out.println(line);
line = br.readLine();
}}
catch (IOException e)
{
e.printStackTrace();
}
}
}
Simply include:
int index = str.indexOf("/*");
while(index != -1) {
str = str.substring(0, index) + str.substring(str.indexOf("*/")+2);
index = str.indexOf("/*");
}
Edit:
Assuming that you have to account for fragments where you have a comment interrupted by the start or end of the string:
Edit2:
Now.. Also assuming that you have to take into account for literal string "/*" or "*/"
str = str.replace("\"/*\"", "literal_string_open_comment");
str = str.replace("\"*/\"", "literal_string_close_comment");
int start = str.indexOf("/*"), end = str.indexOf("*/");
while(start > -1 || end > -1) {
if(start != -1) {
if(end != -1) {
if(end < start) {
str = str.substring(end+2);
} else {
str = str.substring(0, start) + str.substring(end+2);
}
} else {
str = str.substring(0, start);
}
} else {
str = str.substring(end+2);
}
start = str.indexOf("/*");
end = str.indexOf("*/");
}
str = str.replace("literal_string_open_comment", "\"/*\"");
str = str.replace("literal_string_close_comment", "\"*/\"");

How can I read every letter from the file?

I have this part of code. I can read all lines from the code. But I want take (read) every letter separately and put it into array. How can I do it?
For Example: In file are numbers 00010 and I want put it into array like this: array[0,0,0,1,0]
public void readTest()
{
try
{
InputStream is = getResources().getAssets().open("test.txt");
BufferedReader br = new BufferedReader(new InputStreamReader(is));
String st = "";
StringBuilder sb = new StringBuilder();
while ((st=br.readLine())!=null)
{
sb.append(st);
}
br.close();
}catch (IOException e)
{
Log.d(TAG, "Error: " + e);
}
}
Use br.read(). It returns the character as integer
ArrayList<char> charArray = new ArrayList<>();
int i;
while ((i = br.read()) != -1) {
char c = (char) i;
charArray.add(c);
}
Straight from the JavaDoc:
public int read()
throws IOException -
Reads a single character.
You should add read every string and add it's letters to array by iterating through it, like this:
while ((st=br.readLine())!=null) {
sb.append(st);
for (int i = 0; i < st.length(); i++) {
char c = st.charAt(i);
yourArray.add(c);
}
}

Categories