java string StringTokenizer doesn't recognize token after "//"? - java

i am writing a code where i want to print only comments in a java file , it worked when i have a comments like this
// a comment
but when i have a comment like this :
// /* cdcdf
it will not print "/* cdcdf" , it only prints a blank line
anyone know why this happens ?
here is my code :
package printC;
import java.io.*;
import java.util.StringTokenizer;
import java.lang.String ;
public class PrintComments {
public static void main(String[] args) {
try {
String line;
BufferedReader br = new BufferedReader(new FileReader(args[0]));
while ((line = br.readLine()) != null) {
if (line.contains("//") ) {
StringTokenizer st1 = new StringTokenizer(line, "//");
if(!(line.startsWith("//"))) {
st1.nextToken();
}
System.out.println(st1.nextToken());
}
}
}catch (Exception e) {
System.out.println(e);
}
}
}

You can simplify the code by just looking for the first position of the //. indexOf works fine for this. You don't need to tokenize as you really just want everything after a certain position (or text), you don't need to split the line into multiple pieces.
If you find the // (indexOf doesn't return -1 for "not found"), you use substring to only print the characters starting at that position.
This minimal example should do what you want:
import java.io.*;
import java.util.StringTokenizer;
public class PrintComments {
public static void main(String[] args) throws IOException {
String line; // comment
BufferedReader br = new BufferedReader(new FileReader(args[0]));
while ((line = br.readLine()) != null) {
int commentStart = line.indexOf("//");
if (commentStart != -1) {
System.out.println(line.substring(commentStart));
}
} // /* that's it
}
}
If you don't want to print the //, just add 2 to commentStart.
Note that this primitive approach to parsing for comments is very brittle. If you run the program on its own source, it will happily report //"); as well, for the line of the indexOf. Any serious attempt to find comments need to properly parse the source code.
Edit: If you want to look for other comments marked by /* and */ as well, do the same thing for the opening comment, then look for the closing comment at the end of the line. This will find a /* comment */ when all of the comment is on a single line. When it sees the opening /* it looks whether the line ends with a closing */ and if so, uses substring again to only pick the parts between the comment markers.
import java.io.*;
import java.util.StringTokenizer;
public class PrintComments {
public static void main(String[] args) throws IOException {
String line; // comment
BufferedReader br = new BufferedReader(new FileReader(args[0]));
while ((line = br.readLine()) != null) {
int commentStart;
String comment = null;
commentStart = line.indexOf("//");
if (commentStart != -1) {
comment = line.substring(commentStart + 2);
}
commentStart = line.indexOf("/*");
if (commentStart != -1) {
comment = line.substring(commentStart + 2);
if (comment.endsWith("*/")) {
comment = comment.substring(0, comment.length() - 2);
}
}
if (comment != null) {
System.out.println(comment);
}
} // /* that's it
/* test */
}
}
To extend this for comments that span multiple lines, you need to remember whether you're in a multi-line comment, and if you are keep printing line and checking for the closing */.

StringTokenizer takes a collection of delimiters, not a single string delimiter. so it is splitting on the '/' char. the "second" token is the empty token between the two initial "//".
If you just want the rest of the line after the "//", you could use:
if(line.startsWith("//")) {
line = line.substring(2);
}

Additional to #jtahlborn answer. You can check all of the token by iterating token:
e.g:
...
StringTokenizer st1 = new StringTokenizer(line, "//");
while (st1.hasMoreTokens()){
System.out.println("token found:" + st1.nextToken());
}
...

If you are reading per line, the StringTokenizer don't do much in your code. Try this, change the content of if like this:
if(line.trim().startWith("//")){//true only if líne start with //,aka: comment line
//Do stuff with líne
String cleanLine = line.trim().replace("//"," ");//to remove all // in line
String cleanLine = línea.trim().substring(2,línea.trim().lenght());//to remove only the first //
}
Note: try to always use the trim() to remove all Blanc spaces at begin and end of string.
To split the líne per // use:
líne.split("//")
For more general purpose,check out :
Java - regular expression finding comments in code

Related

reading int from text line by line

i have a 3 lines text of int that are data i need to put in some variables, i want to access one by one to all like an array, i can read the firt line but don't know how to go to next line, i know it 's a stupid thing but i'm blocked
public void Load () throws IOException {
BufferedReader in = new BufferedReader(new FileReader("prova.txt"));
String inputLine = in.readLine();
String [] fields = inputLine.split(" "); // Splits at the space
int i=0;
while(inputLine!=null) {
System.out.println(fields[i]); //prints out name
i++;
}
}
i wanto to access to a single int for any line, any tips?
You can get all lines from file using Files.readAllLines() from java8:
List<String> lines = Files.readAllLines(Paths.get("prova.txt"));
for (String line : lines) {
String[] split = line.split(" "));
// use element access by index to do what you want
}
Also if you are familiar with stream api:
Files.lines(Paths.get("prova.txt"))
.flatMap(s -> Arrays.stream(s.split(" ")))
.forEach(System.out::println);
Use the Java NIO API.
Path myPath = Paths.get("prova.txt")
List<String> contents = Files.readAllLines(myPath)
for(String line : contents) {
System.out.println(line);
}
You have to iterate twice : once over the lines in the files (for example using Files.lines(...)) and then over the fields in the line (with say a for loop).
Something like so :
import java.io.File;
import java.io.IOException;
import java.nio.file.Files;
public class Snippet {
public static void main(String[] args) throws IOException {
new Snippet().Load();
}
public void Load() throws IOException {
Files.lines(new File("prova.txt").toPath()).forEach(line -> {
String[] fields = line.split("\\s"); // Splits at the space
for (String field : fields) {
System.out.println(field);
}
System.out.println();
});
}
}
HTH!
What wrong you are doing is that you have read only first line and trying to print all that is there in first line by continuously increasing value of i which will end up with null pointer exception.I have tried the same way you are doing it. Let me know if you have any concern.
public class Snippet {
public static void Load() throws IOException {
BufferedReader in = new BufferedReader(new FileReader("prova.txt"));
String inputLine = in.readLine();
// Splits at the space
while (inputLine != null) {
int i = 0;
String[] fields = inputLine.split(" ");
while (i < fields.length) {
System.out.println(fields[i]); // prints out name
i++;
}
inputLine = in.readLine();
}
}
}
What i am doing here is reading each line splitting it based on space and print everything that came on first line and and then read next line at the end of loop.

Reading unicode char codes JAVA

Hi I'm reading file (please, use the link to see the file) that contains this rows:
U+0000
U+0001
U+0002
U+0003
U+0004
U+0005
using this code
import java.io.BufferedReader;
import java.io.File;
import java.io.FileReader;
public class fgenerator {
public static void main(String[] args) {
try(BufferedReader br = new BufferedReader(new FileReader(new File("C:\\UNCDUNCD.txt")))){
String line;
String[] splited;
while ((line = br.readLine()) != null){
splited = line.split(" ");
System.out.println(splited[0]);
}
}catch(Exception e) {
e.printStackTrace();
}
}
}
but output is
U+D01C
U+D01D
U+D01E
U+D01F
U+D020
U+D021
why does this happen?
how to get the char of its code
change line datatype to char, if doesnt work then String.getBytes()
I am assuming that you want to take the Unicode representation that is on each line of the file and output the actual Unicode character which the code represents.
If we start with your loop that reads each line from the file...
while ((line = br.readLine()) != null){
System.out.println( line );
}
... then what we want to do is convert the input line to the character, and print that ...
while ((line = br.readLine()) != null){
System.out.println( convert(line) ); <- I just put a method call to "convert()"
}
So, how do you convert(line) into a character before printing it?
As my earlier comment suggested, you want to take the numeric string that follows the U+ and convert it to an actual numeric value. That, then, is the character value you want to print.
The following is a complete program — essentially like yours but I take the filename as an argument rather than hard-coding it. I've also added skipping blank lines, and rejecting invalid strings -- printing a blank space instead.
Reject the line if it does not match the U+nnnn form of a Unicode representation — match against "(?i)U\\+[0-9A-F]{4}", which means:
(?i) - ignore case
U\\+ - match U+, where the + has to be escaped to be a literal plus
[0-9A-F] - match any character 0-9 or A-F (ignoring case)
{4} - exactly 4 times
With your update that includes a linked sample file, which includes # comments, I have modified my original program (below) so it will now strip comments and then convert the remaining representation.
This is a complete program that can be run as:
javac Reader2.java
java Reader2 inputfile.txt
I tested it with a subset of your file, starting inputfile.txt at line 1 with U+0000 and ending at line 312 with U+0138
import java.io.*;
public class Reader2
{
public static void main(String... args)
{
final String filename = args[0];
try (BufferedReader br = new BufferedReader(
new FileReader(new File( filename ))
)
)
{
String line;
while ((line = br.readLine()) != null) {
if (line.trim().length() > 0) { // skip blank lines
//System.out.println( convert(line) );
final Character c = convert(line);
if (Character.isValidCodePoint(c)) {
System.out.print ( c );
}
}
}
System.out.println();
}
catch(Exception e) {
e.printStackTrace();
}
}
private static char convert(final String input)
{
//System.out.println("Working on line: " + input);
if (! input.matches("(?i)U\\+[0-9A-F]{4}(\\s+#.*)")) {
System.err.println("Rejecting line: " + input);
return ' ';
}
else {
//System.out.println("Accepting line: " + input);
}
// else
final String stripped = input.replaceFirst("\\s+#.*$", "");
final Integer cval = Integer.parseInt(stripped.substring(2), 16);
//System.out.println("cval = " + cval);
return (char) cval.intValue();
}
}
Original program that assumed a line consisted only of U+nnnn is here.
You would run this as:
javac Reader.java
java Reader input.txt
import java.io.*;
public class Reader
{
public static void main(String... args)
{
final String filename = args[0];
try (BufferedReader br = new BufferedReader(
new FileReader(new File( filename ))
)
)
{
String line;
while ((line = br.readLine()) != null) {
if (line.trim().length() > 0) { // skip blank lines
//System.out.println( line );
// Write all chars on one line rather than one char per line
System.out.print ( convert(line) );
}
}
System.out.println(); // Print a newline after all chars are printed
}
catch(Exception e) { // don't catch plain `Exception` IRL
e.printStackTrace(); // don't just print a stack trace IRL
}
}
private static char convert(final String input)
{
// Reject any line that doesn't match U+nnnn
if (! input.matches("(?i)U\\+[0-9A-F]{4}")) {
System.err.println("Rejecting line: " + input);
return ' ';
}
// else convert the line to the character
final Integer cval = Integer.parseInt(input.substring(2), 16);
//System.out.println("cval = " + cval);
return (char) cval.intValue();
}
}
Try it using this as your input file:
U+0041
bad line
U+2718
U+00E9
u+0073
Redirect standard error when you run it java Reader input.txt 2> /dev/null or comment out the line System.err.println...
You should get this output: A ✘és

Can i read file in java and print its contents without comment statements?

When i read java file as tokens and print it's content,
using BufferedReader and StringTokenizer,how can i print only its content without comment statements that begin with " // " , " /* */" .
I want to print content of file without these statement that used for clarify the code.
You can do that very easily using JavaParser: just parse the code specifying that you want to ignore comments and then dump the AST
CompilationUnit cu = JavaParser.parse(reader, false /*considerComments*/);
String codeWithoutComments = cu.toString();
While dumping it will reformat the code.
1 If you want to remove comments, you can well:
remove // => see the same question here, no need of regex : Find single line comments in byte array
remove /* */ it is more difficult. regex could work, but you could get a lot of pain . I dont recommend that
2 use a java parser : Java : parse java source code, extract methods
javaparser for example: https://github.com/javaparser/javaparser
then iterate the code, and remove comments, etc.
This code will remove the comment inside a text file.But, It will not remove the symbols of comment, if you need to remove it, you can do it by editing the three functions which I had written below.Test case which i had tested.
// helloworld
/* comment */
a /* comment */
b
/*
comment
*/
c
d
e
// xxxx
f // xxxx
The Output will be:
//
/* */
a /* */
b
/*
*/
c
d
e
//
f //
In this program I didn't remove the comment symbol as I was making lexical analyzer.You can remove the comment symbols by editing the program statements where i had put the comments.
public class testSpace {
public static void main(String[] args) {
try {
String filePath = "C:\\Users\\Sibil\\eclipse-workspace\\Assignment1\\src\\Input.txt";
FileReader fr = new FileReader(filePath);
String line;
BufferedReader br = new BufferedReader(fr);
int lineNumber = 0;
while ((line = br.readLine()) != null) {
lineNumber++;
if ((line.contains("/*") && line.contains("*/")) || (line.contains("//"))) {
line = findreplacement(line);
System.out.println(line);//Begining of the multiline comment
} else if (line.contains("/*")) {
line = getStartString(line);
System.out.println(line);
while ((line = br.readLine()) != null) {
lineNumber++;
if (line.contains("*/")) {
line = getEndString(line);
System.out.println(line);//Print the end of a Multline comment
break;
} else {
line = " ";
System.out.println(line);//Blank Space for commented line inside a multiline comment
}
}
} else
System.out.println(line);//Line without comment
}
} catch (Exception e) {
System.out.println(e);
}
}
private static String getEndString(String s) {
int end = s.indexOf("*/");
String lineEnd = s.substring(end, s.length());//Edit here if you don't need the comment symbol by substracting 2 or adding 2
return lineEnd;
}
private static String getStartString(String s) {
int start = s.indexOf("/*");
String lineStart = s.substring(0, start + 2);//Edit here if you don't need the comment symbol by substracting 2 or adding 2
return lineStart;
}
private static String findreplacement(String s) {
String line = "";
if (s.contains("//")) {
int start = s.indexOf("//");
line = s.substring(0, start + 2);//Edit here if you don't need the comment symbol by substracting 2 or adding 2
} else if ((s.contains("/*") && s.contains("*/"))) {
int start = s.indexOf("/*");
int end = s.indexOf("*/");
String lineStart = s.substring(0, start + 2);//Edit here if you don't need the comment symbol by substracting 2 or adding 2
String lineEnd = s.substring(end, s.length());//Edit here if you don't need the comment symbol by substracting 2 or adding 2
line = lineStart + " " + lineEnd;
}
return line;
}
}
If your file has a line like this,
System.out.println("Hello World/*Do Something */");
It will fail and the output will be:
System.out.println("Hello world");

Splitting a text file into multiple files by specific character sequence

I have a file with the following format.
.I 1
.T
experimental investigation of the aerodynamics of a
wing in a slipstream . 1989
.A
brenckman,m.
.B
experimental investigation of the aerodynamics of a
wing in a slipstream .
.I 2
.T
simple shear flow past a flat plate in an incompressible fluid of small
viscosity .
.A
ting-yili
.B
some texts...
some more text....
.I 3
...
".I 1" indicate the beginning of chunk of text corresponding to doc ID1 and ".I 2" indicates the beginning of chunk of text corresponding to doc ID2.
what I need is read the text between ".I 1" and ".I 2" and save it as a separate file like "DOC_ID_1.txt" and then read the text between ".I 2" and ".I 3"
and save it as a separate file like "DOC_ID_2.txt" and so on. lets assume that the number of .I # is not known.
I have tried this but cannot finish it. any help will be appreciated
String inputDocFile="C:\\Dropbox\\Data\\cran.all.1400";
try {
File inputFile = new File(inputDocFile);
FileReader fileReader = new FileReader(inputFile);
BufferedReader bufferedReader = new BufferedReader(fileReader);
String line=null;
String outputDocFileSeperatedByID="DOC_ID_";
//Pattern docHeaderPattern = Pattern.compile(".I ", Pattern.MULTILINE | Pattern.COMMENTS);
ArrayList<ArrayList<String>> result = new ArrayList<> ();
int docID =0;
try {
StringBuilder sb = new StringBuilder();
line = bufferedReader.readLine();
while (line != null) {
if (line.startsWith(".I"))
{
result.add(new ArrayList<String>());
result.get(docID).add(".I");
line = bufferedReader.readLine();
while(line != null && !line.startsWith(".I")){
line = bufferedReader.readLine();
}
++docID;
}
else line = bufferedReader.readLine();
}
} finally {
bufferedReader.close();
}
} catch (IOException ex) {
Logger.getLogger(ReadFile.class.getName()).log(Level.SEVERE, null, ex);
}
You want to find the lines which match "I n".
The regex you need is : ^.I \d$
^ indicates the beginning of the line. Hence, if there are some whitespaces or text before I, the line will not match the regex.
\d indicates any digit. For the sake of simplicty, I allow only one digit in this regex.
$ indicates the end of the line. Hence, if there are some characters after the digit, the line will not match the expression.
Now, you need to read the file line by line and keep a reference to the file in which you write the current line.
Reading a file line by line is much easier in Java 8 with Files.lines();
private String currentFile = "root.txt";
public static final String REGEX = "^.I \\d$";
public void foo() throws Exception{
Path path = Paths.get("path/to/your/input/file.txt");
Files.lines(path).forEach(line -> {
if(line.matches(REGEX)) {
//Extract the digit and update currentFile
currentFile = "File DOC_ID_"+line.substring(3, line.length())+".txt";
System.out.println("Current file is now : currentFile);
} else {
System.out.println("Writing this line to "+currentFile + " :" + line);
//Files.write(...);
}
});
Note : In order to extract the digit, I use a raw "".substring() which I consider as evil but it is easier to understand. You can do it in a better way with a Pattern and a Matcher :
With this regex : ".I (\\d)". (The same as before but with parenthesis which indicates what you will want to capture). Then :
Pattern pattern = Pattern.compile(".I (\\d)");
Matcher matcher = pattern.matcher(".I 3");
if(matcher.find()) {
System.out.println(matcher.group(1));//display "3"
}
Look up regex, Java has inbuilt libraries for this.
https://docs.oracle.com/javase/tutorial/essential/regex/
http://docs.oracle.com/javase/7/docs/api/java/util/regex/Pattern.html
These links will give you a starting point, effectively you can use counter to perform a pattern match against the string and store anything between the first pattern match and the second pattern match. This information can be output to a separate file using the Formatter class.
Found here:-
http://docs.oracle.com/javase/7/docs/api/java/util/Formatter.html
import java.io.BufferedReader;
import java.io.File;
import java.io.FileReader;
import java.io.IOException;
import java.io.PrintWriter;
public class Test {
/**
* #param args
* #throws IOException
*/
public static void main(String[] args) throws IOException {
// TODO Auto-generated method stub
String inputFile="C:\\logs\\test.txt";
BufferedReader br = new BufferedReader(new FileReader(new File(inputFile)));
String line=null;
StringBuilder sb = new StringBuilder();
int count=1;
try {
while((line = br.readLine()) != null){
if(line.startsWith(".I")){
if(sb.length()!=0){
File file = new File("C:\\logs\\DOC_ID_"+count+".txt");
PrintWriter writer = new PrintWriter(file, "UTF-8");
writer.println(sb.toString());
writer.close();
sb.delete(0, sb.length());
count++;
}
continue;
}
sb.append(line);
}
} catch (Exception ex) {
ex.printStackTrace();
}
finally {
br.close();
}
}
}

How to use string tokenizer when reading in from a file?

I am implementing a RPN calculator in Java and need help creating a class to parse the equations into separate tokens.
My input file will have an unknown number of equations similar to the ones shown below:
49+62*61-36
4/64
(53+26)
0*72
21-85+75-85
90*76-50+67
46*89-15
34/83-38
20/76/14+92-15
I have already implemented my own generic stack class to be used in the program, but I am now trying to figure out how to read data from the input file. Any help appreciated.
I've posted the source code for my stack class at PasteBin, in case it may help.
I have also uploaded the Calculator with no filereading to PasteBin to show what I have done already.
I have now managed to get the file read in and the tokens broken up thanks for the help. I am getting an error when it reaches the end of the file and was wondering how to solve that?
Here is the code:
import java.io.BufferedReader;
import java.io.File;
import java.io.FileReader;
import java.io.IOException;
import java.util.StringTokenizer;
public class TestClass {
static public void main(String[] args) throws IOException {
File file = new File("testEquations.txt");
String[] lines = new String[10];
try {
FileReader reader = new FileReader(file);
BufferedReader buffReader = new BufferedReader(reader);
int x = 0;
String s;
while((s = buffReader.readLine()) != null){
lines[x] = s;
x++;
}
}
catch(IOException e){
System.exit(0);
}
String OPERATORS = "+-*/()";
for (String st : lines) {
StringTokenizer tokens = new StringTokenizer(st, OPERATORS, true);
while (tokens.hasMoreTokens()) {
String token = tokens.nextToken();
if (OPERATORS.contains(token))
handleOperator(token);
else
handleNumber(token);
}
}
}
private static void handleNumber(String token) {
System.out.println(""+token);
}
private static void handleOperator(String token) {
System.out.println(""+token);
}
}
Also How would I make sure the RPN works line by line? I am getting quite confused by the algorithms I am trying to follow.
Because all of the operators are single characters, you can instruct StringTokenizer to return them along with the numeric tokens.
String OPERATORS = "+-*/()";
String[] lines = ...
for (String line : lines) {
StringTokenizer tokens = new StringTokenizer(line, OPERATORS, true);
while (tokens.hasMoreTOkens()) {
String token = tokens.nextToken();
if (OPERATORS.contains(token))
handleOperator(token);
else
handleNumber(token);
}
}
As your question has now changed completely from it's original version - this is in response to your original one, which was how to use FileReader to get the values from your file.
This will put each line into a separate element of a String array. You should probably use an ArrayList instead, as it's far more flexible, but I have just done this as a quick demo - you can clean it up as you wish, although I notice the code you are using expects a String array as it's input. Perhaps you could read the values initially into an ArrayList, then copy that to an array once you have all the lines - that way you can put as many lines in as you wish and keep your code flexible for changes in the number of lines in your input file.
import java.io.BufferedReader;
import java.io.File;
import java.io.FileReader;
import java.io.IOException;
public class TestClass {
static public void main(String[] args) {
File file = new File("myfile.txt");
String[] lines = new String[10];
try {
FileReader reader = new FileReader(file);
BufferedReader buffReader = new BufferedReader(reader);
int x = 0;
String s;
while((s = buffReader.readLine()) != null){
lines[x] = s;
x++;
}
}
catch(IOException e){
//handle exception
}
// And just to prove we have the lines right where we want them..
for(String st: lines)
System.out.println(st);
}
}
You mentioned before that you were using the code on this link:
http://www.technical-recipes.com/2011/a-mathematical-expression-parser-in-java/#more-1658
This appears to already deal with operator precedence doesn't it? And with parsing each String from the array and sorting them into numbers or operators? From my quick look it at least it appears to do that.
So it looks like all you need is for your lines to be in a String array, which you then pass to the code you already have. From what I can see anyway.
Obviously this doesn't address the issue of numbers greater than 9, but hopefully it helps with the first half.
:-)
public void actionPerformed(ActionEvent e) {
double sum=0;
int count = 0 ;
try {
String nomFichier = "Fichier.txt";
FileReader fr = new FileReader(nomFichier);
BufferedReader br = new BufferedReader(fr);
String ligneLue;
do {
ligneLue = br.readLine();
if(ligneLue != null) {
StringTokenizer st = new StringTokenizer(ligneLue, ";");
String nom = st.nextToken();
String prenom = st.nextToken();
String age = st.nextToken();
String tele = st.nextToken();
String adress = st.nextToken();
String codePostal = st.nextToken();
String ville = st.nextToken();
String paye = st.nextToken();
double note = Double.parseDouble(st.nextToken());
count++;
}
}
while(ligneLue != null);
br.close();
double mediane = count / 2;
if(mediane % 2 == 0) {
JOptionPane.showMessageDialog(null, "Le mediane dans le fichier est " + mediane);
}
else {
mediane +=1;
JOptionPane.showMessageDialog(null, "Le mediane dans le fichier est " + mediane);
}
}//fin try
catch(FileNotFoundException ex) {
System.out.println(ex.getMessage());
}
catch(IOException ex) {
System.out.println(ex.getMessage());
}
}

Categories