Finding errors in LaTeX document, java - java

So I have a bunch of LaTeX styled documents, one looks like this...
\documentclass{article}
\usepackage{amsmath, amssymb, amsthm}
\begin{document}
{\Large \begin{center} Homework Problems \end{center}}\begin{itemize}\item\end{itemize}
\begin{enumerate}
\item Prove: For all sets $A$ and $B$, $(A - B) \cup
(A \cap B) = A$.
\begin{proof}
\begin{align}
& (A - B) \cup (A \cap B) && \\
& = (A \cap B^c) \cup (A \cap B) && \text{by
Alternate Definition of Set Difference} \\
& = A \cap (B^c \cup B) && \text{by Distributive Law} \\
& = A \cap (B \cup B^c) && \text{by Commutative Law} \\
& = A \cap U && \text{by Union with the Complement Law} \\
& = A && \text{by Intersection with $U$ Law}
\end{align}
\end{proof}
\item If $n = 4k + 3$, does 8 divide $n^2 - 1$?
\begin{proof}
Let $n = 4k + 3$ for some integer $k$. Then
\begin{align}
n^2 - 1 & = (4k + 3)^2 - 1 \\
& = 16k^2 + 24k + 9 - 1 \\
& = 16k^2 + 24k + 8 \\
& = 8(2k^2 + 3k + 1) \text{,}
\end{align}
which is certainly divisible by 8.
\end{proof}
\end{enumerate}
\end{document}
Now first I had to read through each document and line and find all of the "\begin{BLOCK}" and "\end{BLOCK}" commands and add the BLOCK string to a Stack and when I found the matching "\end" I would call the pop() command on my Stack. I pretty much got all that done, it's just not well organized, or at least I think there is a better way to go about it than all my "if" statements. So that is my first question, is there something better than the way I did it?
Next is I want to find errors and report them. For example, if I removed the line "\begin{document}" from the text above, I want the program to run through, do everything it is suppose to, but when It reaches the line "\end{document}" it reports the missing "\begin" command. I got my code to handle other examples, such as removing the "\begin" commands for enumerate or itemize, but I can't get that case to work.
Finally I want to be able to handle missing "\end" commands. I have made an attempt at it but I can't quite get the conditioning right. Let's say I have this document...
\begin{argument}
\begin{Palin}No it can't. An argument is a connected series of statements intended to establish a proposition.\end{Palin}
\begin{Cleese}No it isn't.\end{Cleese}
\begin{Palin}\expression{exclamation}Yes it is! It's not just contradiction.\end{Palin}
\begin{Cleese}Look, if I argue with you, I must take up a contrary position.\end{Cleese}
\begin{Palin}Yes, but that's not just saying \begin{quotation}'No it isn't.'\end{Palin}
\begin{Cleese}\expression{exclamation}Yes it is!\end{Cleese}
\begin{Palin}\expression{exclamation}No it isn't!\end{Palin}
\begin{Cleese}\expression{exclamation}Yes it is!\end{Cleese}
\begin{Palin}Argument is an intellectual process. Contradiction is just the automatic gainsaying of any statement the other person makes.\end{Palin}
\end{argument}
You'll notice on line 6 there is a "\begin{quotation}" command without an "\end". My code when going through this particular document gives me this as output...
PARSE ERROR Line 6: Missing command \begin{Palin}.
PARSING TERMINATED!
This is obviously not true, but I don't know how to restructure my error handling to get these cases to work. Can anyone provide any help? Especially in the way of organizing this code to better suite it for finding these issues.
-------------------------------------------CODE-------------------------------------------
import java.io.FileInputStream;
import java.io.FileNotFoundException;
import java.util.Scanner;
import java.util.Stack;
import java.util.StringTokenizer;
public class LaTeXParser{
public static void main(String args[]) throws FileNotFoundException{
Scanner scan = new Scanner(System.in);
Stack s = new Stack();
int lineCount = 0;
String line;
String nextData = null;
String title = null;
String fname;
System.out.print("Enter the name of the file (no extension): ");
fname = scan.next();
fname = fname + ".txt";
FileInputStream fstream = new FileInputStream(fname);
Scanner fscan = new Scanner(fstream);
System.out.println();
while(fscan.hasNextLine()){
lineCount++;
line = fscan.nextLine();
StringTokenizer tok = new StringTokenizer(line);
while(tok.hasMoreElements()){
nextData = tok.nextToken();
System.out.println("The line: "+nextData);
if(nextData.contains("\\begin") && !nextData.contains("\\end")){
if(nextData.charAt(1) == 'b'){
title = nextData.substring(nextData.indexOf("{") + 1, nextData.indexOf("}"));
s.push(title);
}
}//end of BEGIN if
if(nextData.contains("\\end") && !nextData.contains("\\begin")){
String[] theLine = nextData.split("[{}]");
for(int i = 0 ; i < theLine.length ; i++){
if(theLine[i].contains("\\end") && !s.isEmpty() && theLine[i+1].equals(s.peek())){
s.pop();
i++;
}
if(theLine[i].contains("\\end") && !theLine[i+1].equals(s.peek())){
System.out.println("PARSE ERROR Line " + lineCount + ": Missing command \\begin{" + theLine[i+1] + "}.");
System.out.println("PARSING TERMINATED!");
System.exit(0);
}
}
}//end of END if
if(nextData.contains("\\begin") && nextData.contains("\\end")){
String[] theLine = nextData.split("[{}]");
for(int i = 0 ; i < theLine.length ; i++){
if(theLine[i].contains("\\end") && theLine[i+1].equals(s.peek())){
s.pop();
}
if(theLine[i].equals("\\begin")){
title = theLine[i+1];
s.push(title);
}
}
}//end of BEGIN AND END if
}
}//end of whiles
fscan.close();
if(s.isEmpty()){
System.out.println();
System.out.println(fname + " LaTeX file is valid!");
System.exit(0);
}
while(!s.isEmpty()){
}
}
}

Related

How to write a Java method which escapes all the whitespaces, newline characters and comments?

I am trying to write a parser of a programming language and I need to escape all whitespaces, newlines and comments(of style /* and //) of a file and to stop just before the token. Assume that the file does not have any syntax error.
This is what I have tried, but I do not think it is the most simple solution. Any help is appreciated.
reader.charPrefetch() gets one character but does not move the reading head in file
reader.charPrefetch(2) gets the 2nd character counting from reading head
private void passCommentsAndWhitespaces(){
char endOfFileChar = (char) 65535;
String twoCharSeq = "" + reader.charPrefetch() + reader.charPrefetch(2);
if(twoCharSeq.charAt(0) == endOfFileChar){
return;
}
while (twoCharSeq.charAt(0) == ' ' || twoCharSeq.charAt(0) == '\n' ||
twoCharSeq.equals("//") || twoCharSeq.equals("/*")){
if(twoCharSeq.equals("/*")){
twoCharSeq = "" + reader.readChar(3) + reader.charPrefetch();
while (!twoCharSeq.equals("*/")){
twoCharSeq = "" + reader.readChar() + reader.charPrefetch();
}
reader.readChar();
}else if(twoCharSeq.equals("//")){
reader.readLine();
}else {
twoCharSeq = "" + reader.readChar() + reader.charPrefetch();
}
}
}

Java: IndexOf(String string) that returns wrong character

I am writing a file browser program that displays file directory path while a user navigates between folders/files.
I have the following String as file path:
"Files > Cold Storage > C > Capital"
I am using Java indexOf(String) method to return the index of 'C' character between > C >, but it returns the first occurrence for it from this word Cold.
I need to get the 'C' alone which sets between > C >.
This is my code:
StringBuilder mDirectoryPath = new StringBuilder("Files > Cold Storage > C > Capital");
String mTreeLevel = "C";
int i = mDirectoryPath.indexOf(mTreeLevel);
if (i != -1) {
mDirectoryPath.delete(i, i + mTreeLevel.length());
}
I need flexible solution that fits other proper problems
Any help is appreciated!
A better approach would be to use a List of Strings.
public void test() {
List<String> directoryPath = Arrays.asList("Files", "Cold Storage", "C", "Capital");
int cDepth = directoryPath.indexOf("C");
System.out.println("cDepth = " + cDepth);
}
Search for the first occurance of " C " :
String mTreeLevel = " C ";
int i = mDirectoryPath.indexOf(mTreeLevel);
Then add 1 to account to get the index of 'C' (assuming the String you searched for was found).
If you only want to delete the single 'C' character :
if (i >= 0) {
mDirectoryPath.delete(i + 1, i + 2);
}
EDIT:
If searching for " C " may still return the wrong occurrence, search for " > C > " instead.

Recursive command parser that solves a repeat statement

I am building a parser that recognizes simple commands such as "DOWN.", "UP." and "REP 3.". It must be able to parse the commands rather freely. It should be legal to write
"DOWN % asdf asdf asdf
."
Where % represents a comment and the fullstop signifying end-of-command. This fullstop can be on the next line.
This is all good and well so far, however I'm struggling with the Rep part (represents Repeat.)
I should be able to issue a command as follows:
DOWN .DOWN. REP 3 " DOWN. DOWN.
DOWN . % hello this is a comment
REP 2 " DOWN. ""
This should give me 17 DOWNS. The semantics is as follows for repeat: REP x " commands " where x is the amount of times it shall repeat the commands listed inside the quotation marks. Note that REP can be nested inside of REP. The following code is for handling the DOWN command. The incoming text is read from System.in or a text file.
public void repeat(String workingString) {
if (workingString.matches(tokens)) {
if (workingString.matches("REP")) {
repada();
} else
if (workingString.matches("(DOWN).*")) {
String job = workingString.substring(4);
job = job.trim();
if (job.equals("")) {
String temp= sc.next();
temp= temp.trim();
// Word after DOWN.
if (temp.matches("\\.")) {
leo.down()
// If word after DOWN is a comment %
} else if (temp.matches("%.*")) {
boolean t = comment();
} else {
throw SyntaxError();
}
} else if (job.matches("\\..*")) {
workingString += job;
System.out.println("Confirm DOWN with .");
}
} else if (workingString.matches("\\.")) {
instructions += workingString;
System.out.println("Fullstop");
} else if (workingString.matches("%.*")) {
comment();
} else {
// work = sc.next();
work = work.trim().toUpperCase();
System.out.println(work);
}
} else {
System.out.println("No such token: " + workingString);
}
}
I got a working start on the repeat function:
public String repada(){
String times = sc.next();
times.trim();
if (times.matches("%.*")) {
comment();
times = sc.next();
}
String quote = sc.next();
quote.trim();
if(quote.matches("%.*")){
comment();
quote = sc.next();
}
String repeater = "";
System.out.println("REP " + times + " "+quote);}
However I'm thinking my whole system of doing things might need a rework. Any advice on how I could more easily solve this issue would be greatly appreciated!

Reading a file and displaying wanted results

I have a program that reads files like the one below.
12 9-62-1
Sample Name: 9-62-1 Injection Volume: 25.0
Vial Number: 37 Channel: ECD_1
Sample Type: unknown Wavelength: n.a.
Control Program: Anions Run Bandwidth: n.a.
Quantif. Method: Anions Method Dilution Factor: 1.0000
Recording Time: 10/2/2013 19:55 Sample Weight: 1.0000
Run Time (min): 14.00 Sample Amount: 1.0000
No. Ret.Time Peak Name Height Area Rel.Area Amount Type
min µS µS*min % mG/L
1 2.99 Fluoride 7.341 1.989 0.87 10.458 BMB
2 3.88 Chloride 425.633 108.551 47.72 671.120 BMb
3 4.54 Nitrite 397.537 115.237 50.66 403.430 bMB
4 5.39 n.a. 0.470 0.140 0.06 n.a. BMB
5 11.22 Sulfate 4.232 1.564 0.69 13.064 BMB
Total: 835.213 227.482 100.00 1098.073
From these files, the program should output a few things not everything.
The final results that I need should look like this:
0012.TXT
Sample#,Date,Time,Peak Name, Amount
9-62-1,10/2/2013,19:55,Fluoride,10.458
9-62-1,10/2/2013,19:55,Chloride,671.120
9-62-1,10/2/2013,19:55,Nitrite,403.430
9-62-1,10/2/2013,19:55,Sulfate,13.064
But, right now they look like this:
0012.TXT
Sample#,Date,Time,Peak Name, Amount
9-62-1,10/2/2013,19:55,Fluoride,10.458 ,
Chloride,671.120 ,
Nitrite,403.430 ,
n.a.,n.a.,
Sulfate,13.064 ,
,1098.073 ,
Here is my code and what I have done.
Scanner input = new Scanner(new FileReader(selectFile.getSelectedFile()));
System.out.println("Sample#,Date,Time,Peak Name,Amount");
int linesToSkip = 28;
BufferedReader br = new BufferedReader(new FileReader(selectFile.getSelectedFile()));
String line;
while ( (line = br.readLine()) != null) {
if (linesToSkip-- > 0) {
continue;
}
if (line.contains("n.a.")) {
continue;
}
if (line.contains("Total")) {
continue;
}
String[] values = line.split("\t");
int index = 0;
for (String value : values) {
/*System.out.println("values[" + index + "] = " + value);*/
index++;
}
while (input.hasNext()) {
String word = input.next();
Pattern pattern1 = Pattern.compile("Name:");
Pattern pattern2 = Pattern.compile("Time:");
Matcher matcher1 = pattern1.matcher(word);
Matcher matcher2 = pattern2.matcher(word);
Matcher matcher3 = pattern2.matcher(word);
if(matcher1.matches()){
System.out.print(input.next() + ",");
}
if(matcher2.matches()){
System.out.print(input.next() + ",");
}
if(matcher3.matches()){
System.out.print(input.next() + ",");
}
System.out.print("");
}
System.out.print(values[2]+",");
System.out.println(values[6]+"\b,");
}
br.close();
How can I make the output look like these with the sample#, Date and Time then followed by the peak name and amount and print them that way on each line?
Sample#,Date,Time,Peak Name, Amount
9-62-1,10/2/2013,19:55,Fluoride,10.458
9-62-1,10/2/2013,19:55,Chloride,671.120
9-62-1,10/2/2013,19:55,Nitrite,403.430
9-62-1,10/2/2013,19:55,Sulfate,13.064
Thanks!
Something like:
while ( (line = br.readLine()) != null) {
if (line.contains("n.a.")) {
continue;
}
//Your code
You can do the same in your inner while loop for specific table item for peak name value and the amount value. In that case you can use String#equales() method.
Edit for Comments:
You are over complicating your things while printing and reading your file content. Dont use Scanner as well as BufferedReader. One will do the work for you.
You have very specific format for your file. You really dont need to use regex for this purpose, which you are using in your inner while loop.
For sample name to match use String#equales() method and do you operations accordingly.
Get the values you needed from upper section of your file like Sample Name and Recording Time, keep them handy, so that you could use them later.
From you lower section get Peak Name and amount from each row.
While printing construct your String by making use of these collected values.
Another Edit for Comments:
the following code is not tested, so there could be some issues, but you can figure them out.
If you look at String Class then you will find many useful methods.
BufferedReader br = new BufferedReader(new FileReader(selectFile.getSelectedFile()));
String recTime, peakName, amount, sample ;
int linesToSkip = 28;
String line = br.readLine();
if(line != null){
String[] values = line.split("\t");
sample = values[1];
}
while ( (line = br.readLine()) != null) {
values = line.split("\t");
if (line.startsWith("Sample Name")) {
// Check here value[1] is equal to sample. If this is needed.
// You got your sample name here
} else if (line.startsWith("Recording Time")) {
recTime = values[1];
// You got your Recording Time here
} else if(values.length > 4 ){
// get Peak Name and recording time
peakName = values[2];
amount = values[6];
} else if (line.contains("n.a.") || line.contains("Total") || linesToSkip-- > 0) {
/* may not needed linesToSkip-- > 0 in above condition */
continue;
}
System.out.println(sample +" ," + recTime + " ," + peakName + " ," + amount);
}
I hope this helps. Good Luck.

Having trouble reading files in java

I have some trouble reading file in Java.
What a file looks like:
Answer 1:
1. This is an apple
2. Something
Answer 2:
1. This is a cool website
2. I love banana
3. This is a table
4. Programming is fun
Answer 3.
1. Hello World
....
What I want to do is separate them into two items:
One is the Answer number; The other one is list of answers.
So assuming I have a object class called Answer:
String of answer number
List of answers.
This is what I have done so far to debug my code before I put it into object class. But I'm not able to get the correct result
public void reader(String file) throws FileNotFoundException, IOException {
FileReader fR = new FileReader(file);
BufferedReader bR = new BufferedReader(fR);
String line = null;
int count = 0 ;
String blockNum = "";
String printState = "" ;
while ((line = bR.readLine()) != null) {
if(line.contains("Answer")){
//System.out.println("Contain Answer statement: " + line);
count++;
blockNum = line;
printState = "";
}
else{
//System.out.println("No Answer Statement: " + line);
printState += line + " / " ;
}
System.out.println( count + " " + blockNum + " " + printState );
}
// Close the input stream
bR.close();
fR.close();
}
I'm pretty sure I did something stupid while I'm coding. I'm not too sure how to read it so that it will have separate it.
Right now the output looks like this:
1 Answer 1:
1 Answer 1: 1. This is an apple /
1 Answer 1: 1. This is an apple / 2. Something /
2 Answer 2:
2 Answer 2: 1. This is a cool website /
2 Answer 2: 1. This is a cool website / 2. I love banana /
2 Answer 2: 1. This is a cool website / 2. I love banana / 3. This is a table /
2 Answer 2: 1. This is a cool website / 2. I love banana / 3. This is a table / 4. Programming is fun /
3 Answer 3.
3 Answer 3. 1. Hello World /
But I want the output to be something like this:
1 Answer 1: 1. This is an apple / 2. Something /
2 Answer 2: 1. This is a cool website / 2. I love banana / 3. This is a table / 4. Programming is fun /
3 Answer 3. 1. Hello World /
You are printing a line of output for each line of input you read. Try moving the println inside the part of the loop that checks for answer to make sure you print each answer/answer value set only once. E.g.:
if(line.contains("Answer")) {
if (printState != "") {
System.out.println(count + " " + blockNum + " " + printState);
}
...
}
EDIT: You will also need to print when you exit the while loop to make sure you print the last answer/answer value set.
One solution use a flag for printTime as a boolean.
boolean printTime = false;
...
if(line.contains("Answer")) {
if (printTime!= false) {
System.out.println( count + " " + blockNum + " " + printState );
printTime=false;
}
...
}else{
//System.out.println("No Answer Statement: " + line);
printState += line + " / " ;
printTime=true; // you have one answer
}
...
add a little extra print at the end of the while for last answer
that way you can have several printState for one answer in one line.
But correct "java" way of handle this is to create your objects:
List<Answer> listOfAnswer = new LinkedList<Answer>();
Answer answer;
...
if(line.contains("Answer")){
//System.out.println("Contain Answer statement: " + line);
count++;
answer = new Answer(line);
listOfAnswer.add(answer)
}
else{
answer.add(new Answer(line));
}
}
...
and only after print them out :)
System.out.println(listOfAnswer.toString());
Simpler solution is to use a
Map<Integer, LinkedList<String>> answers = new LinkedHashMap<Integer, LinkedList<String>();
The main output issue is that you are printing inside the line-iterating loop (while).
To solve it, you can follow what #Albion answered and change where the print is done inside the while, but as the EDIT on his answer states, there is a flaw and you will have to print an extra time after the loop, in order to get the correct result.
There is an alternative tough, that is to not print it inside the loop at all, which i consider to be the correct approach in your case. To do it, you need little more than using StringBuilder instead of String!
I have also spotted some needless variables, and there is also a flaw in using the if(line.contains("Answer")) method, that is that if "Answer" string appears inside one of the options texts, it will get a true and mess your results, for example:
Answer 1:
1. This is an apple
2. Capitalizing Is The Answer!
3. Something
Will output:
1 Answer 1: 1. This is an apple /
2 2. Capitalizing Is The Answer! 3. Something
In most cases, the best approach to finding a dynamic pattern (as yours is, for it changes the number and, in the last Answer, the ':' with a '.' too) is to use a (pattern) Matcher! I used this: if(line.matches("Answer \\d[\\:\\.]")), and if you are not yet used to it, see the Pattern Docs, as String.matches() is something you will probably use a lot when processing text.
Explaining every single change is a little troublesome, and the code is simple enough for you master after you analyse it, so i'll simply post what my approach would be:
public static void StackAnswer(String file) throws FileNotFoundException, IOException{
BufferedReader br = new BufferedReader(new FileReader(file));
StringBuilder output = new StringBuilder();
int count = 0;
while(br.ready()){
String line = br.readLine().trim();
if(line.matches("Answer \\d[\\:\\.]")){
count++;
output.append(System.lineSeparator()).append(count).append(' ').append(line);
} else {
output.append(" / ").append(line);
}
}
System.out.println(output.toString().trim());
br.close();
}
Good luck!
See working example:
public void reader(String file) throws FileNotFoundException,
IOException {
BufferedReader reader = new BufferedReader(new FileReader(file));
String line = "";
while (reader.ready()) {
line = reader.readLine();
if (!line.contains("Answer")) {
System.out.print(line + " / ");
} else {
System.out.println();
System.out.print(line + " ");
}
}
}

Categories