I decided to create a currency converter in Java, and have it so that it would pull the conversion values out of a text file (to allow for easy editability since these values are constantly changing). I did manage to do it by using the Scanner class and putting all the values into an ArrayList.
Now I'm wondering if there is a way to add comments to the text file for the user to read, which Scanner will ignore. "//" doesn't seem to work.
Thanks
Best way would be to read the file line by line using java.io.BufferedReader and scan every line for comments using String#startsWith() where in you searches for "//".
But have you considered using a properties file and manage it using the java.util.Properties API? This way you can benefit from a ready-made specification and API's and you can use # as start of comment line. Also see the tutorial at sun.com.
Scanner wont ignore anything, you will have to remove the comments from your data after you have read it in.
Yea, while ((currentLine = bufferedReader.readLine()) != null) is possibly the easiest, then perform your necessary tests. currentLine.split(regex) is also very handy for converting a line into an array of values using a delimiter.
With Java nio, you could do something like this. Assuming you want to ignore lines that start with "//" and end up with an ArrayList.
List<String> dataList;
Path path = FileSystems.getDefault().getPath(".", "data.txt");
dataList = Files.lines(path)
.filter(line -> !(line.startsWith("//")))
.collect(Collectors.toCollection(ArrayList::new));
Related
I would like to store some String in a file and then read it back again. The problem is Strings could be anything for instance it could even be something like "Entry1","Entry2" for one field. So if I simply check commas and split Strings accordingly to that it will definitely fail.
Is there any built-in Java class that handles situations like that? If not how can I make a simple CSV parser in Java?
You might want to have a look at thisspecification for CSV. Bear in mind that there is no official recognized specification. You can probably try this parser too else
There is Apache Common library for CSV too that can help.
If you do not know about delimiter it will not be possible to do this so you have to find out somehow. If the delimiter can vary your only hope is to be able to deduce if from the formatting of the known data. When Excel imports CSV files it lets the user choose the delimiter and this is a solution you could use as well.
I would recommend openCSV: http://opencsv.sourceforge.net/
I have used it for numerous Java projects requiring CSV support, both reading and writing. A simple example of how it writes a CSV from the docs:
CSVWriter writer = new CSVWriter(new FileWriter("yourfile.csv"), ',');
// feed in your array (or convert your data to an array)
String[] entries = "first,second,third".split(",");
writer.writeNext(entries);
writer.close();
Assuming you can make a String[] out of your data it's that simple.
To deal with comma's in your entries you'd need to quote the entire entry:
`Make,Take,Break", "Top,Right,Left,Bottom",
With OpenCSV you can provide a quote character,you just pass it in the constructor:
CSVWriter writer = new CSVWriter(new FileWriter("yourfile.csv"), ',', '"');
That should take care of the needs you listed.
I'm currently writing something which is validating our vbscript files. Right at the start I wish to remove all lines of code which are comments. I was expecting to be able to use the "'" (comment symbol in vbscript) and '\n'. However, when I write the content of the file to screen, the new lines are not formatting. Does this mean there are actually no new lines in the original vbscript file and if not, how could I remove comments?
first read whole file in string example
then use regex or simply substring for removing extra syntax
How are you parsing the file? Are you also taking the '\r' into consideration when removing the comments? Or maybe you are accidentally removing all newline characters.
I would create some state flags to tell the parser when I was in a comment or not.
What if I have a file that I am using a string tokenizer on to get values between commas. Its a csv file. Here is sample input:
test,first,second,,fourth,fifth
so how can i catch that empty comma? Right now its just pretending nothing is there. It doesn't even see that there is a place with nothing in it.
Using String#split() would be recommended over StringTokenizer.
String[] s = "test,first,second,,fourth,fifth".split(",");
System.out.println(Arrays.asList(s));
System.out.println(s.length);
// output:
// [test, first, second, , fourth, fifth]
// 6
Also, if you have much more involved CSV parsing in your code, if possible, try using an existing library like JavaCSV.
I am not sure if I am understanding your question correctly. I would use well-known packages like opencsv.
The split technique works great, so long as none of your elements have a comma inside it. You can use existing libraries. I've also had good results using regexp for CSV processing.
This question already has answers here:
How do I create a Java string from the contents of a file?
(35 answers)
Closed 7 years ago.
I am trying to read a simple text file into a String. Of course there is the usual way of getting the input stream and iterating with readLine() and reading contents into String.
Having done this hundreds of times in past, I just wondered how can I do this in minimum lines of code? Isn't there something in java like String fileContents = XXX.readFile(myFile/*File*/) .. rather anything that looks as simple as this?
I know there are libraries like Apache Commons IO which provide such simplifications or even I can write a simple Util class to do this. But all that I wonder is - this is a so frequent operation that everyone needs then why doesn't Java provide such simple function? Isn't there really a single method somewhere to read a file into string with some default or specified encoding?
Yes, you can do this in one line (though for robust IOException handling you wouldn't want to).
String content = new Scanner(new File("filename")).useDelimiter("\\Z").next();
System.out.println(content);
This uses a java.util.Scanner, telling it to delimit the input with \Z, which is the end of the string anchor. This ultimately makes the input have one actual token, which is the entire file, so it can be read with one call to next().
There is a constructor that takes a File and a String charSetName (among many other overloads). These two constructor may throw FileNotFoundException, but like all Scanner methods, no IOException can be thrown beyond these constructors.
You can query the Scanner itself through the ioException() method if an IOException occurred or not. You may also want to explicitly close() the Scanner after you read the content, so perhaps storing the Scanner reference in a local variable is best.
See also
Java Tutorials - I/O Essentials - Scanning and formatting
Related questions
Validating input using java.util.Scanner - has many examples of more typical usage
Third-party library options
For completeness, these are some really good options if you have these very reputable and highly useful third party libraries:
Guava
com.google.common.io.Files contains many useful methods. The pertinent ones here are:
String toString(File, Charset)
Using the given character set, reads all characters from a file into a String
List<String> readLines(File, Charset)
... reads all of the lines from a file into a List<String>, one entry per line
Apache Commons/IO
org.apache.commons.io.IOUtils also offer similar functionality:
String toString(InputStream, String encoding)
Using the specified character encoding, gets the contents of an InputStream as a String
List readLines(InputStream, String encoding)
... as a (raw) List of String, one entry per line
Related questions
Most useful free third party Java libraries (deleted)?
From Java 7 (API Description) onwards you can do:
new String(Files.readAllBytes(Paths.get(filePath)), StandardCharsets.UTF_8);
Where filePath is a String representing the file you want to load.
You can use apache commons IO..
FileInputStream fisTargetFile = new FileInputStream(new File("test.txt"));
String targetFileStr = IOUtils.toString(fisTargetFile, "UTF-8");
This should work for you:
import java.io.IOException;
import java.nio.file.Files;
import java.nio.file.Paths;
public static void main(String[] args) throws IOException {
String content = new String(Files.readAllBytes(Paths.get("abc.java")));
}
Using Apache Commons IO.
import org.apache.commons.io.FileUtils;
//...
String contents = FileUtils.readFileToString(new File("/path/to/the/file"), "UTF-8")
You can see de javadoc for the method for details.
Don't write your own util class to do this - I would recommend using Guava, which is full of all kinds of goodness. In this case you'd want either the Files class (if you're really just reading a file) or CharStreams for more general purpose reading. It has methods to read the data into a list of strings (readLines) or totally (toString).
It has similar useful methods for binary data too. And then there's the rest of the library...
I agree it's annoying that there's nothing similar in the standard libraries. Heck, just being able to supply a CharSet to a FileReader would make life a little simpler...
Another alternative approach is:
How do I create a Java string from the contents of a file?
Other option is to use utilities provided open source libraries
http://commons.apache.org/io/api-1.4/index.html?org/apache/commons/io/IOUtils.html
Why java doesn't provide such a common util API ?
a) to keep the APIs generic so that encoding, buffering etc is handled by the programmer.
b) make programmers do some work and write/share opensource util libraries :D ;-)
Sadly, no.
I agree that such frequent operation should have easier implementation than copying of input line by line in loop, but you'll have to either write helper method or use external library.
I discovered that the accepted answer actually doesn't always work, because \\Z may occur in the file. Another problem is that if you don't have the correct charset a whole bunch of unexpected things may happen which may cause the scanner to read only a part of the file.
The solution is to use a delimiter which you are certain will never occur in the file. However, this is theoretically impossible. What we CAN do, is use a delimiter that has such a small chance to occur in the file that it is negligible: such a delimiter is a UUID, which is natively supported in Java.
String content = new Scanner(file, "UTF-8")
.useDelimiter(UUID.randomUUID().toString()).next();
How would you parse in Java a structure, similar to this
\\Header (name)\\\
1JohnRide 2MarySwanson
1 password1
2 password2
\\\1 block of data name\\\
1.ABCD
2.FEGH
3.ZEY
\\\2-nd block of data name\\\
1. 123232aDDF dkfjd ksksd
2. dfdfsf dkfjd
....
etc
Suppose, it comes from a text buffer (plain file).
Each line of text is "\n" - limited. Space is used between the words.
The structure is more or less defined. Ambuguity may sometimes be, though, case
number of fields in each line of information may be different, sometimes there may not
be some block of data, and the number of lines in each block may vary as well.
The question is how to do it most effectively?
First solution that comes to my head is to use regular expressions.
But are there other solutions? Problem-oriented? Maybe some java library already written?
Check out UTAH: https://github.com/sonalake/utah-parser
It's a tool that's pretty good at parsing this kind of semi structured text
As no one recommended any library, my suggestion would be : use REGEX.
From what you have posted it looks like the data is delimited by whitespace. One idea is to use a Scanner or a StringTokenizer to get one token at a time. You can then check the first char of a token to see if it is a digit (in which case the part of the token after the digit(s) will be the data, if there is any).
This sounds like a homework problem so I'm going to try to answer it in such a way to help guide you (not give the final solution).
First, you need to consider each object of data you're reading. Is it a number then a text field? A number then 3 text fields? Variable numbers and text fields?
After that you need to determine what you're going to use to delimit each field and each object. For example, in many files you'll see something like a semi-colon between the fields and a new line for the end of the object. From what you said it sounds like yours is different.
If an object can go across multiple lines you'll need to bear that in mind (don't stop partway through an object).
Hopefully that helps. If you research this and you're still having problems post the code you've got so far and some sample data and I'll help you to solve your problems (I'll teach you to fish....not give you fish :-) ).
If the fields are fixed length, you could use a DataInputStream to read your file. Or, since your format is line-based, you could use a BufferedReader to read lines and write yourself a state machine which knows what kind of line to expect next, given what it's already seen. Once you have each line as a string, then you just need to split the data appropriately.
E.g., the password can be gotten from your password line like this:
final int pos = line.indexOf(' ');
String passwd = line.substring(pos+1, line.length());