This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
Scanner vs. BufferedReader
Is there any situation in which it's apropriate to use java.util.Scanner in order to read input of some sort? In my small test I've found it to be incredibly slow compared to java.util.Bufferedreader or implementing your own reader from java.util.InputStreamReader.
So is there any reason as to why I would want to use a Scanner?
From the docs:
A simple text scanner which can parse primitive types and strings
using regular expressions.
That won´t do a BufferedReader.
The Scanner class main purpose is for parsing text for primitive types and strings using regular expressions. You can provide several resource types to read from.
While Scanner is relatively slower, it is often more than fast enough and it is much more powerful than BufferedReader.
public static void main(String... args) throws IOException {
StringBuilder sb = new StringBuilder();
for (int i = 0; i < 10000; i++)
sb.append("line: ").append(i).append("\n");
String lines = sb.toString();
for (int i = 0; i < 6; i++) {
testBufferedReader(lines);
testScanner(lines);
}
}
private static void testBufferedReader(String text) throws IOException {
int count = 0;
BufferedReader br = new BufferedReader(new StringReader(text));
long start = System.nanoTime();
while (br.readLine() != null)
count++;
long time = System.nanoTime() - start;
System.out.printf("BufferedReader.readLine() took an average of %,d ns count=%,d%n", time / count, count);
}
private static void testScanner(String text) throws IOException {
int count = 0;
Scanner sc = new Scanner(new StringReader(text));
long start = System.nanoTime();
while (sc.hasNextLine()) {
sc.nextLine();
count++;
}
long time = System.nanoTime() - start;
System.out.printf("Scanner nextLine took an average of %,d ns count=%,d%n", time / count, count);
}
finally prints
BufferedReader.readLine() took an average of 124 ns count=10,000
Scanner nextLine took an average of 1,549 ns count=10,000
While the relative difference is large, the scanner is less than a couple of micro-seconds each.
Scanner- Many of its methods is the idea of parsing the input stream into tokens. BufferedReader doesn't rely on breaking up its input into tokens. It allows you to read character by character if you want. It can read an entire line and let you do what you want.
A scanner into it; it can do all that a BufferedReader can do and at the same level of efficiency as well. However, in addition a Scanner can parse the underlying stream for primitive types and strings using regular expressions. It can also tokenize the underlying stream with the delimiter of your choice. It can also do forward scanning of the underlying stream disregarding the delimiter!
EDIT Just Forget to mention...
"A scanner however is not thread safe, it has to be externally synchronized."
Here is the difference between java.lang.util.scanner and java.lang.util buffered reader. While both are useful in taking input from the user in a java class, there is a significant difference that one needs to understand.
Scanner is a single token input system that uses white spaces as the default delimiter. Although you can change this to other formats using various other methods.
While buffered reader is a buffered input system. It takes chunks (stream) of data and then feeds into the data type that the user wants to store it in. So until you flush or the buffer is full, the reader stream wont give you data..
Related
I have to read through a text file with roughly 100K words and create a HashMap with the frequency of each word. The code I have so far takes about 15-20 minutes to execute and I'm guessing I'm doing something horribly wrong.
How much would the execution time for such task be?
This is the code I'm using
Scanner scanner = new Scanner(new FileReader("myFile.txt"));
HashMap<String, Integer> wordFrequencies = new HashMap<>();
while (scanner.hasNextLine()) {
wordFrequencies.merge(scanner.next(), 1, (a, b) -> a + b);
}
return wordFrequencies;
It should take next-to-no-time. As in, if you're doing this just once, you should barely notice the time it takes. If it's taking 20 minutes, you're processing roughly 100 words per second, which is abysmal performance, even if your words are really long.
From the Javadoc of BufferedReader (emphasis added):
In general, each read request made of a Reader causes a corresponding read request to be made of the underlying character or byte stream. It is therefore advisable to wrap a BufferedReader around any Reader whose read() operations may be costly, such as FileReaders and InputStreamReaders.
Try wrapping the FileReader in a BufferedReader:
Scanner scanner = new Scanner(new BufferedReader(new FileReader("myFile.txt")));
I am trying to solve UVa problem 458 - decoder and I came up with the following algorithm which gives me the correct output for the sample input data, but runs longer than allowed.
public class Decoder {
public void decoder() {
Scanner sc = new Scanner(System.in);
while (sc.hasNext()) {
String line = sc.nextLine();
for (int i = 0; i < line.length(); i++) {
if(line.charAt(i)>=32 && line.charAt(i)<=126)
System.out.print((char) (line.charAt(i) - 7));
}
System.out.println();
}
}
}
What I've looked into
Well I have read the forums and most of the solutions were pretty similar, I have been researching if there was a way of avoiding the for loop which is running through the string and printing out the new char. But this loop is inevitable, this algorithm's time complexity is always going to be n^2.
The problem also mentions to only change ASCII printable values, which is why I set the condition to check if its greater than or equal to 32 and 126. According to Wikipedia that is the range of printable values.
http://ideone.com/XkByW9
Avoid decoding the stream to characters. It's ok to use bytes if you only have to support ASCII.
Read and write the data in big chunks to avoid function/system call overhead.
Avoid unnecessary allocations. Currently you are allocating new String for every line.
Do not split the input into lines to avoid bad performance for very small lines.
Example:
public static void main(String[] args) throws IOException {
byte[] buffer = new byte[2048];
while (true) {
int len = System.in.read(buffer);
if (len <= 0) {
break;
}
for (int i = 0; i < len; i++) {
...
}
System.out.write(buffer, 0, len);
}
}
It will process the input as you would normally process a binary file. For every iteration, it will read up to 2048 bytes into a buffer, process them and write them to standard output. Program will end when EOF is reached and read returns -1. 2048 is usually a good buffer size, but you might want to try different sizes and see which one works best.
Never use Scanner for long inputs. The scanner is unbelievably slower than other means of reading input in Java, such as BufferedReader. This UVa problem looks like one with a quite long input.
This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
Java: Reading integers from a file into an array
I want to read integer values from a text file say contactids.txt. in the file i have values like
12345
3456778
234234234
34234324234
i Want to read them from a text file...please help
You might want to do something like this (if you're using java 5 and more)
Scanner scanner = new Scanner(new File("tall.txt"));
int [] tall = new int [100];
int i = 0;
while(scanner.hasNextInt())
{
tall[i++] = scanner.nextInt();
}
Via Julian Grenier from Reading Integers From A File In An Array
You can use a Scanner and its nextInt() method.
Scanner also has nextLong() for larger integers, if needed.
Try this:-
File file = new File("contactids.txt");
Scanner scanner = new Scanner(file);
while(scanner.hasNextLong())
{
// Read values here like long input = scanner.nextLong();
}
How large are the values? Java 6 has Scanner class that can read anything from int (32 bit), long (64-bit) to BigInteger (arbitrary big integer).
For Java 5 or 4, Scanner is there, but no support for BigInteger. You have to read line by line (with readLine of Scanner class) and create BigInteger object from the String.
use FileInputStream's readLine() method to read and parse the returned String to int using Integer.parseInt() method.
I would use nearly the same way but with list as buffer for read integers:
static Object[] readFile(String fileName) {
Scanner scanner = new Scanner(new File(fileName));
List<Integer> tall = new ArrayList<Integer>();
while (scanner.hasNextInt()) {
tall.add(scanner.nextInt());
}
return tall.toArray();
}
I'm practicing for a competitive tournament that will be in my faculty in a few weeks, and thus I encountered a small problem.
The competition restricted the use of java.io.* (except IOException...)
I need to read (from stdin) input, each test case is separated with a blank line. end of test cases - when EOF is found.
I need to find a way to get data from IO, without using java.io
so far, I got this (which works) - it returns a string containing each test case, and null when I'm out of test cases.
public static String getInput() throws IOException {
int curr=0;
int prev=0;
StringBuilder sb = new StringBuilder();
while (true) {
curr = System.in.read();
if (curr == -1) {
return null; //end of data
}
if (curr == '\r') {
curr = System.in.read();
}
if (curr == prev && curr == '\n') {
return sb.toString(); //end of test case
} //else:
sb = sb.append((char)curr);
prev = curr;
}
}
performance (for the IO) is neglected, so I don't care I read only one byte every time.
Question: Is there a more elegant (shorter and faster to code) way to achieve the same thing?
In fact, there are a few ways that you can process input in Java in competitive programming.
Approach 1: Using java.util.Scanner
This is the simplest way to read input, and it is also really straightforward to use. It can be slow if you have a huge amount of input. If your program keeps getting TLE (Time Limit Exceeded), but your program has the correct time complexity, try reading input with the second or third approach.
Initialization Scanner sc = new Scanner(System.in);
Reading an integer: int n = sc.nextInt();
Approach 2: Using java.io.BufferedReader
Use this one if there is a huge amount of input, and when the time limit of the problem is strict. It does require some more work, involving splitting the input by spaces, or using Integer.parseInt(str); to extract integers from the input.
You can find a speed comparison here https://www.cpe.ku.ac.th/~jim/java-io.html
Initialization: BufferedReader reader = new BufferedReader(System.in);
Reading an integer: int n = Integer.parseInt(reader.readLine());
Approach 3: Reading directly from FileDescriptor using custom reader
This approach is the fastest approach possible in Java. It does require a lot of work, including implementing the reader, as well as debugging should any problems arise. Use this approach if the time limit is strict and if you are allowed to bring code into the competition. This method is tested to be much faster than the second approach, but it would not usually provide you with an advantage since it is only about 2x the speed of the BufferedReader approach.
This is one implementation of such an approach written by my friend:
https://github.com/jackyliao123/contest-programming/blob/master/Utils/FastScanner.java
The usage of the reader really depends on your implementation of the reader. It is suggested to maintain one copy of the reader that is somewhat guaranteed to work, because the last thing you want in a contest is having a non-functional reader and debugging the rest of your program, thinking there are some bugs there.
Hope this helps and best wishes on your competition!
You could try the following and make it efficient by wrapping the System.in.
public static String readLine() throws IOException {
StringBuilder sb = new StringBuilder();
for (int ch; (ch = System.in.read()) > 0;)
if (ch == '\r') continue;
else if (ch == '\n') break;
else sb.append(ch);
return sb.toString();
}
EDIT: On Oracle JVM, System.in is a BufferedInputStream which wraps a FileInputStream which wraps a FileDescriptor. All these are in java.io.
You can try using the java.util.Scanner class if java.util is allowed. It has useful methods for reading in a line, a token or even a number as needed. But it is slower than BufferedReader and possibly slower than using System.in.read() directly.
Since System.in implements the InputStream interface, it might also be some speedup to use System.in.read(byte[] b) to read in the input. This way you can read in a bunch of bytes at a time instead of just the one, which should be faster. But the added complexity of having to code and debug it during the contest might not be worth it.
Edit:
Searching the web I found someone discussing using System.in.read(byte[] b) in the UVa forum back when UVa had terrible Java support.
You can use a scanner
import java.util.Scanner;//put this above the class
Scanner scanner = new Scanner(System.in); //this creates the scanner
int input = scanner.nextInt();
.nextInt() takes integers
.nextLine() takes strings
In trying to resolve Facebook's Puzzle "Hoppity Hop", http://www.facebook.com/careers/puzzles.php?puzzle_id=7, I'm reading one integer only from a file. I'm wondering if this is the most efficient mechanism to do this?
private static int readSoleInteger(String path) throws IOException {
BufferedReader buffer = null;
int integer = 0;
try {
String integerAsString = null;
buffer = new BufferedReader(new FileReader(path));
// Read the first line only.
integerAsString = buffer.readLine();
// Remove any surplus whitespace.
integerAsString = integerAsString.trim();
integer = Integer.parseInt(integerAsString);
} finally {
buffer.close();
}
return integer;
}
I have seen How do I create a Java string from the contents of a file?, but I don't know the efficiency of the idiom which answers that question.
Looking at my code, it seems like a lot of lines of code and Objects for a trivial problem...
The shortest method would be with a Scanner:
private static int readSoleInteger(String path) {
Scanner s = new Scanner(new File(path));
int ret = s.nextInt();
s.close();
return ret;
}
Note that Scanner swallows any IOExceptions, so that simplifies things a lot.
As for "most efficient"... well, the simple act of opening a file from the disk is likely to be the slowest part of any method you write for this. Don't worry too much about efficiency in this case.
Edit: I hadn't realized that the integer can have whitespace on either side of it. My code does not account for this currently, but it's easy to make the Scanner skip things. I've added the line
s.skip("\\s+");
to correct this.
Edit 2: Never mind, Scanner ignores whitespace when it's trying to parse numbers:
The strings that can be parsed as numbers by an instance of this class are specified in terms of the following regular-expression grammar:
(regexes snipped)
Whitespace is not significant in the above regular expressions.
I would use the Scanner class:
Scanner sc = new Scanner(new File("my_file"));
int some_int = sc.nextInt();