Java: Processing Stream line by line without forEach? - java

I am new to Java and trying out Streams for the first time.
I have a large input file where there is a string on each line like:
cart
dumpster
apple
cherry
tank
laptop
...
I'm trying to read the file in as a Stream and doing some analysis on the data. For example, to count all the occurrences of a particular string, I might think to do something like:
Stream<String> lines = Files.lines(Path.of("/path/to/input/file.txt"));
int count = 0;
lines.forEach((line) => {
if (line.equals("tank")) {
count++;
}
});
But, Java doesn't allow mutation of variables within the lambda.
I'm not sure if there's another way to read from the stream line by line. How would I do this properly?

You don't need a variable external to the stream. And if you have a really big file to count, long would be preferred
long tanks = lines
.filter(s -> s.equals("tank"))
.count();

To iterate a stream using a regular loop, you can get an iterator from your stream and use a for-loop:
Iterable<String> iterable = lines::iterator;
for (String line : iterable) {
if (line.equals("tank")) {
++count;
}
}
But in this particular case, you could just use the stream's count method:
int count = (int) lines.filter("tank"::equals).count();

you can read from the file line by line, with stream of each one :
try (Stream<String> lines = Files.lines(Path.of("/path/to/input/file.txt"))) {
list = stream
.filter(line -> !line.startsWith("abc"))
.map(String::toUpperCase)
.collect(Collectors.toList());
} catch (IOException e) {
e.printStackTrace();
}

Related

Process string using Java streams

I am in need of some guidance. I am not sure how to go about reading in the sample text file into an array of objects using Java Streams. Does stream provide a functionality to correctly output a position of the character from a string that it reads from the file?
I am reading a file using Java I/O and then passing the content as string to this function to create array of Squares....
Can this creation of Array of Objects be done using Java 8 Stream ? If so How please. Thank you.
Using java streams you could do something like this:
AtomicInteger row = new AtomicInteger(-1);
// count specific characters with this:
AtomicInteger someCount = new AtomicInteger();
try (Stream<String> stringStream = Files.lines(Paths.get("yourFile.txt"))) { // read all lines from file into a stream of strings
// This Function makes an array of Square objects of each line
Function<String, Square[]> mapper = (s) -> {
AtomicInteger col = new AtomicInteger();
row.incrementAndGet();
return s.chars()
.mapToObj(i -> {
// increment counter if the char fulfills condition
if((char)i == 'M')
someCount.incrementAndGet();
return new Square(row.get(), col.getAndIncrement(), (char)i);
})
.toArray(i -> new Square[s.length()]);
};
// Now streaming all lines using the mapper function from above you can collect them into a List<Square[]> and convert this List into an Array of Square objects
Square[][] squares = stringStream
.map(mapper)
.collect(Collectors.toList()).toArray(new Square[0][]);
}
Answering your second question: If you have an array of Square[] and want to find the first Square with val == 'M' you could do this:
Optional<Square> optSquare = Stream.of(squares).flatMap(Stream::of).filter(s -> s.getVal() == 'M').findFirst();
// mySquare will be null if no Square was matching condition
Square mySquare = optSquare.orElse(null);

Read specific columns from a file in Java 8 using streams, and put them in a 2D array

I have an input file that looks like this
#id1 1.2 3.4
#id2 6.8 8.1
#id3 1.5 9.4
#id4 5.9 2.7
I would like to store the numbers only in a 2D array, and forget about the 1st column that contains the #id.
I also want to use streams only for that operation.
So far I made 2 methods :
First method read the input file and store each lines in a List, as an array of string :
private List<String[]> savefromfile(String filePath) throws IOException {
List<String[]> rowsOfFile = new LinkedList<String[]>();
try (Stream<String> lines = Files.lines(Paths.get(filePath))) {
lines.forEach(line -> {
String rows[] = line.trim().split("\\s+");
rowsOfFile.add(rows);
});
lines.close();
}
return rowsOfFile;
The second method take as an input the List, and return a 2D Array that contains only the columns numbers :
private double[][] storeAllID(List<String[]> rowsOfFile) {
int numberOfID = rowsOfFile.size();
double[][] allID = new double[numberOfID][2];
int i = 0;
for (String[] line : rowsOfFile) {
double id[] = new double[2];
id[0] = Double.parseDouble(line[1]);
id[1] = Double.parseDouble(line[2]);
allID[i++] = id;
}
return allID;
}
Is there a way to make this code more efficient ? I want only one, short method that read the input file and return a 2D array containing numbers only.
I don't think it's necessary to write 20 lines of code to do this.
You aren't really gaining any benefit on your use of a stream in savefromfile, since you are using it exactly like it was a plain for-loop. To make the code a bit cleaner, you could get rid of the local variable completely, and also the call to close() is unnecessary as you are using try-with-resources already.
private List<String[]> savefromfile(String filePath) throws IOException {
try (Stream<String> lines = Files.lines(Paths.get(filePath))) {
return lines
.map(line -> line.trim().split("\\s+"))
.collect(Collectors.toCollection(LinkedList::new));
}
}
I don't know why you want to separate the parsing to double[][] into a separate method, as you could do it within your stream with a map:
private double[][] loadFromFile(String filePath) throws IOException {
try (Stream<String> lines = Files.lines(Paths.get(filePath))) {
return lines
.map(line -> line.trim().split("\\s+"))
.map(line -> new double[] {
Double.parseDouble(line[1]),
Double.parseDouble(line[2])
})
.toArray(double[][]::new);
}
}
For performance, you'll just have to measure for yourself if using lower-level data types and loops would be worth the added complexity.

Java File Parsing - Go word by word

I have a file content as Follows:
Sample.txt:
Hi my name is john
and I am an engineer. How are you
The output I want is an arrayList of string like [Hi,my,name,is,john,and,I,am,an,engineer,.,How,are,you]
The standard java function parses it as line and I would get an array containing the lines. I am confused as to which approach I should use to get the following output.
Any help is appretiated.
.nextLine() will get one whole line but .next() will go word by word
You could check out using the Scanner class with the .next() method.
This will read the file and collect all words into a list of strings.
Edit: Updated so as to handle punctuation and the likes as distinct words:
try {
List<String> words = Files.lines(Paths.get("/path/to/sample.txt"))
.map(line -> line.split("\\b"))
.flatMap(Arrays::stream)
.filter(w -> !w.trim().isEmpty())
.collect(Collectors.toList());
return words;
} catch (IOException e) {
// handle error
}
If you are getting the strings as whole lines, but just want the words, you could use .split(" ") on the words, as this would return an array containing individual words with no spaces. If you want to do this within the file reading, you could use something like the following...
public ArrayList<String> readWords(File file) throws IOException {
ArrayList<String> words = new ArrayList<String>();
String cLine = "";
BufferedReader reader = new BufferedReader(new FileReader(file));
while ((cLine = reader.readLine()) != null) {
for (String word : cLine.split(" ")) {words.add(word);}
}
reader.close();
return words;
}
which would return an ArrayList<String> containing all of the individual words in the file.
Hope this helps.

How to collect CSV row as array of strings using simpleflatmapper

I'm trying to collect CSV row as array of strings using simpleflatmapper:
try (Reader in = Files.newBufferedReader("path")) {
return org.simpleflatmapper.csv.CsvParser
// .mapTo(String[].class)
.stream(in)
// .parallel()
// .flatMap(Arrays::stream)
.map(line -> {return new ArrayList<>(Arrays.asList(line));})
// .map(Arrays::asList)
.collect(Collectors.toList());
} catch (Exception e) {
e.printStackTrace();
}
As I debug, the line is String[] but the value is entire row (one element) instead of many strings (many cells). How can I got the array of cells?
The CSV file is no special. Ex:
a\t b\t 1\t 2
x\t y\t 3\t 4
The issue as I see in this code .map(line -> {return new ArrayList<>(Arrays.asList(line));}) that the line contains one string value that is the whole line (with tab, space, ...) instead of many strings (each string is the value of each cell).
The whole result I want is List<List<String>> (List of lines). Each line is List<String> (list of cells). The current result is list of lines (rows), each line/row is the whole string.
since the file is CSV, you don't need to use any external lib, so simply you have to read the file as you read a txt file like this
Scanner scanner=new Scanner(new File("MyFile.csv"));
while(scanner.hasNextLine()){
myArray=scanner.nextLine().split(",");
}
I have found the solution:
return org.simpleflatmapper.csv.CsvParser
.separator('\t') //<-- solution
.stream(in)
.map(Arrays::asList)
.collect(Collectors.toList());
Thanks all!

How can I use Java 8 Streams with an InputStream?

I would like to wrap a java.util.streams.Stream around an InputStream to process one Byte or one Character at a time. I didn't find any simple way of doing this.
Consider the following exercise: We wish to count the number of times each letter appears in a text file. We can store this in an array so that tally[0] will store the number of times a appears in the file, tally[1] stores the number of time b appears and so on. Since I couldn't find a way of streaming the file directly, I did this:
int[] tally = new int[26];
Stream<String> lines = Files.lines(Path.get(aFile)).map(s -> s.toLowerCase());
Consumer<String> charCount = new Consumer<String>() {
public void accept(String t) {
for(int i=0; i<t.length(); i++)
if(Character.isLetter(t.charAt(i) )
tall[t.charAt(i) - 'a' ]++;
}
};
lines.forEach(charCount);
Is there a way of accomplishing this without using the lines method? Can I just process each character directly as a Stream or Stream instead of creating Strings for each line in the text file.
Can I more direcly convert java.io.InputStream into java.util.Stream.stream ?
First, you have to redefine your task. You are reading characters, hence you do not want to convert an InputStream but a Reader into a Stream.
You can’t re-implement the charset conversion that happens, e.g. in an InputStreamReader, with Stream operations as there can be n:m mappings between the bytes of the InputStream and the resulting chars.
Creating a stream out of a Reader is a bit tricky. You will need an iterator to specify a method for getting an item and an end condition:
PrimitiveIterator.OfInt it=new PrimitiveIterator.OfInt() {
int last=-2;
public int nextInt() {
if(last==-2 && !hasNext())
throw new NoSuchElementException();
try { return last; } finally { last=-2; }
}
public boolean hasNext() {
if(last==-2)
try { last=reader.read(); }
catch(IOException ex) { throw new UncheckedIOException(ex); }
return last>=0;
}
};
Once you have the iterator you can create a stream using the detour of a spliterator and perform your desired operation:
int[] tally = new int[26];
StreamSupport.intStream(Spliterators.spliteratorUnknownSize(
it, Spliterator.ORDERED | Spliterator.IMMUTABLE | Spliterator.NONNULL), false)
// now you have your stream and you can operate on it:
.map(Character::toLowerCase)
.filter(c -> c>='a'&&c<='z')
.map(c -> c-'a')
.forEach(i -> tally[i]++);
Note that while iterators are more familiar, implementing the new Spliterator interface directly simplifies the operation as it doesn’t require to maintain state between two methods that could be called in arbitrary order. Instead, we have just one tryAdvance method which can be mapped directly to a read() call:
Spliterator.OfInt sp = new Spliterators.AbstractIntSpliterator(1000L,
Spliterator.ORDERED | Spliterator.IMMUTABLE | Spliterator.NONNULL) {
public boolean tryAdvance(IntConsumer action) {
int ch;
try { ch=reader.read(); }
catch(IOException ex) { throw new UncheckedIOException(ex); }
if(ch<0) return false;
action.accept(ch);
return true;
}
};
StreamSupport.intStream(sp, false)
// now you have your stream and you can operate on it:
…
However, note that if you change your mind and are willing to use Files.lines you can have a much easier life:
int[] tally = new int[26];
Files.lines(Paths.get(file))
.flatMapToInt(CharSequence::chars)
.map(Character::toLowerCase)
.filter(c -> c>='a'&&c<='z')
.map(c -> c-'a')
.forEach(i -> tally[i]++);

Categories