How to replace the for each using streams in java

How to replace the for each using streams in java - java

Dates.forEach(date -> {
executeQuery(date, loads);
private void executeQuery(LocalDate date, ArrayList<Load> loads){
MapSqlParameterSource source = new MapSqlParameterSource();
source.addValue("date", date.toString());
Load load = namedJdbcTemplate.queryForObject(Constants.SQL_QUERY, source,
new BeanPropertyRowMapper<>(Load.class));
loads.add(load);
}
How can I use the streams concept for the above code

Something like this should work
// change your method like so
private Load executeQuery(LocalDate date){
MapSqlParameterSource source = new MapSqlParameterSource();
source.addValue("date", date.toString());
return namedJdbcTemplate.queryForObject(Constants.SQL_QUERY, source,
new BeanPropertyRowMapper<>(Load.class));
}
// load your dates from somewhere
List<LocalDate> dates = getYourDates();
// now use the streams API to collect the query results into a new list
List<Load> loads = dates.stream()
.map(this::executeQuery)
.collect(Collectors.toList());
or
List<Load> loads = getYourDates().stream()
.map(this::executeQuery)
.collect(Collectors.toList());

Related

java collec to list ignored

I am trying to write lambda in java that filter list by month and add data in the current month to the new list but when I try to collect the data I get an error collect is ignored.
public String getMonthlyExpensesNew() {
Functions functions = new Functions();
List<ShoppingMgnt> monthlyData = new ArrayList<>();
try {
monthlyData = getRecordsAsList();
monthlyData.stream().filter(date -> functions.checkForCurrentMonth(date.getPurchaseDate())).collect(Collectors.toList());
}catch (SQLException sqlException){
System.err.println("Error in getMonthlyExpensesNew");
}
return String.valueOf(monthlyData);
}
public boolean checkForCurrentMonth(String givenDate){
LocalDate currentDate = LocalDate.now();
LocalDate monthToCheck = LocalDate.parse(givenDate);
return currentDate.getMonth().equals(monthToCheck.getMonth());
}

your final function will be as follow:
public String getMonthlyExpensesNew() {
Functions functions = new Functions();
List<ShoppingMgnt> monthlyData = new ArrayList<>();
try {
monthlyData = getRecordsAsList();
//put the returned list in the same defined list
monthlyData = monthlyData.stream()
.filter(date -> functions.checkForCurrentMonth(date.getPurchaseDate()))
.collect(Collectors.toList());
}catch (SQLException sqlException){
System.err.println("Error in getMonthlyExpensesNew");
}
//the return with updated list
return String.valueOf(monthlyData);
}

Your initial code:
monthlyData.stream()
.filter(date -> functions.checkForCurrentMonth(date.getPurchaseDate()))
.collect(Collectors.toList());
Within this line the collect operation returns a List. You should store this List into your monthlyData reference to be returned later. So you should write like this:
monthlyData = monthlyData.stream()
.filter(date -> functions.checkForCurrentMonth(date.getPurchaseDate()))
.collect(Collectors.toList());

How to convert List<String> into a Mono<List>

i'm trying to convert this method to an a reactive method
#GetMapping(RestConstants.BASE_PATH_AUDIENCE + "/link")
public List<String> users () {
List<String> list= new ArrayList<>();
MongoCollection mongoCollection = mongoTemplate.getCollection("collection");
DistinctIterable distinctIterable = mongoCollection.distinct("user_name", String.class);
MongoCursor mongoCursor = distinctIterable.iterator();
while (mongoCursor.hasNext()){
String user = (String)mongoCursor.next();
creatorsList.add(user);
}
return list;
}
I Have something like this but i don't know how to conver the ArrayList to return an Mono<List>
#GetMapping(RestConstants.BASE_PATH_AUDIENCE + "/link")
public Mono<List<String>> usersReactive () {
List<Mono<String>> list= new ArrayList<List>();
MongoCollection mongoCollection = mongoTemplate.getCollection("collection");
DistinctIterable distinctIterable = mongoCollection.distinct("user_name", String.class);
MongoCursor mongoCursor = distinctIterable.iterator();
while (mongoCursor.hasNext()){
String user = (String)mongoCursor.next();
list.add(user);
}
return list;
}

If you really want a Mono, then just wrap the value that you want to transport in it:
return Mono.just(creatorsList);
But I doubt you really want to return a list in a Mono. Usually, reactive endpoints returning a number of items would return a Flux
return Flux.fromIterable(creatorsList);
But since your MongoCursor is already iterable (you use its iterator in an enhanced for-loop), you can stream the cursor directly to the flux. This saves you from collecting all items into a list first.
return Flux.fromIterable(cursor);
And finally, if you are trying to convert your application to be reactive, it is wise to use the Mongo driver with native support for reactive streams: https://docs.mongodb.com/drivers/reactive-streams/

Web scraping using multithreading

I wrote a code to lookup for some movie names on IMDB, but if for instance I am searching for "Harry Potter", I will find more than one movie. I would like to use multithreading, but I don't have much knowledge on this area.
I am using strategy design pattern to search among more websites, and for instance inside one of the methods I have this code
for (Element element : elements) {
String searchedUrl = element.select("a").attr("href");
String movieName = element.select("h2").text();
if (movieName.matches(patternMatcher)) {
Result result = new Result();
result.setName(movieName);
result.setLink(searchedUrl);
result.setTitleProp(super.imdbConnection(movieName));
System.out.println(movieName + " " + searchedUrl);
resultList.add(result);
}
}
which, for each element (which is the movie name), will create a new connection on IMDB to lookup for ratings and other stuff, on the super.imdbConnection(movieName) line.
The problem is, I would like to have all the connections at the same time, because on 5-6 movies found, the process will take much longer than expected.
I am not asking for code, I want some ideeas. I thought about creating an inner class which implements Runnable, and to use it, but I don't find any meaning on that.
How can I rewrite that loop to use multithreading?
I am using Jsoup for parsing, Element and Elements are from that library.

The most simple way is parallelStream()
List<Result> resultList = elements.parallelStream()
.map(e -> {
String searchedUrl = element.select("a").attr("href");
String movieName = element.select("h2").text();
if(movieName.matches(patternMatcher)){
Result result = new Result();
result.setName(movieName);
result.setLink(searchedUrl);
result.setTitleProp(super.imdbConnection(movieName));
System.out.println(movieName + " " + searchedUrl);
return result;
}else{
return null;
}
}).filter(Objects::nonNull)
.collect(Collectors.toList());
If you don't like parallelStream() and want to use Threads, you can to this:
List<Element> elements = new ArrayList<>();
//create a function which returns an implementation of `Callable`
//input: Element
//output: Callable<Result>
Function<Element, Callable<Result>> scrapFunction = (element) -> new Callable<Result>() {
#Override
public Result call() throws Exception{
String searchedUrl = element.select("a").attr("href");
String movieName = element.select("h2").text();
if(movieName.matches(patternMatcher)){
Result result = new Result();
result.setName(movieName);
result.setLink(searchedUrl);
result.setTitleProp(super.imdbConnection(movieName));
System.out.println(movieName + " " + searchedUrl);
return result;
}else{
return null;
}
}
};
//create a fixed pool of threads
ExecutorService executor = Executors.newFixedThreadPool(elements.size());
//submit a Callable<Result> for every Element
//by using scrapFunction.apply(...)
List<Future<Result>> futures = elements.stream()
.map(e -> executor.submit(scrapFunction.apply(e)))
.collect(Collectors.toList());
//collect all results from Callable<Result>
List<Result> resultList = futures.stream()
.map(e -> {
try{
return e.get();
}catch(Exception ignored){
return null;
}
}).filter(Objects::nonNull)
.collect(Collectors.toList());

Overwrite some partitions of a partitioned table Bigquery

I am currently trying to develop a Dataflow pipeline in order to replace some partitions of a partitioned table. I have a custom partition field which is a date. The input of my pipeline is a file with potentially different dates.
I developed a Pipeline :
PipelineOptionsFactory.register(BigQueryOptions.class);
BigQueryOptions options = PipelineOptionsFactory.fromArgs(args).withValidation().as(BigQueryOptions.class);
Pipeline p = Pipeline.create(options);
PCollection<TableRow> rows = p.apply("ReadLines", TextIO.read().from(options.getFileLocation()))
.apply("Convert To BQ Row", ParDo.of(new StringToRowConverter(options)));
ValueProvider<String> projectId = options.getProjectId();
ValueProvider<String> datasetId = options.getDatasetId();
ValueProvider<String> tableId = options.getTableId();
ValueProvider<String> partitionField = options.getPartitionField();
ValueProvider<String> columnNames = options.getColumnNames();
ValueProvider<String> types = options.getTypes();
rows.apply("Write to BQ", BigQueryIO.writeTableRows()
.withCreateDisposition(BigQueryIO.Write.CreateDisposition.CREATE_IF_NEEDED)
.withCustomGcsTempLocation(options.getGCSTempLocation())
.to(new DynamicDestinations<TableRow, String>() {
#Override
public String getDestination(ValueInSingleWindow<TableRow> element) {
TableRow date = element.getValue();
String partitionDestination = (String) date.get(partitionField.get());
SimpleDateFormat from = new SimpleDateFormat("yyyy-MM-dd");
SimpleDateFormat to = new SimpleDateFormat("yyyyMMdd");
try {
partitionDestination = to.format(from.parse(partitionDestination));
LOG.info("Table destination "+partitionDestination);
return projectId.get()+":"+datasetId.get()+"."+tableId.get()+"$"+partitionDestination;
} catch(ParseException e){
e.printStackTrace();
return projectId.get()+":"+datasetId.get()+"."+tableId.get()+"_rowsWithErrors";
}
}
#Override
public TableDestination getTable(String destination) {
TimePartitioning timePartitioning = new TimePartitioning();
timePartitioning.setField(partitionField.get());
timePartitioning.setType("DAY");
timePartitioning.setRequirePartitionFilter(true);
TableDestination tableDestination = new TableDestination(destination, null, timePartitioning);
LOG.info(tableDestination.toString());
return tableDestination;
}
#Override
public TableSchema getSchema(String destination) {
return new TableSchema().setFields(buildTableSchemaFromOptions(columnNames, types));
}
})
.withWriteDisposition(BigQueryIO.Write.WriteDisposition.WRITE_TRUNCATE)
);
p.run();
}
When I trigger the pipeline locally, it successfully replacesthe partitions which date are in the input file. Nevertheless, when deploying on Google Cloud Dataflow and running the template with the exact same parameters, it truncates all the data, and I just have at the end the file I wanted to upload in my table.
Do you know why there is such a difference ?
Thank you !

You specified BigQueryIO.Write.CreateDisposition to CREATE_IF_NEEDED, and this is paired with BigQueryIO.Write.WriteDisposition.WRITE_TRUNCATE, so even if the table exists, it may be recreated. This is the reason why you see your table getting replaced.
See this document [1] for details.
[1] https://cloud.google.com/dataflow/java-sdk/JavaDoc/com/google/cloud/dataflow/sdk/io/BigQueryIO.Write.CreateDisposition#CREATE_IF_NEEDED

How to achieve formatting like this in Eclipse?

I have a function formatted like this:
private void verifyDatatypeTables(
final DynamoDBMapper mapper,
final List<Datatype> datatypeMissingEntries) {
final List<Datatype> datatypeEntries = new ArrayList<>();
this.mapToListDatatypeTables(datatypeEntries);
final List<Datatype> datatypeEntriesInTable =
this.dbUtilityDatatype.scanRecord(new DynamoDBScanExpression(), true);
}
This creates a little reading problem. I want it to be formatted like this:
private void verifyDatatypeTables(
final DynamoDBMapper mapper,
final List<Datatype> datatypeMissingEntries
) {
final List<Datatype> datatypeEntries = new ArrayList<>();
this.mapToListDatatypeTables(datatypeEntries);
final List<Datatype> datatypeEntriesInTable =
this.dbUtilityDatatype.scanRecord(new DynamoDBScanExpression(), true);
}
How to achieve formatting like this in Eclipse?

Preferences -> Java -> Code Style -> Formatter -> edit -> Parentheses -> Method declaration -> separate lines
with
Line Wrapping -> Method declarations -> Parameters -> Line Wrapping policy -> Wrap all elements, every element on a new line

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

How to replace the for each using streams in java - java

Related

java collec to list ignored

How to convert List<String> into a Mono<List>

Web scraping using multithreading

Overwrite some partitions of a partitioned table Bigquery

How to achieve formatting like this in Eclipse?

Categories

Resources