Functions instead of static utility methods - java

Despite Functions being around in Java since Java 8, I started playing with them only recently. Hence this question may sound a little archaic, kindly excuse.
At the outset, I am talking of a pure function written completely in conformance of Functional Programming definition: deterministic and immutable.
Say, I have a frequent necessity to prepend a string with another static value. Like the following for example:
private static Function<String, String> fnPrependString = (s) -> {
return "prefix_" + s;
};
In the good old approach, the Helper class and its static methods would have been doing this job for me.
The question now is, whether I can create these functions once and reuse them just like helper methods.
One threat is that of thread-safety. And I used a simple test to check this with this JUnit test:
package com.me.expt.lt.test;
import static org.junit.jupiter.api.Assertions.assertEquals;
import static org.junit.jupiter.api.Assertions.assertTrue;
import java.util.HashMap;
import java.util.Map;
import java.util.concurrent.ConcurrentHashMap;
import java.util.concurrent.ConcurrentMap;
import java.util.function.Consumer;
import java.util.function.Function;
import org.junit.jupiter.api.Test;
import com.vmlens.api.AllInterleavings;
public class TestFunctionThreadSafety {
private static Function<String, String> fnPrepend = (s) -> {
System.out.println(s);
return new StringBuffer("prefix_").append(s).toString();
};
#Test
public void testThreadSafety() throws InterruptedException {
try (AllInterleavings allInterleavings = new AllInterleavings(
TestFunctionThreadSafety.class.getCanonicalName());) {
ConcurrentMap<String, Integer> resultMap = new ConcurrentHashMap<String, Integer>();
while (allInterleavings.hasNext()) {
int runSize = 5;
Thread[] threads = new Thread[runSize];
ThreadToRun[] ttrArray = new ThreadToRun[runSize];
StringBuffer sb = new StringBuffer("0");
for (int i = 0; i < runSize; i++) {
if (i > 0)
sb.append(i);
ttrArray[i] = new ThreadToRun();
ttrArray[i].setS(sb.toString());
threads[i] = new Thread(ttrArray[i]);
}
for (int j = 0; j < runSize; j++) {
threads[j].start();
}
for (int j = 0; j < runSize; j++) {
threads[j].join();
}
System.out.println(resultMap);
StringBuffer newBuffer = new StringBuffer("0");
for (int j = 0; j < runSize; j++) {
if(j>0)
newBuffer.append(j);
assertEquals("prefix_" + newBuffer, ttrArray[j].getResult(), j + " fails");
}
}
}
}
private static class ThreadToRun implements Runnable {
private String s;
private String result;
public String getS() {
return s;
}
public void setS(String s) {
this.s = s;
}
public String getResult() {
return result;
}
#Override
public void run() {
this.result = fnPrepend.apply(s);
}
}
}
I am using vmlens. I can tune my test by changing the runSize variable by as good a number as I choose so that the randomness can be checked. The objective is to see if these multiple threads using the same function mix up their inputs because of concurrent access. The test did not return any negative results. Please also do comment on whether the test meets the goals.
I also tried to understand the internal VM end of how lambdas are executed from here. Even as I look for somewhat simpler articles that I can understand these details faster, I did not find anything that says "Lambdas will have thread safety issues".
Assuming the test case meets my goal, the consequential questions are:
Can we replace the static helper classes with function variables immutable and deterministic functions like fnPrepend? The objective is to simply provide more readable code and also of course to move away from the "not so Object oriented" criticism about static methods.
Is there is a source of simpler explanation to how Lambdas work inside the vm?
Can the results above with a Function<InputType, ResultType> be applied to a Supplier<SuppliedType> and a Consumer<ConsumedType> also?
Some more familiarity with functions and the bytecode will possibly help me answer these questions. But a knowledge exchange forum like this may get me an answer faster and the questions may trigger more ideas for the readers.
Thanks in advance.
Rahul

I really don't think you, as a user, need to go to such lengths to prove the JVM's guarantees about lambdas. Basically, they are just like any other method to the JVM with no special memory or visibility effects :)
Here's a shorter function definition:
private static Function<String, String> fnPrepend = s -> "prefix_" + s;
this.result = fnPrepend.apply(s);
... but don't use a lambda just for the sake of it like this - it's just extra overhead for the same behaviour. Assuming the real usecase has a requirement for a Function, we can use Method References to call the static method. This gets you the best of both worlds:
// Available as normal static method
public static String fnPrepend(String s) {
return "prefix_" + s;
}
// Takes a generic Function
public static void someMethod(UnaryOperator<String> prefixer) {
...
}
// Coerce the static method to a function
someMethod(Util::fnPrepend);

Related

Java, comparing Strings using parallelStream().anyMatch() and contains() one version works another doesn't

I aimed to rework the code found here: https://stackoverflow.com/a/8995988
But this proved unsuccessful.
I got an idea from here: https://www.logicbig.com/how-to/java/lambda-list-contains-a-substring.html
And my idea worked, but I suspect it is bad code and I'd like to know why the StackOverflow rework I did does not work as anticipated.
I'll present both bits of code in 1 block. Simply switch which "if" line is commented and not commented to go between the working and not working versions.
import java.util.Arrays;
import java.util.List;
public class Demo {
public static void main(String[] args) {
List<String> result0 = Arrays.asList("/Videos/Templates/file.mp4", "/Videos/Templates/file2.mp4", "/Videos/Templates/file3.mp4");
List<String> result2 = Arrays.asList("/Videos/Templates/file.mp4.sha256");
for (int i = 0; i < result0.size(); i++) {
List<String> finalResult = result0;
int finalI = i;
// if (result2.parallelStream().anyMatch(x -> x.contains(finalResult.get(finalI)))) {
if (result2.parallelStream().anyMatch(finalResult.get(finalI)::contains)){
System.out.println("sha matches files: " + result0.get(i));
}
}
}
}
If it proves that this question better serves as just a comment on https://stackoverflow.com/a/8995988 better explaining the code, then I'm happy to modify to that.
Because contains is not a commutative operator. For example, "lightning" contains "light", but "light" does not contain "lightning". In your case, "/Videos/Templates/file.mp4.sha256" contains "/Videos/Templates/file.mp4", but "/Videos/Templates/file.mp4" does not contain "/Videos/Templates/file.mp4.sha256".
(In case it isn't clear, foo::bar is equivalent to x -> foo.bar(x), so finalResult.get(finalI)::contains is equivalent to x -> finalResult.get(finalI).contains(x).)
#Joseph's right! You just inverted the order of the method's object and operand. The String.contains method is to be called on the Strings of the result2 Stream.
Moreover, you don't need to use any for loop to compute this result, both Lists can be converted into Streams, hence avoiding to mixing unnecessarily imperative and reactive styles on the same method.
Here is a Streams-only version of your program:
import java.util.Arrays;
import java.util.List;
public class Demo {
public static void main(String[] args) {
List<String> result0 = Arrays.asList("/Videos/Templates/file.mp4", "/Videos/Templates/file2.mp4", "/Videos/Templates/file3.mp4");
List<String> result2 = Arrays.asList("/Videos/Templates/file.mp4.sha256");
result0.parallelStream()
.filter(item0 -> result2.parallelStream()
.anyMatch(item2 -> item2.contains(item0)))
.forEach(item0 -> System.out.println("sha matches files: " + item0));
}
}

Affect some logic lists using java stream [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 3 years ago.
Improve this question
I am trying to change the loop to Java streams.
For example,
interface Logic {
int apply(int value);
}
public class AddOneLogic implements Logic {
#Override
public int apply(int value) {
return value + 1;
}
}
public class AddTwoLogic implements Logic {
#Override
public int apply(int value) {
return value + 2;
}
}
Using a loop to apply a Logic looks like
List<Logic> logics = new ArrayList<>();
logics.add(new AddOneLogic());
logics.add(new AddTwoLogic());
int init = 1;
I want to change to streams below. Is there any better way to do it?
int result = init;
for (Logic logic : logics) {
result = logic.apply(result);
}
As #duffymo mentioned in the comments, these classes aren't particularly useful and they could be replaced with Function<Integer, Integer>s and lambda expressions to define them.
In that case, you may want to reduce a list/stream of Functions by Function::andThen,
Function<Integer, Integer> addOneFunction = i -> i + 1;
Function<Integer, Integer> addTwoFunction = i -> i + 2;
Function<Integer, Integer> function =
Stream.of(addOneFunction, addTwoFunction)
.reduce(Function.identity(), Function::andThen);
so you would get a composed function to work with
Integer result = function.apply(init);
// ((1 + 1) + 2) = 4
You can do it with Stream and AtomicInteger and getAndSet(int) method as below,
AtomicInteger result = new AtomicInteger(1);
logics.stream().forEach(ele-> result.getAndSet(ele.apply(result.get())));
// result = ((1+1)+2)=4
Better option would be to use Function,
Function<Integer, Integer> addOne = i -> i + 1;
Function<Integer, Integer> addTwo = i -> i + 2;
List<Function<Integer, Integer>> logics = new ArrayList<>();
logics.add(addOne);
logics.add(addTwo);
AtomicInteger result = new AtomicInteger(1);
logics.stream().forEach(ele-> result.getAndSet(ele.apply(result.get())));
You can even avoid logics list and use andThen method as below,
Function<Integer, Integer> add = addOne.andThen(addTwo);
result = add.apply(1);
Hope it helps..!!
As others have already mentioned: The intention behind the question might be distorted by the attempt to simplify the question so that it can be posted here. The Logic interface does not really make sense, because it could be replaced with an IntUnaryOperator.
Not with a Function<Integer, Integer> - that's a different thing!
But I'll (also) make some assumptions when trying to answer the question:
The Logic interface is merely a placeholder for an interface that has to be retained in its current form
Several Logic instances can sensibly be combined in order to yield an new Logic
The goal is not to "apply streams for the streams sake", but to create sensible, usable classes and methods (and it's a pity that this is worth mentioning...)
If this is the case, then I'd suggest creating a CombinedLogic class that simply offers a method for combining several Logic objects to create the combined one.
It could also be a concrete class that internally stores a List<Logic>. This might be handy in order to modify a combined logic later, as in combinedLogic.setElement(42, new OtherLogic());. But a public class with a modifiable state should be thought through carefully...
import java.util.ArrayList;
import java.util.Arrays;
import java.util.List;
public class CombinedLogicExample {
public static void main(String[] args) {
List<Logic> logics = new ArrayList<>();
logics.add(new AddOneLogic());
logics.add(new AddTwoLogic());
Logic combined = CombinedLogic.of(logics);
// Alternatively:
// Logic logic1 = new AddOneLogic();
// Logic logic2 = new AddTwoLogic();
// Logic combined = CombinedLogic.of(logic1, logic2);
int init = 1;
int result = combined.apply(init);
System.out.println(result);
}
}
class CombinedLogic {
static Logic of(Logic... logics) {
return of(Arrays.asList(logics));
}
static Logic of(Iterable<? extends Logic> logics) {
return a -> {
int result = a;
for (Logic logic : logics) {
result = logic.apply(result);
}
return result;
};
}
}
interface Logic {
int apply(int value);
}
class AddOneLogic implements Logic {
#Override
public int apply(int value) {
return value + 1;
}
}
class AddTwoLogic implements Logic {
#Override
public int apply(int value) {
return value + 2;
}
}

How good or bad is it to iterate over a hash-table after checking that it contains a value you want to remove?

I'm currently practicing Algorithm design on HackerRank.
This question pertains to the challenge found here:
https://www.hackerrank.com/challenges/ctci-ransom-note
I solved this problem fairly quickly. However, I ran into an issue that kind of bugs me. I can check for a value on my hash table by using the contains(value) function. However, I didn't see any way to retrieve the key/keys associated with it. In order to do this I was forced to iterate through the table until I found that value again.
While I see the usefulness of Hash Tables... I don't think I am going about solving the problem in an optimal way. I feel like it's a time waster to iterate through the table if I already know it contains the value I want to remove.
One idea I had was to make two tables and have them be the "mirrored" version of one another, as in the original map is using the numbers as keys and the copy or mirrored map uses the keys as the values. However, this seems impractical and I have a feeling that I'm just missing something essential in my knowledge of Hash functions or something.
One reason I'm thinking about this is that I recently made a program that uses a sqlight table to hold data. I only need one loop to search for and delete these values, which makes it more efficient doesn't it?
Could I please get an explanation of how to better achieve what my code below does?
import java.io.*;
import java.util.*;
import java.text.*;
import java.math.*;
import java.util.regex.*;
public class Solution {
public static void main(String[] args) {
Scanner in = new Scanner(System.in);
int m = in .nextInt();
int n = in .nextInt();
String isTrue = "Yes";
Hashtable myTable = new Hashtable();
String magazine[] = new String[m];
for (int magazine_i = 0; magazine_i < m; magazine_i++) {
myTable.put(magazine_i, in .next());
}
Set < Integer > keySet = myTable.keySet();
for (int ransom_i = 0; ransom_i < n; ransom_i++) {
String temp = in .next();
//System.out.println("Line " + ransom_i);
if (!myTable.containsValue(temp)) {
isTrue = "No";
break;
} else {
for (int key: keySet) {
if (myTable.get(key).equals(temp)) {
myTable.remove(key);
//System.out.println("Found it");
break;
}
}
}
}
System.out.println(isTrue);
}
}
Here's an easy way to do it:
public class DenyReturn<K,T> extends Map<K,T>{
private Map m;
private List<T> dontreturn;
public DenyReturn(Map<K,T> m, List<T> dontreturn) {
this.m = m;
this.dontreturn = dontreturn;
}
public T get(Object key) {
T val = super.get(key);
if (dontreturn.contains(val)) return null;
return val;
}
//implement all other methods of Map by invoking the inner map methods
}

Parallelize search in a Java set

I have a List<String> called lines and a huge (~3G) Set<String> called voc. I need to find all lines from lines that are in voc. Can I do this multithreaded way?
Currently I have this straightforward code:
for(String line: lines) {
if (voc.contains(line)) {
// Great!!
}
}
Is there a way to search for few lines at the same time? May be there are existing solutions?
PS: I am using javolution.util.FastMap, because it behaves better during filling up.
Here is a possible implementation. Please note that error/interruption handling has been omitted but this might give you a starting point. I included a main method so you could copy and paste this into your IDE for a quick demo.
Edit: Cleaned things up a bit to improve readability and List partitioning
import java.util.ArrayList;
import java.util.HashSet;
import java.util.List;
import java.util.Set;
import java.util.concurrent.Callable;
import java.util.concurrent.CompletionService;
import java.util.concurrent.ExecutionException;
import java.util.concurrent.ExecutorCompletionService;
import java.util.concurrent.ExecutorService;
import java.util.concurrent.Executors;
public class ParallelizeListSearch {
public static void main(String[] args) throws InterruptedException, ExecutionException {
List<String> searchList = new ArrayList<String>(7);
searchList.add("hello");
searchList.add("world");
searchList.add("java");
searchList.add("debian");
searchList.add("linux");
searchList.add("jsr-166");
searchList.add("stack");
Set<String> targetSet = new HashSet<String>(searchList);
Set<String> matchSet = findMatches(searchList, targetSet);
System.out.println("Found " + matchSet.size() + " matches");
for(String match : matchSet){
System.out.println("match: " + match);
}
}
public static Set<String> findMatches(List<String> searchList, Set<String> targetSet) throws InterruptedException, ExecutionException {
Set<String> locatedMatchSet = new HashSet<String>();
int threadCount = Runtime.getRuntime().availableProcessors();
List<List<String>> partitionList = getChunkList(searchList, threadCount);
if(partitionList.size() == 1){
//if we only have one "chunk" then don't bother with a thread-pool
locatedMatchSet = new ListSearcher(searchList, targetSet).call();
}else{
ExecutorService executor = Executors.newFixedThreadPool(threadCount);
CompletionService<Set<String>> completionService = new ExecutorCompletionService<Set<String>>(executor);
for(List<String> chunkList : partitionList)
completionService.submit(new ListSearcher(chunkList, targetSet));
for(int x = 0; x < partitionList.size(); x++){
Set<String> threadMatchSet = completionService.take().get();
locatedMatchSet.addAll(threadMatchSet);
}
executor.shutdown();
}
return locatedMatchSet;
}
private static class ListSearcher implements Callable<Set<String>> {
private final List<String> searchList;
private final Set<String> targetSet;
private final Set<String> matchSet = new HashSet<String>();
public ListSearcher(List<String> searchList, Set<String> targetSet) {
this.searchList = searchList;
this.targetSet = targetSet;
}
#Override
public Set<String> call() {
for(String searchValue : searchList){
if(targetSet.contains(searchValue))
matchSet.add(searchValue);
}
return matchSet;
}
}
private static <T> List<List<T>> getChunkList(List<T> unpartitionedList, int splitCount) {
int totalProblemSize = unpartitionedList.size();
int chunkSize = (int) Math.ceil((double) totalProblemSize / splitCount);
List<List<T>> chunkList = new ArrayList<List<T>>(splitCount);
int offset = 0;
int limit = 0;
for(int x = 0; x < splitCount; x++){
limit = offset + chunkSize;
if(limit > totalProblemSize)
limit = totalProblemSize;
List<T> subList = unpartitionedList.subList(offset, limit);
chunkList.add(subList);
offset = limit;
}
return chunkList;
}
}
Simply splitting lines among different threads would (in Oracle JVM at least) spread the work into all CPUs if you are looking for this.
I like using CyclicBarrier, makes those threads controlled in an easier way.
http://javarevisited.blogspot.cz/2012/07/cyclicbarrier-example-java-5-concurrency-tutorial.html
It's absolutely possible to parallelize this using multiple threads. You could do the following:
Break up the list into a different "blocks," one per thread that will do the search.
Have each thread look over its block, checking whether each string is in the set, and if so adding the string to the resulting set.
For example, you might have the following thread routine:
public void scanAndAdd(List<String> allStrings, Set<String> toCheck,
ConcurrentSet<String> matches, int start, int end) {
for (int i = start; i < end; i++) {
if (toCheck.contains(allStrings.get(i))) {
matches.add(allStrings.get(i));
}
}
}
You could then spawn off as many threads as you needed to run the above method and wait for all of them to finish. The resulting matches would then be stored in matches.
For simplicity, I've had the output set be a ConcurrentSet, which automatically eliminates race conditions due to writes. Since you are only doing reads on the list of strings and set of strings to check for, no synchronization is required when reading from allStrings or performing lookups in toCheck.
Hope this helps!
Another option would be to use Akka, it does these kinds of things quite simply.
Actually, having done some search work with Akka, one of the things I can tell you about this too is that it supports two ways of parallelizing such things: through Composable Futures or Agents. For what you want, the Composable Futures would be completely sufficient. Then, Akka is actually not adding that much: Netty is providing the massively parallel io infrastructure, and Futures are part of the jdk, but Akka does make it super simple to put these two together and extend them when/if needed.

google-guava MapMaker .softValues() - values don't get GC-ed, OOME: HeapSpace follows

I am having trouble using the MapMaker from google-guava. Here is the code:
package test;
import java.lang.ref.SoftReference;
import java.util.Map;
import java.util.Random;
import com.google.common.collect.MapEvictionListener;
import com.google.common.collect.MapMaker;
public class MapMakerTest {
private static Random RANDOM = new Random();
private static char[] CHARS =
("abcdefghijklmnopqrstuvwxyz" +
"ABCDEFGHIJKLMNOPQRSTUVWXYZ" +
"1234567890-=!##$%^&*()_+").toCharArray();
public static void main(String[] args) throws Exception {
MapEvictionListener<String, String> listener = new MapEvictionListener<String, String>() {
#Override
public void onEviction(String key, String value) {
System.out.println(">>>>> evicted");
}
};
Map<String, String> map = new MapMaker().
concurrencyLevel(1).softValues().
evictionListener(listener).makeMap();
while (true) {
System.out.println(map.size());
String s = getRandomString();
map.put(s, s);
Thread.sleep(50);
}
}
private static String getRandomString() {
int total = 50000;
StringBuilder sb = new StringBuilder();
for (int i = 0; i < total; ++i) {
sb.append(CHARS[RANDOM.nextInt(CHARS.length)]);
}
return sb.toString();
}
}
When java is called like: java -Xms2m -Xmx2m -cp guava-r09.jar:. test.MapMakerTest (the heap settings are so small intentionally to easier see what happens) around the 60th iteration it explodes with OutOfMemoryError: HeapSpace.
However, when the map is Map<String, SoftReference<String>> (and according changes in the rest of the code: the listener, and the put), I can see the evictions taking place, and the code simply works, and the values get garbage collected.
In all of the documentation, including this one: http://guava-libraries.googlecode.com/svn/tags/release09/javadoc/index.html, there is no mention of SoftReferences explicitly. Isn't the Map implementation supposed to wrap the values in SoftReference when put is called? I am really confused about the supposed usage.
I am susing guava r09.
Could anyone maybe explain what I am doing wrong, and why my assumptions are wrong?
Best regards,
wujek
You use the same object for key and value, therefore it is strongly reachable as a key and is not eligible for garbage collection despite the fact that value is softly reachable:
map.put(s, s);
Try to use different instances:
map.put(s, new String(s));

Categories