I have a findAllPaths() function to find all possible paths in a graph (stored in matrix):
public void findAllPaths(final Matrix matrix, final Index s, final Index d, HashMap <Index , Boolean> isVisited, Collection localPathList)
{
// Mark the current node
isVisited.put(s, true);
if (s.equals(d)) {
this.pathsList.add(new ArrayList<>(localPathList));
// if match found no need to traverse more till depth
isVisited.put(s, false);
return;
}
// Recur for all the vertices neighbors to current index
for (Index i : matrix.getNeighbors(s)) {
if (!isVisited.get(i)) {
// store current node in path[]
localPathList.add(i);
findAllPaths(matrix, i, d, isVisited, localPathList);
// remove current node in path[]
localPathList.remove(i);
}
}
// Mark the current node
isVisited.put(s, false);
}
I'm trying to make it run in parallel as much as possible.
Does anybody have an idea of how I can accomplish that?
You might want to use RecursiveTask or RecursiveAction of the java.util.concurrent package and run it in a ForkjoinPool. The general idea of these classes is that you decide in your code what portion of the work should be done in one thread. If the work is bigger than that portion, fork a part of the work to a new thread. To get the results of all threads, use join(). These classes use a work stealing algorithm: when a thread is done, it can steal work of another thread, so this is highly efficient for heavy number crunching like this.
One prerequisite of RecursiveTask or RecursiveAction, is that you should be able to split the work in parts, and that you know how to combine the finished parts to the final result. The factorial of a big number is one example that can be calculated using RecursiveTask: 10! can be split in (1 * 2 * 3 * 4 * 5) and (6 * 7 * 8 * 9 * 10). Both sub-results can be multiplied for the final result.
Related
I was tasked to perform the Dijkstra Algorithm on big graphs (25 million nodes). These are represented as a 2D array: -each node as a double[] with latitude, longitude and offset (offset meaning index of the first outgoing edge of that node)
-each edge as a int[] with sourceNodeId,targetNodeId and weight of that edge
Below is the code, I used int[] as a tupel for the comparison in the priority queue.
The algorithm is working and gets the right results HOWEVER it is required to be finished in 15s but takes like 8min on my laptop. Is my algorithm fundamentally slow? Am I using the wrong data structures? Am I missing something? I tried my best optimizing as far as I saw fit.
Any help or any ideas would be greatly appreciated <3
public static int[] oneToAllArray(double[][]nodeList, int[][]edgeList,int sourceNodeId) {
int[] distance = new int[nodeList[0].length]; //the array that will be returned
//the priorityQueue will use arrays with the length 2, representing [index, weight] for each node and order them by their weight
PriorityQueue<int[]> prioQueue = new PriorityQueue<>((a, b) -> ((int[])a)[1] - ((int[])b)[1]);
int offset1; //used for determining the amount of outgoing edges
int offset2;
int newWeight; //declared here so we dont need to declare it a lot of times later (not sure if that makes a difference)
//currentSourceNode here means the node that will be looked at for OUTGOING edges
int[] currentSourceNode= {sourceNodeId,0};
prioQueue.add(currentSourceNode);
//at the start we only add the sourceNode, then we start the actual algorithm
while(!prioQueue.isEmpty()) {
if(prioQueue.size() % 55 == 2) {
System.out.println(prioQueue.size());
}
currentSourceNode=prioQueue.poll();
int sourceIndex = currentSourceNode[0];
if(sourceIndex == nodeList[0].length-1) {
offset1= (int) nodeList[2][sourceIndex];
offset2= edgeList[0].length;
} else {
offset1= (int) nodeList[2][sourceIndex];
offset2= (int) nodeList[2][sourceIndex+1];
}
//checking every outgoing edge for the currentNode
for(int i=offset1;i<offset2;i++) {
int targetIndex = edgeList[1][i];
//if the node hasnt been looked at yet, the weight is just the weight of this edge + distance to sourceNode
if(distance[targetIndex]==0&&targetIndex!=sourceNodeId) {
distance[targetIndex] = distance[sourceIndex] + edgeList[2][i];
int[]targetArray = {targetIndex, distance[targetIndex]};
prioQueue.add(targetArray);
} else if(prioQueue.stream().anyMatch(e -> e[0]==targetIndex)) {
//above else if checks if this index is already in the prioQueue
newWeight=distance[sourceIndex]+edgeList[2][i];
//if new weight is better, we have to update the distance + the prio queue
if(newWeight<distance[targetIndex]) {
distance[targetIndex]=newWeight;
int[] targetArray;
targetArray=prioQueue.stream().filter(e->e[0]==targetIndex).toList().get(0);
prioQueue.remove(targetArray);
targetArray[1]=newWeight;
prioQueue.add(targetArray);
}
}
}
}
return distance;
}
For each node that you process, you are doing a linear scan of the priority queue to see if something is already queued, and a second scan to find all the things that are queued if you have to update the distance. Instead, keep a separate multi-set of things that are in the queue.
This is not a proper Dijkstra's implementation.
One of the key elements of Dijkstra is that you mark nodes as "visited" when they have been evaluated and prevent looking at them again because you can't do any better. You are not doing that, so your algorithm is doing many many more computations than necessary. The only place where a priority queue or sort is required is to pick the next node to visit, from amongst the unvisited. You should re-read the algorithm, implement the "visitation tracking" and re-formulate.
I want to implement a very simple sliding window. In other words, I will have some kind of list with objects inserted from the right end of that list and dropped from the left end. In every insertion, the previous objects are left-shifted by one index. When the list get filled with objects, in every insertion from the right end an object will be dropped from the left end (and the previous objects of course will be left-shifted by one index, as usual).
I had in mind either a LinkedList or an ArrayDeque - probably the latter is a better choise, since as far as I know both inserting AND removing to/from either end is constant effort O(1) for an ArrayDeque, which is not the case for a LinkedList. Is that right?
Moreover, I would like to ask the following: Left-shifting all the previous objects stored in the sliding window when I insert a new object is processing-intensive for a large sliding window with 100,000 or even 1,000,000 objects as in my case. Is there any other data structure which might perform better in my application?
NOTE: I use the term "sliding window" for what I want to implement, maybe there is some other term that describes it better, but I think is clear what I want to do from the above description.
ArrayDeque does what you want. It doesn't move the elements around. It moves the index of where the start and the end is. When you add an element, the end counter moves and when you remove an element, the start counter moves.
One advantage of ArrayDeque is that it can use less memory and does create garbage. On the down side it has a fixed maximum size. LinkedList grows and shrinks.
BTW If you want a light weight sliding window or the average of some values, an exponentially weighted moving average is much cheaper as you only need to record two values, the previous and last time.
e.g
double last = 0;
long lastTime = 0;
double halfLife = 60 * 1000; // 60 seconds for example.
public static double ewma(double sample, long time) {
double alpha = Math.exp((lastTime - time) / halfLife);
lastTime = time;
return last = sample * alpha + last * (1 - alpha);
}
or you can approximate this to avoid calling Math.exp with
public static double ewma(double sample, long time) {
long delay = time - lastTime
double alpha = delay >= halfLife ? 1.0 : delta / halfLife;
lastTime = time;
return last = sample * alpha + last * (1 - alpha);
}
This is many times faster and for short intervals gives much the same result.
Are you talking about a Queue? Take a look at java.util.LinkedList implementation, as it implements the Queue interface. Also LinkedList's both push and pop complexity is O(1), but get's is O(N).
Edit: This is the core of LinkedList's add method:
Link<ET> next = link.next;
Link<ET> newLink = new Link<ET>(object, link, next);
link.next = newLink;
next.previous = newLink;
link = newLink;
lastLink = null;
pos++;
expectedModCount++;
list.size++;
list.modCount++;
I am generating my world (random, infinite and 2d) in sections that are x by y, when I reach the end of x a new section is formed. If in section one I have hills, how can I make it so that in section two those hills will continue? Is there some kind of way that I could make this happen?
So it would look something like this
1221
1 = generated land
2 = non generated land that will fill in the two ones
I get this now:
Is there any way to make this flow better?
This seems like just an algorithm issue. Your generation mechanism needs a start point. On the initial call it would be say 0, on subsequent calls it would be the finishing position of the previous "chunk".
If I was doing this, I'd probably make the height of the next point plus of minus say 0-3 from the previous, using some sort of distribution - e.g. 10% of the time it's +/1 3, 25% of the time it is +/- 2, 25% of the time it is 0 and 40% of the time it is +/- 1.
If I understood your problem correctly, here is a solution:
If you generated the delta (difference) between the hills and capped at a fixed value (so changes are never too big), then you can carry over the value of the last hill from the previous section when generating the new one and apply the first randomly genenarted delta (of the new section) to the carried-over hill size.
If you're generating these "hills" sequentially, I would create an accessor method that provides the continuation of said hill with a value to begin the next section. It seems that you are creating a random height for the hill to be constrained by some value already when drawing a hill in a single section. Extend that functionality with this new accessor method.
My take on a possible implementation of this.
public class DrawHillSection {
private int index;
private int x[50];
public void drawHillSection() {
for( int i = 0; i < 50; i++) {
if (i == 0) {
getPreviousHillSectionHeight(index - 1)
}
else {
...
// Your current implementation to create random
// height with some delta-y limit.
...
}
}
}
public void getPreviousHillSectionHeight(int index)
{
return (x[49].height);
}
}
I need to detect the presence of multiple blocks of columnar data given only their headings. Nothing else is known about the data except the heading words, which are different for every set of data.
Importantly, it is not known before hand how many words are in each block nor, therefore, how many blocks there are.
Equally important, the word list is always relatively short - less than 20.
So, given a list or array of heading words such as:
Opt
Object
Type
Opt
Object
Type
Opt
Object
Type
what's the most processing-efficient way to determine that it consists entirely of the repeating sequence:
Opt
Object
Type
It must be an exact match, so my first thought is to search [1+] looking for matches to [0], calling them index n,m,... Then if they are equidistant check [1] == [n+1] == [m+1], and [2] == [n+2] == [m+2] etc.
EDIT: It must work for word sets where some of the words are themselves repeated within a block, so
Opt
Opt
Object
Opt
Opt
Object
is a set of 2
Opt
Opt
Object
If the list is made of x repeating groups such that each group contains n elements...
We know there is at least 1 group so we will see if there 2 repeating groups, test by comparing the first half of the list and the second half.
1) If the above is true we know that that the solution is a factor of 2
2) If the above is false we move to the next largest prime number which is divisible by the total number of words...
At each step we check for equality among the lists, if we find it then know we have a solution with that factor in it.
We want to return a list of words for which we have the greatest factor of the first prime number for which we find equality among sub lists.
So we apply the above formula on the sub list knowing all sub lists are equal... therefore the solution is best solved recursively. That is we only need to consider the current sub list in isolation.
The solution will be extremely efficient if loaded with a short table of primes... after this it will be necessary to compute them but the list would have to be non trivial if even a list of only a few dozen primes are taken into account.
Can the unit sequence contain repetitions of its own? Do you know the length of the unit sequence?
e.g.
ABCABCABCDEFABCABCABCDEFABCABCABCDEF
where the unit sequence is ABCABCABCDEF
If the answer is yes, you've got a difficult problem, I think, unless you know the length of the unit sequence (in which case the solution is trivial, you just make a state machine that first stores the unit sequence, then verifies that the each element rest of the sequence corresponds to each element of the unit sequence).
If the answer is no, use this variant Floyd's cycle-finding algorithm to identify the unit sequence:
Initialize pointers P1 and P2 to the beginning of the sequence.
For each new element, increment pointer P1 every time, and increment pointer P2 every other time (keep a counter around to do this).
If P1 points to an identical elements of P2, you've found a unit sequence.
Now repeat through the rest of the sequence to verify that it consists of duplicates.
UPDATE: you've clarified your problem to state that the unit sequence may contain repetitions of its own. In this case, use the cycle-finding algorithm, but it's only guaranteed to find potential cycles. Keep it running throughout the length of the sequence, and use the following state machine, starting in state 1:
State 1: no cycle found that works; keep looking. When the cycle-finding algorithm finds a potential cycle, verify that you've gotten 2 copies of a preliminary unit sequence from P, and go to state 2. If you reach the end of the input, go to state 4.
State 2: preliminary unit sequence found. Run through the input as long as the cycle repeats identically. If you reach the end of the input, go to state 3. If you find an input element that is different from the corresponding element of the unit sequence, go back to state 1.
State 3: The input is a repetition of a unit sequence if the end of the input consists of complete repetitions of the unit sequence. (If it's midway through a unit sequence, e.g. ABCABCABCABCAB then a unit sequence found, but it does not consist of complete repetitions.)
State 4: No unit sequence found.
In my example (repeating ABCABCABCDEF) the algorithm starts by finding ABCABC, which would put it in state 2, and it would stay there until it hit the first DEF, which would put it back in state 1, then probably jump back and forth between states 1 and 2, until it reached the 2nd ABCABCABCDEF, at which point it would re-enter state 2, and at the end of the input it would be in state 3.
A better answer than my other one: a Java implementation which works, should be straightforward to understand, and is generic:
package com.example.algorithms;
import java.util.ArrayList;
import java.util.Collections;
import java.util.Iterator;
import java.util.List;
interface Processor<T> {
public void process(T element);
}
public class RepeatingListFinder<T> implements Processor<T> {
private List<T> unit_sequence = new ArrayList<T>();
private int repeat_count = 0;
private int partial_matches = 0;
private Iterator<T> iterator = null;
/* Class invariant:
*
* The sequence of elements passed through process()
* can be expressed as the concatenation of
* the unit_sequence repeated "repeat_count" times,
* plus the first "element_matches" of the unit_sequence.
*
* The iterator points to the remaining elements of the unit_sequence,
* or null if there have not been any elements processed yet.
*/
public void process(T element) {
if (unit_sequence.isEmpty() || !iterator.next().equals(element))
{
revise_unit_sequence(element);
iterator = unit_sequence.iterator();
repeat_count = 1;
partial_matches = 0;
}
else if (!iterator.hasNext())
{
iterator = unit_sequence.iterator();
++repeat_count;
partial_matches = 0;
}
else
{
++partial_matches;
}
}
/* Unit sequence has changed.
* Restructure and add the new non-matching element.
*/
private void revise_unit_sequence(T element) {
if (repeat_count > 1 || partial_matches > 0)
{
List<T> new_sequence = new ArrayList<T>();
for (int i = 0; i < repeat_count; ++i)
new_sequence.addAll(unit_sequence);
new_sequence.addAll(
unit_sequence.subList(0, partial_matches));
unit_sequence = new_sequence;
}
unit_sequence.add(element);
}
public List<T> getUnitSequence() {
return Collections.unmodifiableList(unit_sequence);
}
public int getRepeatCount() { return repeat_count; }
public int getPartialMatchCount() { return partial_matches; }
public String toString()
{
return "("+getRepeatCount()
+(getPartialMatchCount() > 0
? (" "+getPartialMatchCount()
+"/"+unit_sequence.size())
: "")
+") x "+unit_sequence;
}
/********** static methods below for testing **********/
static public List<Character> stringToCharList(String s)
{
List<Character> result = new ArrayList<Character>();
for (char c : s.toCharArray())
result.add(c);
return result;
}
static public <T> void test(List<T> list)
{
RepeatingListFinder<T> listFinder
= new RepeatingListFinder<T>();
for (T element : list)
listFinder.process(element);
System.out.println(listFinder);
}
static public void test(String testCase)
{
test(stringToCharList(testCase));
}
static public void main(String[] args)
{
test("ABCABCABCABC");
test("ABCDFTBAT");
test("ABABA");
test("ABACABADABACABAEABACABADABACABAEABACABADABAC");
test("ABCABCABCDEFABCABCABCDEFABCABCABCDEF");
test("ABABCABABCABABDABABDABABC");
}
}
This is a stream-oriented approach (with O(N) execution time and O(N) worst-case space requirements); if the List<T> to be processed already exists in memory, it should be possible to rewrite this class to process the List<T> without any additional space requirements, just keeping track of the repeat count and partial match count, using List.subList() to create a unit sequence that is a view of the first K elements of the input list.
My solution, which works as desired, is perhaps naive. It does have the advantage of being simple.
String[] wta; // word text array
...
INTERVAL:
for(int xa=1,max=(wta.length/2); xa<=max; xa++) {
if((wta.length%xa)!=0) { continue; } // ignore intervals which don't divide evenly into the words
for(int xb=0; xb<xa; xb++) { // iterate the words within the current interval
for(int xc=xb+xa; xc<wta.length; xc+=xa) { // iterate the corresponding words in each section
if(!wta[xb].equalsIgnoreCase(wta[xc])) { continue INTERVAL; } // not a cycle
}
}
ivl=xa;
break;
}
I'm having some difficulties with the following problem:
I'm making a little game where you're at a specific spot and each spot has each some possible directions.
The available directions are N(ord),E(ast),S,W . I use the function getPosDirections to get the possible directions of that spot. The function returns the directions into an ArrayList<String> e.g. for spot J3: [E,W]
Now the game goes like this: 2 dice will be rolled so you get a number between 2 and 12, this number represents the number of steps you can make.
What I want is an ArrayList of all the possible routes
clarification of all the possible routes:
When I'm at the current position I check what the possibilities are from there. Let's say that's go East and go West. So we get 2 new positions and from there on we need to check for the next possibilities again for both positions (until we took x directions)
(x equals the number thrown by the dice).
e.g.: I throw 3 and I'm currently at spot J3:
[[E,N,E],[E,N,S],[E,S,E],[E,S,S],[W,N,E],[W,N,S],[W,S,E],[W,S,S]]
How would obtain the last mentioned Array(list)?
First, you might wish to think about your approach some more. In the worst case (a 12 is rolled, and all 4 directions are possible at every location), there will be 4^12 ~= 160 million routes. Is it really necessary to iterate over them all? And is it necessary to fill about 1 GB of memory to store that list?
Next, it is probably a good idea to represent directions in a type-safe manner, for instance using an enum.
That being said, recursion is your friend:
private void iteratePaths(Location currentLoc, List<Direction> currentPath, List<List<Direction>> allPaths, int pathLength) {
if (currentPath.size() >= pathLength) {
allPaths.add(new ArrayList<Direction>(currentPath));
return;
}
for (Direction d : currentLoc.getPosDirections()) {
currentPath.add(d);
Location newLoc = currentLoc.walk(d);
iteratePaths(newLoc, currentPath, allPaths, pathLength);
currentPath.remove(currentPath.size() - 1);
}
}
public void List<List<Direction>> getAllPaths(Location loc, int length) {
List<List<Direction>> allPaths = new ArrayList<List<Direction>>();
List<Direction> currentPath = new ArrayList<Direction>();
iteratePaths(loc, currentPath, allPaths, length);
return allPaths;
}
You can assume that your field of spots is a complete graph. Then you need to implement BFS or DFS with saving pathes.
You can implement all logic in any of these algorithms (like getting a list of possible directions from a certain node).