I'm accessing this API that gives me global weather:
https://callforcode.weather.com/doc/v3-global-weather-notification-headlines/
However, it takes lat/lng as an input parameter, and I need the data for the entire world.
I figure I can loop through every latitude longitude, every 2 latitudes and 2 longitude degrees, giving me a point on the world, every ~120 miles across and roughly 100 degrees north/south, which should give me all the data in 16,200 API calls ((360/2) * (180/2)).
How can I do this effectively in Java?
I'd conceived something like this; but is there a better way of doing this?
for(int i = 0; i < 360; i+2){
var la = i;
for(int x = 0 x < 180; x+2) {
var ln = x;
//call api with lat = i, lng = x;
}
}
It's somewhat of a paradigm shift, but I would NOT use a nested for-loop for this problem. In many situations where you are looking at iterating over an entire result set, it is often possible to trim the coverage dramatically without losing much or any effectiveness. Caching, trimming, prioritizing... these are the things you need: not a for-loop.
Cut sections entirely - maybe you can ignore ocean, maybe you can ignore Antartica and the North Pole (since people there have better ways of checking weather anyway)
Change your search frequency based on population density. Maybe northern Canada doesn't need to be checked as thoroughly as Los Angeles or Chicago.
Rely on caching in low-usage areas - presumably you can track what areas are actually being used and can then more frequently refresh those sections.
So what you end up with is some sort of weighted caching system that takes into account population density, usage patterns, and other priorities to determine what latitude/longitude coordinates to check and how frequently.
High-level code might look something like this:
void executeUpdateSweep(List<CoordinateCacheItem> cacheItems)
{
for(CoordinateCacheItem item : cacheItems)
{
if(shouldRefreshCache(item))
{
//call api with lat = item.y , lng = item.x
}
}
}
boolean shouldRefreshCache(item)
{
long ageWeight = calculateAgeWeight(item);//how long since last update?
long basePopulationWeight = item.getBasePopulationWeight();//how many people (users and non-users) live here?
long usageWeight = calculateUsageWeight(item);//how much is this item requested?
return ageWeight + basePopulationWeight + usageWeight > someArbitraryThreshold;
}
Related
I am working on an application that deals with moving objects from point A to point B in a 2D space. The job of the application is to animate that transition in a given number to steps (frames).
What I am currently doing is divide the distance by the number steps, hence creating a very linear and boring movement in a straight line:
int frames = 25;
int fromX = 10;
int toX = 20;
double step = (toX - fromX) / frames;
List<Double> values = new ArrayList<>();
int next = start;
for (int i = 0; i < frames; i++) {
values.add(next);
next += step;
}
As a first improvement - since my poor users have to look at this misery - I would like that to be an accelerated and decelerated movement starting slow, picking up speed, then getting slower again until arrival at the destination.
For that particular case, I could probably figure out the math somehow but in the end, I want to be able to provide more complex animations that would go beyond my capabilities as a mathematician ;) I have many of the capabilities of e.g. PowerPoint or iMovie in mind.
My ask is: Is there a library that would allow me to generated these sequences of coordinates? I found a few things but they where often tied to some Graphics object etc which I am not using. For me it's all about Lists of Doubles.
it's my first time with CGAL, some of you may argue why do I have to learn CGAL from something like that, but it's a new project that I must do (and... yes, I must use CGAL and Java combined) :/ Long story short... I only have:
Two double arrays, representing x and y coordinates of my vertices. Let's call them double[] x, y;.
Both arrays have S random values.
Two vertices, u and w are connected if distance(x[u], y[u], x[w], y[w]) < CONSTANT (ofc. I do distanceSquared(x[u], y[u], x[w], y[w]) < CONSTANT_SQUARED, so I avoid to call sqrt()).
x and y are filled randomly with values from 0 to UPPER_LIMIT, no other infos are given.
Question, do x and y describes a connected graph?
Right now I have two algoritms:
Algorithm 1:
Build adjacency list (Arraylist<Integer>[] adjLists;) for each vertex (only upper triangular matrix explored). Complexity O(|V|^2) (V = vertices set).
Recursive graph exploration, vertex marking and counting, if visited vertex equals S my graph have only one connected component, my graph is connected. Complexity O(|E|) (E = edges set).
Algorithm 2:
private static boolean algorithmGraph(double[] x, double[] y) {
int unchecked, inside = 0, current = 0;
double switchVar;
while (current <= inside && inside != S - 1) {
unchecked = inside + 1;
while (unchecked < S) {
if ((x[current] - x[unchecked]) * (x[current] - x[unchecked]) + (y[current] - y[unchecked]) * (y[current] - y[unchecked]) <= CONSTANT_SQUARED) {
inside++;
// switch x coordinates | unchecked <-> inside
switchVar = x[unchecked];
x[unchecked] = x[inside];
x[inside] = switchVar;
// switch y coordinates | unchecked <-> inside
switchVar = y[unchecked];
y[unchecked] = y[inside];
y[inside] = switchVar;
}
unchecked++;
}
current++;
}
return inside == S - 1;
}
Funny thing the second one is slower, I do not use data structures, the code is iterative and in-place but the heavy use of switch makes it slow as hell.
The problem spec changed and now I must do it with CGAL and Java, I'll read the whole "https://github.com/CGAL/cgal-swig-bindings" to learn how to use CGAL within Java.... but I'd like some help about this specific instance of CGAL code... Are there faster algorithms already implemented in CGAL?
Thank you for your times guys! Happy coding!
I believe that, without a method of spatial indexing, the best performance you are going to achieve in the worst-case-scenario (all connected) is going to be O(n*(n-1)/2).
If you can afford to build a spatial index (have enough memory to pay for the boost in speed), you may consider R-tree and variants - insertion is O(n) searching is O(log2(n)): this will get your "outlier detection by examining distances" approach for a cost of of O(n*log2(n)) in the worst-case-scenario.
A notable result
I'm trying to write a time efficient algorithm that can detect a group of overlapping circles and make a single circle in the "middle" of the group that will represent that group. The practical application of this is representing GPS locations over a map, put the conversion in to Cartesian co-ordinates is already handled so that's not relevant, the desired effect is that at different zoom levels clusters of close together points just appear as a single circle (that will have the number of points printed in the centre in the final version)
In this example the circles just have a radius of 15 so the distance calculation (Pythagoras) is not being square rooted and compared to 225 for the collision detection. I was trying anything to shave off time, but the problem is this really needs to happen very quickly becasue it's a user facing bit of code that needs to be snappy and good looking.
I've given this a go and I it works with small data sets pretty well. 2 big problems, it takes too long and it can run out of memory if all the points are on top of one another.
The route I've taken is to calculate distance between each point in a first pass, and then take the shortest distance first and start to combine from there, anything that's been combined becomes ineligible for combination on that pass, and the whole list is passed back around to the distance calculations again until nothing changes.
To be honest I think it needs a radical shift in approach and I think it's a little beyond me. I've re factored my code in to one class for ease of posting and generated random points to give an example.
package mergepoints;
import java.awt.Point;
import java.util.ArrayList;
import java.util.Collections;
import java.util.HashSet;
import java.util.List;
import java.util.Set;
public class Merger {
public static void main(String[] args) {
Merger m = new Merger();
m.subProcess(m.createRandomList());
}
private List<Plottable> createRandomList() {
List<Plottable> points = new ArrayList<>();
for (int i = 0; i < 50000; i++) {
Plottable p = new Plottable();
p.location = new Point((int) Math.floor(Math.random() * 1000),
(int) Math.floor(Math.random() * 1000));
points.add(p);
}
return points;
}
private List<Plottable> subProcess(List<Plottable> visible) {
List<PlottableTuple> tuples = new ArrayList<PlottableTuple>();
// create a tuple to store distance and matching objects together,
for (Plottable p : visible) {
PlottableTuple tuple = new PlottableTuple();
tuple.a = p;
tuples.add(tuple);
}
// work out each Plottable relative distance from
// one another and order them by shortest first.
// We may need to do this multiple times for one set so going in own
// method.
// this is the bit that takes ages
setDistances(tuples);
// Sort so that smallest distances are at the top.
// parse the set and combine any pair less than the smallest distance in
// to a combined pin.
// any plottable thats been combine is no longer eligable for combining
// so ignore on this parse.
List<PlottableTuple> sorted = new ArrayList<>(tuples);
Collections.sort(sorted);
Set<Plottable> done = new HashSet<>();
Set<Plottable> mergedSet = new HashSet<>();
for (PlottableTuple pt : sorted) {
if (!done.contains(pt.a) && pt.distance <= 225) {
Plottable merged = combine(pt, done);
done.add(pt.a);
for (PlottableTuple tup : pt.others) {
done.add(tup.a);
}
mergedSet.add(merged);
}
}
// if we haven't processed anything we are done just return visible
// list.
if (done.size() == 0) {
return visible;
} else {
// change the list to represent the new combined plottables and
// repeat the process.
visible.removeAll(done);
visible.addAll(mergedSet);
return subProcess(visible);
}
}
private Plottable combine(PlottableTuple pt, Set<Plottable> done) {
List<Plottable> plottables = new ArrayList<>();
plottables.addAll(pt.a.containingPlottables);
for (PlottableTuple otherTuple : pt.others) {
if (!done.contains(otherTuple.a)) {
plottables.addAll(otherTuple.a.containingPlottables);
}
}
int x = 0;
int y = 0;
for (Plottable p : plottables) {
Point position = p.location;
x += position.x;
y += position.y;
}
x = x / plottables.size();
y = y / plottables.size();
Plottable merged = new Plottable();
merged.containingPlottables.addAll(plottables);
merged.location = new Point(x, y);
return merged;
}
private void setDistances(List<PlottableTuple> tuples) {
System.out.println("pins: " + tuples.size());
int loops = 0;
// Start from the first item and loop through, then repeat but starting
// with the next item.
for (int startIndex = 0; startIndex < tuples.size() - 1; startIndex++) {
// Get the data for the start Plottable
PlottableTuple startTuple = tuples.get(startIndex);
Point startLocation = startTuple.a.location;
for (int i = startIndex + 1; i < tuples.size(); i++) {
loops++;
PlottableTuple compareTuple = tuples.get(i);
double distance = distance(startLocation, compareTuple.a.location);
setDistance(startTuple, compareTuple, distance);
setDistance(compareTuple, startTuple, distance);
}
}
System.out.println("loops " + loops);
}
private void setDistance(PlottableTuple from, PlottableTuple to,
double distance) {
if (distance < from.distance || from.others == null) {
from.distance = distance;
from.others = new HashSet<>();
from.others.add(to);
} else if (distance == from.distance) {
from.others.add(to);
}
}
private double distance(Point a, Point b) {
if (a.equals(b)) {
return 0.0;
}
double result = (((double) a.x - (double) b.x) * ((double) a.x - (double) b.x))
+ (((double) a.y - (double) b.y) * ((double) a.y - (double) b.y));
return result;
}
class PlottableTuple implements Comparable<PlottableTuple> {
public Plottable a;
public Set<PlottableTuple> others;
public double distance;
#Override
public int compareTo(PlottableTuple other) {
return (new Double(distance)).compareTo(other.distance);
}
}
class Plottable {
public Point location;
private Set<Plottable> containingPlottables;
public Plottable(Set<Plottable> plots) {
this.containingPlottables = plots;
}
public Plottable() {
this.containingPlottables = new HashSet<>();
this.containingPlottables.add(this);
}
public Set<Plottable> getContainingPlottables() {
return containingPlottables;
}
}
}
Map all your circles on a 2D grid first. You then only need to compare the circles in a cell with the other circles in that cell and in it's 9 neighbors (you can reduce that to five by using a brick pattern instead of a regular grid).
If you only need to be really approximate, then you can just group all the circles that fall into a cell together. You will probably also want to merge cells that only have a small number of circles together with there neighbors, but this will be fast.
This problem is going to take a reasonable amount of computation no matter how you do it, the question then is: can you do all the computation up-front so that at run-time it's just doing a look-up? I would build a tree-like structure where each layer is all the points that need to be drawn for a given zoom level. It takes more computation up-front, but at run-time you are simply drawing a list of point, fast.
My idea is to decide what the resolution of each zoom level is (ie at zoom level 1 points closer than 15 get merged; at zoom level 2 points closer than 30 get merged), then go through your points making groups of points that are within the 15 of each other and pick a point to represent group that group at the higher zoom. Now you have a 2 layer tree. Then you pass over the second layer grouping all points that are within 30 of each other, and so on all the way up to your highest zoom level. Now save this tree structure to file, and at run-time you can very quickly change zoom levels by simply drawing all points at the appropriate tree level. If you need to add or remove points, that can be done dynamically by figuring out where to attach them to the tree.
There are two downsides to this method that come to mind: 1) it will take a long time to compute the tree, but you only have to do this once, and 2) you'll have to think really carefully about how you build the tree, based on how you want the groupings to be done at higher levels. For example, in the image below the top level may not be the right grouping that you want. Maybe instead building the tree based off the previous layer, you always want to go back to the original points. That said, some loss of precision always happens when you're trying to trade-off for faster run-time.
EDIT
So you have a problem which requires O(n^2) comparisons, you say it has to be done in real-time, can not be pre-computed, and has to be fast. Good luck with that.
Let's analyze the problem a bit; if you do no pre-computation then in order to decide which points can be merged you have to compare every pair of points, that's O(n^2) comparisons. I suggested building a tree before-hand, O(n^2 log n) once, but then runtime is just a lookup, O(1). You could also do something in between where you do some work before and some at run-time, but that's how these problems always go, you have to do a certain amount of computation, you can play games by doing some of it earlier, but at the end of the day you still have to do the computation.
For example, if you're willing to do some pre-computation, you could try keeping two copies of the list of points, one sorted by x-value and one sorted by y-value, then instead of comparing every pair of points, you can do 4 binary searches to find all the points within, say, a 30 unit box of the current point. More complicated so would be slower for a small number of points (say <100), but would reduce the overall complexity to O(n log n), making it faster for large amounts of data.
EDIT 2
If you're worried about multiple points at the same location, then why don't you do a first pass removing the redundant points, then you'll have a smaller "search list"
list searchList = new list()
for pt1 in points :
boolean clean = true
for pt2 in searchList :
if distance(pt1, pt2) < epsilon :
clean = false
break
if clean :
searchList.add(pt1)
// Now you have a smaller list to act on with only 1 point per cluster
// ... I guess this is actually the same as my first suggestion if you make one of these search lists per zoom level. huh.
EDIT 3: Graph Traversal
A totally new approach would be to build a graph out of the points and do some sort of longest-edge-first graph traversal on them. So pick a point, draw it, and traverse its longest edge, draw that point, etc. Repeat this until you come to a point which doesn't have any untraversed edges longer than your zoom resolution. The number of edges per point gives you an easy way to tradeoff speed for correctness. If the number of edges per point was small and constant, say 4, then with a bit of cleverness you could build the graph in O(n) time and also traverse it to draw points in O(n) time. Fast enough to do it on the fly with no pre-computation.
Just a wild guess and something that occurred to me while reading responses from others.
Do a multi-step comparison. Assume your combining distance at the current zoom level is 20 meters. First, subtract (X1 - X2). If This is bigger than 20 meters then you are done, the points are too far. Next, subtract (Y1 - Y2) and do the same thing to reject combining the points.
You could stop here and be happy if you are good with using only horizontal/vertical distances as your metric for combining. Much less math (no squaring or square roots). Pythagoras wouldn't be happy but your users might.
If you really insist on exact answers, do the two subtraction/comparison steps above. If the points are within horizontal and vertical limits, THEN you do the full Pythagoras check with square roots.
Assuming all your points are not highly clustered very close to the combining limit, this should save some CPU cycles.
This is still approximately an O(n^2) technique, but the math should be simpler. If you have the memory, you could store distances between each set of points and then you never have to compute it again. This could take up more memory than you have and also grows at a rate of approximately O(n^2), so be careful.
Also, you could make a linked list or sorted array of all your points, sorted in order of increasing X or increasing Y. (I don't think you need both, just one). Then walk through the list in sorted order. For each point, check the neighbors out until (X1 - X2) is bigger than your combining distance. and then stop. You don't have to compare each set of points for O(N^2), you only have to compare neighbors that are close in one dimension to quickly prune your large list to a small one. As you move through the list, you only have to compare points that have a bigger X than your current candidate, because you already compared and combined with all previous X values. This gets you closer to the O(n) complexity you want. Of course, you would need to check the Y dimension and fully qualify the points to be combined before you actually do it. Don't just use the X distance to make your combining decision.
I've written an Adaline Neural Network. Everything that I have compiles, so I know that there isn't a problem with what I've written, but how do I know that I have to algorithm correct? When I try training the network, my computer just says the application is running and it just goes. After about 2 minutes I just stopped it.
Does training normally take this long (I have 10 parameters and 669 observations)?
Do I just need to let it run longer?
Hear is my train method
public void trainNetwork()
{
int good = 0;
//train until all patterns are good.
while(good < trainingData.size())
{
for(int i=0; i< trainingData.size(); i++)
{
this.setInputNodeValues(trainingData.get(i));
adalineNode.run();
if(nodeList.get(nodeList.size()-1).getValue(Constants.NODE_VALUE) != adalineNode.getValue(Constants.NODE_VALUE))
{
adalineNode.learn();
}
else
{
good++;
}
}
}
}
And here is my learn method
public void learn()
{
Double nodeValue = value.get(Constants.NODE_VALUE);
double nodeError = nodeValue * -2.0;
error.put(Constants.NODE_ERROR, nodeError);
BaseLink link;
int count = inLinks.size();
double delta;
for(int i = 0; i < count; i++)
{
link = inLinks.get(i);
Double learningRate = value.get(Constants.LEARNING_RATE);
Double value = inLinks.get(i).getInValue(Constants.NODE_VALUE);
delta = learningRate * value * nodeError;
inLinks.get(i).updateWeight(delta);
}
}
And here is my run method
public void run()
{
double total = 0;
//find out how many input links there are
int count = inLinks.size();
for(int i = 0; i< count-1; i++)
{
//grab a specific link in sequence
BaseLink specificInLink = inLinks.get(i);
Double weightedValue = specificInLink.weightedInValue(Constants.NODE_VALUE);
total += weightedValue;
}
this.setValue(Constants.NODE_VALUE, this.transferFunction(total));
}
These functions are part of a library that I'm writing. I have the entire thing on Github here. Now that everything is written, I just don't know how I should go about actually testing to make sure that I have the training method written correctly.
I asked a similar question a few months ago.
Ten parameters with 669 observations is not a large data set. So there is probably an issue with your algorithm. There are two things you can do that will make debugging your algorithm much easier:
Print the sum of squared errors at the end of each iteration. This will help you determine if the algorithm is converging (at all), stuck at a local minimum, or just very slowly converging.
Test your code on a simple data set. Pick something easy like a two-dimensional input that you know is linearly separable. Will your algorithm learn a simple AND function of two inputs? If so, will it lean an XOR function (2 inputs, 2 hidden nodes, 2 outputs)?
You should be adding debug/test mode messages to watch if the weights are getting saturated and more converged. It is likely that good < trainingData.size() is not happening.
Based on Double nodeValue = value.get(Constants.NODE_VALUE); I assume NODE_VALUE is of type Double ? If that's the case then this line nodeList.get(nodeList.size()-1).getValue(Constants.NODE_VALUE) != adalineNode.getValue(Constants.NODE_VALUE) may not really converge exactly as it is of type double with lot of other parameters involved in obtaining its value and your convergence relies on it. Typically while training a neural network you stop when the convergence is within an acceptable error limit (not a strict equality like you are trying to check).
Hope this helps
I was wondering if I could get some advice on increasing the overall efficiency of a program that implements a genetic algorithm. Yes this is an assignment question, but I have already completed the assignment on my own and am simply looking for a way to get it to perform better
Problem Description
My program at the moment reads a given chain made of the types of constituents, h or p. (For example: hphpphhphpphphhpphph) For each H and P it generated a random move (Up, Down, Left, Right) and adds the move to an arrayList contained in the "Chromosome" Object. At the start the program is generating 19 moves for 10,000 Chromosomes
SecureRandom sec = new SecureRandom();
byte[] sbuf = sec.generateSeed(8);
ByteBuffer bb = ByteBuffer.wrap(sbuf);
Random numberGen = new Random(bb.getLong());
int numberMoves = chromosoneData.length();
moveList = new ArrayList(numberMoves);
for (int a = 0; a < numberMoves; a++) {
int randomMove = numberGen.nextInt(4);
char typeChro = chromosoneData.charAt(a);
if (randomMove == 0) {
moveList.add(Move.Down);
} else if (randomMove == 1) {
moveList.add(Move.Up);
} else if (randomMove == 2) {
moveList.add(Move.Left);
} else if (randomMove == 3) {
moveList.add(Move.Right);
}
}
After this comes the selection of chromosomes from the Population to crossover. My crossover function selections the first chromosome at random from the fittest 20% of the population and the other at random from outside of the top 20%. The chosen chromosomes are then crossed and a mutation function is called. I believe the area in which I am taking the biggest hit is calculating the fitness of each Chromosome. Currently my fitness function creates a 2d Array to act as a grid, places the moves in order from the move list generated by the function shown above, and then loops through the array to do the fitness calculation. (I.E. found and H at location [2,1] is Cord [1,1] [3,1] [2,0] or [2,2] also an H and if an H is found it just increments the count of bonds found)
After the calculation is complete the least fit chromosome is removed from my population and the new one is added and then the array list of chromosomes is sorted. Rinse and repeat until target solution is found
If you guys want to see more of my code to prove I actually did the work before asking for help just let me know (dont want to post to much so other students cant just copy pasta my stuff)
As suggested in the comments I have ran the profiler on my application (have never used it before, only a first year CS student) and my initial guess on where i am having issues was somewhat incorrect. It seems from what the profiler is telling me is that the big hotspots are:
When comparing the new chromosome to the others in the population to determine its position. I am doing this by implementing Comparable:
public int compareTo(Chromosome other) {
if(this.fitness >= other.fitness)
return 1;
if(this.fitness ==other.fitness )
return 0;
else
return -1;
}
The other area of issue described is in my actual evolution function, consuming about 40% of the CPU time. A codesample from said method below
double topPercentile = highestValue;
topPercentile = topPercentile * .20;
topPercentile = Math.ceil(topPercentile);
randomOne = numberGen.nextInt((int) topPercentile);
//Lower Bount for random two so it comes from outside of top 20%
int randomTwo = numberGen.nextInt(highestValue - (int) topPercentile);
randomTwo = randomTwo + 25;
//System.out.println("Selecting First: " + randomOne + " Selecting Second: " + randomTwo);
Chromosome firstChrom = (Chromosome) populationList.get(randomOne);
Chromosome secondChrom = (Chromosome) populationList.get(randomTwo);
//System.out.println("Selected 2 Chromosones Crossing Over");
Chromosome resultantChromosome = firstChrom.crossOver(secondChrom);
populationList.add(resultantChromosome);
Collections.sort(populationList);
populationList.remove(highestValue);
Chromosome bestResult = (Chromosome) populationList.get(0);
The other main preformance hit is the inital population seeding which is performed by the first code sample in the post
I believe the area in which I am taking the biggest hit is calculating the fitness of each Chromosome
If you are not sure then I assume you have not run a profiler on the program yet.
If you want to improve the performance, profiling is the first thing you should do.
Instead of repeatedly sorting your population, use a collection that maintains its contents already sorted. (e.g. TreeSet)
If your fitness measure is consistent across generations (i.e. not dependent on other members of the population) then I hope at least that you are storing that in the Chromosome object so you only calculate it once for each member of the population. With that in place you'd only be calculating fitness on the newly generated/assembled chromosome each iteration. Without more information on how fitness if calculated it's difficult to be able to offer any optimisations in that area.
Your random number generator seed doesn't need to be cryptographically strong.
Random numberGen = new Random();
A minor speedup when seeding your population is to remove all the testing and branching:
static Move[] moves = {Move.Down, Move.Up, Move.Left, Move.Right};
...
moveList.add(moves[randomMove]);