I was working on an application last night and came across a particular problem which I'm sure probably has an efficient algorithm to solve it. Could anyone suggest?
Problem:
TL;DR: Maybe a picture will help: http://www.custom-foam-inserts.com/ . I have lots of items which fit in a range of compartments: and I want to minimise the number of cases I need to take.
.
I have a set of N items of expensive electronic equipment which I want to pack into specially designed protective boxes. These boxes each have many compartments which can each fit a single item: some of which are designed specially to fit a particular item (ie, a camera shaped hole) and some of which are generic (a rectangular hole). I know in advance that there are C different sizes of compartment and what sizes these are.
These boxes come in L different layouts, each with at least one compartment. A layout might be ‘two large rectangular compartments and 4 small circular compartments’.
Each compartment size is present on at least one layout, but I have items which don’t suit any compartment size. Each item fits at least one compartment and may fit into multiple different compartments: for example, my DSLR camera might be a tight fit in a ‘medium rectangle’ compartment, a loose fit in a ‘large rectangle’ and a perfect fit in a ‘DSLR camera compartment’, but won’t fit in a ‘small circle’. To this end I’ve made a list of which compartments are suitable for each item.
The items are moderately heterogeneous – for example there may be 50 items of one size and 20 items of another size.
Each box has two costs Volume and Dollars (however D ~proportional to V). I need to minimise one or both of these costs whilst fitting ALL of my items into the boxes. Due to the layouts of the boxes, the optimal solution may contain unused compartments. If two solutions have equal volume, select the one with the most unused compartments. Because each compartment is present on at least one layout and each item fits in at least one compartment, there is always a solution which fits all items.
Number of items: <=2000, average case 150.
Number of compartments: <= 1000.
Number of layouts: <= 1000.
Any ideas on this one? I've looked a bit at Knapsack and Bin Packing algorithms and I'm not sure they're the way to go. Help much appreciated.
From the problem description this does indeed seem to be knapsack problem since you have to maximise your space available while keeping in mind the weight of your options.
Depending on what you are after, you could also consider using a Genetic Algorithm. Since this problem is NP Complete the running time will eventually explode should you need to add more items, so I would go with this primarily if I need the best solution available irrelevant of the time it takes.
On the other hand, the Genetic Algorithm should be able to provide with some solution in a relatively small period of time, however, the solution it provides might not be as good as the one provided by the Knapsack algorithm, so I would choose a GA if I have restrictions on the time I need to provide some solution and I do not care if it is the not absolute best.
Related
I am attempting to render around 500 points of data at a time for a hierarchical clustering operation as a demonstration using Java and OpenGL. In order to show it's steps I would like to color individual points so that they are easily distinguishable so that when the clusters merge it's obvious which is merging.
I have used this list. But after separating out hard to distinguish colors and color that are too light for my white background I'm left with less than 50.
Is there a method to create unique, easily distinguished colors? I would need around 500 generated. I'd prefer a method if possible so that I do not have to handcode (or awk/sed) a list of them.
After some experimentation this problem seemed fairly difficult and downright impossible. Before I move on to numbering each cluster and hoping the numbers render correctly I wanted to ask if this was possible, and additionally what the best method to achieve this would be.
We have two rooms of certain sizes(lets call it volume). We have a number of boxes that we have to fit in the two rooms. The boxes have certain sizes, and we cannot stack any boxes on top of each other. Our goal is to maximize the number of boxes in the two rooms using backtracking algorithm. Any suggestions please?
I guess my suggestion would be to think of this as a searching problem in a tree or graph structure. What you need to do is continue to try different paths and save the "best solution". However, this could end up trying all of the possibilies and be O(n!). Therefore, I advise you use some sort of pruning or logic so that this isn't the case. i.e. Alpha-Beta or don't persue paths once they exceed some specifications.
For a current project, I want to use genetic algorithms - currently I had a look at the jenetics library.
How can I force that some genes are dependent on each other? I want to map CSS on the gene, f.e. I have genes indicating if an image is displayed, and in case it is also the respective height and width. So I want to have those genes as a group togheter, as it would make no sense that after a crossover, the chrosome would indicate something like "no image" - height 100px - width 0px.
Is there a method to do so? Or maybe another library (in java) which supports this?
Many thanks!
You want to embed more knowledge into your system to reduce the search space.
If it would be knowledge about the structure of the solution, I would propose taking a look at grammatical evolution (GE). Your knowledge appears to be more about valid combinations of codons, so GE is not easily applicable.
It might be possible to combine a few features into a single codon, but this may be undesirable and/or unfeasible (e.g. due to great number of possible combinations).
But in fact you don't have an issue here:
it's fine to have meaningless genotypes — they will be removed due to the selection pressure
it's fine to have meaningless codon sequences — it's called "bloat"; bloat is quite common to some evolutionary algorithms (usually discussed in the context of genetic programming) and is not strictly bad; fighting with bloat too much can reduce the search performance
If you know how your genome is encoded - that is, you know which sequences of chromosomes form groups - then you could extend (since you mention jenetics) io.jenetics.MultiPointCrossover to avoid splitting groups. (Source code available on GitHub.)
It could be as simple as storing ranges of genes which form groups if one of the random cut indexes would split a group, adjusting the index to the nearest end of the group. (Of course this would cause a statistically higher likelihood of cuts at the ends of groups; it would probably be better to generate a new random location until it doesn't intersect a group.)
But it's also valid (as Pete notes) to have genes which aren't meaningful (ignored) based on other genes; if the combination is anti-survival it will be selected out.
Within a university course I have some features of images (as text files). I have to rank those images according to their diversity.#
The idea I have in mind is to feed a k-means classifier with the images and then compute the euclidian-distance from the images within a cluster to the cluster's centroïd. Then do a rotation between clusters and take always the (next) closest image to the centroïd. I.e., return closest to centroïd 1, then closest to centroïd 2, then 3.... then second closest to centroïd 1, 2, 3 and so on.
First question: would this be a clever approach? Or am I on the wrong path?
Second question: I'm a bit confused. I thought I'd feed the data to Weka and it'd tell me "hey, if I were you, I'd split this data into 7 clusters", or something like that. I mean, that it'd be able to give me some information about the clusters I need. Instead, to use simplekmeans I'm supposed to know a priori how many clusters I'll use... how could I possibly know that?
One example of what I mean: let's say I have 3 mono-color images: light-blue, blue, red.
I thought Weka would notice that the 2 blues are similar and cluster them together.
Btw I'm kind of new to Weka (as you might have seen) so if you could provide some information on which functions I miggt want to use (and why :P) I'd be grateful!
Thank you!
Simple K-means - is an algorithm where you have to specify a number of the possible clusters in the data set.
If you don't know how many clusters there might be, it's better to get different algorithm or find out a number of the clusters.
You can use X-means -there you don't need to specify k parameter. (http://weka.sourceforge.net/doc.packages/XMeans/weka/clusterers/XMeans.html)
X-Means is K-Means extended by an Improve-Structure part In this part of the algorithm the centers are attempted to be split in its region. The decision between the children of each center and itself is done comparing the BIC-values of the two structures.
or you can observe a cut point chart based on AHC - hierarchical clustering algorithm (https://en.wikipedia.org/wiki/Hierarchical_clustering)
and then deduct a number of the clusters
What I seek is to turn a grid into a somewhat "random" plane of tiles.
I tried just multiplying Math.random() individually with the width and height of the plane (in this case its 800 / 600). The circles you see there are points that intersect each other and have been removed from the scene.
As you can see, it looks very far from an "evenly distributed" field of points. There are large holes and just as bad, clusters of points can be seen.
What I am looking for is a way to distribute these points better to have a minimum amount of clusters and holes. Ideally, to have a value that is the minimum distance between any two points, while having the maximum number of points that can fit in the area. I am fine with approximations of all kinds, I just don't want to attempt to do a greedy distribution.
Whatever ecma solution you give its fine, I can convert it to Actionscript.
I have found a visual example. The left side is what I got and the right is what I aim for.
You can try Loyds algorithm, i.e. centroidal weighted voronoi diagrams. Compute the vd and then the center of gravity of each cell. Replace the old points and rinse and repeat: http://www-cs-students.stanford.edu/~amitp/game-programming/polygon-map-generation/.
In general, it is a non-trivial problem, and there are many different approaches.
One that I have liked, since it is fast and produces decent results, is the quasi-random number generator from this article: "The Unreasonable Effectiveness of Quasirandom Sequences"
Other approaches are generally iterative, where the more iterations you do, the better results. You could look up "Mitchell's Best Candidate", for one. Another is "Poisson Disc Sampling".
There are innumerable variations on the different algorithms depending on what you want — some applications demand certain frequencies of noise, for instance. But if you just want something that "looks okay", I think the quasirandom one is a good starting point.
Another cheap and easy one is a "jittered grid", where you evenly space the points on your plane, then randomly adjust each one a small amount.