I have made a small program that computes logic circuits' truth tables. In the representation I have chosen (out of ignorance, I have no schooling on the subject), I use a Circuit class, and a Connector class to represent "circuits" (including basic gates such as NOT, OR...) and wiring.
A Factory class is used to "solder the pins and wires" with statements looking like this
factory.addCircuit("OR0", CircuitFactory.OR);
factory.addConnector("N0OUT", "NOT0", 0, "AND1", 1);
When the circuit is complete
factory.createTruthTable();
computes the circuit's truth table.
Inputing the truth tables for OR NOT and AND, the code has chained the creation of XOR, 1/2 ADDER, ADDER and 4-bit ADDER, reusing the previous step's truth table at each step.
It's all very fine and dandy for an afternoon's work, but it will obviously break on loops (as an example, flip-flops). Does anyone know of a convenient way to represent a logic circuit with loops? Ideal would be if it could be represented with a table, maybe a table with previous states, new states and delays.
Pointing me to litterature describing such a representation would also be fine. One hour of internet searching brought only a PhD paper, a little over my understanding.
Thanks a lot!
Any loop must contain at least one node with "state", of which the flip-flop (or register) is the fundamental building block. An effective approach is to split all stateful nodes into two nodes; one acting as a data source, the other as a data sink. So you now have no loops.*
To simulate, on every clock cycle,** you propagate your data values from sources to sinks in a feedforward fashion. Then you update your stateful sources (from their corresponding sinks), ready for the next cycle.
* If you still have loops at this point, then you have an invalid circuit graph.
** I'm assuming you want to simulate synchronous logic, i.e. you have a clock, and state only updates on clock edges. If you want to simulate asynchronous logic, then things get trickier, as you need to start modelling propagation delays and so on.
Related
In a simulation i get some data looking like a arctan or tanh function.
I want to implement a function fit in Java for getting the parameter of this function for optimization. For other functions i used for example the Apache code for function fit of polynomial and gaus function but couldn't find a solution for tangent.
To be honest I don't know how to write such a function fit so maybe someone can help me fixing this problem or does know if there is already a function fit existing for such functions.
There is an example model called "Calibration of agent based SIR model" that does what you are looking for: Calibrate model parameters so the output matches a given function (not tangent in this example but easy to adjust)
Short answer
AnyLogic does not have any data-fitting capabilities built-in, other than simple interpolation of discrete data (see Table Functions in the help). So
(a) if you needed to do it in-model (e.g., driven by some model state), you'd need to find a suitable Java library that did what was missing in what you'd already tried (Apache Commons), and call that from the AnyLogic model;
(b) if you could do it outside the model, use a data-fitting tool like Stat::Fit (which exists as a plug-in for some sim tools like Simul8, but not for AnyLogic).
Longer answer
Based on your additional explanatory comments, it sounds like this is a question where it's crucial to properly explain your context, and perhaps you don't need to use data-fitting at all (and there may be a more 'AnyLogic-centric' way of approaching it in that case). Particularly around the intended interaction between simulation and (mathematical) Gurobi optimisation; note that AnyLogic has built-in heuristic optimisation via OptQuest so any normal discussion of 'optimisation' with AnyLogic is referring to that.
On the one hand you seem to suggest you want to fit a function to some input data to your simulation. (You talk about having Excel inputs and wanting to fit a curve to it.)
On the other hand, you seem to suggest you want an approach where you are optimising at intermediate time intervals based on run-time model state. But what is the optimiser determining and how do its results affect the ongoing execution of the simulation? You say "So it is not about an optimization of the whole model but of intermediate results. Since I didn't find a solution for this". What 'solution' are you looking for? This sounds like an approach where you're modelling decisions for time period N being made inside the simulation, where those decisions are based on an optimisation using the outcomes from period N-1 as its inputs (and thus the optimisation is effectively based on a simplified emulation of the simulation using a function, since the simulation is already supposed to be the most-accurate computational representation of the real-world system).
So perhaps(?) you're saying that you are emulating/approximating the simulation as a function of its input data (where you happen to think a tangent function fits). In which case the original suggestion (a) is probably the only thing that makes sense. Though, even then, when you are optimising for anything after the first time period, the 'inputs' are no longer the original model inputs; they are some representation of the simulation's current state/outcomes (so it's not clear that this relates to the Excel input data directly, and so maybe I'm barking up the wrong tree).
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 7 years ago.
Improve this question
I am working on a project, where I was provided a Java matrix-multiplication program which can run in a distributed system , which is run like so :
usage: java Coordinator maxtrix-dim number-nodes coordinator-port-num
For example:
java blockMatrixMultiplication.Coordinator 25 25 54545
Here's a snapshot of how output looks like :
I want to extend this code with some kind of failsafe ability - and am curious about how I would create checkpoints within a running matrix multiplication calculation. The general idea is to recover to where it was in a computation (but it doesn't need to be so fine grained - just recover to beginning, i.e row 0 column 0 )
My first idea is to use log files (like Apache log4j ), where I would be logging the relevant matrix status. Then, if we forcibly shut down the app in the middle of a calculation, we could recover to a reasonable checkpoint.
Should I use MySQL for such a task (or maybe a more lightweight database)? Or would a basic log file ( and using some useful Apache libraries) be good enough ? any tips appreciated, thanks
source-code :
MatrixMultiple
Coordinator
Connection
DataIO
Worker
If I understand the problem correctly, all you need to do is recover your place in a single matrix calculation in the event of a crash or if the application is quit half way through.
Minimum Viable Solution
The simplest approach would be to recover just the two matrixes you were actively multiplying, but none of your progress, and multiply them from the beginning next time you load the application.
The Process:
At the beginning of public static int[][] multiplyMatrix(int[][] a, int[][] b) in your MatrixMultiple class, create a file, let's call it recovery_data.txt with the state of the two arrays being multiplied (parameters a and b). Alternatively, you could use a simple database for this.
At the end of public static int[][] multiplyMatrix(int[][] a, int[][] b) in your MatrixMultiple class, right before you return, clear the contents of the file, or wipe you database.
When the program is initially run, most likely near the beginning of the main(String[] args) you should check to see if the contents of the text file is non-null, in which case you should multiply the contents of the file, and display the output, otherwise proceed as usual.
Notes on implementation:
Using a simple text file or a full fledged relational database is a decision you are going to have to make, mostly based on the real world data that only you could really know, but in my mind, a textile wins out in most situations, and here are my reasons why. You are going to want to read the data sequentially to rebuild your matrix, and so being relational is not that useful. Databases are harder to work with, not too hard, but compared to a text file there is no question, and since you would not be much use of querying, that isn't balanced out by the ways they usually might make a programmers life easier.
Consider how you are going to store your arrays. In a text file, you have several options, my recommendation would be to store each row in a line of text, separated by spaces or commas, or some other character, and then put an extra line of blank space before the second matrix. I think a similar approach is used in crAlexander's Answer here, but I have not tested his code. Alternatively, you could use something more complicated like JSON, but I think that would be too heavy handed to justify. If you are using a database, then the relational structure should make several logical arrangements for your data apparent as well.
Strategic Checkpoints
You expressed interest in saving some calculations by taking advantage of the possibility that some of the calculations will have already been handled on last time the program ran. Lets look first look at the Pros and Cons of adding in checkpoints after every row has been processed, best I can see them.
Pros:
Save computation time next time the program is run, if the system had been closed.
Cons:
Making the extra writes will either use more nodes if distributed (more on that later) or increase general latency from calculations because you now have to throw in a database write operation for every checkpoint
More complicated to implement (but probably not by too much)
If my comments on the implementation of the Minimum Viable Solution about being able to get away with a text file convinced you that you would not have to add in RDBMS, I take back the parts about not leveraging queries, and everything being accessed sequentially, so a database is now perhaps a smarter choice.
I'm not saying that checkpoints are definitely not the better solution, just that I don't know if they are worth it, but here is what I would consider:
Do you expect people to be quitting half way through a calculation frequently relative to the total amount of calculations they will be running? If you think this feature will be used a lot, then the pro of adding checkpoints becomes much more significant relative to the con of it slowing down calculations as a whole.
Does it take a long time to complete a typical calculation that people are providing the program? If so, the added latency I mentioned in the cons is (percentage wise) smaller, and so perhaps more tolerable, but users are already less happy with performance, and so that cancels out some of the effect there. It also makes the argument for checkpointing more significant because it has the potential to save more time.
And so I would only recommend checkpointing like this if you expect a relatively large amount of instances where this is happening, and if it takes a relatively large amount of time to complete a calculation.
If you decide to go with checkpoints, then modify the approach to:
after every row has been processed on the array that you produce the content of that row to your database, or if you use the textile, at the end of the textile, after another empty line to separate it from the last matrix.
on startup if you need to finish a calculation that has already been begun, solve out and distribute only the rows that have yet to be considered, and retrieve the content of the other rows from your database.
A quick point on implementing frequent checkpoints: You could greatly reduce the extra latency from adding in frequent checkpoints by pushing this task out to an additional thread. Doing this would use more processes, and there is always some latency in actually spawning the process or thread, but you do not have to wait for the entire write operation to be completed before proceeding.
A quick warning on the implementation of any such failsafe method
If there is an unchecked edge case that would mean some sort of invalid matrix would crash the program, this failsafe now bricks the program it entirely by trying it again on every start. To combat this, I see some obvious solutions, but perhaps a bit of thought would let you modify my approaches to something you prefer:
Use a lot of try and catch statements, if you get any sort of error that seems to be caused by malformed data, wipe your recovery file, or modify it to add a note that tells your program to treat it as a special case. A good treatment of this special case may be to display the two matrixes at start with an explanation that your program failed to multiply them likely due to malformed content.
Add data in your file/database on how many times the program has quit while solving the current problem, if this is not the first resume, treat it like the special case in the above option.
I hope that this provided enough information for you to implement your failsafe in the way that makes the most sense given what you suspect the realistic use to be, and note that there are perhaps other ways you could approach this problem as well, and these could equally have their own lists of pros and cons to take into consideration.
I have 2D hydraulic data, which are multigigabyte text files containing depth and velocity information for each point in a grid, broken up into time steps. Each timestep contains a depth/velocity value for every point in the grid. So you could follow one point through each timestep and see how its depth/velocity changes. I want to read in this data one timestep at a time, calculating various things - the maximum depth a grid cell achieves, max velocity, the number of the first timestep where water is more than 2 feet deep, etc. The results of each of these calculations will be a grid - max depth at each point, etc.
So far, this sounds like the Decorator pattern. However, I'm not sure how to get the results out of the various calculations - each calculation produces a different grid. I would have to keep references to each decorator after I create it in order to extract the results from it, or else add a getResults() method that returns a map of different results, etc, neither of which sound ideal.
Another option is the Strategy pattern. Each calculation is a different algorithm that operates on a time step (current depth/velocity) and the results of previous rounds (max depth so far, max velocity so far, etc). However, these previous results are different for each computation - which means either the algorithm classes become stateful, or it becomes the caller's job to keep track of previous results and feed them in. I also dislike the Strategy pattern because the behavior of looping over the timesteps becomes the caller's responsibility - I'd like to just give the "calculator" an iterator over the timesteps (fetching them from the disk as needed) and have it produce the results it needs.
Additional constraints:
Input is large and being read from disk, so iterating exactly once, by time step, is the only practical method
Grids are large, so calculations should be done in place as much as possible
If i understand your problem right, you have a grid_points which have many timesteps & each timestep has depth & velocity. Now have GBs of data.
I would suggest to do one pass on the data & store the parsed data in a RDBMS. then run queries or stored procedures on this data. This way at least the application will not run out of memory
First, maybe I've not well understood the issue and miss the point in my answer, in which case I apologize for taking your time.
At first sight I would think of an approach that's more akin to the "strategy pattern", in combination with a data-oriented base, something like the following pseudo-code:
foreach timeStamp
readGridData
foreach activeCalculator in activeCalculators
useCalculatorPointerListToAccessSpecificStoredDataNeededForNewCalculation
performCalculationOnFreshGridData
updateUpdatableData
presentUpdatedResultsToUser
storeGridResultsInDataPool(OfResultBaseClassType)
discardNoLongerNeededStoredGridResults
next calculator
next timeStep
Again, sorry if this is off the point.
I am trying to implement K means in hadoop-1.0.1 in java language. I am frustrated now. Although I got a github link of the complete implementation of k means but as a newbie in Hadoop, I want to learn it without copying other's code. I have basic knowledge of map and reduce function available in hadoop. Can somebody provide me the idea to implement k means mapper and reducer class. Does it require iteration?
Ok I give it a go to tell you what I thought when implementing k-means in MapReduce.
This implementation differs from that of Mahout, mainly because it is to show how the algorithm could work in a distributed setup (and not for real production usage).
Also I assume that you really know how k-means works.
That having said we have to divide the whole algorithm into three main stages:
Job level
Map level
Reduce level
The Job Level
The job level is fairly simple, it is writing the input (Key = the class called ClusterCenter and Value = the class called VectorWritable), handling the iteration with the Hadoop job and reading the output of the whole job.
VectorWritable is a serializable implementation of a vector, in this case from my own math library, but actually nothing else than a simple double array.
The ClusterCenter is mainly a VectorWritable, but with convenience functions that a center usually needs (averaging for example).
In k-means you have some seedset of k-vectors that are your initial centers and some input vectors that you want to cluster. That is exactly the same in MapReduce, but I am writing them to two different files. The first file only contains the vectors and some dummy key center and the other file contains the real initial centers (namely cen.seq).
After all that is written to disk you can start your first job. This will of course first launch a Mapper which is the next topic.
The Map Level
In MapReduce it is always smart to know what is coming in and what is going out (in terms of objects).
So from the job level we know that we have ClusterCenter and VectorWritable as input, whereas the ClusterCenter is currently just a dummy. For sure we want to have the same as output, because the map stage is the famous assignment step from normal k-means.
You are reading the real centers file you created at job level to memory for comparision between the input vectors and the centers. Therefore you have this distance metric defined, in the mapper it is hardcoded to the ManhattanDistance.
To be a bit more specific, you get a part of your input in map stage and then you get to iterate over each input "key value pair" (it is a pair or tuple consisting of key and value) comparing with each of the centers. Here you are tracking which center is the nearest and then assign it to the center by writing the nearest ClusterCenter object along with the input vector itself to disk.
Your output is then: n-vectors along with their assigned center (as the key).
Hadoop is now sorting and grouping by your key, so you get every assigned vector for a single center in the reduce task.
The Reduce Level
As told above, you will have a ClusterCenter and its assigned VectorWritable's in the reduce stage.
This is the usual update step you have in normal k-means. So you are simply iterating over all vectors, summing them up and averaging them.
Now you have a new "Mean" which you can compare to the mean it was assigned before. Here you can measure a difference between the two centers which tells us about how much the center moved. Ideally it wouldn't have moved and converged.
The counter in Hadoop is used to track this convergence, the name is a bit misleading because it actually tracks how many centers have not converged to a final point, but I hope you can live with it.
Basically you are writing now the new center and all the vectors to disk again for the next iteration. In addition in the cleanup step, you are writing all the new gathered centers to the path used in the map step, so the new iteration has the new vectors.
Now back at the job stage, the MapReduce job should be done now. Now we are inspecting the counter of that job to get the number of how many centers haven't converged yet.
This counter is used at the while loop to determine if the whole algorithm can come to an end or not.
If not, return to the Map Level paragraph again, but use the output from the previous job as the input.
Actually this was the whole VooDoo.
For obvious reasons this shouldn't be used in production, because its performance is horrible. Better use the more tuned version of Mahout. But for educational purposes this algorithm is fine ;)
If you have any more questions, feel free to write me a mail or comment.
I want to implement a reinforcement learning connect four agent.
I am unsure how to do so and how it should look. I am familiar with the theoretical aspects of reinforcement learning but don't know how they should be implemented.
How should it be done?
Should I use TD(lambda) or Q-learning, and how do MinMax trees come in to this?
How does my Q and V functions work (Quality of action and Value of state). How do I score those things? What is my base policy which I improve, and what is my model?
Another thing is how should I save the states or statesXactions (depending on the learning algorithm). Should I use neural networks or not? And if yes, how?
I am using JAVA.
Thanks.
This might be a more difficult problem than you think, and here is why:
The action space for the game is the choice of column to drop a piece into. The state space for the game is an MxN grid. Each column contains up to M pieces distributed among the 2 players.This means there are (2M+1-1)N states. For a standard 6x7 board, this comes out to about 1015. It follows that you cannot apply reinforement learning to the problem directly. The state value function is not smooth, so naĆve function approximation would not work.
But not all is lost. For one thing, you could simplify the problem by separating the action space. If you consider the value of each column separately, based on the two columns next to it, you reduce N to 3 and the state space size to 106. Now, this is very manageable. You can create an array to represent this value function and update it using a simple RL algorithm, such as SARSA.
Note, that the payoff for the game is very delayed, so you might want to use eligibility traces to accelerate learning.