I am working on a project in Android for my Signal Processing course. My aim is to find signal properties, such as periodicity, even/odd, causality etc, given a user-input function. Right now, I am stuck at trying to figure out how to programmatically calculate the periodicity of a given function. I know the basics behind periodicity: f(t+T) = f(t)
The only idea I have right now is to extensively calculate values for the function, and then check for repetition of values. I know the method is quite stupid, given the fact I don't know how many such values I need to calculate to determine if it is periodic or not.
I know this can be done easily in Matlab, but again very difficult to port Matlab to Java. Is there something I am missing? I have been searching a lot, but haven't found anything useful.
Thanks for any help, in advance!
If the function f is given as a symbolic expression, then the equation you stated can in some cases be solved symbolically. The amount of work required for this will depend on how your functions are described, what kinds of functions you allow, what libraries you use and so on.
If your only interaction with the function is evaluating it, i.e. if you treat the function description as a black box or obtain its values from some sensor, then your best bet would be a Fourier transformation of the data, to convert it from the time domain into frequency domain. In particularly, you probably want to choose your number of of samples to analyze as a power of two, and then use FFT to quickly obtain intensities for various frequencies.
Related
My goal is to generate a predictive model using tensor flow in Java but I first want to ensure that my goal is achievable. Firstly, if I have a bunch of parameters and each set of parameters is assigned an output is it possible to train a model to predict an output given similar parameters? I am able to get hundreds of thousands samples (if needed) in order to train it so is this possible?
Secondly, after the model is trained how fast can it actually generate results?
Lastly, assuming everything up until this point checks out what is the best method in Java’s tensor flow to train a model with data that has multiple parameters associated with an outcome? Also in the result a given piece of data satisfies two results both can be returned as options ordered from most likely to least.
Also just to clarify I am not asking someone to make this for me I am just trying to make sure that a solution exists and is quick (if it’s slow I could just go back to brute forcing which I am trying to move away from since is kinda slow and resource intensive). Also, if you have any pointers on getting started tackling this I would greatly appreciate it!
Your question is very, very general, but I'll try to offer some insight:
Firstly, if I have a bunch of parameters and each set of parameters is assigned an output is it possible to train a model to predict an output given similar parameters?
Taking a set of parameters (known as the feature set X) and making predictions of another set of parameters (known as the output set Y) is the primary purpose of machine learning. Exactly how to do this requires many steps, how to do it well takes a lot of experience... However if you are asking if it is possible in principle, that depends on the specific feature set X, and output set Y.
I am able to get hundreds of thousands samples (if needed) in order to train it so is this possible?
The trick to machine learning is the data must be of a sufficient quantity and quality. This takes domain specific knowledge to know.
Are you able to provide any specifics about your data to help us understand?
In a simulation i get some data looking like a arctan or tanh function.
I want to implement a function fit in Java for getting the parameter of this function for optimization. For other functions i used for example the Apache code for function fit of polynomial and gaus function but couldn't find a solution for tangent.
To be honest I don't know how to write such a function fit so maybe someone can help me fixing this problem or does know if there is already a function fit existing for such functions.
There is an example model called "Calibration of agent based SIR model" that does what you are looking for: Calibrate model parameters so the output matches a given function (not tangent in this example but easy to adjust)
Short answer
AnyLogic does not have any data-fitting capabilities built-in, other than simple interpolation of discrete data (see Table Functions in the help). So
(a) if you needed to do it in-model (e.g., driven by some model state), you'd need to find a suitable Java library that did what was missing in what you'd already tried (Apache Commons), and call that from the AnyLogic model;
(b) if you could do it outside the model, use a data-fitting tool like Stat::Fit (which exists as a plug-in for some sim tools like Simul8, but not for AnyLogic).
Longer answer
Based on your additional explanatory comments, it sounds like this is a question where it's crucial to properly explain your context, and perhaps you don't need to use data-fitting at all (and there may be a more 'AnyLogic-centric' way of approaching it in that case). Particularly around the intended interaction between simulation and (mathematical) Gurobi optimisation; note that AnyLogic has built-in heuristic optimisation via OptQuest so any normal discussion of 'optimisation' with AnyLogic is referring to that.
On the one hand you seem to suggest you want to fit a function to some input data to your simulation. (You talk about having Excel inputs and wanting to fit a curve to it.)
On the other hand, you seem to suggest you want an approach where you are optimising at intermediate time intervals based on run-time model state. But what is the optimiser determining and how do its results affect the ongoing execution of the simulation? You say "So it is not about an optimization of the whole model but of intermediate results. Since I didn't find a solution for this". What 'solution' are you looking for? This sounds like an approach where you're modelling decisions for time period N being made inside the simulation, where those decisions are based on an optimisation using the outcomes from period N-1 as its inputs (and thus the optimisation is effectively based on a simplified emulation of the simulation using a function, since the simulation is already supposed to be the most-accurate computational representation of the real-world system).
So perhaps(?) you're saying that you are emulating/approximating the simulation as a function of its input data (where you happen to think a tangent function fits). In which case the original suggestion (a) is probably the only thing that makes sense. Though, even then, when you are optimising for anything after the first time period, the 'inputs' are no longer the original model inputs; they are some representation of the simulation's current state/outcomes (so it's not clear that this relates to the Excel input data directly, and so maybe I'm barking up the wrong tree).
I'm trying to sort a collection of AnyLogic GISRegions by their geographical area. Said area is calculated using GISRegion.area(units), which is straightforward enough. The areas I'm using, however, are city-scale and the method returns a double. This appears to cause overflow problems:
I don't think I'm doing anything wrong with my code, so presumably this is an AnyLogic problem. For brevity, I've included a line that prints each region's area rather than the sorting steps:
// For each region of the Australian Capital Territory, print its area in km^2:
areas.forEach(next -> traceln(""+next.name+": " + next.gisRegion.area(SQ_KILOMETER)));
Has anyone encountered this issue? How did you get around it?
For non-AnyLogic users, I have all the lat-long points in each geoshape. How might I calculate the area using those points?
[Not really a full answer, but the ideas are too long for a comment.]
I assume you've raised an AnyLogic support request since it seems 100% a bug. Since this is just a basic 'calculate area' function, I can't see any way round it other than, as you suggest, calculating it in an alternative manner from the vertex lat/longs that you have, and can get via getPoints() on the GISRegion.
Since this is just an N-sided polygon, surely there must be standard Java libraries that could calculate that, though that's not allowing for the GIS projection (not sure what level of error that might introduce); you'd expect open GIS libraries to cope with the latter. Since a GISRegion has a createOMGraphicObject() method to create an OpenMap standard(?) format graphic, that could be useful if that's a standard format other libraries can work with.
There's code on glennon's answer to this GIS StackOverflow question that claims to perform the calculation (or you make be able to hook in to PostGIS as in fmark's answer).
I am developing a financial manager in my freetime with Java and Swing GUI. When the user adds a new entry, he is prompted to fill in: Moneyamount, Date, Comment and Section (e.g. Car, Salary, Computer, Food,...)
The sections are created "on the fly". When the user enters a new section, it will be added to the section-jcombobox for further selection. The other point is, that the comments could be in different languages. So the list of hard coded words and synonyms would be enormous.
So, my question is, is it possible to analyse the comment (e.g. "Fuel", "Car service", "Lunch at **") and preselect a fitting Section.
My first thought was, do it with a neural network and learn from the input, if the user selects another section.
But my problem is, I don´t know how to start at all. I tried "encog" with Eclipse and did some tutorials (XOR,...). But all of them are only using doubles as in/output.
Anyone could give me a hint how to start or any other possible solution for this?
Here is a runable JAR (current development state, requires Java7) and the Sourceforge Page
Forget about neural networks. This is a highly technical and specialized field of artificial intelligence, which is probably not suitable for your problem, and requires a solid expertise. Besides, there is a lot of simpler and better solutions for your problem.
First obvious solution, build a list of words and synonyms for all your sections and parse for these synonyms. You can then collect comments online for synonyms analysis, or use parse comments/sections provided by your users to statistically detect relations between words, etc...
There is an infinite number of possible solutions, ranging from the simplest to the most overkill. Now you need to define if this feature of your system is critical (prefilling? probably not, then)... and what any development effort will bring you. One hour of work could bring you a 80% satisfying feature, while aiming for 90% would cost one week of work. Is it really worth it?
Go for the simplest solution and tackle the real challenge of any dev project: delivering. Once your app is delivered, then you can always go back and improve as needed.
String myString = new String(paramInput);
if(myString.contains("FUEL")){
//do the fuel functionality
}
In a simple app, if you will be having only some specific sections in your application then you can get string from comments and check it if it contains some keywords and then according to it change the value of Section.
If you have a lot of categories, I would use something like Apache Lucene where you could index all the categories with their name's and potential keywords/phrases that might appear in a users description. Then you could simply run the description through Lucene and use the top matched category as a "best guess".
P.S. Neural Network inputs and outputs will always be doubles or floats with a value between 0 and 1. As for how to implement String matching I wouldn't even know where to start.
It seems to me that following will do:
hard word statistics
maybe a stemming class (English/Spanish) which reduce a word like "lunches" to "lunch".
a list of most frequent non-words (the, at, a, for, ...)
The best fit is a linear problem, so theoretical fit for a neural net, but why not take immediately the numerical best fit.
A machine learning algorithm such as an Artificial Neural Network doesn't seem like the best solution here. ANNs can be used for multi-class classification (i.e. 'to which of the provided pre-trained classes does the input represent?' not just 'does the input represent an X?') which fits your use case. The problem is that they are supervised learning methods and as such you need to provide a list of pairs of keywords and classes (Sections) that spans every possible input that your users will provide. This is impossible and in practice ANNs are re-trained when more data is available to produce better results and create a more accurate decision boundary / representation of the function that maps the inputs to outputs. This also assumes that you know all possible classes before you start and each of those classes has training input values that you provide.
The issue is that the input to your ANN (a list of characters or a numerical hash of the string) provides no context by which to classify. There's no higher level information provided that describes the word's meaning. This means that a different word that hashes to a numerically close value can be misclassified if there was insufficient training data.
(As maclema said, the output from an ANN will always be floats with each value representing proximity to a class - or a class with a level of uncertainty.)
A better solution would be to employ some kind of word-relation or synonym graph. A Bag of words model might be useful here.
Edit: In light of your comment that you don't know the Sections before hand,
an easy solution to program would be to provide a list of keywords in a file that gets updated as people use the program. Simply storing a mapping of provided comments -> Sections, which you will already have in your database, would allow you to filter out non-keywords (and, or, the, ...). One option is to then find a list of each Section that the typed keywords belong to and suggest multiple Sections and let the user pick one. The feedback that you get from user selections would enable improvements of suggestions in the future. Another would be to calculate a Bayesian probability - the probability that this word belongs to Section X given the previous stored mappings - for all keywords and Sections and either take the modal Section or normalise over each unique keyword and take the mean. Calculations of probabilities will need to be updated as you gather more information ofcourse, perhaps this could be done with every new addition in a background thread.
I've been researching this off-and-on for a few months.
I'm looking for a library or working example code to detect the frequency in sound card audio input, or detect presence of a given set of frequencies. I'm leaning towards Java, but the real requirement is that it should be something higher-level/simpler than C, and preferably cross-platform. Linux will be the target platform but I want to leave options open for Mac or possibly even Windows. Python would be acceptable too, and if anyone knows of a language that would make this easier/has better pre-written libraries, I'd be willing to consider it.
Essentially I have a defined set of frequency pairs that will appear in the soundcard audio input and I need to be able to detect this pair and then... do something, such as for example record the following audio up to a maximum duration, and then perform some action. A potential run could feature say 5-10 pairs, defined at runtime, can't be compiled in: something like frequency 1 for ~ 1 second, a maximum delay of ~1 second, frequency 2 for ~1 second.
I found suggestions of either doing an FFT or Goertzel algorithm, but was unable to find any more than the simplest example code that seemed to give no useful results. I also found some limitations with Java audio and not being able to sample at a high enough rate to get the resolution I need.
Any suggestions for libraries to use or maybe working code? I'll admit that I'm not the most mathematically inclined, so I've been lost in some of the more technical descriptions of how the algorithms actually work.
If you are aiming at detecting frequency pairs then your job is very similar to a DTMF detector.
Try searching for DTMF in places like sourgeforge, you'll find detectors in many programming languages. The frequency pairs placing along the spectrum seems to be even more stringent than your specs so you should be fine adapting a DTMF detector to your input.
Check out SNDPeek, its a cross-platform C++ application that extracts all kinds of information from live audio; https://github.com/RobQuistNL/sndpeek