How to find the regression of a given input in java? - java

I have a two array inputs of data. Call them p and t. I need to find the regression function of these columns so I can accuratly predict future values. Is there any way to do this with in java? If so, what is the best way?

I know of:
apache commons math (see stat.regression).

Related

How to get all three roots from a cube root function in Java 8

I have to do (-5.428271)1/3 and want the complex-number "0.8787335 + 1.522011i" as well as the other 2 results.
Is there a method to do that?
If there isn't, how can I create it?
Apache commons math has an nthRoot method returning a list of complex values. If you do not want the dependency, the javadoc has the formula.

Java library to calculate the relative difference between two Strings? [duplicate]

This question already has answers here:
Fuzzy string search library in Java [closed]
(8 answers)
Closed 9 years ago.
I'm looking for a way to do programmatically detect the delta ratio between two strings. I can use string length, but this doesn't give much useful information for like-sized but different inputs. There is a java diff tool on google code Java Diff Utils, but it hasn't been updated since 2011 and I don't need to actually modify the Strings themselves.
I'm attempting to do change detection with threshold values, for instance: Updated string is 42% different than existing string, are you sure you want to proceed?
Does anyone know of a library that could be used for this, or is java-diff-utils my only option? I couldn't find much in apache commons, and googling is returning irrelevant information.
You could use the Levenshtein Distance to calculate how much different two strings are amongst themselves. There's some quite complex math there but the actual code is rather short. You can easily rewrite the code in that wiki in Java.
The difference will be measured in integers, saying how many steps you'd take to turn one string into the other. A step may be a character addition, removal, or replacement with another character. It will tell you the amount of steps it takes, but not which steps, nor in which order. But then again, since you only want to measure the total difference, I'm sure that's enough information for your needs.
edit: one of the commenters (kaos) provided a link to an implementation of Levenshtein Distance in the Apache Commons.

How to check if two Strings are approximately equal?

I'm making a chat responder for a game and i want know if there is a way you can compare two strings and see if they are approximatley equal to each other for example:
if someone typed:
"Strength level?"
it would do a function..
then if someone else typed:
"Str level?"
it would do that same function, but i want it so that if someone made a typo or something like that it would automatically detect what they're trying to type for example:
"Strength tlevel?"
would also make the function get called.
is what I'm asking here something simple or will it require me to make a big giant irritating function to check the Strings?
if you've been baffled by my explanation (Not really one of my strong points) then this is basically what I'm asking.
How can I check if two strings are similar to each other?
See this question and answer: Getting the closest string match
Using some heuristics and the Levenshtein distance algorithm, you can compute the similarity of two strings and take a guess at whether they're equal.
Your only option other than that would be a dictionary of accepted words similar to the one you're looking for.
You can use Levenshtein distance.
I believe you should use one of Edit distance algorithms to solve your problem. Here is for example Levenstein distance algorithm implementation in java. You may use it to compare words in the sentences and if sum of their edit distances would be less than for example 10% of sentence length consider them equals.
Perhaps what you need is a large dictionary for similar words and common spelling mistakes, for which you would use for each word to "translate" to one single entry or key.
This would be useful for custom words, so you could add "str" in the same key as "strength".
However, you could also make a few automated methods, i.e. when your word isn't found in the dictionary, to loop recursively for 1 letter difference (either missing or replaced) and can recurse into deeper levels, i.e. 2 missing letters etc.
I found a few projects that do text to phonemes translations, don't know which one is best
http://mary.dfki.de/
http://www2.eng.cam.ac.uk/~tpl/asp/source/Phoneme.java
http://java.dzone.com/announcements/announcing-phonemic-10
If you want to find similar word beginnings, you can use a stemmer. Stemmers reduce words to a common beginning. The most known algorithm if the Port Stemmer (http://tartarus.org/~martin/PorterStemmer).
Levenshtein, as pointed above, is great, but computational heavy for distances greater than one or two.

program for A three-point Gauss integration

I want to write a java program to calculate integral with three-point Gauss.
How to calculate result of every function that is string?
For example want to calculate F(x) = x^4 + cos(x) + e^2x
Evaluating a string is not an easy task by itself.
You have to write your own Interpreter with Lexer and a Parser.
You can consider to use thirdparty libraries for mathematical functions parsing and execution. I've never used any one of them. Simple googling reveals this:
JbcParser
JepParser
I'm sure there are a couple of others around...
Hope this helps

Text similarity algorithm

I have two subtitles files.
I need a function that tells whether they represent the same text, or the similar text
Sometimes there are comments like "The wind is blowing... the music is playing" in one file only.
But 80% percent of the contents will be the same. The function must return TRUE (files represent the same text).
And sometimes there are misspellings like 1 instead of l (one - L ) as here:
She 1eft the baggage.
Of course, it means function must return TRUE.
My comments:
The function should return percentage of the similarity of texts - AGREE
"all the people were happy" and "all the people were not happy" - here that'd be considered as a misspelling, so that'd be considered the same text. To be exact, the percentage the function returns will be lower, but high enough to say the phrases are similar
Do consider whether you want to apply Levenshtein on a whole file or just a search string - not sure about Levenshtein, but the algorithm must be applied to the file as a whole. It'll be a very long string, though.
Levenshtein algorithm: http://en.wikipedia.org/wiki/Levenshtein_distance
Anything other than a result of zero means the text are not "identical". "Similar" is a measure of how far/near they are. Result is an integer.
For the problem you've described (i.e. compering large strings), you can use Cosine Similarity, which return a number between 0 (completely different) to 1 (identical), base on the term frequency vectors.
You might want to look at several implementations that are described here: Cosine Similarity
You're expecting too much here, it looks like you would have to write a function for your specific needs. I would recommend starting with an existing file comparison application (maybe diff already has everything you need) and improve it to provide good results for your input.
Have a look at approximate grep. It might give you pointers, though it's almost certain to perform abysmally on large chunks of text like you're talking about.
EDIT: The original version of agrep isn't open source, so you might get links to OSS versions from http://en.wikipedia.org/wiki/Agrep
There are many alternatives to the Levenshtein distance. For example the Jaro-Winkler distance.
The choice for such algorithm is depending on the language, type of words, are the words entered by human and many more...
Here you find a helpful implementation of several algorithms within one library
if you are still looking for the solution then go with S-Bert (Sentence Bert) which is light weight algorithm which internally uses cosine similarly.

Categories