Trapez Method in Java - java

I found a formula in the Internet for calculating the trapezoid method , it works as it should, but I do not see why should I performed the following lines in the trapez method:
sum = 0.5 * bef + (h * sum);
i= i+ 2
The first iteration performed by the following command in main :
tra[0] = 0.5 * ((b - a) / n) * (function(a) + function(b));
//calculates the first step value
the trapez method for the next iterations:
/**
* calculate the next step with trapez method
* #param a -lower limit
* #param b -upper limit
* #param bef -previous step value
* #param n -number of dividing points
* #return integral area
*/
public static double trapz(double a, double b,double bef, int n)
{
double sum = 0;
double h = ((b - a)/n);
for (int i = 1; i <= n; i = i + 2) {
sum += function(a + (i) * h);
}
sum = 0.5 * bef + (h * sum);
return sum;
}

The function would be used in conjunction with a driver loop that doubles the number of subintervals at each iteration, refining the estimated integral until the difference from one iteration to the next is less than some threshold criterion. It is desirable in such an endeavor to avoid repeating computations that have already been performed, and that's the point of the lines you asked about.
Consider the function values that are needed when applying the trapezoid rule on a given number of subintervals. Now consider the function values needed for splitting each subinterval in half and applying the trapezoid rule to those subintervals. Half (give or take 1) of the function values needed in the latter case are the same ones needed in the former. The code presented simply reuses the previously computed values (0.5 * bef), adding to them only the new values (i = i + 2). It must scale down the previous estimate by a factor of two to account for splitting the subintervals in two.
Note that for the code to be right, it appears that argument n must represent the number of subintervals of the integration region, not the number of dividing points as its documentation claims.

Related

Im a really stumped on this Methods and Recursion

so I'm not really sure what to do what I've already done
/*
* a collection of methods to be completed
* complete each TODO body-code for the given methods
* use recursion for the first method and last method
*/
public class Methods
/*
* method to compute value of f(n) where:
* f(1) = 1
* f(n) = n + f(n-1) for n>1, n is even
* f(n) = n * f(n-1) for n>1, n is odd
*/
public static int f(int n) {
//TODO
return 0; //dummy line, replace this
}
/*
* method to compute the sum of the proper divisors of a given positive integer, n
* e.g., if n is 12, the sum of the proper divisors is:
* 1 + 2 + 3 + 4 + 6 = 16
* note: proper divisors are all divisors of an integer other than itself
*/
public static int sumOfDivisors(int n) {
//TODO
return 0; //dummy line, replace this
}
/*
* method that returns a String indicating whether a given positive integer, n, is:
* "abundant" - sum of proper divisors is greater than n
* "deficient" - sum of proper divisors is less than n
* "perfect" - sum of proper divisors is equal to n
*/
public static String numberType(int n) {
//TODO
return "foo"; //dummy line, replace this
}
/*
* method that returns the sum of the digits of the positive integer, n
* e.g., if n = 5403, the method will return:
* 5+4+0+3 = 12
* note: the right-most (1's) digit can be found using n%10
* the remaining digits (all but the 1's digit) can be found using n/10
*/
public static int sumOfDigits(int n) {
//TODO
return 0; //dummy line, replace this
}
//a dummy main method, not used
public static void main(String[] args) {
System.out.println("This program is not meant to be run on its own.");
System.out.println("This is just a dummy main method.");
}
} //end Methods
so yea any help would be great.
so I'm not really sure what to do what I've already done
You actually wrote only method declarations.
Your task is really well documented, you may try out to implement the utilities(like a public int[] divisors(int n) and so on) and eventually come back here and ask help about your concerns, posting either code or errors.
PS:This seems an academic assignment , have you checked some books, or notes?

Efficiently computing a pair(base, exponent) representation of an integer n

I'm having trouble with desiging a method to express an integer n as an integer pair (base, exponent) such that n == base ^ exponent, for an assignment. Below is the contract for the method. #pre specifies what n has to be, #post defines what the output has to be.
/**
* Writes a number as a power with maximal exponent.
*
* #param n the number to 'powerize'
* #return power decomposition of {#code n} with maximal exponent
* #throws IllegalArgumentException if precondition violated
* #pre {#code 2 <= n}
* #post {#code n == power(\result) &&
* (\forall int b, int e;
* 2 <= b && 1 <= e && n == b ^ e;
* e <= \result.exponent)}
*/
public static Power powerize(int n){}
Where the class Power is just a Pair instance.
I already managed to solve this problem using a naive approach, in which I compute the value of the base and exponent by computing log(n) / log(b) with 2 <= b <= sqrt(n). But the assignment description says I have to produce an ellegant & efficient method and I couldn't find a way to compute this more efficiently.
After consulting some books I designed the following solution:
Input int n:
Let p1...pm be m unique primes.
Then we can express n as:
n = p1e1 x ... x pmem.
Then compute the gcd d of e1 ... em using the euclidean algorithm.
Then we express n as:
n = (p1e1/d x ... x pmem/d)d.
now we have:
b = p1e1/d x ... x pmem/d
e = d
return new Power(b, e)

How to get while loop to work?

I'm having trouble getting the while loop to work the way I want it. I've written a method called trapezium (that calculates the area of a trapezium). I need the screen to print the area of the trapezium, then the area of a trapezium with the value of N doubled and then the difference between these two, which it does.
I then need the while loop to keep doubling N, inputting this in the formula, and printing the new difference UNTIL this new difference is less than or equal to a user inputted value called eps. It then needs to print to the screen the area found and the value of N required to do this.
double traparea = trapezium(a, b, N);
System.out.println(traparea + " using the trapezium rule");
double traparea2 = trapezium(a, b, 2 * N);
double difftrap = (traparea2 - traparea);
System.out.println(traparea2);
System.out.println(difftrap);
while (Math.abs(difftrap) < eps) {
N = 2 * N;
traparea2 = trapezium(a, b, N);
difftrap = traparea2 - traparea;
}
System.out.println("The integration from trapezium rule and the value of N are:");
System.out.print(traparea2 + " " + N);
UNTIL this new difference is less than or equal to a user inputted value called eps
while (difftrap>eps) // do the loop as long as the difference is not less than or equal to eps

BigInteger: count the number of decimal digits in a scalable method

I need the count the number of decimal digits of a BigInteger. For example:
99 returns 2
1234 returns 4
9999 returns 4
12345678901234567890 returns 20
I need to do this for a BigInteger with 184948 decimal digits and more. How can I do this fast and scalable?
The convert-to-String approach is slow:
public String getWritableNumber(BigInteger number) {
// Takes over 30 seconds for 184948 decimal digits
return "10^" + (number.toString().length() - 1);
}
This loop-devide-by-ten approach is even slower:
public String getWritableNumber(BigInteger number) {
int digitSize = 0;
while (!number.equals(BigInteger.ZERO)) {
number = number.divide(BigInteger.TEN);
digitSize++;
}
return "10^" + (digitSize - 1);
}
Are there any faster methods?
Here's a fast method based on Dariusz's answer:
public static int getDigitCount(BigInteger number) {
double factor = Math.log(2) / Math.log(10);
int digitCount = (int) (factor * number.bitLength() + 1);
if (BigInteger.TEN.pow(digitCount - 1).compareTo(number) > 0) {
return digitCount - 1;
}
return digitCount;
}
The following code tests the numbers 1, 9, 10, 99, 100, 999, 1000, etc. all the way to ten-thousand digits:
public static void test() {
for (int i = 0; i < 10000; i++) {
BigInteger n = BigInteger.TEN.pow(i);
if (getDigitCount(n.subtract(BigInteger.ONE)) != i || getDigitCount(n) != i + 1) {
System.out.println("Failure: " + i);
}
}
System.out.println("Done");
}
This can check a BigInteger with 184,948 decimal digits and more in well under a second.
This looks like it is working. I haven't run exhaustive tests yet, n'or have I run any time tests but it seems to have a reasonable run time.
public class Test {
/**
* Optimised for huge numbers.
*
* http://en.wikipedia.org/wiki/Logarithm#Change_of_base
*
* States that log[b](x) = log[k](x)/log[k](b)
*
* We can get log[2](x) as the bitCount of the number so what we need is
* essentially bitCount/log[2](10). Sadly that will lead to inaccuracies so
* here I will attempt an iterative process that should achieve accuracy.
*
* log[2](10) = 3.32192809488736234787 so if I divide by 10^(bitCount/4) we
* should not go too far. In fact repeating that process while adding (bitCount/4)
* to the running count of the digits will end up with an accurate figure
* given some twiddling at the end.
*
* So here's the scheme:
*
* While there are more than 4 bits in the number
* Divide by 10^(bits/4)
* Increase digit count by (bits/4)
*
* Fiddle around to accommodate the remaining digit - if there is one.
*
* Essentially - each time around the loop we remove a number of decimal
* digits (by dividing by 10^n) keeping a count of how many we've removed.
*
* The number of digits we remove is estimated from the number of bits in the
* number (i.e. log[2](x) / 4). The perfect figure for the reduction would be
* log[2](x) / 3.3219... so dividing by 4 is a good under-estimate. We
* don't go too far but it does mean we have to repeat it just a few times.
*/
private int log10(BigInteger huge) {
int digits = 0;
int bits = huge.bitLength();
// Serious reductions.
while (bits > 4) {
// 4 > log[2](10) so we should not reduce it too far.
int reduce = bits / 4;
// Divide by 10^reduce
huge = huge.divide(BigInteger.TEN.pow(reduce));
// Removed that many decimal digits.
digits += reduce;
// Recalculate bitLength
bits = huge.bitLength();
}
// Now 4 bits or less - add 1 if necessary.
if ( huge.intValue() > 9 ) {
digits += 1;
}
return digits;
}
// Random tests.
Random rnd = new Random();
// Limit the bit length.
int maxBits = BigInteger.TEN.pow(200000).bitLength();
public void test() {
// 100 tests.
for (int i = 1; i <= 100; i++) {
BigInteger huge = new BigInteger((int)(Math.random() * maxBits), rnd);
// Note start time.
long start = System.currentTimeMillis();
// Do my method.
int myLength = log10(huge);
// Record my result.
System.out.println("Digits: " + myLength+ " Took: " + (System.currentTimeMillis() - start));
// Check the result.
int trueLength = huge.toString().length() - 1;
if (trueLength != myLength) {
System.out.println("WRONG!! " + (myLength - trueLength));
}
}
}
public static void main(String args[]) {
new Test().test();
}
}
Took about 3 seconds on my Celeron M laptop so it should hit sub 2 seconds on some decent kit.
I think that you could use bitLength() to get a log2 value, then change the base to 10.
The result may be wrong, however, by one digit, so this is just an approximation.
However, if that's acceptable, you could always add 1 to the result and bound it to be at most. Or, subtract 1, and get at least.
You can first convert the BigInteger to a BigDecimal and then use this answer to compute the number of digits. This seems more efficient than using BigInteger.toString() as that would allocate memory for String representation.
private static int numberOfDigits(BigInteger value) {
return significantDigits(new BigDecimal(value));
}
private static int significantDigits(BigDecimal value) {
return value.scale() < 0
? value.precision() - value.scale()
: value.precision();
}
This is an another way to do it faster than Convert-to-String method. Not the best run time, but still reasonable 0.65 seconds versus 2.46 seconds with Convert-to-String method (at 180000 digits).
This method computes the integer part of the base-10 logarithm from the given value. However, instead of using loop-divide, it uses a technique similar to Exponentiation by Squaring.
Here is a crude implementation that achieves the runtime mentioned earlier:
public static BigInteger log(BigInteger base,BigInteger num)
{
/* The technique tries to get the products among the squares of base
* close to the actual value as much as possible without exceeding it.
* */
BigInteger resultSet = BigInteger.ZERO;
BigInteger actMult = BigInteger.ONE;
BigInteger lastMult = BigInteger.ONE;
BigInteger actor = base;
BigInteger incrementor = BigInteger.ONE;
while(actMult.multiply(base).compareTo(num)<1)
{
int count = 0;
while(actMult.multiply(actor).compareTo(num)<1)
{
lastMult = actor; //Keep the old squares
actor = actor.multiply(actor); //Square the base repeatedly until the value exceeds
if(count>0) incrementor = incrementor.multiply(BigInteger.valueOf(2));
//Update the current exponent of the base
count++;
}
if(count == 0) break;
/* If there is no way to multiply the "actMult"
* with squares of the base (including the base itself)
* without keeping it below the actual value,
* it is the end of the computation
*/
actMult = actMult.multiply(lastMult);
resultSet = resultSet.add(incrementor);
/* Update the product and the exponent
* */
actor = base;
incrementor = BigInteger.ONE;
//Reset the values for another iteration
}
return resultSet;
}
public static int digits(BigInteger num)
{
if(num.equals(BigInteger.ZERO)) return 1;
if(num.compareTo(BigInteger.ZERO)<0) num = num.multiply(BigInteger.valueOf(-1));
return log(BigInteger.valueOf(10),num).intValue()+1;
}
Hope this will helps.

Efficient implementation of mutual information in Java

I'm looking to calculate mutual information between two features, using Java.
I've read Calculating Mutual Information For Selecting a Training Set in Java already, but that was a discussion of if mutual information was appropriate for the poster, with only some light pseudo-code as to the implementation.
My current code is below, but I'm hoping there is a way to optimise it, as I have large quantities of information to process. I'm aware that calling out to another language/framework may improve speed, but would like to focus on solving this in Java for now.
Any help much appreciated.
public static double calculateNewMutualInformation(double frequencyOfBoth, double frequencyOfLeft,
double frequencyOfRight, int noOfTransactions) {
if (frequencyOfBoth == 0 || frequencyOfLeft == 0 || frequencyOfRight == 0)
return 0;
// supp = f11
double supp = frequencyOfBoth / noOfTransactions; // P(x,y)
double suppLeft = frequencyOfLeft / noOfTransactions; // P(x)
double suppRight = frequencyOfRight / noOfTransactions; // P(y)
double f10 = (suppLeft - supp); // P(x) - P(x,y)
double f00 = (1 - suppRight) - f10; // (1-P(y)) - P(x,y)
double f01 = (suppRight - supp); // P(y) - P(x,y)
// -1 * ((P(x) * log(Px)) + ((1 - P(x)) * log(1-p(x)))
double HX = -1 * ((suppLeft * MathUtils.logWithoutNaN(suppLeft)) + ((1 - suppLeft) * MathUtils.logWithoutNaN(1 - suppLeft)));
// -1 * ((P(y) * log(Py)) + ((1 - P(y)) * log(1-p(y)))
double HY = -1 * ((suppRight * MathUtils.logWithoutNaN(suppRight)) + ((1 - suppRight) * MathUtils.logWithoutNaN(1 - suppRight)));
double one = (supp * MathUtils.logWithoutNaN(supp)); // P(x,y) * log(P(x,y))
double two = (f10 * MathUtils.logWithoutNaN(f10));
double three = (f01 * MathUtils.logWithoutNaN(f01));
double four = (f00 * MathUtils.logWithoutNaN(f00));
double HXY = -1 * (one + two + three + four);
return (HX + HY - HXY) / (HX == 0 ? MathUtils.EPSILON : HX);
}
public class MathUtils {
public static final double EPSILON = 0.000001;
public static double logWithoutNaN(double value) {
if (value == 0) {
return Math.log(EPSILON);
} else if (value < 0) {
return 0;
}
return Math.log(value);
}
I have found the following to be fast, but I have not compared it against your method - only that provided in weka.
It works on the premise of re-arranging the MI equation so that it is possible to minimise the number of floating point operations:
We start by defining as count/frequency over number of samples/transactions. So, we define the number of items as n, the number of times x occurs as |x|, the number of times y occurs as |y| and the number of times they co-occur as |x,y|. We then get,
.
Now, we can re-arrange that by flipping the bottom of the inner divide, this gives us (n|x,y|)/(|x||y|). Also, compute use N = 1/n so we have one less divide operation. This gives us:
This gives us the following code:
/***
* Computes MI between variables t and a. Assumes that a.length == t.length.
* #param a candidate variable a
* #param avals number of values a can take (max(a) == avals)
* #param t target variable
* #param tvals number of values a can take (max(t) == tvals)
* #return
*/
static double computeMI(int[] a, int avals, int[] t, int tvals) {
double numinst = a.length;
double oneovernuminst = 1/numinst;
double sum = 0;
// longs are required here because of big multiples in calculation
long[][] crosscounts = new long[avals][tvals];
long[] tcounts = new long[tvals];
long[] acounts = new long[avals];
// Compute counts for the two variables
for (int i=0;i<a.length;i++) {
int av = a[i];
int tv = t[i];
acounts[av]++;
tcounts[tv]++;
crosscounts[av][tv]++;
}
for (int tv=0;tv<tvals;tv++) {
for (int av=0;av<avals;av++) {
if (crosscounts[av][tv] != 0) {
// Main fraction: (n|x,y|)/(|x||y|)
double sumtmp = (numinst*crosscounts[av][tv])/(acounts[av]*tcounts[tv]);
// Log bit (|x,y|/n) and update product
sum += oneovernuminst*crosscounts[av][tv]*Math.log(sumtmp)*log2;
}
}
}
return sum;
}
This code assumes that the values of a and t are not sparse (i.e. min(t)=0 and tvals=max(t)) for it to be efficient. Otherwise (as commented) large and unnecessary arrays are created.
I believe this approach improves further when computing MI between several variables at once (the count operations can be condensed - especially that of the target). The implementation I use is one that interfaces with WEKA.
Finally, it might be more efficient even to take the log out of the summations. But I am unsure whether log or power will take more computation within the loop. This is done by:
Apply a*log(b) = log(a^b)
Move the log to outside the summations, using log(a)+log(b) = log(ab)
and gives:
I am not mathematician but..
There are just a bunch of floating point calculations here. Some mathemagician might be able to reduce this to fewer calculation, try the Math SE.
Meanwhile, you should be able to use a static final double for Math.log(EPSILON)
Your problem might not be a single call but the volume of data for which this calculation has to be done. That problem is better solved by throwing more hardware at it.

Categories