Class to count variables design issue

Class to count variables design issue - java

I'm new to OO programing and having a bit of trouble with the design of my program to use the concepts. I have done the tutorials but am still having problem.
I have a recursion that takes a value of items(could be anything in this example, stocks) and figures out what number of them are needed to equal a specific value(in this code 100). This part works but I want to know if a stock's weighting exceeds a threshold. Originally I approached this problem with a method that did a for loop and calculated the entire list of values but this is super inefficient because its doing it on every loop of the recursion. I thought this would be a good time to try to learn classes because I could use a class to maintain state information and just increment the value on each loop and it'll let me know when the threshold is hit.
I think I have the code but I don't fully understand how to design this problem with classes. So far it runs the loop each step of the recursion because I'm initially the class there. Is there a better way to design this? My end goal is to be notified when a weighting is exceeded(which I can somewhat already do) but I want to do in way that uses the least bit of resources(avoiding inefficient/unnecessary for loops)
Code(Here's the entire code I have been using to learn but the problem is with the Counter class and its location within the findVariables method):
import java.util.Arrays;
public class LearningClassCounting {
public static int[] stock_price = new int[]{ 20,5,20};
public static int target = 100;
public static void main(String[] args) {
// takes items from the first list
findVariables(stock_price, 100, new int[] {0,0,0}, 0, 0);
}
public static void findVariables(int[] constants, int sum,
int[] variables, int n, int result) {
Counter Checker = new Counter(stock_price, variables);
if (n == constants.length) {
if (result == sum) {
System.out.println(Arrays.toString(variables));
}
} else if (result <= sum){ //keep going
for (int i = 0; i <= 100; i++) {
variables[n] = i;
Checker.check_total_percent(n, i);
findVariables(constants, sum, variables, n+1, result+constants[n]*i);
}
}
}
}
class Counter {
private int[] stock_price;
private int[] variables;
private int value_so_far;
public Counter(int[] stock_price, int[] variables) {
this.stock_price = stock_price;
this.variables = variables;
for (int location = 0; location < variables.length; location++) {
//System.out.println(variables[location] + " * " + stock_price[location] + " = " + (variables[location] * stock_price[location]) );
value_so_far = value_so_far + (variables[location] * stock_price[location]);
}
//System.out.println("Total value so far is " + value_so_far);
//System.out.println("************");
}
public void check_total_percent(int current_location, int percent) {
// Check to see if weight exceeds threshold
//System.out.println("we are at " + current_location + " and " + percent + " and " + Arrays.toString(variables));
//System.out.println("value is " + stock_price[current_location] * percent);
//formula I think I need to use is:
if (percent == 0) {
return;
}
int current_value = (stock_price[current_location] * percent);
int overall_percent = current_value/(value_so_far + current_value);
if (overall_percent > 50 ) {
System.out.println("item " + current_location + " is over 50%" );
}
}
}

What you're describing sounds like a variant of the famous knapsack problem. There are many approaches to these problems, which are inherently difficult to calculate.
Inherently, one may need to check "all the combinations". The so-called optimization comes from backtracking when a certain selection subset is already too large (e.g., if 10 given stocks are over my sum, no need to explore other combinations). In addition, one can cache certain subsets (e.g., if I know that X Y and Z amount to some value V, I can reuse that value). You'll see a lot of discussion of how to approach these sort of problems and how to design solutions.
That being said, my view is that while algorithmic problems of this sort may be important for learning how to program and structure code and data structures, they're generally a very poor choice for learning object-oriented design and modelling.

Related

Using Parallel array Show highest sales value, and the month in which it occurred

I'm trying to display in the Message dialog on the JOptionPane the highest number of sales from my array of sales.
And I also want to show in which month they happened, but I am failing to find a way to display the month.
public static void main(String[] args) {
int[] CarSales= {1234,2343,1456,4567,8768,2346,9876,4987,7592,9658,7851,2538};
String [] Months = {"January","February","March","April","May","June"
,"July ","August","September","October","November","December" };
int HighNum = CarSales[0];
for(int i = 0; i < CarSales.length; i++)
{
if(CarSales[i] > HighNum)
{
HighNum = CarSales[i];
}
}
JOptionPane.showMessageDialog(null,"The highest car sales value is :"+HighNum +
"-which happened in the month of");
}

Use the Power of Objects
Avoid using parallel arrays in Java. It brings unnecessary complexity, makes the code brittle and less maintainable.
Your code doesn't automatically become object-oriented just because of the fact that you're using an object-oriented language.
Objects provide you an easy way of structuring your data and organizing the code (if you need to implement some functionality related, to a particular data you know where it should go - its plane is in the class representing the data).
So, to begin with, I advise you to implement a class, let's call it CarSale:
public static class CarSale {
private Month month;
private int amount;
// getters, constructor, etc
}
Or, if you don't need it to be mutable, it can be implemented as a Java 16 record. In a nutshell, record is a specialized form of class, instances of which are meant to be transparent carriers of data (you can not change their properties after instantiation).
One of the greatest things about records is their concise syntax. The line below is an equivalent of the fully fledged class with getters, constructor, equals/hashCode and toString (all these would be generated for you by the compiler):
public record CarSale(Month month, int amount) {}
java.time.Month
You've probably noticed that in the code above, property month is not a String. It's a standard enum Month that resides in java.time package.
When you have a property that might have a limited set of values, enum is always preferred choice because contrary to a plain String, enum guards you from making typo and also enums have an extensive language support. So you don't need this array filled with moth-names.
That's how your code might look like:
CarSale[] carSales = {
new CarSale(Month.JANUARY, 1234),
new CarSale(Month.FEBRUARY, 2343),
new CarSale(Month.MARCH, 1456),
// ...
};
// you might want to check if carSales is not is empty before accessing its first element
CarSale best = carSales[0];
for (CarSale sale: carSales) {
if (sale.getAmount() > best.getAmount()) best = sale;
}
JOptionPane.showMessageDialog(null,
"The highest car sales value is :" + best.getAmount() +
" which happened in the month of " + best.getMonth());
Note
Try to keep your code aligned with Java naming conventions. Only names of classes and interface should start with a capital letter.
In case if you've heard from someone that usage of parallel arrays can improve memory consumption, then I would advise to examine this question dipper and take a look at questions like this Why use parallel arrays in Java? In case of such tiny arrays the only thing are disadvantages of a fragile code.

There are multiple solutions but i'll give you the simplest based on your structure.
Just declare one String variable and assign the value whenever you change the highest num.
public static void main(String[] args) {
int[] CarSales= {1234,2343,1456,4567,8768,2346,9876,4987,7592,9658,7851,2538};
String [] Months = {"January","February","March","April","May","June"
,"July ","August","September","October","November","December" };
int HighNum = CarSales[0];
String month = Months[0];
for(int i = 0; i < CarSales.length; i++)
{
if(CarSales[i] > HighNum)
{
HighNum = CarSales[i];
month = Months[i];
}
}
JOptionPane.showMessageDialog(null,"The highest car sales value is :"+HighNum +
"-which happened in the month of " + month);
}

Keep an index. Whenever you change the highest found, update the index.
public static void main(String[] args) {
int[] CarSales= {1234,2343,1456,4567,8768,2346,
9876,4987,7592,9658,7851,2538};
String [] Months = {"January","February","March","April","May","June"
,"July ","August","September","October","November","December" };
int HighNum = CarSales[0];
int highMonth = 0;
for(int i = 0; i < CarSales.length; i++)
{
if(CarSales[i] > HighNum)
{
HighNum = CarSales[i];
highMonth = i;
}
}
JOptionPane.showMessageDialog
(null,"The highest car sales value is :"+HighNum +
"-which happened in the month of " + Months[highMonth]);
}

FASTEST way to truncate a float in Java

I have a program that takes in anywhere from 20,000 to 500,000 velocity vectors and must output these vectors multiplied by some scalar. The program allows the user to set a variable accuracy, which is basically just how many decimal places to truncate to in the calculations. The program is quite slow at the moment, and I discovered that it's not because of multiplying a lot of numbers, it's because of the method I'm using to truncate floating point values.
I've already looked at several solutions on here for truncating decimals, like this one, and they mostly recommend DecimalFormat. This works great for formatting decimals once or twice to print nice user output, but is far too slow for hundreds of thousands of truncations that need to happen in a few seconds.
What is the most efficient way to truncate a floating-point value to n number of places, keeping execution time at utmost priority? I do not care whatsoever about resource usage, convention, or use of external libraries. Just whatever gets the job done the fastest.
EDIT: Sorry, I guess I should have been more clear. Here's a very simplified version of what I'm trying to illustrate:
import java.util.*;
import java.lang.*;
import java.text.DecimalFormat;
import java.math.RoundingMode;
public class MyClass {
static class Vector{
float x, y, z;
#Override
public String toString(){
return "[" + x + ", " + y + ", " + z + "]";
}
}
public static ArrayList<Vector> generateRandomVecs(){
ArrayList<Vector> vecs = new ArrayList<>();
Random rand = new Random();
for(int i = 0; i < 500000; i++){
Vector v = new Vector();
v.x = rand.nextFloat() * 10;
v.y = rand.nextFloat() * 10;
v.z = rand.nextFloat() * 10;
vecs.add(v);
}
return vecs;
}
public static void main(String args[]) {
int precision = 2;
float scalarToMultiplyBy = 4.0f;
ArrayList<Vector> velocities = generateRandomVecs();
System.out.println("First 10 raw vectors:");
for(int i = 0; i < 10; i++){
System.out.print(velocities.get(i) + " ");
}
/*
This is the code that I am concerned about
*/
DecimalFormat df = new DecimalFormat("##.##");
df.setRoundingMode(RoundingMode.DOWN);
long start = System.currentTimeMillis();
for(Vector v : velocities){
/* Highly inefficient way of truncating*/
v.x = Float.parseFloat(df.format(v.x * scalarToMultiplyBy));
v.y = Float.parseFloat(df.format(v.y * scalarToMultiplyBy));
v.z = Float.parseFloat(df.format(v.z * scalarToMultiplyBy));
}
long finish = System.currentTimeMillis();
long timeElapsed = finish - start;
System.out.println();
System.out.println("Runtime: " + timeElapsed + " ms");
System.out.println("First 10 multiplied and truncated vectors:");
for(int i = 0; i < 10; i++){
System.out.print(velocities.get(i) + " ");
}
}
}
The reason it is very important to do this is because a different part of the program will store trigonometric values in a lookup table. The lookup table will be generated to n places beforehand, so any velocity vector that has a float value to 7 places (i.e. 5.2387471) must be truncated to n places before lookup. Truncation is needed instead of rounding because in the context of this program, it is OK if a vector is slightly less than its true value, but not greater.
Lookup table for 2 decimal places:
...
8.03 -> -0.17511085919
8.04 -> -0.18494742685
8.05 -> -0.19476549993
8.06 -> -0.20456409661
8.07 -> -0.21434223706
...
Say I wanted to look up the cosines of each element in the vector {8.040844, 8.05813164, 8.065688} in the table above. Obviously, I can't look up these values directly, but I can look up {8.04, 8.05, 8.06} in the table.
What I need is a very fast method to go from {8.040844, 8.05813164, 8.065688} to {8.04, 8.05, 8.06}

The fastest way, which will introduce rounding error, is going to be to multiply by 10^n, call Math.rint, and to divide by 10^n.
That's...not really all that helpful, though, considering the introduced error, and -- more importantly -- that it doesn't actually buy anything. Why drop decimal points if it doesn't improve efficiency or anything? If it's about making the values shorter for display or the like, truncate then, but until then, your program will run as fast as possible if you just use full float precision.

How to set an initial moving average value in Java?

I want to actively calculate the moving average of stock data using the formula below:
public class Average {
private static double usdJpy;
private int counter = 1;
private double movingAverageUsdJpy_ = 100.5;
public void calculateAverage(){
ReadData myData = new ReadData();
usdGbp = myData.getUsdGbp();
usdJpy = myData.getUsdJpy();
System.out.println("Before: " + movingAverageUsdJpy_);
movingAverageUsdJpy_ = (counter * movingAverageUsdJpy_ + usdJpy) / (counter + 1);
counter++;
System.out.println("Moving Average: " + movingAverageUsdJpy_);
}
}
-> Counter is the number of elements in the array.
My question is since the stock data already has a moving average, I want to set my initial movingAverage value to that (e.g 97.883). However, every time I call my method, the latest value that my program calculated for the movingAverage will be overwritten by the initial value I have set earlier, hence giving me the wrong result. I can't really use final because the movingAverage needs to be updated each time I call the method so really stuck!
Is there a way to fix this problem??

Your formula is incorrect. If counter is the not-yet incremented value, then use
movingAverage = (counter * movingAverage + latestRate) / (counter + 1)
Then increment counter by 1. Note that if you want counter to be fixed in size (as is quite common when reporting financial data like this), then you need to keep that number of elements in memory.

You probably have something like:
class Something{
public int calculateAverage(){
int movingAverage = 98888;
//more code to count average
}
}
What you need to do is:
class Something{
private int myAverage = 98888;
public int calculateAverage(){
//code to calculate using myAverage variable;
}
}

Create a new private field used for storing the previous average. I'm not very familiar with moving averages per say, but using the formula you've provided I've adjusted things a bit. Note, typically an underscore is used to indicate that a class level variable is private
private double movingAverage_;
private double prevAverage_ = 97.883;
public void calculateMovingAverage()
{
movingAverage_ = prevAverage_ + (latestRate - prevAverage_)/counter;
prevAverage_ = movingAverage_;
// finish other logic
}

Can't remember how to do this for some reason

This one should be fairly simple I think, I just can't remember how, when using get methods of an object, how to pull the highest double out of the pack and put it in the println.
So far I just get every object to print with its percentages. But for the life of me I just can't remember and I know I've done this before.
public void displayBookWithBiggestPercentageMarkup(){
Collection<Book> books = getCollectionOfItems();
Iterator<Book> it = books.iterator();
while(it.hasNext()){
Book b = it.next();
double percent = b.getSuggestedRetailPriceDollars() / b.getManufacturingPriceDollars() * 100.0;
System.out.println("Highest markup is " + percent + " " + b.getTitle() + " " + b.getAuthor().getName().getLastName());
}
}
I'm pretty sure I need another local variable but I can't seem to do anything but make it equal the other percent. I have removed the other variable for now as I try to think about it.

I won't go into a lot of detail because it's homework (well done for being up-front about that, by the way) but here's the key idea: keep track of the largest percentage you've seen so far as your loop runs. That's what you want in your other variable.

Good job posting what you've tried so far. You were on the right track. As you loop through your books, keep a variables continuously updated with the highest percent seen so far and another variable for the associated book. Output the variable at the end outside the loop after iteration is done. Also, don't forget to check the edge case of an empty list of books! Something like this should do the trick:
public void displayBookWithBiggestPercentageMarkup(){
Collection<Book> books = getCollectionOfItems();
if (books.size() == 0) {
return;
}
Iterator<Book> it = books.iterator();
double highestPercent = 0;
Book highestPercentBook = null;
while(it.hasNext()){
Book b = it.next();
double percent = b.getSuggestedRetailPriceDollars() / b.getManufacturingPriceDollars() * 100.0;
if (percent > highestPercent) {
highestPercent = percent;
highestPercentBook = b;
}
}
System.out.println("Highest markup is " + highestPercent
+ " " + highestPercentBook.getTitle()
+ " " + highestPercentBook.getAuthor().getName().getLastName());
}

Java : linear algorithm but non-linear performance drop, where does it come from?

I am currently having heavy performance issues with an application I'm developping in natural language processing. Basically, given texts, it gathers various data and does a bit of number crunching.
And for every sentence, it does EXACTLY the same. The algorithms applied to gather the statistics do not evolve with previously read data and therefore stay the same.
The issue is that the processing time does not evolve linearly at all: 1 min for 10k sentences, 1 hour for 100k and days for 1M...
I tried everything I could, from re-implementing basic data structures to object pooling to recycles instances. The behavior doesn't change. I get non-linear increase in time that seem impossible to justify by a little more hashmap collisions, nor by IO waiting, nor by anything! Java starts to be sluggish when data increases and I feel totally helpless.
If you want an example, just try the following: count the number of occurences of each word in a big file. Some code is shown below. By doing this, it takes me 3 seconds over 100k sentences and 326 seconds over 1.6M ...so a multiplicator of 110 times instead of 16 times. As data grows more, it just get worse...
Here is a code sample:
Note that I compare strings by reference (for efficiency reasons), this can be done thanks to the 'String.intern()' method which returns a unique reference per string. And the map is never re-hashed during the whole process for the numbers given above.
public class DataGathering
{
SimpleRefCounter<String> counts = new SimpleRefCounter<String>(1000000);
private void makeCounts(String path) throws IOException
{
BufferedReader file_src = new BufferedReader(new FileReader(path));
String line_src;
int n = 0;
while (file_src.ready())
{
n++;
if (n % 10000 == 0)
System.out.print(".");
if (n % 100000 == 0)
System.out.println("");
line_src = file_src.readLine();
String[] src_tokens = line_src.split("[ ,.;:?!'\"]");
for (int i = 0; i < src_tokens.length; i++)
{
String src = src_tokens[i].intern();
counts.bump(src);
}
}
file_src.close();
}
public static void main(String[] args) throws IOException
{
String path = "some_big_file.txt";
long timestamp = System.currentTimeMillis();
DataGathering dg = new DataGathering();
dg.makeCounts(path);
long time = (System.currentTimeMillis() - timestamp) / 1000;
System.out.println("\nElapsed time: " + time + "s.");
}
}
public class SimpleRefCounter<K>
{
static final double GROW_FACTOR = 2;
static final double LOAD_FACTOR = 0.5;
private int capacity;
private Object[] keys;
private int[] counts;
public SimpleRefCounter()
{
this(1000);
}
public SimpleRefCounter(int capacity)
{
this.capacity = capacity;
keys = new Object[capacity];
counts = new int[capacity];
}
public synchronized int increase(K key, int n)
{
int id = System.identityHashCode(key) % capacity;
while (keys[id] != null && keys[id] != key) // if it's occupied, let's move to the next one!
id = (id + 1) % capacity;
if (keys[id] == null)
{
key_count++;
keys[id] = key;
if (key_count > LOAD_FACTOR * capacity)
{
resize((int) (GROW_FACTOR * capacity));
}
}
counts[id] += n;
total += n;
return counts[id];
}
public synchronized void resize(int capacity)
{
System.out.println("Resizing counters: " + this);
this.capacity = capacity;
Object[] new_keys = new Object[capacity];
int[] new_counts = new int[capacity];
for (int i = 0; i < keys.length; i++)
{
Object key = keys[i];
int count = counts[i];
int id = System.identityHashCode(key) % capacity;
while (new_keys[id] != null && new_keys[id] != key) // if it's occupied, let's move to the next one!
id = (id + 1) % capacity;
new_keys[id] = key;
new_counts[id] = count;
}
this.keys = new_keys;
this.counts = new_counts;
}
public int bump(K key)
{
return increase(key, 1);
}
public int get(K key)
{
int id = System.identityHashCode(key) % capacity;
while (keys[id] != null && keys[id] != key) // if it's occupied, let's move to the next one!
id = (id + 1) % capacity;
if (keys[id] == null)
return 0;
else
return counts[id];
}
}
Any explanations? Ideas? Suggestions?
...and, as said in the beginning, it is not for this toy example in particular but for the more general case. This same exploding behavior occurs for no reason in the more complex and larger program.

Rather than feeling helpless use a profiler! That would tell you where exactly in your code all this time is spent.

Bursting the processor cache and thrashing the Translation Lookaside Buffer (TLB) may be the problem.
For String.intern you might want to do your own single-threaded implementation.
However, I'm placing my bets on the relatively bad hash values from System.identityHashCode. It clearly isn't using the top bit, as you don't appear to get ArrayIndexOutOfBoundsExceptions. I suggest replacing that with String.hashCode.

String[] src_tokens = line_src.split("[ ,.;:?!'\"]");
Just an idea -- you are creating a new Pattern object for every line here (look at the String.split() implementation). I wonder if this is also contributing to a ton of objects that need to be garbage collected?
I would create the Pattern once, probably as a static field:
final private static Pattern TOKEN_PATTERN = Pattern.compile("[ ,.;:?!'\"]");
And then change the split line do this:
String[] src_tokens = TOKEN_PATTERN.split(line_src);
Or if you don't want to create it as a static field, as least only create it once as a local variable at the beginning of the method, before the while.

In get, when you search for a nonexistent key, search time is proportional to the size of the set of keys.
My advice: if you want a HashMap, just use a HashMap. They got it right for you.

You are filling up the Perm Gen with the string intern. Have you tried viewing the -Xloggc output?

I would guess it's just memory filling up, growing outside the processor cache, memory fragmentation and the garbage collection pauses kicking in. Have you checked memory use at all? Tried to change the heap size the JVM uses?

Try to do it in python, and run the python module from Java.
Enter all the keys in the database, and then execute the following query:
select key, count(*)
from keys
group by key
Have you tried to only iterate through the keys without doing any calculations? is it faster? if yes then go with option (2).

Can't you do this? You can get your answer in no time.

It's me, the original poster, something went wrong during registration, so I post separately. I'll try the various suggestions given.
PS for Tom Hawtin: thanks for the hints, perhaps the 'String.intern()' takes more and more time as vocabulary grows, i'll check that tomorrow, as everything else.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Class to count variables design issue - java

Related

Using Parallel array Show highest sales value, and the month in which it occurred

FASTEST way to truncate a float in Java

How to set an initial moving average value in Java?

Can't remember how to do this for some reason

Java : linear algorithm but non-linear performance drop, where does it come from?

Categories

Resources