Calculating average in a TestLuck (Probability) Program - java

I'm a student, trying to write a program that tests probability. It's called TestLuck it's supposed to generate a user determined amount of IntArrayLogs(ADT's) that are populated with random values. The program is supposed to calculate how many values were generated before there is a match to the first value.
Actual Problem:
"Create application TestLuck; have the user enter the upper limit of the random integer range (the book says 10,000, but you should also test with 365) as well as the number of times to run the test. Compute and output the average."
This is what I came up with, but I'm not getting the correct results for some reason, I tested the methods I use and they seem to work right, I think it's something to do with how I'm keeping track of the counter.
for(int k=0; k<numTests; k++) {
for(int i=0; i<upperLimit; i++) {
arrLog.insert(n);
n = rand.nextInt(upperLimit);
if(arrLog.contains(arrLog.getElement(0))) {
totalCount += i;
break;
}
if(i == upperLimit-1)
totalCount +=i;
}
System.out.println("Total Count: " + totalCount);
arrLog.clear();
}
testAverage = totalCount/numTests;
System.out.println("Average tests before match: " + testAverage);
Contains Method:
// Returns true if element is in this IntLog,
// otherwise returns false.
public boolean contains(int element) {
int location = 0;
int counter = 0;
while (location <= lastIndex) {
if (element == log[location]) { // if they match
counter++;
location++;
if(counter == 2)
return true;
} else
location++;
}
return false;
}

You don't need a contains() method, as this will only take more time to compute something as simple as a comparison.
The question is how many numbers have to be generated before matching the first number, but you need to take into account if this includes the first number. eg. {1,2,3,4,1} count = 5, or {1,2,3,4,1} count = 4. Either way, this wont affect the logic on this answer:
If you re-arrange your method it will work much faster.
for(int k=0; k<numTests; k++){
for(int i=0; i<upperLimit; i++){
arrLog.insert(n);
if(arrLog.getElement(0) == n && i != 0){// i != 0 to prevent it from counting a match on the first iteration
totalCount += i;//totalCount += i+1 if you are counting the first number
break;
}
n = rand.nextInt(upperLimit);
}
System.out.println("Total Count: " + totalCount);
arrLog.clear();
}
testAverage = totalCount/numTests;
System.out.println("Average tests before match: " + testAverage);
If you are required to use a contains() method let me know on the comments and I'll edit the answer.
I would also like to suggest not using any storage data structure, in this case an ADT's IntArrayLog (again, I dont know if you are required to use ADT as part of your course); so that your program will run even faster:
int firstNum;
for(int k=0; k<numTests; k++){
firstNum = rand.nextInt(upperLimit);
for(int i=1; i<upperLimit; i++){//notice this starts in 1
n = rand.nextInt(upperLimit);
if(firstNum == n){
totalCount += i;//totalCount += i+1 if you are counting the first number
break;
}
}
System.out.println("Total Count: " + totalCount);
arrLog.clear();
}
testAverage = totalCount/numTests;
System.out.println("Average tests before match: " + testAverage);

I find some odd things in your code.
First, you are inserting n to arrLog before giving n a value.
Second, you are testing if i == upperLimit-1 after the for loop to add 1 to the counter. You will only meet this condition if the for loop breaks in the last step (in which case you had added 2 to the counter).
Third, in the contains method you are returning true if you find element twice. As I understand the first time should be in position 0 (the first element) and then the test itself, yet you are passing the first element as argument. You should probably begin location in 1 (skipping first element) and count it once:
for (location=1; location<=lastIndex; location++) {
if (element = log[location]) return true;
}
return false;
However it should be easier just to compare n with arrLog.getElement(0)
P.S. I'm assuming everything else is properly initialized.

Related

Significant Running time difference between two algorithms solving same task

In the process of learning algorithms, I have written code to compare 2 algorithms performance in terms of running time. The task of these algorithms is to find all the pairs of numbers in an array that add up to a specific number.
First approach - Brute force.
2 for loops to find the pairs of numbers that add up to the given number. Basically time complexity is O(n*n).
Second approach - Efficient
First sort the array, then have start and end as index to the beginning and end of array, and depending on the sum of these elements in the positions, move left or right to find pairs of numbers.
My question is -
I am printing the running time of each algorithm approach. But it seems like the running time of the Brute force approach is faster than the Efficient one. Why is this happening?
See the code here -
public class MainRunner {
final private static int numberRange = 100000;
public static void generateRandomNumbers(int[] array, int[] dupArray) {
System.out.println("Generated Array: ");
Random random = new Random();
for (int i = 0; i < array.length; i++) {
int generatedRandomInt = random.nextInt(array.length) + 1;
array[i] = dupArray[i] = generatedRandomInt;
}
}
public static void main(String[] args) {
int[] array = new int[numberRange];
int[] dupArray = new int[numberRange];
generateRandomNumbers(array, dupArray);
Random random = new Random();
int sumToFind = random.nextInt(numberRange) + 1;
System.out.println("\n\nSum to find: " + sumToFind);
// Starting Sort and Find Pairs
final long startTimeSortAndFindPairs = System.currentTimeMillis();
new SortAndFindPairs().sortAndFindPairsOfNumbers(sumToFind, array);
final long durationSortAndFind = System.currentTimeMillis() - startTimeSortAndFindPairs;
// Starting Find Pairs
final long startTimeFindPairs = System.currentTimeMillis();
new FindPairs().findPairs(sumToFind, dupArray);
final long durationFindPairs = System.currentTimeMillis() - startTimeFindPairs;
System.out.println("Sort and Find Pairs: " + durationSortAndFind);
System.out.println("Find Pairs: " + durationFindPairs);
}
}
SortAndFindPairs.java
public class SortAndFindPairs {
public void sortAndFindPairsOfNumbers(int argNumberToFind, int[] array) {
Arrays.sort(array);
System.out.println("\n\nResults of Sort and Find Pairs: \n");
int startIndex = 0;
int endIndex = array.length - 1;
while (startIndex < endIndex) {
int sum = array[startIndex] + array[endIndex];
if (argNumberToFind == sum) {
//System.out.println(array[startIndex] + ", " + array[endIndex]);
startIndex++;
endIndex--;
} else if (argNumberToFind > sum) {
startIndex++;
} else {
endIndex--;
}
}
}
And the FindPairs.java
public class FindPairs {
public void findPairs(int argNumberToFind, int[] array) {
System.out.println("\nResults of Find Pairs: \n");
int randomInt1 = 0;
int randomInt2 = 0;
for (int i = 0; i < array.length - 1; i++) {
for (int j = i + 1; j < array.length; j++) {
int sum = array[i] + array[j];
if (argNumberToFind == sum) {
//System.out.println(array[i] + ", " + array[j]);
//randomInt1++;
//randomInt2--;
}
}
}
}}
Only on adding the two variables randomInt1 and randomInt2 in the FindPairs.java, the running time difference is seen. Or else, the running time of FindPairs.java is much less than SortAndFindPairs.java. So why does adding just 2 variable operations increase time by so much? According to conventions, simple operations should consume negligible time. Am I missing out something here?
Results for numberRange = 1000000
Results of Find Pairs:
Sort and Find Pairs: 641
Find Pairs: 57
I think the problem is your compiler optimization playing tricks to you. I tried different permutations of your code, and noticed that the double for loop in FindPairs is doing almost nothing. So the compiler may be stripping some of the code.
I got this numbers with the exact copy of your code:
Sort and Find Pairs: 43
Find Pairs: 13
Consistently (I ran it several times to double check) Sort and find was slower, everytime.
But then I changed the inner loop for to do nothing:
for (int j = i + 1; j < array.length; j++) {
//int sum = array[i] + array[j];
//if (argNumberToFind == sum) {
//System.out.println(array[i] + ", " + array[j]);
//randomInt1++;
//randomInt2--;
//}
And guess what? I got:
Sort and Find Pairs: 20
Find Pairs: 11
Tried several times and the numbers were pretty similar. By removing both loops the runtime for find pairs went to 1. So My guess, maybe the optimization step of the compiler is assuming that the code inside the inner loop doesn't have any effect and thus removes it. The code in Sort and find is a little smarter and so it gets kept.
Now, I tried a different thing, I commented out the increment of randomInt1, but left the sum and if commented,
for (int j = i + 1; j < array.length; j++) {
//int sum = array[i] + array[j];
//if (argNumberToFind == sum) {
//System.out.println(array[i] + ", " + array[j]);
randomInt1++;
//randomInt2--;
//}
and then I got:
Sort and Find Pairs: 42
Find Pairs: 5
Wow, suddenly it got faster! (maybe the compiler replaced the for for the arithmetic calculation of randomInt1 by using the loop bounds?)
My last attempt. You can noticed that this is not a fair comparison, the sort and find have a lot of logic involved, while the find doesn't. It actually does nothing when it find a pair. So to make it apples to apples we want to be sure find pairs actually do something, and lets make sure sort and find do the same extra amount (like adding the same number on both sides of the equation). So I changed the methods to calculate the count of matching pairs instead. Like this:
System.out.println("\nResults of Find Pairs: \n");
long randomInt1 = 0;
int randomInt2 = 0;
int count = 0;
for (int i = 0; i < array.length - 1; i++) {
for (int j = i + 1; j < array.length; j++) {
int sum = array[i] + array[j];
if (argNumberToFind == sum) {
count++;
}
}
}
System.out.println("\nPairs found: " + count + "\n");
and
public void sortAndFindPairsOfNumbers(int argNumberToFind, int[] array) {
Arrays.sort(array);
System.out.println("\n\nResults of Sort and Find Pairs: \n");
int startIndex = 0;
int endIndex = array.length - 1;
int count = 0;
while (startIndex < endIndex) {
int sum = array[startIndex] + array[endIndex];
if (argNumberToFind == sum) {
//System.out.println(array[startIndex] + ", " + array[endIndex]);
startIndex++;
endIndex--;
count++;
} else if (argNumberToFind > sum) {
startIndex++;
} else {
endIndex--;
}
}
System.out.println("\nPairs found: " + count + "\n");
}
And then got:
Sort and Find Pairs: 38
Find Pairs: 4405
The time for find pairs blowed up! And the sort and find kept in line with what we were seeing before.
So the most likely answer to your problem is that the compiler is optimizing something, and an almost empty for loop is something that the compiler can definitely use to optimize. whilst for the sort and find, the complex logic may cause the optimizer to step back. Your algorithm lessons are find. Here java is playing you a trick.
One more thing you can try is use different languages. I'm pretty sure you will find interesting stuff by doing so!
As stated by LIuxed, sort operation takes some time. If you invest time in sorting, why do you then not take advantage of the fact that list items are sorted?
If list elements are sorted, you could use a binary search algorithm... start in the middle of the array, and check if you go 1/2 up, or 1/2 down. As a result, you can get faster performance with sorted array for seeking a value. Such an algorithm is already implemented in the Arrays.binarySearch methods.
See https://docs.oracle.com/javase/7/docs/api/java/util/Arrays.html#binarySearch(int[],%20int)
You will notice the difference when you sort just once, but seek many times.
Calling the Array.sort(MyArray) method, takes long time because it uses a selection algorithm; this means, the Sort method go through all the array x times ( x= array.lenght) searching for the smallest/biggest value, and set it on top of the array, and so on.
Thats why, using this method takes a long time, depending on the array size.
I removed everything from your sortAndFindPairsOfNumbers method, just kept
Arrays.sort(array);
But still time difference is much more.
This means most of the time taken is by sort method.
So your thinking that second approach is Efficient one is not correct. Its all about input size.
If you keep numberRange, lets say, 1000, then SortAndFindPairs will be faster.

Why does the if-block not run for the final case?

Consider the following code, which counts how many of each element an array has:
public static void getCounts(int[] list) {
int current = list[0];
int count = 0;
for (int i = 0; i < list.length; i++, count++) {
if (list[i] > current) {
System.out.println(current + " occurs " + count + timeOrTimes(count));
current = list[i];
count = 0;
}
}
System.out.println(current + " occurs " + count + timeOrTimes(count));
}
For this question, please assume list is sorted in ascending order. If list is [1, 1, 2, 3, 4, 4], for example, the output is:
1 occurs 2 times
2 occurs 1 time
3 occurs 1 time
4 occurs 2 times
Now, if I get rid of the println that comes after the for-loop, i.e.
public static void getCounts(int[] list) {
int current = list[0];
int count = 0;
for (int i = 0; i < list.length; i++, count++) {
if (list[i] > current) {
System.out.println(current + " occurs " + count + timeOrTimes(count));
current = list[i];
count = 0;
}
}
// System.out.println(current + " occurs " + count + timeOrTimes(count));
}
Then using the same example input, the output becomes:
1 occurs 2 times
2 occurs 1 time
3 occurs 1 time
In other words, the if block doesn't execute if list[i] is the maximum value of the array. Why is this the case? For the example above, the index of the first 4 is i = 4, and list[4] > 3, so the conditional statement is met, but it still won't execute.
How can I adjust the code so that the if block will run for all cases?
Thanks
The final println is necessary because you are triggering your print statement on a change in the value of list[i] and printing the result for the previous value. At the end of the program there's no last "change" to be detected, so you need to handle the last case separately.
This (the need for a final operation after the loop) is a standard coding pattern that occurs any time a variable change in a sequence triggers an operation at the end of a batch. One way of thinking about it is that there's a virtual value, one past the end of your array, that is always larger than any possible previous value and signals the end of data. There's no need to test for it or actually implement it, but you still have to code the operation (in your case a println).
The operation could be much more complex, in which case you'd encapsulate it in a method to avoid code duplication. Your code could benefit slightly from encapsulating the println in a method outputCount(int value, int count).
For example
private void outputCount(int value, int count) {
System.out.println(value + " occurs " + count + timeOrTimes(count));
}
For your use case it's almost not worth it, but if the end-of-batch operation were much more than 1 line of code I would certainly write a method for it instead of repeating the code.

Find largest sequence within an arraylist

Some Background
Last week I did a problem in my textbook where It told me to generate 20 random numbers and then put brackets around successive numbers that are equal
Consider the following which my program outputs
697342(33)(666)(44)69(66)1(88)
What I need to do
The next problem was to basically get the longest sequence of these words and put brackets around them. If you have
1122345(6666)
Basically you need to put brackets around four 6's , since they occur most often.
I've finished all other problems in the chapter I am studying ( Arrays and ArrayLists), however I can't seem to figure this one out.
Here is the solution that I have made for putting brackets around successive numbers:
class Seq
{
private ArrayList<Integer> nums;
private Random randNum;
public Seq()
{
nums = new ArrayList<Integer>();
randNum = new Random();
}
public void fillArrList()
{
for (int i = 0 ; i < 20 ; i++)
{
int thisRandNum = randNum.nextInt(9)+1;
nums.add(thisRandNum);
}
}
public String toString() {
StringBuilder result = new StringBuilder();
boolean inRun = false;
for (int i = 0; i < nums.size(); i++) {
if (i < nums.size() - 1 && nums.get(i).equals(nums.get(i + 1))) {
if (!inRun) {
result.append("(");
}
result.append(nums.get(i));
inRun = true;
} else {
result.append(nums.get(i));
if (inRun) {
result.append(")");
}
inRun = false;
}
}
return result.toString();
}
}
My Thoughts
Iterate through the whole list. Make a count variable, that keeps track of how many numbers are successive of each other. I.e 22 would have a count of 2. 444 a count of 3
Next make an oldCount, which compares the current count to the oldCount. We only want to keep going if our new count is greater than oldCount
After that we need a way to get the starting index of the largest count variable, as well as the end.
Is my way of thinking correct? Because I'm having trouble updating the oldCount and count variable while comparing them, since there values constantly change. I'm not looking for the code, but rather some valuable hints.
My count is resetting like this
int startIndex, endIndex = 0;
int count = 0;
int oldCount = 0;
for(int i = 0 ; i < nums.size(); i++)
{
if(nums.get(i) == nums.get(i+1) && count >= oldCount)
{
count++;
}
oldCount = count;
}
Only after walking all elements you will know the longest subsequence.
11222333333444555
11222(333333)444555
Hence only after the loop you can insert both brackets.
So you have to maintain a local optimum: start index plus length or last index of optimum.
And then for every sequence the start index of the current sequence.
As asked:
The optimal state (sequence) and the current state are two things. One cannot in advance say that any current state is the final optimal state.
public String toString() {
// Begin with as "best" solution the empty sequence.
int startBest = 0; // Starting index
int lengthBest = 0; // Length of sequence
// Determine sequences:
int startCurrent = 0; // Starting index of most current/last sequence
for (int i = 0; i < nums.size(); i++) {
// Can we add the current num to the current sequence?
if (i == startCurrent || nums.get(i).equals(nums.get(i - 1)))) {
// We can extend the current sequence with this i:
int lengthCurrent = i - startCurrent + 1;
if (lengthCurrent > lengthBest) { // Current length better?
// New optimum:
startBest = startCurrent;
lengthBest = lengthCurrent;
}
} else {
// A different num, start here.
// As we had already a real sequence (i != 0), no need for
// checking for a new optimum with length 1.
startCurrent = i;
}
}
// Now we found the best solution.
// Create the result:
StringBuilder result = new StringBuilder();
for (int i = 0; i < nums.size(); i++) {
result.append(nums.get(i));
}
// Insert the right ')' first as its index changes by 1 after inserting '('.
result.insert(startBest + lengthBest, ")");
result.insert(startBest, "(");
return result.toString();
}
The first problem is how to find the end of a sequence, and set the correct start of the sequence.
The problem with the original algorithm is that there is handled just one sequence (one subsequence start).
The way you have suggested could work. And then, if newcount is greater than oldcount, you'll want to store an additional number in another variable - the index of the where the longest sequence begins.
Then later, you can go and insert the ( at the position of that index.
i.e. if you have 11223456666.
The biggest sequence starts with the first number 6. That is at index 7, so store that 7 in a variable.
I think you need to iterate the entire list even though the current count is lower than the oldCount, what about e.g. 111224444?
Keep 4 variables while iterating the list: highestStartIndex, highestEndIndex, highestCount and currentCount. Iterate the entire list and use currentCount to count equal neighbouring numbers. Update the highest* variables when a completed currentCount is higher than highestCount. Lastly write the numbers out with paranthesis using the *Index variables.

The loop is not adding items to lists as it supposed to

I would like to cluster some letters based on certain value called GAD, at each iteration I would like to add the letter that has the highest value for each cluster and this will continue until no letters are left.
The problem here that code does the first iteration correct (adds the letter that has the highest value for cluster 0) and then stops, when it should find the highest letters for the next cluster.
note: the number of clusters are 4. and the variable 'clusters' is an array of objects where each object contains a list.
do {
if (count == 4) {
count = 0;
}
for (int j = 0; j < unassignedLetters.size(); j++) {
if (unassignedLetters.get(j).getGADVal(count) > max) {
max = unassignedLetters.get(j).getGADVal(count);
maxLetter = unassignedLetters.get(j);
System.out.println("maxLetter for cluster " + count + " is: " + maxLetter.getLetter());
} else if (unassignedLetters.get(j).getGADVal(count) == max) {
maxLetter = CLDMax(sheet, this.clusters[count], max, maxLetter, unassignedLetters.get(j));
}
}
this.clusters[count].addLetter(maxLetter);
unassignedLetters.remove(maxLetter);
System.out.println("Letter " + maxLetter.getLetter() + " has been added cluster " + count);
maxLetter = null;
count++;
} while (unassignedLetters.isEmpty());
Your while condition seems wrong :
do{
[...]
for (int j = 0; j < unassignedLetters.size(); j++) {
[...]
} while(unassignedLetters.isEmpty());
It should be :
while(!unassignedLetters.isEmpty());
For starters your while condition is wrong.
You need while(!unassignedLetters.isEmpty());. read: continue executing if there are still items in unassignedLetters.
Currently you have: continue executing if there are no items in unassignedLetters.
A do/while loop executes everything in the do{} block before it ever checks that the while condition is true. So your code will execute once, and then breaks once the while condition, unassignedLetters.isEmpty() evaluates to false.

Finding the maximum number?

Ok, so I can't figure out why when I input an array of 1, 2, 100, 3, 9, 22, 58
the following code returns 100:
(this is just a snippet, this is part of a larger block of code)
double result = numbers[0];
for (int i = 0; i < numbers.length; i++)
if (numbers[i] > result)
result = numbers[i];
System.out.println("The max value is " + result);
But without the curly brackets on the if, it prints a list of numbers leading to the biggest one, starting from the first one, in this case: 1 2 100:
double result = numbers[0];
for (int i = 0; i < numbers.length; i++)
if (numbers[i] > result) {
result = numbers[i];
System.out.println("The max value is " + result);
}
Thanks for your help in advance, this is driving me crazy and it's probably very stupid.
In the second example you are printing within the if statement, so each time it iterates through the list, it prints out a result. In the first example, it does it after the if because you don't use braces. When you don't you {} braces after a statement, it assumes only the very next line is included in that statement.
You should learn good coding practices before you continue to code. It will help you avoid things like this later on when your code is much more complex. Additionally, stepping through the code will show you exactly what is happening, so you should also learn how to use the debugger.
in
if (numbers[i] > result)
result = numbers[i];
System.out.println("The max value is " + result);
the if block without braces only includes the immediate next line.
same for the for block.
Explanation.
the for does not have braces, so it will just iterate through the immediate next block of code, which is the if block.
the if block has no braces either, so it will iterate through the immediate next block/statement which is
result = numbers[i];
so effectively your System.out.println("The max value is " + result); statement is out of both the blocks in the first case and hence executes only once.
In the first case, result is being set to the current number (number[i]), if that number is greater than the previous value of result. This has the effect of updating result to the largest value found so far, as long as it's larger than the first value its set to (numbers[0]). If you print this at the end of the loop, you're therefore printing the largest number found in the array (max value). In the second case, you are always printing the largest number found so far in the array, as you go through the numbers array - therefore you are printing the numbers in ascending order.
In the first example, the System.out.println is only executed once, at the end of the block. In the second example, it is executed every time a new highest-so-far number is encountered.
Adding braces to both examples should make the difference clear:
for (int i = 0; i < numbers.length; i++) {
if (numbers[i] > result) {
result = numbers[i];
}
}
System.out.println("The max value is " + result); // only ever called once
vs
for (int i = 0; i < numbers.length; i++) {
if (numbers[i] > result) {
result = numbers[i];
System.out.println("The max value is " + result); // called whenever numbers[i] > result
}
}
whenever you find a greater number than result you swap it with result and also print it out.
System.out.println("The max value is " + result);
should be right after the for loop (outside the curly braces) so when the loop will be over only the greatest number will be printed out.
double result = numbers[0];
for (int i = 0; i < numbers.length; i++)
{
if (numbers[i] > result) //this will check each array members to find the max
{
result = numbers[i];//assign the array member if it is the largest
}
System.out.println("The max value is " + result); //print the max value of the array
Read up on java and it's control flow statements.
If- and for-statements both have a clause - which is the code that is controlled. It can be a single statement or a group of statements wrapped with braces.
Read this:
http://docs.oracle.com/javase/tutorial/java/nutsandbolts/flow.html
I'd review how to step through code with a debugger as BobbyD17 suggested.
For eclipse refer to this link.
For netbeans.

Categories