Most efficient way to create an array of counting numbers - java

What's the most efficient way to make an array of a given length, with each element containing its subscript?
Possible description with my dummy-level code:
/**
* The IndGen function returns an integer array with the specified dimensions.
*
* Each element of the returned integer array is set to the value of its
* one-dimensional subscript.
*
* #see Modeled on IDL's INDGEN function:
* http://idlastro.gsfc.nasa.gov/idl_html_help/INDGEN.html
*
* #params size
* #return int[size], each element set to value of its subscript
* #author you
*
* */
public int[] IndGen(int size) {
int[] result = new int[size];
for (int i = 0; i < size; i++) result[i] = i;
return result;
}
Other tips, such as doc style, welcome.
Edit
I've read elsewhere how inefficient a for loop is compared to other methods, as for example in Copying an Array:
Using clone: 93 ms
Using System.arraycopy: 110 ms
Using Arrays.copyOf: 187 ms
Using for loop: 422 ms
I've been impressed by the imaginative responses to some questions on this site, e.g., Display numbers from 1 to 100 without loops or conditions. Here's an answer that might suggest some methods:
public class To100 {
public static void main(String[] args) {
String set = new java.util.BitSet() {{ set(1, 100+1); }}.toString();
System.out.append(set, 1, set.length()-1);
}
}
If you're not up to tackling this challenging problem, no need to vent: just move on to the next unanswered question, one you can handle.

Since it's infeasible to use terabytes of memory at once, and especially to do any calculation with them simultaneously, you might considering using a generator. (You were probably planning to loop over the array, right?) With a generator, you don't need to initialize an array (so you can start using it immediately) and almost no memory is used (O(1)).
I've included an example implementation below. It is bounded by the limitations of the long primitive.
import java.util.Iterator;
import java.util.NoSuchElementException;
public class Counter implements Iterator<Long> {
private long count;
private final long max;
public Counter(long start, long endInclusive) {
this.count = start;
this.max = endInclusive;
}
#Override
public boolean hasNext() {
return count <= max;
}
#Override
public Long next() {
if (this.hasNext())
return count++;
else
throw new NoSuchElementException();
}
#Override
public void remove() {
throw new UnsupportedOperationException();
}
}
Find a usage demonstration below.
Iterator<Long> i = new Counter(0, 50);
while (i.hasNext()) {
System.out.println(i.next()); // Prints 0 to 50
}

only thing i ca think of is using "++i" instead of "i++" , but i think the java compiler already has this optimization .
other than that, this is pretty much the best algorithm there is.
you could make a class that acts as if it has an array yet it doesn't , and that it will simply return the same number that it gets (aka the identity function) , but that's not what you've asked for.

As other have said in their answers, your code is already close to the most efficient that I can think of, at least for small sized arrays. If you need to create those arrays a lot of times and they are very big, instead of continuously iterating in a for loop you could create all the arrays once, and then copy them. The copy operation will be faster than iterating over the array if the array is very big. It would be something like this (in this example for a maximum of 1000 elements):
public static int[][] cache = {{0},{0,1},{0,1,2},{0,1,2,3},{0,1,2,3,4}, ..., {0,1,2,...,998,999}};
Then, from the code where you need to create those arrays a lot of times, you would use something like this:
int[] arrayOf50Elements = Arrays.copyOf(cache[49], 50);
Note that this way you are using a lot of memory to improve the speed. I want to emphasize that this will only be worth the complication when you need to create those arrays a lot of times, the arrays are very big, and maximum speed is one of your requirements. In most of the situations I can think of, the solution you proposed will be the best one.
Edit: I've just seen the huge amount of data and memory you need. The approach I propose would require memory of the order of n^2, where n is the maximum integer you expect to have. In this case that's impractical, due to the monstrous amount of memory you would need. Forget about this. I leave the post because maybe it is useful for others.

Related

HashMap performs better than array? [duplicate]

Is it (performance-wise) better to use Arrays or HashMaps when the indexes of the Array are known? Keep in mind that the 'objects array/map' in the example is just an example, in my real project it is generated by another class so I cant use individual variables.
ArrayExample:
SomeObject[] objects = new SomeObject[2];
objects[0] = new SomeObject("Obj1");
objects[1] = new SomeObject("Obj2");
void doSomethingToObject(String Identifier){
SomeObject object;
if(Identifier.equals("Obj1")){
object=objects[0];
}else if(){
object=objects[1];
}
//do stuff
}
HashMapExample:
HashMap objects = HashMap();
objects.put("Obj1",new SomeObject());
objects.put("Obj2",new SomeObject());
void doSomethingToObject(String Identifier){
SomeObject object = (SomeObject) objects.get(Identifier);
//do stuff
}
The HashMap one looks much much better but I really need performance on this so that has priority.
EDIT: Well Array's it is then, suggestions are still welcome
EDIT: I forgot to mention, the size of the Array/HashMap is always the same (6)
EDIT: It appears that HashMaps are faster
Array: 128ms
Hash: 103ms
When using less cycles the HashMaps was even twice as fast
test code:
import java.util.HashMap;
import java.util.Random;
public class Optimizationsest {
private static Random r = new Random();
private static HashMap<String,SomeObject> hm = new HashMap<String,SomeObject>();
private static SomeObject[] o = new SomeObject[6];
private static String[] Indentifiers = {"Obj1","Obj2","Obj3","Obj4","Obj5","Obj6"};
private static int t = 1000000;
public static void main(String[] args){
CreateHash();
CreateArray();
long loopTime = ProcessArray();
long hashTime = ProcessHash();
System.out.println("Array: " + loopTime + "ms");
System.out.println("Hash: " + hashTime + "ms");
}
public static void CreateHash(){
for(int i=0; i <= 5; i++){
hm.put("Obj"+(i+1), new SomeObject());
}
}
public static void CreateArray(){
for(int i=0; i <= 5; i++){
o[i]=new SomeObject();
}
}
public static long ProcessArray(){
StopWatch sw = new StopWatch();
sw.start();
for(int i = 1;i<=t;i++){
checkArray(Indentifiers[r.nextInt(6)]);
}
sw.stop();
return sw.getElapsedTime();
}
private static void checkArray(String Identifier) {
SomeObject object;
if(Identifier.equals("Obj1")){
object=o[0];
}else if(Identifier.equals("Obj2")){
object=o[1];
}else if(Identifier.equals("Obj3")){
object=o[2];
}else if(Identifier.equals("Obj4")){
object=o[3];
}else if(Identifier.equals("Obj5")){
object=o[4];
}else if(Identifier.equals("Obj6")){
object=o[5];
}else{
object = new SomeObject();
}
object.kill();
}
public static long ProcessHash(){
StopWatch sw = new StopWatch();
sw.start();
for(int i = 1;i<=t;i++){
checkHash(Indentifiers[r.nextInt(6)]);
}
sw.stop();
return sw.getElapsedTime();
}
private static void checkHash(String Identifier) {
SomeObject object = (SomeObject) hm.get(Identifier);
object.kill();
}
}
HashMap uses an array underneath so it can never be faster than using an array correctly.
Random.nextInt() is many times slower than what you are testing, even using array to test an array is going to bias your results.
The reason your array benchmark is so slow is due to the equals comparisons, not the array access itself.
HashTable is usually much slower than HashMap because it does much the same thing but is also synchronized.
A common problem with micro-benchmarks is the JIT which is very good at removing code which doesn't do anything. If you are not careful you will only be testing whether you have confused the JIT enough that it cannot workout your code doesn't do anything.
This is one of the reason you can write micro-benchmarks which out perform C++ systems. This is because Java is a simpler language and easier to reason about and thus detect code which does nothing useful. This can lead to tests which show that Java does "nothing useful" much faster than C++ ;)
arrays when the indexes are know are faster (HashMap uses an array of linked lists behind the scenes which adds a bit of overhead above the array accesses not to mention the hashing operations that need to be done)
and FYI HashMap<String,SomeObject> objects = HashMap<String,SomeObject>(); makes it so you won't have to cast
For the example shown, HashTable wins, I believe. The problem with the array approach is that it doesn't scale. I imagine you want to have more than two entries in the table, and the condition branch tree in doSomethingToObject will quickly get unwieldly and slow.
Logically, HashMap is definitely a fit in your case. From performance standpoint is also wins since in case of arrays you will need to do number of string comparisons (in your algorithm) while in HashMap you just use a hash code if load factor is not too high. Both array and HashMap will need to be resized if you add many elements, but in case of HashMap you will need to also redistribute elements. In this use case HashMap loses.
Arrays will usually be faster than Collections classes.
PS. You mentioned HashTable in your post. HashTable has even worse performance thatn HashMap. I assume your mention of HashTable was a typo
"The HashTable one looks much much
better "
The example is strange. The key problem is whether your data is dynamic. If it is, you could not write you program that way (as in the array case). In order words, comparing between your array and hash implementation is not fair. The hash implementation works for dynamic data, but the array implementation does not.
If you only have static data (6 fixed objects), array or hash just work as data holder. You could even define static objects.

How to properly use the remove() method in java for an array [duplicate]

This question already has answers here:
How do I remove objects from an array in Java?
(20 answers)
Closed 9 years ago.
Is there any fast (and nice looking) way to remove an element from an array in Java?
You could use commons lang's ArrayUtils.
array = ArrayUtils.removeElement(array, element)
commons.apache.org library:Javadocs
Your question isn't very clear. From your own answer, I can tell better what you are trying to do:
public static String[] removeElements(String[] input, String deleteMe) {
List result = new LinkedList();
for(String item : input)
if(!deleteMe.equals(item))
result.add(item);
return result.toArray(input);
}
NB: This is untested. Error checking is left as an exercise to the reader (I'd throw IllegalArgumentException if either input or deleteMe is null; an empty list on null list input doesn't make sense. Removing null Strings from the array might make sense, but I'll leave that as an exercise too; currently, it will throw an NPE when it tries to call equals on deleteMe if deleteMe is null.)
Choices I made here:
I used a LinkedList. Iteration should be just as fast, and you avoid any resizes, or allocating too big of a list if you end up deleting lots of elements. You could use an ArrayList, and set the initial size to the length of input. It likely wouldn't make much of a difference.
The best choice would be to use a collection, but if that is out for some reason, use arraycopy. You can use it to copy from and to the same array at a slightly different offset.
For example:
public void removeElement(Object[] arr, int removedIdx) {
System.arraycopy(arr, removedIdx + 1, arr, removedIdx, arr.length - 1 - removedIdx);
}
Edit in response to comment:
It's not another good way, it's really the only acceptable way--any tools that allow this functionality (like Java.ArrayList or the apache utils) will use this method under the covers. Also, you REALLY should be using ArrayList (or linked list if you delete from the middle a lot) so this shouldn't even be an issue unless you are doing it as homework.
To allocate a collection (creates a new array), then delete an element (which the collection will do using arraycopy) then call toArray on it (creates a SECOND new array) for every delete brings us to the point where it's not an optimizing issue, it's criminally bad programming.
Suppose you had an array taking up, say, 100mb of ram. Now you want to iterate over it and delete 20 elements.
Give it a try...
I know you ASSUME that it's not going to be that big, or that if you were deleting that many at once you'd code it differently, but I've fixed an awful lot of code where someone made assumptions like that.
You can't remove an element from the basic Java array. Take a look at various Collections and ArrayList instead.
Nice looking solution would be to use a List instead of array in the first place.
List.remove(index)
If you have to use arrays, two calls to System.arraycopy will most likely be the fastest.
Foo[] result = new Foo[source.length - 1];
System.arraycopy(source, 0, result, 0, index);
if (source.length != index) {
System.arraycopy(source, index + 1, result, index, source.length - index - 1);
}
(Arrays.asList is also a good candidate for working with arrays, but it doesn't seem to support remove.)
I think the question was asking for a solution without the use of the Collections API. One uses arrays either for low level details, where performance matters, or for a loosely coupled SOA integration. In the later, it is OK to convert them to Collections and pass them to the business logic as that.
For the low level performance stuff, it is usually already obfuscated by the quick-and-dirty imperative state-mingling by for loops, etc. In that case converting back and forth between Collections and arrays is cumbersome, unreadable, and even resource intensive.
By the way, TopCoder, anyone? Always those array parameters! So be prepared to be able to handle them when in the Arena.
Below is my interpretation of the problem, and a solution. It is different in functionality from both of the one given by Bill K and jelovirt. Also, it handles gracefully the case when the element is not in the array.
Hope that helps!
public char[] remove(char[] symbols, char c)
{
for (int i = 0; i < symbols.length; i++)
{
if (symbols[i] == c)
{
char[] copy = new char[symbols.length-1];
System.arraycopy(symbols, 0, copy, 0, i);
System.arraycopy(symbols, i+1, copy, i, symbols.length-i-1);
return copy;
}
}
return symbols;
}
You could use the ArrayUtils API to remove it in a "nice looking way". It implements many operations (remove, find, add, contains,etc) on Arrays.
Take a look. It has made my life simpler.
okay, thx a lot
now i use sth like this:
public static String[] removeElements(String[] input, String deleteMe) {
if (input != null) {
List<String> list = new ArrayList<String>(Arrays.asList(input));
for (int i = 0; i < list.size(); i++) {
if (list.get(i).equals(deleteMe)) {
list.remove(i);
}
}
return list.toArray(new String[0]);
} else {
return new String[0];
}
}
Some more pre-conditions are needed for the ones written by Bill K and dadinn
Object[] newArray = new Object[src.length - 1];
if (i > 0){
System.arraycopy(src, 0, newArray, 0, i);
}
if (newArray.length > i){
System.arraycopy(src, i + 1, newArray, i, newArray.length - i);
}
return newArray;
You can not change the length of an array, but you can change the values the index holds by copying new values and store them to a existing index number.
1=mike , 2=jeff // 10 = george 11 goes to 1 overwriting mike .
Object[] array = new Object[10];
int count = -1;
public void myFunction(String string) {
count++;
if(count == array.length) {
count = 0; // overwrite first
}
array[count] = string;
}
Copy your original array into another array, without the element to be removed.
A simplier way to do that is to use a List, Set... and use the remove() method.
Swap the item to be removed with the last item, if resizing the array down is not an interest.
I hope you use the java collection / java commons collections!
With an java.util.ArrayList you can do things like the following:
yourArrayList.remove(someObject);
yourArrayList.add(someObject);
Use an ArrayList:
alist.remove(1); //removes the element at position 1
Sure, create another array :)

What is the fastest and most concise/correct way to implement this model class backed by values in a 2-dimensional array?

I solved this problem using a graph, but unfortunately now I'm stuck with having to use a 2d array and I have questions about the best way to go about this:
public class Data {
int[][] structure;
public data(int x, int y){
structure = new int[x][y]
}
public <<TBD>> generateRandom() {
// This is what my question is about
}
}
I have a controller/event handler class:
public class Handler implements EventHandler {
#Override
public void onEvent(Event<T> e) {
this.dataInstance.generateRandom();
// ... other stuff
}
}
Here is what each method will do:
Data.generateRandom() will generate a random value at a random location in the 2d int array if there exists a value in the structure that in not initialized or a value exists that is equal to zero
If there is no available spot in the structure, the structure's state is final (i.e. in the literal sense, not the Java declaration)
This is what I'm wondering:
What is the most efficient way to check if the board is full? Using a graph, I was able to check if the board was full on O(1) and get an available yet also random location on worst-case O(n^2 - 1), best case O(1). Obviously now with an array improving n^2 is tough, so I'm just now focusing on execution speed and LOC. Would the fastest way to do it now to check the entire 2d array using streams like:
Arrays.stream(board).flatMapToInt(tile -> tile.getX()).map(x -> x > 0).count() > board.getWidth() * board.getHeight()
(1) You can definitely use a parallel stream to safely perform read only operations on the array. You can also do an anyMatch call since you are only caring (for the isFull check) if there exists any one space that hasn't been initialized. That could look like this:
Arrays.stream(structure)
.parallel()
.anyMatch(i -> i == 0)
However, that is still an n^2 solution. What you could do, though, is keep a counter of the number of spaces possible that you decrement when you initialize a space for the first time. Then the isFull check would always be constant time (you're just comparing an int to 0).
public class Data {
private int numUninitialized;
private int[][] structure;
public Data(int x, int y) {
if (x <= 0 || y <= 0) {
throw new IllegalArgumentException("You can't create a Data object with an argument that isn't a positive integer.");
}
structure = new int[x][y];
int numUninitialized = x * y;
}
public void generateRandom() {
if (isFull()) {
// do whatever you want when the array is full
} else {
// Calculate the random space you want to set a value for
int x = ThreadLocalRandom.current().nextInt(structure.length);
int y = ThreadLocalRandom.current().nextInt(structure[0].length);
if (structure[x][y] == 0) {
// A new, uninitialized space
numUninitialized--;
}
// Populate the space with a random value
structure[x][y] = ThreadLocalRandom.current().nextInt(Integer.MIN_VALUE, Integer.MAX_VALUE);
}
}
public boolean isFull() {
return 0 == numUninitialized;
}
}
Now, this is with my understanding that each time you call generateRandom you take a random space (including ones already initialized). If you are supposed to ONLY choose a random uninitialized space each time it's called, then you'd do best to hold an auxiliary data structure of all the possible grid locations so that you can easily find the next random open space and to tell if the structure is full.
(2) What notification method is appropriate for letting other classes know the array is now immutable? It's kind of hard to say as it depends on the use case and the architecture of the rest of the system this is being used in. If this is an MVC application with a heavy use of notifications between the data model and a controller, then an observer/observable pattern makes a lot of sense. But if your application doesn't use that anywhere else, then perhaps just having the classes that care check the isFull method would make more sense.
(3) Java is efficient at creating and freeing short lived objects. However, since the arrays can be quite large I'd say that allocating a new array object (and copying the data) over each time you alter the array seems ... inefficient at best. Java has the ability to do some functional types of programming (especially with the inclusion of lambdas in Java 8) but only using immutable objects and a purely functional style is kind of like the round hole to Java's square peg.

Splitting vectors into subvectors - Java

I have a function that processes vectors. Size of input vector can be anything up to few millions. Problem is that function can only process vectors that are no bigger than 100k elements without problems.
I would like to call function in smaller parts if vector has too many elements
Vector<Stuff> process(Vector<Stuff> input) {
Vector<Stuff> output;
while(1) {
if(input.size() > 50000) {
output.addAll(doStuff(input.pop_front_50k_first_ones_as_subvector());
}
else {
output.addAll(doStuff(input));
break;
}
}
return output;
}
How should I do this?
Not sure if a Vector with millions of elements is a good idea, but Vector implements List, and thus there is subList which provides a lightweight (non-copy) view of a section of the Vector.
You may have to update your code to work with the interface List instead of only the specific implementation Vector, though (because the sublist returned is not a Vector, and it is just good practice in general).
You probably want to rewrite your doStuff method to take a List rather than a Vector argument,
public Collection<Output> doStuff(List<Stuff> v) {
// calculation
}
(and notice that Vector<T> is a List<T>)
and then change your process method to something like
Vector<Stuff> process(Vector<Stuff> input) {
Vector<Stuff> output;
int startIdx = 0;
while(startIdx < input.size()) {
int endIdx = Math.min(startIdx + 50000, input.size());
output.addAll(doStuff(input.subList(startIdx, endIdx)));
startIdx = endIdx;
}
}
this should work as long as the "input" Vector isn't being concurrently updated during the running of the process method.
If you can't change the signature of doStuff, you're probably going to need to wrap a new Vector around the result of subList,
output.addAll(doStuff(new Vector<Stuff>(input.subList(startIdx, endIdx)))));

Time efficient implementation of generating probability tree and then sorting the results

I have some events, where each of them has a probability to happen, and a weight if they do. I want to create all possible combinations of probabilities of events, with the corresponding weights. In the end, I need them sorted in weight order. It is like generating a probability tree, but I only care about the resulting leaves, not which nodes it took to get them. I don't need to look up specific entries during the creation of the end result, just to create all the values and sort them by weight.
There will be only about 5-15 events,but since there is 2^n resulting possibilities with n events, and this is to be done very often, I don’t want it to take unnecessarily long time. Speed is much more important than the amount of storage used.
The solution I came up with works but is slow. Any idea for a quicker solution or some ideas for improvement?
class ProbWeight {
double prob;
double eventWeight;
public ProbWeight(double aProb, double aeventWeight) {
prob = aProb;
eventWeight = aeventWeight;
}
public ProbWeight(ProbWeight aCellProb) {
prob = aCellProb.getProb();
eventWeight = aCellProb.geteventWeight();
}
public double getProb(){
return prob;
}
public double geteventWeight(){
return eventWeight;
}
public void doesHappen(ProbWeight aProb) {
prob*=aProb.getProb();
eventWeight += aProb.geteventWeight();
}
public void doesNotHappen(ProbWeight aProb) {
prob*=(1-aProb.getProb());
}
}
//Data generation for testing
List<ProbWeight> dataList = new ArrayList<ProbWeight>();
for (int i =0; i<5; i++){
ProbWeight prob = new ProbWeight(Math.random(), 10*Math.random(), i);
dataList.add(prob);
}
//The list where the results will end up
List<ProbWeight> resultingProbList = new ArrayList<ProbWeight>();
// a temporaty list to avoid modifying a list while looping through it
List<ProbWeight> tempList = new ArrayList<ProbWeight>();
resultingProbList.add(dataList.remove(0));
for (ProbWeight data : dataList){ //for each event
//go through the already created event combinations and create two new for each
for(ProbWeight listed: resultingProbList){
ProbWeight firstPossibility = new ProbWeight(listed);
ProbWeight secondPossibility = new ProbWeight(listed);
firstPossibility.doesHappen(data);
secondPossibility.doesNotHappen(data);
tempList.add(firstPossibility);
tempList.add(secondPossibility);
}
resultingProbList = new ArrayList<ProbWeight>(tempList);
}
// Then sort the list by weight using sort and a comparator
It is 50% about choosing an appropriate data structure and 50% about the algorithm. Data structure - I believe TreeBidiMap will do the magic for you. You will need to implement 2 Comparators - 1 for the weight and another for the probability.
Algorithm - trivial.
Good luck!
just a few tricks to try to speed up your code:
- try to avoid non necessary objects allocation
- try to use the right constructor for your collections , in your code sample it seems that you already know the size of the collections, so use it as a parameter in the constructors to prevent useless collections resizing (and gc calls)
You may try to use a Set instead of List in order to see the ordering made on the fly.....
HTH
jerome

Categories