A hashtable with a 1.0 maxLoad factors, time complexity - java

Just a quick question to confirm my thoughts,
The complexity of a hash table using a load factor of 1.0 would be quadratic time shown with the following notation O(n^2).
This would be because of having to continuously resize and insert over and over. Please correct me if im wrong.
Thanks

The worst case scenarios for searching, inserting, and deleting for a hash table will almost always be O(n).

Seems you mean open addressing. With separate chaining, complexity of all operations will still be O(1).

Related

Time Complexity of the Java Bitset.flip(from_index, to_index)

I am interested in knowing time complexity of BitSet.flip().
I tried various documentations. Closest thing I found is this Oracle documentation , which doesn't mention anything about time complexity of the internal operation.
If you are someone who says, it has time complexity of O(n) where n is distance between from_index and to_index, please have look at this solution. This solution is proof that BitSet.flip() has some internal optimisation. It may be setting the bits in bigger chunks of blocks. If that is that case I wanna know what is the block size .
There is literally no information about Time complexity or internal implementation of BitSet.flip().

Ambiguity in a CodeForces Problem - usage of HashSet Vs LinkedHashSet

I was solving a Codeforces problem yesterday. The problem's URL is this
I will just explain the question in short below.
Given a binary string, divide it into a minimum number of subsequences
in such a way that each character of the string belongs to exactly one
subsequence and each subsequence looks like "010101 ..." or "101010
..." (i.e. the subsequence should not contain two adjacent zeros or
ones).
Now, for this problem, I had submitted a solution yesterday during the contest. This is the solution. It got accepted temporarily and on final test cases got a Time limit exceeded status.
So today, I again submitted another solution and this passed all the cases.
In the first solution, I used HashSet and in the 2nd one I used LinkedHashSet. I want to know, why didn't HashSet clear all the cases? Does this mean I should use LinkedHashSet whenever I need a Set implementation? I saw this article and found HashSet performs better than LinkedHashSet. But why my code doesn't work here?
This question would probably get more replies on Codeforces, but I'll answer it here anyways.
After a contest ends, Codeforces allows other users to "hack" solutions by writing custom inputs to run on other users' programs. If the defending user's program runs slowly on the custom input, the status of their code submission will change from "Accepted" to "Time Limit Exceeded".
The reason why your code, specifically, changed from "Accepted" to "Time Limit Exceeded" is that somebody created an "anti-hash test" (a test on which your hash function results in many collisions) on which your program ran slower than usual. If you're interested in how such tests are generated, you can find several posts on Codeforces, like this one: https://codeforces.com/blog/entry/60442.
As linked by #Photon, there's a post on Codeforces explaining why you should avoid using Java.HashSet and Java.HashMap: https://codeforces.com/blog/entry/4876, which is essentially due to anti-hash tests. In some instances, adding the extra log(n) factor from a balanced BST may not be so bad (by using TreeSet or TreeMap). In many cases, an extra log(n) factor won't make your code time out, and it gives you protection from anti-hash tests.
How do you determine whether your algorithm is fast enough to add the log(n) factor? I guess this comes with some experience, but most people suggest performing some sort of calculation. Most online judges (including Codeforces) show the time that your program is allowed to run on a particular problem (usually somewhere between one and four seconds), and you can use 10^9 constant-time operations per second as a rule of thumb when performing calculations.

different types of sorting and their time

I know that as number of elements is doubled the time to sort for selection sort and insertion sort quadrupled.
How about merge sort and quick sort?
Lets say it makes 2 seconds to sort 100 items using merge sort.
How long would it take to sort 200 items using merge sort and quick sort?
Merge Sort is usually O(nlog(n)). Quick sort can be O(nlog(n)), but worst case scenario it will end up closer to O(n^2). I'll leave the math to you, as it's fairly simple. The nice thing about common sorting algorithms is that they are very well documented online, there are likely plenty of calculators online that could give you specifics. As for how long it would actually take to run an algorithm, I'm no expert but I'm guessing that would depend largely on hardware. You should be more concerned with the Big-O of whatever you're running because that's the only thing you can really control as a programmer.

BOBYQA Algorithm

I am working with BOBYQA Algorithm for optimisation issues. I have a question about the efficiency of this algorithm and how to set the right parameters.
Bobyqa is based on trust region and for this purpose one needs to set a number of interpolated value to approximate the function with quadratic formula. It is strongly recommended to choose this number between n+2 and 2n+1. I have some difficulties to see how to choose the right one. I have a problem with 9 unknown and I select empirically the best number. I have also noticed that this parameter could dramatically increase calculation time.
If someone can share his own experience with this algorithm, it would be helpful.
Thanks

Subgraph matching (JUNG)

I have a set of subgraphs and I need to match them on the graph they were extracted from. I also need to count how many times each subgraph shows up in such graph (I need to store all possible matches). There must be a perfect match considering the edges' labels of both subgraph and graph, the vertices' labels, however, donĀ“t need to match each other. I built my system using JUNG API, so I would like a solution (api, algorithm etc) that could deal with the Graph structure provided by JUNG. Any thoughts?
JUNG is very full-featured, so if there isn't a graph analysis algorithm in JUNG for what you need, there's usually a strong, theoretical reason for it. To me, your problem sounds like an instance of the Subgraph Isomorphism problem, which is NP-Complete:
http://en.wikipedia.org/wiki/Subgraph_isomorphism_problem
NP-Completeness may or may not be familiar to you (it took me 7 years of college and Master's Degree in Computer Science to understand it!), so I'll give a high-level description here. Certain problems, like sorting, can be solved in Polynomial time with respect to their input size. For example, if I have a list of N elements, I can sort it in O(N log(N)) time. More specifically, if I can solve a problem in Polynomial time, this means I can solve the problem without exhausting every possible solution. In the sorting case, I could traverse every possible permutation of the list and, if I found a permutation of the list that was sorted, return it. This is obviously not the fastest way to solve the problem though! Some very clever mathematicians were able to get it down to its theoretical minimum of O(N log(N)), thus we can sort really big lists of things quite quickly using computers today.
On the flip-side, NP-Complete problems are thought to have no Polynomial time solution (I say thought because no one has ever proven it, although evidence strongly suggests this is the case). Anyway, what this means is that you cannot definitively solve an NP-Complete problem without first exhausting every possible solution. The time complexity of NP-Complete problems are always O(c ^ N) or worse, where c is some constant greater than 1. This means that the time required to solve the problem grows exponentially with every incremental increase in problem size.
So what does this have to do with my problem???
What I'm getting at here is that, if the Subgraph Isomorphism problem is NP-Complete, then the only way you can determine if one graph is a subgraph of another graph is by exhausting every possible solution. So you can solve this, but probably only up to graphs of a few nodes or so (since the problem's time complexity grows exponentially with every incremental increase in graph size). This means that it is computationally infeasible to compute a solution for your problem because as soon as you reach a certain graph size, it will quite literally take forever to find a solution.
More practically, if your boss asks you to do something that is provably NP-Complete, you can simply say it's impossible and he will have to listen to you. If your professor asks you to do something that is provably NP-Complete, show him that it's NP-Complete and you'll probably get an A for the course. If YOU are trying to do something NP-Complete of your own accord, it's better to just move on to the next project... ;)
Well, I had to solve the problem by implementing it from scratch. I followed the strategy suggested in the topic Any working example of VF2 algorithm?. So, if someone is in doubt about this problem too, I suggest to take a look at Rich Apodaca's answer in the aforementioned topic.

Categories