Challenging Multithreading Problems

Challenging Multithreading Problems - java

Is there some resource for challenging multi-threading problems? Would like to pose these to interviewees if possible. Tired of asking the same wait-notify questions that everyone gets right these days, but can't visualise a real scenario where multi-threading was employed.

The problem is that concurrent programming is a difficult topic. If you (the interviewer) are not fully on top of it, it will be difficult for you to tell if the interviewee knows their stuff. It is very easy to come up with solutions to concurrency problems that have subtle flaws. Conversely, it is unfair on candidates1 if you reject them because you think their answers are wrong when they are actually correct.
1 - and bad for your organisation. If the candidate actually knows more about multi-threading than you, then you arguably need to employ him. Other factors being equal, of course.

Java Concurrency In Practice. I like to know if candidate understand data race, CAS, Michael Scott Queue and other concurrent data structures and how concurrent thread safety is important with growing number of cores.

As multithreading is hard (as others have pointed out) I would suggest having this in an actual programming session where the potential employee is given a programming problem preferrably based on something that has actually happened along with one of your experienced programmers so you can actually SEE how the problem was attempted solved, and the experienced programmer can evaluate what happened.
Must not be too complex, but complex enough that your expereinced programmer get enough information.

Well, if you want to have fun with the poor sap, ask him about Dekker's Algorithm (and Peterson's variation thereof). If you're feeling nasty, ask him if he has ever used either one on real multiprocessor hardware.
If you feel extra-nasty, ask him to show you a technique suitable for lock-free true concurrent single-reader single-writer unidirectional communications, between two processors with shared memory, in which the only atomic operations are single-word reads and writes. There is no read-modify-write instruction, on either side, and the processor architectures need not be the same. (Yes, such a technique exists.)

I wouldn't ask too specific/detailed questions. But the above mentioned book 'Concurrency in practice' is a good helper. Just go there chapter-wise and read out the pin-points, e.g.:
Explain difference between mutable/immutable
What does it mean to share data in concurrency setup
What problems do you solve with concurrency
etc.

First you
try to get the real scenario
of it, and then ask job seekers.
For this you should pose a questions like what is real scenario for multithreading?
Hope it will help you.

I got one in a interview recently. Get the candidate to write a Servlet that implements an accurate in memory hit counter indexed by URL (to serve a javascript style hit counter on a number of web page). Try it for yourself, it's not as easy as it sounds. The solution is a cut down implementation of the Memoizer pattern from Concurrency in Practice.

Related

Efficiency of having logic in-line vs calling a method?

I currently have a disagreement going on with my 2nd year JAVA professor that I'm hoping y'all could help solve:
The code we started with was this:
public T peek()
{
if (isEmpty())
.........
}
public boolean isEmpty()
{
return topIndex<0;
}
And she wants us to remove the isEmpty() reference and place its code directly into the if statement (i.e. change the peek method contents to:
if(topIndex<0).......) to "Make the code more efficient". I have argued that a) the runtime/compile time optimizer would most likely have inlined
the isEmpty() call, b) even if it didn't, the 5-10 machine operations would be negligible in nearly every situation, and c) its just bad style because it makes the program less readable and less changeable.
So, I guess my question is:
Is there any runtime efficiency gained by inlineing logic as opposed to just calling a method?
I have tried simple profiling techniques (aka long loop and a stopwatch) but tests have been inconclusive.
EDIT:
Thank you everyone for the responses! I appreciate you all taking the time. Also, I appreciate those of you who commented on the pragmatism of arguing with my professor and especially doing so without data. #Mike Dunlavey I appreciate your insight as a former professor and your advice on the appropriate coding sequence. #ya_pulser I especially appreciate the profiling advice and links you took the time to share.

You are correct in your assumptions about java code behaviour, but you are impolite to your professor arguing without data :). Arguing without data is pointless, prove your assumptions with measurements and graphs.
You can use JMH ( http://openjdk.java.net/projects/code-tools/jmh/ ) to create a small benchmark and measure difference between:
inlined by hand (remove isEmpty method and place the code in call place)
inlined by java jit compiler (hotspot after 100k (?) invocations - see jit print compilation output)
disabled hotspot inlining at all
Please read http://www.oracle.com/technetwork/java/whitepaper-135217.html#method
Useful parameters could be:
-Djava.compiler=NONE
-XX:+PrintCompilation
Plus each jdk version has it's own set of parameters to control jit.
If you will create some set of graphics as results of your research and will politely present them to the professor - I think it will benefit you in future.
I think that https://stackoverflow.com/users/2613885/aleksey-shipilev can help with jmh related questions.
BTW: I had great success when I inlined plenty of methods into a single huge code loop to achieve maximum speed for neural network backpropagation routine cause java is (was?) too lazy to inline methods with methods with methods. It was unmaintainable and fast :(.

Sad...
I agree with your intuitions about it, especially "the 5-10 machine operations would be negligible in nearly every situation".
I was a C.S. professor a long time ago.
On one hand, professors need all the slack you can give them.
Teaching is very demanding. You can't have a bad day.
If you show up for a class and you're not fully prepared, you're in for a rough ride.
If you give a test on Friday and don't have the grades on Monday the students will say "But you had all weekend!"
You can get satisfaction from seeing your students learn, but you yourself don't learn much, except how to teach.
On the other hand, few professors have much practical experience with real software.
So their opinions tend to be founded on various dogmatic certitudes rather than solid pragmatism.
Performance is a perfect example of this.
They tend to say "Don't do X. Do Y because it performs better." which completely misses the point about performance problems - you have to deal in fractions, not absolutes. Everything depends on what else is going on.
The way to approach performance is, as someone said "First make it right. Then make it fast."
And the way you make it fast is not by eyeballing the code (and wondering "should I do this, or should I do that"), but by running it and letting it tell you how it's spending time.
The basic idea of profiling is how you do this.
Now there is such a thing as bad profiling and good profiling, as explained in the second answer here (and usually when professors do teach profiling, they teach the bad kind), but that's the way to go.

As you say, the difference will be small, and in most circumstances the readability should be a higher priority. In this case though, since the extra method consists of a single line, I'm not sure this adds any real readability benefit unless you're calling the same method from elsewhere.
That said, remember that your lecturer's target is to help you learn computer science, and this is a different priority than writing production code. Particularly, she won't want you to be leaving optimization to automated tools, since that doesn't help your learning.
Also, just a practical note - in school and in professional development, we all have to adhere to coding standards we personally disagree with. It's an important skill, and really is necessary for team working, even if it does chafe.

Calling isEmpty is idiomatic and nicely readable.
Manually inlining that would be a micro optimization,
something best done in performance critical situations,
and after a bottleneck was confirmed by benchmarking in the intended production environment.
Is there a real performance benefit in manually inlining?
Theoretically yes, and maybe that's what the lecture wanted to emphasize.
In practice,
I don't think you'll find an absolute answer.
The automatic inlining behavior maybe implementation dependent.
Also keep in mind that benchmark results will depend on JVM implementation, version, platform.
And for that reason,
this kind of optimization can be useful in rare extreme situations,
and in general detrimental to portability and maintainability.
By the same logic, should we inline all methods,
eliminating all indirections at the expense of duplicating large blocks of code?
Definitely not.
Where you draw the line exactly between decomposition and inlining may also depend on personal taste,
to some degree.

Another form of data to look at could be the code that is generated. See the -XX:+PrintAssembly option and friends. See How to see JIT-compiled code in JVM? for more information.
I'm confident that in this particular case the Hotspot JVM will inline the call to isEmpty and there would be no performance difference.

Difference between Composability and Decomposability

I've been looking across the web for a simple explanation about the differences between the two.
I understand composition is "bottom-up" design while decomposition is "top-down" design.
However, aside from that - are there any further differences?
If a program implements the "composability" principle, does it necessarily also implement the "decomposability" principle, and vice-versa?
It's obvious how these two can lead to different designs, but all in all, it seems they represent exactly the same thing from different point of views.
Clarifications will be highly appreciated.
Cheers!
Some reference links:
YorkU
Blog about Modular Composability
Blog about Modular Decomposability

As the first link you've provided shows it, these two approaches are not incompatible. You just need to know when to use one or the other.
From my experience, top-down design is a good approach when you start designing, as you need to discover your system, understand the requirements and making something works quickly. As you add more and more features, specific responsibilities start to emerge and this is where decomposing your problem is required. This will prevent to duplicate code from one feature to another, and lower the efforts required to compose new ones.
Choosing between one approach or the other is just a matter of figuring out the right design decision to take at a proper time. If you feel that some aspects of a problem are still unclear, there is no reason to decompose it. Just wait until your module cries out for modularization (for example, having a hard time to understand what you wrote some days ago would be a good sign, same for duplicated code).
Is this answering your question ?

Is it compulsory to learn about Data Structures if you want to be a Java/C++ programmer? [closed]

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 10 years ago.
So do I really need to learn about them ? Isn't there an interesting way to learn about stacks, linked lists, heaps, etc ? I found it a boring subject.
**While posting this question it showed some warning. Am I not allowed to post such a question ? Admins please clarify and I will delete it :/
Warning :: The question you're asking appears subjective and is likely to be closed.
okay..I get it
So what is THE best way to learn them ? What book do I refer ? What website ?

It's compulsory to learn about data structures if you want to be a programmer. Data structures are your bread-and-butter - if you don't understand things like the behavior, uses, and run-time complexity ('big-O') of at least the basic structures (arrays, linked lists, stacks, queues, trees (binary / n-ary, self-balancing varietes), hash-tables, heaps, graphs) and the algorithms that run on them (insert / locate / delete), you won't know which is appropriate to use under what circumstances.
Every trade has its tools; these are ours. Data structures are the most basic underpinnings of almost any algorithm that you're going to learn. Unless you want to be a cargo cult programmer, you need to understand how they work.
Whether or not there are interesting ways to learn about them is a separate question entirely... :)

I would go even so far as to say that most of programming revolves around manipulating data structures, it is the foundation of computing after all: you get some data, you process it, you possibly give output. All the data usually reside in data structures and choosing inappropriate structures will have the bigger impact the bigger the project.

As you get more experience you will find that algorithms and datastructures are invaluable to your day-to-day development, and actually pretty interesting.
By learning about them now you will learn:
Which data structure is appropriate for which context, i.e. when to use a single-linked list, when to use a stack, when to use a queue, when to use a tree
Which algorithms are appropriate for which purpose, such as tree depth-first search or breadth-first search.
Space and time complexity of algorithms, for example why is a quicksort sometimes the best solution, and sometimes heapsort.
Overall it'll teach you the beginnings and fundamentals of computer science, even if you never have to implement a stack again you'll know the kind of thought and consideration that goes into it. If you then ever have to implement your OWN data structure (and chances are you will, quite often), you will know what to do and what not to do.

If you want to be a successful programmer, data structure is a must. How will you program if you don't know data structures and algorithms?

Whether you like it or not, all programming is built around data structures. You may never have to write one, but you will have to choose which one to use many times. It is not really a requirement to programming in general, but if you want to excel in the field, understanding of the basics is a must.
Anyone can build a shed without knowledge of materials or construction techniques. You can even work in a house placing bricks and mortar under the orders of someone else, but if you want to build a house yourself, you do need to understand materials and techniques.
Data structures are the programming materials. Algorithms are techniques. Will you use data structures? You will use the simplest ones in a daily basis, each so often you will need to solve a problem where an specific data structure is required, and while you may manage not to build your own bricks, you will need to understand whether you need bricks or a concrete wall for your purposes.

If you want some evidence for the importance of data structures, take a look at the Google hiring process. Whatever you think about Google as a company, there is no denying that they have some very good people working for them. Their interview process is setup to determine the candidates knowledge of data structures and algorithms. Because when it comes down to it, that is what is at the core of programming, no matter what language you are working in or what domain you are programming for.
If you are planning a career as a professional programmer, you need to know the fundamentals, not just how to crank out code that "works". Otherwise, your just playing.

Is it compulsory to learn about arithmetic to be an engineer?

If you take the attitude of "is it compulsory" with regards to any of the building blocks of programming languages, you're probably not cut out to be a coder. Regardless of "compulsory" or not, you should always be looking for new concepts to learn and seeing if it'll improve your coding style/standard.
But in answer to your question: yes.

I'd say it's compulsory at some point in your development to have a firm grasp. I'm not necessarily sure the standard Data Structures course is the best way to learn. Sometimes, the best way to learn them is "I have problem X. For some reason, it's taking my algorithm a long time to solve X. How can I make this faster?"
One book I'd highly recommend is Programming Pearls. It has some really good analyses, backed by a lot of examples of where the real-world motivations for the solutions came from. It presents the problems in an interesting manner, and never teaches by giving you a laundry list of data structures.

Yes 99% of books on Data Structures are boring and exercises contrived. They feel like they are just making up problems that server no practical purpose :(
This book is the one exception to the rule I've come across. You will have a naive but working RPG game by the end of the book:
Data Structures for Game Programmers
Read the above book and you will solve your chicken and egg problem and see you really can't do much without data structures after all.

Well, this may sound a little awkward but I wouldn't say it is COMPULSORY to learn Data Structures to be a - regular- developer. Seriously! Of courser if you study then hard it will give you a lot of insights and knowledge on several programming aspects and that's always good. But compulsory ... well, I think it is just too much. VERY GOOD would be enough.
Let me explain why. It is not that often, for today, to write Data Structures code because - let's face it - it would be RE-writing, RE-inventing what we already know for so many years! What I would say it is COMPULSORY is to study just the general theory of them and the APIs/libraries that are already commonly in use (and tested and optimized) like the Collections API in Java. You must know by heart the differences between a List and a Set (in Java for instance) and their capabilities and proper usage but you don't need to know exactly HOW they are implemented - checking every private method and attribute - to deal with most common, day to day, coding problems. You will do just fine without all the "guts" of Data Structures for the general stuff. We face different challenges now.
But don't get me wrong - neither think I am crazy or naive! Off course there are situations that you will need to implement yourself some sort of custom data structure (maybe your own BalancedBinaryTreeMap!). You got to be prepared for everything.
I am just arguing about being compulsory or not. Again, I don't think is compulsory but it is indeed very good.
Cheers.

I suppose you could learn programming without learning a whole lot about data structures or algorithms. To make an equivalent example, think of it like if a carpenter knew how to build things, but didn't know about measurements and such. Would he be able to get a career in carpentry? Possibly, but let's say he needed to know the exact material he would need to complete a project. He'd probably get fired because he doesn't know what sort of material or measurements to use.
So with data structures and algorithms, you can say it's the ability to give exact measurements of an application and knowing what sort of performance you'll get out it.

Like a musician learning scales, data structures are part of the tools of the software trade. Sure you can work as a programmer without the knowledge but you're handicapping yourself. If I'm interviewing two people for a position and one of them understands and uses structures and the other can't even explain what a stack is, my choice is pretty clear.
If you want to be judged to be a competent, employable programmer, you need to learn your craft.

Is it compulsory to learn them?
No, you can program without them, just as it's not compulsory to break your code into functions.
That being said, if you want to be an effective programmer that can write at least decent code without getting your car egged by your coworkers, you want to at least be able to make a decent selection of library classes.
Every programmer should understand the tradeoff between a LinkedList and an Array, or why binary searches and binary trees are useful for sorted data. This isn't just about performance - it's about correctness too since you can't just put anything into a tree set.
Does it mean that you need to know how to implement your own AVL tree, build super-smart data structures, etc.? Not necessarily. It's a matter of how much you want to know what's going on "beneath the hood" and whether your tasks necessitate it.
I'm not a big fan of deep data structure and algorithms questions in interviews because the vast majority of developers don't need to implement these things, just to use library stuff. I prefer to ask job related questions in interviews. However, accept that if you don't learn those things you would face a tougher battle to get other jobs.

It should be, yes...

No body should force you to learn anything you don't want to learn.
If you're the type of person that is compelled to be the best that he/she can be at what he/she does, and you love what you do for a living, you'll learn everything there is to know on your own accord.
#happysoul: You should ask yourself WHY learning data structures bore you. Also, it would help if you also identify what DOESN'T bore you.
If you at least love to learn about algorithms, I'm sure we can all suggest a perfect marriage of the two that would be exciting to learn!
My recommendation for best algorithm/data structure combo for the most fun learning experience: graphs.

Thread-safety refactoring

I am trying to make a Java application thread-safe. Unfortunately, it was originally designed for a single-user mode and all the key classes are instantiated as singletons. To make matters worse, there is a whole bunch of interfaces working as constants containers and numerous static fields.
What would be considered as a good practice in this case?
There is a single entry point, so I could synchronize that and just use pooling (sort of), but if the calls take more than a minute on average, all other threads in the queue would have to wait for a long time...
Since the test code coverage is not quite optimal and I can't be sure if I overlooked something, some hints about bad implementation patterns (similar to those stated above) in this area would be useful.
I know the best thing would be to rewrite the whole structure, but this isn't an option.

It doesn't sound like there is a quick fix for this. You should probably start by refactoring the existing code to use good design patterns, with an eye for multi-threading it in the future. Implement the multi-threading as a later step, after you've cleaned it up.

#coldphusion, you'll have to read/analyze code. Using an automated tool, if such a tool exists, would be like shooting yourself in the foot.
Plus, not everything has to be thread-safe. If an object will never be accessed from multiple threads, no need to make it thread-safe. If an object is immutable, then it's already thread-safe.
Be ready to tell your boss "It won't take a few hours or a day, even you know it, so stop asking."
I recommend reading Java Concurrency In Practice.

As Jonathan mentions it's doesn't sound like there's a quick fix.
You could consider using ThreadLocal in order to provided a dedicated per-thread singleton. Obviously this may or may not be possible depending on the state stored within the singletons, whether this has to be shared / maintained, etc.

I will add to #nevermind's advice, since he/she made some very practical points.
Be practical about what you need to change to accomplish your task since there is no magic way. Your existing code, well designed or not, may only need small changes depending on how it is used. Of course this also means a complete redesign may also be in order.
There is no way for anyone here to know (unless they wrote the original code ;-)
For example, if you only need to make access to a single object (singleton or not) threadsafe, this is fairly easily accomplished, possibly without any coding impacts on the caller of such an object.
On the other hand, if you need to modify multiple objects at once to keep the integrity of your data/state, then your efforts will be considerably harder.

Singletons are not a bad thing and do not go against thread-safety, as long as they don't store any state. Just look at any J2EE app; lots of singletons, without any state (only references to other stateless singletons). All state is stored in sessions; you could maybe mimic that, but as others have said, there is no way to automagically transform your app; you will have to make some good analysis to determine how you will refactor it to separate all stateless beans from the stateful ones, maybe encapsulate state in some value objects, etc.

If anyone should be also interested on the topic - I found a pretty detailed tutorial on "what (not) to do" - with common mistakes and best practices.
Unfortunately, it's only available in German atm :|

Multiprocessor Programming Course Feedback

I am planning to attend a one week course on this subject. I am primarily involved in Java projects and have decent knowledge of C and C++ too. And, I am interested in learning more on concurrent programming and would like to get feedback on this course. Has someone read the book or found these concepts relevant in contemporary programming?
More information on the course:
http://www.amazon.com/Art-Multiprocessor-Programming-Maurice-Herlihy/dp/0123705916/

I would definitely, suggest you to go with this. But I would like to add another really important resource, specific to java - as you labeled the question 'java' - which is Java Concurrency in Practice.

The concepts are very relevant.
I seem to recall I had a very quick "flick" through this book at some point. It covers some quite interesting material. But a slight thing that concerned me as I recall is that it presents various algorithm implementations that rely on access to volatile arrays, and assuming that the individual elements have volatile access semantics when doing so. As far as I'm aware, the Java Memory Model doesn't offer this guarantee, so the implementations given may need some modification.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.