I just had a discussion with a colleague who asked me why i would do a static Http request like this:
HttpClient.doGet(HashMap<String,String> Parameters);
instead of invoking an object of the class via default constructor and use a nonstatic method like this:
new HttpClient().doGet(HashMap<String,String> Parameters)
If assuming that the implementation of the method doGet only uses the parameters of the function without any member variables, would the static implementation be problematic in any way, e.g. thread safety?
It depends on what you mean by problematic, but going off just your given example, the answer is no, the static method call is not problematic, and is arguably better, since no object needs to be instantiated.
You mentioned thread safety, so I will touch on that. You only need to be concered with thread safety if there is "mutable shared state" involved. Mutable being the key-word here. For example, if multiple threads were sharing the same instance of HttpClient, and that HttpClient was keeping track of some state by mutating one or more of its member variables, then that definitely has the potential to be problematic.
... but also, every HTTP request has to go out on a network, to a physical computer someplace else, then to return, "at least many milli- seconds later." So, there's really no point in "multi-threading" that chore. A single thread can be given the responsibility for sending out parallel I/O-requests to the remote hosts, receiving requests from the rest of your code by means of some thread-safe queue and returning the responses in like manner on another queue (or, queues).
It is wasteful to associate "a thread" with "a request." A very small pool of workers can be consuming the responses that come off of that reply-queue.
(And of course, there are plenty of existing Java open-source frameworks that implement all of this very-familiar plumbing for you.)
Related
I have an Actor that - in its very essence - maintains a list of objects. It has three basic operations, an add, update and a remove (where sometimes the remove is called from the add method, but that aside), and works with a single collection. Obviously, that backing list is accessed concurrently, with add and remove calls interleaving each other constantly.
My first version used a ListBuffer, but I read somewhere it's not meant for concurrent access. I haven't gotten concurrent access exceptions, but I did note that finding & removing objects from it does not always work, possibly due to concurrency.
I was halfway rewriting it to use a var List, but removing items from Scala's default immutable List is a bit of a pain - and I doubt it's suitable for concurrent access.
So, basic question: What collection type should I use in a concurrent access situation, and how is it used?
(Perhaps secondary: Is an Actor actually a multithreaded entity, or is that just my wrong conception and does it process messages one at a time in a single thread?)
(Tertiary: In Scala, what collection type is best for inserts and random access (delete / update)?)
Edit: To the kind responders: Excuse my late reply, I'm making a nasty habit out of dumping a question on SO or mailing lists, then moving on to the next problem, forgetting the original one for the moment.
Take a look at the scala.collection.mutable.Synchronized* traits/classes.
The idea is that you mixin the Synchronized traits into regular mutable collections to get synchronized versions of them.
For example:
import scala.collection.mutable._
val syncSet = new HashSet[Int] with SynchronizedSet[Int]
val syncArray = new ArrayBuffer[Int] with SynchronizedBuffer[Int]
You don't need to synchronize the state of the actors. The aim of the actors is to avoid tricky, error prone and hard to debug concurrent programming.
Actor model will ensure that the actor will consume messages one by one and that you will never have two thread consuming message for the same Actor.
Scala's immutable collections are suitable for concurrent usage.
As for actors, a couple of things are guaranteed as explained here the Akka documentation.
the actor send rule: where the send of the message to an actor happens before the receive of the same actor.
the actor subsequent processing rule: where processing of one message happens before processing of the next message by the same actor.
You are not guaranteed that the same thread processes the next message, but you are guaranteed that the current message will finish processing before the next one starts, and also that at any given time, only one thread is executing the receive method.
So that takes care of a given Actor's persistent state. With regard to shared data, the best approach as I understand it is to use immutable data structures and lean on the Actor model as much as possible. That is, "do not communicate by sharing memory; share memory by communicating."
What collection type should I use in a concurrent access situation, and how is it used?
See #hbatista's answer.
Is an Actor actually a multithreaded entity, or is that just my wrong conception and does it process messages one at a time in a single thread
The second (though the thread on which messages are processed may change, so don't store anything in thread-local data). That's how the actor can maintain invariants on its state.
If multiple requests are handled by a server to run a single servlet then where we need to take care of synchronization?
I have got the answer from How does a single servlet handle multiple requests from client side how multiple requests are handled. But then again there is a question that why we need synchronization if all requests are handled separately?
Can you give some real life example how a shared state works and how a servlet can be dependent? I am not much interested in code but looking for explanation with example of any portal application? Like if there is any login page how it is accessed by n number of users concurrently.
If more than one request is handled by the server.. like what I read is server make a thread pool of n threads to serve the requests and I guess each thread will have their own set of parameters to maintain the session... so is there any chance that two or more threads (means two or more requests) can collide with each other?
Synchronization is required when multiple threads are modifying a shared resources.
So, when all your servlets are independent of each other, you don't worry about the fact that they run in parallel.
But, if they work on "shared state" somehow (for example by reading/writing values into some sort of centralized data store); then you have to make sure that things don't go wrong. Of course: the layer/form how to provide the necessary synchronization to your application depends on your exact setup.
Yes, my answer is pretty generic; but so is your question.
Synchronization in Java will only be needed if shared object is mutable. if your shared object is either read-only or immutable object, then you don't need synchronization, despite running multiple threads. Same is true with what threads are doing with an object if all the threads are only reading value then you don't require synchronization in Java.
Read more
Basically if your servlet application is multi-threaded, then data associated with servlet will not be thread safe. The common example given in many text books are things like a hit counter, stored as a private variable:
e.g
public class YourServlet implements Servlet {
private int counter;
public void service(ServletRequest req, ServletResponse, res) {
//this is not thread safe
counter ++;
}
}
This is because the service method and Servlet is operated on by multiple thread incoming as HTTP requests. The unary increment operator has to firstly read the current value, add one and the write the value back. Another thread doing the same operation concurrently, may increment the value after the first thread has read the value, but before it is written back, thus resulting in a lost write.
So in this case you should use synchronisation, or even better, the AtomicInteger class included as part of Java Concurrency from 1.5 onwards.
I guess JAXB calls the zero-arg constructor and then starts filling the non volatile fields and adds stuff to the lists.
In my own code: Immediately after doing this (the unmarshalling) the generated beans get deported to some worker threads over some add method, but not through the constructor or any other way that would trigger the memory model to flush and refetch the data to and from shared area.
Is this safe? Or does JAXB do some magic trick behind the scenes? I can't think of any way in the java programming language that could enforce everything being visible for all threads. Does the user of JAXB generated beans have to worry about fields maybe not being visibly set in a concurrent setup?
Edit: Why are there so many downvotes? Nobody was yet able to explain how JAXB ensures this seemingly impossible task.
I won't bother to investigate the various "facts" in your question, I'll just paraphrase:
"Without references it ain't true!"
That said, anyone dealing with threads in Java these days will have to actually try to avoid establishing happens-before and happens-after relationships inadvertently. Any use of a volatile variable, a synchronized block, a Lock object or an atomic variable is bound to establish such a relationship. That immediately pulls in blocking queues, synchronized hash maps and a whole lot of other bits and pieces.
How are you so certain that the JAXB implementation actually manages to do the wrong thing?
That said, while objects obtained from JAXB are about as safe as any Java object once JAXB is done with them, the marshalling/unmarshalling methods themselves are not thread-safe. I believe that you do not have to worry unless:
Your threads share JAXB handler objects.
You are passing objects between your threads without synchronization: A decidedly unhealthy practice, regardless of where those objects came from...
EDIT:
Now that you have edited your question we can provide a more concrete answer:
JAXB-generated objects are as thread-safe as any other Java object, which is not at all. A direct constructor call offers no thread-safety on its own either. Without an established happens-before relationship, the JVM is free to return partially initialized objects at the time when new is called.
There are ways, namely via the use of final fields and immutable objects, to avoid this pitfall, but it is quite hard to get right, especially with JAXB, and it does not actually solve the issue of propagating the correct object reference so that all threads are looking at the same object.
Bottom line: it is up to you to move data among your threads safely, via the use of proper synchronization methods. Do not assume anything about the underlying implementation, except for what is clearly documented. Even then, it's better to play it safe and code defensively - it usually results in more clear interactions between the threads anyway. If at a later stage a profiler indicates a performance issue, then you should start thinking about fine-tuning your synchronization code.
I'm implementing a service that does REST calls for multiple applications. The results of certain REST calls should be stored in a content provider.
I'm currently trying to use multiple threads that would do the HTTP request, parse the result, and store the data in a content provider. In order to do this, I must pass around the Context to each of the threads. I'm not sure if this is a good idea because I do not know if the Context is ok to be passed to multiple threads because of its size, thread safety, etc. I'm thinking that I'm only passing a reference to the Context object for each thread, so maybe its not heavy to pass it around?
Yes, this is fine. I don't believe that explicit synchronization is required, but many of the interesting things you can do with a Context must happen on the UI thread.
Because of this reason it is usually wise to do your http request inside an AsyncTask, which will arrange to have your implementation of onPreExecute and onPostExecute run on the UI thread, as well as provide a nice interface for cancellation.
Pretty much everything in Java is passed by reference, so it's not "heavyweight".
However, you'll need to be careful that your access to members of Context is synchronized appropriately or else you will have thread safety issues.
I'm using ThreadLocal variables (through Clojure's vars, but the following is the same for plain ThreadLocals in Java) and very often run into the issue that I can't be sure that a certain code path will be taken on the same thread or on another thread. For code under my control this is obviously not too big a problem, but for polymorphic third party code there's sometimes not even a way to statically determine whether it's safe to assume single threaded execution.
I tend to think this is a inherent issue with ThreadLocals, but I'd like to hear some advise on how to use them in a safe way.
Then don't use ThreadLocals! They are specifically for when you want a variable that's associated with a Thread, as if there were a Map<Thread,T>.
The typical use case (as far as I know) for a ThreadLocal is in a web application framework. An HTTP filter obtains a database connection on an incoming request, and stores the connection in a static ThreadLocal. All subsequent controllers needing the connection can easily obtain it from the framework using a static call. When the response is returned, the same filter releases the connection again.