Not seeing full data in the string buffer in mutli thread logging - java

I am using ThreadLocal StringBuffer for storing entire log to move it to kibana at the end of each test case execution. But when I am running cases in parallel, the log is loosing some entries.
I am writing the log into file before adding it to StringBuffer and I see all entires in the log file but not in the StringBuffer.
Any ideas are much appreciated.
I tried String Builder but no luck. Tried to make synchronized still no luck. I tried writing the log entry into file before moving into string buffer and I see all entire there but not getting all entries in string buffer
public abstract class Messages {
static ThreadLocal<StringBuffer> msg;
public static void init() {
msg = new ThreadLocal<StringBuffer>() {
#Override
protected StringBuffer initialValue() {
return new StringBuffer();
}
};
public static void addMsg(String msg) {
msg.get().append(msg + "\r\n");
System.out.println(msg);
}
}
public class CallingClass(){
public void callingMethod(String threadName){
Messages.init();
Messages.addMsg("Hi");
Messages.addMsg("This");
Messages.addMsg("Is");
Messages.addMsg("Testing");
Messages.addMsg("For");
Messages.addMsg("Multi");
Messages.addMsg("Thread");
Messages.addMsg("UI");
Messages.addMsg(threadName + "!!");
}
}
From my cucumber tests, we call the above method callingMethod from each thread.
I am running 10 parallel threads and the result is different when I print the msg at the end from all 10 threads, I see for some threads it is missing the first few entries.
I tried making addMsg synchronized but still no luck. In a single thread execution, the log is proper and also when I am using debug from eclipse, the log is coming properly as expected.

Multiple threads call Messages.init().
The first time we call init the previous value of null for msg is replaced with a new ThreadLocal.
The next time we call init the previous value is replaced with a new copy of ThreadLocal, effectively discarding all previously used/created StringBuffer objects.
And the same the next time ...
There's no point in that init method and you should get rid of it (and all calls to it) and just define msg like this:
static final ThreadLocal<StringBuffer> msg = new ThreadLocal<StringBuffer>() {
#Override
protected StringBuffer initialValue() {
return new StringBuffer();
}
};
The whole point of ThreadLocal is that you only need one instance of it to give each thread their own StringBuffer.
Also, if all access to the content is via those methods, then the synchronization/thread safety of StringBuffer isn't needed and you should use a StringBuilder instead.

Related

Synchronization block inside callback method

I'm using Apache mina in one of my projects. The doDecode() of CumulativeProtocolDecoder is called every time a chunk of data is received. I'm concatenating these chunks together until I get a special character at the end of the string. So I start concatenating when I receive $ as the first character and end concatenation when I receive another $ character.
I want to make the concatenation part synchronized to avoid any potential non intended concatenations.
By encapsulating the concatenating block with synchronized() clause I can make this operation thread safe but My question is while one thread is busy doing the concatenations and another thread calls doDecode() with the new data, will the new info provided as an argument to doDecode() will be lost because the synchronized block is busy or will it wait and keep the argument cached until the synchronized block is available again?
#Override
protected boolean doDecode(IoSession ioSession, IoBuffer ioBuffer, ProtocolDecoderOutput protocolDecoderOutput) throws Exception {
System.out.println("inside decoder");
try {
IoBuffer data = (IoBuffer) ioBuffer;
// create a byte array to hold the bytes
byte[] buf = new byte[data.limit()];
System.out.println("REPSONSE LENGTH: "+ data.limit());
// pull the bytes out
data.get(buf);
// look at the message as a string
String messageString = new String(buf);
synchronized (messageString) {
//do concatenatoin operatoins and other stuff
}
}
catch (Exception e) {
e.printStackTrace();
}
return true;
}
Synchronizing on a local variable won't do anything useful, so you can safely remove that block.
Every thread calling doDecode will have its own copy of method arguments, so you can be safe that no argument will be changed in between.
I'm guessing concatenating these chunks means storing them in some member field of your Decoder class.
In this case, you probably want to synchronize on a field. E.g:
private final Object lock = new Object();
#Override
protected boolean doDecode(IoSession ioSession, IoBuffer ioBuffer, ProtocolDecoderOutput protocolDecoderOutput) throws Exception {
// ...
synchronized (this.lock) {
// do concatenation operations and other stuff
}
// ...
}
I'm just not sure whether it's good practice to synchronize within a framework component that it's probably meant to handle requests concurrently.

Java start multiple threads in a class

I am consuming from a certain source (say Kafka) and periodically dumping the collected messages (to, say, S3). My class definition is as follows:
public class ConsumeAndDump {
private List<String> messages;
public ConsumeAndDump(){
messages = new ArrayList<>();
// initialize required resources
}
public void consume(){
// this runs continuously and keeps consuming from the source.
while(true){
final String message = ...// consume from Kafka
messages.add(message);
}
}
public void dump(){
while(true){
final String allMessages = String.join("\n", messages);
messages.clear(); // shown here simply, but i am synchronising this to avoid race conditions
// dump to destination (file, or S3, or whatever)
TimeUnit.SECONDS.sleep(60); // sleep for a minute
}
}
public void run() {
// This is where I don't know how to proceed.
// How do I start consume() and dump() as separate threads?
// Is it even possible in Java?
// start consume() as thread
// start dump() as thread
// wait for those to finish
}
}
I want to have two threads - consume and dump. consume should run continuously whereas dump wakes up periodically, dumps the messages, clears the buffer and then goes back to sleep again.
I am having trouble starting consume() and dump() as threads. Honestly, I don't know how to do that. Can we even run member methods as threads? Or do I have to make separate Runnable classes for consume and dump? If so, how would I share messages between those?
First of all, you can't really use ArrayList for this. ArrayList is not thread-safe. Check out BlockingQueue for example. You will have to deal with things like back pressure. Don't use an unbounded queue.
Starting a thread is pretty simple, you can use lambdas for it.
public void run() {
new Thread(this::consume).start();
new Thread(this::produce).start();
}
Should work, but gives you little to no control over when those processes should end.

Java thread stuck after join

I have this Transmitter class, which contains one BufferedReader and one PrintWriter. The idea is, on the main class, to use Transmitter.receive() and Transmitter.transmit() to the main socket. The problem is:
public void receive() throws Exception {
// Reads from the socket
Thread listener = new Thread(new Runnable() {
public void run() {
String res;
try {
while((res = input.readLine()) != null) {
System.out.println("message received: " + res);
outputMessage = (res);
if (res.equals("\n")) {
break;
}
}
} catch (IOException e) {
e.printStackTrace();
}
};
});
listener.start();
listener.join();
}
The thread changes the 'outputMessage' value, which I can get using an auxiliary method. The problem is that, without join, my client gets the outputMessage but I want to use it several times on my main class, like this:
trans1.receive();
while(trans1.getOutput() == null);
System.out.println("message: " + trans1.getOutput());
But with join this system.out never executes because trans1.receive() is stuck... any thoughts?
Edit 1: here is the transmitter class https://titanpad.com/puYBvlVery
You might send \n; that doesn't mean that you will see it in your Java code.
As it says in the Javadoc for BufferedReader.readLine() (emphasis mine):
(Returns) A String containing the contents of the line, not including any line-termination characters
so "\n" will never be returned.
Doing this:
{
Thread listener = new Thread(new Runnable() {
public void run() {
doSomeWork();
};
});
listener.start();
listener.join();
}
will create a new thread and then wait for it to do its work and finish. Therefore it's more or less the same as just directly doing:
doSomeWork();
The new thread doesn't serve any real purpose here.
Also, the extra thread introduces synchronization problems because in your code you don't make sure your variables are synchronized.
Thirdly, your thread keeps reading lines from the input in a loop until there's nothing more to read and unless the other side closes the stream, it will block on the readLine() call. What you will see in with getOutput() will be a random line that just happens to be there at the moment you look, the next time you look it might be the same line, or some completely different line; some lines will be read and forgotten immediatelly without you ever noticing it from the main thread.
You can just call input.readLine() directly in your main thread when you actually need to get a new line message from the input, you don't need an extra reader thread. You could store the read messages into a Queue as yshavit suggests, if that's desirable, e.g. for performance reasons it might be better to read the messages as soon as they are available and have them ready in memory. But if you only need to read messages one by one then you can simply call input.readLine() only when you actually need it.

How to use Multithreading to effectively

I want to do a task that I've already completed except this time using multithreading. I have to read a lot of data from a file (line by line), grab some information from each line, and then add it to a Map. The file is over a million lines long so I thought it may benefit from multithreading.
I'm not sure about my approach here since I have never used multithreading in Java before.
I want to have the main method do the reading, and then giving the line that has been read to another thread which will format a String, and then give it to another thread to put into a map.
public static void main(String[] args)
{
//Some information read from file
BufferedReader br = null;
String line = '';
try {
br = new BufferedReader(new FileReader("somefile.txt"));
while((line = br.readLine()) != null) {
// Pass line to another task
}
// Here I want to get a total from B, but I'm not sure how to go about doing that
}
public class Parser extends Thread
{
private Mapper m1;
// Some reference to B
public Parse (Mapper m) {
m1 = m;
}
public parse (String s, int i) {
// Do some work on S
key = DoSomethingWithString(s);
m1.add(key, i);
}
}
public class Mapper extends Thread
{
private SortedMap<String, Integer> sm;
private String key;
private int value;
boolean hasNewItem;
public Mapper() {
sm = new TreeMap<String, Integer>;
hasNewItem = false;
}
public void add(String s, int i) {
hasNewItem = true;
key = s;
value = i;
}
public void run() {
while (!Thread.currentThread().isInterrupted()) {
try {
if (hasNewItem) {
// Find if street name exists in map
sm.put(key, value);
newEntry = false;
}
} catch (InterruptedException e) {
Thread.currentThread().interrupt();
}
}
// I'm not sure how to give the Map back to main.
}
}
I'm not sure if I am taking the right approach. I also do not know how to terminate the Mapper thread and retrieve the map in the main. I will have multiple Mapper threads but I have only instantiated one in the code above.
I also just realized that my Parse class is not a thread, but only another class if it does not override the run() method so I am thinking that the Parse class should be some sort of queue.
And ideas? Thanks.
EDIT:
Thanks for all of the replies. It seems that since I/O will be the major bottleneck there would be little efficiency benefit from parallelizing this. However, for demonstration purpose, am I going on the right track? I'm still a bit bothered by not knowing how to use multithreading.
Why do you need multiple threads? You only have one disk and it can only go so fast. Multithreading it won't help in this case, almost certainly. And if it does, it will be very minimal from a user's perspective. Multithreading isn't your problem. Reading from a huge file is your bottle neck.
Frequently I/O will take much longer than the in-memory tasks. We refer to such work as I/O-bound. Parallelism may have a marginal improvement at best, and can actually make things worse.
You certainly don't need a different thread to put something into a map. Unless your parsing is unusually expensive, you don't need a different thread for it either.
If you had other threads for these tasks, they might spend most of their time sitting around waiting for the next line to be read.
Even parallelizing the I/O won't necessarily help, and may hurt. Even if your CPUs support parallel threads, your hard drive might not support parallel reads.
EDIT:
All of us who commented on this assumed the task was probably I/O-bound -- because that's frequently true. However, from the comments below, this case turned out to be an exception. A better answer would have included the fourth comment below:
Measure the time it takes to read all the lines in the file without processing them. Compare to the time it takes to both read and process them. That will give you a loose upper bound on how much time you could save. This may be decreased by a new cost for thread synchronization.
You may wish to read Amdahl's Law. Since the majority of your work is strictly serial (the IO) you will get negligible improvements by multi-threading the remainder. Certainly not worth the cost of creating watertight multi-threaded code.
Perhaps you should look for a new toy-example to parallelise.

Java MULTITHREADING - when multiple threads accesses the print method - why the execution of while method is by default synched

Have a doubt in multithreading.
Following is my main program to access a file, am creating 10 threads to be accessed on the object.
public class CallTest {
public static void main(String[] args) throws Exception {
Test t = new Test();
for (int i = 0; i < 10; i++) {
Thread t1 = new Thread(t);
t1.start();
}
}
}
Following is my program to read data from file.
public class Test implements Runnable {
static int i;
public void run() {
try {
i++;
System.out.println("####Count" + i);
print();
} catch (Exception e) {}
}
public void print() {
try {
StringBuilder bufData = new StringBuilder();
File fileTest = new File("D:\\Work\\i466477");
BufferedReader bufferedReader1 = new BufferedReader(new FileReader(
fileTest));
String strRecord = new String();
while ((strRecord = bufferedReader1.readLine()) != null) {
bufData.append(strRecord);
bufData.append("\r");
bufData.append("\n");
}
bufferedReader1.close();
System.out.println("########");
System.out.println(bufData);
} catch (Exception exe) {
System.out.println(exe);
}
}
}
Here I could see the code in the while is by default synchronized, is BufferedReader thread safe or because each thread will have their own copy of StringBuilder and BufferedReader? I could see the contents are read and written properly.
No, that code won't be synchronized by default. Several threads could each be in the while loop at the same time. "Synchronized" isn't the same as "working without any problems" - did you think it was synchronized just because you didn't have any issues? In Java, synchronized is about only allowing one thread to execute certain critical pieces of code at a time in relation to a particular monitor.
Note that your access to i in the run method is unsafe, by the way. You should also close the BufferedReader in a finally block, and avoid catching Exception. Finally, your assignment of new String() to strRecord to start with is pointless. Hopefully these are just errors due to it being test code, but it's worth being aware of them.
Actually, System.out.println is synchronized. Try this again without those.
Each thread has its own StringBuilder, BufferedReader and FileReader (and operating system level file descriptor) so there won't be any interference at that level. (None of these classes is thread-safe, but the instances are thread-confined so that doesn't matter.)
When you are writing, the PrintWriter.print(...) and PrintWriter.println(...) methods are synchronized, and that explains why you don't see output from individual println calls mixed together. (PrintWriter is thread-safe ... and needs to be.)
Note: if you changed your code to include the thread number in each println'ed string, you might occasionally see the output appearing in an unexpected order. Separate calls to a thread-safe method on the same object (the PrintWriter) don't necessarily occur in "first come, first served" order.
The code that updates the static variable i is not thread-safe, and might give you unexpected (incorrect) results every now and then ... depending on what hardware / JVM you use. You should either do the update in a synchronized static method, or replace i with an AtomicInteger.
Local variables are thread confined. But the non atomic operations(like i++) on static variable i is not thread safe.
bufferreader and stringbuilder are not shared between threads, so their use is thread safe.
StringBuffer is thread safe to a degree, in that all its methods are synchronized. BufferedReader is not thread safe.

Categories