I am developing an API request and I'm using multi threading.In the output I'm getting the same request twice generated by two threads.As I debugged two thread are calling the same method again.So need help so that this issue is resolved
This is my pseudo code
public void run() {
logger.debug("Thread " + currentThread().getName() + " Running");
String message = "";
Connection connection = null;
InputStream fileinput = null;
Properties properties = new Properties();
try {
File file = new File("/home/sridhar.anirudh/eclipse-workspace/API/Change.properties");
fileinput = new FileInputStream(file);
properties.load(fileinput);
soapEndpointUrl = properties.getProperty("endpoint_url");
soapAction = properties.getProperty("soap_action");
} catch (Exception e) {
e.printStackTrace();
}
try {
connection = Database.getInstance().getConnection();
} catch (SQLException e1) {
logger.error("Failed To Get Connection " + e1.getMessage());
return;
}
if (CATEGORY.equalsIgnoreCase("fraudrestriction")) {
String soapResponse = callSoapWebServiceFraudRestriction(soapEndpointUrl, soapAction);
String response_status = "";
if (soapResponse.contains("<tns:Description>SUCCESS</tns:Description>") &&
soapResponse.contains("<tns:Code>ERR_000</tns:Code>")) {
response_status = "SUCCESS";
If you kick off two copies of the thread, they will both run, creating the effect you see.
You can create multiple worker threads, but you need to allocate the work between those workers such that each performs a subset of the total workload.
Since you're (seemingly) parsing and processing a file, and making a network service request in response to that file's contents, it's not clear how you intend to divide up the work. That's the key; to use multiple threads to improve throughput, you the programmer must devise a means of partitioning the work between those threads.
As an analogy, if you have one (human) worker working on a job, simply hiring a second worker won't get the job completed any faster unless the work is divided between those workers. That division is your problem. There's nothing magical about threads that can do this for you.
Related
So what I'm trying to do is have a socket that receives input from the client, put the client into the queue and then return a message to each client in the queue when my algorithm returns true.
This queue should support a few hundred clients at once but at the same time not bottle neck the server so it can actually do what its supposed to do.
This is what i have so far:
private static final int PORT = 25566;
private static final int THREADS = 4;
private ExecutorService service;
public void init() throws IOException, IllegalStateException {
ServerSocket serverSocket;
serverSocket = new ServerSocket(PORT);
service = Executors.newCachedThreadPool();
Socket socket;
while(true) {
socket = serverSocket.accept();
System.out.println
("Connection established with " + socket.getInetAddress().toString());
service.execute(() -> {
Scanner scanner = null;
PrintWriter output = null;
String line = null;
try {
scanner = new Scanner(new InputStreamReader(socket.getInputStream()));
output = new PrintWriter(socket.getOutputStream());
} catch(IOException e) {
e.printStackTrace();
}
try {
if (scanner == null || output == null)
throw new IllegalStateException("Scanner/PrintWriter is " + "null!");
line = scanner.nextLine();
while (line.compareTo("QUIT") != 0) {
/* This is where input comes in, queue for the algorithm,
algorithm happens then returns appropriate values */
output.flush();
line = scanner.nextLine();
}
} finally {
try {
System.out.println
("Closing connection with " + socket.getInetAddress().toString());
if(scanner != null) {
scanner.close();
}
if(output != null) {
output.close();
}
socket.close();
} catch(IOException e) {
e.printStackTrace();
}
}
});
}
}
Now what I think will happen with this, is if the queues do reach high enough levels, my thread pool will completely bottleneck the server as all of the threads are being put to use on handling the clients in the queue and there won't be enough processing for the algorithm.
EDIT: After a bunch of testing, I think it will work out if in the algorithm it returns the value then disconnects, not waiting for user response but having the users client reconnect after certain conditions are met.
Your bottleneck is unlikely to be processing power unless you are machine limited. What's more likely to happen is that all the threads in your thread pool are consumed and end up waiting on input from the clients. Your design can only handle as many clients at once as there are threads in the pool.
For a few hundred clients, you could consider simply creating a thread for each client. The limiting resource for the number of threads that can be supported is typically memory for the stack that each thread requires, not processing power; for a modern machine with ample memory, a thousand threads is not a problem, based on personal experience. There may be an operating system parameter limiting the number of threads which you may have to adjust.
If you need to handle a very large number of clients, you can set up your code to poll sockets for available input and do the processing only for those sockets that have input to be processed.
I'm trying to create a java program that downloads certain asset files from an FTP server to a local file. Because my (free) FTP server doesn't support file sizes over a few megabytes, I decided to split up the files when they are uploaded and recombine them when the program downloads them. This works, but it is rather slow, because for each file, it has to get the InputStream, which takes some time.
The FTP server I use has a way to download the files without actually logging into the server, so I'm using this code to get the InputStream:
private static final InputStream getInputStream(String file) throws IOException {
return new URL("http://site.website.com/path/" + file).openStream();
}
To get the InputStream of a part of the asset file I'm using this code:
public static InputStream getAssetInputStream(String asset, int num) throws IOException, FTPException {
try {
return getInputStream("assets/" + asset + "_" + num + ".raf");
} catch (Exception e) {
// error handling
}
}
Because the getAssetInputStreams(String, int) method takes some time to run (especially if the file size is more then a megabyte), I decided to make the code that actually downloads the file multi-threaded. Here is where my problem lies.
final Map<Integer, Boolean> done = new HashMap<Integer, Boolean>();
final Map<Integer, byte[]> parts = new HashMap<Integer, byte[]>();
for (int i = 0; i < numParts; i++) {
final int part = i;
done.put(part, false);
new Thread(new Runnable() {
#Override
public void run() {
try {
InputStream is = FTP.getAssetInputStream(asset, part);
ByteArrayOutputStream baos = new ByteArrayOutputStream();
byte[] buf = new byte[DOWNLOAD_BUFFER_SIZE];
int len = 0;
while ((len = is.read(buf)) > 0) {
baos.write(buf, 0, len);
curDownload.addAndGet(len);
totAssets.addAndGet(len);
}
parts.put(part, baos.toByteArray());
done.put(part, true);
} catch (IOException e) {
// error handling
} catch (FTPException e) {
// error handling
}
}
}, "Download-" + asset + "-" + i).start();
}
while (done.values().contains(false)) {
try {
Thread.sleep(100);
} catch(InterruptedException e) {
e.printStackTrace();
}
}
File assetFile = new File(dir, "assets/" + asset + ".raf");
assetFile.createNewFile();
FileOutputStream fos = new FileOutputStream(assetFile);
for (int i = 0; i < numParts; i++) {
fos.write(parts.get(i));
}
fos.close();
This code works, but not always. When I run it on my desktop computer, it works almost always. Not 100% of the time, but often it works just fine. On my laptop, which has a far worse internet connection, it almost never works. The result is a file that is incomplete. Sometimes, it downloads 50% of the file. Sometimes, it downloads 90% of the file, it differs every time.
Now, if I replace the .start() by .run(), the code works just fine, 100% of the time, even on my laptop. It is, however, incredibly slow, so I'd rather not use .run().
Is there a way I could change my code so it does work multi-threaded? Any help will be appreciated.
Firstly, get your FTP server replaced, there are plenty of free FTP servers that support arbitrary file size serving with additional features, but I digress...
Your code seems to have many unrelated problems that could potentially all cause the behavior you are seeing, addressed below:
You have race conditions from accessing the done and parts maps from unprotected/unsynchronized access from multiple threads. This could cause data corruption and loss of synchronization for these variables between threads, potentially causing done.values().contains(false) to return true even when it's really not.
You are calling done.values().contains() repeatedly at a high frequency. Whilst the javadoc doesn't explicitly state, a hash map likely traverses every value in a O(n) fashion to check if a given map contains a value. Coupled with the fact that other threads are modifying the map, you'll get undefined behavior. According to values() javadoc:
If the map is modified while an iteration over the collection is in progress (except through the iterator's own remove operation), the results of the iteration are undefined.
You are somehow calling new URL("http://site.website.com/path/" + file).openStream(); but stating you are using FTP. The http:// in the link defines the protocol openStream() tries to open in and http:// is not ftp://. Not sure if this is a typo or did you mean HTTP (or do you have an HTTP server serving identical files).
Any thread raising any type of Exception will cause the code to fail given that not all parts will have "completed" (based on your busy-wait loop design). Granted, you may be redacted some other logic to guard against this, but otherwise this is a potential problem with the code.
You aren't closing any streams that you've opened. This could mean that the underlying socket itself is also left open. Not only does this constitute resource leakage, if the server itself has some sort of maximum number of simultaneous connection limit, you are only causing new connections to fail because the old, completed transfers are not closed.
Based on the issues above, I propose moving the download logic into a Callable task and running them through an ExecutorService as follows:
LinkedList<Callable<byte[]>> tasksToExecute = new LinkedList<>();
// Populate tasks to run
for(int i = 0; i < numParts; i++){
final int part = i;
// Lambda to
tasksToExecute.add(() -> {
InputStream is = null;
try{
is = FTP.getAssetInputStream(asset, part);
ByteArrayOutputStream baos = new ByteArrayOutputStream();
byte[] buf = new byte[DOWNLOAD_BUFFER_SIZE];
int len = 0;
while((len = is.read(buf)) > 0){
baos.write(buf, 0, len);
curDownload.addAndGet(len);
totAssets.addAndGet(len);
}
return baos.toByteArray();
}catch(IOException e){
// handle exception
}catch(FTPException e){
// handle exception
}finally{
if(is != null){
try{
is.close();
}catch(IOException ignored){}
}
}
return null;
});
}
// Retrieve an ExecutorService instance, note the use of work stealing pool is Java 8 only
// This can be substituted for newFixedThreadPool(nThreads) for Java < 8 as well for tight control over number of simultaneous links
ExecutorService executor = Executors.newWorkStealingPool(4);
// Tells the executor to execute all the tasks and give us the results
List<Future<byte[]>> resultFutures = executor.invokeAll(tasksToExecute);
// Populates the file
File assetFile = new File(dir, "assets/" + asset + ".raf");
assetFile.createNewFile();
try(FileOutputStream fos = new FileOutputStream(assetFile)){
// Iterate through the futures, writing them to file in order
for(Future<byte[]> result : resultFutures){
byte[] partData = result.get();
if(partData == null){
// exception occured during downloading this part, handle appropriately
}else{
fos.write(partData);
}
}
}catch(IOException ex(){
// handle exception
}
Using the executor service, you further optimize your multi-threading scenario since the output file will start writing as soon as pieces (in order) are available and that threads themselves are reused to save on thread creation costs.
As mentioned, there could be the case where too many simultaneous links causes the server to reject connections (or even more dangerously, write an EOF to make you think the part was downloaded). In this case, the number of worker threads can be tweaked by newFixedThreadPool(nThreads) to ensure at any given time, only nThreads amount of downloads can happen concurrently.
I need to make multiple GET requests to the same URL but with different queries. I will be doing this on a mobile device (Android) so I need to optimise as much as possible. I learnt from watching an Android web seminar by Google that it takes ~200ms to connect to a server and there's also various other delays involved with making data calls. I'm just wondering if theres a way I can optimise the process of making multiple requests to the same URL to avoid some of these delays?
I have been using the below method so far but I have been calling it 6 times, one for each GET request.
//Make a GET request to url with headers.
//The function returns the contents of the retrieved file
public String getRequest(String url, String query, Map<String, List<String>> headers) throws IOException{
String getUrl = url + "?" + query;
BufferedInputStream bis = null;
try {
connection = new URL(url + "?" + query).openConnection();
for(Map.Entry<String, List<String>> h : headers.entrySet()){
for(String s : h.getValue()){
connection.addRequestProperty(h.getKey(), s);
}
}
bis = new BufferedInputStream(connection.getInputStream());
StringBuilder builder = new StringBuilder();
int byteRead;
while ((byteRead = bis.read()) != -1)
builder.append((char) byteRead);
bis.close();
return builder.toString();
} catch (MalformedURLException e) {
throw e;
} catch (IOException e) {
throw e;
}
}
If for every request you expect another result and you cannot combine requests by adding more than one GET variables in the same request then you cannot avoid the 6 calls.
However you can use multiple Threads to simultaneously run your requests. You may use a Thread Pool approach using the native ExecutorService in Java. I would propose you to use an ExecutorCompletionService to run your requests. As the processing time is not CPU-bounded, but network-bounded, you may use more Threads than your current CPUs.
For instance, in some of my projects I use 10+, sometimes 50+ Threads (in a Thread Pool) to simultaneously retrieve URL data, even though I only have 4 CPU cores.
I have the following code in my application which does two things:
Parse the file which has 'n' number of data.
For each data in the file, there will be two web service calls.
public static List<String> parseFile(String fileName) {
List<String> idList = new ArrayList<String>();
try {
BufferedReader cfgFile = new BufferedReader(new FileReader(new File(fileName)));
String line = null;
cfgFile.readLine();
while ((line = cfgFile.readLine()) != null) {
if (!line.trim().equals("")) {
String [] fields = line.split("\\|");
idList.add(fields[0]);
}
}
cfgFile.close();
} catch (IOException e) {
System.out.println(e+" Unexpected File IO Error.");
}
return idList;
}
When i try parse the file having 1 million lines of record, the java process fails after processing certain amount of data. I got java.lang.OutOfMemoryError: Java heap space error. I can partly figure out that the java process stops because of this huge data being provided. Kindly suggest me how to proceed with this huge data.
EDIT: Will this part of code new BufferedReader(new FileReader(new File(fileName))); parse the whole file and gets affected to the size of the file.
The problem you have is you are accumulating all the data on the list. The best way to approach this is to do it on a streaming fashion. This means do not accumulate all the ids on the list, but call your web service on each row or accumulate a smaller buffer and then do the call.
Opening the file and creating the BufferedReader will have no impact on memory consumption, as the bytes from the file will be read (more or less) line by line. The problem is at this point in the code idList.add(fields[0]);, the list will grow as large as the file as you keep accumulating all of the file data into it.
Your code should do something like this:
while ((line = cfgFile.readLine()) != null) {
if (!line.trim().equals("")) {
String [] fields = line.split("\\|");
callToRemoteWebService(fields[0]);
}
}
Increase your java heap memory size using the -Xms and -Xmx options. If not set explicitly, the jvm sets the heap size to the ergonomic defaults which in your case is not enough. Read this paper to find out more about tuning the memory in jvm: http://www.oracle.com/technetwork/java/javase/tech/memorymanagement-whitepaper-1-150020.pdf
EDIT: Alternative way on doing this in a producer-consumer way to exploit parallel processing. The general idea is to create a producer thread that reads the file and queues tasks for processing and n consumer threads that consume them. A very general idea (for illustrative purposes) is the following:
// blocking queue holding the tasks to be executed
final SynchronousQueue<Callable<String[]> queue = // ...
// reads the file and submit tasks for processing
final Runnable producer = new Runnable() {
public void run() {
BufferedReader in = null;
try {
in = new BufferedReader(new FileReader(new File(fileName)));
String line = null;
while ((line = file.readLine()) != null) {
if (!line.trim().equals("")) {
String[] fields = line.split("\\|");
// this will block if there are not available consumer threads to process it...
queue.put(new Callable<Void>() {
public Void call() {
process(fields);
}
});
}
}
} catch (InterruptedException e) {
Thread.currentThread().interrupt());
} finally {
// close the buffered reader here...
}
}
}
// Consumes the tasks submitted from the producer. Consumers can be pooled
// for parallel processing.
final Runnable consumer = new Runnable() {
public void run() {
try {
while (true) {
// this method blocks if there are no items left for processing in the queue...
Callable<Void> task = queue.take();
taks.call();
}
} catch (InterruptedException e) {
Thread.currentThread().interrupt();
}
}
}
Of course you have to write code that manages the lifecycle of the consumer and producer threads. The right way to do this would be by implementing it using an Executor.
When you want to work with big data, you have 2 choices:
use a big enough heap to fit all the data. this will "work" for a while, but if your data size is unbounded, it will eventually fail.
work with the data incrementally. only keep part of the data (of a bounded size) in memory at any one time. this is the ideal solution as it will scale to any amount of data.
I have cycle, where i download image, I need to load for example 10 images and merge them in one image. In my interest what images will all loaded. This is how i do that.
I have executor for limit thread count, and i have CountDownLatch barrier which waiting until all images will be loaded.
CountDownLatch barrier = new CountDownLatch(images.size());
private static ExecutorService executorService = Executors.newFixedThreadPool(MAX_THREAD_POOL);
for (Image image : images) {
executorService.execute(new ImageRunnable(image, barrier));
}
barrier.await();
In ImageRunnable i download image like this. From google static map.
String url ="my url"
try {
URL target = new URL(url);
ImageIO.read(target);
barrier.countDown();
//exaggerated logic
} catch (IOException e) {
System.out.println("Can not load image, " + e);
}
Other people said to me that i can get case when all threads in executor will be busy and my algorithm never ends because he will wait until all threads get barrier.await() point (deadlock). How said to me it's will happen when ImageIO.read(target) called and connection will established but HTTP session never be closed (response from server do not come back). This can happen? I thought in this case i get some exception and bad thread will interrupted. Exactly that happens when I start my cycle but on third image i close internet connection by firewall. On output I get broken image like network was closed and image not loaded to end. Am I wrong?
The concern is you may throw an exception and never count down your latch.
I would consider doing this:
String url ="my url"
try {
URL target = new URL(url);
ImageIO.read(target);
} catch (IOException e) {
System.out.println("Can not load image, " + e);
throw e;
} finally {
barrier.countDown();
}
Throw the exception to let the world know you've run into a problem and may not be able to complete (you know you can't recover from it) but at the very least let the barrier get lowered. I'd rather have to deal with an exception than a deadlock.
Just to flesh out my comment:
CompletionService<Image> service = new ExecutorCompletionService<Image>(
Executors.newFixedThreadPool(nThreads));
for (Image image : images) {
service.submit(new ImageRunnable(image), image);
}
try {
for (int i = 0; i < images.size(); i++) {
service.take();
}
} catch (InterruptedException e) {
// someone wants this thread to cancel peacefully; either exit the thread
// or at a bare minimum do this to pass the interruption up
Thread.currentThread().interrupt();
}
There. That's it.
If you're concerned about enforcing timeouts on the HTTP connection, my quick and dirty research suggests something like...
URL target = // whatever;
URLConnection connection = target.openConnection();
connection.setReadTimeout(timeoutInMilliseconds);
InputStream stream;
try {
stream = connection.getInputStream();
return ImageIO.read(stream);
} finally {
if (stream != null) { stream.close(); }
}
Apart from moving barrier.countDown() to finally block as suggested by #corsiKa, make sure your code ever finishes. Set some timeout on reading URL and on await():
barrier.await(1, TimeUnit.MINUTES);