Fast and asynchronous way of making multiple http requests in JAVA - java

I have a program that should make really fast http requests. Requests should be made asynchronously so that it won't block the main thread.
So I have created a queue which is observed by 10 separate threads that make http requests. If something is inserted in the queue then the first thread that gets the data will make the requests and process the result.
The queue gets filled with thousands of items so multithreading is really neccessary to get the response as fast as possible.
Since I have alot of code I'll give a short example.
main class
package fasthttp;
import java.util.concurrent.ExecutorService;
import java.util.concurrent.Executors;
import java.util.concurrent.LinkedBlockingQueue;
public class FastHTTP {
public static void main(String[] args) {
ExecutorService executor = Executors.newFixedThreadPool(10);
for (int i = 0; i < 10; i++) {
LinkedBlockingQueue queue = new LinkedBlockingQueue();
queue.add("http://www.lennar.eu/ip.php");//for example
executor.execute(new HTTPworker(queue));
}
}
}
FastHTTP class
package fasthttp;
import java.io.BufferedReader;
import java.io.IOException;
import java.io.InputStreamReader;
import java.net.HttpURLConnection;
import java.net.URL;
import java.util.concurrent.LinkedBlockingQueue;
public class HTTPworker implements Runnable {
private final LinkedBlockingQueue queue;
public HTTPworker(LinkedBlockingQueue queue) {
this.queue = queue;
}
private String getResponse(String url) throws IOException {
URL obj = new URL(url);
HttpURLConnection con = (HttpURLConnection) obj.openConnection();
StringBuilder response;
try (BufferedReader in = new BufferedReader(
new InputStreamReader(con.getInputStream()))) {
String inputLine;
response = new StringBuilder();
while ((inputLine = in.readLine()) != null) {
response.append(inputLine);
}
}
return response.toString();
}
#Override
public void run() {
while (true) {
try {
String data = (String) queue.take();
String response = getResponse(data);
//Do something with response
System.out.println(response);
} catch (InterruptedException | IOException ex) {
//Handle exception
}
}
}
}
Is there a better or faster way to make thousands of http requests response processing asynchronously? Speed and performance is what I'm after.

Answering my own question. Tried Apaches asynchronous http client but after a while I started using Ning's async client and I am happy with it.

import java.util.ArrayList;
import java.util.List;
import java.util.concurrent.*;
import java.util.stream.Collectors;
import org.apache.http.client.methods.HttpGet;
import java.util.Iterator;
import org.apache.http.impl.client.BasicResponseHandler;
import org.apache.http.impl.client.CloseableHttpClient;
import org.apache.http.impl.client.HttpClientBuilder;
public class RestService {
private final static Executor executor = Executors.newCachedThreadPool();
private final static CloseableHttpClient closeableHttpClient = HttpClientBuilder.create().build();
public static String sendSyncGet(final String url) {
return sendAsyncGet(List.of(url)).get(0);
}
public static List<String> sendAsyncGet(final List<String> urls){
List<GetRequestTask> tasks = urls.stream().map(url -> new GetRequestTask(url, executor)).collect(Collectors.toList());
List<String> responses = new ArrayList<>();
while(!tasks.isEmpty()) {
for(Iterator<GetRequestTask> it = tasks.iterator(); it.hasNext();) {
final GetRequestTask task = it.next();
if(task.isDone()) {
responses.add(task.getResponse());
it.remove();
}
}
//if(!tasks.isEmpty()) Thread.sleep(100); //avoid tight loop in "main" thread
}
return responses;
}
private static class GetRequestTask {
private final FutureTask<String> task;
public GetRequestTask(String url, Executor executor) {
GetRequestWork work = new GetRequestWork(url);
this.task = new FutureTask<>(work);
executor.execute(this.task);
}
public boolean isDone() {
return this.task.isDone();
}
public String getResponse() {
try {
return this.task.get();
} catch(Exception e) {
throw new RuntimeException(e);
}
}
}
private static class GetRequestWork implements Callable<String> {
private final String url;
public GetRequestWork(String url) {
this.url = url;
}
public String getUrl() {
return this.url;
}
public String call() throws Exception {
return closeableHttpClient.execute(new HttpGet(getUrl()), new BasicResponseHandler());
}
}
}

Related

How to give file as input and work in multiple threads?

I have this code to find out how to get the status code from a URL:
import java.io.IOException;
import java.net.HttpURLConnection;
import java.net.URL;
/**
* #author Crunchify.com
*
*/
class j {
public static void main(String args[]) throws Exception {
String[] hostList = { "http://example.com", "http://example2.com","http://example3.com" };
for (int i = 0; i < hostList.length; i++) {
String url = hostList[i];
String status = getStatus(url);
System.out.println(url + "\t\tStatus:" + status);
}
}
public static String getStatus(String url) throws IOException {
String result = "";
try {
URL siteURL = new URL(url);
HttpURLConnection connection = (HttpURLConnection) siteURL
.openConnection();
connection.setRequestMethod("HEAD");
connection.connect();
int code = connection.getResponseCode();
result = Integer.toString(code);
} catch (Exception e) {
result = "->Red<-";
}
return result;
}
}
I have checked it for small input it works fine. But I have millions of domains which I need to scan. I have a file containing it.
I want to know how I can give file as an input to this code.
I want the code to work in Multiple Threads. Say Thread count should be more than 20000, so that my output will be faster.
How I can write the out to another file?
Kindly help me. If possible I would like to know which the Bandwidth Savvy method to do the same job. I want to make the code faster anyways. how I can do these thing with the code I have?
Java Version:
java version "1.8.0_121"
Java(TM) SE Runtime Environment (build 1.8.0_121-b13)
Java HotSpot(TM) 64-Bit Server VM (build 25.121-b13, mixed mode)
This does what you want:
Input list file (c://lines.txt)
http://www.adam-bien.com/
http://stackoverflow.com/
http://www.dfgdfgdfgdfgdfgertwsgdfhdfhsru.de
http://www.google.de
The Thread:
import java.net.HttpURLConnection;
import java.net.URL;
import java.util.concurrent.Callable;
public class StatusThread implements Callable<String> {
String url;
public StatusThread(String url) {
this.url = url;
}
#Override
public String call() throws Exception {
String result = "";
try {
URL siteURL = new URL(url);
HttpURLConnection connection = (HttpURLConnection) siteURL.openConnection();
connection.setRequestMethod("HEAD");
connection.connect();
int code = connection.getResponseCode();
result = Integer.toString(code);
} catch (Exception e) {
result = "->Red<-";
}
return url + "|" + result;
}
}
And the main program:
import java.io.IOException;
import java.nio.file.Files;
import java.nio.file.Paths;
import java.util.ArrayList;
import java.util.List;
import java.util.concurrent.ExecutionException;
import java.util.concurrent.Executors;
import java.util.concurrent.Future;
import java.util.concurrent.ThreadPoolExecutor;
import java.util.stream.Stream;
public class CallableExample {
public static void main(String[] args) throws IOException {
// Number of threads
int numberOfThreads = 10;
// Input file
String sourceFileName = "c://lines.txt"; // Replace by your own
String targetFileName = "c://output.txt"; // Replace by your own
// Read input file into List
ArrayList<String> urls = new ArrayList<>();
try (Stream<String> stream = Files.lines(Paths.get(sourceFileName ))) {
stream.forEach((string) -> {
urls.add(string);
});
} catch (IOException e) {
e.printStackTrace();
}
// Create thread pool
ThreadPoolExecutor executor = (ThreadPoolExecutor) Executors.newFixedThreadPool(numberOfThreads);
List<Future<String>> resultList = new ArrayList<>();
// Launch threads
for(String url : urls) {
StatusThread statusGetter = new StatusThread(url);
Future<String> result = executor.submit(statusGetter);
resultList.add(result);
}
// Use results
FileWriter writer;
writer = new FileWriter(targetFileName);
for (Future<String> future : resultList) {
try {
String oneResult = future.get().split("\\|")[0] + " -> " + future.get().split("\\|")[1];
// Print the results to the console
System.out.println(oneResult);
// Write the result to a file
writer.write(oneResult + System.lineSeparator());
} catch (InterruptedException | ExecutionException e) {
e.printStackTrace();
}
}
writer.close();
// Shut down the executor service
executor.shutdown();
}
}
Don't forget to:
Create your input file and point to it (c://lines.txt)
Change the number of threads to get the best result
You will have issues sharing a file across threads. Much better to read the file and then spawn a thread to process each record in the file.
Creating a thread is none trivial resource wise so a thread pool would be useful so threads can be reused.
Do you want all threads to write to a single file?
I would do that using a shared list between the threads and the writer. others may have a better idea.
How to do all this depends on Java version.
You can use the ExecutorService and set the thread number to use.
The ExecutorService instance will handle for your the threads management.
You just need to provide it the tasks to execute and invoking all tasks executions.
When all the task are performed you can get the result.
In the call() method of The Callable implementation we return a String with a separator to indicate the url and the response code of the request.
For example : http://example3.com||301, http://example.com||200, etc...
I have not written the code to read a file and store in another file the result of the tasks. You should not have great difficulty to implement it.
Here is the main class :
import java.util.HashSet;
import java.util.List;
import java.util.Set;
import java.util.concurrent.Callable;
import java.util.concurrent.ExecutionException;
import java.util.concurrent.ExecutorService;
import java.util.concurrent.Executors;
import java.util.concurrent.Future;
public class Main {
public static void main(String[] args) throws InterruptedException {
String[] hostList = { "http://example.com", "http://example2.com", "http://example3.com" };
int nbThreadToUse = Runtime.getRuntime().availableProcessors() - 1;
ExecutorService executorService = Executors.newFixedThreadPool(nbThreadToUse);
Set<Callable<String>> callables = new HashSet<Callable<String>>();
for (String host : hostList) {
callables.add(new UrlCall(host));
}
List<Future<String>> futures = executorService.invokeAll(callables);
for (Future<String> future : futures) {
try {
String result = future.get();
String[] keyValueToken = result.split("\\|\\|");
String url = keyValueToken[0];
String response = keyValueToken[1];
System.out.println("url=" + url + ", response=" + response);
} catch (ExecutionException e) {
e.printStackTrace();
}
}
executorService.shutdown();
}
}
Here is UrlCall, the Callable implementation to perform a call to the url.
UrlCall takes in its constructor the url to test.
import java.io.IOException;
import java.net.HttpURLConnection;
import java.net.URL;
import java.util.concurrent.Callable;
public class UrlCall implements Callable<String> {
private String url;
public UrlCall(String url) {
this.url = url;
}
#Override
public String call() throws Exception {
return getStatus(url);
}
private String getStatus(String url) throws IOException {
try {
URL siteURL = new URL(url);
HttpURLConnection connection = (HttpURLConnection) siteURL.openConnection();
connection.setRequestMethod("HEAD");
connection.connect();
int code = connection.getResponseCode();
return url + "||" + code;
} catch (Exception e) {
//FIXME to log of course
return url + "||exception";
}
}
}
I agree with Thread pool approach exposed here.
Multi-threading consists in exploiting the time the others threads spend to wait (I guess int his case: the distant site response). It does not multiply processing power. Then about 10 threads seem reasonable (more depending on hardware).
An important point that seem to have been neglected in answer I read is that OP talk about millions of domains. Then I would discourage loading whole file in memory in a list iterated over afterwards. I would rather merge all in a single loop (file reading), instead of 3 (read, ping, write).
stream.forEach((url) -> {
StatusThread statusGetter = new StatusThread(url, outputWriter);
Future<String> result = executor.submit(statusGetter);
});
outputWriter would be a type with a synchronized method to write into an output stream.

JAVA- generate scheduled JSON

I started with this :http://examples.javacodegeeks.com/enterprise-java/rest/jersey/json-example-with-jersey-jackson/
and created a JSON on localhost:8080...
I am not sure if there is any way to generate new Json every 3 or 5 second. for example just add one digit to new JSON every 3 sec.
#Path("/LNU.se")
public class EntryPoint {
#GET
#Path("get")
#Produces(MediaType.TEXT_PLAIN)
public String test() {
return "mahdi Test";
}
#POST
#Path("post")
#Consumes(MediaType.TEXT_PLAIN)
public Response postTest(Track track){
String resault = " the track is saved mahdi: "+ track;
return Response.status(201).entity(resault).build();
}
}
and another class:
import org.eclipse.jetty.server.Server;
import org.eclipse.jetty.servlet.ServletContextHandler;
import org.eclipse.jetty.servlet.ServletHolder;
public class App {
public static void main(String[] args) {
ServletContextHandler context = new ServletContextHandler(ServletContextHandler.SESSIONS);
context.setContextPath("/");
Server jettyServer = new Server(8080);
jettyServer.setHandler(context);
ServletHolder jerseyServlet = context.addServlet(org.glassfish.jersey.servlet.ServletContainer.class, "/*");
jerseyServlet.setInitOrder(0);
jerseyServlet.setInitParameter( "jersey.config.server.provider.classnames", EntryPoint.class.getCanonicalName());
try {
jettyServer.start();
System.out.println(" open your browser on http://localhost:8080/LNU.se/get");
jettyServer.join();
} catch (Exception e)
{
e.printStackTrace();
} finally {
jettyServer.destroy();
}
}
}
======================
UPDATE: -->
My server is:
package rest;
import org.eclipse.jetty.server.Server;
import org.eclipse.jetty.servlet.ServletContextHandler;
import org.eclipse.jetty.servlet.ServletHolder;
public class RestServer {
public static void main(String[] args) throws Exception {
ServletContextHandler context = new ServletContextHandler(ServletContextHandler.SESSIONS);
context.setContextPath("/");
Server jettyServer = new Server(8080);
jettyServer.setHandler(context);
ServletHolder jerseyServlet = context.addServlet(
org.glassfish.jersey.servlet.ServletContainer.class, "/*");
jerseyServlet.setInitOrder(0);
jerseyServlet.setInitParameter("jersey.config.server.provider.classnames",
Calculator.class.getCanonicalName());
try {
jettyServer.start();
System.out.println(" open the browser on http://localhost:8080/calculator/squareRoot?input=16");
jettyServer.join();
} finally {
jettyServer.destroy();
}
}
}
and this class:
package rest;
import java.awt.event.ItemEvent;
import javax.ws.rs.Consumes;
import javax.ws.rs.GET;
import javax.ws.rs.POST;
import javax.ws.rs.Path;
import javax.ws.rs.Produces;
import javax.ws.rs.QueryParam;
import javax.ws.rs.core.MediaType;
import javax.ws.rs.core.Response;
import org.json.simple.JSONArray;
import org.json.simple.JSONObject;
#Path("calculator")
public class Calculator {
Result ress = new Result("mahdi84");
public static String resul1;
#GET
#Path("mahdi")
#Produces(MediaType.APPLICATION_JSON)
public Result mahdi(){
JSONObject json = new JSONObject();
json.put("validTime", "2016-02-24T11:00:00Z");
JSONArray jsonArray = new JSONArray();
JSONObject obj = new JSONObject();
obj.put("mcc", resul1);
obj.put("temprature", resul1+1);
obj.put("Humidity", resul1+10);
jsonArray.add(obj);
json.put("\n JSONdata --> ", jsonArray);
ress.setInput(Double.parseDouble(resul1));
ress.setOutput(Double.parseDouble(resul1));
ress.setTestVar(resul1);
ress.setTestVar2(json);
return ress;
}
#POST
#Path("/post")
#Consumes(MediaType.APPLICATION_JSON)
public Response createDataInJSON(String data) {
resul1= data;
return Response.status(201).entity(result).build();
}
static class Result{
double input;
double output;
String action;
Object testVar;
JSONObject testVar2;
public Result(){}
public JSONObject getTestVar2() {
return testVar2;
}
public void setTestVar2(JSONObject testVar2) {
this.testVar2 = testVar2;
}
public Object getTestVar() {
return testVar;
}
public void setTestVar(Object testVar) {
this.testVar = testVar;
}
public Result(String action) {
this.action = action;
}
public String getAction() {
return action;
}
public void setAction(String action) {
this.action = action;
}
public double getInput() {
return input;
}
public void setInput(double input) {
this.input = input;
}
public double getOutput() {
return output;
}
public void setOutput(double output) {
this.output = output;
}
}
}
and my client is:
import java.util.Timer;
import java.util.TimerTask;
import com.sun.jersey.api.client.Client;
import com.sun.jersey.api.client.ClientResponse;
import com.sun.jersey.api.client.WebResource;
public class JerseyClientPost extends TimerTask{
public int cuntr = 0;
public static void main(String[] args) {
TimerTask mytask = new JerseyClientPost();
Timer timer = new Timer();
timer.schedule(mytask, 1000, 1000);
}
#Override
public void run() {
try {
Client client = Client.create();
WebResource webResource = client.resource("http://localhost:8080/calculator/post");
String input = Integer.toString(cuntr++);
ClientResponse response = webResource.type("application/json").post(ClientResponse.class, input);
if (response.getStatus() != 201) {
throw new RuntimeException("Failed : HTTP error code : "+ response.getStatus());
}
String output = response.getEntity(String.class);
}
catch (Exception e) {
e.printStackTrace();
}
}
}
the client will post to the server every 1 sec and the server will generate new JSON on "http://localhost:8080/calculator/mahdi"
when i try to read from "http://localhost:8080/calculator/mahdi" by apache http1.1 in another program :
package HttpClient;
import java.io.BufferedReader;
import java.io.IOException;
import java.io.InputStreamReader;
import org.apache.http.HttpEntity;
import org.apache.http.HttpResponse;
import org.apache.http.client.ClientProtocolException;
import org.apache.http.client.methods.HttpGet;
import org.apache.http.client.methods.HttpPost;
import org.apache.http.impl.client.DefaultHttpClient;
public class MyHttpClient {
public static void main(String[] args) throws Exception{
HttpGet requset = new HttpGet("http://localhost:8080/calculator/mahdi");
DefaultHttpClient client = new DefaultHttpClient();
HttpResponse response = client.execute(requset);
HttpEntity entity = response.getEntity();
BufferedReader br = new BufferedReader(new InputStreamReader(entity.getContent(), "UTF-8"));
System.out.println("Reading began ... ");
while (true) {
String line = br.readLine();
if (line != null && !line.isEmpty()) {
System.out.println(line);
}
}
}
}
edit:
thank you I did that by using TImeTask class
it only print the first JSON! but when I refresh the weblink, it is updating on browser.
would you please correct me if I am wrong? I want to see the stream of JSON on MyHttpClient, but it only shows the first JSON!
I think you are mixing things. The server should only be a server, and not extend TimerTask. The client should call the server at a specific intervall
Do the follwing:
Test your server in a web-browser. Make sure that it will respond.
Write a client either in Java or javascript If you are using Java
the client should have a http-client.
The client should hava a timertask calling the the server at a specific intervall

Get different results using main and junit in java

When I tested a simple producer/consumer example, I got a very strange result as below.
If I used main() to test the following code, I'll get the correct and expected result.
But I only can get the 1st directory correctly, the remaining works were dropped by the JUnit.
What is the exact reason?
Working code:
import java.io.File;
import java.util.concurrent.BlockingQueue;
import java.util.concurrent.LinkedBlockingQueue;
import org.junit.Test;
public class TestProducerAndConsumer {
public static void main(String[] args) {
BlockingQueue<File> queue = new LinkedBlockingQueue<File>(1000);
new Thread(new FileCrawler(queue, new File("C:\\"))).start();
new Thread(new Indexer(queue)).start();
}
}
Bad Code:
import java.io.File;
import java.util.concurrent.BlockingQueue;
import java.util.concurrent.LinkedBlockingQueue;
import org.junit.Test;
public class TestProducerAndConsumer {
#Test
public void start2() {
BlockingQueue<File> queue = new LinkedBlockingQueue<File>(1000);
new Thread(new FileCrawler(queue, new File("C:\\"))).start();
new Thread(new Indexer(queue)).start();
}
}
Other function code:
import java.io.File;
import java.util.Arrays;
import java.util.concurrent.BlockingQueue;
public class FileCrawler implements Runnable {
private final BlockingQueue<File> fileQueue;
private final File root;
private int i = 0;
public FileCrawler(BlockingQueue<File> fileQueue, File root) {
this.fileQueue = fileQueue;
this.root = root;
}
#Override
public void run() {
try {
craw(root);
} catch (InterruptedException e) {
System.out.println("shit!");
e.printStackTrace();
Thread.currentThread().interrupt();
}
}
private void craw(File file) throws InterruptedException {
File[] entries = file.listFiles();
//System.out.println(Arrays.toString(entries));
if (entries != null && entries.length > 0) {
for (File entry : entries) {
if (entry.isDirectory()) {
craw(entry);
} else {
fileQueue.offer(entry);
i++;
System.out.println(entry);
System.out.println(i);
}
}
}
}
public static void main(String[] args) throws InterruptedException {
FileCrawler fc = new FileCrawler(null, null);
fc.craw(new File("C:\\"));
System.out.println(fc.i);
}
}
import java.io.File;
import java.util.concurrent.BlockingQueue;
public class Indexer implements Runnable {
private BlockingQueue<File> queue;
public Indexer(BlockingQueue<File> queue) {
this.queue = queue;
}
#Override
public void run() {
try {
while (true) {
indexFile(queue.take());
}
} catch (InterruptedException e) {
Thread.currentThread().interrupt();
}
}
private void indexFile(File file) {
System.out.println("Indexing ... " + file);
}
}
Junit's presumably allowing the JVM & threads to terminate, once the test is finished -- thus your threads do not complete working.
Try waiting for the threads to 'join':
Thread crawlerThread = new Thread(new FileCrawler(queue, new File("C:\\")));
Thread indexerThread = new Thread(new Indexer(queue));
crawlerThread.start();
indexerThread.start();
//
// wait for them to finish.
crawlerThread.join();
indexerThread.join();
This should help.
.. The other thing that can go wrong, is that log output (via Log4J) can sometimes be truncated at the end of execution; flushing & pausing can help. But I don't think that will affect you here.

Collect links from web pages using thread pool java

I am programming a links collector from specified number of pages. To make it more efficient I am using a ThreadPool with fixed size. Because I am really a newbie in the multithreading area I have problems with fixing some issues. So my idea is that every thread does the same thing: Connect to page and collect every url. After that urls are added to Queue for next thread.
But this doesn't work. At first program analyze baseurl and add urls from it. But at first I want to do it only with LinksToVisit.add(baseurl) and run it with threadpool but it always poll queue and threads add nothing new so on the top of queue is null.And I dont know why:(
I tried to do it with ArrayBlockingQueue but with no success. Fixing it with analyze base url is not good solution because when on baseurl is for example only one link it doesn't follow it. So I think I am going about it the wrong way or missing something important. As html parser I am using Jsoup. Thanks for answers.
Source(removed unnecessary methods) :
package collector;
import java.io.File;
import java.io.FileNotFoundException;
import java.io.IOException;
import java.text.DecimalFormat;
import java.util.Iterator;
import java.util.Map;
import java.util.Scanner;
import java.util.Map.Entry;
import java.util.concurrent.*;
import org.jsoup.Jsoup;
import org.jsoup.nodes.Document;
import org.jsoup.nodes.Element;
import org.jsoup.select.Elements;
public class Collector {
private String baseurl;
private int links;
private int cvlinks;
private double time;
private int chcount;
private static final int NTHREADS = Runtime.getRuntime().availableProcessors()*2;
private ConcurrentLinkedQueue<String> LinksToVisit = new ConcurrentLinkedQueue<String>();
private ConcurrentSkipListMap<String, Double> SortedCharMap = new ConcurrentSkipListMap<String, Double>();
private ConcurrentHashMap<String, Double> CharMap = new ConcurrentHashMap<String, Double>();
public Collector(String url, int links) {
this.baseurl = url;
this.links = links;
this.cvlinks = 0;
this.chcount = 0;
try {
Document html = Jsoup.connect(url).get();
if(cvlinks != links){
Elements collectedLinks = html.select("a[href]");
for(Element link:collectedLinks){
if(cvlinks == links) break;
else{
String current = link.attr("abs:href");
if(!current.equals(url) && current.startsWith(baseurl)&& !current.contains("#")){
LinksToVisit.add(current);
cvlinks++;
}
}
}
}
AnalyzeDocument(html, url);
} catch (IOException e) {
e.printStackTrace();
}
CollectFromWeb();
}
private void AnalyzeDocument(Document doc,String url){
String text = doc.body().text().toLowerCase().replaceAll("[^a-z]", "").trim();
chcount += text.length();
String chars[] = text.split("");
CharCount(chars);
}
private void CharCount(String[] chars) {
for(int i = 1; i < chars.length; i++) {
if(!CharMap.containsKey(chars[i]))
CharMap.put(chars[i],1.0);
else
CharMap.put(chars[i], CharMap.get(chars[i]).doubleValue()+1);
}
}
private void CollectFromWeb(){
long startTime = System.nanoTime();
ExecutorService executor = Executors.newFixedThreadPool(NTHREADS);
CollectorThread[] workers = new CollectorThread[this.links];
for (int i = 0; i < this.links; i++) {
if(!LinksToVisit.isEmpty()){
int j = i+1;
System.out.println("Collecting from "+LinksToVisit.peek()+" ["+j+"/"+links+"]");
//Runnable worker = new CollectorThread(LinksToVisit.poll());
workers[i] = new CollectorThread(LinksToVisit.poll());
executor.execute(workers[i]);
}
else break;
}
executor.shutdown();
while (!executor.isTerminated()) {}
SortedCharMap.putAll(CharMap);
this.time =(System.nanoTime() - startTime)*10E-10;
}
class CollectorThread implements Runnable{
private Document html;
private String url;
public CollectorThread(String url){
this.url = url;
try {
this.html = Jsoup.connect(url).get();
} catch (IOException e) {
e.printStackTrace();
}
}
#Override
public void run() {
if(cvlinks != links){
Elements collectedLinks = html.select("a[href]");
for(Element link:collectedLinks){
if(cvlinks == links) break;
else{
String current = link.attr("abs:href");
if(!current.equals(url) && current.startsWith(baseurl)&& !current.contains("#")){
LinksToVisit.add(current);
cvlinks++;
}
}
}
}
AnalyzeDocument(html, url);
}
}
}
Instead of using the LinksToVisit queue, just call executor.execute(new CollectorThread(current)) directly from CollectorThread.run(). The ExecutorService has its own internal queue of tasks which it will run as threads become available.
The other problem here is that calling shutdown() after adding the first set of URLs to the queue will prevent new tasks from being added to the executor. You can fix this by instead making the executor shut down when it has emptied its queue:
class Queue extends ThreadPoolExecutor {
Queue(int nThreads) {
super(nThreads, nThreads, 0L, TimeUnit.MILLISECONDS,
new LinkedBlockingQueue<Runnable>());
}
protected void afterExecute(Runnable r, Throwable t) {
if(getQueue().isEmpty()) {
shutdown();
}
}
}

multithreading with non-blocking sockets

I am trying to implement a TCP Server in Java using nio.
Its simply using the Selector's select method to get the ready keys. And then processing those keys if they are acceptable, readable and so. Server is working just fine till im using a single thread. But when im trying to use more threads to process the keys, the server's response gets slowed and eventually stops responding, say after 4-5 requests.
This is all what im doing:(Pseudo)
Iterator<SelectionKey> keyIterator = selector.selectedKeys().iterator();
while (keyIterator.hasNext()) {
SelectionKey readyKey = keyIterator.next();
if (readyKey.isAcceptable()) {
//A new connection attempt, registering socket channel with selector
} else {
Worker.add( readyKey );
}
Worker is the thread class that performs Input/Output from the channel.
This is the code of my Worker class:
private static List<SelectionKey> keyPool = Collections.synchronizedList(new LinkedList());
public static void add(SelectionKey key) {
synchronized (keyPool) {
keyPool.add(key);
keyPool.notifyAll();
}
}
public void run() {
while ( true ) {
SelectionKey myKey = null;
synchronized (keyPool) {
try {
while (keyPool.isEmpty()) {
keyPool.wait();
}
} catch (InterruptedException ex) {
}
myKey = keyPool.remove(0);
keyPool.notifyAll();
}
if (myKey != null && myKey.isValid() ) {
if (myKey.isReadable()) {
//Performing reading
} else if (myKey.isWritable()) {
//performing writing
myKey.cancel();
}
}
}
My basic idea is to add the key to the keyPool from which various threads can get a key, one at a time.
My BaseServer class itself is running as a thread. It is creating 10 Worker threads before the event loop to begin. I also tried to increase the priority of BaseServer thread, so that it gets more chance to accept the acceptable keys. Still, to it stops responding after approx 8 requests. Please help, were I am going wrong. Thanks in advance. :)
Third, you aren't removing anything from the selected-key set. You must do that every time around the loop, e.g. by calling keyIterator.remove() after you call next().
You need to read the NIO Tutorials.
First of all, you should not really be using wait() and notify() calls anymore since there exist good Standard Java (1.5+) wrapper classes in java.util.concurrent, such as BlockingQueue.
Second, it's suggested to do IO in the selecting thread itself, not in the worker threads. The worker threads should just queue up reads/and writes to the selector thread(s).
This page explains it pretty good and even provides working code samples of a simple TCP/IP server: http://rox-xmlrpc.sourceforge.net/niotut/
Sorry, I don't yet have time to look at your specific example.
Try using xsocket library. It saved me a lot of time reading on forums.
Download: http://xsocket.org/
Tutorial: http://xsocket.sourceforge.net/core/tutorial/V2/TutorialCore.htm
Server Code:
import org.xsocket.connection.*;
/**
*
* #author wsserver
*/
public class XServer {
protected static IServer server;
public static void main(String[] args) {
try {
server = new Server(9905, new XServerHandler());
server.start();
} catch (Exception ex) {
System.out.println(ex.getMessage());
}
}
protected static void shutdownServer(){
try{
server.close();
}catch(Exception ex){
System.out.println(ex.getMessage());
}
}
}
Server Handler:
import java.io.IOException;
import java.nio.BufferUnderflowException;
import java.nio.ByteBuffer;
import java.nio.channels.ClosedChannelException;
import java.nio.charset.Charset;
import java.nio.charset.CharsetDecoder;
import java.nio.charset.CharsetEncoder;
import java.util.*;
import org.xsocket.*;
import org.xsocket.connection.*;
public class XServerHandler implements IConnectHandler, IDisconnectHandler, IDataHandler {
private Set<ConnectedClients> sessions = Collections.synchronizedSet(new HashSet<ConnectedClients>());
Charset charset = Charset.forName("ISO-8859-1");
CharsetEncoder encoder = charset.newEncoder();
CharsetDecoder decoder = charset.newDecoder();
ByteBuffer buffer = ByteBuffer.allocate(1024);
#Override
public boolean onConnect(INonBlockingConnection inbc) throws IOException, BufferUnderflowException, MaxReadSizeExceededException {
try {
synchronized (sessions) {
sessions.add(new ConnectedClients(inbc, inbc.getRemoteAddress()));
}
System.out.println("onConnect"+" IP:"+inbc.getRemoteAddress().getHostAddress()+" Port:"+inbc.getRemotePort());
} catch (Exception ex) {
System.out.println("onConnect: " + ex.getMessage());
}
return true;
}
#Override
public boolean onDisconnect(INonBlockingConnection inbc) throws IOException {
try {
synchronized (sessions) {
sessions.remove(inbc);
}
System.out.println("onDisconnect");
} catch (Exception ex) {
System.out.println("onDisconnect: " + ex.getMessage());
}
return true;
}
#Override
public boolean onData(INonBlockingConnection inbc) throws IOException, BufferUnderflowException, ClosedChannelException, MaxReadSizeExceededException {
inbc.read(buffer);
buffer.flip();
String request = decoder.decode(buffer).toString();
System.out.println("request:"+request);
buffer.clear();
return true;
}
}
Connected Clients:
import java.net.InetAddress;
import org.xsocket.connection.INonBlockingConnection;
/**
*
* #author wsserver
*/
public class ConnectedClients {
private INonBlockingConnection inbc;
private InetAddress address;
//CONSTRUCTOR
public ConnectedClients(INonBlockingConnection inbc, InetAddress address) {
this.inbc = inbc;
this.address = address;
}
//GETERS AND SETTERS
public INonBlockingConnection getInbc() {
return inbc;
}
public void setInbc(INonBlockingConnection inbc) {
this.inbc = inbc;
}
public InetAddress getAddress() {
return address;
}
public void setAddress(InetAddress address) {
this.address = address;
}
}
Client Code:
import java.net.InetAddress;
import org.xsocket.connection.INonBlockingConnection;
import org.xsocket.connection.NonBlockingConnection;
/**
*
* #author wsserver
*/
public class XClient {
protected static INonBlockingConnection inbc;
public static void main(String[] args) {
try {
inbc = new NonBlockingConnection(InetAddress.getByName("localhost"), 9905, new XClientHandler());
while(true){
}
} catch (Exception ex) {
System.out.println(ex.getMessage());
}
}
}
Client Handler:
import java.io.IOException;
import java.nio.BufferUnderflowException;
import java.nio.ByteBuffer;
import java.nio.channels.ClosedChannelException;
import java.nio.charset.Charset;
import java.nio.charset.CharsetDecoder;
import java.nio.charset.CharsetEncoder;
import org.xsocket.MaxReadSizeExceededException;
import org.xsocket.connection.IConnectExceptionHandler;
import org.xsocket.connection.IConnectHandler;
import org.xsocket.connection.IDataHandler;
import org.xsocket.connection.IDisconnectHandler;
import org.xsocket.connection.INonBlockingConnection;
/**
*
* #author wsserver
*/
public class XClientHandler implements IConnectHandler, IDataHandler,IDisconnectHandler, IConnectExceptionHandler {
Charset charset = Charset.forName("ISO-8859-1");
CharsetEncoder encoder = charset.newEncoder();
CharsetDecoder decoder = charset.newDecoder();
ByteBuffer buffer = ByteBuffer.allocate(1024);
#Override
public boolean onConnect(INonBlockingConnection nbc) throws IOException {
System.out.println("Connected to server");
nbc.write("hello server\r\n");
return true;
}
#Override
public boolean onConnectException(INonBlockingConnection nbc, IOException ioe) throws IOException {
System.out.println("On connect exception:"+ioe.getMessage());
return true;
}
#Override
public boolean onDisconnect(INonBlockingConnection nbc) throws IOException {
System.out.println("Dissconected from server");
return true;
}
#Override
public boolean onData(INonBlockingConnection inbc) throws IOException, BufferUnderflowException, ClosedChannelException, MaxReadSizeExceededException {
inbc.read(buffer);
buffer.flip();
String request = decoder.decode(buffer).toString();
System.out.println(request);
buffer.clear();
return true;
}
}
I spent a lot of time reading on forums about this, i hope i can help u with my code.

Categories