I'm trying to do a multi-get on my redis data store which is distributed across multiple shards. However the keys I want to do this on do not belong to the same shard so I can't use redis' inbuilt multi-get.
Instead I'm trying to use futures to achieve this. But after checking the lookup times it almost seems like these cache calls are being made serially.
The request/sec on the server is about 1.5k with an average of 10 ms response time. Literature I've read told me that my threadpool size should be requests/sec * response time. Since I'm spawning 3 threads this becomes 1500 * 0.010 * 3 = 45. I've tried using threadpool sizes of 50,100,300. But this hasn't helped either.
I'm using Jedis as a client. I thought it could be an issue with exceeding Jedis' max total/idle connection limit. But even after increasing this from 8 to 24 I see no difference in lookup times.
I understand that some overhead will be there since there will be context switches and the overhead of spawning new threads.
Can anyone help me figure out where I'm missing out. Let me know if you need more info.
for(String recordKey : pidArr) {
//Adding futures. Max 3
if(count >= 3) {
break;
}
count++;
Callable<String> a = new FeedCacheCaller(recordKey);
Future<String> future = feedThreadPool.submit(a);
futureList.add(future);
}
//Getting the data from the futures
for(Future<String> foo : futureList) {
try {
String data = foo.get();
logger.debug(data);
feedDataList.add(parseInfo(data));
} catch (Exception e) {
logger.error("somethings going wrong in retrieval",e);
}
}
Here's the Callable class
public class FeedCacheCaller implements Callable {
String pid = null;
FeedCache feedCache;
public FeedCacheCaller(String pid) {
this.pid = pid;
this.feedCache = new FeedCache();
}
#Override
public String call() throws Exception {
return feedCache.get(pid);
}
}
Edit 1:
Here's the Jedis side code.
public class FeedCache {
private ShardedJedisPool feedClient = RedisPool.getPool("feed");
public String get(String key) {
ShardedJedis client = null;
String value = null;
try {
client = feedClient.getResource();
byte[] valueByteArray = client.get(key.getBytes(Constants.CHARSET));
if (valueByteArray != null) {
value = new String(CacheUtils.decompress(valueByteArray),
Constants.CHARSET);
}
} catch (JedisConnectionException e) {
if (client != null) {
feedClient.returnBrokenResource(client);
client = null;
}
logger.error(e.getMessage());
} finally {
if (client != null) {
feedClient.returnResource(client);
}
}
return value;
}
}
Here is the code that initializes the ShardedJedisPool
public class RedisPool {
private static final Logger logger = LoggerFactory.getLogger(
RedisPool.class);
private static ConcurrentHashMap<String, ShardedJedisPool> redisPools = new ConcurrentHashMap<String, ShardedJedisPool>();
public static void initializePool(String poolName) {
List<JedisShardInfo> shards = new ArrayList<JedisShardInfo>();
ArrayList<String> servers = new ArrayList<String>(Arrays.asList(
Constants.config.getStringArray(
poolName + "_redis_servers")));
for (int i = 0; i < servers.size(); i++) {
JedisShardInfo shardInfo = new JedisShardInfo(servers.get(i).split(":")[0], Integer.parseInt(servers.get(i).split(":")[1]));
shards.add(shardInfo);
}
redisPools.putIfAbsent(poolName,
new ShardedJedisPool(new GenericObjectPoolConfig(), shards));
}
public static ShardedJedisPool getPool(String poolName) {
if (!redisPools.containsKey(poolName)) {
synchronized (RedisPool.class) {
if (!redisPools.containsKey(poolName)) {
initializePool(poolName);
}
}
}
return redisPools.get(poolName);
}
public static void shutdown(String poolName) {
ShardedJedisPool pool = getPool(poolName);
pool.destroy();
redisPools.remove(poolName);
}
public static void main(String args[]) {
initializePool("vizidtoud");
}
}
Related
I am trying to operate on the same source with two threads. I designed a typical producer and consumer problem for it. While setting the value in the resource class with the producer, I want to get setted values with the consumer one by one. The output I want should be like this:
Producer -> Setting data = 0
Consumer -> Getting data = 0
Producer -> Setting data = 1
Consumer -> Getting data = 1
Producer -> Setting data = 2
Consumer -> Getting data = 2
Producer -> Setting data = 3
Consumer -> Getting data = 3
Producer -> Setting data = 4
Consumer -> Getting data = 4
Here is my Resource class:
public class Resource{
private int value;
private boolean current = false;
public synchronized void setValue(int val) {
while(current == true) {
try {
wait();
}catch(Exception ex) {}}
value = val;
current = true;
notifyAll();
}
public synchronized int getValue() {
while(current == false) {
try {
wait();
}catch(Exception ex) {}}
current = false;
notifyAll();
return value;
}
}
And main method and Producer,Consumer class is here:
class Producer extends Thread{
private Resource rs;
public Producer(Resource rs1) {
rs = rs1;
}
public void run() {
for(int i = 0 ; i < 5 ; i++) {
rs.setValue(i);
System.out.println("Producer -> Setting data = " + i);
try {
sleep(100);
}catch(Exception ex){
ex.printStackTrace();
}
}
}
}
class Consumer extends Thread{
private Resource rs;
public Consumer(Resource rs1) {
rs = rs1;
}
public void run() {
int value = 0;
for(int i = 0 ; i < 5; i++) {
value = rs.getValue();
System.out.println("Consumer -> Getting data= " + i);
try {
sleep(100);
}catch(Exception ex) {
ex.printStackTrace();
}
}
}
}
public class Dependent {
public static void main(String[] args) throws IOException {
Resource res = new Resource();
Producer p1 = new Producer(res);
Consumer c1 = new Consumer(res);
p1.start();
c1.start();
}
}
Although I use synchronized, wait and notifyAll keywords in the methods in the resource class, the threads continue to work without waiting for each other. Where am I making a mistake? I've seen a code sample similar to this code sample in a java book, there doesn't seem to be a problem.
When I write without adding the current boolean variable, the code doesn't even work. That's why I had to add it by looking from the book. Don't the threads need to work synchronously without checking the Current value?
They do wait for each other, but the thread sync operations are much, much faster than Thread.sleep(100) so you can't tell. Your test code prints 'i' and not 'value', which is suspect. Get rid of Thread.sleep(100) in one of these threads (for example, in the consumer) and you'll find that the consumer nevertheless still requires about half a second to complete - as it will be waiting about 100 msec every time it invokes .getValue() on the resource, because that call will block (stuck in that wait() loop) until the producer calls .setValue which it only does about once every 100 msec.
Your Resource object 'works', for some value of 'works', but is very poorly designed, re-creating already existing and better implemented classes from the core library such as a java.util.concurrent.Latch, and which ignore interrupts and will blindly just keep waiting.
Their APIs are also a tad oddly named, in that a get call has considerably side effects. get is more of a get and clear operation: After a get operation, another get operation will freeze the thread forever, or at least, until some thread sets a value.
How do you think?
import java.io.IOException;
class Resource {
private volatile Integer value;
public synchronized void setValue(int val) {
while(value != null && !value.equals(val)) {
try {
wait();
}catch(Exception ex) {}}
value = val;
notifyAll();
}
public synchronized int getValue() {
while(value == null) {
try {
wait();
}catch(Exception ex) {}}
int answer = value;
value = null;
notifyAll();
return answer;
}
}
class Producer extends Thread{
private Resource rs;
public Producer(Resource rs1) {
rs = rs1;
}
public void run() {
for(int i = 0 ; i < 5 ; i++) {
rs.setValue(i);
System.out.println("Producer -> Setting data = " + i);
try {
sleep(100);
}catch(Exception ex){
ex.printStackTrace();
}
}
}
}
class Consumer extends Thread{
private Resource rs;
public Consumer(Resource rs1) {
rs = rs1;
}
public void run() {
for(int i = 0 ; i < 5; i++) {
int value = rs.getValue();
System.out.println("Consumer -> Getting data= " + value);
try {
sleep(100);
}catch(Exception ex) {
ex.printStackTrace();
}
}
}
}
public class Dependent {
public static void main(String[] args) throws IOException {
Resource res = new Resource();
Producer p1 = new Producer(res);
Consumer c1 = new Consumer(res);
p1.start();
c1.start();
}
}
or
class Resource {
private static final int WAIT_VALUE = -1;
private volatile int value = WAIT_VALUE;
public synchronized void setValue(int val) {
while(value > WAIT_VALUE && value != val) {
try {
wait();
}catch(Exception ex) {}}
value = val;
notifyAll();
}
public synchronized int getValue() {
while(value == WAIT_VALUE) {
try {
wait();
}catch(Exception ex) {}}
int answer = value;
value = WAIT_VALUE;
notifyAll();
return answer;
}
}
I have list of machines in an ArrayList. I am checking whether all those machines are up or not so I am making a http call on them and if they respond so they are up but if they don't respond then they are down. I considered them dead if they don't respond for 10 retries meaning each machine I retry 10 times if they don't respond.
Now I need to iterate that hostnames list in parallel in a for loop so I came up with below code. Is there any multithreading issues or anything can be improved here? Am I using executor service correctly here?
private static final ExecutorService executorService = Executors.newFixedThreadPool(20);
public static void main(String[] args) throws Exception {
List<String> hostnames = new ArrayList<>();
//.. populating hostnames
List<Future<Void>> futures = new ArrayList<Future<Void>>();
for (final String machine : hostnames) {
Callable<Void> callable = new Callable<Void>() {
public Void call() throws Exception {
String url = "http://" + machine + ":8080/ruok";
if (!checkServer(url)) {
System.out.println("server down: " + machine);
}
return null;
}
};
futures.add(executorService.submit(callable));
}
executorService.shutdown();
for (Future<Void> future : futures) {
future.get();
}
System.out.println("All done!");
}
private static boolean checkServer(final String url) {
boolean isUp = false;
int retry = 10;
for (int i = 1; i <= retry; i++) {
try {
RestClient.getInstance().getClient().getForObject(url, String.class);
isUp = true;
break;
} catch (Exception ex) {
// log exception
}
}
return isUp;
}
I am working on my application which sends data to zeromq. Below is what my application does:
I have a class SendToZeroMQ that send data to zeromq.
Add same data to retryQueue in the same class so that it can be retried later on if acknowledgment is not received. It uses guava cache with maximumSize limit.
Have a separate thread which receives acknowledgement from the zeromq for the data that was sent earlier and if acknowledgement is not received, then SendToZeroMQ will retry sending that same piece of data. And if acknowledgement is received, then we will remove it from retryQueue so that it cannot be retried again.
Idea is very simple and I have to make sure my retry policy works fine so that I don't loose my data. This is very rare but in case if we don't receive acknolwedgements.
I am thinking of building two types of RetryPolicies but I am not able to understand how to build that here corresponding to my program:
RetryNTimes: In this it will retry N times with a particular sleep between each retry and after that, it will drop the record.
ExponentialBackoffRetry: In this it will exponentially keep retrying. We can set some max retry limit and after that it won't retry and will drop the record.
Below is my SendToZeroMQ class which sends data to zeromq, also retry every 30 seconds from a background thread and start ResponsePoller runnable which keeps running forever:
public class SendToZeroMQ {
private final ScheduledExecutorService executorService = Executors.newScheduledThreadPool(5);
private final Cache<Long, byte[]> retryQueue =
CacheBuilder
.newBuilder()
.maximumSize(10000000)
.concurrencyLevel(200)
.removalListener(
RemovalListeners.asynchronous(new CustomListener(), executorService)).build();
private static class Holder {
private static final SendToZeroMQ INSTANCE = new SendToZeroMQ();
}
public static SendToZeroMQ getInstance() {
return Holder.INSTANCE;
}
private SendToZeroMQ() {
executorService.submit(new ResponsePoller());
// retry every 30 seconds for now
executorService.scheduleAtFixedRate(new Runnable() {
#Override
public void run() {
for (Entry<Long, byte[]> entry : retryQueue.asMap().entrySet()) {
sendTo(entry.getKey(), entry.getValue());
}
}
}, 0, 30, TimeUnit.SECONDS);
}
public boolean sendTo(final long address, final byte[] encodedRecords) {
Optional<ZMQSocketInfo> liveSockets = PoolManager.getInstance().getNextSocket();
if (!liveSockets.isPresent()) {
return false;
}
return sendTo(address, encodedRecords, liveSockets.get().getSocket());
}
public boolean sendTo(final long address, final byte[] encodedByteArray, final Socket socket) {
ZMsg msg = new ZMsg();
msg.add(encodedByteArray);
boolean sent = msg.send(socket);
msg.destroy();
// adding to retry queue
retryQueue.put(address, encodedByteArray);
return sent;
}
public void removeFromRetryQueue(final long address) {
retryQueue.invalidate(address);
}
}
Below is my ResponsePoller class which polls all the acknowledgement from the zeromq. And if we get an acknowledgement back from the zeromq then we will remove that record from the retry queue so that it doesn't get retried otherwise it will get retried.
public class ResponsePoller implements Runnable {
private static final Random random = new Random();
#Override
public void run() {
ZContext ctx = new ZContext();
Socket client = ctx.createSocket(ZMQ.PULL);
String identity = String.format("%04X-%04X", random.nextInt(), random.nextInt());
client.setIdentity(identity.getBytes(ZMQ.CHARSET));
client.bind("tcp://" + TestUtils.getIpaddress() + ":8076");
PollItem[] items = new PollItem[] {new PollItem(client, Poller.POLLIN)};
while (!Thread.currentThread().isInterrupted()) {
// Tick once per second, pulling in arriving messages
for (int centitick = 0; centitick < 100; centitick++) {
ZMQ.poll(items, 10);
if (items[0].isReadable()) {
ZMsg msg = ZMsg.recvMsg(client);
Iterator<ZFrame> it = msg.iterator();
while (it.hasNext()) {
ZFrame frame = it.next();
try {
long address = TestUtils.getAddress(frame.getData());
// remove from retry queue since we got the acknowledgment for this record
SendToZeroMQ.getInstance().removeFromRetryQueue(address);
} catch (Exception ex) {
// log error
} finally {
frame.destroy();
}
}
msg.destroy();
}
}
}
ctx.destroy();
}
}
Question:
As you can see above, I am sending encodedRecords to zeromq using SendToZeroMQ class and then it gets retried every 30 seconds depending on whether we got an acknolwedgement back from ResponsePoller class or not.
For each encodedRecords there is a unique key called address and that's what we will get back from zeromq as an acknowledgement.
How can I go ahead and extend this example to build two retry policies that I mentioned above and then I can pick what retry policy I want to use while sending data. I came up with below interface but then I am not able understand how should I move forward to implement those retry policies and use it in my above code.
public interface RetryPolicy {
/**
* Called when an operation has failed for some reason. This method should return
* true to make another attempt.
*/
public boolean allowRetry(int retryCount, long elapsedTimeMs);
}
Can I use guava-retrying or failsafe here becuase these libraries already have many retry policies which I can use?
I am not able to work out all the details regarding how to use the relevant API-s, but as for algorithm, you could try:
the retry-policy needs to have some sort of state attached to each message (atleast the number of times the current message has been retried, possible what the current delay is). You need to decide whether the RetryPolicy should keep that itself or if you want to store it inside the message.
instead of allowRetry, you could have a method calculating when the next retry should occur (in absolute time or as a number of milliseconds in the future), which will be a function of the state mentioned above
the retry queue should contain information on when each message should be retried.
instead of using scheduleAtFixedRate, find the message in the retry queue which has the lowest when_is_next_retry (possibly by sorting on absolute retry-timestamp and picking the first), and let the executorService reschedule itself using schedule and the time_to_next_retry
for each retry, pull it from the retry queue, send the message, use the RetryPolicy for calculating when the next retry should be (if it is to be retried) and insert back into the retry queue with a new value for when_is_next_retry (if the RetryPolicy returns -1, it could mean that the message shall not be retried any more)
not a perfect way, but can be achieved by below way as well.
public interface RetryPolicy {
public boolean allowRetry();
public void decreaseRetryCount();
}
Create two implementation. For RetryNTimes
public class RetryNTimes implements RetryPolicy {
private int maxRetryCount;
public RetryNTimes(int maxRetryCount) {
this.maxRetryCount = maxRetryCount;
}
public boolean allowRetry() {
return maxRetryCount > 0;
}
public void decreaseRetryCount()
{
maxRetryCount = maxRetryCount-1;
}}
For ExponentialBackoffRetry
public class ExponentialBackoffRetry implements RetryPolicy {
private int maxRetryCount;
private final Date retryUpto;
public ExponentialBackoffRetry(int maxRetryCount, Date retryUpto) {
this.maxRetryCount = maxRetryCount;
this.retryUpto = retryUpto;
}
public boolean allowRetry() {
Date date = new Date();
if(maxRetryCount <= 0 || date.compareTo(retryUpto)>=0)
{
return false;
}
return true;
}
public void decreaseRetryCount() {
maxRetryCount = maxRetryCount-1;
}}
You need to make some changes in SendToZeroMQ class
public class SendToZeroMQ {
private final ScheduledExecutorService executorService = Executors.newScheduledThreadPool(5);
private final Cache<Long,RetryMessage> retryQueue =
CacheBuilder
.newBuilder()
.maximumSize(10000000)
.concurrencyLevel(200)
.removalListener(
RemovalListeners.asynchronous(new CustomListener(), executorService)).build();
private static class Holder {
private static final SendToZeroMQ INSTANCE = new SendToZeroMQ();
}
public static SendToZeroMQ getInstance() {
return Holder.INSTANCE;
}
private SendToZeroMQ() {
executorService.submit(new ResponsePoller());
// retry every 30 seconds for now
executorService.scheduleAtFixedRate(new Runnable() {
public void run() {
for (Map.Entry<Long, RetryMessage> entry : retryQueue.asMap().entrySet()) {
RetryMessage retryMessage = entry.getValue();
if(retryMessage.getRetryPolicy().allowRetry())
{
retryMessage.getRetryPolicy().decreaseRetryCount();
entry.setValue(retryMessage);
sendTo(entry.getKey(), retryMessage.getMessage(),retryMessage);
}else
{
retryQueue.asMap().remove(entry.getKey());
}
}
}
}, 0, 30, TimeUnit.SECONDS);
}
public boolean sendTo(final long address, final byte[] encodedRecords, RetryMessage retryMessage) {
Optional<ZMQSocketInfo> liveSockets = PoolManager.getInstance().getNextSocket();
if (!liveSockets.isPresent()) {
return false;
}
if(null==retryMessage)
{
RetryPolicy retryPolicy = new RetryNTimes(10);
retryMessage = new RetryMessage(retryPolicy,encodedRecords);
retryQueue.asMap().put(address,retryMessage);
}
return sendTo(address, encodedRecords, liveSockets.get().getSocket());
}
public boolean sendTo(final long address, final byte[] encodedByteArray, final ZMQ.Socket socket) {
ZMsg msg = new ZMsg();
msg.add(encodedByteArray);
boolean sent = msg.send(socket);
msg.destroy();
return sent;
}
public void removeFromRetryQueue(final long address) {
retryQueue.invalidate(address);
}}
Here is a working little simulation of your environment that shows how this can be done. Note the Guava cache is the wrong data structure here, since you aren't interested in eviction (I think). So I'm using a concurrent hashmap:
package experimental;
import static java.util.concurrent.TimeUnit.MILLISECONDS;
import java.util.Arrays;
import java.util.Iterator;
import java.util.concurrent.ConcurrentHashMap;
import java.util.concurrent.ConcurrentMap;
import java.util.concurrent.Executors;
import java.util.concurrent.Future;
import java.util.concurrent.ScheduledExecutorService;
class Experimental {
/** Return the desired backoff delay in millis for the given retry number, which is 1-based. */
interface RetryStrategy {
long getDelayMs(int retry);
}
enum ConstantBackoff implements RetryStrategy {
INSTANCE;
#Override
public long getDelayMs(int retry) {
return 1000L;
}
}
enum ExponentialBackoff implements RetryStrategy {
INSTANCE;
#Override
public long getDelayMs(int retry) {
return 100 + (1L << retry);
}
}
static class Sender {
private final ScheduledExecutorService executorService = Executors.newScheduledThreadPool(4);
private final ConcurrentMap<Long, Retrier> pending = new ConcurrentHashMap<>();
/** Send the given data with given address on the given socket. */
void sendTo(long addr, byte[] data, int socket) {
System.err.println("Sending " + Arrays.toString(data) + "#" + addr + " on " + socket);
}
private class Retrier implements Runnable {
private final RetryStrategy retryStrategy;
private final long addr;
private final byte[] data;
private final int socket;
private int retry;
private Future<?> future;
Retrier(RetryStrategy retryStrategy, long addr, byte[] data, int socket) {
this.retryStrategy = retryStrategy;
this.addr = addr;
this.data = data;
this.socket = socket;
this.retry = 0;
}
synchronized void start() {
if (future == null) {
future = executorService.submit(this);
pending.put(addr, this);
}
}
synchronized void cancel() {
if (future != null) {
future.cancel(true);
future = null;
}
}
private synchronized void reschedule() {
if (future != null) {
future = executorService.schedule(this, retryStrategy.getDelayMs(++retry), MILLISECONDS);
}
}
#Override
synchronized public void run() {
sendTo(addr, data, socket);
reschedule();
}
}
long getVerifiedAddr() {
System.err.println("Pending messages: " + pending.size());
Iterator<Long> i = pending.keySet().iterator();
long addr = i.hasNext() ? i.next() : 0;
return addr;
}
class CancellationPoller implements Runnable {
#Override
public void run() {
while (!Thread.currentThread().isInterrupted()) {
try {
Thread.sleep(1000);
} catch (InterruptedException ex) {
Thread.currentThread().interrupt();
}
long addr = getVerifiedAddr();
if (addr == 0) {
continue;
}
System.err.println("Verified message (to be cancelled) " + addr);
Retrier retrier = pending.remove(addr);
if (retrier != null) {
retrier.cancel();
}
}
}
}
Sender initialize() {
executorService.submit(new CancellationPoller());
return this;
}
void sendWithRetriesTo(RetryStrategy retryStrategy, long addr, byte[] data, int socket) {
new Retrier(retryStrategy, addr, data, socket).start();
}
}
public static void main(String[] args) {
Sender sender = new Sender().initialize();
for (long i = 1; i <= 10; i++) {
sender.sendWithRetriesTo(ConstantBackoff.INSTANCE, i, null, 42);
}
for (long i = -1; i >= -10; i--) {
sender.sendWithRetriesTo(ExponentialBackoff.INSTANCE, i, null, 37);
}
}
}
You can use apache camel. It provides a component for zeromq, and tools like errohandler, redeliverypolicy, deadletter channel and such things are natively provided.
I have a rpt file, using which i will be generating multiple reports in pdf format. Using the Engine class from inet clear reports. The process takes very long as I have nearly 10000 reports to be generated. Can I use the Mutli-thread or some other approach to speed up the process?
Any help of how it can be done would be helpful
My partial code.
//Loops
Engine eng = new Engine(Engine.EXPORT_PDF);
eng.setReportFile(rpt); //rpt is the report name
if (cn.isClosed() || cn == null ) {
cn = ds.getConnection();
}
eng.setConnection(cn);
System.out.println(" After set connection");
eng.setPrompt(data[i], 0);
ReportProperties repprop = eng.getReportProperties();
repprop.setPaperOrient(ReportProperties.DEFAULT_PAPER_ORIENTATION, ReportProperties.PAPER_FANFOLD_US);
eng.execute();
System.out.println(" After excecute");
try {
PDFExportThread pdfExporter = new PDFExportThread(eng, sFileName, sFilePath);
pdfExporter.execute();
} catch (Exception e) {
e.printStackTrace();
}
PDFExportThread execute
public void execute() throws IOException {
FileOutputStream fos = null;
try {
String FileName = sFileName + "_" + (eng.getPageCount() - 1);
File file = new File(sFilePath + FileName + ".pdf");
if (!file.getParentFile().exists()) {
file.getParentFile().mkdirs();
}
if (!file.exists()) {
file.createNewFile();
}
fos = new FileOutputStream(file);
for (int k = 1; k <= eng.getPageCount(); k++) {
fos.write(eng.getPageData(k));
}
fos.flush();
fos.close();
} catch (Exception e) {
e.printStackTrace();
} finally {
if (fos != null) {
fos.close();
fos = null;
}
}
}
This is a very basic code. A ThreadPoolExecutor with a fixed size threads in a pool is the backbone.
Some considerations:
The thread pool size should be equal or less than the DB connection pool size. And, it should be of an optimal number which is reasonable for parallel Engines.
The main thread should wait for sufficient time before killing all threads. I have put 1 hour as the wait time, but that's just an example.
You'll need to have proper Exception handling.
From the API doc, I saw stopAll and shutdown methods from the Engine class. So, I'm invoking that as soon as our work is done. That's again, just an example.
Hope this helps.
import java.io.File;
import java.io.FileOutputStream;
import java.io.IOException;
import java.sql.Connection;
import java.util.concurrent.Executors;
import java.util.concurrent.ThreadPoolExecutor;
import java.util.concurrent.TimeUnit;
public class RunEngine {
public static void main(String[] args) throws Exception {
final String rpt = "/tmp/rpt/input/rpt-1.rpt";
final String sFilePath = "/tmp/rpt/output/";
final String sFileName = "pdfreport";
final Object[] data = new Object[10];
ThreadPoolExecutor executor = (ThreadPoolExecutor) Executors.newFixedThreadPool(10);
for (int i = 0; i < data.length; i++) {
PDFExporterRunnable runnable = new PDFExporterRunnable(rpt, data[i], sFilePath, sFileName, i);
executor.execute(runnable);
}
executor.shutdown();
executor.awaitTermination(1L, TimeUnit.HOURS);
Engine.stopAll();
Engine.shutdown();
}
private static class PDFExporterRunnable implements Runnable {
private final String rpt;
private final Object data;
private final String sFilePath;
private final String sFileName;
private final int runIndex;
public PDFExporterRunnable(String rpt, Object data, String sFilePath,
String sFileName, int runIndex) {
this.rpt = rpt;
this.data = data;
this.sFilePath = sFilePath;
this.sFileName = sFileName;
this.runIndex = runIndex;
}
#Override
public void run() {
// Loops
Engine eng = new Engine(Engine.EXPORT_PDF);
eng.setReportFile(rpt); // rpt is the report name
Connection cn = null;
/*
* DB connection related code. Check and use.
*/
//if (cn.isClosed() || cn == null) {
//cn = ds.getConnection();
//}
eng.setConnection(cn);
System.out.println(" After set connection");
eng.setPrompt(data, 0);
ReportProperties repprop = eng.getReportProperties();
repprop.setPaperOrient(ReportProperties.DEFAULT_PAPER_ORIENTATION,
ReportProperties.PAPER_FANFOLD_US);
eng.execute();
System.out.println(" After excecute");
FileOutputStream fos = null;
try {
String FileName = sFileName + "_" + runIndex;
File file = new File(sFilePath + FileName + ".pdf");
if (!file.getParentFile().exists()) {
file.getParentFile().mkdirs();
}
if (!file.exists()) {
file.createNewFile();
}
fos = new FileOutputStream(file);
for (int k = 1; k <= eng.getPageCount(); k++) {
fos.write(eng.getPageData(k));
}
fos.flush();
fos.close();
} catch (Exception e) {
e.printStackTrace();
} finally {
if (fos != null) {
try {
fos.close();
} catch (IOException e) {
e.printStackTrace();
}
fos = null;
}
}
}
}
/*
* Dummy classes to avoid compilation errors.
*/
private static class ReportProperties {
public static final String PAPER_FANFOLD_US = null;
public static final String DEFAULT_PAPER_ORIENTATION = null;
public void setPaperOrient(String defaultPaperOrientation, String paperFanfoldUs) {
}
}
private static class Engine {
public static final int EXPORT_PDF = 1;
public Engine(int exportType) {
}
public static void shutdown() {
}
public static void stopAll() {
}
public void setPrompt(Object singleData, int i) {
}
public byte[] getPageData(int k) {
return null;
}
public int getPageCount() {
return 0;
}
public void execute() {
}
public ReportProperties getReportProperties() {
return null;
}
public void setConnection(Connection cn) {
}
public void setReportFile(String reportFile) {
}
}
}
I will offer this "answer" as a possible quick & dirty solution to get you started on a parallelization effort.
One way or another you're going to build a render farm.
I don't think there is a trivial way to do this in java; I would love to have someone post an answer that show how to parallelize your example in just a few lines of code. But until that happens this will hopefully help you make some progress.
You're going to have limited scaling in the same JVM instance.
But... let's see how far you get with that and see if it helps enough.
Design challenge #1: restarting.
You will probably want a place to keep the status for each of your reports e.g. "units of work".
You want this in case you need to re-start everything (maybe your server crashes) and you don't want to re-run all of the reports thus far.
Lots of ways you can do this; database, check to see if a "completed" file exists in your report folder (not sufficient for the *.pdf to exist, as that may be incomplete... for xyz_200.pdf you could maybe make an empty xyz_200.done or xyz_200.err file to help with re-running any problem children... and by the time you code up that file manipulation/checking/initialization logic, seems like it may have been easier to add a column to your database which holds the list of work to-be-done).
Design consideration #2: maximizing throughput (avoiding overload).
You don't want to saturate you system and run one thousand reports in parallel.
Maybe 10.
Maybe 100.
Probably not 5,000.
You will need to do some sizing research and see what gets you near 80 to 90% system utilization.
Design consideration #3: scaling across multiple servers
Overly complex, outside the scope of a Stack Exchange answer.
You'd have to spin up JVM's on multiple systems that are running something like the workers below, and a report-manager that can pull work items from a shared "queue" structure, again a database table is probably easier here than doing something file-based (or a network feed).
Sample Code
Caution: None of this code is well tested, it almost certainly has an abundance of typos, logic errors and poor design. Use at your own risk.
So anyway... I do want to give you the basic idea of a rudimentary task runner.
Replace your "// Loops" example in the question with code like the following:
main loop (original code example)
This is more or less doing what your example code did, modified to push most of the work into ReportWorker (new class, see below). Lots of stuff seems to be packed into your original question's example of "// Loop", so I'm not trying to reverse engineer that.
fwiw, it was unclear to me where "rpt" and "data[i]" are coming from so I hacked up some test data.
public class Main {
public static boolean complete( String data ) {
return false; // for testing nothing is complete.
}
public static void main(String args[] ) {
String data[] = new String[] {
"A",
"B",
"C",
"D",
"E" };
String rpt = "xyz";
// Loop
ReportManager reportMgr = new ReportManager(); // a new helper class (see below), it assigns/monitors work.
long startTime = System.currentTimeMillis();
for( int i = 0; i < data.length; ++i ) {
// complete is something you should write that knows if a report "unit of work"
// finished successfully.
if( !complete( data[i] ) ) {
reportMgr.assignWork( rpt, data[i] ); // so... where did values for your "rpt" variable come from?
}
}
reportMgr.waitForWorkToFinish(); // out of new work to assign, let's wait until everything in-flight complete.
long endTime = System.currentTimeMillis();
System.out.println("Done. Elapsed time = " + (endTime - startTime)/1000 +" seconds.");
}
}
ReportManager
This class is not thread safe, just have your original loop keep calling assignWork() until you're out of reports to assign then keep calling it until all work is done, e.g. waitForWorkToFinish(), as shown above. (fwiw, I don't think you could say any of the classes here are especially thread safe).
public class ReportManager {
public int polling_delay = 500; // wait 0.5 seconds for testing.
//public int polling_delay = 60 * 1000; // wait 1 minute.
// not high throughput millions of reports / second, we'll run at a slower tempo.
public int nWorkers = 3; // just 3 for testing.
public int assignedCnt = 0;
public ReportWorker workers[];
public ReportManager() {
// initialize our manager.
workers = new ReportWorker[ nWorkers ];
for( int i = 0; i < nWorkers; ++i ) {
workers[i] = new ReportWorker( i );
System.out.println("Created worker #"+i);
}
}
private ReportWorker handleWorkerError( int i ) {
// something went wrong, update our "report" status as one of the reports failed.
System.out.println("handlerWokerError(): failure in "+workers[i]+", resetting worker.");
workers[i].teardown();
workers[i] = new ReportWorker( i ); // just replace everything.
return workers[i]; // the new worker will, incidentally, be avaialble.
}
private ReportWorker handleWorkerComplete( int i ) {
// this unit of work was completed, update our "report" status tracker as success.
System.out.println("handleWorkerComplete(): success in "+workers[i]+", resetting worker.");
workers[i].teardown();
workers[i] = new ReportWorker( i ); // just replace everything.
return workers[i]; // the new worker will, incidentally, be avaialble.
}
private int activeWorkerCount() {
int activeCnt = 0;
for( int i = 0; i < nWorkers; ++i ) {
ReportWorker worker = workers[i];
System.out.println("activeWorkerCount() i="+i+", checking worker="+worker);
if( worker.hasError() ) {
worker = handleWorkerError( i );
}
if( worker.isComplete() ) {
worker = handleWorkerComplete( i );
}
if( worker.isInitialized() || worker.isRunning() ) {
++activeCnt;
}
}
System.out.println("activeWorkerCount() activeCnt="+activeCnt);
return activeCnt;
}
private ReportWorker getAvailableWorker() {
// check each worker to see if anybody recently completed...
// This (rather lazily) creates completely new ReportWorker instances.
// You might want to try pooling (salvaging and reinitializing them)
// to see if that helps your performance.
System.out.println("\n-----");
ReportWorker firstAvailable = null;
for( int i = 0; i < nWorkers; ++i ) {
ReportWorker worker = workers[i];
System.out.println("getAvailableWorker(): i="+i+" worker="+worker);
if( worker.hasError() ) {
worker = handleWorkerError( i );
}
if( worker.isComplete() ) {
worker = handleWorkerComplete( i );
}
if( worker.isAvailable() && firstAvailable==null ) {
System.out.println("Apparently worker "+worker+" is 'available'");
firstAvailable = worker;
System.out.println("getAvailableWorker(): i="+i+" now firstAvailable = "+firstAvailable);
}
}
return firstAvailable; // May (or may not) be null.
}
public void assignWork( String rpt, String data ) {
ReportWorker worker = getAvailableWorker();
while( worker == null ) {
System.out.println("assignWork: No workers available, sleeping for "+polling_delay);
try { Thread.sleep( polling_delay ); }
catch( InterruptedException e ) { System.out.println("assignWork: sleep interrupted, ignoring exception "+e); }
// any workers avaialble now?
worker = getAvailableWorker();
}
++assignedCnt;
worker.initialize( rpt, data ); // or whatever else you need.
System.out.println("assignment #"+assignedCnt+" given to "+worker);
Thread t = new Thread( worker );
t.start( ); // that is pretty much it, let it go.
}
public void waitForWorkToFinish() {
int active = activeWorkerCount();
while( active >= 1 ) {
System.out.println("waitForWorkToFinish(): #active workers="+active+", waiting...");
// wait a minute....
try { Thread.sleep( polling_delay ); }
catch( InterruptedException e ) { System.out.println("assignWork: sleep interrupted, ignoring exception "+e); }
active = activeWorkerCount();
}
}
}
ReportWorker
public class ReportWorker implements Runnable {
int test_delay = 10*1000; //sleep for 10 seconds.
// (actual code would be generating PDF output)
public enum StatusCodes { UNINITIALIZED,
INITIALIZED,
RUNNING,
COMPLETE,
ERROR };
int id = -1;
StatusCodes status = StatusCodes.UNINITIALIZED;
boolean initialized = false;
public String rpt = "";
public String data = "";
//Engine eng;
//PDFExportThread pdfExporter;
//DataSource_type cn;
public boolean isInitialized() { return initialized; }
public boolean isAvailable() { return status == StatusCodes.UNINITIALIZED; }
public boolean isRunning() { return status == StatusCodes.RUNNING; }
public boolean isComplete() { return status == StatusCodes.COMPLETE; }
public boolean hasError() { return status == StatusCodes.ERROR; }
public ReportWorker( int id ) {
this.id = id;
}
public String toString( ) {
return "ReportWorker."+id+"("+status+")/"+rpt+"/"+data;
}
// the example code doesn't make clear if there is a relationship between rpt & data[i].
public void initialize( String rpt, String data /* data[i] in original code */ ) {
try {
this.rpt = rpt;
this.data = data;
/* uncomment this part where you have the various classes availble.
* I have it commented out for testing.
cn = ds.getConnection();
Engine eng = new Engine(Engine.EXPORT_PDF);
eng.setReportFile(rpt); //rpt is the report name
eng.setConnection(cn);
eng.setPrompt(data, 0);
ReportProperties repprop = eng.getReportProperties();
repprop.setPaperOrient(ReportProperties.DEFAULT_PAPER_ORIENTATION, ReportProperties.PAPER_FANFOLD_US);
*/
status = StatusCodes.INITIALIZED;
initialized = true; // want this true even if we're running.
} catch( Exception e ) {
status = StatusCodes.ERROR;
throw new RuntimeException("initialze(rpt="+rpt+", data="+data+")", e);
}
}
public void run() {
status = StatusCodes.RUNNING;
System.out.println("run().BEGIN: "+this);
try {
// delay for testing.
try { Thread.sleep( test_delay ); }
catch( InterruptedException e ) { System.out.println(this+".run(): test interrupted, ignoring "+e); }
/* uncomment this part where you have the various classes availble.
* I have it commented out for testing.
eng.execute();
PDFExportThread pdfExporter = new PDFExportThread(eng, sFileName, sFilePath);
pdfExporter.execute();
*/
status = StatusCodes.COMPLETE;
System.out.println("run().END: "+this);
} catch( Exception e ) {
System.out.println("run().ERROR: "+this);
status = StatusCodes.ERROR;
throw new RuntimeException("run(rpt="+rpt+", data="+data+")", e);
}
}
public void teardown() {
if( ! isInitialized() || isRunning() ) {
System.out.println("Warning: ReportWorker.teardown() called but I am uninitailzied or running.");
// should never happen, fatal enough to throw an exception?
}
/* commented out for testing.
try { cn.close(); }
catch( Exception e ) { System.out.println("Warning: ReportWorker.teardown() ignoring error on connection close: "+e); }
cn = null;
*/
// any need to close things on eng?
// any need to close things on pdfExporter?
}
}
Following is some parts of my code, which uses Threading. The purpose is to retrieve all the records from database (approx. 5,00,000) and send them alert email messages. The problem I am facing is the variable emailRecords becomes very heavy and too much time is taken to send email message. How can I make it fast by using multi-threading such that 5,00,000 records are processed parallelly? I tried to use ExecutorService but got confused in implementing it. I got mixed up in the method checkName(), getRecords() and sendAlert(). All these 3 methods are used relevantly. So, where to use executorService ??
Please provide me the suggestion how to proceed with the following code and which part needs editing? Thanks in advance!!
public class sampledaemon implements Runnable {
private static List<String[]> emailRecords = new ArrayList<String[]>();
public static void main(String[] args) {
if (args.length != 1) {
return;
}
countryName = args[0];
try {
Thread t = null;
sampledaemon daemon = new sampledaemon();
t = new Thread(daemon);
t.start();
} catch (Exception e) {
e.printStackTrace()
}
}
public void run() {
Thread thisThread = Thread.currentThread();
try {
while (true) {
checkName(countryName);
Thread.sleep(TimeUnit.SECONDS.toMillis(10));
}
} catch (Exception e) {
e.printStackTrace();
}
}
public void checkName(String countryName) throws Exception {
Country country = CountryPojo.getDetails(countryName)
if (country != null) {
getRecords(countryconnection);
}
}
private void getRecords(Country country, Connection con) {
String users[] = null;
while (rs.next()) {
users = new String[2];
users[0] = rs.getString("userid");
users[1] = rs.getString("emailAddress");
emailRecords.add(props);
if (emailRecords.size() > 0) {
sendAlert(date, con);
}
}
}
void sendAlert(String date, Connection con) {
for (int k = 0; k < emailRecords.size(); k++) {
//check the emailRecords and send email
}
}
}
From what i can tell is that you would most likely be single threaded data retrieval, and multi-threaded for the e-mail sending. Roughly, you'd be cycling through your result set and building a list of records. When that list hits a certain size, you make a copy and send off that copy to be processed in a thread, and clear the original list. At the end of the result set, check to see if you have unprocessed records in your list, and send that to the pool as well.
Finally, wait for the threadpool to finish processing all records.
Something along these lines:
protected void processRecords(String countryName) {
ThreadPoolExecutor executor = new ThreadPoolExecutor(10, 10, 10, TimeUnit.SECONDS,
new ArrayBlockingQueue<Runnable>(5), new ThreadPoolExecutor.CallerRunsPolicy());
List<String[]> emaillist = new ArrayList<String>(1000);
ResultSet rs = ....
try {
while (rs.next()) {
String user[] = new String[2];
users[0] = rs.getString("userid");
users[1] = rs.getString("emailAddress");
emaillist.add(user);
if (emaillist.size() == 1000) {
final List<String[]> elist = new ArrayList<String[]>(emaillist);
executor.execute(new Runnable() {
public void run() {
sendMail(elist);
}
}
emaillist.clear();
}
}
}
finally {
DbUtils.close(rs);
}
if (! emaillist.isEmpty()) {
final List<String[]> elist = emaillist;
executor.execute(new Runnable() {
public void run() {
sendMail(elist);
}
}
emaillist.clear();
}
// wait for all the e-mails to finish.
while (! executor.isTerminated()) {
executor.shutdown();
executor.awaitTermination(10, TimeUnit.DAYS);
}
}
The advantage of using the FixedThreadPool is that you don't have to do the expensive process of creating the threads again and again, its done at the beginning...see below..
ExecutorService executor = Executors.newFixedThreadPool(100);
ArrayList<String> arList = Here your Email addresses from DB will go in ;
for(String s : arList){
executor.execute(new EmailAlert(s));
}
public class EmailAlert implements Runnable{
String addr;
public EmailAlert(String eAddr){
this.addr = eAddr;
}
public void run(){
// Do the process of sending the email here..
}
}
Creating a second thread to do all of the work in instead of doing the same work in the main thread isn't going to help you avoid the problem of filling up the emailRecords list with 5 million records before processing any of them.
It sounds like your goal is to be able to read from the database and send email in parallel. Instead of worrying about the code, first think of an algorithm for the work you want to accomplish. Something like this:
In one thread, query for the records from the database, and for each result, add one job to an ExecutorService
That job sends email to one person/address/record.
or alternatively
Read records from the database in batches of N (50, 100, 1000, etc)
Submit each batch to the executorService