Java Async NIO servlet not working as expected - java

I'm trying to put working a simple servlet (copied from another source). It's an async servlet that reads a post request using NIO through a ReadListener. This is the code:
#WebServlet(name = "AsyncProxyServlet2", urlPatterns = "/*" , asyncSupported = true)
public class AsyncProxyServlet2 extends HttpServlet
{
private static final long serialVersionUID = 8458401860448619054L;
#Override
public void doPost(HttpServletRequest request, HttpServletResponse response) throws IOException
{
final AsyncContext acontext = request.startAsync();
final ServletInputStream input = request.getInputStream();
System.out.println("request.isAsyncStarted() = " + request.isAsyncStarted() + ", request.isAsyncSupported = " + request.isAsyncSupported() + ", main = " + Thread.currentThread().getName());
input.setReadListener(new ReadListener()
{
byte tmp[] = new byte[4*1024];
ByteArrayOutputStream buffer = new ByteArrayOutputStream(this.tmp.length);
#Override
public void onDataAvailable()
{
try
{
System.out.println("onDataAvailable = " + Thread.currentThread().getName());
int numBytesRead = -1;
while ( input.isReady() && !input.isFinished() )
{
if ( (numBytesRead = input.read(this.tmp)) > 0 ) this.buffer.write(this.tmp, 0, numBytesRead);
}
}
catch (IOException ex) { ex.printStackTrace(); }
}
#Override
public void onAllDataRead()
{
try
{
System.out.println("onAllDataRead = " + Thread.currentThread().getName() + ", buffer size = " + this.buffer.size());
acontext.getResponse().setContentType("text/xml; charset=utf-8");
acontext.getResponse().setContentLength(this.buffer.size());
acontext.getResponse().getOutputStream().write(this.buffer.toByteArray());
}
catch (Exception ex) { ex.printStackTrace(); }
acontext.complete();
}
#Override
public void onError(Throwable t) { t.printStackTrace(); }
});
System.out.println("final 1");
try { Thread.sleep(3000); } catch (Exception e) {}
System.out.println("final 2");
}
}
When I send a post request to this servlet, the output of this code is:
request.isAsyncStarted() = true, request.isAsyncSupported = true, main = http-nio-80-exec-1
final 1
<-- Here a 3 second pause
final 2
onDataAvailable = http-nio-80-exec-1
onAllDataRead = http-nio-80-exec-1, buffer size = 910
So it seems that everything is executed on the same thread. The request is not read until the 3-second pause ends, then the last 3 messages are printed.
Why is this not working properly?

There is a 3 seconds pause...
try { Thread.sleep(3000); } catch (Exception e) {}
Just comment out this line.

Related

RMI does not return response over internet

I have a simple rmi-server and rmi-client. When i run this server and client in same network, my server function returns the result properly. But my server and client are in different networks and if the process time is more than 3-4 minutes client can not get the result, although server fihishes the operation.
here is my entire server code:
public class SimpleServer {
ServerRemoteObject mRemoteObject;
public static int RMIInPort = 27550;
public static int delay = 0;
public byte[] handleEvent(byte[] mMessage) throws Exception {
String request = new String(mMessage, "UTF-8");
// if ("hearthbeat".equalsIgnoreCase(request)) {
// System.out.println("returning for hearthbeat");
// return "hearthbeat response".getBytes("UTF-8");
// }
System.out.println(request);
Thread.sleep(delay);
System.out.println("returning response");
return "this is response".getBytes("UTF-8");
}
public void bindYourself(int rmiport) {
try {
mRemoteObject = new ServerRemoteObject(this);
java.rmi.registry.Registry iRegistry = LocateRegistry.getRegistry(rmiport);
iRegistry.rebind("Server", mRemoteObject);
} catch (Exception e) {
e.printStackTrace();
mRemoteObject = null;
}
}
public static void main(String[] server) {
int rmiport = Integer.parseInt(server[0]);
RMIInPort = Integer.parseInt(server[1]);
delay = Integer.parseInt(server[2]);
System.out.println("server java:" + System.getProperty("java.version"));
System.out.println("server started on:" + rmiport + "/" + RMIInPort);
System.out.println("server delay on:" + delay);
SimpleServer iServer = new SimpleServer();
iServer.bindYourself(rmiport);
while (true) {
try {
Thread.sleep(10000);
} catch (Exception e) {
e.printStackTrace();
}
}
}
}
and here is my client code:
public class SimpleClient {
ISimpleServer iServer;
public SimpleClient(String p_strServerIp, String p_strCMName, int nRMIPort) {
try {
if (nRMIPort == 1099) {
iServer = (ISimpleServer) Naming.lookup("rmi://" + p_strServerIp + "/" + p_strCMName);
} else {
Registry rmiRegistry = null;
rmiRegistry = LocateRegistry.getRegistry(p_strServerIp, nRMIPort);
iServer = (ISimpleServer) rmiRegistry.lookup(p_strCMName);
}
} catch (Exception ex) {
ex.printStackTrace();
iServer = null;
}
}
public static void main(String... strings) {
String ip = strings[0];
int rmiport = Integer.parseInt(strings[1]);
System.out.println("client java:" + System.getProperty("java.version"));
System.out.println("client is looking for:" + ip + ":" + rmiport);
SimpleClient iClient = new SimpleClient(ip, "Server", rmiport);
try {
byte[] response = iClient.iServer.doaction("this is request".getBytes("UTF-8"));
System.out.println(new String(response, "UTF-8"));
} catch (Exception e) {
e.printStackTrace();
}
}
}
and here is my rmi-registry code:
public class SimpleRMI implements Runnable {
Registry mRegistry = null;
public SimpleRMI(int nPort) {
try {
mRegistry = new sun.rmi.registry.RegistryImpl(nPort);
} catch (RemoteException e1) {
e1.printStackTrace();
}
}
#Override
public void run() {
while (true) {
try {
Thread.sleep(360000);
} catch (Exception e) {
e.printStackTrace();
}
}
}
public static void main(String... strings) {
int rmiport = Integer.parseInt(strings[0]);
System.out.println("rmi java:" + System.getProperty("java.version"));
System.out.println("rmi started on:" + rmiport);
SimpleRMI iRegisry = new SimpleRMI(rmiport);
Thread tThread = new Thread(iRegisry);
tThread.start();
byte[] bytes = new byte[1];
while (true) {
try {
System.in.read(bytes);
if (bytes[0] == 13) {
try {
iRegisry.listRegistry();
} catch (Exception exc2) {
exc2.printStackTrace();
}
}
} catch (Exception exc) {
exc.printStackTrace();
}
}
}
private void listRegistry() {
String[] strList = null;
try {
strList = mRegistry.list();
if (strList != null) {
for (int i = 0; i < strList.length; i++) {
int j = i + 1;
String name = strList[i];
java.rmi.Remote r = mRegistry.lookup(name);
System.out.println(j + ". " + strList[i] + " -> "
+ r.toString());
}
}
System.out.println();
} catch (Exception exc) {
exc.printStackTrace();
}
}
}
and my remote interface and remote object:
public interface ISimpleServer extends java.rmi.Remote {
public byte[] doaction(byte[] message) throws java.rmi.RemoteException;
}
#SuppressWarnings("serial")
public class ServerRemoteObject extends UnicastRemoteObject implements ISimpleServer {
SimpleServer Server = null;
public ServerRemoteObject(SimpleServer pServer) throws RemoteException {
super(SimpleServer.RMIInPort);
Server = pServer;
}
#Override
public byte[] doaction(byte[] message) throws RemoteException {
try {
return Server.handleEvent(message);
} catch (Exception e) {
e.printStackTrace();
return null;
}
}
}
when i run client and server in different networks. (i run client in my home network) and if delay is more than 3-4 mins server prints returning response but client still waits for the response. If delay is only 1 minute, clients gets the result properly.
Can you please help me to find where the problem is?

Missing data in serial communication

IN serial communication some time data miss at receive time and some time its miss and send time at receiver side . I observe that data send and receive successfully when I am set baud rate is 300800 please check following code ===>
receiver==>
I am using Jserial com for communication
also open port port and set baud rate, partity=1, NumDataBits=8,NumStopBits=1
comPort.addDataListener(new SerialPortDataListener() {
#Override
public int getListeningEvents() {
{
return SerialPort.LISTENING_EVENT_DATA_AVAILABLE;
}
// return 0;
}
#Override
public void serialEvent(SerialPortEvent serialPortEvent) {
if (serialPortEvent.getEventType() != SerialPort.LISTENING_EVENT_DATA_AVAILABLE) {
logger.info("in HLIToModbusConversionController" + "event_type="
+ serialPortEvent.getEventType());
return;
}else {
logger.info("in HLIToModbusConversionController getCommandFromSerial() getEventType:" + serialPortEvent.getEventType());
}
ExecutorService executorService = null;
try {
//50 ml time out
executorService = Executors.newSingleThreadExecutor();
long sTime = System.currentTimeMillis();
int len = serialPortEvent.getSerialPort().bytesAvailable();
byte dataBuffer[] = new byte[len];
serialPortEvent.getSerialPort().readBytes(dataBuffer, len);
for (int i = 0; i < dataBuffer.length; i++) {
byte b = dataBuffer[i];
logger.info("in HLIToModbusConversionController" + String.format("%02x ", b));
}
logger.info("in HLIToModbusConversionController" + dataBuffer);
Future<byte[]> future = executorService
.submit(new TimeOutTask(hliToModbusService, dataBuffer));
while (!future.isDone()) {
long totalTime = System.currentTimeMillis() - sTime;
if (totalTime > configurationModel.getModbusTimeout()) {
logger.info("Task is taking long time to execute so cancelling it..");
future.cancel(true);
}
}
byte responseFrame[] =null;
//byte responseFrame[] = hliToModbusService.decodeHLICommand(dataBuffer); // service
try {
responseFrame = future.get((long) configurationModel.getModbusTimeoutSecond(), TimeUnit.SECONDS);
logger.info("result:"+responseFrame);
fileOperationUtil.writeFramFromCache(responseFrame);
} catch (Exception e) {
logger.info("50 millisecond time out frame takes from cache");
responseFrame =fileOperationUtil.readFramFromCache();
} // call
modbusRequest.setTimeout(false);
if (responseFrame != null) {
comPort.writeBytes(responseFrame, responseFrame.length);
logger.info("in HLIToModbusConversionController response frame sent" + responseFrame);
// wait for 100ms
} else {
logger.info(
"in HLIToModbusConversionController responseFrame is empty" + responseFrame);
}
String finalFrame="";
for(byte data:responseFrame) {
finalFrame=finalFrame+","+data;
}
logger.info(
"in HLIToModbusConversionController final responseFrame:" + finalFrame);
startTimer();
logger.info("in HLIToModbusConversionController returns from startTimer");
} catch (Exception e) {
logger.info("in HLIToModbusConversionController exception occurs");
e.printStackTrace();
}finally{
if(executorService != null){
executorService.shutdown();
}
}}
});
}
}
//comPort = SerialPort.getCommPorts()[0]; // take 1st port
} catch (Exception e) {
logger.info("in HLIToModbusConversionController exception occurs");
e.printStackTrace();
}
logger.info("in HLIToModbusConversionController");
logger.info("Server started");
logger.info("Waiting for a client ...");
return "Hli server started";

Multiple ExecutorCompletionService not working

I've a requirement to start and stop task from java application. I'm trying to use
ExecutorService to create threads and ExecutorCompletionService to check
processing status of thread . Startup and stop is a continious activity so in my
test code I've created a while loop .
public class ProcessController {
String[] processArray = { "Process1", "Process2", "Process3", "Process4", "Process5", "Process6", "Process7" };
private List<String> processList = Arrays.asList(processArray);
public static void main(String[] args ) {
ExecutorService startUpExecutor = Executors.newFixedThreadPool(3);
ExecutorService cleanUpExecutor = Executors.newFixedThreadPool(3);
CompletionService<String> startUpCompletionService = new ExecutorCompletionService<>(startUpExecutor);
CompletionService<String> cleanUpCompletionService = new ExecutorCompletionService<>(cleanUpExecutor);
List<Future<String>> cleanupFutures = new ArrayList<Future<String>>();
List<Future<String>> startupFutures = new ArrayList<Future<String>>();
ProcessController myApp = new ProcessController();
int i = 0;
while (i++ < 3) {
System.out.println("**********Starting Iteration " + i + "************* =====> ");
if (!cleanupFutures.isEmpty()) cleanupFutures.clear();
myApp.processList.forEach(process -> cleanupFutures.add(cleanUpCompletionService.submit(new CleanupTask(process))));
myApp.processList.forEach(process -> startupFutures.add(startUpCompletionService.submit(new StartupTask(process))));
for (Future<String> f : cleanupFutures) {
try {
String result = cleanUpCompletionService.take().get();
System.out.println("Result from Cleanup thread : " + result);
} catch (InterruptedException e) {
e.printStackTrace();
} catch (ExecutionException e) {
e.printStackTrace();
}
}
for (Future<String> f1 : startupFutures) {
try {
String result = startUpCompletionService.take().get();
System.out.println("Result from startup thread : " + result);
} catch (InterruptedException e) {
e.printStackTrace();
} catch (ExecutionException e) {
e.printStackTrace();
}
}
System.out.println("**********Finished Iteration " + i + "************* =====> ");
}
startUpExecutor.shutdown();
cleanUpExecutor.shutdown();
}
}
CleanupTask class
public class CleanupTask implements Callable<String> {
private String task;
public CleanupTask(String task) {
this.task = task;
}
#Override
public String call() throws Exception {
checkIfAnyFinished();
return "finished clean up processing for " + getThreadId();
}
private void checkIfAnyFinished( )
{
System.out.println( getThreadId() + " Checking if task " + this.task + " is finished");
try {
isFinished();
} catch (InterruptedException e) {
e.printStackTrace();
}
}
private void isFinished() throws InterruptedException {
Thread.sleep(1000*4);
}
private String getThreadId()
{
return Thread.currentThread().getName();
}
}
Startup Task class
public class StartupTask implements Callable<String> {
private String processSchedule ;
public StartupTask(String processSchedule) {
this.processSchedule = processSchedule;
}
#Override
public String call() {
scheduleifdue();
return "finished start up up processing for " + getThreadId();
}
private void scheduleifdue()
{
System.out.println(getThreadId() + " Checking " + this.processSchedule + " is due or not");
try {
Thread.sleep(4000);
} catch (InterruptedException e) {
e.printStackTrace();
}
}
private String getThreadId()
{
return Thread.currentThread().getName();
}
}
Above code successfully complete iteration 1 and start 2nd iteration . But it never finish and keeps running.
When I run the same application only with one task( either cleanup or startup) then it run without any issues. I'm not sure what is causing issue.

How to give message when Threadpool Executor is completed?

I am trying to give a pop up alert message when my ThreadpoolExecutor is finished executing. It is searching email addresses from websites, once it is done I want a alert message as "Completed". Here is my Thread :-
public class Constant
{
public static final int NUM_OF_THREAD = 60;
public static final int TIME_OUT = 10000;
}
ThreadPoolExecutor poolMainExecutor = (ThreadPoolExecutor) Executors.newFixedThreadPool
(Constant.NUM_OF_THREAD);
Here is my Searching Operation class :-
class SearchingOperation implements Runnable {
URL urldata;
int i;
Set<String> emailAddresses;
int level;
SearchingOperation(URL urldata, int i, Set<String> emailAddresses, int level) {
this.urldata = urldata;
this.i = i;
this.emailAddresses = emailAddresses;
this.level = level;
if (level != 1)
model.setValueAt(urldata.getProtocol() + "://" + urldata.getHost() + "/contacts", i, 3);
}
public void run() {
BufferedReader bufferreader1 = null;
InputStreamReader emailReader = null;
System.out.println(this.i + ":" + poolMainExecutor.getActiveCount() + ":" + level + ";" + urldata.toString());
try {
if (level < 1) {
String httpPatternString = "https?:\\/\\/(www\\.)?[-a-zA-Z0-9#:%._\\+~#=]{2,256}\\.[a-z]{2,6}\\b([-a-zA-Z0-9#:%_\\+.~#?&//=]*)";
String httpString = "";
BufferedReader bufferreaderHTTP = null;
InputStreamReader httpReader = null;
try {
httpReader = new InputStreamReader(urldata.openStream());
bufferreaderHTTP = new BufferedReader(httpReader
);
StringBuilder rawhttp = new StringBuilder();
while ((httpString = bufferreaderHTTP.readLine()) != null) {
rawhttp.append(httpString);
}
if (rawhttp.toString().isEmpty()) {
return;
}
List<String> urls = getURL(rawhttp.toString());
for (String url : urls) {
String fullUrl = getMatchRegex(url, httpPatternString);
if (fullUrl.isEmpty()) {
if (!url.startsWith("/")) {
url = "/" + url;
}
String address = urldata.getProtocol() + "://" + urldata.getHost() + url;
fullUrl = getMatchRegex(address, httpPatternString);
}
if (!addressWorked.contains(fullUrl) && fullUrl.contains(urldata.getHost())) {
addressWorked.add(fullUrl);
sendToSearch(fullUrl);
}
}
} catch (Exception e) {
//System.out.println("652" + e.getMessage());
//e.printStackTrace();
return;
} finally {
try {
if (httpReader != null)
bufferreaderHTTP.close();
} catch (Exception e) {
// e.printStackTrace();
}
try {
if (httpReader != null)
httpReader.close();
} catch (Exception e) {
e.printStackTrace();
}
}
}
String someString = "";
emailReader = new InputStreamReader(urldata.openStream());
bufferreader1 = new BufferedReader(
emailReader);
StringBuilder emailRaw = new StringBuilder();
while ((someString = bufferreader1.readLine()) != null) {
if (someString.contains("#")) {
emailRaw.append(someString).append(";");
}
}
//Set<String> emailAddresses = new HashSet<String>();
String emailAddress;
//Pattern pattern = Pattern
//.compile("\\b[a-zA-Z0-9.-]+#[a-zA-Z0-9.-]+\\.[a-zA-Z0-9.-]+\\b");
Pattern
pattern = Pattern
.compile("\\b[a-zA-Z0-9.-]+#[a-zA-Z0-9.-]+\\.[a-zA-Z0-9.-]+\\b");
Matcher matchs = pattern.matcher(emailRaw);
while (matchs.find()) {
emailAddress = (emailRaw.substring(matchs.start(),
matchs.end()));
// //System.out.println(emailAddress);
if (!emailAddresses.contains(emailAddress)) {
emailAddresses.add(emailAddress);
// //System.out.println(emailAddress);
if (!foundItem.get(i)) {
table.setValueAt("Found", i, 4);
foundItem.set(i, true);
}
String emails = !emailAddresses.isEmpty() ? emailAddresses.toString() : "";
model.setValueAt(emails, i, 2);
model.setValueAt("", i, 3);
}
}
} catch (Exception e) {
//System.out.println("687" + e.getMessage());
} finally {
try {
if (bufferreader1 != null)
bufferreader1.close();
} catch (Exception e) {
e.printStackTrace();
}
try {
if (emailReader != null)
emailReader.close();
} catch (Exception e) {
e.printStackTrace();
}
Thread.currentThread().interrupt();
return;
}
}
After this the final snippet :-
private void sendToSearch(String address) throws Throwable {
SearchingOperation operation = new SearchingOperation(new URL(address), i,
emailAddresses, level + 1);
//operation.run();
try {
final Future handler = poolMainExecutor.submit(operation);
try {
handler.get(Constant.TIME_OUT, TimeUnit.MILLISECONDS);
} catch (TimeoutException e) {
e.printStackTrace();
handler.cancel(false);
}
} catch (Exception e) {
//System.out.println("Time out for:" + address);
} catch (Error error) {
//System.out.println("Time out for:" + address);
} finally {
}
}
Implement Callable<Void> instead of Runnable and wait for all the task to terminate by calling Future<Void>.get():
class SearchingOperation implements Callable<Void>
{
public Void call() throws Exception
{
//same code as in run()
}
}
//submit and wait until the task complete
Future<Void> future = poolMainExecutor.submit(new SearchingOperation()).get();
Use ThreadPoolExecutor.awaitTermination():
Blocks until all tasks have completed execution after a shutdown request, or the timeout occurs, or the current thread is interrupted, whichever happens first.
As in your code, you create your ThreadPoolExecutor first
ThreadPoolExecutor poolMainExecutor = (ThreadPoolExecutor) Executors.newFixedThreadPool(Constant.NUM_OF_THREAD);
Then, you need to add Tasks to it:
poolMainExecutor.execute(myTask);
poolMainExecutor.submit(myTask);
execute will return nothing, while submit will return a Future object. Tasks must implement Runnable or Callable. An object of SearchingOperation is a task for example. The thread pool will execute the tasks in parallel, but each task will be executed by one thread. That means to effectively use NUM_OF_THREAD Threads you need to add at least NUM_OF_THREAD Tasks.
(Optional) Once you got all jobs to work, shutdown your pool. This will prevent new tasks from being submitted. It won't affect running tasks.
poolMainExecutor.shutdown();
At the end, you need to wait for all Tasks to complete. The easiest way is by calling
poolMainExecutor.awaitTermination(Integer.MAX_VALUE, TimeUnit.DAYS);
You should adjust the amount of time you want to wait for the tasks to finish before throwing an exception.
Now that the work is done, notify the user. A simple way is to call one of the Dialog presets from JOptionPane, like:
JOptionPane.showMessageDialog(null, "message", "title", JOptionPane.INFORMATION_MESSAGE);
It will popup a little window with title "title", the message "message", an "information" icon and a button to close it.
This code can be used., it will check whether the execution is completed in every 2.5 seconds.
do {
System.out.println("In Progress");
try {
Thread.sleep(2500);
} catch (InterruptedException e) {
e.printStackTrace();
}
} while (poolMainExecutor.getActiveCount() != 0);
System.out.println("Completed");

Splitting huge CSV by custom filter?

I have huge (>5GB) CSV file in format:
username,transaction
I want to have as an output separate CSV file for each user with only all of his transactions in the same format. I have few ideas in mind, but i want to hear other ideas for effective (fast and memory efficient) implementation.
Here is what i done up to now. First test is read/process/write in single thread, second test is with many threads. Performance is not that good, so i think i'm doing something wrong. Please correct me.
public class BatchFileReader {
private ICsvBeanReader beanReader;
private double total;
private String[] header;
private CellProcessor[] processors;
private DataTransformer<HashMap<String, List<LoginDto>>> processor;
private boolean hasMoreRecords = true;
public BatchFileReader(String file, DataTransformer<HashMap<String, List<LoginDto>>> processor) {
try {
this.processor = processor;
this.beanReader = new CsvBeanReader(new FileReader(file), CsvPreference.STANDARD_PREFERENCE);
header = CSVUtils.getHeader(beanReader.getHeader(true));
processors = CSVUtils.getProcessors();
} catch (IOException e) {
e.printStackTrace();
}
}
public void read() {
try {
readFile();
} catch (IOException e) {
e.printStackTrace();
} finally {
if (beanReader != null) {
try {
beanReader.close();
} catch (IOException e) {
e.printStackTrace();
}
}
}
}
private void readFile() throws IOException {
while (hasMoreRecords) {
long start = System.currentTimeMillis();
HashMap<String, List<LoginDto>> usersBatch = readBatch();
long end = System.currentTimeMillis();
System.out.println("Reading batch for " + ((end - start) / 1000f) + " seconds.");
total +=((end - start)/ 1000f);
if (processor != null && !usersBatch.isEmpty()) {
processor.transform(usersBatch);
}
}
System.out.println("total = " + total);
}
private HashMap<String, List<LoginDto>> readBatch() throws IOException {
HashMap<String, List<LoginDto>> users = new HashMap<String, List<LoginDto>>();
int readLoginCount = 0;
while (readLoginCount < CONFIG.READ_BATCH_SIZE) {
LoginDto login = beanReader.read(LoginDto.class, header, processors);
if (login != null) {
if (!users.containsKey(login.getUsername())) {
List<LoginDto> logins = new LinkedList<LoginDto>();
users.put(login.getUsername(), logins);
}
users.get(login.getUsername()).add(login);
readLoginCount++;
} else {
hasMoreRecords = false;
break;
}
}
return users;
}
}
public class BatchFileWriter {
private final String file;
private final List<T> processedData;
public BatchFileWriter(final String file, List<T> processedData) {
this.file = file;
this.processedData = processedData;
}
public void write() {
try {
writeFile(file, processedData);
} catch (IOException e) {
e.printStackTrace();
} finally {
}
}
private void writeFile(final String file, final List<T> processedData) throws IOException {
System.out.println("START WRITE " + " " + file);
FileWriter writer = new FileWriter(file, true);
long start = System.currentTimeMillis();
for (T record : processedData) {
writer.write(record.toString());
writer.write("\n");
}
writer.flush();
writer.close();
long end = System.currentTimeMillis();
System.out.println("Writing in file " + file + " complete for " + ((end - start) / 1000f) + " seconds.");
}
}
public class LoginsTest {
private static final ExecutorService executor = Executors.newSingleThreadExecutor();
private static final ExecutorService procExec = Executors.newFixedThreadPool(Runtime.getRuntime().availableProcessors() + 1);
#Test
public void testSingleThreadCSVtoCSVSplit() throws InterruptedException, ExecutionException {
long start = System.currentTimeMillis();
DataTransformer<HashMap<String, List<LoginDto>>> simpleSplitProcessor = new DataTransformer<HashMap<String, List<LoginDto>>>() {
#Override
public void transform(HashMap<String, List<LoginDto>> data) {
for (String field : data.keySet()) {
new BatchFileWriter<LoginDto>(field + ".csv", data.get(field)).write();
}
}
};
BatchFileReader reader = new BatchFileReader("loadData.csv", simpleSplitProcessor);
reader.read();
long end = System.currentTimeMillis();
System.out.println("TOTAL " + ((end - start)/ 1000f) + " seconds.");
}
#Test
public void testMultiThreadCSVtoCSVSplit() throws InterruptedException, ExecutionException {
long start = System.currentTimeMillis();
System.out.println(start);
final DataTransformer<HashMap<String, List<LoginDto>>> simpleSplitProcessor = new DataTransformer<HashMap<String, List<LoginDto>>>() {
#Override
public void transform(HashMap<String, List<LoginDto>> data) {
System.out.println("transform");
processAsync(data);
}
};
final CountDownLatch readLatch = new CountDownLatch(1);
executor.execute(new Runnable() {
#Override
public void run() {
BatchFileReader reader = new BatchFileReader("loadData.csv", simpleSplitProcessor);
reader.read();
System.out.println("read latch count down");
readLatch.countDown();
}});
System.out.println("read latch before await");
readLatch.await();
System.out.println("read latch after await");
procExec.shutdown();
executor.shutdown();
long end = System.currentTimeMillis();
System.out.println("TOTAL " + ((end - start)/ 1000f) + " seconds.");
}
private void processAsync(final HashMap<String, List<LoginDto>> data) {
procExec.execute(new Runnable() {
#Override
public void run() {
for (String field : data.keySet()) {
writeASync(field, data.get(field));
}
}
});
}
private void writeASync(final String field, final List<LoginDto> data) {
procExec.execute(new Runnable() {
#Override
public void run() {
new BatchFileWriter<LoginDto>(field + ".csv", data).write();
}
});
}
}
Would it not be better to use unix commands to sort and then split the original file?
Something like: cat txn.csv | sort > txn-sorted.csv
From there get a listing of the unique usernames via grep and then grep the sorted file for each username
If you know Camel already, I'd write a simple Camel route to:
Read line from file
Parse the line
Write to the correct output file
Its a very simple route but if you want it as fast as possible it is then trivially easy make it multithreaded
eg your route would look something like:
from("file:/myfile.csv")
.beanRef("lineParser")
.to("seda:internal-queue");
from("seda:internal-queue")
.concurrentConsumers(5)
.to("fileWriter");
If you don't know Camel then its not worth learning some this one task. However you are probably going to need to make it multithreaded to get the maximum performance. You'll have to experiment where best to put the threading as it will depend on what parts of the operation are slowest.
The multithreading will use up more memory so you'll need to balance memory efficiency against performance.
I would open/append a new output file for each user. If you wanted to minimize memory usage and incur more I/O overhead, you could do something like the following, though you'd probably want to use a real CSV parser like Super CSV (http://supercsv.sourceforge.net/index.html):
Scanner s = new Scanner(new File("/my/dir/users-and-transactions.txt"));
while (s.hasNextLine()) {
String line = s.nextLine();
String[] tokens = line.split(",");
String user = tokens[0];
String transaction = tokens[1];
PrintStream out = new PrintStream(new FileOutputStream("/my/dir/" + user, true));
out.println(transaction);
out.close();
}
s.close();
If you've got a reasonable amount of memory, you could create a Map of user name to OutputStream. Each time you see a user string, you could get the existing OutputStream for that user name or create a new one if none exists.

Categories