When I tested a simple producer/consumer example, I got a very strange result as below.
If I used main() to test the following code, I'll get the correct and expected result.
But I only can get the 1st directory correctly, the remaining works were dropped by the JUnit.
What is the exact reason?
Working code:
import java.io.File;
import java.util.concurrent.BlockingQueue;
import java.util.concurrent.LinkedBlockingQueue;
import org.junit.Test;
public class TestProducerAndConsumer {
public static void main(String[] args) {
BlockingQueue<File> queue = new LinkedBlockingQueue<File>(1000);
new Thread(new FileCrawler(queue, new File("C:\\"))).start();
new Thread(new Indexer(queue)).start();
}
}
Bad Code:
import java.io.File;
import java.util.concurrent.BlockingQueue;
import java.util.concurrent.LinkedBlockingQueue;
import org.junit.Test;
public class TestProducerAndConsumer {
#Test
public void start2() {
BlockingQueue<File> queue = new LinkedBlockingQueue<File>(1000);
new Thread(new FileCrawler(queue, new File("C:\\"))).start();
new Thread(new Indexer(queue)).start();
}
}
Other function code:
import java.io.File;
import java.util.Arrays;
import java.util.concurrent.BlockingQueue;
public class FileCrawler implements Runnable {
private final BlockingQueue<File> fileQueue;
private final File root;
private int i = 0;
public FileCrawler(BlockingQueue<File> fileQueue, File root) {
this.fileQueue = fileQueue;
this.root = root;
}
#Override
public void run() {
try {
craw(root);
} catch (InterruptedException e) {
System.out.println("shit!");
e.printStackTrace();
Thread.currentThread().interrupt();
}
}
private void craw(File file) throws InterruptedException {
File[] entries = file.listFiles();
//System.out.println(Arrays.toString(entries));
if (entries != null && entries.length > 0) {
for (File entry : entries) {
if (entry.isDirectory()) {
craw(entry);
} else {
fileQueue.offer(entry);
i++;
System.out.println(entry);
System.out.println(i);
}
}
}
}
public static void main(String[] args) throws InterruptedException {
FileCrawler fc = new FileCrawler(null, null);
fc.craw(new File("C:\\"));
System.out.println(fc.i);
}
}
import java.io.File;
import java.util.concurrent.BlockingQueue;
public class Indexer implements Runnable {
private BlockingQueue<File> queue;
public Indexer(BlockingQueue<File> queue) {
this.queue = queue;
}
#Override
public void run() {
try {
while (true) {
indexFile(queue.take());
}
} catch (InterruptedException e) {
Thread.currentThread().interrupt();
}
}
private void indexFile(File file) {
System.out.println("Indexing ... " + file);
}
}
Junit's presumably allowing the JVM & threads to terminate, once the test is finished -- thus your threads do not complete working.
Try waiting for the threads to 'join':
Thread crawlerThread = new Thread(new FileCrawler(queue, new File("C:\\")));
Thread indexerThread = new Thread(new Indexer(queue));
crawlerThread.start();
indexerThread.start();
//
// wait for them to finish.
crawlerThread.join();
indexerThread.join();
This should help.
.. The other thing that can go wrong, is that log output (via Log4J) can sometimes be truncated at the end of execution; flushing & pausing can help. But I don't think that will affect you here.
Related
As in several research papers [1], [2], [3] it seems appropriate to make use of the Principal Component Analysis (PCA) algorithm for feature extraction. To attempt to accomplish this I have produced a traffic sniffer program in java making use of the jpcap library (snippet below), which is able to extract various features from live network traffic.
But I have not found any way to actually implement this, as existing approaches (snippets below) all seem to operate on non-live pre-organised datasets such as in the form of ARFF files.
I have seen a number of research papers [4], [5], [6] that reference the use of the PCA algorithm in combination jpcap library, that said I have not been able to find an explanation as to how this was accomplished this.
To be clear:
My question is, how can I implement PCA (using any approach) to extract features from live packets (attained via used of the pcap library referenced above) in Java? as others have accomplished (some examples have been referenced above)
Sniffer Example (using Jpcap Library)
Sniffer.java
import jpcap.JpcapCaptor;
import jpcap.NetworkInterface;
import jpcap.packet.Packet;
import jpcap.*;
import java.io.IOException;
import java.io.UnsupportedEncodingException;
import java.util.ArrayList;
import java.util.logging.Level;
import java.util.logging.Logger;
import javax.xml.bind.DatatypeConverter;
import java.util.List;
public class Sniffer {
public static NetworkInterface[] NETWORK_INTERFACES;
public static JpcapCaptor CAP;
jpcap_thread THREAD;
public static int INDEX = 0;
public static int flag = 0;
public static int COUNTER = 0;
static boolean CaptureState = false;
public static int No = 0;
JpcapWriter writer = null;
List<Packet> packetList = new ArrayList<>();
public static ArrayList<Object[]> packetInfo = new ArrayList<>();
public static void CapturePackets() {
THREAD = new jpcap_thread() {
public Object construct() {
try {
CAP = JpcapCaptor.openDevice(NETWORK_INTERFACES[INDEX], 65535, false, 20);
// writer = JpcapWriter.openDumpFile(CAP, "captureddata");
if ("UDP".equals(filter_options.getSelectedItem().toString())) {
CAP.setFilter("udp", true);
} else if ("TCP".equals(filter_options.getSelectedItem().toString())) {
CAP.setFilter("tcp", true);
} else if ("ICMP".equals(filter_options.getSelectedItem().toString())) {
CAP.setFilter("icmp", true);
}
while (CaptureState) {
CAP.processPacket(1, new PacketContents());
packetList.add(CAP.getPacket());
}
CAP.close();
} catch (Exception e) {
System.out.print(e);
}
return 0;
}
public void finished() {
this.interrupt();
}
};
THREAD.start();
}
public static void main(String[] args) {
CaptureState = true;
CapturePackets();
}
public void saveToFile() {
THREAD = new jpcap_thread() {
public Object construct() {
writer = null;
try {
CAP = JpcapCaptor.openDevice(NETWORK_INTERFACES[INDEX], 65535, false, 20);
writer = JpcapWriter.openDumpFile(CAP, "captured_data.txt");
} catch (IOException ex) {
Logger.getLogger(Sniffer.class.getName()).log(Level.SEVERE, null, ex);
}
for (int i = 0; i < No; i++) {
writer.writePacket(packetList.get(i));
}
return 0;
}
public void finished() {
this.interrupt();
}
};
THREAD.start();
}
}
PacketContents.java
import jpcap.PacketReceiver;
import jpcap.packet.Packet;
import javax.swing.table.DefaultTableModel;
import jpcap.packet.TCPPacket;
import jpcap.packet.UDPPacket;
import java.util.ArrayList;
import java.util.List;
import jpcap.packet.ICMPPacket;
public class PacketContents implements PacketReceiver {
public static TCPPacket tcp;
public static UDPPacket udp;
public static ICMPPacket icmp;
public void recievePacket(Packet packet) {
}
#Override
public void receivePacket(Packet packet) {
if (packet instanceof TCPPacket) {
tcp = (TCPPacket) packet;
Sniffer.packetInfo.add(new Object[] { sniffer.No, tcp.length, tcp.src_ip, tcp.dst_ip, "TCP", tcp.src_port,
tcp.dst_port, tcp.ack, tcp.ack_num, tcp.data, tcp.sequence, tcp.offset, tcp.header });
sniffer.No++;
} else if (packet instanceof UDPPacket) {
udp = (UDPPacket) packet;
Sniffer.packetInfo.add(new Object[] { sniffer.No, udp.length, udp.src_ip, udp.dst_ip, "UDP", udp.src_port,
udp.dst_port, udp.data, udp.offset, udp.header });
sniffer.No++;
} else if (packet instanceof ICMPPacket) {
icmp = (ICMPPacket) packet;
Sniffer.packetInfo.add(new Object[] { sniffer.No, icmp.length, icmp.src_ip, icmp.dst_ip, "ICMP",
icmp.checksum, icmp.header, icmp.offset, icmp.orig_timestamp, icmp.recv_timestamp,
icmp.trans_timestamp, icmp.data });
sniffer.No++;
}
}
}
Using the Weka Library to execute the PCA algorithm on an ARFF file
WekaPCA.java
package project;
import weka.core.Instances;
import weka.core.converters.ArffLoader;
import weka.core.converters.ConverterUtils;
import weka.core.converters.TextDirectoryLoader;
import java.io.File;
import org.math.plot.FrameView;
import org.math.plot.Plot2DPanel;
import org.math.plot.PlotPanel;
import org.math.plot.plots.ScatterPlot;
import weka.attributeSelection.PrincipalComponents;
import weka.attributeSelection.Ranker;
public class PCA {
public static void main(String[] args) {
try {
// Load data
String InputFilename = "kdd99.arff";
ArffLoader loader = new ArffLoader();
loader.setSource(new File(InputFilename));
Instances data = loader.getDataSet();
// Perform PCA
PrincipalComponents pca = new PrincipalComponents();
pca.setVarianceCovered(1.0);
pca.setTransformBackToOriginal(false);
pca.buildEvaluator(data);
// Show transformed data
Instances transformedData = pca.transformedData();
System.out.println(transformedData);
}
catch (Exception e) {
e.printStackTrace();
I have noticed that there are some data missing or data merged when I am trying to write CSV file by multiple threads at the same time. So how could I avoid this missing or wrong data?
My suggestion is to create a writer handler outside threads and sync the println/write method, like this:
package snippet;
import java.io.FileNotFoundException;
import java.io.PrintWriter;
import java.util.concurrent.ExecutorService;
import java.util.concurrent.Executors;
import java.util.concurrent.TimeUnit;
public class MultiThreadWriter implements Runnable {
private SyncWriter sw;
public MultiThreadWriter(SyncWriter sw) {
this.sw = sw;
}
public void run() {
// ...
sw.println("whatever");
// ....
}
public static void main(String[] args) {
SyncWriter writer = null;
try {
writer = new SyncWriter("foo.csv");
ExecutorService pool = Executors.newFixedThreadPool(10);
pool.submit(new MultiThreadWriter(writer));
pool.submit(new MultiThreadWriter(writer));
pool.submit(new MultiThreadWriter(writer));
pool.submit(new MultiThreadWriter(writer));
pool.submit(new MultiThreadWriter(writer));
pool.submit(new MultiThreadWriter(writer));
pool.submit(new MultiThreadWriter(writer));
pool.shutdown();
while (pool.awaitTermination(1000, TimeUnit.MILLISECONDS)) {
;
}
writer.close();
} catch (Exception e) {
// ..
}
}
}
class SyncWriter {
private PrintWriter pw;
public SyncWriter(String path) throws FileNotFoundException {
pw = new PrintWriter(path);
}
public void close() {
pw.close();
}
public synchronized void println(String x) {
pw.println(x);
}
}
I have a program that should make really fast http requests. Requests should be made asynchronously so that it won't block the main thread.
So I have created a queue which is observed by 10 separate threads that make http requests. If something is inserted in the queue then the first thread that gets the data will make the requests and process the result.
The queue gets filled with thousands of items so multithreading is really neccessary to get the response as fast as possible.
Since I have alot of code I'll give a short example.
main class
package fasthttp;
import java.util.concurrent.ExecutorService;
import java.util.concurrent.Executors;
import java.util.concurrent.LinkedBlockingQueue;
public class FastHTTP {
public static void main(String[] args) {
ExecutorService executor = Executors.newFixedThreadPool(10);
for (int i = 0; i < 10; i++) {
LinkedBlockingQueue queue = new LinkedBlockingQueue();
queue.add("http://www.lennar.eu/ip.php");//for example
executor.execute(new HTTPworker(queue));
}
}
}
FastHTTP class
package fasthttp;
import java.io.BufferedReader;
import java.io.IOException;
import java.io.InputStreamReader;
import java.net.HttpURLConnection;
import java.net.URL;
import java.util.concurrent.LinkedBlockingQueue;
public class HTTPworker implements Runnable {
private final LinkedBlockingQueue queue;
public HTTPworker(LinkedBlockingQueue queue) {
this.queue = queue;
}
private String getResponse(String url) throws IOException {
URL obj = new URL(url);
HttpURLConnection con = (HttpURLConnection) obj.openConnection();
StringBuilder response;
try (BufferedReader in = new BufferedReader(
new InputStreamReader(con.getInputStream()))) {
String inputLine;
response = new StringBuilder();
while ((inputLine = in.readLine()) != null) {
response.append(inputLine);
}
}
return response.toString();
}
#Override
public void run() {
while (true) {
try {
String data = (String) queue.take();
String response = getResponse(data);
//Do something with response
System.out.println(response);
} catch (InterruptedException | IOException ex) {
//Handle exception
}
}
}
}
Is there a better or faster way to make thousands of http requests response processing asynchronously? Speed and performance is what I'm after.
Answering my own question. Tried Apaches asynchronous http client but after a while I started using Ning's async client and I am happy with it.
import java.util.ArrayList;
import java.util.List;
import java.util.concurrent.*;
import java.util.stream.Collectors;
import org.apache.http.client.methods.HttpGet;
import java.util.Iterator;
import org.apache.http.impl.client.BasicResponseHandler;
import org.apache.http.impl.client.CloseableHttpClient;
import org.apache.http.impl.client.HttpClientBuilder;
public class RestService {
private final static Executor executor = Executors.newCachedThreadPool();
private final static CloseableHttpClient closeableHttpClient = HttpClientBuilder.create().build();
public static String sendSyncGet(final String url) {
return sendAsyncGet(List.of(url)).get(0);
}
public static List<String> sendAsyncGet(final List<String> urls){
List<GetRequestTask> tasks = urls.stream().map(url -> new GetRequestTask(url, executor)).collect(Collectors.toList());
List<String> responses = new ArrayList<>();
while(!tasks.isEmpty()) {
for(Iterator<GetRequestTask> it = tasks.iterator(); it.hasNext();) {
final GetRequestTask task = it.next();
if(task.isDone()) {
responses.add(task.getResponse());
it.remove();
}
}
//if(!tasks.isEmpty()) Thread.sleep(100); //avoid tight loop in "main" thread
}
return responses;
}
private static class GetRequestTask {
private final FutureTask<String> task;
public GetRequestTask(String url, Executor executor) {
GetRequestWork work = new GetRequestWork(url);
this.task = new FutureTask<>(work);
executor.execute(this.task);
}
public boolean isDone() {
return this.task.isDone();
}
public String getResponse() {
try {
return this.task.get();
} catch(Exception e) {
throw new RuntimeException(e);
}
}
}
private static class GetRequestWork implements Callable<String> {
private final String url;
public GetRequestWork(String url) {
this.url = url;
}
public String getUrl() {
return this.url;
}
public String call() throws Exception {
return closeableHttpClient.execute(new HttpGet(getUrl()), new BasicResponseHandler());
}
}
}
I am currently developing an application that requires random access to many (60k-100k) relatively large files.
Since opening and closing streams is a rather costly operation, I'd prefer to keep the FileChannels for the largest files open until they are no longer needed.
The problem is that since this kind of behaviour is not covered by Java 7's try-with statement, I'm required to close all the FileChannels manually.
But that is becoming increasingly too complicated since the same files could be accessed concurrently throughout the software.
I have implemented a ChannelPool class that can keep track of opened FileChannel instances for each registered Path. The ChannelPool can then be issued to close those channels whose Path is only weakly referenced by the pool itself in certain intervals.
I would prefer an event-listener approach, but I'd also rather not have to listen to the GC.
The FileChannelPool from Apache Commons doesn't address my problem, because channels still need to be closed manually.
Is there a more elegant solution to this problem? And if not, how can my implementation be improved?
import java.io.IOException;
import java.lang.ref.WeakReference;
import java.nio.channels.FileChannel;
import java.nio.file.Path;
import java.nio.file.StandardOpenOption;
import java.util.Map;
import java.util.Timer;
import java.util.TimerTask;
import java.util.concurrent.ConcurrentHashMap;
public class ChannelPool {
private static final ChannelPool defaultInstance = new ChannelPool();
private final ConcurrentHashMap<String, ChannelRef> channels;
private final Timer timer;
private ChannelPool(){
channels = new ConcurrentHashMap<>();
timer = new Timer();
}
public static ChannelPool getDefault(){
return defaultInstance;
}
public void initCleanUp(){
// wait 2 seconds then repeat clean-up every 10 seconds.
timer.schedule(new CleanUpTask(this), 2000, 10000);
}
public void shutDown(){
// must be called manually.
timer.cancel();
closeAll();
}
public FileChannel getChannel(Path path){
ChannelRef cref = channels.get(path.toString());
System.out.println("getChannel called " + channels.size());
if (cref == null){
cref = ChannelRef.newInstance(path);
if (cref == null){
// failed to open channel
return null;
}
ChannelRef oldRef = channels.putIfAbsent(path.toString(), cref);
if (oldRef != null){
try{
// close new channel and let GC dispose of it
cref.channel().close();
System.out.println("redundant channel closed");
}
catch (IOException ex) {}
cref = oldRef;
}
}
return cref.channel();
}
private void remove(String str) {
ChannelRef ref = channels.remove(str);
if (ref != null){
try {
ref.channel().close();
System.out.println("old channel closed");
}
catch (IOException ex) {}
}
}
private void closeAll() {
for (Map.Entry<String, ChannelRef> e : channels.entrySet()){
remove(e.getKey());
}
}
private void maintain() {
// close channels for derefenced paths
for (Map.Entry<String, ChannelRef> e : channels.entrySet()){
ChannelRef ref = e.getValue();
if (ref != null){
Path p = ref.pathRef().get();
if (p == null){
// gc'd
remove(e.getKey());
}
}
}
}
private static class ChannelRef{
private FileChannel channel;
private WeakReference<Path> ref;
private ChannelRef(FileChannel channel, WeakReference<Path> ref) {
this.channel = channel;
this.ref = ref;
}
private static ChannelRef newInstance(Path path) {
FileChannel fc;
try {
fc = FileChannel.open(path, StandardOpenOption.READ);
}
catch (IOException ex) {
return null;
}
return new ChannelRef(fc, new WeakReference<>(path));
}
private FileChannel channel() {
return channel;
}
private WeakReference<Path> pathRef() {
return ref;
}
}
private static class CleanUpTask extends TimerTask {
private ChannelPool pool;
private CleanUpTask(ChannelPool pool){
super();
this.pool = pool;
}
#Override
public void run() {
pool.maintain();
pool.printState();
}
}
private void printState(){
System.out.println("Clean up performed. " + channels.size() + " channels remain. -- " + System.currentTimeMillis());
for (Map.Entry<String, ChannelRef> e : channels.entrySet()){
ChannelRef cref = e.getValue();
String out = "open: " + cref.channel().isOpen() + " - " + cref.channel().toString();
System.out.println(out);
}
}
}
EDIT:
Thanks to fge's answer I have now exactly what I needed. Thanks!
import com.google.common.cache.CacheBuilder;
import com.google.common.cache.CacheLoader;
import com.google.common.cache.LoadingCache;
import com.google.common.cache.RemovalListener;
import com.google.common.cache.RemovalNotification;
import java.io.IOException;
import java.nio.channels.FileChannel;
import java.nio.file.Path;
import java.nio.file.StandardOpenOption;
import java.util.concurrent.ExecutionException;
public class Channels {
private static final LoadingCache<Path, FileChannel> channelCache =
CacheBuilder.newBuilder()
.weakKeys()
.removalListener(
new RemovalListener<Path, FileChannel>(){
#Override
public void onRemoval(RemovalNotification<Path, FileChannel> removal) {
FileChannel fc = removal.getValue();
try {
fc.close();
}
catch (IOException ex) {}
}
}
)
.build(
new CacheLoader<Path, FileChannel>() {
#Override
public FileChannel load(Path path) throws IOException {
return FileChannel.open(path, StandardOpenOption.READ);
}
}
);
public static FileChannel get(Path path){
try {
return channelCache.get(path);
}
catch (ExecutionException ex){}
return null;
}
}
Have a look here:
http://code.google.com/p/guava-libraries/wiki/CachesExplained
You can use a LoadingCache with a removal listener which would close the channel for you when it expires, and you can specify expiry after access or write.
I have made a program that continuously monitors a log file. But I don't know how to monitor multiple log files. This is what I did to monitor single file. What changes should I make in the following code so that it monitors multiple files also?
package com.read;
import java.io.File;
import java.io.IOException;
import java.io.RandomAccessFile;
import java.nio.channels.FileChannel;
import java.nio.channels.FileLock;
import java.util.Date;
import java.util.Timer;
import java.util.TimerTask;
import java.util.logging.Level;
import java.util.logging.Logger;
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class FileWatcherTest {
public static void main(String args[]) {
final File fileName = new File("D:/logs/myFile.log");
// monitor a single file
TimerTask fileWatcherTask = new FileWatcher(fileName) {
long addFileLen = fileName.length();
FileChannel channel;
FileLock lock;
String a = "";
String b = "";
#Override
protected void onChange(File file) {
RandomAccessFile access = null;
try {
access = new RandomAccessFile(file, "rw");
channel = access.getChannel();
lock = channel.lock();
if (file.length() < addFileLen) {
access.seek(file.length());
} else {
access.seek(addFileLen);
}
} catch (Exception e) {
e.printStackTrace();
}
String line = null;
try {
while ((line = access.readLine()) != null) {
System.out.println(line);
}
addFileLen = file.length();
} catch (IOException ex) {
Logger.getLogger(FileWatcherTest.class.getName()).log(
Level.SEVERE, null, ex);
}
try {
lock.release();
} catch (IOException e1) {
// TODO Auto-generated catch block
e1.printStackTrace();
} // Close the file
try {
channel.close();
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}
};
Timer timer = new Timer();
// repeat the check every second
timer.schedule(fileWatcherTask, new Date(), 1000);
}
}
package com.read;
import java.util.*;
import java.io.*;
public abstract class FileWatcher extends TimerTask {
private long timeStamp;
private File file;
static String s;
public FileWatcher(File file) {
this.file = file;
this.timeStamp = file.lastModified();
}
public final void run() {
long timeStamp = file.lastModified();
if (this.timeStamp != timeStamp) {
this.timeStamp = timeStamp;
onChange(file);
}
}
protected abstract void onChange(File file);
}
You should use threads. Here's a good tutorial:
http://docs.oracle.com/javase/tutorial/essential/concurrency/
You would do something like:
public class FileWatcherTest {
public static void main(String args[]) {
(new Thread(new FileWatcherRunnable("first.log"))).start();
(new Thread(new FileWatcherRunnable("second.log"))).start();
}
private static class FileWatcherRunnable implements Runnable {
private String logFilePath;
// you should inject the file path of the log file to watch
public FileWatcherRunnable(String logFilePath) {
this.logFilePath = logFilePath;
}
public void run() {
// your code from main goes in here
}
}
}