I am interested to know if the Observer Pattern is correct approach for implementing code to monitor log files and their changes?
I am currently using it, but there seems to be an anomaly that I can't quite explain. Basically, i create a Class called FileMonitor that has a timer that fires, that iterates a list of unique files looking for a changed "lastmodified date".
Upon finding it, a list of Listeners are iterated through to find the matching file, and it's
fileChanged event is notified. It then begins to process the lines that were added in the file.
So to make my question more succinct:
Does the Observer Pattern fit what I am trying to do? (Currently
I have one Listener per file)
Is there any possibility of 'concurrency issues' given that there is more than one File to
monitor?
Thanks
Java 7 has introduced WatchService which watches registered objects for changes and event.
A Watchable object is registered with a watch service by invoking its
register method, returning a WatchKey to represent the registration.
When an event for an object is detected the key is signalled, and if
not currently signalled, it is queued to the watch service so that it
can be retrieved by consumers that invoke the poll or take methods to
retrieve keys and process events. Once the events have been processed
the consumer invokes the key's reset method to reset the key which
allows the key to be signalled and re-queued with further events.
File systems may report events faster than they can be retrieved or
processed and an implementation may impose an unspecified limit on the
number of events that it may accumulate. Where an implementation
knowingly discards events then it arranges for the key's pollEvents
method to return an element with an event type of OVERFLOW. This event
can be used by the consumer as a trigger to re-examine the state of
the object.
Example -
Path myDir = Paths.get("D:/test");
try {
WatchService watcher = myDir.getFileSystem().newWatchService();
myDir.register(watcher, StandardWatchEventKind.ENTRY_CREATE,
StandardWatchEventKind.ENTRY_DELETE, StandardWatchEventKind.ENTRY_MODIFY);
WatchKey watckKey = watcher.take();
List<WatchEvent<?>> events = watckKey.pollEvents();
for (WatchEvent event : events) {
if (event.kind() == StandardWatchEventKind.ENTRY_CREATE) {
System.out.println("Created: " + event.context().toString());
}
if (event.kind() == StandardWatchEventKind.ENTRY_DELETE) {
System.out.println("Delete: " + event.context().toString());
}
if (event.kind() == StandardWatchEventKind.ENTRY_MODIFY) {
System.out.println("Modify: " + event.context().toString());
}
}
} catch (Exception e) {
System.out.println("Error: " + e.toString());
}
}
Reference - link
If you do not want to use Java 7, you can get the same behavior with Apache IO.
From the official documentation:
FileAlterationObserver represents the state of files below a root
directory, checking the filesystem and notifying listeners of create,
change or delete events.
Here is how you can add listeners to define operations to be executed when such events happen.
File directory = new File(new File("."), "src");
FileAlterationObserver observer = new FileAlterationObserver(directory);
observer.addListener(...);
observer.addListener(...);
You will have to register the oberver(s) with a FileAlterationMonitor. Continuing from the same documentation:
long interval = ...
FileAlterationMonitor monitor = new FileAlterationMonitor(interval);
monitor.addObserver(observer);
monitor.start();
...
monitor.stop();
Where interval is the amount of time (in miliseconds) to wait between checks of the file system.
Look for package named org.apache.commons.io.monitor in the library.
Does the Observer Pattern fit what I am trying to do? (Currently I
have one Listener per file)
Yes it does.
Is there any possibility of 'concurrency issues' given that there is
more than one File to monitor?
If you have multiple threads removing and adding listeners to a list backed up by an ArrayList you run the risk of ConcurrentModificationException . Use a CopyOnWriteArrayList instead.
IIRC Effective java has an article containing a good example on the same.
I'd suggest to go for NIO
and File Watcher services - Watching File For Changes
Related
I am running a hierachical Spring Statemachine and - after walking through the inital transitions into state UP with the default substate STOPPED - want to use statemachine.getState(). Trouble is, it gives me only the parent state UP, and I cannot find an obvious way to retrieve both the parent state and the sub state.
The machine has states constructed like so:
StateMachineBuilder.Builder<ToolStates, ToolEvents> builder = StateMachineBuilder.builder();
builder.configureStates()
.withStates()
.initial(ToolStates.UP)
.state(ToolStates.UP, new ToolUpEventAction(), null)
.state(ToolStates.DOWN
.and()
.withStates()
.parent(ToolStates.UP)
.initial(ToolStates.STOPPED)
.state(ToolStates.STOPPED,new ToolStoppedEventAction(), null )
.state(ToolStates.IDLE)
.state(ToolStates.PROCESSING,
new ToolBeginProcessingPartAction(),
new ToolDoneProcessingPartAction());
...
builder.build();
ToolStates and ToolEvents are just enums. In the client class, after running the builder code above, the statemachine is started with statemachine.start(); When I subsequently call statemachine.getState().getId(); it gives me UP. No events sent to statemachine before that call.
I have been up and down the Spring statemachine docs and examples. I know from debugging that the entry actions of both states UP and STOPPED have been invoked, so I am assuming they are both "active" and would want to have both states presented when querying the statemachine. Is there a clean way to achieve this ? I want to avoid storing the substate somewhere from inside the Action classes, since I believe I have delegated all state management issues to the freakin Statemachine in the first place and I would rather like to learn how to use its API for this purpose.
Hopefully this is something embarrasingly obvious...
Any advice most welcome!
The documentation describes getStates():
https://docs.spring.io/spring-statemachine/docs/current/api/org/springframework/statemachine/state/State.html
java.util.Collection<State<S,E>> getStates()
Gets all possible states this state knows about including itself and substates.
stateMachine.getState().getStates();
to wrap it up after SMA's most helpful advice: turns out the stateMachine.getState().getStates(); does in my case return a list of four elements:
a StateMachineState instance containing UP and STOPPED
three ObjectState instances containing IDLE, STOPPED and PROCESSING,
respectively.
this leads me to go forward for the time being with the following solution:
public List<ToolStates> getStates() {
List<ToolStates> result = new ArrayList<>();
Collection<State<ToolStates, ToolEvents>> states = this.stateMachine.getState().getStates();
Iterator<State<ToolStates, ToolEvents>> iter = states.iterator();
while (iter.hasNext()) {
State<ToolStates, ToolEvents> candidate = iter.next();
if (!candidate.isSimple()) {
Collection<ToolStates> ids = candidate.getIds();
Iterator<ToolStates> i = ids.iterator();
while (i.hasNext()) {
result.add(i.next());
}
}
}
return result;
}
This maybe would be more elegant with some streaming and filtering, but does the trick for now. I don't like it much, though. It's a lot of error-prone logic and I'll have to see if it holds in the future - I wonder why there isn't a function in the Spring Statemachine that gives me a list of the enum values of all the currently active states, rather than giving me everything possible and forcing me to poke around in it with external logic...
i create a job running a Spring bean class with this code
MethodInvokingJobDetailFactoryBeanjobDetail = new MethodInvokingJobDetailFactoryBean();
Class<?> businessClass = Class.forName(task.getBusinessClassType());
jobDetail.setTargetObject(applicationContext.getBean(businessClass));
jobDetail.setTargetMethod(task.getBusinessMethod());
jobDetail.setName(task.getCode());
jobDetail.setGroup(task.getGroup().getCode());
jobDetail.setConcurrent(false);
Object[] argumentArray = builArgumentArray(task.getBusinessMethodParams());
jobDetail.setArguments(argumentArray);
jobDetail.afterPropertiesSet();
CronTrigger trigger = TriggerBuilder.newTrigger().withIdentity(task.getCode() + "_TRIGGER", task.getGroup().getCode() + "_TRIGGER_GROUP")
.withSchedule(CronScheduleBuilder.cronSchedule(task.getCronExpression())).build();
dataSchedulazione = scheduler.scheduleJob((JobDetail) jobDetail.getObject(), trigger);
scheduler.start();
sometimes the task stop to respond if i remove the trigger and the task from scheduler
remain in
List ob = scheduler.getCurrentlyExecutingJobs();
The state of the trigger is NONE but is still in scheduler.getCurrentlyExecutingJobs();
I have tried to implent InterruptableJob in a class that extend MethodInvokingJobDetailFactoryBeanjobDetail
But when i use
scheduler.interrupt(jobKey);
It say that the InterruptableJob is not implemented.
I think is because the instance of the class is MethodInvokingJobDetailFactoryBeanjobDetail
`scheduler.scheduleJob((JobDetail) jobDetail.getObject(), trigger);`
this is the code inside the quartz scheduler
`job = jec.getJobInstance();
if (job instanceof InterruptableJob) {
((InterruptableJob)job).interrupt();
interrupted = true;
} else {
throw new UnableToInterruptJobException(
"Job " + jobDetail.getKey() +
" can not be interrupted, since it does not implement " +
InterruptableJob.class.getName());
}
`
Is there another way to kill a single task?
I use Quartz 2.1.7 and java 1.6 and java 1.8
TIA
Andrea
There is no magic way to force JVM to stop execution of some piece of code.
You can implement different ways to interrupt the job. But the most appropriate way is to implement InterruptableJob.
Implementing this interface is not sufficient. You should implement a job in such way that it really reacts on such requests.
Example
Suppose, your job is processing 1 000 000 records in the database or in a file and it take relatively long time, let say 1 hour. Then one possible implementation can be following. In the method "interrupt()" you set some flag (member variable) to "true", let name it isInterruptionRequested. In the main logic part that is processing 1 000 000 records you can regularly, e.g. each 5 seconds or after each let say 100 records check if this flag isInterruptionRequested is set to "true". If set, you exit from the method where you implemented the main logic.
It is important that you don't check the condition too often. Otherwise, depending on the logic, it may happen that checking if the job interruption was requested may take 80-90% of CPU, much more than the actual logic :)
Thus, even when you implement the InterruptableJob interface properly, it doesn't mean that the job will be stopped immediately. It will be just a hint like "I would like to stop this job when it is possible". When it will be stopped (if at all) depends on how you implement it.
In working with the WatchService, I found that if I delete a file in the directory being watched, it fires an ENTRY_MODIFY followed by an ENTRY_DELETE event.
I realize that technically, a file may be modified before deleted, but is it really the expected behavior that deleting a file will trigger ENTRY_MODIFY (which presumably no one cares about)?
To handle this, I had to add a condition to check before firing passing along the ENTRY_MODIFY event:
if (eventKind == ENTRY_CREATE) {
listener.fileCreated(file);
} else if (eventKind == ENTRY_MODIFY) {
if (Files.exists(fullPath, LinkOption.NOFOLLOW_LINKS)) {
listener.fileChanged(file);
}
} else if (eventKind == ENTRY_DELETE) {
listener.fileDeleted(file);
}
Is this there a better way to handle this issue (feature)?
I can only confirm the issue. From the comments and from my own observations, the ENTRY_MODIFY event is fired just before the file is deleted and you have to deal with it.
Suppose we have two threads. One is doing Files.delete(), the other is watching the directory and trying to read modified files. Any of the following can happen:
Files.delete() just manages to modify and delete the file, before the event is picked up by the watching thread. Then the technique to check the file existence after ENTRY_MODIFY works.
the Files.delete() call can fail (return false), as the file is already opened by the watching thread.
The only resolution seems to be to ignore all IOExceptions in the watching thread and to retry the Files.delete() call few times.
I only tried to delete the files from the same JVM with Files.delete(). I did not try to delete from other process on the system. Problem reproduces on Windows 7~10 with NTFS, probably not on other OSes.
I encourage others to edit this answer and add their observations.
General purpose of program
To read in a bash-pattern and specified location from command line, and find all files matching that pattern in the location but I have to make the program multi-threaded.
General structure of the program
Driver/Main Class which parses arguments and initiates other classes.
ProcessDirectories Class which adds all directory addresses found from the specified root directory to a string array for processing later
DirectoryData Class which holds the addresses found in the above class
ProcessMatches Class which examines each directory found, and adds any files inside that match the pattern to a string array for printing results later
Main/Driver once again takes over and prints the results :)
The Problem
I need to be processing matches even whilst the ProcessDirectories class is still working (for efficiency so I don't unnecessarily wait for the list to populate before doing work). To do this I try to: a) make ProcessMatches threads wait() if DirectoryData is empty b) make ProcessDirectories notifyAll() if added a new entry.
The Question :)
Every tutorial I look at is focused on the producer and consumer being in the same object, or dealing with just one data structure. How can I do this when I am using more than one data structure and more than one class for producing and consuming?
How about something like:
class Driver(String args)
{
ProcessDirectories pd = ...
BlockingQueue<DirectoryData> dirQueue = new LinkedBlockingQueue<DirectoryData>();
new Thread(new Runnable(){public void run(){pd.addDirs(dirQueue);}}).start();
ProcessMatches pm = ...
BlockingQueue<File> fileQueue = new LinkedBlockingQueue<File>();
new Thread(new Runnable()
{
public void run()
{
for (DirectoryData dir = dirQueue.take(); dir != DIR_POISON; dir = dirQueue.take())
{
for (File file : dir.getFiles())
{
if (pm.matches(data))
fileQueue.add(file)
}
}
fileQueue.add(FILE_POISON);
}
}).start();
for (File file = fileQueue.take(); file != FILE_POISON; file = fileQueue.take())
{
output(file);
}
}
This is just a rough idea of course. ProcessDirectories.addDirs() would just add DirectoryData objects to the queue. In production you'd want to name the threads. Perhaps use an executor to provide manage threads. Perhaps use some other mechanism to indicate end of processing than a poison message. Also, you might want to reduce the limit on the queue size.
Have one data structure that's associated with the data the two threads communicate with each other. This can be a queue that has "get data from queue, waiting if empty" and "put data on queue, waiting if full" functions. Those functions should internally call notify and wait on the queue itself and they should be synchronized to that queue.
I implemented JNotify to determine when a new file arrives in a particular directory, and, when a file arrives, to send the filename over to another function, as follows:
public class FileDetector {
MessageProcessor mp;
class Listener implements JNotifyListener {
public void fileCreated(int wd, String rootPath, String name) {
print("created " + rootPath + " : " + name);
mp.processMessage(rootPath + "\\" + name);
}
}
}
The function mp.processMessage tries to open the file, but I keep getting an error that the file is in use by another process. However, as the file has just been created, the only other process which might be using it is JNotify.
I put a couple of print statements, and it appears that the function mp.processMessage is being called before the listener's print function. Does anyone have a suggestion for how I might resolve this, beyond putting the entire message processing inside the listener class?
#Eile What I think is As soon as one process is copying the file, you are trying to read it, 100 ms delay will complete the copy first n then you can read the file easily.
Here's what I've done so far - I added into mp.processMessage() a 100 millisecond delay before trying to open the file, and have had no issues with it. However, I am still puzzled as to why that would be necessary, and whether or not there is a better solution to this issue.
I have tried this and have found that an arbitrary delay didn't work well for me. What I did was create a DelayQueue. I added each observed new file to the queue with a 100ms delay. When the delay expired I checked if the file was readable/writable. If is was, I popped it from the queue. If not, I readded it to the queue with another 100ms delay. To check if it was readable/writable I attempt to open a FileInputStream to the file. If no exception, I close the stream and pop the file.
I am hoping that nio.2 (jsr 203) does not have this same issue. If you can use Java 7 you might want to give it a try.