Monitoring directory for changes from web service - java

Don't know if it is clear from title, I'll explain it deeper.
First of all limitations: Java 1.5 IBM.
This is the situation:
I have spring web service that receives request with pdf document in it. I need to put this pdf into the some input directory that AFP application (not of the importance) monitors. This AFP application takes that pdf, do something with it and returns it to some output directory that I need to monitor. Monitoring of output directory would take some time, probably 30 seconds. Also, I know what is exact file name that I expect to appear in output directory. If nothing appears in 30 seconds than I would return some fault response.
Because of my poor knowledge of web services and multithreading I don't know in which possible problems I can fall into.
Also, searching the internet I realize that most of people recommend watchservice for directory monitoring, but this is introduced in Java 7.
Any suggestion, link, idea would be helpful.

So, the scenario is simple. In a main method, the following actions are done in order:
call the AFP service;
poll the directory for the output file;
deal with the output file.
We suppose here that outputFile is a File containing the absolute path to the generated file; this method returns void, adapt:
// We poll every second, so...
private static final int SAMPLES = 30;
public void dealWithAFP(whatever, arguments, are, there)
throws WhateverIsNecessary
{
callAfpService(here);
int i = 0;
try {
while (i < SAMPLES) {
TimeUnit.SECONDS.sleep(1);
if (outputFile.exists())
break;
}
throw new WhateverIsNecessary();
} catch (InterruptedException e) {
// Throw it back if the method does, otherwise the minimum is to:
Thread.currentThread().interrupt();
throw new WhateverIsNecessary();
}
dealWithOutputFile(outputFile);
}

Related

Encapsulating a multi-threaded operation in Java

I have a situation where I have a large number of classes that need to do file (read only) access. This is part of a web app running on top of OSGI, so there will be a lot of concurrent needs to access.
So I'm building an OSGI service to access the file system for all the other pieces that will need it and provide a centralized access as this also simplifies configuration of file locations, etc.
It occurs to me that a multi-threaded approach makes the most sense along with a thread pool.
So the question is this:
If I do this and I have a service with an interface like:
FileService.getFileAsClass(class);
and the method getFileAsClass(class) looks kinda like this: (this is a sketch it may not be perfect java code)
public < T> T getFileAsClass(Class< T> clazz) {
Future<InputStream> classFuture = threadpool.submit(new Callable< InputStream>() {
/* initialization block */
{
//any setup from configs.
}
/* implement Callable */
public InputStream call() {
InputStream stream = //new inputstream from file location;
boolean giveUp = false;
while(null == stream && !giveUp) {
//Code that tries to read in the file 4
// times with a Thread.sleep() then gives up
// this is here t make sure we aren't busy updating file.
}
return stream;
}
});
//once we have the file, convert it and return it.
return InputStreamToClassConverter< T>.convert(classFuture.get());
}
Will that correctly wait until the relevant operation is done to call InputStreamtoClassConverter.convert?
This is my first time writing multithreaded java code so I'm not sure what I can expect for some of the behavior. I don't care about order of which threads complete, only that the file handling is handled async and once that file pull is done, then and only then is the Converter used.

Why is it that RESTlet takes quite time to print the XML, sometimes

I am implementing REST through RESTlet. This is an amazing framework to build such a restful web service; it is easy to learn, its syntax is compact. However, usually, I found that when somebody/someprogram want to access some resource, it takes time to print/output the XML, I use JaxbRepresentation. Let's see my code:
#Override
#Get
public Representation toXml() throws IOException {
if (this.requireAuthentication) {
if (!this.app.authenticate(getRequest(), getResponse()))
{
return new EmptyRepresentation();
}
}
//check if the representation already tried to be requested before
//and therefore the data has been in cache
Object dataInCache = this.app.getCachedData().get(getURI);
if (dataInCache != null) {
System.out.println("Representing from Cache");
//this is warning. unless we can check that dataInCache is of type T, we can
//get rid of this warning
this.dataToBeRepresented = (T)dataInCache;
} else {
System.out.println("NOT IN CACHE");
this.dataToBeRepresented = whenDataIsNotInCache();
//automatically add data to cache
this.app.getCachedData().put(getURI, this.dataToBeRepresented, cached_duration);
}
//now represent it (if not previously execute the EmptyRepresentation)
JaxbRepresentation<T> jaxb = new JaxbRepresentation<T>(dataToBeRepresented);
jaxb.setFormattedOutput(true);
return jaxb;
}
AS you can see, and you might asked me; yes I am implementing Cache through Kitty-Cache. So, if some XML that is expensive to produce, and really looks like will never change for 7 decades, then I will use cache... I also use it for likely static data. Maximum time limit for a cache is an hour to remain in memory.
Even when I cache the output, sometimes, output are irresponsive, like hang, printed partially, and takes time before it prints the remaining document. The XML document is accessible through browser and also program, it used GET.
What are actually the problem? I humbly would like to know also the answer from RESTlet developer, if possible. Thanks

Major java libraries doesn't predict the case of “cyclic copy” of a file onto a destination differently mapped

In my experience and after repeated tests I've done and deep web researches, I've found that major java libraries (either "Apache Commons" or Google.coomons or Jcifs) doesn't predict the case of “cyclic copy” of a file onto a destination differently mapped (denoted with different RootPath according with newer java.nio package Path Class) that,at last end of mapping cycle,resolves into the itself origin file.
That's a situation of data losing, because Outputsream method nor jnio's GetChannel method prevents itself this case:the origin file and the destination file are in reality "the same file" and the result of these methods is that the file become lost, better said the size o file become 0 length.
How can one avoid this without get off at a lower filesystem level or even surrender to a more safe Runtime.exec, delegating the stuff at the underlying S.O.
Should I have to lock the destination file (the above methods not allowing this), perhaps with the aid of the oldest RandomAccessFile Class ?
You can test using those cited major libraries with a common "CopyFile(File origin,File dest)" method after having done:
1) the origin folder of file c:\tmp\test.txt mapped to to x: virtual drive via a cmd's [SUBST x: c:\tmp] thus trying to copy onto x:\test.txt
2) Similar case if the local folder c:\tmp has been shared via Windows share mechanism and the destination is represented as a UNC path ending with the same file name
3) Other similar network situations ...
I think there must be another better solution, but my experience of java is fairly few and so I ask for this to you all. Thanks in advance if interested in this “real world” discussion.
Your question is interesting, never thought about that. Look at this question: Determine Symbolic Links. You should detect the cycle before copying.
Perhaps you can try to approach this problem slightly differently and try to detect that source and destination files are the same by comparing file's metadata (name, size, date, etc) and perhaps even calculate hash of the files content as well. This would of course slow processing down.
If you have enough permissions you could also write 'marker' file with random name in destination and try to read it at the source to detect that they're pointing to the same place. Or try to check that file already exist at destination before copying.
I agree that it is unusual situations, but you will agree that files are a critical base of every IT system. I disagree that manipulating files in java is unusual: in my case I have to attach image files of products through FileChooser and copy them in ordered way to a repository ... but real world users (call them customers who buy your product) may fall in such situations and if it happens, one can not 'blame the devil of bad luck if your product does something "less" than expected.
It is a good practice learning from experience and try to avoid what one of Murphy's Laws says, more' or less: "if something CAN go wrong, it WILL go wrong sooner or later.
Is perhaps also for one of those a reason I believe the Java team at Sun and Oracle has enhanced the old java.io package for to the newest java.nio. I'm analyzing a the new java.nio.Files Class which I had escaped to attention, and soon I believe I've found the solution I wanted and expected. See you later.
Thank for the address from other experienced members of the community,and thanks also to a young member of my team, Tindaro, who helped me in the research, I've found the real solution in Jdk 1.7, which is made by reliable, fast, simple and almost definitively will spawn a pity veil on older java.io solutions. Despite the web is still plenty full of examples of copying files in java using In/out Streams I'll warmely suggest everyone to use a simple method : java.nio.Files.copy(Path origin, Path destination) with optional parameters for replacing destination,migrate metadata file attributes and even try a transactional move of files (if permitted by the underlying O.S.).
That's a really good Job, waited for so long!
You can easily convert code from copy(File file1, File file2) by appending a ".toPath()" to the File instance (e.g. file1.toPath(), file2.toPath().
Note also that the boolean method "isSameFile(file1.toPath(), file2.toPath())", is already used inside the above copy method but easily usable in every case you want.
For every case you can't upgrade to 1.7 using community libraries from Apache or Google is still suggested, but for reliable purpose, permit me to suggest the temporary workaround I've found before:
public static boolean isTheSameFile(File f1, File f2) {//throws Exception{
// minimum prerequisites !
if(f1.length()!=f2.length()) return false;
if (!file1.exists() || !file2.exists()) { return false; }
if (file1.isDirectory() || file2.isDirectory()){ return false; }
//if (file1.getCanonicalFile().equals(file2.getCanonicalFile())); //don't rely in this ! can even still fail
//new FileInputStream(f2).getChannel().lock();//exception, can lock only on OutputStream
RandomAccessFile rf1=null,rf2=null; //the only practicable solution on my own ... better than parse entire files
try {
rf1 = new RandomAccessFile(f1, "r");
rf2=new RandomAccessFile(f2, "rw");
} catch (FileNotFoundException e) {
e.printStackTrace();
return false;
}
try {
rf2.getChannel().lock();
} catch (IOException e) {
return false;
}
try {
rf1.getChannel().read(ByteBuffer.allocate(1));//reads 1 only byte
} catch (IOException e) {
//e.printStackTrace(); // if and if only the same file, the O.S. will throw an IOException with reason "file already in use"
try {rf2.close();} catch (IOException e1) {}
return true;
}
//close the still opened resources ...
if (rf1.getChannel().isOpen())
try {rf1.getChannel().close();} catch (IOException e) {}
try {
rf2.close();
} catch (IOException e) {
return false;
}
// done, files differs
return false;
}

JUnit tests fail when creating new Files

We have several JUnit tests that rely on creating new files and reading them. However there are issues with the files not being created properly. But this fault comes and goes.
This is the code:
#Test
public void test_3() throws Exception {
// Deletes files in tmp test dir
File tempDir = new File(TEST_ROOT, "tmp.dir");
if (tempDir.exists()) {
for (File f : tempDir.listFiles()) {
f.delete();
}
} else {
tempDir.mkdir();
}
File file_1 = new File(tempDir, "file1");
FileWriter out_1 = new FileWriter(file_1);
out_1.append("# File 1");
out_1.close();
File file_2 = new File(tempDir, "file2");
FileWriter out_2 = new FileWriter(file_2);
out_2.append("# File 2");
out_2.close();
File file_3 = new File(tempDir, "fileXXX");
FileWriter out_3 = new FileWriter(file_3);
out_3.append("# File 3");
out_3.close();
....
The fail is that the second file object, file_2, never gets created. Sometimes. Then when we try to write to it a FileNotFoundException is thrown
If we run only this testcase, everything works fine.
If we run this testfile with some ~40 testcases, it can both fail and work depending on the current lunar cycle.
If we run the entire testsuite, consisting of some 10*40 testcases, it always fails.
We have tried
adding sleeps (5sec) after new File, nothing
adding while loop until file_2.exists() is true but the loop never stopped
catching SecurityException, IOException and even throwable when we do the New File(..), but caught nothing.
At one point we got all files to be created, but file_2 was created before file_1 and a test that checked creation time failed.
We've also tried adding file_1.createNewFile() and it always returns true.
So what is going on? How can we make tests that depend on actual files and always be sure they exist?
This has been tested in both java 1.5 and 1.6, also in Windows 7 and Linux. The only difference that can be observed is that sometimes a similar testcase before fails, and sometimes file_1 isn't created instead
Update
We tried a new variation:
File file_2 = new File(tempDir, "file2");
while (!file_2.canRead()) {
Thread.sleep(500);
try {
file_2.createNewFile();
} catch (IOException e) {
e.printStackTrace();
}
}
This results in alot of Exceptions of the type:
java.io.IOException: Access is denied
at java.io.WinNTFileSystem.createFileExclusively(Native Method)
at java.io.File.createNewFile(File.java:883)
... but eventually it works, the file is created.
Are there multiple instances of your program running at once?
Check for any extra instances of javaw.exe running. If multiple programs have handles to the same file at once, things can get very wonky very quickly.
Do you have antivirus software or anything else running that could be getting in the way of file creation/deletion, by handle?
Don't hardcode your file names, use random names. It's the only way to abstract yourself from the various external situations that can occur (multiple access to the same file, permissions, file system error, locking problems, etc...).
One thing for sure: using sleep() or retrying is guaranteed to cause weird errors at some point in the future, avoid doing that.
I did some googling and based on this lucene bug and this board question seems to indicate that there could be an issue with file locking and other processes using the file.
Since we are running this on ClearCase it seems plausible that ClearCase does some indexing or something similar when the files are being created. Adding loops that repeat until the file is readable solved the issue, so we are going with that. Very ugly solution though.
Try File#createTempFile, this at least guarantees you that there are no other files by the same name that would still hold a lock.

Locking file across services

What is the best way to share a file between two "writer" services in the same application?
Edit:
Sorry I should have given more details I guess.
I have a Service that saves entries into a buffer. When the buffer gets full it writes all the entries to the file (and so on). Another Service running will come at some point and read the file (essentially copy/compress it) and then empty it.
Here is a general idea of what you can do:
public class FileManager
{
private final FileWriter writer = new FileWriter("SomeFile.txt");
private final object sync = new object();
public void writeBuffer(string buffer)
{
synchronized(sync)
{
writer.write(buffer.getBytes());
}
}
public void copyAndCompress()
{
synchronized(sync)
{
// copy and/or compress
}
}
}
You will have to do some extra work to get it all to work safe, but this is just a basic example to give you an idea of how it looks.
A common method for locking is to create a second file in the same location as the main file. The second file may contain locking data or be blank. The benefit to having locking data (such as a process ID) is that you can easily detect a stale lockfile, which is an inevitability you must plan for. Although PID might not be the best locking data in your case.
example:
Service1:
creates myfile.lock
creates/opens myfile
Service2:
Notices that myfile.lock is present and pauses/blocks/waits
When myfile.lock goes away, it creates it and then opens myfile.
It would also be advantageous for you to double-check that the file contains your locking information (identification specific to your service) right after creating it - just in case two or more services are waiting and create a lock at the exact same time. The last one succeeds and so all other services should notice that their locking data is no longer in the file. Also - pause a few milliseconds before checking its contents.

Categories