I am writing a piece of software that is to monitor the time a file was added into a specific directory. I would need to do this in both c# and java. However, I am not so much interested in when the files was created as this could be days before they are actually moved into the directory of interest. I have been loking around, but unable to find anything. The closest I've found so far in java is:
File file = new File(yourPathHere);
long lastModified = file.lastModified();
But that does not give me the time the file was moved into the folder. Thanks for help :)
if you are using windows, have a look at this rules :
https://support.microsoft.com/en-us/kb/299648
It seems that when you move a file, it does not change its modification or creation date.
It's changed only when doing a copy.
As an alternative, you can regularly scan your folder, like every 1 minutes and when you discover a new file, you put it in a log and write it's discovery date.
As IInspectable is saying, FileSystemWatcher and FindFirstChangeNotification are probably the way to go to avoid coding a scanner
Related
I'm using java to parse XML files which come from a FTP protocol. The problem is, the file I take may being copied/modified by the FTP. So I need a method which can check whether the file is completely written.
I've tried using File::canWrite method (which did work at all) or finding the ending tag of the XML file but none of them works correctly at any case. The File::renameTo is pretty slow and doesn't look decent although it works (not all the case either). Is there any good and fast way to check a file if it's completely copied?
Thanks alot!
Short answer no. The best practice is to write to a file with a temporary name, for example somefile.part and rename it when done. The writing program needs to do that. The workaround when you don't control the writing application is to check the modification time and ensure that some reasonable time has passed since the most recent change. Perhaps a minute. Then you assume that the file is complete.
I have scenario where I need to detect recently moved file in a directory. Option are available which says check the last modified date then we can pick up.
But in my case even if the file comes there is a possibility that its modified date is older than already existing files.
So Is there anything we could write any code snippet who could just detects the latest moved file ? Just based on movements but not any last modified date.
Ideas...?
EDIT:
Operating System using: Windows
If you want to keep track of what is going on within one or more specific directories (folders) then use Java's Watch Service API which is part of the java.nio.file package.
This API can detect when a file is moved, copied, or created within the specified directory. Even when a file is edited, modified, or deleted from within the very same directory.
Read the information within the supplied Oracle link above and try the demo application which can be downloaded (copied) from within the Try It Out section. I think it's just what you may be looking for.
Currently I am writing an application in java that looks through the files in a directory (call it 'topics'). Within this directory are some number of folders named after their respective topics, maybe 'dog', 'cat', etc.
I am currently using a ScheduledExecutorService to look through this directory every 30 seconds, going through each topic folder and performing some operation on the contents with in the folder (We'll say some other independent piece of code is writing something to these topic folders, maybe a log file or something).
What would be the best way to only examine the new subdirectories each 30 seconds? If I start with just the topic dog, and somewhere between those 30 seconds the topics 'cat' and 'bird' are added, what would be the best way for me to only look through those new folders? I was thinking about comparing it to a HashSet or something, but I'm not sure what the most efficient way it would be to do this.
I ask because there could potentially be a great amount of subdirectories being created, and it seems to me problematic to try to loop through each one with something like directory.listFiles(). Any advice?
Use java's WatchService API.
The WatchService API is fairly low level, allowing you to customize it. You can use it as is, or you can choose to create a high-level API on top of this mechanism so that it is suited to your particular needs. - Java Docs
When WatchService detects a file or directory change it can notify you of watch file or directory was changed giving you the path to it. By using this api you don't have to worry about comparing two lists to see what was changed because the api provides the path of the changed file.
Take a look at this Java Tutorial "Watching a Directory for Changes" by Oracle on using the WatchService api.
i am writing a program that parses xml files that hold tourist attractions for cities. each city has it's own xml and the nodes have info like cost, address etc... i want to have a thread on a timer to check for new xml files or more recent versions of existing ones in a specific directory. creating the thread is not the problem. i just have no idea what the best way to check for these new files or changed files is. does anyone have any suggestions as to an easy way to make do that. i was thinking of crating a csv file with names and date altered info for each file processed and then checking against this csv file when i go to check for new or altered xml, but that seems overly complicated and i would like a better solution. i have no code to offer at this point for this mechanism i am just looking for a direction to go in.
the idea is as i get xml's for different cities fitting the schema that it will update my db automatically next time the program runs or periodically if already running.
To avoid polling you should watch the directory containing the xml file. Oracle has an extensive documentation about the topic at Watching a Directory for Changes
What you are describing looks like asynchronous feeding of new info. One common pitfall on such problem is race condition : what happens if you are trying to read a file while it's being modified or if something else tries to write a file while you are reading it ? What happens if your app (or the app that edit your xml files) breaks in the middle of processing ? To avoid such problems you should move files (change name or directory) to follow their status because moves are atomical operation on normal file systems. If you want a bullet proof solution, you should have :
files being edited or transfered by an external part
files being fully edited or transfered and ready to be read by you app
files being processed
files completely processed
files containing errors (tried to process them but could not complete processing)
The 2 first are under external responsability (you just define an interface contract), the 2 latter are under yours. The cost if 4 or 5 directories (if you choose that solution), the gains are :
if there is any problem while editing-tranfering a xml file, the external app just have to restart its operation
if a file can't be processed (syntax error, oversized, ...) it is put apart for further analysis but does not prevent processing of other files
you only have to watch almost empty directories
if your app breaks in the middle of processing a file, at next start it can restart its processing.
I am stuck with a problem when reading files from an FTP-Server. It appears, that I get empty files. But I know (kind of for sure :-) ) that there are no empty files uploaded. My strong feeling is, that I start downloading when the file has not yet been completely been uploaded.
Unfortunately I do not have the possibility to change the way files are uploaded. So I need to find a workaround from my side.
My Idea was to check the mDate (Last Modification Date) of the file. And when it is more then 30s in the past, it would be safe to start downloading the files. During my tests I uploaded a file and checked the mdate. Unfortunately it was 13s in the future.
No finally my question
Is there a way to get the current system time of the ftp server? So I could calculate an offset. In the sftp framework I am using (com.jcraft.jsch) there are function like "getExtension()" but I do not find any usefull information on that method.
Cheers,
Christian
Before getting files from the remote server, do this. (1) Put an empty file in the remote location. (2) Get the last modified time of the file put in step-1. This roughly gives the system time. (3) List the actual files you want to get, get their last modified time, compare with the system time in step-2. (4) Delete the empty file created in step-1, if you do not like it being there.