Java copy directory slow - java

I have several folders of size >2.5GB on C drive which is SSD. Through Java, I'm moving these folders to another shared drive which also happens to SSD using FileUtils.copyDirectoryToDirectory(sourceDir, destiDir);
It works fine but is slow (taking ~30 mins) when compared to windows default move option which takes 5 mins. I googled around to see if there is a better way to increase the performance of moving directories through my java program but no luck. Can someone suggest me the best way to move these directories?

ok this is what I did
Used a robocopy command within java to copy directories between two locations. Tested with a ~9GB file and was able to copy in ~9 mins. Below is the code snippet
String sourceFolder = new File("C:\\test\\robocopytest\\source\\20170925T213857460").toString();
String destFolder = new File("C:\\test\\robocopytest\\destination\\20170925T213857460").toString();
StringBuffer rbCmd = new StringBuffer();
if ((sourceFolder != null) && (destFolder != null))
{
if (sourceFolder.contains(" ")) {
if (sourceFolder.startsWith("\\")) {
sourceFolder = "/\"" + sourceFolder.substring(1) + "/\"";
} else {
sourceFolder = "\"" + sourceFolder + "\"";
}
}
if (destFolder.contains(" ")) {
if (destFolder.startsWith("\\")) {
destFolder = "/\"" + destFolder.substring(1) + "/\"";
} else {
destFolder = "\"" + destFolder + "\"";
}
}
rbCmd.append("robocopy " + sourceFolder + " " + destFolder);
Process p = Runtime.getRuntime().exec(rbCmd.toString());
}

Related

Java code works on my laptop, but not on my desktop?

I use IntelliJ. I have a class that manages student grades. It can edit files, which I have a new .temp file being written, and renamed. Then the old file gets deleted. On my laptop (mac) this works fine, but on my desktop (windows) everything works, but the old file is not deleted, and temp is not renamed.
Below is my method to edit the files:
private static void editStuGrade() throws Exception {
System.out.println("Enter a course grade you want to change (q1,q2,q3,mid,final): ");
String editInput = command.next();
System.out.println("Enter a new score: ");
String newGrade = command.next();
Path p = Paths.get(System.getProperty("user.dir"),"src","Assignment1", student.name + ".txt");
File inputFile = new File(p.toString());
FileInputStream inputStream = new FileInputStream(inputFile);
InputStreamReader inputStreamReader = new InputStreamReader(inputStream);
BufferedReader reader = new BufferedReader(inputStreamReader);
File outputFile = new File(p + ".temp");
FileOutputStream outputStream = new FileOutputStream(outputFile);
OutputStreamWriter outputStreamWriter = new OutputStreamWriter(outputStream);
BufferedWriter writer = new BufferedWriter(outputStreamWriter);
if (editInput.equalsIgnoreCase("q1")) {
writer.write(student.name + "," + student.id + "," + newGrade + "," + student.quiz2
+ "," + student.quiz3 + "," + student.midterm + "," + student.finalTest);
writer.close();
} else if (editInput.equalsIgnoreCase("q2")) {
writer.write(student.name + "," + student.id + "," + student.quiz1 + "," + newGrade
+ "," + student.quiz3 + "," + student.midterm + "," + student.finalTest);
writer.close();
} else if (editInput.equalsIgnoreCase("q3")) {
writer.write(student.name + "," + student.id + "," + student.quiz1 + "," + student.quiz2
+ "," + newGrade + "," + student.midterm + "," + student.finalTest);
writer.close();
} else if (editInput.equalsIgnoreCase("mid")) {
writer.write(student.name + "," + student.id + "," + student.quiz1 + "," + student.quiz2
+ "," + student.quiz3 + "," + newGrade + "," + student.finalTest);
writer.close();
} else if (editInput.equalsIgnoreCase("final")) {
writer.write(student.name + "," + student.id + "," + student.quiz1 + "," + student.quiz2
+ "," + student.quiz3 + "," + student.midterm + "," + newGrade);
writer.close();
}
inputFile.delete();
outputFile.renameTo(new File(p.toString()));
System.out.println("Successful.");
}
Windows is a special snowflake. Unlike other OSes (or rather, Windows File System, unlike other file systems) does not let you delete any open files, and does not let you rename or delete directories of they contain any open files. In contrast to other OSes which don't mind at all; files on disk are merely 'pointers', and any process that opens a file also gets a pointer. The file isn't truly removed from disk until all pointers are gone, so, you can just delete files - if that file is still open, no problem - as long as it is the file doesn't disappear. It's very similar to java garbage collection in that way.
But not so on windows.
Your code has a bug in it - you aren't managing your resources. This is resulting in the files being open, and then you try to delete them - this works on non-windows filesystems but isn't allowed on windows - you can't delete files even if you're the very process that still has them open.
Resources MUST be closed, and the responsibility to do this lies on you. Given that code can exit in many ways (not just 'by running to the end of a method', but also: With return, by throwing an exception, by using break or continue for flow control, etc). Therefore, trying to write code by hand that ensures your resource is closed for all possible exit paths is annoying and error prone, so, don't. Use java's language features:
Do not EVER open a resource unless you do so in a try-with block.
Looks like this:
try (var outputStream = new FileOutputStream(outputFile)) {
// outputStream exists here and can be interacted with as normal
}
No matter how code flow 'exits' that try block, the resource is closed automatically once it does. This is good - not just because this lets you delete those files, but also because every process gets a limited number of files they get to open, so if you fail to close, any non-trivial app will soon hard-crash due to having too many open files.
What are resources? The javadoc will tell you, and use common sense. most InputStream and OutputStreams are - any type that implements AutoClosable tends to be. If you new X() them up you definitely have to close them. If you're invoking a method that sounds like it 'makes' the resource (example: socket.getInputStream or Files.newInputStream), you have to close them.
Use try () {} to do this.
Once you do so, you can delete these files just fine, even on windows.

The fastest ways to check if a file exists in java

Currently i am tasked with making a tool that can check whether a link is correct or not using java. The link is fed from Jericho HTML Parser, and my job is only to check whether the file is exist / the link is correct or not. That part is done, the hard part is to optimize it, since my code run (i have to say) rather sluggishly on 65ms per run
public static String checkRelativeURL(String originalFileLoc, String relativeLoc){
StringBuilder sb = new StringBuilder();
String absolute = Common.relativeToAbsolute(originalFileLoc, relativeLoc); //built in function to replace the link from relative link to absolute path
sb.append(absolute);
sb.append("\t");
try {
Path path = Paths.get(absolute);
sb.append(Files.exists(path));
}catch (InvalidPathException | NullPointerException ex) {
sb.append(false);
}
sb.append("\t");
return sb.toString();
}
and on this line it took 65 ms
Path path = Paths.get(absolute);
sb.append(Files.exists(path));
I have tried using
File file = new File(absolute);
sb.append(file.isFile());
It's still ran around 65~100ms.
So is there any other faster way to check whether a file exists or not other than this?
Since i am processing more than 70k html files and every milliseconds counts, thanks :(
EDIT:
I tried listing all the files into some List, and it doesn't really helps since it take more than 20mins just to list all the file....
The code that i use to list all the file
static public void listFiles2(String filepath){
Path path = Paths.get(filepath);
File file = null;
String pathString = new String();
try {
if(path.toFile().isDirectory()){
DirectoryStream<Path> stream = Files.newDirectoryStream(path);
for(Path entry : stream){
file = entry.toFile();
pathString = entry.toString();
if(file.isDirectory()){
listFiles2(pathString);
}
if (file.isFile()){
filesInProject.add(pathString);
System.out.println(pathString);
}
}
stream.close();
}
} catch (IOException e) {
e.printStackTrace();
}
}
If you know in advance the target OS set (usually it is the case), ultimately the fastest way will be to list so many files through a shell, by invoking a process e.g. using Runtime.exec.
On Windows you can do with
dir /s /b
On Linux
ls -R -1
You can check what is the OS and use appropriate command (error or resort to directory stream if not supported).
If you wish simplicity and don't need to report a progress, you can avoid dealing with the process IO and store the list to a temporary file e.g. ls -R -1 > /tmp/filelist.txt. Alternatively, you can read from the process output directly. Read with a buffered stream, a reader or alike, with large enough buffer.
On SSD it will complete in a blink of an eye and on modern HDD in seconds (half million files is not a problem with this approach).
Once you have the list, you can approach it differently depending on maximum files count and memory requirements. If requirements are loose, e.g. desktop program, you can do with very simple code e.g. pre-loading the complete file list to a HashSet and check existence when needed. Shortening path by removing common root will require much less memory. You can also reduce memory by keeping only filename hash instead of full name (common root removal will probably reduce more).
Or you can optimize it further if you wish, the question just reduces now to a problem of checking existense of a string in a list of strings stored in memory or file, which has many well known optimal solutions.
Bellow is very loose, simplistic sample for Windows. It executes dir on HDD (not SSD) drive root with ~400K files, reads the list and benchmarks (well, kind of) time and memory for string set and md5 set approaches:
public static void main(String args[]) throws Exception {
final Runtime rt = Runtime.getRuntime();
System.out.println("mem " + (rt.totalMemory() - rt.freeMemory())
/ (1024 * 1024) + " Mb");
long time = System.currentTimeMillis();
// windows command: cd to t:\ and run recursive dir
Process p = rt.exec("cmd /c \"t: & dir /s /b > filelist.txt\"");
if (p.waitFor() != 0)
throw new Exception("command has failed");
System.out.println("done executing shell, took "
+ (System.currentTimeMillis() - time) + "ms");
System.out.println();
File f = new File("T:/filelist.txt");
// load into hash set
time = System.currentTimeMillis();
Set<String> fileNames = new HashSet<String>(500000);
try (BufferedReader reader = new BufferedReader(new InputStreamReader(
new FileInputStream(f), StandardCharsets.UTF_8),
50 * 1024 * 1024)) {
for (String line = reader.readLine(); line != null; line = reader
.readLine()) {
fileNames.add(line);
}
}
System.out.println(fileNames.size() + " file names loaded took "
+ (System.currentTimeMillis() - time) + "ms");
System.gc();
System.out.println("mem " + (rt.totalMemory() - rt.freeMemory())
/ (1024 * 1024) + " Mb");
time = System.currentTimeMillis();
// check files
for (int i = 0; i < 70_000; i++) {
StringBuilder fileToCheck = new StringBuilder();
while (fileToCheck.length() < 256)
fileToCheck.append(Double.toString(Math.random()));
if (fileNames.contains(fileToCheck))
System.out.println("to prevent optimization, never executes");
}
System.out.println();
System.out.println("hash set 70K checks took "
+ (System.currentTimeMillis() - time) + "ms");
System.gc();
System.out.println("mem " + (rt.totalMemory() - rt.freeMemory())
/ (1024 * 1024) + " Mb");
// Test memory/performance with MD5 hash set approach instead of full
// names
time = System.currentTimeMillis();
Set<String> nameHashes = new HashSet<String>(50000);
MessageDigest md5 = MessageDigest.getInstance("MD5");
for (String name : fileNames) {
String nameMd5 = new String(md5.digest(name
.getBytes(StandardCharsets.UTF_8)), StandardCharsets.UTF_8);
nameHashes.add(nameMd5);
}
System.out.println();
System.out.println(fileNames.size() + " md5 hashes created, took "
+ (System.currentTimeMillis() - time) + "ms");
fileNames.clear();
fileNames = null;
System.gc();
Thread.sleep(100);
System.gc();
System.out.println("mem " + (rt.totalMemory() - rt.freeMemory())
/ (1024 * 1024) + " Mb");
time = System.currentTimeMillis();
// check files
for (int i = 0; i < 70_000; i++) {
StringBuilder fileToCheck = new StringBuilder();
while (fileToCheck.length() < 256)
fileToCheck.append(Double.toString(Math.random()));
String md5ToCheck = new String(md5.digest(fileToCheck.toString()
.getBytes(StandardCharsets.UTF_8)), StandardCharsets.UTF_8);
if (nameHashes.contains(md5ToCheck))
System.out.println("to prevent optimization, never executes");
}
System.out.println("md5 hash set 70K checks took "
+ (System.currentTimeMillis() - time) + "ms");
System.gc();
System.out.println("mem " + (rt.totalMemory() - rt.freeMemory())
/ (1024 * 1024) + " Mb");
}
Output:
mem 3 Mb
done executing shell, took 5686ms
403108 file names loaded took 382ms
mem 117 Mb
hash set 70K checks took 283ms
mem 117 Mb
403108 md5 hashes created, took 486ms
mem 52 Mb
md5 hash set 70K checks took 366ms
mem 48 Mb

Issues with running Runtime.getRuntime().exec

I'm using process = Runtime.getRuntime().exec(cmd,null,new File(path));
to execute some SQL in file (abz.sql)
Command is:
"sqlplus "+ context.getDatabaseUser() + "/"
+ context.getDatabasePassword() + "#"
+ context.getDatabaseHost() + ":"
+ context.getDatabasePort() + "/"
+ context.getSid() + " #"
+ "\""
+ script + "\"";
String path=context.getReleasePath()+ "/Server/DB Scripts";
It is executing that file but not getting exit. Hence I tried using:
Writer out = new OutputStreamWriter(process.getOutputStream());
out.append("commit;\r\n");
out.append("exit \r\n");
System.out.println("---------"+out);
out.close();
This it complete block that I m using:
if(context.getConnectionField()=="ORACLE")
{
String cmd=
"sqlplus "+ context.getDatabaseUser() + "/"
+ context.getDatabasePassword() + "#"
+ context.getDatabaseHost() + ":"
+ context.getDatabasePort() + "/"
+ context.getSid() + " #"
+ "\""
+ script +"\"";
String path=context.getReleasePath()+ "/Server/DB Scripts";
process = Runtime.getRuntime().exec(cmd,null,new File(path));
out = new OutputStreamWriter(process.getOutputStream());
out.append("commit;\r\n");
out.append("exit \r\n");
System.out.println("---------"+out);
out.close();
Integer result1 = null;
while (result1 == null) {
try {
result1 = process.waitFor();
}
catch (InterruptedException e) {}
}
if(process.exitValue() != 0)
return false;
return true;
}
The code shown fails to read the error stream of the Process. That might be blocking progress. ProcessBuilder was introduced in Java 1.5 and has a handy method to redirectErrorStream() - so that it is only necessary to consume a single stream.
For more general tips, read & implement all the recommendations of When Runtime.exec() won't.
I can see a few issues here. The version of 'exec' that you are using will tokenize the command string using StringTokenizer, so unusual characters in the password (like spaces) or the other parameters being substituted are accidents waiting to happen. I recommend switching to the version
Process exec(String[] cmdarray,
String[] envp,
File dir)
throws IOException
It is a bit more work to use but much more robust.
The second issue that there are all kinds of caveat about whether or not exec will run concurrently with the Java process (see http://download.oracle.com/javase/1.4.2/docs/api/java/lang/Process.html). So you need to say which operating system you're on. If it does not run concurrently then your strategy of writing to the output stream cannot work!
The last bit of the program is written rather obscurely. I suggest ...
for (;;) {
try {
process.waitFor();
return process.exitValue() == 0;
} catch ( InterruptedException _ ) {
System.out.println( "INTERRUPTED!" ); // Debug only.
}
}
This eliminates the superfluous variable result1, eliminates the superfluous boxing and highlights a possible cause of endless looping.
Hope this helps & good luck!

Java Web Start deploy on Windows startup

I have a Java application that I'm about to begin to use Web Start to deploy. But a new demand has made me rethink this, as I'm now required to add a piece of functionality that allows the end user to select whether or not they'd like to run this program on startup (of Windows, not cross-platform). But I'd still like to shy away from making this run as a service. Is there any way that this can be accomplished using Web Start, or should I explore other options to deploy this? Thanks in advance.
It actually works to put a this in the jnlp-file:
<shortcut online="true">
<desktop/>
<menu submenu="Startup"/>
</shortcut>
But that still would only work with English windows versions. German is "Autostart", Spanish was "Iniciar" I think. So it causes basically the same headache as the way via the IntegrationService.
I have not tried it, but I wonder if you could use the new JNLP IntegrationService in combination with the javaws command line program. The idea being to programmatically create a shortcut in the Windows startup group (although that location is dependent on specific Windows version).
To get around the language problem for the Startup folder just use the registry. Here is some code that should work. This calls reg.exe to make registry changes.
public class StartupCreator {
public static void setupStartupOnWindows(String jnlpUrl, String applicationName) throws Exception {
String foundJavaWsPath = findJavaWsOnWindows();
String cmd = foundJavaWsPath + " -Xnosplash \"" + jnlpUrl + "\"";
setRegKey("HKCU\\Software\\Microsoft\\Windows\\CurrentVersion\\Run", applicationName, cmd);
}
public static String findJavaWsOnWindows() {
// The paths where it will look for java
String[] paths = {
// first use the JRE that was used to launch this app, it will probably not reach the below paths
System.getProperty("java.home") + File.separator + "bin" + File.separator + "javaws.exe",
// it must check for the 64 bit path first because inside a 32-bit process system32 is actually syswow64
// 64 bit machine with 32 bit JRE
System.getenv("SYSTEMROOT") + File.separator + "syswow64" + File.separator + "javaws.exe",
// 32 bit machine with 32 bit JRE or 64 bit machine with 64 bit JRE
System.getenv("SYSTEMROOT") + File.separator + "system32" + File.separator + "javaws.exe",};
return findJavaWsInPaths(paths);
}
public static String findJavaWsInPaths(String[] paths) throws RuntimeException {
String foundJavaWsPath = null;
for (String p : paths) {
File f = new File(p);
if (f.exists()) {
foundJavaWsPath = p;
break;
}
}
if (foundJavaWsPath == null) {
throw new RuntimeException("Could not find path for javaws executable");
}
return foundJavaWsPath;
}
public static String setRegKey(String location, String regKey, String regValue) throws Exception {
String regCommand = "add \"" + location + "\" /v \"" + regKey + "\" /f /d \"" + regValue + "\"";
return doReg(regCommand);
}
public static String doReg(String regCommand) throws Exception {
final String REG_UTIL = "reg";
final String regUtilCmd = REG_UTIL + " " + regCommand;
return runProcess(regUtilCmd);
}
public static String runProcess(final String regUtilCmd) throws Exception {
StringWriter sw = new StringWriter();
Process process = Runtime.getRuntime().exec(regUtilCmd);
InputStream is = process.getInputStream();
int c = 0;
while ((c = is.read()) != -1) {
sw.write(c);
}
String result = sw.toString();
try {
process.waitFor();
} catch (Throwable ex) {
System.out.println(ex.getMessage());
}
if (process.exitValue() == -1) {
throw new Exception("REG QUERY command returned with exit code -1");
}
return result;
}
}

Java code for FTP upload doesn't work properly if launched as windows service

I'm using java code for uploading some files using FTP. When I compile and run the code everything works perfectly, but if I launch it as a windows service using Java Service Launcher, it doesn't connect to the FTP server at all (it just does the rest of the job, that is moving files to archive folder).
Btw, is there any better way for testing child process's output, than writing it's output to a file, than parsing the file content? Here's the code:
Runtime runtime = Runtime.getRuntime();
String[] cmd = {"c:\\ftp\\putskripta.bat"};
Process p1 = runtime.exec(cmd);
p1.waitFor();
File izlaz = new File("C:\\ftp\\izlaz.txt");
int arrlen = 10000;
byte[] infile = new byte[arrlen];
FileInputStream fis = new FileInputStream(izlaz);
BufferedInputStream bis = new BufferedInputStream(fis);
DataInputStream dis = new DataInputStream(bis);
int filelength = dis.read(infile);
String filestring = new String(infile, 0, 10000);
CharSequence[] sekvenca = {"Invalid command", "Not connected"};
if (!filestring.contains(sekvenca[0]) && !filestring.contains(sekvenca[1]))
{
File uploads = new File("C:\\ftp\\Uploads");
File[] uploadfiles = uploads.listFiles();
int godina = java.util.Calendar.getInstance().get(java.util.Calendar.YEAR);
int mjesec = Calendar.getInstance().get(Calendar.MONTH)+1;
int dan = Calendar.getInstance().get(Calendar.DAY_OF_MONTH);
for (int i = 0; i < uploadfiles.length; i++) {
if (uploadfiles[i].getName().startsWith("ARTST") || uploadfiles[i].getName().startsWith("BESTE") || uploadfiles[i].getName().startsWith("LAGER") || uploadfiles[i].getName().startsWith("AVISE") || uploadfiles[i].getName().startsWith("KUNDE") || uploadfiles[i].getName().startsWith("BORDE") || uploadfiles[i].getName().startsWith("ENTLA")) {
File destinacijaFoldera = new File("C:\\ftp\\MovedUploads\\" + godina + "\\" + mjesec + "\\" + dan);
File destinacijaFajla = new File("C:\\ftp\\MovedUploads\\" + godina + "\\" + mjesec + "\\" + dan + "\\" + uploadfiles[i].getName());
if (!destinacijaFoldera.isDirectory()) {
destinacijaFoldera.mkdirs();
}
File temp = new File(destinacijaFoldera, uploadfiles[i].getName() + "_" + String.valueOf(Calendar.getInstance().get(Calendar.HOUR_OF_DAY)) + String.valueOf(Calendar.getInstance().get(Calendar.MINUTE)) + String.valueOf(Calendar.getInstance().get(Calendar.SECOND)));
if (!destinacijaFajla.exists()) {
uploadfiles[i].renameTo(destinacijaFajla);
}
else{
uploadfiles[i].renameTo(temp);
}
uploadfiles[i].renameTo(new File(destinacijaFoldera, uploadfiles[i].getName()));
}
}
}
else {izlaz.delete(); throw new Exception("Neuspio pokusaj uploada");}
izlaz.delete();
Just in case, here is the code for "putskripta.bat"
#echo off
cd c:\ftp\Uploads
ftp -s:c:\ftp\putkomande.txt -i localhost > c:\ftp\izlaz.txt
By default, Windows Services run under the LocalSystem account, which does not have network access. Reconfigure your service to use an account that has access to your FTP site, such as NetworkService, or an account you create.
I don't have a better way of getting the results back from the child process, but I have had success using ftp without creating a child process. The Apache Commons Net package has methods to do the FTP from within the java program.

Categories