Get File Extension for special cases like tar.gz [closed] - java

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
Questions asking for code must demonstrate a minimal understanding of the problem being solved. Include attempted solutions, why they didn't work, and the expected results. See also: Stack Overflow question checklist
Closed 9 years ago.
Improve this question
I need to extract extensions from file names.
I know this can be done for single extensions like .gz or .tar by using filePath.lastIndexOf('.') or using utility methods like FilenameUtils.getExtension(filePath) from Apache commons-io.
But, what if I have a file with an extension like .tar.gz? How can I manage files with extensions that contain . characters?

If you know what extensions are important, you can simply check for them explicitly. You would have a collection of known extensions, like this:
List<String> EXTS = Arrays.asList("tar.gz", "tgz", "gz", "zip");
You could get the (first) longest matching extension like this:
String getExtension(String fileName) {
String found = null;
for (String ext : EXTS) {
if (fileName.endsWith("." + ext)) {
if (found == null || found.length() < ext.length()) {
found = ext;
}
}
}
return found;
}
So calling getExtension("file.tar.gz") would return "tar.gz".
If you have mixed-case names, perhaps try changing the check to filename.toLowerCase().endsWith("." + ext) inside the loop.

A file can just have one extension!
If you have a file test.tar.gz,
.gz is the extension and
test.tar is the Basename!
.tar in this case is part of the basename, not the part of the extension!
If you like to have a file encoded as tar and gz you should call it .tgz. To use a .tar.gz is bad practice, if you need to handle thesse files you should make a workaround like rename the file to test.tgz.

Found a simple way. Use substring to get filename only and indexOf instead of lastIndexOf to get first '.' and extension after it

You can get the filename part of the path, split on . and take the final 0, 1, or 2 elements in the array as the extension.
Of course if .tar.* (gz, bz2, etc.) is your only edge case it may be pragmatic to just build a solution that filters filenames for .tar. and use that as the point at which to extract the extension (to include the .tar portion).

Related

File wildcard use *

I am trying to read a file which has name: K2ssal.timestamp.
I want to handle the time stamp part of the file name as wildcard.
How can I achieve this ?
tried * after file name but not working.
var getK2SSal: Iterator[String] = Source.fromFile("C://Users/nrakhad/Desktop/Work/Data stage migration/Input files/K2Ssal.*").getLines()
You can use Files.newDirectoryStream with directory + glob:
import java.nio.file.{Paths, Files}
val yourFile = Files.newDirectoryStream(
Paths.get("/path/to/the/directory"), // where is the file?
"K2Ssal.*" // glob of the file name
).iterator.next // get first match
Misconception on your end: unless the library call is specifically implemented to do so, using a wildcard simply doesn't work like you expect it to.
Meaning: a file system doesn't know about wildcards. It only knows about existing files and folders. The fact that you can put * on certain commands, and that the wildcard is replaced with file names is a property of the tool(s) you are using. And most often, programming APIs that allow you to query the file system do not include that special wild card handling.
In other words: there is no sense in adding that asterisk like that.
You have to step back and write code that actively searches for files itself. Here are some examples for scala.
You can read the directory and filter on files based upon the string.
val l = new File("""C://Users/nrakhad/Desktop/Work/Data stage migration/Input files/""").listFiles
val s = l.filter(_.toString.contains("K2Ssal."))

How to replace String with different slashes? [duplicate]

This question already has answers here:
java split function
(6 answers)
Closed 7 years ago.
I need to rename some paths in database.
I rename folder:
String mainFolder= "D:\test\1\data"; //folder renamed from fd
Then i need to rename all files and directories inside that folder:
String file1="D:\test\1\fd\dr.jpg";
String folder1="D:\test\1\fd\fd"; // in this case last fd needs to be renamed
String folder2="src/fd/fd/"; //fake path also needs to be renamed
What is the best and fastest way to rename that strings?
My thoughts about "/":
String folder2= "src/da/da";
String[] splittedFakePath = folder2.split("/");
splittedFakePath[splittedFakePath.length - 2] = "data";
StringBuffer newFakePath = new StringBuffer();
for (String str : splittedFakePath) {
newFakePath.append(str).append("/");
}
String after rename: src/data/da/
But when im trying split by "\":
Arrays.toString(Pattern.compile(File.separator).split(folder1));
I receive:
java.util.regex.PatternSyntaxException: Unexpected internal error near index 1
\
^
Look into java's String replace(...) method.
It is wonderful for string replacement, much better than attempting a regex.
Keep in mind that real directory handling has a few special cases, which don't lend themselves well to direct string manipulation. For example '//' often gets compacted to '/' in Unix like systems, and if you care about proper directory corner-cases, then use the Java Path class

How to retrieve particular part of string

I have got a directory listing as a String and I want to retrieve a particular part of the string, the only thing is that as this is a directory it can change in length
I want to retrieve the file name from the string
"C:\projects\Compiler\Compiler\src\JUnit\ExampleTest.java"
"C:\projects\ExampleTest.java"
So in these two cases I want to retrieve just ExampleTest (the filename can also change so i need something like get the text before the first . and after the last \). Is there a way to do this using something like regex or something similar?
Why not use Apache Commons FileNameUtils rather than coding your own regular expressions ? From the doc:
This class defines six components within a filename (example
C:\dev\project\file.txt):
the prefix - C:\
the path - dev\project\
the full path - C:\dev\project\
the name - file.txt
the base name - file
the extension - txt
You're a lot better off using this. It's geared directly towards filenames, dirs etc. and given that it's a commonly used, well-defined component, it'll have been tested extensively and edge cases ironed out etc.
new File(thePath).getName()
or
int pos = thePath.lastIndexOf("\\");
return pos >= 0? thePath.substring(pos+1): thePath;
File file = new File("C:\\projects\\ExampleTest.java");
System.out.println(file.getAbsoluteFile().getName());
Java code
String test = "C:\\projects\\Compiler\\Compiler\\src\\JUnit\\ExampleTest.java";
String arr[] = test.split("\\Q"+"\\");
System.out.println(arr[arr.length-1].split("\\.")[0]);
This is the regex in c# and it works in java :P too.Thanks to Perl.It matches in Group[1]
^.*\\(.*?)\..*?$

Java - How to get the name of a file from the absolute path and remove its file extension?

I have a problem here, I have a String that contains a value of C:\Users\Ewen\AppData\Roaming\MyProgram\Test.txt, and I want to remove the C:\Users\Ewen\AppData\Roaming\MyProgram\ so that only Test is left. So the question is, how can i remove any part of the string.
Thanks for your time! :)
If you're working strictly with file paths, try this
String path = "C:\\Users\\Ewen\\AppData\\Roaming\\MyProgram\\Test.txt";
File f = new File(path);
System.out.println(f.getName()); // Prints "Test.txt"
Thanks but I also want to remove the .txt
OK then, try this
String fName = f.getName();
System.out.println(fName.substring(0, fName.lastIndexOf('.')));
Please see this for more information.
The String class has all the necessary power to deal with this. Methods you may be interested in:
String.split(), String.substring(), String.lastIndexOf()
Those 3, and more, are described here: http://docs.oracle.com/javase/1.4.2/docs/api/java/lang/String.html
Give it some thought, and you'll have it working in no time :).
I recommend using FilenameUtils.getBaseName(String filename). The FilenameUtils class is a part of Apache Commons IO.
According to the documentation, the method "will handle a file in either Unix or Windows format". "The text after the last forward or backslash and before the last dot is returned" as a String object.
String filename = "C:\\Users\\Ewen\\AppData\\Roaming\\MyProgram\\Test.txt";
String baseName = FilenameUtils.getBaseName(filename);
System.out.println(baseName);
The above code prints Test.

Working with files with anchors, which options are there for Java?

I am working with files (reading, writing and copying) in my Java application, java.io.File and commons-io were perfect for this kind of tasks.
Right now, I can link to HTML in this way:
Z:\an absolute\path\to\a\file.html
But, I need to provide support for anchors too:
Z:\an absolute\path\to\a\file.html#anchor
keeping the system-independence obtained by using java.io.File. So, I will need to extract the path and the anchor, I wonder whether it will be as easy as searching for a sharp occurrence.
java.io.File includes a constructor that accepts a URI, which can represent all kinds of resources, included URLs and local files (see the rfc). URI's also meets your requirements of supporting anchors, and extracting path information (through instance.getPath()).
File f = new File("Z:\\path\\to\\a\\file.html#anchor");
String anchor = f.toURL().getRef(); //note: toURL is deprecated
If you look at the java source you will see that it is as simple as:
String file = "Z:\\path\\to\\a\\file.html#anchor";
int ind = file.indexOf('#');
String anchor = ind < 0 ? null: file.substring(ind + 1);

Categories