Generate URI from String provided by user via command line argument - java

This is such a simple question, I'm sure the answer is out there and I'm simply not searching with the proper lingo. I'm new to Java, using Java 8, and want to learn how to properly handle this, rather than rigging it together.
The application takes in arguments via command line.
$ MyApp /home/user/thefiletheywant.me
I have tried the following:
// Missing Scheme, I know I can just force ("file:" + args[0]) but is that proper?
URI fileIn = new URI(args[0]);
// I've learned this is the same thing as above
URI fileIn = URI.create(args[0]);
I've seen examples that take the string, check with File.Separator to verify it is "/" and if not, replace it, then simply tack on "file:" in front. Which, again, seems sloppy.
What if the user added "http:"?
What if the user specifies a full path or a path relative to the directory they are currently in?
Do the builtin functions verify the path is proper? I'm aware of file.isFile() and file.exists(), which I can check myself easy enough.
If I knew exactly where the file was every time, of course the URI.create would be fine. But for future education, I want to know how to properly handle this very simple scenario. Please forgive me if in my searches I've simply somehow missed what I suspect is an easy solution.

You could just create a File object in java, which is OS-independent (Windows uses backslash for instance), check if it exists, and use the handy toURI() method on it, to create a valid URI object.
File myFile = new File(args[0]);
URI fileUri = null;
if(myFile.exists()) {
fileUri = myFile.toURI();
}

Related

checkmarx - How to resolve Stored Absolute Path Traversal issue?

Checkmarx - v 9.3.0 HF11
I am passing env value as data directory path in docker file which used in dev/uat server
ENV DATA /app/data/
In local, using following Environment variable
DATA=C:\projects\app\data\
getDataDirectory("MyDirectoryName"); // MyDirectoryName is present in data folder
public String getDataDirectory(String dirName)
{
String path = System.getenv("DATA");
if (path != null) {
path = sanitizePathValue(path);
path = encodePath(path);
dirName = sanitizePathValue(dirName);
if (!path.endsWith(File.separator)) {
path = path + File.separator;
} else if (!path.contains("data")) {
throw new MyRuntimeException("Data Directory path is incorrect");
}
} else {
return null;
}
File file = new File(dirName); // NOSONAR
if (!file.isAbsolute()) {
File tmp = new File(SecurityUtil.decodePath(path)); // NOSONAR
if (!tmp.getAbsolutePath().endsWith(Character.toString(File.separatorChar))) {
dirName = tmp.getAbsolutePath() + File.separatorChar + dirName;
} else {
dirName = tmp.getAbsolutePath() + dirName;
}
}
return dirName;
}
public static String encodePath(String path) {
try {
return URLEncoder.encode(path, "UTF-8");
} catch (UnsupportedEncodingException e) {
logger.error("Exception while encoding path", e);
}
return "";
}
public static String validateAndNormalizePath(String path) {
path = path.replaceAll("/../", "/");
path = path.replaceAll("/%46%46/", "/");
path = SecurityUtil.cleanIt(path);
path = FilenameUtils.normalize(path); // normalize path
return path;
}
public static String sanitizePathValue(String filename){
filename = validateAndNormalizePath(filename);
String regEx = "..|\\|/";
// compile the regex to create pattern
// using compile() method
Pattern pattern = Pattern.compile(regEx);
// get a matcher object from pattern
Matcher matcher = pattern.matcher(filename);
// check whether Regex string is
// found in actualString or not
boolean matches = matcher.matches();
if(matches){
throw new MyAppRuntimeException("filename:'"+filename+"' is bad.");
}
return filename;
}
public static String validateAndNormalizePath(String path) {
path = path.replaceAll("/../", "/");
path = path.replaceAll("/%46%46/", "/");
path = SecurityUtil.cleanIt(path);
path = FilenameUtils.normalize(path); // normalize path
return path;
}
[Attempt] - Update code which I tried with the help of few members to prevent path traversal issue.
Tried to sanitize string and normalize string, but no luck and getting same issue.
How to resolve Stored Absolute Path Traversal issue ?
Your first attempt is not going to work because escaping alone isn't going to prevent a path traversal. Replacing single quotes with double quotes won't do it either given you need to make sure someone setting a property/env variable with ../../etc/resolv.conf doesn't succeed in tricking your code into overwriting/reading a sensitive file. I believe Checkmarx won't look for StringUtils as part of recognizing it as sanitized, so the simple working example below is similar without using StringUtils.
Your second attempt won't work because it is a validator that uses control flow to prevent a bad input when it throws an exception. Checkmarx analyzes data flows. When filename is passed as a parameter to sanitizePathValue and returned as-is at the end, the data flow analysis sees this as not making a change to the original value.
There also appears to be some customizations in your system that recognize System.getProperty and System.getenv as untrusted inputs. By default, these are not recognized in this way, so anyone trying to scan your code probably would not have gotten any results for Absolute Path Traversal. It is possible that the risk profile of your application requires that you call properties and environment variables as untrusted inputs, so you can't really just remove these and revert back to the OOTB settings.
As Roman had mentioned, the logic in the query does look for values that are prepended to this untrusted input to remove those data flows as results. The below code shows how this could be done using Roman's method to trick the scanner. (I highly suggest you do not choose the route to trick the scanner.....very bad idea.) There could be other string literal values that would work using this method, but it would require some actions that control how the runtime is executed (like using chroot) to make sure it actually fixed the issue.
If you scan the code below, you should see only one vulnerable data path. The last example is likely something along the lines of what you could use to remediate the issues. It really depends on what you're trying to do with the file being created.
(I tested this on 9.2; it should work for prior versions. If it doesn't work, post your version and I can look into that version's query.)
// Vulnerable
String fn1 = System.getProperty ("test");
File f1 = new File(fn1);
// Path prepend - still vulnerable, tricks the scanner, DO NOT USE
String fn2 = System.getProperty ("test");
File f2 = new File(Paths.get ("", fn2).toString () );
// Path prepend - still vulnerable, tricks the scanner, DO NOT USE
String fn3 = System.getProperty ("test");
File f3 = new File("" + fn3);
// Path prepend - still vulnerable, tricks the scanner, DO NOT USE
String fn4 = System.getProperty ("test");
File f4 = new File("", fn4);
// Sanitized by stripping path separator as defined in the JDK
// This would be the safest method
String fn5 = System.getProperty ("test");
File f5 = new File(fn5.replaceAll (File.separator, ""));
So, in summary (TL;DR), replace the file separator in the untrusted input value:
String fn5 = System.getProperty ("test");
File f5 = new File(fn5.replaceAll (File.separator, ""));
Edit
Updating for other Checkmarx users that may come across this in search of an answer.
After my answer, OP updated the question to reveal that the issue being found was due to a mechanism written for the code to run in different environments. Pre-docker, this would have been the method to use. The vulnerability would have still been detected but most courses of action would have been to say "our deployment environment has security measures around it to prevent a bad actor from injecting an undesired path into the environment variable where we store our base path."
But now, with Docker, this is a thing of the past. Generally the point of Docker is to create applications that run the way same everywhere they are deployed. Using a base path in an environment likely means OP is executing the code outside of a container for development (based on the update showing a Windows path) and inside the container for deployment. Why not just run the code in the container for development as well as deployment as is intended by Docker?
Most of the answers tend to explain that OP should use a static path. This is because they are realizing that there is no way to avoid this issue because taking an untrusted input (from the environment) and prefixing it to a path is the exact problem of Absolute Path Traversal.
OP could follow the good advice of many posters here and put a static base path in the code then use Docker volumes or Docker bind mounts.
Is it difficult? Nope. If I were OP, I'd fix the base path prefix in code to a static value of /app/data and do a simple volume binding during development. (When you think about it, if there is storage of data in the container during a deployment then the deployment environment must be doing this exact thing for /app/data unless the data is not kept after the lifetime of the container.)
With the base path fixed at /app/data, one option for OP to run their development build is:
docker run -it -v"C:\\projects\\app\\data":/app/data {container name goes here}
All data written by the application would appear in C:\projects\app\data the same way it does when using the environment variables. The main difference is that there are no environment-variable-prefixed paths and thus no Absolute Path Traversal results from the static analysis scanner.
It depends on how Checkmarx comes to this point. Most likely because the value that is handed to File is still tainted. So make sure both /../ and /%46%46/ are replaced by /.
checkedInput = userInput.replaceAll("/../", "/");
Secondly, give File a parent directory to start with and later compare the path of the file you want to process. Some common example code is below. If the file doesn't start with the full parent directory, then it means you have a path traversal.
File file = new File(BASE_DIRECTORY, userInput);
if (file.getCanonicalPath().startsWith(BASE_DIRECTORY)) {
// process file
}
Checkmarx can only check if variables contain a tainted value and in some cases if the logic is correct. Please also think about the running process and file system permissions. A lot of applications have the capability of overwriting their own executables.
If there is one thing to remember it is this
use allow lists not deny lists
(traditionally known as whitelists and blacklists).
For instance, consider replacing /../ with / suggested in another answer. My response is to contain the sequence /../../. You could pursue this iteratively, and I might run out of adversarial examples, but that doesn't mean there are any.
Another problem is knowing all the special characters. \0 used to truncate the file name. What happens to non-ASCII characters - I can't remember. Might other code be changed in future so that the path ends up on a command line with other special characters - worse, OS/command line dependent.
Canonicalisation has its problems too. It can be used to some extent probe the file system (and perhaps beyond the machine).
So, choose what you allow. Say
if (filename.matches("[a-zA-Z0-9_]+")) {
return filename;
} else {
throw new MyException(...);
}
(No need to go through the whole Pattern/Matcher palaver in this situation.)
For this issue i would suggest you hard code the absolute path of the directory that you allow your program to work in; like this:
String separator = FileSystems.getDefault().getSeparator();
// should resolve to /app/workdir in linux
String WORKING_DIR = separator + "app"+separator +"workdir"+separator ;
then when you accept the parameter treat it as a relative path like this:
String filename = System.getProperty("test");
sanitize(filename);
filename = WORKING_DIR+filename;
File dictionaryFile = new File(filename);
To sanitize your user's input make sure he does not include .. and does not include also \ nor /
private static void sanitize(filename){
if(Pattern.compile("\\.\\.|\\|/").matcher(filename).find()){
throw new RuntimeException("filename:'"+filename+"' is bad.");
}
}
Edit
In case you are running the process in linux you can change the root of the process using chroot maybe you do some googling to know how you should implement it.
how about using Java's Path to make the check("../test1.txt" is the input from user):
File base=new File("/your/base");
Path basePath=base.toPath();
Path resolve = basePath.resolve("../test1.txt");
Path relativize = basePath.relativize(resolve);
if(relativize.startsWith("..")){
throw new Exception("invalid path");
}
Based on reading the Checkmarx query for absolute path traversal vulnerability (and I believe in general one of the mitigation approach), is to prepend a hard coded path to avoid the attackers traversing through the file system:
File has a constructor that accepts a second parameter that will allow you to perform some prepending
String filename = System.getEnv("test");
File dictionaryFile = new File("/home/", filename);
UPDATE:
The validateAndNormalizePath would have technically sufficed but I believe Checkmarx is unable to recognize this as a sanitizer (being a custom written function). I would advice to work with your App Security team for them to use the CxAudit and overwrite the base Stored Path Traversal Checkmarx query to recognize validateAndNormalizePath as a valid sanitizer.

Serving file resources contents from subfolder safely, securely

A user can submit a subfolder/filename to download.
The subfolder/filename will then be used to serve a file from a predertemined folder.
In the end, I am doing new File(folder, "subfolder/filename").
But before I do that, I also check that !"subfolder/filename".contains("..")
But is this enough? Is there possibly a scenario where two dots (..) may not come after each other, but still be interpreted as two dots when passed to new File(...) ?
Are there any other way a user can navigate back and reach content outside this folder?
Do you need to do something else to secure such a subfolder/filename access from folder?
One can get the absolute paths, from the OS, so a bit slow.
String folderPath = folder.getCanonicalPath() + File.separator;
File file = new File(folder, "subfolder/filename");
String path = file.getCanonicalPath();
if (!path.startsWith(folderPath)) {
log(Level.ERROR, "Security breach attempt: ...");
return;
}
A simple check would probably do too:
Pattern BREACH = Pattern.compile("\\.[\\\\]*\\.");
if (BREACH.matcher(path).find()) { ... }
Mind when you use version control or other "protected" files/folders, then names of files or folders starting with a dot are illegal too.
You can execute something like
cd ./\.\.
In Unix it will change directory to parent. May be You can resolve file and when check if it under right parent?
UPD: looks like in java You cannot use \.\. pattern http://goo.gl/4Rszg5 still it does not mean what check for ".." is sufficient. Better check canonical path

Canonical path not working

So I'm trying to use the canonical path to access a sound file, but it does not seem to work. Here is my code:
// load wave data from buffer
WaveData wavefile = WaveData.create("/Users/spex/NetBeansProjects/spaceinvaders/src/spaceinvaders/spaceinvaders/" + path);
It appears that it is trying to get the path from the location of the class path. Is there a way to let it know that I want to input the canonical path rather than a local one?
Try using a URL instead, as stated in the javadocs:
WaveData wavefile = WaveData.create(new URL("file:/Users/spex/NetBeansProjects/spaceinvaders/src/spaceinvaders/spaceinvaders/" + path));
Alternatively, create an input stream from your file, and the call WaveData.create(inputStream).

Files, URIs, and URLs conflicting in Java

I am getting some strange behavior when trying to convert between Files and URLs, particularly when a file/path has spaces in its name. Is there any safe way to convert between the two?
My program has a file saving functionality where the actual "Save" operation is delegated to an outside library that requires a URL as a parameter. However, I also want the user to be able to pick which file to save to. The issue is that when converting between File and URL (using URI), spaces show up as "%20" and mess up various operations. Consider the following code:
//...user has selected file
File userFile = myFileChooser.getSelectedFile();
URL userURL = userFile.toURI().toURL();
System.out.println(userFile.getPath());
System.out.println(userURL);
File myFile = new File(userURL.getFile());
System.out.println(myFile.equals(userFile);
This will return false (due to the "%20" symbols), and is causing significant issues in my program because Files and URLs are handed off and often operations have to be performed with them (like getting parent/subdirectories). Is there a way to make File/URL handling safe for paths with whitespace?
P.S. Everything works fine if my paths have no spaces in them (and the paths look equal), but that is a user restriction I cannot impose.
The problem is that you use URL to construct the second file:
File myFile = new File(userURL.getFile());
If you stick to the URI, you are better off:
URI userURI = userFile.toURI();
URL userURL = userURI.toURL();
...
File myFile = new File(userURI);
or
File myFile = new File( userURL.toURI() );
Both ways worked for me, when testing file names with blanks.
Use instead..
System.out.println(myFile.toURI().toURL().equals(userURL);
That should return true.

Generate URL for File

The default output of File.toURL() is
file:/c:/foo/bar
These don't appear to work on windows, and need to be changed to
file:///c:/foo/bar
Does the format
file:/foo/bar
work correctly on Unix (I don't have a Unix machine to test on)? Is there a library that can take care of generating a URL from a File that is in the correct format for the current environment?
I've considered using a regex to fix the problem, something like:
fileUrl.replaceFirst("^file:/", "file:///")
However, this isn't quite right, because it will convert a correct URL like:
file:///c:/foo/bar
to:
file://///c:/foo/bar
Update
I'm using Java 1.4 and in this version File.toURL() is not deprecated and both File.toURL().toString() and File.toURI().toString() generate the same (incorrect) URL on windows
The File(String) expects a pathname, not an URL. If you want to construct a File based on a String which actually represents an URL, then you'll need to convert this String back to URL first and make use of File(URI) to construct the File based on URL#toURI().
String urlAsString = "file:/c:/foo/bar";
URL url = new URL(urlAsString);
File file = new File(url.toURI());
Update: since you're on Java 1.4 and URL#toURI() is actually a Java 1.5 method (sorry, overlooked that bit), better use URL#getPath() instead which returns the pathname, so that you can use File(String).
String urlAsString = "file:/c:/foo/bar";
URL url = new URL(urlAsString);
File file = new File(url.getPath());
The File.toURL() method is deprecated - it is recommended that you use the toURI() method instead. If you use that instead, does your problem go away?
Edit:
I understand: you are using Java 4. However, your question did not explain what you were trying to do. If, as you state in the comments, you are attempting to simply read a file, use a FileReader to do so (or a FileInputStream if the file is a binary format).
What do you actually mean with "Does the format file:/c:/foo/bar work correctly on Unix"?
Some examples from Unix.
File file = new File("/tmp/foo.txt"); // this file exists
System.out.println(file.toURI()); // "file:/tmp/foo.txt"
However, you cannot e.g. do this:
File file = new File("file:/tmp/foo.txt");
System.out.println(file.exists()); // false
(If you need a URL instance, do file.toURI().toURL() as the Javadoc says.)
Edit: how about the following, does it help?
URL url = new URL("file:/tmp/foo.txt");
System.out.println(url.getFile()); // "/tmp/foo.txt"
File file = new File(url.getFile());
System.out.println(file.exists()); // true
(Basically very close to BalusC's example which used new File(url.toURI()).)

Categories