Using the "file:" prefix to file paths in Java filename strings?

Using the "file:" prefix to file paths in Java filename strings? - java

Getting resources for Java projects has always been fairly confusing to me, as the documentation doesn't explain it very well in my opinion, and I end up having to re-learn it every time I need to use it in a project. Most recently, using JavaFX, I was trying to load an image. The constructor requires a string representing the file path. I had come up with a very hacky method of doing this in the past, but I recently came across this StackOverflow post, and the accepted answer shows a very simple way of referencing the top level of the Eclipse project so that I can access source folders in the build path and easily locate my image files.
Is there a name for this particular delimeter? Are there other delimeters like it? Would there be problems using this notation when running this code in an executable JAR?
Any information would be greatly appreciated. And if this isn't the best way to approach this and someone could give me an adequate, simple explanation on how to do this in the future or a link to an article that explains it well, that would be great.

You can get resources using a relative path or an absolute path. Relative paths start from the "working directory", which is the directory your application runs from. An absolute path is unambiguous. The working directory is checked before the classpath. This is what is used in places like
new File(filepath).
When you run Java, there is also the classpath to consider. Another way some people look for resources is
ClassLoader.getResource(String)
and
ClassLoader.getResourceAsStream(String)
The ClassLoader can accept the String as a relative path and will check each location along the classpath.
https://docs.oracle.com/javase/7/docs/api/java/lang/ClassLoader.html#getResourceAsStream(java.lang.String)
People get results by putting their files within the src folder because that usually gets copied over to the place where the class files are generated, which is included on the classpath.
The "top level" that you speak of is the working directory. Eclipse sets the working directory to the project root by default.

The relevant wikipedia page covers the file: scheme nicely.
But we live in a messy world, and various pieces of software may conform to older standards, or not conform completely with any standards, or may include support for alternate syntaxes. Understanding the correct syntax to use with a particular software system is unfortunately not always straightforward.

Related

Relative path vs absolute paths

I have a java web application that reads files from the system, these files can be anywhere on the system.
I know if the file is in the webapp itself its always better to use relative path, but for files outside is it better to use absolute or relative? I just think using relative is a bit pointless, but many senior developers have suggested i use relative.
Would be interested in hearing thoughts, and what the practice is?

Normally, you should always use relative paths where possible. This is the best practice.
But if you can be absolutely (get it?) sure that file will stay there and is outside your actual application, you can refer it with an absolute path and it should be fine.
There are no real rules for when to use which... Id use it just for external stuff like this
The advantage you get with relative paths, obviously, is that its a lot more dynamic then using absolute paths.

It just depends on what you know. You have a webapp - that means that it's deployed on a server (like glassfish).
Up to the level of your servers folder it will be better to use relative paths - because you know the file structure up to that point and can localise everything without extra knowledge about the rest of the filesystem.
For files outside that you can't guarantee anything (it's perfectly legal to put and move server installation absolutely anywhere on the system) so absolute paths will be a more flexible choice.

Symbol to signify the root of a project

Is there a well accepted symbol in the programming world for the root of a project?
For example, the tilde ~ is the user's home directory, but this not just convention, but part of UNIX.
I am looking for a symbol that is merely convention.

If you are looking for a convention for use in communicating with a team, I'd suggest the project name followed by a /. This makes it clear as to what project you are referring to. If the project name is already implied by the context, it seems to be the convention to simply use a subdirectory name, with or without a trailing slash. See here and here for examples from Linux-kernel related documentation.

I'm not aware of any such convention. In Autoconf, variables top_srcdir and abs_top_srcdir points to the root of a project. In git, this does the job:
git rev-parse --show-toplevel
However, if you are looking for a single character symbol, I suggest borrowing the tee character: ⊤ (U+22A4, &#8868). I don't think it has ever been used for that, but it captures the idea of top.

the root of a project
What means the root of the project exactly ? Given which context ? Which types of projects ? Are you talking about a deployed web projects ? A source tree of a web projects ? A command line utility written in C ? Or in Java ? Or Go ?
Each language and framework provides its on sets of predefined structures to follow. The root of the project is then, either the root of the vcs, which may store many assets not strictly related to the business of the software, or the root according to the given framework / language you are working with, in which case, i assume it is safe to say, it can be anything because they are so many different fw for so many different concerns.

Windows vs. POSIX
The Portable Operating System Interface (POSIX) like UNIX.
Windows has C:// or other drivers as root, while POSIX have / as root.
to know if the file is a root path or not, you can use path.isAbsolute('PATH_HERE') this ill return true if it is a root path.
to know if your node is running on a windows or POSIX platform use process.platform
to check if you are running in windows:
var isWin = /^win/.test(process.platform);
nodeJS Docs: https://nodejs.org/dist/latest-v6.x/docs/api/path.html#path_path_isabsolute_path

i think people usually use label to be the root instead of symbol, e.g., /server for the root of node app.

The Be-all, End-all
After doing the bare minimum of research and reading about 1/4 of a wikipedia article on Root Directory I have come to the almighty, forever-binding conclusion that:
No, there is no standardized way of indicating you are in the root directory of an arbitrary project. (Apart from reading the path itself)
Here is another link pertaining to inodes farther down to make it seem like I did more research.
In that case, making a standard seems like fun doesn't it?
The standard you come up with doesn't have to be global, it can just apply to your dev team if you want it to. In that case, let's make 3 right off the top of our (my) head.
How about |->foo/bar/a.java? The | indicates a flat level, with nothing before it.
We could always try a boring (but useful... I guess): (foo)/bar/a.java
Or to spice things up a little bit, we could do...
I am gROOT
|foo|/bar/a.java
Whatever standard you choose (which is kinda funny, because the usage of standard implies that there's only one) you're now going to have to...
Implement it!
This is going to be the hard part. You're going to have to find some way to indicate to the OS that you're not only in an arbitrary directory, but that you're in a directory that holds slightly more significance than others. Maybe you add another section to the INODE (in *nix at least) that specifies that it's important. Maybe you don't fuss around with all the OS level stuff, and instead patch git to recognize the root of all git projects... which now that I think about it, kind of already happens.
Possible Implementation
Lets use git as an example. Git projects are denoted by .git files in the root directory. So let's take that a step farther and put a .base file in every directory that is the root of a project (or what have you). The .base doesn't even need to have anything in it, it just needs to be there. Now, patch up whatever terminal you're using to recognize the .base file as the root of an arbitrary project, and display it however you like! EZ-PZ
Possible additions?
Some other thoughts here, maybe you could add some configuration to the .base file, like so:
proj_name=WorldTraveller
lang=java
other=stuff
can=go
here=whatever
which then drives how its displayed in the terminal. The above configuration using my first suggested standard would be
|->WorldTraveller/Countries/France/a.java
Note
I'm not trying to come off as a sarcastic D.i.a.B, so if I came off as one it wasn't my intention. I like to have fun answering questions sometimes.

What is the equivalent to python equivalent to using Class.getResource() [duplicate]

This question already has answers here:
Way to access resource files in python
(4 answers)
Closed 9 years ago.
In java if I want to read a file that contains resource data for my algorithms how do I do it so the path is correctly referenced.
Clarification
I am trying to understand how in the Python world one packages data along with code in a module.
For example I might be writing some code that looks at a string and tries to classify the language the text is written in. For this to work I need to have a file that contains data about language models.
So when my code is called I would like to load a file (or files) that is packaged along with the module. I am not clear on how I should do that in Python.
TIA.

I think you may be looking for pkgutil.get_data(). The docs for this say:
pkgutil.get_data(package, resource)
Get a resource from a package.
This is a wrapper for the PEP 302 loader get_data() API. The package
argument should be the name of a package, in standard module format
(foo.bar). The resource argument should be in the form of a relative
filename, using / as the path separator. The parent directory name ..
is not allowed, and nor is a rooted name (starting with a /).
The function returns a binary string that is the contents of the
specified resource.
For packages located in the filesystem, which have already been
imported, this is the rough equivalent of:
d = os.path.dirname(sys.modules[package].__file__)
data = open(os.path.join(d, resource), 'rb').read()
If the package cannot be
located or loaded, or it uses a PEP 302 loader which does not support
get_data(), then None is returned.

I think you are looking for imp.load_source:
import imp
module = imp.load_source('ModuleName', '/path/of/the/file.py')
module.FooBar()

For Pythonistas who don't know, the behaviour of Java's Class.getResource is basically: the supplied file name is (unless it's already an absolute path) transformed into a relative path by using the class' package (since the directory path to the class file is expected to mirror the explicit "package" declaration for the class). The ClassLoader that was used to load the class in the first place then gets to transform this path string, by its own logic, into a URL object that could encode a file name, a location on the WWW, etc.
Python is not Java, so we have to approximate a few things and read intent into the question.
Python classes don't really explicitly go into packages, although you can create packages by putting them in folders with an additional __init__.py file.
Python does not really have anything quite like the URL class in its standard library; although there is plenty of support for connecting to the Internet, you're generally expected to just use strings to represent URLs (and file names) and format them appropriately. This is arguably an unfortunate missed opportunity for polymorphism (it would not be hard to make your own wrapper, though you might miss lots of special cases and useful functionality). Anyway, in normal cases with Java, you're not expecting to get a web URL from this process.
Python has a concept of a "working directory" that depends on how the Python process was launched. File paths are not necessarily relative to the directory where the "main class" (well, really, "main module", because Python doesn't make you put everything in a class) is found.
So what you really want, probably, is to get the absolute path on disk to the source file corresponding to the class. But that isn't really going to work out either. The problem is: given a class, you can get the name of the module it comes from, and then look up that name to get the actual module object, and then from the module object get the file name that the module was loaded from. However, that file name is relative to whatever the working directory was when the module was loaded, and that information isn't recorded. If the working directory has changed since then (with os.chdir), you're out of luck.
Please try to be more clear about what you're really trying to do.

Java Project Documentation

I need to document a Java project. I am a C# Programmer and Systems Analyst. But I am new to Java.
I have the directories checked out of SVN.
These directories include the source directories, WEB-INF and other files required for definition of the project, classpath etc.
I understand that the files essentially belong either of the following three categories
Source code files / directories that are based on the way the packages are structured (.Java)
Directories / Files required for project definition, compiler settings etc
Files required for deployment.
The project is (as most Java projects are) an Eclipse based project designed to be hosted on Tomcat.
Now, give the above information I have decided to document the entire project into three different documents
A document explaining the source code etc.
A document explaining the purpose of the files & directories that are required for compiler settings, project definitions etc
A document that explains the deployment directory structure.
Or alternatively I could create a single document with three sections that explain 1-3 above.
Now, questions
Is this the right approach?
Are there any other methodologies that I can follow or borrow from?
Are there any other suggestions etc that you can add to this approach
Any additional info will be of use.
Thanks a ton in advance

I think you're on the right track. In a project you need to address three documentation needs
User Documentation
This include a document stating what the application is about, and how to start it/access ut.
Development Documentation
This includes at least the Javadocs, a description of the source code directory structure, the build process (ie, how to compile the project), compiler time dependencies, development standards, how to set up a database for development, and how to get the source code from the repository. These are the minimum you need to get others to work in your project. Additionally as the project complexity grows I like to put together a series of "How To" for common tasks in the system (ie: "How to leave an Audit Trail for a given Operation", "How to use the Logging framework", "How to manage exceptions", etc), a description of the main Domain classes and their relationship. If you use a database, and the database schema is not exactly one-on-one with the domain classes, I'll add a schema documentation.
Deployment Documentation
This is basically the installation manual of the application, describing any steps needed to make it run: putting the WAR in Tomcat, running scripts against a database, configuration files that needs to be modified, etc,etc.
As you see, you already partially addressed two of them. Start small and simple, and add the rest as the need arises.
It also helps to check if your organization has any documentation standard.

Try Javadocs link. Written with proper planning, it will address all your points above.

A document explaining the source code etc.
Yes. Approach this as if your reader was someone trying to get familiar with the reasons why the project was written (why was this project created), as well as the overall architecture of the project.
The Javadocs on the source classes should explain what each class does. Your document should tie the Javadocs together, like a tutorial.
A document explaining the purpose of the files & directories that are required for compiler settings, project definitions etc.
Yes.
A document that explains the deployment directory structure.
I suppose that's what your build scripts do. Perhaps I don't understand what you expect this document to accomplish.
Are there any other suggestions etc that you can add to this approach
Unless this is the first time anyone in your development group has documented a Java project, there should be other documentation. See what they've done.
If you are the first, then I'd say this was a good start. I'd be most interested in the first document. Your new programmers would like the second document.

Using the ClassLoader method to retrieve all resources under classes as Input Streams

My problem is one that you would think is quite common, but I haven't so far managed to find a solution.
Building a Java web app under Tomcat 5.5 (although a requirement is that it can be deployed anywhere, like under a WebLogic environment, hence the loading resources as streams requirement). Good practice dictates that resource files are placed under WEB-INF/classes and loaded using the ClassLoader's getResourceAsStream() method. All well and good when you know the name of the resource you want to load.
My problem is that I need to load everything (including recursively in non-empty sub-directories) that lives in a subdirectory of classes.
So, for example, if I have the following under WEB-INF/classes:
folderX/folderY
folderX/folderY/fileA.properties
folderX/fileB.properties
I need the fileA.properties and fileB.properties classes to be loaded, without actually knowing their names before the application is started (ie I need the ability to arbitrarily load resources from any directory under WEB-INF/classes).
What is the most elegant way to do this? What object could I interrogate to find the information I need (the resource paths to each of the required resources)? A non-servlet specific solution would be best (keeping it all within the class loading framework if possible).
Thanks in advance!

As far as I am aware, there is no such ability, since the classloader only attempts to load things it is asked for. It doesn't pre-fetch all items on the classpath, or treat them as a directory structure.
The way I would solve the problem is create a directory listing in a text file of all relevant resources at build time and include that in the war, and then walk it through that way.

You can do that with some tricks :)
Get the resource as URL, extract the protocol :
file protocol - get the URL path and you have a folder, scan for files.
jar/zip protocol - extract the jar/zip path and use JarFile to browse the files and extract everything under your path/package.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.