When a I do ls for a file in linux and I see: crw-rw---- what does the c imply?
I mean I read that c is for character device but is there any consequence on which Java API I/O methods I will use (e.g. BufferedWriter etc) or must not use?
I think this should give sufficient advice. FileInputStream and FileOutputStream are sufficient to access the device file which in turn talks to the character device.
This wikipedia explains the usage of the c character.
crw-rw-r-- a character special file whose user and group classes have the read and write permissions and whose others class has only the read permission.
This c character will not affect your ability to read this file, however you may want to check the write permissions if you are not in correct group. There are much better ways to check read/write permissions on files rather than reading through ls.
Use File.canWrite() and File.canRead() for checking file permissions. More info on these methods here - File - JavaDoc.
Related
I never thought about it.
But if you read a file you can use for example this code.
FileReader fileReader = new FileReader("c:\\data\\input-text.txt");
int data = fileReader.read();
while(data != -1) {
data = fileReader.read();
}
But how is actually recognised that the file ends. Is this because operating system know size of the file. Or is there a special character . I think java will call some C/C++ function from operating system and this function will return -1 , so java knows end of file is reached. But how does operating system know that file end is reached. Which special character is used for this.
How is actually End of File detected in java?
Java doesn't detect it. The operating system does.
The meaning of end-of-file depends on the nature of the "file" that you are reading.
If the file is a regular file in a file system, then the operating system knows or can find out what the actual file size is. It is part of the file's metadata.
If the file is a Socket stream, then end-of-file means that all available data has been consumed, and the OS knows that there cannot be any more. Typically, the socket has been closed or half closed.
If the file is a Pipe, then end-of-file means that the other end of the Pipe has closed it, and there will be no maore data.
If the file is a Linux/UNIX device file, then the precise end-of-file meaning will be device dependent. For example, if the device is a "tty" device on Linux/UNIX, it could mean:
the modem attached to the serial line has dropped out
the tty was in "cooked" mode and received the character that denotes EOF
and possibly other things.
It is common for a command shell to provide a way to signal an "end of file". Depending on the implementation, it may implement this itself, or it may be implemented at the device driver level. In either case, Java is not involved in the recognition.
I think java will call some C/C++ function from operating system and this function will return -1 , so java knows end of file is reached.
On Linux / UNIX / MacOS, the Java runtime calls the read(fd, buffer, count) native library method. That will return -1 if the fd is at the end-of-file position.
I see the chances of most popular file systems like ext and NTFS using a delimiter/ special char to mark the end of data as very slim. This is because files often have to store binary information too rather than text data and if the delimiter is present within its data, it can easily confuse the OS. In Linux, VFS (Virtual Filesystem Layer) offloads these details to implementations themselves and most of them construct a unique iNode (sort of like metadata) for every file that's resident in the filesystem. iNodes tend to have information on the blocks where the data is stored and also the exact size of the file among other things. Detecting EOF becomes is trivial when you have those.
When I call endsWith(".pdf"), would this open malware.pdf.exe or just malware.pdf?
String sFileName = request.getParameter("fName");
if (sFileName.toLowerCase().endsWith(".pdf"))
// open file
else
// don’t open the file
String.endsWith works as documented. However, there are a couple of obvious problems here.
A NUL character \0 will typically terminate the string as far as the OS file API is concerned (because it'll be using C strings).
If served up, may lose content by extension, possibly being macgiced to a different type.
It's generally dangerous to run PDFs downloaded from the internet from the local filesystem. (Chrome warns of this and see Billy Rios on Content Smuggling).
.endsWith("string") will perform as you intend. However, that doesn't mean that the file is actually a pdf. Check out this SO question or others for more information on how to check the header.
I am currently writing a program which takes user input and creates rows of a comma delimited .csv file. I am in need of a way to save this data in a way in which users are not able to easily edit this data. It does not need to be super secure, just enough so that it couldn't accidentally be edited. I also need another file (or the same file?) created to then be easily accessible (in the file system) by the user so that they may then email this file to a system admin who can then open the .csv file. I could provide this second person with a conversion program if necessary.
The file I save data in and the file to be sent can be two different files if there are any advantages to this. I was currently considering just using a file with a weird file extension, but saving it as a text file so that the user will only be able to open it if they know to try that. The other option being some sort of encryption, but I'm not sure if this is necessary and even if it was where I would start.
Thanks for the help :)
Edit: This file is meant to store the actual data being entered. Currently the data is being gathered on paper forms which are then sent to the admin to manually enter all of the data. This little app is meant to have someone else enter the data from the paper form and then tell them if they've entered it all correctly. After they've entered it all they then need to send the data to the admin. It would be preferable if the sending was handled automatically, but this app needs to be very simple and low budget and I don't want an internet connection to be a requirement.
You could store your data in a serializable object and save that. It would resist casual editing and be very simple to read and write from your app. This page should get you started: http://java.sun.com/developer/technicalArticles/Programming/serialization/
From your question, I am guessing that the uneditable file's purpose is to store some kind of system config and you don't want it to get messed up easily. From your own suggestions, it seems that even knowing that the file has been edited would help you, since you can then avoid using it. If that is the case, then you can use simple checks, such as save the total number of characters in the line as the first or last comma delimited value. Then, before you use the file, you just run a small validation code on it to verify that the file is indeed unaltered.
Another approach may just be to use a ZIP (file) of a "plain text format" (CSV, XML, other serialization method, etc) and, optionally, utilize a well-known (to you) password.
This approach could be used with other stream/package types: the idea behind using a ZIP (as opposed to an object serializer directly) is so that one can open/inspect/modify said data/file(s) easily without special program support. This may or may not be a benefit and using a password may or may not even be required, see below.
Some advantages of using a ZIP (or CAB):
The ability for multiple resources (aids in extensibility)
The ability to save the actual data in a "text format" (XML, perhaps)
Maintain competitive file-sizes for "common data"
Re-use existing tooling support (also get checksum validation for free!)
Additionally, using a non-ZIP file extension will prevent most users from casually associating the file (a similar approach to what is presented in the original post, but subtly different because the ZIP format itself is not "plain text") with the ZIP format and being able to open it. A number of modern Microsoft formats utilize the fact that the file-extension plays an important role and use CAB (and sometimes ZIP) formats as the container format for the document. That is, an ".XSN" or ".WSP" or ".gadget" file can be opened with a tool like 7-zip, but are generally only done so by developers who are "in the know". Also, just consider ".WAR" and ".JAR" files as other examples of this approach, since this is Java we're in.
Traditional ZIP passwords are not secure, and more-so is using a static password embedded in the program. However, if this is just a deterrent (e.g. not for "security") then those issues are not important. Coupled with an "un-associated" file-type/extension, I believe this offers the protection asked for in the question while remaining flexible. It may be possible to entirely drop the password usage and still prevent "accidental modifications" just by using a ZIP (or other) container format, depending upon requirement/desires.
Happy coding.
Can you set file permissions to make it read-only?
Other than doing a binary output file, the file system that Windows runs (I know for sure it works from XP through x64 Windows 7) has a little trick that you can use to hide data from anyone simply perusing through your files:
Append your output and input files with a colon and then an arbitrary value, eg if your filename is "data.csv", make it instead "data.csv:42". Any existing or non-existing file can be appended to to access a whole hidden area (and every file for every value after the colon is distinct, so "data.csv:42" != "data.csv:carrots" != "second.csv:carrots").
If this file doesn't exist, it will be created and initialized to have 0 bytes of data with it. If you open up the file in Notepad you will indeed see that it holds exactly the data it held before writing to the :42 file, no more, no less, but in reality subsequent data read from this "data.csv:42" file will persist. This makes it a perfect place to hide data from any annoying user!
Caveats: If you delete "data.csv", all associated hidden data will be deleted too. Also, there are indeed programs that will find these files, but if your user goes through all that trouble to manually edit some csv file, I say let them.
I also have no idea if this will work on other platforms, I've never thought to try it.
I've to make a code to upload/download a file on remote machine. But when i upload the file new line is not saved as well as it automatically inserts some binary characters. Also I'm not able to save the file in its actual format, I've to save it as "filename.ser". I'm using serialization-deserialization concept of java.
Thanks in advance.
How exactly are you transmitting the files? If you're using implementations of InputStream and OutputStream, they work on a byte-by-byte level so you should end up with a binary-equal output.
If you're using implementations of Reader and Writer, they convert the bytes to characters according to some character mapping, and then perform the reverse process when saving. Depending on the platform encodings of the various machines (and possibly other effects if you're not specifying the charset explicitly), you could well end up with differences in the binary file.
The fact that you mention newlines makes me think that you're using Readers to send strings (and possibly that you're stitching the strings back together yourself by manually adding newlines). If you want the files to be binary equal, then send them as a stream of bytes and store that stream verbatim. If you want them to be equal as strings in a given character set, then use Readers and Writers but specify the character set explicitly. If you want them to be transmitted as strings in the platform default set (not very useful), then accept that they're not going to be binary equal as files.
(Also, your question really doesn't provide much information to solve it. To me, it basically reads "I wrote some code to do X, and it doesn't work. Where did I go wrong?" You seem to assume that your code is correct by not listing it, but at the same time recognise that it's not...)
When reading zipfiles (using Java ZipInputStream or any other library) from an unknown source is there any way of detecting which entries are "character data" (and if so the encoding) or "binary data". And, if binary, any way of determining any more information (MIME types, etc.)
EDIT does the ByteOrderMark (BOM) occur in zipentries and if so do we have to make special operations for it.
It basically boils down to heuristics for determining the contents of files. For instance, for text files (ASCII) it should be possible to make a fairly good guess by checking the range of byte values used in the file -- although this will never be completely fool-proof.
You should try to limit the classes of file types you want to identify, e.g. is it enough to discern between "text data" and "binary data" ? If so you should be able to get a fairly high success rate for detection.
For UNIX systems, there is always the file command which tries to identify file types based on (mostly) content.
Maybe implement a Java component that is capable of applying the rules defined in /usr/share/file/magic. I would love to have something like that. (You would basically have to be able to look at the first x couple of bytes.)