identify the file extension using java - java

i have different format files in DB. i want to copy to my local machine.
how can i identify the file format (doc, xls, etc...)
Regards,
krishna
Thanks, for providing suggestions... based on your suggestions i had written the code & i am completed...
please look into my blog.. i posted the code over here...
http://muralie39.wordpress.com/java-program-to-copy-files-from-oracle-to-localhost/
Thank you guys..
Thanks,
krishna

If your files are named according to convention, you can just parse the filename:
String filename = "yourFileName";
int dotPosition = filename.lastIndexOf(".");
String extension = "";
if (dotPosition != -1) {
extension = filename.substring(dotPosition);
}
System.out.println("The file is of type: " + extension);
That's the simplest approach, assuming your files are named using some kind of standard naming convention. The extensions could be proprietary to your system, even, as long as they follow a convention this will work.
If you need to actually scan the file to get the format information, you will need to do some more investigation into document formats.

How are the files stored? Do you have filenames with extensions, or just the binary data?
Mime Util has tools to detect format both from extensions and from magic headers, but of course that's never 100%.

You can use the Tika apache library.
As Dmitri pointed out however, you may have incorrect results sometimes if detecting mime type from file headers or file extension.

Related

unable to recognize file type

this is my first post. I'm new in Java. I'm working on file parser. I've tried to identify if it is CSV or another file format, but it looks like it is not quite a standard format. I'm working on apache camel solution (my first and last idea :( ), but maybe some of you recognize this kind of file format? Additionally, I've got .imp file for my output.
Here is my example input:
NrDok:FS-2222/17/W
Data:12.02.2017
SposobPlatn:GOT
NazwaWystawcy:MAAKAI Gawron
AdresWystawcy:33-123 bABA
KodWystawcy:33-112
MiastoWystawcy:bABA
UlicaWystawcy:czysfa 8
NIPWystawcy:123-19-85-123
NazwaOdbiorcy:abc abc-HANDLOWO-USŁUGOWE
AdresOdbiorcy:33-123 fghd
KodOdbiorcy:33-123
MiastoOdbiorcy:Tdsfs
UlicaOdbiorcy:dfdfdA 39
NIPOdbiorcy:82334349
TelefonOdbiorcy:654-522-124
NrOdbiorcyWSieciSklepow:efdsS-sffgsA
IloscLinii:1
Linia:Nazwa{ĆWIARTKA KG}Kod{C1}Vat{5}Jm{kg.}Asortyment{dfgv}Sww{}PKWIU{10.12.10}Ilosc{3.40}Cena{n3.21}Wartosc{n11.83}IleWOpak{1}CenaSp{b0.00}
DoZaplaty:252.32
And here is my example output file:
FH 2015.07.31 2015.07.31 F04443 Gotowka
FO 812-123-45-11 P.a.b.Uc"fdad" abcd deffF UL.fdfgdfdA 12/33 33-123 afvdf
FS 779-19-06-082 badfdf S.A. ul. Wisniowa 89 60-003 Poznan
FP 00218746 CHRZAN TARTY EXTRA POLONAISE 180G SZT 32.00 2.21 8 10.39.17.0 32.00 5900138000055
Is there any easy way to convert the first file to second file format? Maybe you know the type of this file? In a meanwhile, I'm continuing my work with apache camel.
Thanks in advance for your time and help!
I suggest you to play with https://tika.apache.org/1.1/detection.html#Mime_Magic_Detection
It's very good lib for file type recognition.
Here https://www.tutorialspoint.com/tika/tika_document_type_detection.htm we have simple example.
Your file can be read as standard Java .properties file. This type of files allows both = and : as key and value separators. While the fact that it contains non ISO-8859-1 characters like Polish Ć may prevent Java from correctly parsing it.
This line
Nazwa{ĆWIARTKA KG}Kod{C1}Vat{5}Jm{kg.}Asortyment{dfgv}Sww{}PKWIU{10.12.10}Ilosc{3.40}Cena{n3.21}Wartosc{n11.83}IleWOpak{1}CenaSp{b0.00}
Seem to be some custom serialization format of the object in the form
key1{value1}key2{value2}...
Your output file contains lots of data that is not listed in the input which makes me think that there is some data querying from external systems to build the output. You should investigate it yourself. There is no way anyone can guess the transformation with provided input.

How to save file with custom file extension in java?

Dear brothers Hope you all right?
I'm designing a document program, however, rather to save file .text extension or using any other MS-Office API in java, i want to create my custom file format such as ".sad" extension so that this sort of file can only be read by my programs, how this can be possible?
Your requirement seems ambiguous. Are you looking to make a program that creates MS Office Word documents or plain text files with a custom file extension?
In the case of the former, you can't have a custom extension as MS Word documents, by definition, have a .doc / .docx extension.
However, if you are looking to create a program that produces text files then you can easily have a custom extension. Just look at this tutorial: How to create a file in Java
I already stated why this is a bad idea. Yet I have a solution for you (more like a how-not-to-do-it)
Take your plain text you want to save, convert it to bytes and apply this "highly enthusiastic encryption nobody will ever be able to break" on it:
string plainText = "yadayada";
bytes[] bytesFromText = toBytes(plainText);
bytes[] encrypted = new Array(sizeof(bytesFromText)*2);
for(int i = 0; i < sizeof(bytesFromText); i++){
if((i modulo 2) == 0){
encrypted.push(toByte(Math.random modulo 255));
}
encrypted.push(bytesFromText[i]);
}
I let it up to you to figure out why this is a bad idea and how to decrypt it. ;)
You can create file with any extension
For example,
File f = new File("confidential.sad");
Hope this will work for you :)
Working with custom files in Java
Here is the tutorial that will help you in getting the concept about how to create your own files with custom extension such as .doc or .sad with some information embedded in it and after saving the file you want to read that information form the file.
ZIP
Similar applications often use archives to store data. Consider MS-Word and its documents >with the .docx file extension. If you change the extension of any .docx file to .zip, you >will find that the document is actually a zip archive, with only a different extension.
https://www.ict.social/java/files/working-with-custom-files-in-java-zip-archive
I have published a library that saves files, and handles everything with one line of code only, you can find it here along with its documentation
Github repository
and the answer to your question is so easy
String path = FileSaver
.get()
.save(file,"file.custom");

Extract additional information from CSV/XML file in Java Code

I have a question for you.
I have a XML file (or CSV file):
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<City>
<Code>LO</Code>
<Name>London</Name>
</City>
and I want to extract the additional information (for example, Author, Description, Creator, Comments, Format, ContentType etc.) from it in Java Code.
I read this similar question, but the extracting is from Excel file to Java Code: How to set Author name to excel file using poi
I would like to get in output the additional information (System.out.println(getAuthor) for example), if I give in input the filename (for example, test.csv or test.xml).
Who can help me?
Those information are not inside the file itself, which only contains its content (like your XML string). They depend on the operating system (which one are you using?). And it is a little bit unclear what you are looking for. So here is what you mentioned:
Author
Path path = Paths.get("C:/Users/Thomas/workspace_eclipse_java/Test/javassist-3.12.1.GA.jar");
FileOwnerAttributeView owner= Files.getFileAttributeView(path, FileOwnerAttributeView.class);
System.out.println("owner: " + owner.getOwner().getName());
Description
I have no idea what this should be. Never saw this on Windows or Linux.
Creator
Do you mean the Author again?
Comments
I have no idea what this should be. Never saw this on Windows or Linux.
Format
Check the file extension
ContentType
Check the file extension or take a look inside.
Generally
Generally you can check what is available by this:
FileSystem fileSystem = FileSystems.getDefault();
Set<String> fileSystemViews = fileSystem.supportedFileAttributeViews();
for (String fileSystemView : fileSystemViews)
System.out.println(fileSystemView);

Comparing two files using commons or Javascript

I need to check if any of the 4 uploaded files are same, this check might be on the JSP or Java-Servlet side.
I've been using
var FileName1 = document.getElementById('fileChooser1').value;
var FileName2 = document.getElementById('fileChooser2').value;
if(FileName1 == FileName2)
{
alert("same files cannot be uploaded");
}
But, the problem is that this only deals with the name of the file and this fails if files with same content but different names are uploaded.
So, on apache commons search I found that there is a Default Comparator but I have no idea of how I can use this or is there any other better/simpler way to check for same files.
How can I use the Default Comparator and on what basis does it compare?
Is there any better/simpler solution to this problem in java or javascript?
You can use FileUtils.contentEquals method to compare content of 2 files.
Example
System.out.println(FileUtils.contentEquals(file1, file2));

Capture generated output file path and name using CSSDK

We are in the process of converting over to using the XSLT compiler for page generation. I have a Xalan Java extention to exploit the CSSDK and capture some meta data we have stored in the Extended Attributes for output to the page. No problems in getting the EA's rendered to the output file.
The problem is that I don't know how to dynamically capture the file path and name of the output file.
So just as POC, I have the CSVPath hard coded to the output file in my Java extension. Here's a code sample:
CSSimpleFile sourceFile = (CSSimpleFile)client.getFile(new CSVPath("/some-path-to-the-output.jsp"));
Can someone point me in the CSSDK to where I could capture the output file?
I found the answer.
First, get or create your CSClient. You can use the examples provided in the cssdk/samples. I tweaked one so that I captured the CSClient in the method getClientForCurrentUser(). Watch out for SOAP vs Java connections. In development, I was using a SOAP connection and for the make_toolkit build, the Java connection was required for our purposes.
Check the following snippet. The request CSClient is captured in the static variable client.
CSSimpleFile sourceFile = (CSSimpleFile)client.getFile(new CSVPath(XSLTExtensionContext.getContext().getOutputDirectory().toString() + "/" + XSLTExtensionContext.getContext().getOutputFileName()));

Categories