Auto-Detect File Extension with APACHE JENA

Auto-Detect File Extension with APACHE JENA - java

I want to convert any file extension to .ttl (TURTLE) and I need to use Apache Jena, I am aware of how it can be accomplished using RDFJ4 but the output isn't as accurate as it is using Jena. I want to know how I can auto-detect the extension or rather file type if I am not aware of the extension when reading a file from a directory. This is my code when I hardcode the file-name, it works, I just need help in auto detecting the file type. My code is as follows:
public class Converter {
public static void main(String[] args) throws FileNotFoundException {
String fileName = "./abc.rdf";
Model model = ModelFactory.createDefaultModel();
//I know this is how it is done with RDF4J but I need to use Apache Jena.
/* RDFParser rdfParser = Rio.createParser(Rio.getWriterFormatForFileName(fileName).orElse(RDFFormat.RDFXML));
RDFWriter rdfWriter = Rio.createWriter(RDFFormat.TURTLE,
new FileOutputStream("./"+stripExtension(fileName)+".ttl"));*/
InputStream is = FileManager.get().open(fileName);
if (is != null) {
model.read(is, null, "RDF/XML");
model.write(new FileOutputStream("./converted.ttl"), "TURTLE");
} else {
System.err.println("cannot read " + fileName);
}
}
}
All help and advice will be highly appreciated.

There is functionality that handles reading from a file using the extension to determine the syntax:
RDFDataMgr.read(model, fileName);
It also handles compressed files e.g. "file.ttl.gz".
There is a registry of languages:
RDFLanguages.fileExtToLang(...)
RDFLanguages.filenameToLang(...)
For more control see RDFParser:
RDFParser.create().
source(FileName)
... many options including forcing the language ...
.parse(model);
https://jena.apache.org/documentation/io/rdf-input.html

Related

Why do I get an Excel warning about file format and extension mismatch when I try to download an excel file? [duplicate]

I have this application I'm developing in JSP and I wish to export some data from the database in XLS (MS Excel format).
Is it possible under tomcat to just write a file as if it was a normal Java application, and then generate a link to this file? Or do I need to use a specific API for it?
Will I have permission problems when doing this?

While you can use a full fledged library like JExcelAPI, Excel will also read CSV and plain HTML tables provided you set the response MIME Type to something like "application/vnd.ms-excel".
Depending on how complex the spreadsheet needs to be, CSV or HTML can do the job for you without a 3rd party library.

Don't use plain HTML tables with an application/vnd.ms-excel content type. You're then basically fooling Excel with a wrong content type which would cause failure and/or warnings in the latest Excel versions. It will also messup the original HTML source when you edit and save it in Excel. Just don't do that.
CSV in turn is a standard format which enjoys default support from Excel without any problems and is in fact easy and memory-efficient to generate. Although there are libraries out, you can in fact also easily write one in less than 20 lines (funny for ones who can't resist). You just have to adhere the RFC 4180 spec which basically contains only 3 rules:
Fields are separated by a comma.
If a comma occurs within a field, then the field has to be surrounded by double quotes.
If a double quote occurs within a field, then the field has to be surrounded by double quotes and the double quote within the field has to be escaped by another double quote.
Here's a kickoff example:
public static <T> void writeCsv (List<List<T>> csv, char separator, OutputStream output) throws IOException {
BufferedWriter writer = new BufferedWriter(new OutputStreamWriter(output, "UTF-8"));
for (List<T> row : csv) {
for (Iterator<T> iter = row.iterator(); iter.hasNext();) {
String field = String.valueOf(iter.next()).replace("\"", "\"\"");
if (field.indexOf(separator) > -1 || field.indexOf('"') > -1) {
field = '"' + field + '"';
}
writer.append(field);
if (iter.hasNext()) {
writer.append(separator);
}
}
writer.newLine();
}
writer.flush();
}
Here's an example how you could use it:
public static void main(String[] args) throws IOException {
List<List<String>> csv = new ArrayList<List<String>>();
csv.add(Arrays.asList("field1", "field2", "field3"));
csv.add(Arrays.asList("field1,", "field2", "fie\"ld3"));
csv.add(Arrays.asList("\"field1\"", ",field2,", ",\",\",\""));
writeCsv(csv, ',', System.out);
}
And inside a Servlet (yes, Servlet, don't use JSP for this!) you can basically do:
protected void doGet(HttpServletRequest request, HttpServletResponse response) throws ServletException, IOException {
String filename = request.getPathInfo().substring(1);
List<List<Object>> csv = someDAO().findCsvContentFor(filename);
response.setHeader("content-type", "text/csv");
response.setHeader("content-disposition", "attachment;filename=\"" + filename + "\"");
writeCsv(csv, ';', response.getOutputStream());
}
Map this servlet on something like /csv/* and invoke it as something like http://example.com/context/csv/filename.csv. That's all.
Note that I added the possiblity to specify the separator character separately, because it may depend on the locale used whether Excel would accept a comma , or semicolon ; as CSV field separator. Note that I also added the filename to the URL pathinfo, because a certain webbrowser developed by a team in Redmond otherwise wouldn't save the download with the proper filename.

You will probably need a library to manipulate Excel files, like JExcelAPI ("jxl") or POI. I'm more familiar with jxl and it can certainly write files. You can generate them and store them by serving a URL to them but I wouldn't. Generated files are a pain. They add complication in the form on concurrency, clean-up processes, etc.
If you can generate the file on the fly and stream it to the client through the standard servlet mechanisms.
If it's generated many, may times or the generation is expensive then you can cache the result somehow but I'd be more inclined to keep it in memory than as a file. I'd certainly avoid, if you can, linking directly to the generated file by URL. If you go via a servlet it'll allow you to change your impleemntation later. It's the same encapsualtion concept as in OO dsign.

POI or JExcel are good APIs. I personally like better POI, plus POI is constantly updated. Furthermore, there are more resources online about POI than JExcel in case you have any questions. However, either of the two does a great job.

maybe you should consider using some reporting tool with an option of exporting files into XLS format. my suggestion is JasperReports

try {
String absoluteDiskPath = test.xls";
File f = new File(absoluteDiskPath);
response.setContentType("application/xlsx");
response.setHeader("Content-Disposition", "attachment; filename=" + absoluteDiskPath);
String name = f.getName().substring(f.getName().lastIndexOf("/") + 1, f.getName().length());
InputStream in = new FileInputStream(f);
out.clear(); //clear outputStream prevent illegalStateException write binary data to outputStream
ServletOutputStream outs = response.getOutputStream();
int bit = 256;
int i = 0;
try {
while ((bit) >= 0) {
bit = in.read();
outs.write(bit);
}
outs.flush();
outs.close();
in.close();
} catch (IOException ioe) {
ioe.printStackTrace();
} finally {
try {
if(outs != null)
outs.close();
if(in != null)
in.close();
}catch (Exception ioe2) {
ioe2.printStackTrace();
}
}
} catch (Exception ex) {
ex.printStackTrace();
}

I tried like as below in JSP, it is working fine.
<% String filename = "xyz.xls";
response.setContentType("application/octet-stream");
response.setHeader("Content-Disposition","attachment; filename=\"" + filename + "\"");
java.io.File excelFile=new java.io.File("C:\\Users\\hello\\Desktop\\xyz.xls");
java.io.FileInputStream fileInputStream=new java.io.FileInputStream(excelFile);
byte[] bytes = new byte[(int) excelFile.length()];
int offset = 0;
while (offset < bytes.length)
{
int result = fileInputStream.read(bytes, offset, bytes.length - offset);
if (result == -1) {
break;
}
offset += result;
}
javax.servlet.ServletOutputStream outs = response.getOutputStream();
outs.write(bytes);
outs.flush();
outs.close();
fileInputStream.close();
%>

SequenceFile Compactor of several small files in only one file.seq

Novell in HDFS and Hadoop:
I am developing a program which one should get all the files of a specific directory, where we can find several small files of any type.
Get everyfile and make append in a SequenceFile compressed, where the key must be the path of the file, and the value must be the file got, For now my code is:
import java.net.*;
import org.apache.hadoop.fs.*;
import org.apache.hadoop.conf.*;
import org.apache.hadoop.io.*;
import org.apache.hadoop.io.compress.BZip2Codec;
public class Compact {
public static void main (String [] args) throws Exception{
try{
Configuration conf = new Configuration();
FileSystem fs =
FileSystem.get(new URI("hdfs://quickstart.cloudera:8020"),conf);
Path destino = new Path("/user/cloudera/data/testPractice.seq");//test args[1]
if ((fs.exists(destino))){
System.out.println("exist : " + destino);
return;
}
BZip2Codec codec=new BZip2Codec();
SequenceFile.Writer outSeq = SequenceFile.createWriter(conf
,SequenceFile.Writer.file(fs.makeQualified(destino))
,SequenceFile.Writer.compression(SequenceFile.CompressionType.BLOCK,codec)
,SequenceFile.Writer.keyClass(Text.class)
,SequenceFile.Writer.valueClass(FSDataInputStream.class));
FileStatus[] status = fs.globStatus(new Path("/user/cloudera/data/*.txt"));//args[0]
for (int i=0;i<status.length;i++){
FSDataInputStream in = fs.open(status[i].getPath());
outSeq.append(new org.apache.hadoop.io.Text(status[i].getPath().toString()), new FSDataInputStream(in));
fs.close();
}
outSeq.close();
System.out.println("End Program");
}catch(Exception e){
System.out.println(e.toString());
System.out.println("File not found");
}
}
}
But after of every execution I receive this exception:
java.io.IOException: Could not find a serializer for the Value class: 'org.apache.hadoop.fs.FSDataInputStream'. Please ensure that the configuration 'io.serializations' is properly configured, if you're using custom serialization.
File not found
I understand the error must be in the type of the file I am creating and the type of object I define for adding to the sequenceFile, but I don't know which one should add, can anyone help me?

FSDataInputStream, like any other InputStream, is not intended to be serialized. What serializing an "iterator" over a stream of byte should do ?
What you most likely want to do, is to store the content of the file as the value. For example you can switch the value type from FsDataInputStream to BytesWritable and just get all the bytes out of the FSDataInputStream. One drawback of using Key/Value SequenceFile for a such purpose is that the content of each file has to fit in memory. It could be fine for small files but you have to be aware of this issue.
I am not sure what you are really trying to achieve but perhaps you could avoid reinventing the wheel by using something like Hadoop Archives ?

Thanks a lot by your comments, the problem was the serializer like you say, and finally I used BytesWritable:
FileStatus[] status = fs.globStatus(new Path("/user/cloudera/data/*.txt"));//args[0]
for (int i=0;i<status.length;i++){
FSDataInputStream in = fs.open(status[i].getPath());
byte[] content = new byte [(int)fs.getFileStatus(status[i].getPath()).getLen()];
outSeq.append(new org.apache.hadoop.io.Text(status[i].getPath().toString()), new org.apache.hadoop.io.BytesWritable(in));
}
outSeq.close();
Probably there are other better solutions in the hadoop ecosystem but this problem was an exercise of a degree I am developing, and for now We are remaking the wheel for understanding concepts ;-).

How to convert HttpPostedFileBase file to Java.Io.InputStream?

I'm working on ASP.net with the MPXJ library. The .net version of MPXJ has been created using IKVM.
Currently, I have a big problem: After upload a file (Microsoft Project file - .mpp file) to server (I don't need to save it), I want to convert from HttpPostedFileBase to the IKVM version of java.io.InputStream and MPXJ will manipulate them, but I don't know a way to implement this.
My code:
public ActionResult Upload(HttpPostedFileBase files)
{
// Todo: Convert from HttpPostedFileBase to Java.Io.InputStream
ProjectReader reader = new MPPReader();
ProjectFile projectObj = reader.read(Java.Io.InputStream);
}

You need a wrapper to provide a conversion between the IKVM Java type java.io.InputStream and a .net Stream instance. As luck would have it, IKVM ships with one...
Using the wrapper, your example will now look like this:
public ActionResult Upload(HttpPostedFileBase files)
{
ProjectReader reader = new MPPReader();
ProjectFile projectObj = reader.read(new ikvm.io.InputStreamWrapper(files.InputStream));
}

If you don't want to use IKVM, you can implement as below:
public ActionResult Upload(HttpPostedFileBase files)
{
byte[] fileData = null;
using (var binaryReader = new BinaryReader(files.InputStream))
{
fileData = binaryReader.ReadBytes(files.ContentLength);
}
ProjectFile projectObj = reader.read(new ByteArrayInputStream(fileData));
}

Aspose with RJB (Ruby Java Bridge) is not working

I have a code in Java that opens a excel template by aspose library (it runs perfectly):
import com.aspose.cells.*;
import java.io.*;
public class test
{
public static void main(String[] args) throws Exception
{
System.setProperty("java.awt.headless", "true");
FileInputStream fstream = new FileInputStream("/home/vmlellis/Testes/aspose-cells/template.xlsx");
Workbook workbook = new Workbook(fstream);
workbook.save("final.xlsx");
}
}
After I run this on Ruby with RJB (Ruby Java Bridge):
require 'rjb'
#RJM Loading
JARS = Dir.glob('./jars/*.jar').join(':')
print JARS
Rjb::load(JARS, ['-Xmx512M'])
system = Rjb::import('java.lang.System')
file_input = Rjb::import('java.io.File')
file_input_stream = Rjb::import('java.io.FileInputStream')
workbook = Rjb::import('com.aspose.cells.Workbook')
system.setProperty("java.awt.headless", "true")
file_path = "/home/vmlellis/Testes/aspose-cells/template.xlsx"
file = file_input.new(file_path)
fin = file_input_stream.new(file)
wb = workbook.new(fin)
I get this error:
test.rb:57:in `new': Can't find file: java.io.FileInputStream#693a317a. (FileNotFoundException)
from aspose-test.rb:57:in `<main>'
Why? I run the same code... but in Ruby is not working! How do I fix this?
Update:
In documentation there is the the initializer: Workbook(java.io.InputStreamstream)... but it's not working in RJB. (How is this possible?)

Your program should have worked, but I could not find any reason why it didn't and I am looking into it.
Now the alternate approaches.
Approach 1
Use Workbook(String) constructor instead of Workbook(FileInputStream). This worked flawlessly at my end. The sample code is
require 'rjb'
#RJM Loading
JARS = Dir.glob('/home/saqib/cellslib/*.jar').join(':')
print JARS
Rjb::load(JARS, ['-Xmx512M'])
system = Rjb::import('java.lang.System')
workbook = Rjb::import('com.aspose.cells.Workbook')
system.setProperty("java.awt.headless", "true")
file_path = "/home/saqib/rjb/template.xlsx"
save_path = "/home/saqib/rjb/final.xlsx"
wb = workbook.new(file_path)
wb.save(save_path)
Approach 2
Write a new Java class library. Write all your Aspose.Cells related code in it. Expose very simple and basic methods that needs to be called from Ruby (RJB).
Why?
It is easy to write program in native Java language. If you use RJB, you need to perform a lot of code conversions
It is easy to debug and test in Java.
Usage of RJB will only be limited to calling methods from your own Java library. The RJB code will be small and basic.
Similar Example using own library
Create a new Java project, lets say "cellstest". Add a new public class in it.
package cellstest;
import com.aspose.cells.Workbook;
public class AsposeCellsUtil
{
public String doSomeOpOnWorkbook(String inFile, String outFile)
{
String result = "";
try
{
// Load the workbook
Workbook wb = new Workbook(inFile);
// Do some operation with this workbook
// ..................
// Save the workbook
wb.save(outFile);
// everything ok.
result = "ok";
}
catch(Exception ex)
{
// Return the exception to calling program
result = ex.toString();
}
return result;
}
}
Like this, add as many methods as you like, for each operation.
Build the project and copy the "cellstest.jar" in same folder where you copied Aspose.Cells jar files. You can return a String from your methods and check the return value in Ruby program for success or error code. The Ruby program will now be like
require 'rjb'
#RJM Loading
JARS = Dir.glob('/home/saqib/cellslib/*.jar').join(':')
print JARS
Rjb::load(JARS, ['-Xmx512M'])
system = Rjb::import('java.lang.System')
AsposeCellsUtil = Rjb::import('cellstest.AsposeCellsUtil')
system.setProperty("java.awt.headless", "true")
file_path = "/home/saqib/rjb/template.xlsx"
save_path = "/home/saqib/rjb/final.xlsx"
# initialize instance
asposeCellsUtil = AsposeCellsUtil.new()
# call methods
result = asposeCellsUtil.doSomeOpOnWorkbook(file_path, save_path)
puts result
PS. I work for Aspose as Developer Evangelist.

In your Java code, you pass a file name string into FileInputStream() constructor:
FileInputStream fstream = new FileInputStream("/home/vmlellis/Testes/aspose-cells/template.xlsx");
In your Ruby code, you pass a file object:
file = file_input.new(file_path)
fin = file_input_stream.new(file)
Have you tried to do the same thing as in Java?
fin = file_input_stream.new(file_path)

Writing in the beginning of a text file Java

I need to write something into a text file's beginning. I have a text file with content and i want write something before this content. Say i have;
Good afternoon sir,how are you today?
I'm fine,how are you?
Thanks for asking,I'm great
After modifying,I want it to be like this:
Page 1-Scene 59
25.05.2011
Good afternoon sir,how are you today?
I'm fine,how are you?
Thanks for asking,I'm great
Just made up the content :) How can i modify a text file like this way?

You can't really modify it that way - file systems don't generally let you insert data in arbitrary locations - but you can:
Create a new file
Write the prefix to it
Copy the data from the old file to the new file
Move the old file to a backup location
Move the new file to the old file's location
Optionally delete the old backup file

Just in case it will be useful for someone here is full source code of method to prepend lines to a file using Apache Commons IO library. The code does not read whole file into memory, so will work on files of any size.
public static void prependPrefix(File input, String prefix) throws IOException {
LineIterator li = FileUtils.lineIterator(input);
File tempFile = File.createTempFile("prependPrefix", ".tmp");
BufferedWriter w = new BufferedWriter(new FileWriter(tempFile));
try {
w.write(prefix);
while (li.hasNext()) {
w.write(li.next());
w.write("\n");
}
} finally {
IOUtils.closeQuietly(w);
LineIterator.closeQuietly(li);
}
FileUtils.deleteQuietly(input);
FileUtils.moveFile(tempFile, input);
}

I think what you want is random access. Check out the related java tutorial. However, I don't believe you can just insert data at an arbitrary point in the file; If I recall correctly, you'd only overwrite the data. If you wanted to insert, you'd have to have your code
copy a block,
overwrite with your new stuff,
copy the next block,
overwrite with the previously copied block,
return to 3 until no more blocks

As #atk suggested, java.nio.channels.SeekableByteChannel is a good interface. But it is available from 1.7 only.
Update : If you have no issue using FileUtils then use
String fileString = FileUtils.readFileToString(file);

This isn't a direct answer to the question, but often files are accessed via InputStreams. If this is your use case, then you can chain input streams via SequenceInputStream to achieve the same result. E.g.
InputStream inputStream = new SequenceInputStream(new ByteArrayInputStream("my line\n".getBytes()), new FileInputStream(new File("myfile.txt")));

I will leave it here just in case anyone need
ByteArrayOutputStream byteArrayOutputStream = new ByteArrayOutputStream();
try (FileInputStream fileInputStream1 = new FileInputStream(fileName1);
FileInputStream fileInputStream2 = new FileInputStream(fileName2)) {
while (fileInputStream2.available() > 0) {
byteArrayOutputStream.write(fileInputStream2.read());
}
while (fileInputStream1.available() > 0) {
byteArrayOutputStream.write(fileInputStream1.read());
}
}
try (FileOutputStream fileOutputStream = new FileOutputStream(fileName1)) {
byteArrayOutputStream.writeTo(fileOutputStream);
}

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Auto-Detect File Extension with APACHE JENA - java

Related

Why do I get an Excel warning about file format and extension mismatch when I try to download an excel file? [duplicate]

SequenceFile Compactor of several small files in only one file.seq

How to convert HttpPostedFileBase file to Java.Io.InputStream?

Aspose with RJB (Ruby Java Bridge) is not working

Writing in the beginning of a text file Java

Categories

Resources