I'm a Java beginner and I need to do the following:
- I have a txt file as input with text that I want to analyse in GATE;
- I want to get GATE to start automatically and run its linguistic analysis (Corpus Pipeline) on this text.
My idea is to open and read the txt file in Java and then convert it to a GATE doc, but I have the following doubts:
1) how do I convert the text to a GATE doc?
2) how do I get GATE to start automatically?
Thanks for helping me out.
In GATE, you don't have to worry about reading and converting common files like .txt, .pdf, .html, etc. GATE automatically does that.
Initialize GATE like this:
private static void initGateApplication(String gateXgappFileLoc, String gateHome) {
try {
try {
if (Gate.getGateHome() == null)
Gate.setGateHome(new File(gateHome));
}
catch (Exception ex) {
ex.printStackTrace(System.out);
}
try {
if (!Gate.isInitialised())
Gate.init();
}
catch (GateException e) {
e.printStackTrace(System.out);
}
System.out.println("Initializing gate application...");
gappFile = new File(gateXgappFileLoc);
gateApplication = (CorpusController) PersistenceManager.loadObjectFromFile(gappFile);
}
catch (Exception e) {
e.printStackTrace(System.out);
}
}
And run your GATE pipeline with your text file:
public void extract(String inputFileName, String docID, CorpusController gateApplication) throws GateException, IOException
{
CorpusController application = gateApplication;
Corpus corpus = Factory.newCorpus("Sample Corpus");
application.setCorpus(corpus);
File docFile = new File(inputFileName);
System.out.print("Processing document " + docFile + "...");
Document doc = Factory.newDocument(docFile.toURL(), encoding);
// add document to the corpus
corpus.add(doc);
// run the application
application.execute();
System.out.println("Done running GATE pipeline...");
// Now use get annotations from 'doc' object
}
Related
public void readList () {
try {
FileOutputStream writeData = new FileOutputStream("Accounts.txt");
ObjectOutputStream writeStream = new ObjectOutputStream(writeData);
writeStream.writeObject(AccountCredentials);
writeStream.close();
}
catch (IOException e) {
e.printStackTrace();
}
}
public void writeList() {
try {
FileInputStream readData = new FileInputStream("Accounts.txt");
ObjectInputStream readStream = new ObjectInputStream(readData);
AccountCredentials = (ArrayList <Accounts>) readStream.readObject();
readStream.close();
System.out.println(AccountCredentials.size());
}
catch (Exception e) {
e.printStackTrace();
}
}
My readList method works fine right, I have ¬í sr java.util.ArrayListxÒ™Ça I sizexp w
in the file. My writeList does not. I have a School folder inside the Netbeans folder, and in the main directory is Accounts.txt. Do I need to specify that? My Java file is in Schools/src. It always says my list size is 0
Can you please share the exception or stack trace you are getting and paste it here ? , Also I would highly recommend not to use a flat file for storing the account credentials, rather use any of the identity management solution and db driven account management. Did you also try to debug the following line "ObjectInputStream readStream = new ObjectInputStream(readData);"
I am using Netbeans on OS X and cannot seem to write text to a text file that I have in a package named "assets".
Below is the way I tried to accomplish writing to the text file and so far my method of doing this is not working.
The way I tried to approach this problem was converting a string to url, then converting the url to a uri. Then I used the uri for the new file parameter. After I tried to write a string using the class print writer.
public class Experiment {
File createFile(String path) {
java.net.URL url = getClass().getResource(path);
URI uri;
try {
uri = url.toURI();
}
catch (URISyntaxException e) {
uri = null;
}
if ((url != null) && (uri != null)) {
System.out.println("file loading sucess");
return new File(uri);
}
else {
System.out.println("Error file has not been loaded");
return null;
}
}
File file = createFile("/assets/myfile.txt");
public static void main(String[] args) {
Experiment testrun = new Experiment();
try {
PrintWriter writer = new PrintWriter(new FileWriter(testrun.file));
writer.println("it works");
writer.flush();
writer.close();
System.out.println("string was written");
}
catch (IOException e) {
System.out.println("there was an error while writing");
}
}
}
The output given from my try catch statements say that the file write code was executed.
file loading sucess
string was written
BUILD SUCCESSFUL (total time: 2 seconds)
I have also tried using absolute string paths for making a new file, but with null results. I am running out of ideas and hoping for some guidance or solution from somebody.
I'm trying to copy files from the assets folder to the device folder using this function:
public static void copyJSON(Context aContext) {
AssetManager assetManager = aContext.getResources().getAssets();
String[] pFiles = null;
try {
pFiles = assetManager.list("ConfigurationFiles");
} catch (IOException e) {
Log.e("tag", "Failed to get asset file list.", e);
}
if (pFiles != null) for (String pJsonFileName : pFiles) {
InputStream tIn = null;
OutputStream tOut = null;
try {
tIn = assetManager.open("ConfigurationFiles" + File.separator + pJsonFileName);
String[] pList = aContext.getFilesDir().list(); //just for test
File pOutFile = new File(aContext.getFilesDir(), pJsonFileName);
tOut = new FileOutputStream(pOutFile);
if (pOutFile.exists()) {
copyFile(tIn, tOut);
}
} catch (IOException e) {
Log.e("tag", "Failed to copy asset file: " + pJsonFileName, e);
} finally {
if (tIn != null) {
try {
tIn.close();
} catch (IOException e) {
Log.e("tag", "Fail closing", e);
}
}
if (tOut != null) {
try {
tOut.close();
} catch (IOException e) {
Log.e("tag", "Fail closing", e);
}
}
}
}
}
If I delete the App and run the code, the variable pList is empty as I expect but the pOutFile.exists()returns true ALWAYS!!.
I don't want to copy them again every time I open my App, and I'm doing this because all my app uses JSON to navigate thru all the screens, so If I change any value in my BBDD a WS send a new JSON file and the App respond in accordance for example a button is no longer needed, so the first time you download my App I copy the original JSON and then if you use the app an if you have internet connection you will download a new JSON file that it is more accurate than the one that is in the Bundle and it will be override, this is because as far as I know I can't change the files that are in the assets folder.
I have read everywhere and all say the same use this:
File pOutFile = new File(aContext.getFilesDir(), pJsonFileName);
And then ask for this:
pOutFile.exists()
I don't know what I'm doing wrong.
Thanks for all your help.
put it this way:
File pOutFile = new File(aContext.getFilesDir(), pJsonFileName);
if (pOutFile.exists()) {
tOut = new FileOutputStream(pOutFile);
copyFile(tIn, tOut);
}
and everything should work fine. Remember the FileOutputStream creates the file it should stream to if possible and non existing
The problem is you're essentially creating a file and then checking if it exists.
try {
tIn = assetManager.open("ConfigurationFiles" + File.separator + pJsonFileName);
String[] pList = aContext.getFilesDir().list(); //just for test
File pOutFile = new File(aContext.getFilesDir(), pJsonFileName);
// See here: you're creating a file right here
tOut = new FileOutputStream(pOutFile);
// And that file will be created in the exact location of the file
// you're trying to check:
if (pOutFile.exists()) { // Will always be true if FileOutputStream was successful
copyFile(tIn, tOut);
}
}
You should instead create your FileOutputStream AFTER you've done your existence check.
Source: http://docs.oracle.com/javase/7/docs/api/java/io/FileOutputStream.html
A file that you have just created without getting an exception always exists. The test is pointless. Remove it.
I need to be able to crawl an online directory such as for example this one http://svn.apache.org/repos/asf/ and whenever a pdf, docx, txt, or odt file come across the crawling, I need to be able to parse, and extract the text from it.
I am using files.walk in order to crawl around locally in my laptop, and Apache Tika library to parse text, and it works just fine, but I don't really know how can I do the same in an online directory.
Here's the code that goes through my PC and parses the files just so you guys have an idea of what I'm doing:
public static void GetFiles() throws IOException {
//PathXml is the path directory such as "/home/user/" that
//is taken from an xml file .
Files.walk(Paths.get(PathXml)).forEach(filePath -> { //Crawling process (Using Java 8)
if (Files.isRegularFile(filePath)) {
if (filePath.toString().endsWith(".pdf") || filePath.toString().endsWith(".docx") ||
filePath.toString().endsWith(".txt")){
try {
TikaReader.ParsedText(filePath.toString());
} catch (IOException e) {
e.printStackTrace();
} catch (SAXException e) {
e.printStackTrace();
} catch (TikaException e) {
e.printStackTrace();
}
System.out.println(filePath);
}
}
});
}
and here's the TikaReader method:
public static String ParsedText(String file) throws IOException, SAXException, TikaException {
InputStream stream = new FileInputStream(file);
AutoDetectParser parser = new AutoDetectParser();
BodyContentHandler handler = new BodyContentHandler();
Metadata metadata = new Metadata();
try {
parser.parse(stream, handler, metadata);
System.out.println(handler.toString());
return handler.toString();
} finally {
stream.close();
}
}
So again, how can I do the same thing with the given online directory above?
I've hosted a text file which I would like to load into a string using java.
My code doesn't seem to work producing errors, any help?
try {
dictionaryUrl = new URL("http://pluginstudios.co.uk/resources/studios/games/hangman/dictionary.dic");
} catch (MalformedURLException catchMalformedURLException) {
System.err.println("Error 3: Malformed URL exception.\n"
+ " Dictionary failed to load.");
}
// 'Dictionary' scanner setting to file
// 'src/Main/Dictionary.dic'
DictionaryS = new Scanner(new File(dictionaryUrl));
System.out.println("Default dictionary loaded.");
UPDATE 1: The file doesn't seem to load going to the catch. But the file exists.
You could do something that this tutorial does
public class WebPageScanner {
public static void main(String[] args) {
try {
URLConnection connection =
new URL("http://java.net").openConnection();
String text = new Scanner(
connection.getInputStream()).
useDelimiter("\\Z").next();
} catch (IOException e) {
e.printStackTrace();
}
}
}
You need to use HttpClient and retrieve the data as a string or string buffer.
then use parse or read as file.
Something like this should work in your case:
DictionaryS = new Scanner(dictionaryUrl.openStream());
JavaDoc tells us:
File(URI uri)
Creates a new File instance by converting the given file: URI into an abstract pathname.
We can't create and use a File instance for any other resource type (like http).