I want to read words in file.txt file token by token and add a Part of Speech tag to each of them and write it to file2.text file. file.txt content is tokenized. So here's my code.
public class PoSTagging {
#SuppressWarnings("resource")
public static void PoStagMethod() throws IOException {
FileInputStream fin= new FileInputStream("C:\\Users\\dell\\Desktop\\file.txt");
DataInputStream in = new DataInputStream(fin);
BufferedReader br = new BufferedReader(new InputStreamReader(in));
String strline=br.readLine();
System.out.println(strline+"first");
try{
POSModel model = new POSModelLoader().load(new File("en-pos-maxent.bin"));
PerformanceMonitor perfMon = new PerformanceMonitor(System.err, "sent");
POSTaggerME tagger = new POSTaggerME(model);
String input = strline;
#SuppressWarnings("deprecation")
ObjectStream<String> lineStream =new PlainTextByLineStream(new StringReader(input));
perfMon.start();
String line;
while ((line = lineStream.read()) != null) {
String whitespaceTokenizerLine[] = WhitespaceTokenizer.INSTANCE.tokenize(line);
String[] tags = tagger.tag(whitespaceTokenizerLine);
POSSample sample = new POSSample(whitespaceTokenizerLine, tags);
System.out.println(sample.toString()+"second");
//String t=sample.toString();
FileOutputStream fout=new FileOutputStream("C:\\Users\\dell\\Desktop\\file2.txt");
//fout.write(t.getBytes());
perfMon.incrementCounter();
fout.close();
}
perfMon.stopAndPrintFinalResult();
}
catch (IOException e) {
e.printStackTrace();
}
}
}
When PoStagMethod() is invoked from another class, only the first word in file.txt file gets written into the file2.txt file. Why won't it read other words in the file? What is wrong with my code?
You can simply read the file.txt line by line using BufferedReader. Then process each line as you know with your POSModel, then write the outputs to the file2.txt using BufferedWriter. A snippet code as below might help:
POSModel model = new POSModelLoader().load(new File("en-pos-maxent.bin"));
PerformanceMonitor perfMon = new PerformanceMonitor(System.err, "sent");
POSTaggerME tagger = new POSTaggerME(model);
BufferedWriter bufferedWriter = new BufferedWriter(new FileWriter("C:\\Users\\dell\\Desktop\\file2.txt"));
BufferedReader bufferedReader = new BufferedReader(new FileReader("C:\\Users\\dell\\Desktop\\file.txt"));
String line = "";
while((line = bufferedReader.readLine()) != null){
String whitespaceTokenizerLine[] = WhitespaceTokenizer.INSTANCE.tokenize(line);
String[] tags = tagger.tag(whitespaceTokenizerLine);
// Do your work with your tags and tokenized words
bufferedWriter.write(/* the string which is needed to be written to your output */);
// for adding new-lines in the output file, uncomment the following line:
//bufferedWriter.newLine();
}
//Do not forget to flush() and close() the streams after your job is done:
bufferedWriter.flush();
bufferedWriter.close();
bufferedReader.close();
If you could make this work, it's not bad to replace old-fashioned try-catch clause with try-with-resource which was added in java 1.7 to close the resources automatically.
Also If you need to write each word and it's tags in separated lines you may want to have an inner loop for writing to the file. It would be something like below:
POSModel model = new POSModelLoader().load(new File("en-pos-maxent.bin"));
PerformanceMonitor perfMon = new PerformanceMonitor(System.err, "sent");
POSTaggerME tagger = new POSTaggerME(model);
BufferedWriter bufferedWriter = new BufferedWriter(new FileWriter("C:\\Users\\dell\\Desktop\\file2.txt"));
BufferedReader bufferedReader = new BufferedReader(new FileReader("C:\\Users\\dell\\Desktop\\file.txt"));
String line = "";
while((line = bufferedReader.readLine()) != null){
String whitespaceTokenizerLine[] = WhitespaceTokenizer.INSTANCE.tokenize(line);
String[] tags = tagger.tag(whitespaceTokenizerLine);
for(String word: whitespaceTokenizerLine){
// Do your work with your tags and tokenized words
bufferedWriter.write(/* the string which is needed to be written to your output */);
// for adding new-lines in the output file, uncomment the following line:
//bufferedWriter.newLine();
}
}
//Do not forget to flush() and close() the streams after your job is done:
bufferedWriter.flush();
bufferedWriter.close();
bufferedReader.close();
Hope this would be helpful,
Good Luck.
Related
I have been trying to remove the column names that come when I read the content of the response returned by http get.
Initially I used http get to get a content and then I read this content using InputStream and then write to local disk as a csv file using FileOutputStream:
InputStream read_content = result.getEntity().getContent();
FileOutputStream writ = new FileOutputStream(new File(path));
byte[] buff = new byte[4096];
int length;
while ((length = read_content.read(buff)) > 0) {
writ.write(buff, 0, length);
}
Here result is the response I get from http get. This works fine but the problem is that the response also contains column names which I want to remove.
After some modification I am using this code now but the output is not coming right:
InputStream read_content = result.getEntity().getContent();
BufferedReader reader =
new BufferedReader(new InputStreamReader(read_content));
FileWriter fstream = new FileWriter(path);
BufferedWriter out = new BufferedWriter(fstream);
reader.readLine();
while (reader.readLine() != null) {
out.write(reader.read());
}
When I execute this modified code then I get garbage result. What am I doing wrong here and how can I remove the table column names?
Yout code should be something like this
BufferedReader br = null ;
BufferedWriter out = null;
try{
InputStream is = new FileInputStream(new File("C:/Space/ConnTest/Test/input.txt"));
br = new BufferedReader(new InputStreamReader(is));
out = new BufferedWriter(new FileWriter(new File("C:/Space/ConnTest/Test/output.txt")));
System.out.println("This is first line ---"+br.readLine());
String str = "";
while ((str = br.readLine()) != null) {
out.write(str);
}
System.out.println("Success");
}
catch(Exception e )
{
e.printStackTrace();
}
finally
{
if(br!=null)
{
br.close();
}
if(out!=null)
{
out.close();
}
}
Dont be confuse with whole code I just replaced your out.write(reader.read()); with
while ((str = br.readLine()) != null) {
out.write(str);
}
And I am calling br.readLine() in SYSOUT so headers will get skipped. Then I am writing the file with br.readLine()
If it's a line, and so is the rest of the content, use BufferedReader.readLine(), and skip the first line.
I'm trying to delete the last four characters of all the lines in a text file. Let's say I have domain.txt and the content:
123.com
student.com
tech.net
running into hundreds of lines. How do I delete the last four characters (the extensions) to remain:
123
student
tech
etc.
I hope this helps.
UPDATED
String a ="123.com";
System.out.println(a.substring(0, a.lastIndexOf(".")));
You can do as below :
File file = new File("file.txt");
BufferedReader reader = new BufferedReader(new FileReader(file));
String line = "",
newtext = "";
while((line = reader.readLine()) != null) {
line=line.substring(0, line.lastIndexOf("."))
newtext += line + "\n";
}
reader.close();
// Now write new Content
FileWriter writer = new FileWriter("file.txt");
writer.write(newtext);
writer.close();
Do not forget to use try..catch
File employe = new File("E:five/emplo.xml");
File stud = new File("E:/one/two/student.xml");
how to combine these two files in one file object
If you want to merge two standard text files then you can just use filewriters and filereaders.
I assuming that this is not some xml specific thing as I am not experienced with them.
Here is how to read a file (without the exception handling):
FileReader fr = new FileReader(file);
BufferedReader br = new BufferedReader(fr); // the only reason I use this is because I am used to line by line handling
String line;
while((line = br.readLine()) != null)
{
// do something with each line
}
You could read each file into an arraylist of strings then output using:
FileWriter fout = new FileWriter(file, toAppend);
fout.write(msg);
fout.close();
String[] filenames = new String[]{ "emplo.xml", "student.xml"};
OutputStream outputStream = new BufferedOutputStream(new FileOutputStream("merged.xml");
for (String filename : filenames) {
InputStream inputStream = new BufferedInputStream(new FileInputStream(filename);
org.apache.commons.io.IOUtils.copy(inputStream, outputStream);
inputStream.close();
}
outputStream.close();<br/>
or you can also use SAXParser
I have a file of words in text.I want to read the file
FileInputStream fstream = new FileInputStream(s);
BufferedReader br = new BufferedReader(new InputStreamReader(fstream));
MaxentTagger tagger = new MaxentTagger("tag/wsj-0-18-bidirectional-distsim.tagger");
String tagged = tagger.tagString(br);
My problem is it should read the file and give line by line of the file as string to the tagger and print in a output file.
As both input and output are going to be text, I'd use Reader and Writer rather than streams. Something like:
try (
BufferedReader in = new BufferedReader(new FileReader("inputFile.txt"));
PrintWriter out = new PrinterWriter(new FileWriter("outputFile.txt"));
) {
MaxentTagger tagger = new MaxentTagger("tag/wsj-0-18-bidirectional-distsim.tagger");
String line;
while ((line = in.readLine()) != null) {
String tagged = tagger.tagString(line);
out.println(tagged);
}
}
Note, this code uses Java 7 resource handling, so the in and out are closed automatically.
I want to write a simple java program to read in a text file and then write out a new file whenever a blank line is detected. I have seen examples for reading in files but I don't know how to detect the blank line and output multiple text files.
fileIn.txt:
line1
line2
line3
fileOut1.txt:
line1
line2
fileOut2.txt:
line3
Just in case your file has special characters, maybe you should specify the encoding.
FileInputStream inputStream = new FileInputStream(new File("fileIn.txt"));
InputStreamReader streamReader = new InputStreamReader(inputStream, "UTF-8");
BufferedReader reader = new BufferedReader(streamReader);
int n = 0;
PrintWriter out = new PrintWriter("fileOut" + ++n + ".txt", "UTF-8");
for (String line;(line = reader.readLine()) != null;) {
if (line.trim().isEmpty()) {
out.flush();
out.close();
out = new PrintWriter("file" + ++n + ".txt", "UTF-8");
} else {
out.println(line);
}
}
out.flush();
out.close();
reader.close();
streamReader.close();
inputStream.close();
I don't know how to detect the blank line..
if (line.trim().length==0) { // perform 'new File' behavior
.. and output multiple text files.
Do what is done for a single file, in a loop.
You can detect an empty string to find out if a line is blank or not. For example:
if(str!=null && str.trim().length()==0)
Or you can do (if using JDK 1.6 or later)
if(str!=null && str.isEmpty())
BufferedReader br = new BufferedReader(new FileReader("test.txt"));
String line;
int empty = 0;
while ((line = br.readLine()) != null) {
if (line.trim().isEmpty()) {
// Line is empty
}
}
The above code snippet can be used to detect if the line is empty and at that point you can create FileWriter to write to new file.
Something like this should do :
public static void main(String[] args) throws Exception {
writeToMultipleFiles("src/main/resources/fileIn.txt", "src/main/resources/fileOut.txt");
}
private static void writeToMultipleFiles(String fileIn, String fileOut) throws Exception {
BufferedReader br = new BufferedReader(new InputStreamReader(new FileInputStream(new File(fileIn))));
String line;
int counter = 0;
BufferedWriter wr = new BufferedWriter(new OutputStreamWriter(new FileOutputStream(new File(fileOut))));
while((line=br.readLine())!=null){
if(line.trim().length()!=0){
wr.write(line);
wr.write("\n");
}else{
wr.close();
wr = new BufferedWriter(new OutputStreamWriter(new FileOutputStream(fileOut + counter)));
wr.write(line);
wr.write("\n");
}
counter++;
}
wr.close();
}