How to combine these two xml files - java

File employe = new File("E:five/emplo.xml");
File stud = new File("E:/one/two/student.xml");
how to combine these two files in one file object

If you want to merge two standard text files then you can just use filewriters and filereaders.
I assuming that this is not some xml specific thing as I am not experienced with them.
Here is how to read a file (without the exception handling):
FileReader fr = new FileReader(file);
BufferedReader br = new BufferedReader(fr); // the only reason I use this is because I am used to line by line handling
String line;
while((line = br.readLine()) != null)
{
// do something with each line
}
You could read each file into an arraylist of strings then output using:
FileWriter fout = new FileWriter(file, toAppend);
fout.write(msg);
fout.close();

String[] filenames = new String[]{ "emplo.xml", "student.xml"};
OutputStream outputStream = new BufferedOutputStream(new FileOutputStream("merged.xml");
for (String filename : filenames) {
InputStream inputStream = new BufferedInputStream(new FileInputStream(filename);
org.apache.commons.io.IOUtils.copy(inputStream, outputStream);
inputStream.close();
}
outputStream.close();<br/>
or you can also use SAXParser

Related

unalble to read clear data from pdf file as other language is not english

I am trying to copy some data from pdf to txt file here is the code
public void readPDFFile() throws IOException {
InputStreamReader reader;
OutputStreamWriter writer;
FileInputStream inputstream;
FileOutputStream outputStream;
BufferedReader bufferedReader = null;
BufferedWriter bufferedWriter = null;
String str;
File rfile = new File(
"C://Documents and Settings/Administrator/My Documents/EGDownloads/source.pdf");
File wFile = new File("C://Documents and Settings/Administrator/My Documents/Folder/destination.txt");
try {
inputstream = new FileInputStream(rfile);
outputStream = new FileOutputStream(wFile);
reader = new InputStreamReader(inputstream, "UTF-8");
writer = new OutputStreamWriter(outputStream, "UTF-8");
bufferedReader = new BufferedReader(reader);
bufferedWriter = new BufferedWriter(writer);
while ((str = bufferedReader.readLine()) != null) {
writer.write(str);
}
} catch (IOException es) {
System.out.println(es.getMessage());
es.printStackTrace(System.out);
} finally {
if (bufferedReader != null) {
bufferedReader.close();
}
if (bufferedWriter != null)
bufferedWriter.close();
}
}
Expected output is supposed in other language but all I am getting is some random boxes as tried both UTF-16 and UTF-8 unicodes
I tried pdfBox but is still not working as all I'm getting is only original language accent and in english language
Note :
1 I'm not trying to print data on console but copying from pdf to txt file
2 Other file contains non english words,
can anyone help me to solve that??
Or any link that might help
Thanks.
The PDF format is a binary format. You must have a really special PDF as all that I know of are compressed in some way. Use a proper library to read it, be it pdfbox or itext or other. Be aware that in some PDFs it's impossible to extract text, you can check it with Acrobat, if Acrobat can't do it nobody can.

FileInputStream only reads the first word in a file

I want to read words in file.txt file token by token and add a Part of Speech tag to each of them and write it to file2.text file. file.txt content is tokenized. So here's my code.
public class PoSTagging {
#SuppressWarnings("resource")
public static void PoStagMethod() throws IOException {
FileInputStream fin= new FileInputStream("C:\\Users\\dell\\Desktop\\file.txt");
DataInputStream in = new DataInputStream(fin);
BufferedReader br = new BufferedReader(new InputStreamReader(in));
String strline=br.readLine();
System.out.println(strline+"first");
try{
POSModel model = new POSModelLoader().load(new File("en-pos-maxent.bin"));
PerformanceMonitor perfMon = new PerformanceMonitor(System.err, "sent");
POSTaggerME tagger = new POSTaggerME(model);
String input = strline;
#SuppressWarnings("deprecation")
ObjectStream<String> lineStream =new PlainTextByLineStream(new StringReader(input));
perfMon.start();
String line;
while ((line = lineStream.read()) != null) {
String whitespaceTokenizerLine[] = WhitespaceTokenizer.INSTANCE.tokenize(line);
String[] tags = tagger.tag(whitespaceTokenizerLine);
POSSample sample = new POSSample(whitespaceTokenizerLine, tags);
System.out.println(sample.toString()+"second");
//String t=sample.toString();
FileOutputStream fout=new FileOutputStream("C:\\Users\\dell\\Desktop\\file2.txt");
//fout.write(t.getBytes());
perfMon.incrementCounter();
fout.close();
}
perfMon.stopAndPrintFinalResult();
}
catch (IOException e) {
e.printStackTrace();
}
}
}
When PoStagMethod() is invoked from another class, only the first word in file.txt file gets written into the file2.txt file. Why won't it read other words in the file? What is wrong with my code?
You can simply read the file.txt line by line using BufferedReader. Then process each line as you know with your POSModel, then write the outputs to the file2.txt using BufferedWriter. A snippet code as below might help:
POSModel model = new POSModelLoader().load(new File("en-pos-maxent.bin"));
PerformanceMonitor perfMon = new PerformanceMonitor(System.err, "sent");
POSTaggerME tagger = new POSTaggerME(model);
BufferedWriter bufferedWriter = new BufferedWriter(new FileWriter("C:\\Users\\dell\\Desktop\\file2.txt"));
BufferedReader bufferedReader = new BufferedReader(new FileReader("C:\\Users\\dell\\Desktop\\file.txt"));
String line = "";
while((line = bufferedReader.readLine()) != null){
String whitespaceTokenizerLine[] = WhitespaceTokenizer.INSTANCE.tokenize(line);
String[] tags = tagger.tag(whitespaceTokenizerLine);
// Do your work with your tags and tokenized words
bufferedWriter.write(/* the string which is needed to be written to your output */);
// for adding new-lines in the output file, uncomment the following line:
//bufferedWriter.newLine();
}
//Do not forget to flush() and close() the streams after your job is done:
bufferedWriter.flush();
bufferedWriter.close();
bufferedReader.close();
If you could make this work, it's not bad to replace old-fashioned try-catch clause with try-with-resource which was added in java 1.7 to close the resources automatically.
Also If you need to write each word and it's tags in separated lines you may want to have an inner loop for writing to the file. It would be something like below:
POSModel model = new POSModelLoader().load(new File("en-pos-maxent.bin"));
PerformanceMonitor perfMon = new PerformanceMonitor(System.err, "sent");
POSTaggerME tagger = new POSTaggerME(model);
BufferedWriter bufferedWriter = new BufferedWriter(new FileWriter("C:\\Users\\dell\\Desktop\\file2.txt"));
BufferedReader bufferedReader = new BufferedReader(new FileReader("C:\\Users\\dell\\Desktop\\file.txt"));
String line = "";
while((line = bufferedReader.readLine()) != null){
String whitespaceTokenizerLine[] = WhitespaceTokenizer.INSTANCE.tokenize(line);
String[] tags = tagger.tag(whitespaceTokenizerLine);
for(String word: whitespaceTokenizerLine){
// Do your work with your tags and tokenized words
bufferedWriter.write(/* the string which is needed to be written to your output */);
// for adding new-lines in the output file, uncomment the following line:
//bufferedWriter.newLine();
}
}
//Do not forget to flush() and close() the streams after your job is done:
bufferedWriter.flush();
bufferedWriter.close();
bufferedReader.close();
Hope this would be helpful,
Good Luck.

bufferedReader reads the filename only

I am building an android app where one of the functionality is that i write on a .txt file and late try to read it. Its very simple. But when I try to read the file using a BufferedReader it only gives me the name of the file and not the content withing.Below is my code(just the concerned portion)
File curFile = new File(getFilesDir().getAbsolutePath()+"/Notes/"+fileName);
BufferedReader br = new BufferedReader(new FileReader(curFile));
StringBuilder note = new StringBuilder(500);
String content;
while((content = br.readLine())!= null) {
ShowToast("Inside While: "+ content);//prints only fileName
note.append(content);
note.append('\n');
}
what is it that i am doing wrong?!

Good practice in Java File I/O

I am trying to read integers from a file, apply some operation on them and writing those resulting integers to another file.
// Input
FileReader fr = new FileReader("test.txt");
BufferedReader br = new BufferedReader(fr);
Scanner s = new Scanner(br);
// Output
FileWriter fw = new FileWriter("out.txt");
BufferedWriter bw = new BufferedWriter(fw);
PrintWriter pw = new PrintWriter(bw);
int i;
while(s.hasNextInt())
{
i = s.nextInt();
pw.println(i+5);
}
I want to ask is it a good practice to wrap these input and output streams like this?
I am new to java and on internet, I saw lots of other ways of I/O in files. I want to stick to one approach so is above the best approach ?
- Well consider that you went shopping into a food mall, Now what you do usually, pick-up each item from the selves and then go to the billing counter then again go to the selves and back to billing counter ....?? Or Store all the item into a Cart then go to the billing counter.
- Its similar here in Java, Files deal with bytes, and Buffer deals with characters, so there is a conversion of bytes to characters and trust me it works well, there will not be any noticeable overhead.
So to Read the File:
File f = new File("Path");
FileReader fr = new FileReader(f);
BufferedReader br = new BufferedReader(fr);
So to Write the File:
File f = new File("Path");
FileWriter fw = new FileWriter(f);
BufferedWriter bw = new BufferedWriter(fw);
And when you use Scanner there is no need to use BufferedReader
Keep in mind that the design of those classes is based on the Decorator design pattern. A good practice is to close all instances of java.io.Closeable in a finally block. For example:
Reader r = null;
Scanner s = null;
try {
r = new FileReader("test.txt");
s = new Scanner(r);
// Do your stuff here.
} finally {
if (r != null)
r.close();
if (s != null)
s.close();
}
or, if you are using Java 7 or higher:
try (
Reader r = new FileReader("test.txt");
Scanner s = new Scanner(r)
) {
// Do your stuff here.
}
you dont really need BuffredWriter when you are using PrintWriter to write character data, printwriter has a constructor which takes filewriter as an argument. and dont need a scanner to read from a file you could acheive it using bufferedreader itself.
FileReader fr = new FileReader("test.txt");
BufferedReader br = new BufferedReader(fr);
while((line=br.readLine())!=null){
//do read operations here
}
FileWriter fw = new FileWriter("out.txt");
PrintWriter pw = new PrintWriter(fw);
pw.println("write some data to the file")
Scanner does not need the BufferedReader. You can wrap it over the FileReader.
Scanner s = new Scanner(new FileReader("test.txt"));
While using the scanner its better to assume that the source contains various content. Its good to close the scanner after using it.
while(s.hasNext()){
if(s.hasNextInt())
int i = s.nextInt();
s.next();
}
s.close();
I usually do this:
String inputFileLocation = "Write it here";
BufferedReader br = new BufferedReader(new FileReader(new File(fileLocation)));
while((line=br.readLine())!=null){
//Scanner operations here
}
String outputFileLocation = "Here";
PrintWriter pr = new PrintWriter(new FileWriter(new File(outputFileLocation)));

Read a input file to post tag

I have a file of words in text.I want to read the file
FileInputStream fstream = new FileInputStream(s);
BufferedReader br = new BufferedReader(new InputStreamReader(fstream));
MaxentTagger tagger = new MaxentTagger("tag/wsj-0-18-bidirectional-distsim.tagger");
String tagged = tagger.tagString(br);
My problem is it should read the file and give line by line of the file as string to the tagger and print in a output file.
As both input and output are going to be text, I'd use Reader and Writer rather than streams. Something like:
try (
BufferedReader in = new BufferedReader(new FileReader("inputFile.txt"));
PrintWriter out = new PrinterWriter(new FileWriter("outputFile.txt"));
) {
MaxentTagger tagger = new MaxentTagger("tag/wsj-0-18-bidirectional-distsim.tagger");
String line;
while ((line = in.readLine()) != null) {
String tagged = tagger.tagString(line);
out.println(tagged);
}
}
Note, this code uses Java 7 resource handling, so the in and out are closed automatically.

Categories