ReplaceAll not working on XML input - java

I am working on a java program that reads in XML and generates an output XML. I am having a problem replacing some of the characters in my read in file.
The following is my method:
public void readTemplateXML() {
BufferedReader br = null;
try {
br = new BufferedReader(new InputStreamReader(new FileInputStream(
path), "UTF8"));
} catch (UnsupportedEncodingException e1) {
// TODO Auto-generated catch block
e1.printStackTrace();
} catch (FileNotFoundException e1) {
// TODO Auto-generated catch block
e1.printStackTrace();
}
String line;
StringBuilder sb = new StringBuilder();
try {
while ((line = br.readLine()) != null) {
sb.append(line);
sb.append(System.lineSeparator());
line = br.readLine();
}
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
xml = sb.toString();
xml = xml.replaceAll("<", "\\<"); //This is not working.
}
I am just outputting the "xml" string to an xml file and I am still getting "<":
<addressLine1>Main Street</addressLine1>
Is there anyway I can replace these characters with <, > ?
The encoding of the file is UTF-8.
EDIT:
the xml string is correct after the replace alls. I am using it as text content in another methods xml node:
// inner request element
Element request = doc.createElement("con:request");
request.appendChild(doc.createTextNode(xml));
rootElement.appendChild(request);
After this the content is incorrect.
Any help would be greatly appreaciated.

short answer :
Syntax:
Here is the syntax of this method:
public String replaceAll(String regex, String replacement)
Parameters:
Here is the detail of parameters:
regex -- the regular expression to which this string is to be matched.
replacement -- the string which would replace found expression.
code :
String xml="<addressLine1>Main Street</addressLine1>&#13";
xml = xml.replaceAll("<", "\\<");
xml = xml.replaceAll(">", "\\>");
xml = xml.replaceAll("&#13", "");
System.out.println( xml );
result :
<addressLine1>Main Street</addressLine1>

Related

formatting while writing a document

I am reading a txt file into a String buffer and writing the content into a word document using OutputStreamWriter.
The problem is that the formatting is not retained in the document. The spaces and the line breaks are not retained as in the text file. The txt file is formatted properly with spaces, page breaks, and tabs. I want to replicate the txt in word document. Please suggest how can the same formatting be retained. The link to the file is: http://s000.tinyupload.com/index.php?file_id=09876662859146558533.
This is the sample code:
private static String readTextFile() {
BufferedReader br = null;
String content = null;
try {
br = new BufferedReader(new FileReader("ORDER_INVOICE.TXT"));
StringBuilder sb = new StringBuilder();
String line = br.readLine();
while (line != null) {
sb.append(line);
line = br.readLine();
sb.append(System.lineSeparator());
}
content = sb.toString();
} catch (FileNotFoundException e) {
e.printStackTrace();
} catch (Exception e) {
e.printStackTrace();
} finally {
try {
br.close();
} catch (IOException e) {
e.printStackTrace();
}
}
return content;
}
private static void createDocument(String docName, String content) {
FileOutputStream fout = null;
try {
fout = new FileOutputStream(docName);
OutputStreamWriter out = new OutputStreamWriter(fout);
out.write(content);
out.close();
} catch (FileNotFoundException e) {
e.printStackTrace();
} catch (IOException e) {
e.printStackTrace();
}
}
Try to change your readTextFile() like this and try.
BufferedReader br = null;
String content = null;
try {
br = new BufferedReader(new FileReader("ORDER_INVOICE.TXT"));
StringBuilder sb = new StringBuilder();
String line = br.readLine();
while(line != null) {
content += line + "\n";
line = br.readLine();
}
} catch (FileNotFoundException e) {
e.printStackTrace();
} catch (Exception e) {
e.printStackTrace();
} finally {
try {
br.close();
} catch (IOException e) {
e.printStackTrace();
}
}
return content;
Actually if your using java 7, you can use try-with-resources in order to decrease the number of lines in your code.
Try to avoid printing \n chars. Use \r\n for Windows - remember that line separators differ across platforms.
A more reliable way is to use PrintWriter, see
How to write new line in Java FileOutputStream
After the discussion in comments:
the source file has unix line breaks
the output file is expected to have Windows line breaks
we shall strip the 0x0c (form feed - i.e. move to next page on the printer) from the source file, as it is non-printable.
public static void main(String[] args) throws IOException {
String content = new String(Files.readAllBytes(Paths.get("f:\\order_invoice.txt")))
.replace("\u000c","");
PrintWriter printWriter=new PrintWriter(new FileWriter("f:\\new_order_invoice.txt"));
for (String line:content.split("\\n")) {
printWriter.println(line);
}
printWriter.close();
}
So:
read the file as it is into a String
get rid of the form feed (0x0c, unicode u000c)
split the string at unix line breaks \n
write it out line by line using PrintWriter which uses the platform default line ending, i.e. windows cr-lf.
Remember that you can actually do this in one line, using a regexp to replace unix line endings to windows line endings in the string representing the whole file, and use Files.write to write out the whole file in one line. However this presented solution is probably a bit better as it always uses platform native line separators.

Android reading text files(.txt) into an array of strings

This is the code:
void CreateWordList()
{
Toast.makeText(getBaseContext(), "Creating Word List...", Toast.LENGTH_SHORT).show();
InputStream is = getResources().openRawResource(R.raw.pass);
BufferedReader lines = null;
try {
lines = new BufferedReader(new InputStreamReader(is, "UTF-8"));
} catch (UnsupportedEncodingException e1) {
// TODO Auto-generated catch block
e1.printStackTrace();
}
ArrayList<String> list = new ArrayList<String>();
String line = null;
try {
while((line = lines.readLine()) !=null)list.add(line);
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
wordlist = (String[]) list.toArray();
if (wordlist[1] == null)
{
Toast.makeText(getBaseContext(), "ERROR: Word List = null", Toast.LENGTH_SHORT).show();
}
}
I have a error at "line = lines.readLine();" that says "Unhandled exception of type IOException" so i surrounded it with try/catch.
And I have another error at "BufferedReader lines = new BufferedReader(new InputStreamReader(is, "UTF-8"));" that says "Unhandled exception type UnsupportedEncodingException" so i surrounded it with try/catch.
Now when i run the app it crashes...
What am I doing wrong ?
How can I read a text file and add each line to an array of strings ?
PS: I have searched and found other similar questions and answers but that did not help me...
change this
while(true){
line = lines.readLine();
to this
while((line = lines.readLine()) !=null)
Try this:
void CreateWordList()throws IOException{
...}

Java: Create a KML File and insert elements in existing file

I`m developing an app that reads the GPS-Exif Information of Photos and writes the Tags (Lat/Lon,...) in an KML or CSV File.
Creating the Files if they dont exist, especially the csv, is not the problem, but in this case i want to add a new KML- placemark to an existing KML-file.
so far i have created a method that checks if the file already exists - if not (if-statement) it creates a new one.
and if the file exists it should add the information (else).
public void createKMLFile(){
String kmlstart = "<?xml version=\"1.0\" encoding=\"utf-8\"?>\n" +
"<kml xmlns=\"http://www.opengis.net/kml/2.2\">\n";
String kmlelement ="\t<Placemark>\n" +
"\t<name>Simple placemark</name>\n" +
"\t<description>"+name+"</description>\n" +
"\t<Point>\n" +
"\t\t<coordinates>"+latlon[1]+","+latlon[0]+","+z+ "</coordinates>\n" +
"\t</Point>\n" +
"\t</Placemark>\n";
String kmlend = "</kml>";
ArrayList<String> content = new ArrayList<String>();
//content.add(0,kmlstart);
//content.add(1,kmlelement);
//content.add(2,kmlend);
String kmltest;
//Zum Einsetzen eines Substrings (weitere Placemark)
//String test = "</kml>";
//int index = kml.lastIndexOf(test);
File test = new File(datapath+"/"+name+".kml");
Writer fwriter;
if(test.exists() == false){
try {
content.add(0,kmlstart);
content.add(1,kmlelement);
content.add(2,kmlend);
kmltest = content.get(0) + content.get(1) + content.get(2);
fwriter = new FileWriter(datapath+"/"+name+".kml");
fwriter.write(kmltest);
//fwriter.append("HalloHallo", index, kml.length());
fwriter.flush();
fwriter.close();
}catch (IOException e1) {
// TODO Auto-generated catch block
e1.printStackTrace();
}
}
else{
kmltest = content.get(0) + content.get(1) + content.get(2);
StringTokenizer tokenize = new StringTokenizer(kmltest, ">");
ArrayList<String> append = new ArrayList<String>();
while(tokenize.hasMoreTokens()){
append.add(tokenize.nextToken());
append.add(1, kmlelement);
String rewrite = append.toString();
try {
fwriter = new FileWriter(datapath+"/"+name+".kml");
fwriter.write(rewrite);
fwriter.flush();
fwriter.close();
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}
}
}
I dont get any Logs in the LogCat but the App stops working if i try to update the existing file... any suggestions?
thanks in advance
EDIT: Ok i see that content.add(0, kml...) has to be outside the try block... but thats not the main problem it seems
When modifying XML files it is best accomplished using a library of some sort. I maintain the XML-manipulation library called JDOM which is designed to make this sort of manipulation as easy as it can. Other options are using the DOM library (which is already built in to the Java runtime which makes it much easier to integrate in to your program), and SAX (which, in this case, I would not recommend, even though it may be faster). Other external libraries (like JDOM) exist which would also help, like XOM, dom4j, etc. This stackoverflow answer seems relevant: Best XML parser for Java
In JDOM, your code would look something like:
Document doc = null;
Namespace kmlns = new Namespace("http://www.opengis.net/kml/2.2");
Element position = new Element("Position", kmlns);
position.addContent(new Element("name", kmlns).setText(positionName));
position.addContent(new Element("desc", kmlns).setText(description));
position.addContent(..... all the XML content needed for the Position ....);
// create the XML Document in memory if the file does not exist
// otherwise read the file from the disk
if(!test.exists()){
doc = new Document();
Element root = new Element("kml", kmlns);
} else {
SAXBuilder sb = new SAXBuilder();
doc = sb.build(test);
}
Element root = doc.getRootElement();
// modify the XML as you need
// add Position Element
root.addContent(position);
try {
fwriter = new FileWriter(datapath+"/"+name+".kml");
XMLOutputter xout = new XMLOutputter(Format.getPrettyFormat());
xout.output(doc, writer);
fwriter.flush();
fwriter.close();
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
EDIT: you ask what's wrong with your actual code.... There are a few things that are contributing to your problems, but you don't show an actual error, or other indication of how the program 'stops working'.
there are bugs in your code which should throw serious exceptions: kmltest = content.get(0) + content.get(1) + content.get(2); should throw IndexOutOfBoundsException because the content ArrayList is empty (the lines adding values to the ArrayList are commented out....) - but let's assume that they are not....
You never read the file you are changing, so how can you be changing it?
The StringTokenizer delimeter is ">", which is never a good way to parse XML.
You loop through the String tokenizer on evert '>' delimeter, but you never add the token back in to the output (i.e. your output is milling a lot of '>' characters).
You add the kmlelement Position content in the place of every '>' caracter in the document, not just the one that is important.
The FileWriter logic should be ** outside** the loop.... you do not want to modify the file for every token you modify.
It´s working now, thanks for your input rolfl!
In my programm i have implemented the method with the JDOM library which is much more comfortable, anyhow here is the working code of my first try if someone is interested.
The output is not in a pretty format but the kml-file is working..
public void createKMLFile(){
String kmlstart = "<?xml version=\"1.0\" encoding=\"utf-8\"?>\n" +
"<kml xmlns=\"http://www.opengis.net/kml/2.2\">\n";
String kmlelement ="\t<Placemark>\n" +
"\t<name>Simple placemark</name>\n" +
"\t<description>"+name+"</description>\n" +
"\t<Point>\n" +
"\t\t<coordinates>"+latlon[1]+","+latlon[0]+","+z+ "</coordinates>\n" +
"\t</Point>\n" +
"\t</Placemark>\n";
String kmlend = "</kml>";
ArrayList<String> content = new ArrayList<String>();
content.add(0,kmlstart);
content.add(1,kmlelement);
content.add(2,kmlend);
String kmltest = content.get(0) + content.get(1) + content.get(2);
File testexists = new File(datapath+"/"+name+".kml");
Writer fwriter;
if(!testexists.exists()){
try {
fwriter = new FileWriter(datapath+"/"+name+".kml");
fwriter.write(kmltest);
fwriter.flush();
fwriter.close();
}catch (IOException e1) {
// TODO Auto-generated catch block
e1.printStackTrace();
}
}
else{
//schleifenvariable
String filecontent ="";
ArrayList<String> newoutput = new ArrayList<String>();;
try {
BufferedReader in = new BufferedReader(new FileReader(testexists));
while((filecontent = in.readLine()) !=null)
newoutput.add(filecontent);
} catch (FileNotFoundException e1) {
// TODO Auto-generated catch block
e1.printStackTrace();
} catch (IOException e1) {
// TODO Auto-generated catch block
e1.printStackTrace();
}
newoutput.add(2,kmlelement);
String rewrite ="";
for(String s : newoutput){
rewrite += s;
}
try {
fwriter = new FileWriter(datapath+"/"+name+".kml");
fwriter.write(rewrite);
fwriter.flush();
fwriter.close();
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}

Reading unicode characters from csv file

I have a csv file which contains words in english followed by their Hindi translation. I am trying to read the csv file and do some further processing with it. The csv file looks like so:
English,,Hindi,,,
,,,,,
Cat,,बिल्ली,,,
Rat,,चूहा,,,
abandon,,छोड़ देना,त्याग देना,लापरवाही की स्वतन्त्रता,जाने देना
I am trying to read the csv file line by line and display what has been written. The code snippet (Java) is as follows:
//Step 2. Read csv file and get the string.
FileInputStream fis = null;
BufferedReader br = null;
try {
fis = new FileInputStream(new File(csvFile));
} catch (FileNotFoundException e1) {
// TODO Auto-generated catch block
e1.printStackTrace();
}
boolean startSeen = true;
if(fis != null) {
try {
br = new BufferedReader(new InputStreamReader(fis, "UTF-8"));
} catch (UnsupportedEncodingException e2) {
// TODO Auto-generated catch block
e2.printStackTrace();
System.out.print("Unsupported encoding");
}
String line = null;
if(br != null) {
try {
while((line = br.readLine()) != null) {
if(line.contains("English") == true) {
startSeen = true;
}
if((startSeen == true) && (line != null)) {
StringBuffer sbuf = new StringBuffer();
//Step 3. Parse the line.
sbuf.append(line);
System.out.println(sbuf.toString());
}
}
} catch (IOException e1) {
// TODO Auto-generated catch block
e1.printStackTrace();
}
}
}
However, the following output is what I get:
English,,Hindi,,,
,,,,,
Cat,,??????,,,
Rat,,????,,,
abandon,,???? ????,????? ????,???????? ?? ???????????,???? ????
My Java is not that great and though I have gone through a number of posts on SO, I need more help in figuring out the exact cause of this problem.
For reading text file it is better to use character stream e.g by using java.util.Scanner directly instead of FileInputStream. About encoding you have to make sure first that the text file that you want to read is saved as 'UTF-8' and not otherwise. I also notice in my system, I have to save my java source file as 'UTF-8' as well to make it shown hindi char properly.
However I want to suggest simpler way to read csv file as follow:
Scanner scan = new Scanner(new File(csvFile));
while(scan.hasNext()){
System.out.println(scan.nextLine());
}
I think your console cannot show Hindi chars. Try
System.out.println("Cat,,बिल्ली,,,");
to test
So as discussed in above answers; solutions it is TWO steps
1) Save your txt file as UTF-8
2) Change the property of your Java code to use UTF-8
In Eclipse; right click on Java file;
Properties -> Resurces -> Text File Encoding -> Other -> UTF-8
Refer screenshot given on
http://howtodoinjava.com/2012/11/27/how-to-compile-and-run-java-program-written-in-another-language/

SetText String[] in a TextView

I am trying to use setText, and I want to use a String array. First, I create a String [], then I assign data to String[0], then I want to .setText(String[0]) on my TextView, is this the right way?
Note : I'm using a StringTokenizer to split Strings in the textfile
try {
filename = "myk.txt";
FileReader filereader = new FileReader(Environment.getExternalStorageDirectory() + "/Q/" + filename);
BufferedReader bufferedreader = new BufferedReader(filereader);
try {
while ((text = bufferedreader.readLine()) != null){
sb.append(text);
sb.toString().split(";");
tokens = new StringTokenizer(sb.toString(), ";");
///NULLPOINTER EXEPTION HERE//// if (tokens.countTokens() > 0){questionfromfile[0] = tokens.nextToken();
}
}
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
} catch (FileNotFoundException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
////ETC ...//// and now textview.setText(question[0]);
Sure, you mean something like
String[] strings = new String [5];
strings[0] = "foobar";
component.setText(strings[0]);
why do you have this line:
sb.toString().split(";");
?
are you forgetting that a string is immutable ,meaning that using the standard API that you use , the string will never change itself , but create new objects instead?
about StringTokenizer, as javadocs say:
StringTokenizer is a legacy class that is retained for compatibility
reasons although its use is discouraged in new code. It is recommended
that anyone seeking this functionality use the split method of String
or the java.util.regex package instead.

Categories