I am reading a text file using java Scanner.
try {
while(sc.hasNextLine()) {
//Read input from file
inputLine = sc.nextLine().toUpperCase();
System.out.println(inputLine);
}
The above gives below output while my text file only includes "aabbcc".
How to avoid scanner from scanning the garbage?
Thanks.
{\RTF1\ANSI\ANSICPG1252\COCOARTF1265\COCOASUBRTF210
{\FONTTBL\F0\FSWISS\FCHARSET0 HELVETICA;}
{\COLORTBL;\RED255\GREEN255\BLUE255;}
\PAPERW11900\PAPERH16840\MARGL1440\MARGR1440\VIEWW10800\VIEWH8400\VIEWKIND0
\PARD\TX566\TX1133\TX1700\TX2267\TX2834\TX3401\TX3968\TX4535\TX5102\TX5669\TX6236\TX6803\PARDIRNATURAL
\F0\FS24 \CF0 AABBCC}
You are reading a RTF Document. If you want to read the text only you can try reading it into a byte array and parsing out the text using swings rtfeditorkit.
Path path = Paths.get("path/to/file");
byte[] data = Files.readAllBytes(path);
RTFEditorKit rtfParser = new RTFEditorKit();
Document document = rtfParser.createDefaultDocument();
rtfParser.read(new ByteArrayInputStream(data), document, 0);
String text = document.getText(0, document.getLength());
This was solved by setting TextEdit preferences, Format to "Plain text" and recreated the input file.
Managed to get the output without garbage.
Source: File input in Java for Mac
The problem isn't that the Scanner is reading in garbage. It is that your file isn't plain text. From the looks of it, your file is actually "rich text", and that garbage contains formatting info. I was able to produce similar output by saving a .rtf using MS WordPad.
Related
My problem right now is:
In my text file I get some funky looking characters when trying to use ObjectOutputStream to write my object to my cases.txt file.
FileOutputStream file = new FileOutputStream("/cases.txt");
ObjectOutputStream data = new ObjectOutputStream(file);
DefectProduct s1 = new DefectProduct("test", 5, "test", "test",
"test", 50, 2019, 1, 24, "505", "test" );
data.writeObject(s1);
data.flush();
data.close();
System.out.println("Record added");
For now I'm trying to get the part of writing my object to a textfile to work. I read somewhere that ObjectOutputStream was the choice for writing objects to text files. Is this correct?
In the end my goal is to take a whole arraylist of objects and write the whole arraylist to my text file at once.
ObjectOutputStream does not produce human readable stream (i.e. you cannot open it in a text editor and to expect to read the content). The text editor is trying to show everything in the file as text but the file itself does not contain valid text and that's why the text editor shows unreadable characters.
The purpose of ObjectOutputStream is to generate data which later can be read back using ObjectInputStream but no one says this data will be valid text data.
It does not matter that your output file is cases.txt. The extension of the file is just part of the name and a hint to the user or the operating system what might be inside that file but this means nothing about what really is saved in this file. You can name your output file cases.mp3 but this will not produce music file.
I am facing a problem in saving a text file in UTF-8 format using java. When i click on save as for the generated text file, it gives ANSI as text format but not as UTF-8. Below is the code i am writing while creating the file:
String excelFile="";
excelFile = getFullFilePath(fileName);
File file = new File(excelFile + fileName,"UTF-8");
File output = new File(excelFile,"UTF-8");
FileUtils.writeStringToFile(file, content, "UTF-8");
While creating the text file, I am using UTF-8 encoding, but the file still shows the encoding as ANSI while saving.
Kindly help.
Instead of using File, create a FileOutputStream.
Try this.
Writer out = new BufferedWriter(new OutputStreamWriter(
new FileOutputStream("outfilename"), "UTF-8"));
try {
out.write(aString);
} finally {
out.close();
}
I had a problem very similar to that and I solved saving the original file with the UTF-8 encoding. In this case, go to the excel file and save it again, or create a new file and copy the content, and make sure that its encoding is UTF-8. This link has a tutorial on how to save an excel file with UTF-8 encoding: https://help.surveygizmo.com/help/encode-an-excel-file-to-utf-8-or-utf-16. At the most your code seems to be correct.
I'm developing an Android app which is supposed to present large text files (books for example), for the user to browse, read, and search.
My question is as follows:
How should I read and present the text file, which is currently in either a PDF or Word format, and is formatted?
What file should the text be in (.doc, .txt, .xml, .html)?
What controls/elements and code should I use to read it on the app so that it should be presented efficiently and formatted correctly (TextView, WebView, PDF reader, or some other way)?
Thanks.
It really depends on your application and programming skills,
Grabbing texts from PDFs, word files, etc.. can be done using
libraries (lots are available)
you can save your files as raw text files (they could be readable
manually) if you want to want to secure it you can encrypt it before
saving and you can save with custom extension( .lib,.abc etc..).
TextViews are the easiest as you can change the colors ,
fonts and text size, it's really fast and easy to deal with.
Edit : example of reading a text file
File myFile = new File("/sdcard/filename.txt");
FileInputStream iStr = new FileInputStream(myFile);
BufferedReader fileReader = new BufferedReader(new InputStreamReader(iStr ));
String TextLine= "";
String TextBuffer = "";
while ((TextLine= fileReader.readLine()) != null) {
TextBuffer += TextLine+ "\n";
}
textView1.setText(TextBuffer );
fileReader.close();
example of writing a text file :
File myFile = new File("/sdcard/filename.txt");
myFile.createNewFile();
FileOutputStream oStr = new FileOutputStream(myFile);
OutputStreamWriter fileWriter= new OutputStreamWriter(oStr);
fileWriter.append(textView1.getText());
fileWriter.close();
fOut.close();
How to write encoded text to a file using java/jsp with FileWriter?
FileWriter testfilewriter = new FileWriter(testfile, true);
testfilewriter.write(testtext);
testtext:- is text
testfile:- is String (Encoded)
What I m trying to do is encoding testfile with Base64 and storing it in file. What is the best way to do so?
Since your data is not plain text, you can't use FileWriter. What you need is FileOutputStream.
Encode your text as Base64:
byte[] encodedText = Base64.encodeBase64( testtext.getBytes("UTF-8") );
and write to file:
try (OutputStream stream = new FileOutputStream(testfile)) {
stream.write(encodedText);
}
or if you don't want to lose existing data, write in append mode by setting append boolean to true:
try (OutputStream stream = new FileOutputStream(testfile, true)) {
stream.write(encodedText);
}
You can do the encoding yourself and then write to the file as suggested by #Alper OR if you want to create a stream which does encoding/decoding to while writing and reading from file , apache commons codec library will come in handy see Base64OutputStream and Base64InputStream
Interestingly Java 8 has a similar API Base64.Encoder. Checkout the wrap method
Hope this helps.
The Approach to be followed depends on the algorithm you are using and writing the encoded file is same as writing the file in java
IMHO, if you are trying to do it using jsp , Kindly go with servlets .As jsp are not meant for business layers rather do servlets.
I'm not going to give the code, as it is pretty easy if you try it. I'll share the best way to do it as a psuedo code. Here are steps to write your encoded text.
Open input file in read mode & output file in append mode.
If input file isn't huge (it can fit in memory) then read whole file at once, otherwise read line-by-line.
Encode the text retrieved from file using Base64Encoder
Write in the output file in append mode.
You can't use a FileWriter directly for this task.
You asked how you can do it, but you didn't give any information about which JDK and library you use, so here are a few solutions with the standard tools.
If you're using Java 8:
String testFile = "";
try (Writer writer = new OutputStreamWriter(
Base64.getEncoder().wrap(
java.nio.file.Files.newOutputStream(
Paths.get(testFile),
StandardOpenOption.APPEND)),
StandardCharsets.UTF_8)
) {
writer.write("text to be encoded in Base64");
}
If you're using Java 7 with Guava:
String testFile = "";
CharSink sink = BaseEncoding.base64()
.encodingSink(
com.google.common.io.Files.asCharSink(
new File(testFile),
StandardCharsets.UTF_8,
FileWriteMode.APPEND))
.asCharSink(StandardCharsets.UTF_8);
try (Writer writer = sink.openStream()) {
writer.write("text to be encoded in Base64");
}
If you're using Java 6 with Guava:
String testFile = "";
CharSink sink = BaseEncoding.base64()
.encodingSink(
com.google.common.io.Files.asCharSink(
new File(testFile),
Charsets.UTF_8,
FileWriteMode.APPEND))
.asCharSink(Charsets.UTF_8);
Closer closer = Closer.create();
try {
Writer writer = closer.register(sink.openStream());
writer.write("text to be encoded in Base64");
} catch (Throwable e) { // must catch Throwable
throw closer.rethrow(e);
} finally {
closer.close();
}
I don't have much knowledge about other libraries so I won't pretend I do and add another helper.
I have one CSV file which contains many records. Noticed that some of the records contain French characters. My script reads each record and processes it and inserts the processed record in the XML. When we view the .csv file on terminal using VIM Editor on Fedora system, the French characters are displayed in correct format. But after processing the records these characters are not getting displayed properly. Also when such a record is printed on the console, it is not displayed properly.
For eg.
String in .csv file : Crêpe Skirt
String in XML : Cr�pe Skirt
code Snippet for Reading file.
BufferedReader file = new BufferedReader(new FileReader(fileLocation));
String line = file.readLine();
Kindly suggest a way to handle such issue.
You need to know what encoding the file is in (probably UTF-8) and then when you open the file in Java specify the same encoding.
try reading the file as UTF-8 file. And provide the encoding of your xml file as UTF-8 too
BufferedReader reader=new BufferedReader(new InputStreamReader(new FileInputStream(your-file-path),"UTF-8"));
String line="";
while((line=reader.readLine())!=null) {
//Do your work here
}