Weird character when read srt file from text - java

I try to read file to string, I've try to make the encode to UTF-8 but still fail, it's return some weird characters in the output.
Here is my function to read file:
private static String readFile(String path, boolean isRaw) throws UnsupportedEncodingException, FileNotFoundException{
File fileDir = new File(path);
try{
BufferedReader in = new BufferedReader(
new InputStreamReader(
new FileInputStream(fileDir), "UTF-8"));
String str;
while ((str = in.readLine()) != null) {
System.out.println(str);
}
in.close();
return str;
}
catch (UnsupportedEncodingException e)
{
System.out.println(e.getMessage());
}
catch (IOException e)
{
System.out.println(e.getMessage());
}
catch (Exception e)
{
System.out.println(e.getMessage());
}
return null;
}
The output of first line is: ��1
Here is my testing file https://www.dropbox.com/s/2linqmdoni77e5b/How.to.Get.Away.with.Murder.S01E01.720p.HDTV.X264-DIMENSION.srt?dl=0
Thanks in advance.

This file is encoded in UTF16-LE and has the Byte order mark which helps to determine the encoding. Use "UTF-16LE" charset (or StandardCharsets.UTF_16LE) and skip the first character of the file (for example, calling str.substring(1) on the first line).

It looks like your file is encoded as a BOM file. If you don't need to handle the BOM character, then open notepad++ and encode your file as UTF-8 without BOM
To handle a BOM file in java, take a look at this apache site for BOMInputStream
Example:
private static String readFile(String path, boolean isRaw) throws UnsupportedEncodingException, FileNotFoundException{
File fileDir = new File(path);
try{
BOMInputStream bomIn = new BOMInputStream(new FileInputStream(fileDir), ByteOrderMark.UTF_16LE);
//You can also detect UTF-8, UTF-16BE, UTF-32LE, UTF-32BE by using this below constructure
//BOMInputStream bomIn = new BOMInputStream(new FileInputStream(fileDir), ByteOrderMark.UTF_16LE,
// ByteOrderMark.UTF_16BE, ByteOrderMark.UTF_32LE, ByteOrderMark.UTF_32BE, ByteOrderMark.UTF_8);
if(bomIn.hasBOM()){
System.out.println("Input file was encoded as a bom file, the bom character has been removed");
}
BufferedReader in = new BufferedReader(
new InputStreamReader(
bomIn, "UTF-8"));
String str;
while ((str = in.readLine()) != null) {
System.out.println(str);
}
in.close();
return str;
}
catch (UnsupportedEncodingException e)
{
System.out.println(e.getMessage());
}
catch (IOException e)
{
System.out.println(e.getMessage());
}
catch (Exception e)
{
System.out.println(e.getMessage());
}
return null;
}

Related

Find and replace words in a text file (Java GUI)

I'm looking to create a find and replace java application which prompts users to call to a text file, print it out to a new file, ask user for a search word or phrase and a word to replace that searched word with. Here is the code I have so far. I can read the contents from the first file just fine but cannot write the contents from the first file to another. This is all done within a GUI code below
String loc = jTextField1.getText(); //gets location of initial file or "source"
String file = jTextField4.getText(); //new file path
String find = jTextField2.getText(); //find word inputted by user
String word = jTextField3.getText(); //replace "find" with word inputted by user
String line = null;
try {
BufferedReader br = new BufferedReader(new FileReader(loc));
while ((line = br.readLine()) !=null)
} catch (FileNotFoundException ex) {
Logger.getLogger(Assign6GUI.class.getName()).log(Level.SEVERE, null, ex);
} catch (IOException ex) {
Logger.getLogger(Assign6GUI.class.getName()).log(Level.SEVERE, null, ex);
}
To write content to a file you need to use BufferedWriter
public static void writetoFile(String str, String FILE_PATH, String FILENAME ) {
BufferedWriter writer = null;
try {
File file = new File(FILE_PATH);
// if file doesnt exists, then create it
if (!file.exists()) {
file.mkdir();
}
file = new File(FILE_PATH + FILENAME);
file.createNewFile();
writer = new BufferedWriter(new FileWriter(file));
writer.write(str);
} catch (IOException e) {
LOGGER.debug(e);
} finally {
try {
if (writer != null) {
writer.close();
}
} catch (Exception e) {
LOGGER.debug(e);
}
}
}
to replace words in a string you should use the replace function in java
String str = someString.replace("OldText", "NewText");

formatting while writing a document

I am reading a txt file into a String buffer and writing the content into a word document using OutputStreamWriter.
The problem is that the formatting is not retained in the document. The spaces and the line breaks are not retained as in the text file. The txt file is formatted properly with spaces, page breaks, and tabs. I want to replicate the txt in word document. Please suggest how can the same formatting be retained. The link to the file is: http://s000.tinyupload.com/index.php?file_id=09876662859146558533.
This is the sample code:
private static String readTextFile() {
BufferedReader br = null;
String content = null;
try {
br = new BufferedReader(new FileReader("ORDER_INVOICE.TXT"));
StringBuilder sb = new StringBuilder();
String line = br.readLine();
while (line != null) {
sb.append(line);
line = br.readLine();
sb.append(System.lineSeparator());
}
content = sb.toString();
} catch (FileNotFoundException e) {
e.printStackTrace();
} catch (Exception e) {
e.printStackTrace();
} finally {
try {
br.close();
} catch (IOException e) {
e.printStackTrace();
}
}
return content;
}
private static void createDocument(String docName, String content) {
FileOutputStream fout = null;
try {
fout = new FileOutputStream(docName);
OutputStreamWriter out = new OutputStreamWriter(fout);
out.write(content);
out.close();
} catch (FileNotFoundException e) {
e.printStackTrace();
} catch (IOException e) {
e.printStackTrace();
}
}
Try to change your readTextFile() like this and try.
BufferedReader br = null;
String content = null;
try {
br = new BufferedReader(new FileReader("ORDER_INVOICE.TXT"));
StringBuilder sb = new StringBuilder();
String line = br.readLine();
while(line != null) {
content += line + "\n";
line = br.readLine();
}
} catch (FileNotFoundException e) {
e.printStackTrace();
} catch (Exception e) {
e.printStackTrace();
} finally {
try {
br.close();
} catch (IOException e) {
e.printStackTrace();
}
}
return content;
Actually if your using java 7, you can use try-with-resources in order to decrease the number of lines in your code.
Try to avoid printing \n chars. Use \r\n for Windows - remember that line separators differ across platforms.
A more reliable way is to use PrintWriter, see
How to write new line in Java FileOutputStream
After the discussion in comments:
the source file has unix line breaks
the output file is expected to have Windows line breaks
we shall strip the 0x0c (form feed - i.e. move to next page on the printer) from the source file, as it is non-printable.
public static void main(String[] args) throws IOException {
String content = new String(Files.readAllBytes(Paths.get("f:\\order_invoice.txt")))
.replace("\u000c","");
PrintWriter printWriter=new PrintWriter(new FileWriter("f:\\new_order_invoice.txt"));
for (String line:content.split("\\n")) {
printWriter.println(line);
}
printWriter.close();
}
So:
read the file as it is into a String
get rid of the form feed (0x0c, unicode u000c)
split the string at unix line breaks \n
write it out line by line using PrintWriter which uses the platform default line ending, i.e. windows cr-lf.
Remember that you can actually do this in one line, using a regexp to replace unix line endings to windows line endings in the string representing the whole file, and use Files.write to write out the whole file in one line. However this presented solution is probably a bit better as it always uses platform native line separators.

How to write inside a text file using arabic letters with the accents? Java

I have a problem with Netbeans file viewer. I have a string in Arabic that includes accents in top of each letter. When I remove the accents from the string, the letters display correctly. However, when I write the string with the accents, it gets somehow disordered (incorrect).
This is an example of what is happening:
Text without accents (correct): بسم الله الرحمن الرحيم
Text with accents (incorrect): it shows broken, but if i copy it here it prints correctly
It should be like this (correct): بِسْمِ اللَّهِ الرَّحْمَنِ الرَّحِيمِ
The code I wrote is to read a text file that includes an arabic string along with its accents, then write it correctly in a new file, then at the end, it deletes the old file. This is the code:
public void arabicReformer(File disordered) {
File output = new File("data/temp2.txt");
try {
BufferedReader br = new BufferedReader(
new InputStreamReader(new FileInputStream(disordered), "UTF8"));
BufferedWriter bw = new BufferedWriter(
new OutputStreamWriter(new FileOutputStream(output), "UTF8"));
String line;
while ((line = br.readLine()) != null) {
bw.write(line.trim() + "\n");
}
br.close();
bw.close();
} catch (UnsupportedEncodingException e) {
System.out.println(e.getMessage());
} catch (IOException e) {
System.out.println(e.getMessage());
} catch (Exception e) {
System.out.println(e.getMessage());
}
output.renameTo(disordered);
}
PS: when I copy past the incorrect arabic string with the accents here, it prints correctly!
Good Day my Friend :)
Try using this code to read and print Arabic Characters, and try ensuring that your file is originally in UTF-8.
public void unicodeShow(String fileName) throws UnsupportedEncodingException, FileNotFoundException, IOException{
Reader reader = new InputStreamReader(new FileInputStream(fileName), "utf-8");
BufferedReader br = new BufferedReader(reader);
String a=br.readLine();
System.out.println(a);
}

StringBuffer replace method doesn't work

I want to read a text file in the same folder with my java program. I have a readFile() that is used to read the content of the file line by line. And then the setName() will replace a part of the content. I compile the program and run without error. But the file's content doesn't change at all.
Thank you
public StringBuffer readFile(){ //read file line by line
URL url = getClass().getResource("test.txt");
File f = new File(url.getPath());
StringBuffer sb = new StringBuffer();
String textinLine;
try {
FileInputStream fs = new FileInputStream(f);
InputStreamReader in = new InputStreamReader(fs);
BufferedReader br = new BufferedReader(in);
while (true){
textinLine = br.readLine();
if (textinLine == null) break;
sb.append(textinLine);
}
fs.close();
in.close();
br.close();
} catch (FileNotFoundException e) {
e.printStackTrace();
} catch (IOException e) {
e.printStackTrace();
}
return sb;
}
public void setName(String newName){
StringBuffer sb = readFile();
int pos = sb.indexOf("UserName=");
sb.replace(pos, pos+newName.length(), newName);
}
You have to write back to the file so it gets changed but your not changing the content of the StringBuffer, you are reading it only.
once you change the content you need to write the new content to the file like:
try{
FileWriter fwriter = new FileWriter(YourFile);
BufferedWriter bwriter = new BufferedWriter(fwriter);
bwriter.write(sb.toString());
bwriter.close();
}
catch (Exception e){
e.printStackTrace();
}
You don't change the content of the file, you change the content of the StringBuffer. If you have a look at your StringBuffer (System.out.println(sb.ToString())) before and after the sb.replace method, you will see where changes are being made

Android won't write new line in text file

I am trying to write a new line to a text file in android.
Here is my code:
FileOutputStream fOut;
try {
String newline = "\r\n";
fOut = openFileOutput("cache.txt", MODE_WORLD_READABLE);
OutputStreamWriter osw = new OutputStreamWriter(fOut);
osw.write(data);
osw.write(newline);
osw.flush();
osw.close();
} catch (FileNotFoundException e) {
e.printStackTrace();
} catch (IOException e) {
e.printStackTrace();
}
I have tried \n, \r\n and I did also try to get the system property for a new line, neither of them work.
The data variable contains previously data from the same file.
String data = "";
try {
FileInputStream in = openFileInput("cache.txt");
StringBuffer inLine = new StringBuffer();
InputStreamReader isr = new InputStreamReader(in, "ISO8859-1");
BufferedReader inRd = new BufferedReader(isr,8 * 1024);
String text;
while ((text = inRd.readLine()) != null) {
inLine.append(text);
}
in.close();
data = inLine.toString();
} catch (FileNotFoundException e1) {
e1.printStackTrace();
} catch (IOException e) {
e.printStackTrace();
}
I had the same problems, tried every trick in the book.
My problem: the newline's were written, but while reading they were removed:
while (readString != null) {
datax.append(readString);
readString = buffreader.readLine();
}
The file was read line by line and concatenated, so the newline's disappeared.
I did not look at the original file in Notepad or something because I didn't know where to look on my phone, and my logscreen used the code which removed the newline's :-(
So the simple soultion was to put it back while reading:
while (readString != null) {
datax.append(readString);
datax.append("\n");
readString = buffreader.readLine();
}
I executed a similar program and it worked for me. I observed a strange behavior though. It added those new lines to the file, however the cursor remained at the first line. If you want to verify, write a String after your newline characters, you will see that the String is written just below those new lines.
I was having the same problem and was unable to write a newline. Instead I use BufferdWritter to write a new line into the file and it works for me.
Here is a sample code sniplet:
OutputStreamWriter out = new OutputStreamWriter(openFileOutput("cache.txt",0));
BufferedWriter bwriter = new BufferedWriter(out);
// write the contents to the file
bwriter.write("Input String"); //Enter the string here
bwriter.newLine();

Categories