I have a question, I'm trying to read from a file, a set of key and value pairs ( Like a Dictionary). For this I'm using the following code:
InputStream is = this.getClass().getResourceAsStream(PROPERTIES_BUNDLE);
properties=new Hashtable();
InputStreamReader isr=new InputStreamReader(is);
LineReader lineReader=new LineReader(isr);
try {
while (lineReader.hasLine()) {
String line=lineReader.readLine();
if(line.length()>1 && line.substring(0,1).equals("#")) continue;
if(line.indexOf("=")!=-1){
String key=line.substring(0,line.indexOf("="));
String value=line.substring(line.indexOf("=")+1,line.length());
properties.put(key, value);
}
}
} catch (IOException e) {
e.printStackTrace();
}
And the readLine function.
public String readLine() throws IOException{
int tmp;
StringBuffer out=new StringBuffer();
//Read in data
while(true){
//Check the bucket first. If empty read from the input stream
if(bucket!=-1){
tmp=bucket;
bucket=-1;
}else{
tmp=in.read();
if(tmp==-1)break;
}
//If new line, then discard it. If we get a \r, we need to look ahead so can use bucket
if(tmp=='\r'){
int nextChar=in.read();
if(tmp!='\n')bucket=nextChar;//Ignores \r\n, but not \r\r
break;
}else if(tmp=='\n'){
break;
}else{
//Otherwise just append the character
out.append((char) tmp);
}
}
return out.toString();
}
Everything is fine, however I want it to be able to parse special characters. For example: ó that would be codified into \u00F3, however in this case it's not replacing it with the correct character... What would be the way to do it?
EDIT: Forgot to say that since I'm using JavaME the Properties class or anything similar does not exist, that's why it may seem that I'm reinventing the wheel...
If it's encoded with UTF-16, can you not just
InputStreamReader isr = new InputStreamReader(is, "UTF16")?
This would recognize your special characters right from the get-go and you wouldn't need to do any replacements.
You need to ensure that you character encoding is set in your InputStreamReader to be that of the file. If it doesn't match some characters can be incorrect.
Related
I am trying to read a binary file in Java using the bufferedReader. I wrote that binary-file using "UTF-8" encoding. The code for writing into a binary file:
byte[] inMsgBin=null;
try {
inMsgBin = String.valueOf(cypherText).getBytes("UTF-8");
//System.out.println("CIPHER TEXT:FULL:BINARY WRITE: "+inMsgBin);
} catch (UnsupportedEncodingException ex) {
Logger.getLogger(EncDecApp.class.getName()).log(Level.SEVERE, null, ex);
}
try (FileOutputStream out = new FileOutputStream(fileName+ String.valueOf(new SimpleDateFormat("yyyyMMddhhmm").format(new Date()))+ ".encmsg")) {
out.write(inMsgBin);
out.close();
} catch (IOException ex) {
Logger.getLogger(EncDecApp.class.getName()).log(Level.SEVERE, null, ex);
}
System.out.println("cypherText charCount="+cypherText.length());
Here 'cypherText' is a String with some content. Total no of characters written in the file is given as 19. Also after writing, when I open the binary file in Notepad++, it shows some characters. Selecting all the content of the file counts to 19 characters in total.
Now when I read the same file using BufferedReader, using the following lines of code:
try
{
DecMessage obj2= new DecMessage();
StringBuilder cipherMsg=new StringBuilder();
try (BufferedReader in = new BufferedReader(new FileReader(filePath))) {
String tempLine="";
fileSelect=true;
while ((tempLine=in.readLine()) != null) {
cipherMsg.append(tempLine);
}
}
System.out.println("FROM FILE: charCount= "+cipherMsg.length());
Here the total no of characters read (stored in 'charCount') is 17 instead of 19.
How can I read all the characters of the file correctly?
Specify the same charset while reading file.
try (final BufferedReader br = Files.newBufferedReader(new File(filePath).toPath(),
StandardCharsets.UTF_8))
UPDATE
Now i got your problem. Thanks for the file.
Again : Your file still readable to any text reader like Notepad++ ( Since your characters includes extended and control characters you are seeing those non readable characters . but it is still in ASCII.)
Now back to your problem, You have two problem with your code.
While reading file you should specify the Correct Charset. Readers are character readers - Bytes would be convert into characters while reading. If you specify the Charset it would use that else it would use the default system charset. So you should create BufferedReader as follows
try (final BufferedReader br = Files.newBufferedReader(new File(filePath).toPath(),
StandardCharsets.UTF_8))
Second issue, you have characters which includes Control characters. while reading file line by line , by default bufferedReader uses System's default EOL characters and skip those characters. thats why you are getting 17 instead of 19 ( since you have 2 characters are CR). To avoid this issue you should read characters.
int ch;
while ((ch = br.read()) > -1) {
buffer.append((char)ch);
}
Overall the below method would return proper text.
static String readCyberText() {
StringBuilder buffer = new StringBuilder();
try (final BufferedReader br = Files.newBufferedReader(new File("C:\\projects\\test2201404221017.txt").toPath(),
StandardCharsets.UTF_8)){
int ch;
while ((ch = br.read()) > -1) {
buffer.append((char)ch);
}
return buffer.toString();
}
catch (IOException e) {
e.printStackTrace();
return null;
}
}
And you can test by
String s = readCyberText();
System.out.println(s.length());
System.out.println(s);
and output as
19
ia#
m©Ù6ë<«9K()il
Note: the length of String is 19, however when it display it just displayed 17 characters. because the console considered as eof and displayed in different line. but the String contain all 19 characters properly.
So I am trying to change the format of a text file that has line numbers every couple of lines just to make it cleaner and easier to read. I made a simple program that goes in and replaces all of the first three characters of a line with spaces, these three character spaces are where the numbers can be. The actual text doesn't start until a few more spaces in. When i do this and have the end result printed out it comes out with a diamond with a question mark in it and I'm assuming that this is the result of missing characters. It seems like most of the missing characters are the apostrophe symbol. If anyone could let me know how to fix it i would really appreciate it :)
public class Conversion {
public static void main(String args[]) throws IOException {
BufferedReader scan = null;
try {
scan = new BufferedReader(new FileReader(new File("C:\\Users\\Nasir\\Desktop\\Beowulftesting.txt")));
} catch (FileNotFoundException e) {
System.out.println("failed to read file");
}
String finalVersion = "";
String currLine;
while( (currLine = scan.readLine()) !=null){
if(currLine.length()>3)
currLine = " "+ currLine.substring(3);
finalVersion+=currLine+"\n";
}
scan.close();
System.out.println(finalVersion);
}
}
Instead of using FileReader, use an InputStreamReader with the correct text encoding. I think the strange characters are appearing because you're reading the file with the wrong encoding.
By the way, don't use += with strings in a loop, like you have. Instead, use a StringBuilder:
StringBuilder finalVersion = new StringBuilder();
String currLine;
while ((currLine = scan.readLine()) != null) {
if (currLine.length() > 3) {
finalVersion.append(" ").append(currLine.substring(3));
} else {
finalVersion.append(currLine);
}
finalVersion.append('\n');
}
I'm working on a program that needs to update a line that depends its value on the result of a line that goes read after. I thought that I could use two BufferedReaders in Java to position the reader on the line to update while the other one goes for the line that fixes the value (it can be an unknown number of lines ahead). The problem here is that I'm using two BufferedReaders on the same file and even if I think I'm doing right with the indexes the result in debug doesn't seem to be reliable.
Here's the code:
String outFinal
FileName=fileOut;
File fileDest=new File(outFinalFileName);
try {
fout = new BufferedWriter(
new OutputStreamWriter(
new FileOutputStream(fileDest)));
} catch (FileNotFoundException e) {
e.printStackTrace();
}
FileReader inputFile=null;
try {
inputFile = new FileReader(inFileName);
} catch (FileNotFoundException e2) {
e2.printStackTrace();
}
BufferedReader fin = new BufferedReader(inputFile);
BufferedReader finChecker = new BufferedReader(inputFile); //Checks the file and matches record to change
String line="";
String lineC="";
int lineNumber=0;
String recordType="";
String statusCode="";
try {
while ((lineC = finChecker.readLine()) != null) {
lineNumber++;
if (lineNumber==1)
line=fin.readLine();
recordType=lineC.substring(0,3);//Gets current Record Type
if (recordType.equals("35")){
while(!line.equals(lineC)){
line=fin.readLine();
if (line==null)
break;
fout.write(line);
}
}else if (recordType.equals("32")){
statusCode=lineC.substring(4,7);
if(statusCode.equals("XX")){
updateRecordLine(line,fout);
}
}
}
returnVal=true;
} catch (IOException e) {
e.printStackTrace();
}
Thanks in advance.
Well, the BufferedReader only reads stuff, it doesn't have the ability to write data back out. So, what you would need is a BufferedReader to get stuff in, and a BufferedWriter that takes all the input from the BufferedReader, and outputs it to a temp file, with the corrected/appended data.
Then, when you're done (i.e. both BufferedReader and BufferedWriter streams are closed), you need to either discard the original file, or rename the temp file to the name of the original file.
You are basically copying the original file to a temp file, modifying the line in question in the temp file's output, and then copying/renaming the temp file over the original.
ok, i see some problem in your code exactly on these lines-->
recordType=lineC.substring(0,3);//Gets current Record Type
if (recordType.equals("35")){
if you see on the first line, you are getting the substring of recordType into recordType. Now recordType length is 3. If at all the recordType has only 2 characters, then substring throws arrayIndexOutOfBoundsException. So when no runtime exceptions, its length is 3 and on the next line you are calling the equals method that has a string with 2 characters.
Will this if block ever run ?
I'm reading numbers from a txt file using BufferedReader for analysis. The way I'm going about this now is- reading a line using .readline, splitting this string into an array of strings using .split
public InputFile () {
fileIn = null;
//stuff here
fileIn = new FileReader((filename + ".txt"));
buffIn = new BufferedReader(fileIn);
return;
//stuff here
}
public String ReadBigStringIn() {
String line = null;
try { line = buffIn.readLine(); }
catch(IOException e){};
return line;
}
public ProcessMain() {
initComponents();
String[] stringArray;
String line;
try {
InputFile stringIn = new InputFile();
line = stringIn.ReadBigStringIn();
stringArray = line.split("[^0-9.+Ee-]+");
// analysis etc.
}
}
This works fine, but what if the txt file has multiple lines of text? Is there a way to output a single long string, or perhaps another way of doing it? Maybe use while(buffIn.readline != null) {}? Not sure how to implement this.
Ideas appreciated,
thanks.
You are right, a loop would be needed here.
The usual idiom (using only plain Java) is something like this:
public String ReadBigStringIn(BufferedReader buffIn) throws IOException {
StringBuilder everything = new StringBuilder();
String line;
while( (line = buffIn.readLine()) != null) {
everything.append(line);
}
return everything.toString();
}
This removes the line breaks - if you want to retain them, don't use the readLine() method, but simply read into a char[] instead (and append this to your StringBuilder).
Please note that this loop will run until the stream ends (and will block if it doesn't end), so if you need a different condition to finish the loop, implement it in there.
I would strongly advice using library here but since Java 8 you can do this also using streams.
try (InputStreamReader in = new InputStreamReader(System.in);
BufferedReader buffer = new BufferedReader(in)) {
final String fileAsText = buffer.lines().collect(Collectors.joining());
System.out.println(fileAsText);
} catch (Exception e) {
e.printStackTrace();
}
You can notice also that it is pretty effective as joining is using StringBuilder internally.
If you just want to read the entirety of a file into a string, I suggest you use Guava's Files class:
String text = Files.toString("filename.txt", Charsets.UTF_8);
Of course, that's assuming you want to maintain the linebreaks. If you want to remove the linebreaks, you could either load it that way and then use String.replace, or you could use Guava again:
List<String> lines = Files.readLines(new File("filename.txt"), Charsets.UTF_8);
String joined = Joiner.on("").join(lines);
Sounds like you want Apache IO FileUtils
String text = FileUtils.readStringFromFile(new File(filename + ".txt"));
String[] stringArray = text.split("[^0-9.+Ee-]+");
If you create a StringBuilder, then you can append every line to it, and return the String using toString() at the end.
You can replace your ReadBigStringIn() with
public String ReadBigStringIn() {
StringBuilder b = new StringBuilder();
try {
String line = buffIn.readLine();
while (line != null) {
b.append(line);
line = buffIn.readLine();
}
}
catch(IOException e){};
return b.toString();
}
You have a file containing doubles. Looks like you have more than one number per line, and may have multiple lines.
Simplest thing to do is read lines in a while loop.
You could return null from your ReadBigStringIn method when last line is reached and terminate your loop there.
But more normal would be to create and use the reader in one method. Perhaps you could change to a method which reads the file and returns an array or list of doubles.
BTW, could you simply split your strings by whitespace?
Reading a whole file into a single String may suit your particular case, but be aware that it could cause a memory explosion if your file was very large. Streaming approach is generally safer for such i/o.
This creates a long string, every line is seprateted from string " " (one space):
public String ReadBigStringIn() {
StringBuffer line = new StringBuffer();
try {
while(buffIn.ready()) {
line.append(" " + buffIn.readLine());
} catch(IOException e){
e.printStackTrace();
}
return line.toString();
}
I'm trying to make this work, I don't understand why it doesn't work since it makes sense to me, but it doesn't make sense to java it seems.
As you read the code, what I expect is _NAME to be replaced by TEST while maintaining the same structure of the text (keeping \n) to save it later(not done yet)
I also stored it using ArrayList, but the replace never took off either, so I'm clueless
try {
BufferedReader reader = new BufferedReader (new InputStreamReader (
new FileInputStream (temp), "utf-8"));
String line = reader.readLine();
StringBuffer text = new StringBuffer();
while(line != null) {
line.replace("[_NAME]", "TEST");
Logger.info(line);
line = reader.readLine();
}
reader.close();
} catch(FileNotFoundException ex) {
} catch(UnsupportedEncodingException ex) {
} catch(IOException ex ) {}
The correct line is
line = line.replace("_NAME", "TEST");
If you use brackets, you are specifying the characters as individual matches (_, N, A, M and E), and you want to replace the whole match.
Second, the replace method return a new String that contains the modified String. Remember that Strings in Java are immutable, so no method that modifies a String would modify the input object, they will always return a new object.
One possible problem is the fact that you have [] around _NAME but I'm going to go with the "you forgot that replace returns the new string instead of changing it in-situ" option. See here.
In other words, it should changed from:
line.replace ( ...
to:
line = line.replace ( ...