When I run the following code:
int i = 0
try {
fstream = new FileWriter(filename);
BufferedWriter out = new BufferedWriter(fstream);
while (i < 100) {
out.write("My Name is Bobby Bob");
out.write(i);
out.newLine();
i++;
}
out.flush();
out.close();
} catch (IOException e) {
e.getClass();
}
I get the following in my output file:
My Name is Bobby Bob
x100
each one is followed by a weird symbol. Male sign, female sign etc etc.
My question is and its more of a curious one. What causes these weird symbols to appear? I was expecting numbers as it counted up. Where are these symbols pulled from?
out is a RandomAccessFile ?
I think you are using write(byte) instead of write(String), so you are writting the byte X, see ASCII TABLES for representations.
Try
write(""+i);
Looking at BufferedWriter java api:
http://download.oracle.com/javase/1.4.2/docs/api/java/io/BufferedWriter.html#write(int)
it says that it writes the integer representation of a char, for your understanding.
If you want to print the value 0 you have to write 48 as this image represents:
When you write
out.write(i);
it writes the (char) i not the number in i as text.
If you want to write i as a number use print
out.print(i);
or
out.write(String.valueOf(i));
or
out.write(""+i);
It looks like you are writing single characters that are specified by the given int value, and not a character representation of the variable i.
System.out is a PrintStream. The PrintStream.write(int) method writes a single byte to the output stream with the byte value specified. So you're not writing the integers 0 through 100, you're writing the bytes 0 through 100. You probably want print(int) instead of write(int).
Related
I'm trying to write a code that pick-up a word from a file according to an index entered by the user but the problem is that the method readChar() from the RandomAccessFile class is returning japanese characters, I must admit that it's not the first time that I've seen this on my lenovo laptop , sometimes on some installation wizards I can see mixed stuff with normal characters mixed with japanese characters, do you think it comes from the laptop or rather from the code?
This is the code:
package com.project;
import java.io.*;
import java.util.StringTokenizer;
public class Main {
public static void main(String[] args) throws IOException {
int N, i=0;
char C;
char[] charArray = new char[100];
String fileLocation = "file.txt";
BufferedReader buffer = new BufferedReader(new InputStreamReader(System.in));
do {
System.out.println("enter the index of the word");
N = Integer.parseInt(buffer.readLine());
if (N!=0) {
RandomAccessFile word = new RandomAccessFile(new File(fileLocation), "r");
do {
word.seek((2*(N-1))+i);
C = word.readChar();
charArray[i] = C;
i++;
}while(charArray[i-1] != ' ');
System.out.println("the word of index " + N + " is: " );
for (char carTemp : charArray )
System.out.print(carTemp);
System.out.print("\n");
}
}while(N!=0);
buffer.close();
}
}
i get this output :
瑯潕啰灰灥敲牃䍡慳獥攨⠩⤍ഊੴ瑯潌䱯潷睥敲牃䍡慳獥攨⠩⤍ഊ捯潭浣捡慴琨⡓却瑲物楮湧朩⤍ഊ捨桡慲牁䅴琨⡩楮湴琩⤍ഊੳ獵畢扳獴瑲物楮湧木⠠獴瑡慲牴琠楮湤摥數砬Ⱐ敮湤搠楮湤摥數砩⤍ഊੴ瑲物業洨⠩Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: Index 100 out of bounds for length 100
at Main.main(Main.java:21)
There are many things wrong, all of which have to do with fundamental misconceptions.
First off: A file on your disk - never mind the File interface in Java, or any other programming language; the file itself - does not and cannot store text. Ever. It stores bytes. That is, raw data, as (on every machine that's been relevant for decades, but historically there have been other ways to do it) quantified in bits, which are organized into groups of 8 that are called bytes.
Text is an abstraction; an interpretation of some particular sequence of byte values. It depends - fundamentally and unavoidably - on an encoding. Because this isn't a blog, I'll spare you the history lesson here, but suffice to say that Java's char type does not simply store a character of text. It stores an unsigned two-byte value, which may represent a character of text. Because there are more characters of text in Unicode than two bytes can represent, sometimes two adjacent chars in an array are required to represent a character of text. (And, of course, there is probably code out there that abuses the char type simply because someone wanted an unsigned equivalent of short. I may even have written some myself. That era is a blur for me.)
Anyway, the point is: using .readChar() is going to read two bytes from your file, and store them into a char within your char[], and the corresponding numeric value is not going to be anything like the one you wanted - unless your file happens to be encoded using the same encoding that Java uses natively, called UTF-16.
You cannot properly read and interpret the file without knowing the file encoding. Full stop. You can at best delude yourself into believing that you can read it. You also cannot have "random access" to a text file - i.e., indexing according to a number of characters of text - unless the encoding in question is constant width. (Otherwise, of course, you can't just calculate the distance-in-bytes into the file where a given character of text is; it depends on how many bytes the previous characters took up, which depends on which characters they are.) Many text encodings are not constant width. One of the most popular, which frankly is the sane default recommendation for most tasks these days, is not. In which case you are simply out of luck for the problem you describe.
At any rate, once you know the encoding of your file, the expected way to retrieve a character of text from a file in Java is to use one of the Reader classes, such as InputStreamReader:
An InputStreamReader is a bridge from byte streams to character streams: It reads bytes and decodes them into characters using a specified charset. The charset that it uses may be specified by name or may be given explicitly, or the platform's default charset may be accepted.
(Here, charset simply means an instance of the class that Java uses to represent text encodings.)
You may be able to fudge your problem description a little bit: seek to a byte offset, and then grab the text characters starting at that offset. However, there is no guarantee that the "text characters starting at that offset" make any sense, or in fact can be decoded at all. If the offset happens to be in the middle of a multi-byte encoding for a character, the remaining part isn't necessarily valid encoded text.
char is 16 bits, i.e. 2 bytes.
seek seeks to a byte boundary.
If the file contains chars then they are at even offsets: 0, 2, 4...
The expression (2*(N-1))+i) is even iff i is even; if odd, you are sure to land in the middle of a char, and thus read garbage.
i starts at zero, but you increment by 1, i.e., half a character.
Your seek argument should probably be (2*(N-1+i)).
Alternative explanation: your file does not contain chars at all; for example, you created an ASCII file in which a character is a single byte.
In that case, the error is attempting to read ASCII (an obsolete character encoding) with a readChar function.
But if the file contains ASCII, the purpose of multiplying by 2 in the seek argument is obscure. It apparently serves no useful purpose.
I changed the encoding of the file to UTF-16 and modified the programe in order to display the right indexes, those that represents the beginning of each word, now it works fine, Thank you guys.
import java.io.*;
public class Main {
public static void main(String[] args) throws IOException {
int N, i=0, j=0, k=0;
char C;
char[] charArray = new char[100];
String fileLocation = "file.txt";
BufferedReader buffer = new BufferedReader(new InputStreamReader(System.in));
DataInputStream in = new DataInputStream(new FileInputStream(fileLocation));
boolean EOF=false;
do {
try {
j++;
C = in.readChar();
if((C==' ')||(C=='\n')){
System.out.print(j+1+"\t");
}
}catch (IOException e){
EOF=true;
}
}while (EOF!=true);
System.out.println("\n");
do {
System.out.println("enter the index of the word");
N = Integer.parseInt(buffer.readLine());
if (N!=0) {
RandomAccessFile word = new RandomAccessFile(new File(fileLocation), "r");
do {
word.seek((2*(N-1+i)));
C = word.readChar();
charArray[i] = C;
i++;
}while(charArray[i-1] != ' ' && charArray[i-1] != '\n');
System.out.print("the word of index " + N + " is: " );
for (char carTemp : charArray )
System.out.print(carTemp);
System.out.print("\n");
i=0;
charArray = new char[100];
}
}while(N!=0);
buffer.close();
}
}
Im trying to connect to a php script on a server and retrieve the text the script echoes.Do accomplish I used the following code.
CODE:=
import java.net.*;
import java.io.*;
class con{
public static void main(String[] args){
try{
int c;
URL tj = new URL("http://www.thejoint.cf/test.php");
URLConnection tjcon = tj.openConnection();
InputStream input = tjcon.getInputStream();
while(((c = input.read()) != -1)){
System.out.print((char) c);
}
input.close();
}catch(Exception e){
System.out.println("Caught this Exception:"+e);
}
}
}
I do get the desired output that is the text "You will be Very successful".But when I remove the (char) type casting it yields a 76 digit long.
8911111732119105108108329810132118101114121321151179999101115115102117108108
number which I'm not able to make sense of.I read that the getInputStream is a byte stream, then should there be number of digits times 8 number long output?
Any insight would be very helpful, Thank you
It does not print one number 76 digits long. You have a loop there, it prints a lot of numbers, each up to three digits long (one byte).
In ASCII, 89 = "Y", 111 = "o" ....
What the cast to char that you removed did was that it interpreted that number as a Unicode code point and printed the corresponding characters instead (also one at a time).
This way of reading text byte by byte is very fragile. It basically only works with ASCII. You should be using a Reader to wrap the InputStream. Then you can read char and String directly (and it will take care of character sets such as Unicode).
Oh I thought it would give out the byte representation of the individual letter.
But that's exactly what it does.
You can see it more clearly if you use println instead of print (then it will print each number on its own line).
I have the following code
public static void main(String aed[]){
double d=17.3;
try{
DataOutputStream out=null;
out=new DataOutputStream(new BufferedOutputStream(new FileOutputStream("new.txt")));
out.writeDouble(d);
out.flush();
}catch(FileNotFoundException fnf){
fnf.printStackTrace();
}catch(IOException io){
io.printStackTrace();
}
}
Now I am writing this double value to a text file new.txt , but following value is getting in text file
#1LÌÌÌÌÍ
But when i use
out.writeUTF(""+d)
It works fine.
Please explain the encoding that is going on here.
In java there are generally two classes of variables namely reference and primitive types.
Your primitive types include int,double,byte,char,boolean,long,short and float. These store one value and are represented in memory by a unicode 16 bit integer.
Reference types hold storage locations and referneces to certain objects. ( string/UTF is a refernce type) hence the actual value is seen
A binary file is not meant to be read by you but by a program that will fetch the values in the correct form and order and the methods you are using should be used solely for writing to a binary file(.dat) which holds actual data values in their respective forms (int,double etc). When writing to a textfile (.txt) text should be written only hence strings.
Writing to a Textfile :
try{
PrintWriter write=new PrintWriter("your filepath",true);
write.println("whatever needs to be written");
write.close();
}
catch(FileNotFoundException){
}
Reading :
Scanner read;
try{
read=new Scanner(new FileReader("your path"));
while(read.hasNext()){
System.out.println(read.nextLine);
}
read.close();
}
catch(FileNotFoundException e){
}
With DataOutputStream you are writing bytes, the bytes that represent a double value (which is a number value) and not the readable version of that number.
Example:
int i = 8;
In binary i value is '0100' and that's the value that the computer manages.... But you don't want to write the bits '0100' because you want something to read, not it's value; you want the CHARACTER '8', so you must transform the double to character (to String is also valid because is readable)....
And that's what you are doing with ("" + d): transforming it to String.
Use Writer to write text files (BufferedWriter and FileWriter are available, check this for more details)
writeDouble(Double)
method does not use UTF-8 encoding. If you have written a double using writeDoble() then you should read it using readDouble method of DataInputStream. These files are not meant to be modified or read manually. If you want to put it in plain then stick to writeUTF method.
From Documentation -
writeDouble -
Converts the double argument to a long using the doubleToLongBits method in class Double, and then writes that long value to the underlying output stream as an 8-byte quantity, high byte first.
writeDouble (as another writeByte, writeShort, etc. with corresponding size of bytes) writes 8 bytes of double value representation. That's why class called as DataOutputStream (Data).
writeUTF writes 2 bytes of length and actual string.
The java.io.DataOuputStream.writeUTF(String str) Writes two bytes of length information to the output stream, followed by the modified UTF-8 representation of every character in the string s.
writeDouble(double v)
Converts the double argument to a long using the doubleToLongBits
method in class Double, and then writes that long value to the
underlying output stream as an 8-byte quantity, high byte first.
Read the Javadoc:
https://docs.oracle.com/javase/7/docs/api/java/io/DataOutputStream.html
How can i write to a file a binary number without it to cut the zeros .
I'm writing like this :
byte[] b = new BigInteger("1011010101010110", 2).toByteArray();
FileOutputStream fos = new FileOutputStream("file",true);
fos.write(b);
But then for example : When i write 0000001, it writes in the file just 1 and ignores the zeros, the same happens if i write 001001001000 , it ignores the zeros on the left reading 8bits at the time from the right to the left.
What is the correct way to write binary digits to a file ? If this is the correct way, i'm might be trying to read the file in the wrong way ( I'm using the read() of InputStream )
Ps-(8 digits must be 1 byte so writing as a string is not an option, cause each digit is 1 byte.)
You can try something like this
String s = "0000001";
byte[] a = new byte[s.length()];
for (int i = 0; i < s.length(); i++) {
a[i] = (byte) (s.charAt(i) & 1);
}
You don't want to write it as a binary, you want to write it as a String representing the binary. The problem is that Java has no way to know you want it padded. I would suggest converting your binary numbers to a String, then left-padding with 0 (Apache StringUtils will help with this)
The method below returns file size as 2. Since it is long, I'm assuming the file size java calculates is 2*64 bits. But actually I saved a 32 bit int + a 16 bit char = 48 bits. Why does Java do this conversion? Also, does Java implicitly store everything as long in the file no matter if char or int ? How do I get the accurate size of 48 bits ?
public static void main(String[] args)
{
File f = new File("C:/sam.txt");
int a= 42;
char c= '.';
try {
try {
f.createNewFile();
} catch (IOException e) {
e.printStackTrace();
}
PrintWriter pw = new PrintWriter(f);
pw.write(a);
pw.write(c);
pw.close();
System.out.println("file size:"+f.length());
} catch (FileNotFoundException e) {
e.printStackTrace();
}
}
No. You wrote two characters. Writers are used for textual data, not for binary data. The documentation of write(int) says:
Writes a single character.
Since the default character encoding of your platform stores those two characters as a single byte (each), the file length is 2 (2 bytes: the length of a file is measured in bytes, as the documentation says). Open the file with a text editor, and see what's in there.
The Java API doc is really useful to know what a class or method does. You should read it.
both calls to write are writing a char, which is 16 bits in memory, but since
new PrintWriter(f)
uses the default character set encoding (probably ASCII or UTF-8 on your system), it results in 2 bytes being written.